Screaming Frog Explained: Overview Tab Definitions

July 27, 2016 | Sean McQuaide

screaming-frog-overview-tab-definitions

Screaming Frog is an endlessly useful tool which can allow you to quickly identify issues your website might have. While this tool provides you with an immense amount of data, it doesn’t do the best job of explaining the implications of each item it counts.

I found myself repeatedly Googling explanations for some of the data Screaming Frog provides and thought it was time to create a single resource that explains them all. The list below corresponds with the overview tab in Screaming Frog. I have included a brief explanation of each and relevant resources I find myself using again and again.

screaming-frog-overview-tab

Jump to Sections

Doc Types
Protocol
Response Codes
URI
Page Titles
Meta Descriptions
H1
H2
Images
Directives
AJAX
Depth
Response Time

Doc Types

HTML

This is a count of HTML files Screaming Frog found during its crawl. Minifying HTML files can help improve pagespeed. Also, using HTML5 elements can help search engines better understand content.

Helpful resource about HTML standards and performance:

JavaScript

This is a count of JavaScript files Screaming Frog found during its crawl. Minifying JS files and optimizing file delivery can improve page speed.

Helpful resource about JavaScript standards and performance:

CSS

This is a count of CSS files Screaming Frog found during its crawl. Minifying CSS files and optimizing the files delivery can improve page speed.

Helpful resources for improving CSS:

Images

This is a count of images Screaming Frog found during its crawl. Optimizing images can drastically reduce page load times.

Here’s a helpful resource for optimizing images:

PDF

This is a count of PDFs Screaming Frog found during its crawl. Google is very good at crawling and ranking PDFs. The downside is that because it is not a webpage, these files can not share the link authority they acquire. Another downside is that you can’t add a Google Analytics tag to them. See the resources below to improve the SEO of PDFs and track them in GA.

Helpful resources for improving PDF SEO & Tracking:

Flash

Types of plugins/content that may still use Flash.

Other

 


Protocol

HTTP vs HTTPs

This is a count of protocols Screaming Frog crawled. There is nothing wrong with having a count for both. HTTPS adds latency to page load time, so many sites have sections – login and check out carts mainly – that are secured while the rest of the site is HTTP. The concern here is control. You should be in control of which URL Google is ranking. Make sure you’re forcing the correct version of the page to load, and use canonical correctly.

Helpful resources for controlling HTTP and HTTPS URLs:

 


Response Codes

Blocked by Robots

This is a count of URI that are blocked by a sites robots.txt file. Whether a URL should or should not be blocked from being indexed should be evaluated on a case by case basis. But if you want a page to be indexed by search engines, it should not be blocked in the robots.txt file.

Here’s a helpful resource for using robots.txt:

No Response

This is a count of URI that for some reason did not return a response code. Typically this is because the server is slow or busy, causing Screaming Frog to timeout and move on. As default, the SEO Spider will wait 10 seconds to get any kind of HTTP response from a URL. You can increase the time Screaming Frog waits for a page to return a response code, see the resource link for how to do that.

Here’s a helpful resource for resolving no response codes:

Success (2xx)

This is a count of URI with a successful response code. This class of status codes indicates the action requested by the client was received, understood, accepted, and processed successfully. The most common response code is 200, but there are 9 others that describe different types of successful response codes. From an SEO perspective, we want this to have the highest count of all other response codes.

Here are some helpful resource for response codes: 

Redirection (3xx)

This is a count of URI that are redirecting to another URI. This class of status code indicates the client must take additional action to complete the request. Many of these status codes are used in URL redirection. 301 and 302 redirects are most common. For SEO purposes it is still best practice to use a 301 when permanently redirecting a URL and to use a 302 redirect when temporarily redirecting a URL. You will sometimes see a 307 response code, with some research you’ll find that a 307 is also a temporary redirect. A high number of 302 redirects is a cause for concern and should be investigated further.

Here are some helpful resource for response codes: 

Client Error (4xx)

This is a count of URI that link to pages that do not exist anymore. The 4xx class of status code is intended for situations in which the client seems to have erred. Errors can occur for a variety of reasons, but most commonly a 404 response code is returned to indicate that the URL crawled points a resource that does not exist anymore. Webmasters with a high number of 404 pages would be wise to analyze how important the 404 pages were and 301 redirect those URLs accordingly.

Here are some helpful resource for response codes: 

Server Error (5xx)

Response status codes beginning with the digit “5” indicate cases in which the server is aware that it has encountered an error or is otherwise incapable of performing the request.

Here are some helpful resource for response codes: 

 


URI

Non ASCII Characters

This is a count of the URI that contain non-ASCII characters. URLs can only contain ASCII character-set. Since URLs often contain characters outside the ASCII set, the URL has to be converted into a valid ASCII format. URL encoding replaces unsafe ASCII characters with a “%” followed by two hexadecimal digits. Commonly you’ll see this when a URL contains a space. The space is encoded with %20 instead of a space. Commonly you’ll see these with international URLs.
Example:

Helpful Resources:

Underscores

This is a count of the number of URLs that contain underscores. The general consensus is that dashes are easier for users to read and understand than underscores. But underscores are still considered an acceptable way to separate words in a URL.

Uppercase

This is a count of the number of URLs that contain an uppercase letter. The best practice for URLs is to force the CMS to use lowercase URLs. This is recommended because technically the following URLs are unique, and if they render the same content, they may be creating a duplicate content issue.

Example:

Duplicate

This is a count of the number of URLs that potentially contain duplicate content. This count filters for all duplicate pages found via the hash value. If two hash values match the pages are exactly the same in content.

Parameters

This is a count of URLs that contain URL parameters. URLs that use parameters could be an indication that content is being dynamically generated on the site, or that the site has faceted navigation, there are many other reasons as well. This could lead to content that is not being indexed or duplicate content issues. If you have legitimate URL parameters, they should be added to Google Search Console so that Google knows how to handle them.

Example: 

http://example.com/shop/index.php?product_id=32&highlight=green+dress&cat_id=1&sessionid=123&affid=431

Helpful Resources: 

Over 115 Characters

This is a count of the URI longer than 115 characters. Shorter URLs have a higher click-through rate. This is not always possible, i.e. news articles. But if you find that a business’s service pages are exceeding this character length, you may want to consider shortening them.

 


Page Titles

Missing

This is a count of pages missing a title tag. Titles are one of the most important things on a page. A page without a title is at a severe disadvantage when it comes to ranking for any search query. You’ll want to export this list and make sure that titles are added to these pages.

Duplicate

This is a count of pages that have exactly the same title as another page. Duplicate titles can make it hard for Google to choose between pages when deciding which page to rank. It is a best practice to make sure all titles are unique. Often times you’ll see paginated pages in this duplicate title list, e.g. blog post list page 2.

Over 55 Characters

This is the count of titles that exceed 55 characters. Character length is something that does not affect rankings but does impact click-through rate. By default, Screaming Frog has this set to 65 characters, but 55 characters is a more accurate representation of the 512px cutoff. You should modify this before you begin your crawl; Configure > Spider > Preferences > Page Title Width. While 512px is a more accurate representation of how Google will truncate the title of a page, the character count is easier for people to wrap their head around. Comparing the two will help with your analysis. Exporting this list and then removing branding at the end of the title (as there should be) will give you a more accurate representation of the types of pages that are exceeding the character length. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Below 30 Characters

This is the count of titles that are below 30 characters. Short titles are an indication that a title has not been properly optimized. Commonly you’ll see a list of single word titles (followed by branding) in this list. These are titles which will require more keyword research. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Over 512px

Similar to “Over 55 Characters”, this count shows the number of titles that exceed 512 pixels. 512 pixels is the commonly used measurement for title truncation. By default Screaming Frog has this set to a max of 486px, you’ll want to change this to be 512px before crawling your site. Titles that exceed this pixel width are at risk of not showing relevant keywords to the user which can reduce click-through rate. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Below 200px

Similar to “Below 30 Characters”, this count shows the number of titles that are below 200 pixels. This is Screaming Frogs default pixel width. We recommend using the count of titles below 30 characters as a measure of titles that need more keyword research. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Same as H1

This is a count of titles that are identical to the H1. Sometimes a CMS will pull the title from the H1 of a page. Sometimes this is an acceptable practice, but it is recommended that you have a title is a unique element of the page, not a copy of another element.

Multiple

This is a count of the pages that have more than one title tag on the page. There is a great article testing how Google interprets multiple titles on Moz.com, you can read that here.

 


Meta Descriptions

Missing

This is a count of pages missing a meta description. Missing meta descriptions are a missed opportunity to differentiate your page from other pages in search results.

Duplicate

This is the count of duplicate meta descriptions. Each meta description should be unique to each page. Duplicates indicate that unique meta descriptions have not been written for each page. We recommend tackling the more important pages of the site, like the homepage and service pages, first. Then move on to lesser pages.

Over 155 Characters

This is the count of pages that have meta descriptions that exceed 155 characters. 155 characters is the average truncation point for meta descriptions, but we have seen descriptions that are longer. Remember that meta descriptions are meant to be an ad for your organic listing. So lengthy meta descriptions is an indication that the description is not doing the best job selling the page. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Below 70 Characters

This is the count of pages that have meta descriptions that are below 70 characters. Meta descriptions that are below a certain character count is also an indication that little optimization has been done in this area. If high-value pages are in this list, meta descriptions should be written out for each. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Over 928px

This is a count of the number of meta descriptions over 928 pixels. A high count here means that descriptions are probably being generated programmatically from page content. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Below 400px

This is a count of the number of meta descriptions below 400 pixels. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Multiple

This is a count of pages that contain more than one meta description. Like a title, there should only be one meta description on each page. Look at the source code to determine which is the correct description, then have your developer remove the rest.

 


H1

Missing

This is a count of the pages that do not contain an H1 tag. Heading tag exist as part of a hierarchy and the H1 is king. Each page should have an H1 that is being used in a meaningful way.

Duplicate

This is a count of the pages that contain more than one H1 tag. If your site uses HTML5 standards, this is not a problem. If your site is not using HTML5 standards, you should identify the duplicates and remove them any unnecessary ones.

Over 70 Characters

This is a count of the number of pages whos H1 tag is greater than 70 characters in length. This is just data for data’s sake. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Multiple

This is a count of the number of pages that have more than one H1 tag. If the site is not using HTML5 standards, as indicated by in the first line of the document, and/or is not properly outlining its documents with sections, then there should only be one H1 tag on each page. If the site is using HTML5 standard, as indicated by in the first line of the page, then these multiple H1s need to be contained within sections of the page. More on that here.

 


H2

Missing

This is a count of the pages that do not contain an H2 tag. H2s are a part of proper page structure. They are also the second most important part of the page after the H1. Thus not having them can be a missed opportunity.

Duplicate

This is a count of the pages that contain identical H2 tags. Sometimes H2s are used to wrap template sections like navigation elements, or logos. This is not a problem.

Over 70 Characters

This is the number of pages where the H2 content exceeds 70 characters. You should not be worried about the length of your H2s. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Multiple

This is a count of the pages that contain more than one H2 tag. Multiple H2s is appropriate and expected if a site contains good page structure.

 


Images

Over 100kb

This is a count of the number of images that exceed 100kb in size. The file size of an images plays a large role in how quickly a pages loads. This is because the browser must download the images before the page fully renders. You’re able to change this number using Screaming Frog’s advanced configuration settings (Configuration >>> Preferences).

Missing Alt Text

This is a count of images that are missing alt text. Alt text helps search engines and visually impaired individuals add context to your content. Keep that in mind and create short, descriptive alt text.

Alt text over 100 Characters

In some rare cases, alt text will be over 100 characters. Whether it’s due to someone stuffing keywords into this attribute or the CMS programmatically filling it, it’s an indication that little thought has been put into what is being added as alt text. Alt text is used by screen readers to help the visually impaired understand the contents of an image. Keep that in mind and create short, descriptive alt text.

 


Directives

Canonical

This is a list of pages that contain a canonical tag.

Example: If the URL is http://www.example.com/abc.html, the canonical element would be:

<link rel="canonical" href="http://www.example.com/abc.html" />

Canonicalized

This is a list of pages that have a canonical tag, but the canonical link is different than the page URL. These are “canonicalized” pages.

Example: URL is http://www.example.com/abc.html,&nbsp;the canonical element might be:

<link rel="canonical" href="http://www.example.com/xyz.html" />

No Canonical

This is a list of pages that are missing a canonical tag.

Next/Prev

This is a list of pages that have a next/prev element.

<link rel="next" href="http://www.example.com/article?story=abc&page=3" />
<link rel="prev" href="http://www.example.com/article?story=abc&page=1" />

Helpful resource about using next/prev markup:

Index

This is a count of the pages that have an index tag. The index tag instructs search engines to index a page. This is an unnecessary tag as search engines will index your pages by default.

<meta name="robots" content="index">
<meta name="robots" content="index,follow">

Noindex

This is a list of pages that have a noindex tag on them. The noindex tag tells search engines that it should not include the page in search results.

Example:

<meta name="robots" content="noindex">

Follow

This is a list of pages that have a follow tag on them.

Examples:

<meta name="robots" content="follow">
<meta name="robots" content="noindex,follow">

Nofollow

This is a list of nofollow tags on them. Nofollow tags allow a website to tell search engines not to crawl a link.

Examples:

<meta name="robots" content="nofollow">
<meta name="robots" content="noindex,nofollow">

NoArchive

This is a list of pages that have a no archive tag. The noarchive tag tells search engines not to store a cached copy of the page.

Example:

<meta name="robots" content="noarchive">

NoSnippet

This is a list of pages that have a nosnippet meta tag. This tag instructs search engines to now show a snippet (description) in search results, it also instructs search engines to not show a chached link in search results.

Example:

<meta name="robots" content="nosnippet">

NoODP

This is a list of pages that have a noodp meta tag. ODP stands for Open Directory Project. The noodp tag instructs search engines to not use the description archived in the Open Directory Project.

Example:

<meta name="robots" content="noodp">

NoYDIR

This is a list of pages that have a noydir tag. YDIR stands for Yahoo Directory. The noydir tag instructs search engines to not use the description archived in Yahoos directory.

Example:

<meta name="robots" content="noydir">

NoImageIndex

This is a count of pages that contain a noimageindex meta tag. A noimageindex tag tells search engines not to index images on this page.

Example:

<meta name="robots" content="noimageindex">

NoTranslate

This is a count of pages that contain a notranslate meta tag. A notranslate meta tag tells search engines not to offer a translation for the page.

Example:

<meta name="robots" content="notranslate">

Unavailable_After

This is a count of pages that contain an unavailable_after meta tag. An unavailable_after tag tells search engines to not show a page in search results after a specific date/time.

Example:

<meta name="robots" content="unavailable_after: 21-Jul-2016 18:00:00 EST">

Refresh

This is a count of pages that contain a meta refresh tag. This defines after how long (in seconds) a page has to be refreshed or redirect users.

Example: 

<meta http-equiv="refresh" content="30">
<meta http-equiv="refresh" content="30; ,URL=http://www.example.com/login">

 


AJAX – With and Without Hash Fragment

Screaming Frog creates a hash fragment for each page that it crawls. This is a duplicate content check. If two hash values match the pages are exactly the same in content.

 


Depth

This is the depth of a page from the start page (number of ‘clicks’ away from the start page).

 


Response Time (sec)

This is the time in seconds it takes to download a page.