Five HTTP Header Fields Every SEO Should Know
When it comes time to conduct technical analyses of our clients’ websites, we can gain a great deal of insight by reviewing the HTTP response headers returned when we issue requests to their servers. Whether we’re checking for chained redirects or working to identify inconsistent canonical URLs, we benefit in understanding the myriad of HTTP header fields returned in each server response. Let’s look at five HTTP response header fields that are particularly pertinent to our efforts as SEOs.
While the response status isn’t so much a header field as it is part of the Status-Line in an HTTP response, we can think of it as a specific piece of pertinent information (like a field) that we ought to understand. The status comes in the form of an HTTP status code, which gives us immediate feedback on the status of the requested resource.
HTTP/1.1 301 Moved Permanently
HTTP/1.1 200 OK
HTTP/1.1 404 Not Found
The Server header field gives us the name of the server from which the HTTP response has been sent. This is especially useful in the early stages of an audit or when it comes time to implement some server-side redirects. Depending on the server type, we might deliver a completely different set of instructions for implementing the redirects we’ve mapped. An Apache server might call for some .htaccess edits/additions, while an IIS server might call for some work with the URL Rewrite Module. In any case, it’s important that we understand what types of servers our clients are using.
Server: Apache/2.2.23 (Unix)
When the requested resource has been redirected to a new URL (or a new resource has been created), we can check the Location field to see the absolute URL that is redirected to (or home to the new resource). This is great for identifying patterns in chained redirects and unveiling generations of legacy URLs, as we can see the location of each redirect along the path to the eventual destination URL. It is important that, when checking response headers, you use a tool that will show you concurrent responses caused by redirects (rather than simply the final response).
Link is the HTTP header field that indicates that the requested resource has some type of relationship with another resource, whose URL is included in the field value. One such relationship we might encounter is the specification of a canonical URL in Link header field. While it’s not something that we come across frequently, we can indeed use the Link HTTP header field to specify a canonical URL for a given resource. Google announced support for this in June 2011.
Link: <http://www.example.com/white-paper.html>; rel=”canonical”
Link: </feed>; rel=”alternate”
Here’s a great resource on implementing rel=”canonical” in HTTP headers.
Much like the Link HTTP header field allows us to stipulate a canonical resource in the HTTP response, the X-Robots-Tag HTTP header field allows us to specify robots directives (not unlike those specified in the HTML <meta> tag). Google has provided a nice resource on using this header field, complete with examples on how to target specific user-agents with your directives. If a page isn’t indexed, and you can’t seem to figure out why, remember to check for the X-Robots-Tag header field!
X-Robots-Tag: googlebot: nofollow
X-Robots-Tag: unavailable_after: 25 Jun 2010 15:00:00 PST
These are just five of the many HTTP header fields that we might encounter on the technical grind. There are many others that deserve a place in this conversation (e.g., Vary (for mobile), Content-Length, etc.), alas the conversation must go on. Share the particulars of your favorite HTTP header response fields in the comments below. What other types of valuable information do you garner from these fields?