How does a redirect work
Created: Last updated:
Redirecting is basically a simple procedure within HTML or rather HTTP. Forcing a reload and redirect (like 301 or 303) for a webpage with HTTP is working in the background and cannot be seen under normal circumstances. So much so, that your final redirected page should show a regular 200 OK status code.
If you like to know what is going on in the background you are at the right place. This page will tell you how this is working and what is going on in the background. If you have any further Questions: Please leave a comment at the end!
This page explains some of the fundamentals for redirecting but not any implementation details. However, at the end of this document you will find some links to resources on the web for how to implement a redirect.
Fundamentals for request and response
Before we begin with redirects I would like to make sure we understand and therefore explain how a regular and plain normal document transfer works in HTTP. It is the foundation for redirects because ultimately the end result will be a regular request and response with a 200 OK status code.
A normal response
Well then, here is how a "normal" transactions works in HTTP. Note that HTML is about the document's content only. For any webpage, the document transfer is the responsability of HTTP, not HTML.
A user-agent (browser) sends a request for a document with the URI that looks something like this: http://www.domainname.com/path/documentnam.html. The leading http indicates the requested transfer method for the document. The browser in this regular case will send the request to TCP port 80 of the server named as www.domainname.com in this example. The server in return sends a response. Keep these two things in mind, they always work together: request→response.
The response always contains a status code which is 200 OK if everything is normal. If there are problems or special situations like our redirects the response has a different number. Watch out: The status codes are part of the HTTP specification; not HTML!
When we redirect we send a response telling the requester to send a different request! In a nutshell, that is the whole secret of redirecting; but we are not there yet.
Other server responses
Before we dive into redirects lets look at two other common types of response by a web server. These are status code 404 and 500 which are not redirects but nevertheless do not return the expected web page.
Although we sometimes see the code in the browser, a user agent itself should and does not care about it in terms of telling you this. The number you may see is coming from the server with the error document which is nothing more than a regular HTML document. A user agent like your browser simply displays this HTML document error page submitted by the server. In fact, we can send any regular HTML document but with an error code and a user will never know what happened just by looking at the page.
Page not found: If the server cannot locate the requested path or document its response is 404. Together with the response a web page is returned, which is either the default page specified with the server for this type of error or a specially designed page and configured into the server.
Internal Server Error: If the web server has to send the request to a script engine like PHP (CGI) and does not get back a proper response (or in time) it will return a 500 error. Like before a web page is returned as well.
There are other types of error plus we can add real redirect options into the web server.
Why or when do we need a redirect? Like we have seen before if a document is not found our web server should respond with a 404 status code.
If the user just typed some silly request this is okay but it is not very smart (or polite) when we know we have moved or renamed the document. In all these cases we should catch the request and redirect to the new location or new document name.
A very simple and common method is to leave a page with a note about the move and a link to the new location so the user can update any bookmarks or links on a web page. We can do a little more with this simple method. Within the HTML document we can place a "refresh" meta tag. The refresh tag will cause a reload for the page after a defined period to a specified new location.
Note: This is not considered a real redirect and a thing handled by browsers once the whole page has been loaded and not a part of HTTP and actually not a standard defined in HTML either. In terms of informing the user this might be a good and valid option but it has some limitations as explained in this document at W3C.
Redirect with HTTP
Again, the refresh meta tag is a method implemented in HTML and not a standard and the preferred method. The preferred method should be a redirect in HTTP. This method basically forces HTTP to reload the page, i.e. send a new request based on this information. Because HTTP is part of the network protocol suite (IP stack) a user agent (like your browser) is not immediately aware of this redirect.
The status code goes inside to what is known as the header. The header is part of HTTP and not HTML. This is somewhat important to understand for developers because you will realize that you have to find the solution outside of anything related to creating a HTML document.
Web Server redirects
In general and as a first approach this should be done inside the web server, like we have seen above with regular transfers and error codes. You have to consult your web server software and the corresponding documentation for how you can do that.
Apache httpd redirects
In Apache there are basicaly two methods or modules to be precise. One is Alias: module mod_alias with a few Redirect directives and the other is Rewrite: module mod_rewrite (see URL Rewriting Guide) with the R|edirect flag. With the rewrite module the redirect flag is usually in combination with a condition and the RewriteCond directive.
If you have a web application (like a CMS, WordPress, Joomla or Drupal) you have dynamic content and probably links change within that content. Maybe you reorganize your document structure and move some documents around. Again, you could set those redirects in your web server but you may prefer a solution to set your HTTP redirects within your application or scripts.
Hence, you have to find where or how you can set the header information. As a rule of thumb for basically all these scenarios you will have to set you header information before any form of HTML code (this includes a simple space) has been send to the web server. Once the web server is receiving HTML code the header process is closed. Header information always have to precede document information.
Can I see the header information
It depends what you are looking for. In case of the HTTP redirect you will have a hard time to see it without a sniffer (see last option). The nature of this redirect is to work in the background and force a regular valid request, without you knowing about it. Therefore you cannot see the redirect under normal circumstances. For all other cases we have this.
On the server-side: If you are looking for the header information before you send your document you may have functions to see what has been sent. Check and know your scripting language.
You can however query your own page programmatically and act as a client, if you must. Then you must have functions to get header information which will give you a client's point of view, sort of. Since the HTTP redirect works in the background you won't see it, though. You can use this in a test application to see the final result or error codes but you should not use this in the same procedure sending your page.
On the client-side: In theory, a browser would be able to query HTTP for the header information but because you should end up with the final redirected webpage which is a regular response and the 200 OK status code you simply cannot see the redirect in the browser itself. I have looked at all major browser but have found none giving you header information by default. Only with some developer extensions you can get and see the response header.
Google's Chrome browser has a Web Developer extension by chrispederick.com. Firefox has this web developer add-on and guess what; also by chrispederick.com. In both tools you go to Information and all the way at the end or bottom you will see View Response Headers. As far as I know these are about the only two browsers and location where you can find out what the response status code is.
If you have your web site in Google's Webmaster Tools you can use "Fetch as Googlebot" in Diagnostics. Once you get the success link you can click that and you will see the whole page including header information.
The last extreme solution to figure out the status code is a network sniffer or get your own web page with telnet.
Redirects and error codes
Do not use the redirects for the error codes for two reasons. First you want to return a document telling what went wrong. Second the Redirector is supposed to terminate with exit and therefore is not returning a document. Whatever you would write is thrown away, lost.
If you can set an error status code you should always create and submit a document explaining what went wrong.
Links for Implementation
Last but not least some links to resources helping you implement a redirect.
For an Apache web server this is usually within the .htaccess file as a redirect or a rewrite [R]. Here is the link to the mod_rewrite documentation for Apache 2.2.