Browsers refresh, reload your pages?
Created: Last updated:
Special Note: If you are looking for how to refresh a webpage with PHP or in Zend Framework you have to read one of the following documents:
- Basic stuff about refresh: How does a redirect work
- Refresh in PHP and Zend: How to use redirect
Refresh is a term we should use client-side, i.e. what a user or a browser is doing. Forcing a page to refresh from the server-side is known as a redirect.
This document is about browsers and why they refresh so if that is causing you a headache with your web application, please read on. For server-side refresh, called redirecting, check out the links above.
Your application problem
Is your web application all of a sudden not working as expected as if somebody is messing with it in the background? Do you see a sharp increase in the log files because of duplicate entries and you have no clue why a page shows up twice just within seconds? Well, it could be that the problem is coming from the browser and not necessarily your web application.
If you have verified and are sure you don't send a redirect from your web application to the browser you will have to look at the client, i.e. the browser.
There are quite a few reasons why a browser requests a page for a second time or even more. Some will be rather obvious once you know about it but others are still a wee bit of a riddle. Lets take a look at the obvious ones first.
Link prefetching
One obvious problem caused by browsers is what they call link prefetching. In most cases this is business as usual but it can cause a serious problem when you work with session state.
Let me say this first, though: As it looks right now only Firefox is actively using link prefechting but I think others might follow soon.
Why browser prefetch
Lets say you want users to experience a faster Internet. One simple way is downloading the next page while the user is still reading the current page. Once the user clicks the link the page is already downloaded and ready to render. However, they need some assistance.
Mozilla Developer Network has an excellent FAQ about how they do it. They are looking for <meta> or <link> tags in the head section; in the <link> tag its the rel and next attributes.
If you place these tags into the head section the browser will go ahead and download the page in the background. When the user finally clicks the link the browser can load the page from its cache.
A problem for session state
At first prefetching seems to be a wonderful thing but not if you work with session state.
When you work with forms and need to know if the user is following a certain path you usually do that with session state. A prefetch can easily advance your state when there is no real action by the user yet.
The background conundrum
Prefetching works in the background, you do not see it. The browser displays the initially requested page as if nothing happened. You can print some debug messages into your pages and you probably will not get what you expected.
If you set a counter to your session and echo the value to your page it probably will skip a beat, every time. If you suspect your application is doing this it will drive you absolutely mad.
You will need log files recording in the background. The log files should finally convince you that the browser sends requests within seconds which causes your application to set a different state as well as advance the counter. Consequently a next button in your form may lead to an unexpected result because your session state is already past that event.
There are ways you can handle this behavior in your application but overall I believe it is easier to simply avoid the <meta> or <link> tags if you work with forms and session state.
Browser cache handling
Now the next problem is quite a riddle. You can look up all sorts of word combinations in Google but nothing will lead to anything explaining why. There have been bug reports in Firefox and some funny stories about encoding and such but none really matches. Finally, here is something that may lead to the cause of the problem and the solutions.
Duplicate, double log entries
This also works in the background and rears its ugly head in the log files. What are the odds, eh?
This case has interesting pattern. If you look at the HTTP_REFERER in the first entry it is the previous page as expected but the referer in second entry is the same page. With this pattern it looks like the browser for some reason reloaded the page; refering to iself.
In most cases also try to test with different browsers. In this case Firefox 3.x, Internet Explorer and Opera did not cause a second entry. Only Firefox 4.x, Google's Chrome and Apple's Safari. Why are they requesting the page twice?
Looking at the Apache server log you will not see anything new, only what you already know. The requests are clean GET requests followed by the 200, Ok response. You can use a network monitor (sniffer) to analyze the IP packets, nothing new either. In terms of Firefox and if it were something like a prefetch you should see a message like X-Moz: prefetch in the request.
I did some additional tests with the help of sessions. I was curious if the page I see is the page from the first or from the second request. I added timestamp values to the session and then printed the session values to the page. Surprise, surprise: I cannot see the second timestamp in the same page. I see the second timestamp when I call the next page, a clear sign that the value has been added later.
XHTML 1.1 versus XHTML 1.0
Browsers act differently on the doctype and you may have changed the doctype from XHTML 1.0 Transitional to XHTML 1.1. If you did see what happens if you revert back or simply use a different doctype. Especially when you have XHMTL 1.1!
I have looked at quite some documents but all clues, reasonable or not, did not indicate why a browser should have to reload the page when you have DOCTYPE XHTML 1.1. There are some hints about some difference in the configuration of the server and the document like charset or cache and expiration settings. These things apparently cause some browsers to think something is wrong requesting anew with different settings.
The problem is probably not the doctype you are using but the cache settings.
Pragma: No-cache
Take a look into caching for your web page and confirm that whatever expiration setting or date you set that it is really included in the header.
If you see a Pragma: No-cache in your server's response something is not setup properly; it should not be there. This directive is intended for browsers (in the request) and not the server (in the response).
Some sources suggest it is coming from the web server, in many cases Apache. Looking through all the settings and more googling you will not find it there, though. It is coming from PHP.
If you are working with sessions you will have to take a closer look at this function: session_cache_limiter. This function handles some cache settings—by default! As soon as you start your session with session_start your cache limiter settings are placed into the header.
You can still overwrite some of the settings but remove one is not easy. There is a header_remove() function but only in the latest php versions. But I think you should take control of caching yourself, i.e. eliminating this pragma directive and the cache limiter function altogether.
Once I have completely disable cache limiter with session_cache_limiter(false), in my index.php, two things happend: a) the double entries are gone and b) suprise, the browsers suddenly respected the expiration date and did not even bothered to contact the server. Silence in my sniffer!
Now, I control all cache settings myself within the application, no more defaults messing up browser. Still the question remains, why browsers react to badly.
If you work with Zend Framework and use Zend_Session you don't have to set the function. You can pass it in the configuration for Zend_Session::start(). The config has an option for cache_limiter which you can set to false or whatever setting you like. I believe the other settings are just like the php function.
Page Refresh F5
Now that I have my cache problems under control I thought everything will be find, but there is still one more funny thing causing double entries.
Firefox for some reason is requesting the page twice when you hit F5, i.e. reload the page. Reloading the very same page also takes considerably longer to load in Firefox than with other browser.
Broken image links
Looking at the log files I see that Firefox for some reason is sending the second request for images Accept: image/png,image/*;q=0.8,*/*;q=0.5. At first I had no clue what this is all about. It took a while but it suddenly hit me: I had a broken link for a tiny little image I somehow forgot and overlooked. Firefox 4.x as probably the only browser cares to send another request triggering my application to process everything twice and even the browser to take twice as long. Not sure if this is a bug with Mozilla but once I fixed the broken link the log finally showed what I was looking for—just one entry for one request, not two, double, twice or even more. Almost!
Browser Add-ons
Last but not least we have the add-ons in our modern browsers. Turns that some of them also trigger a reload of refresh. Since I noticed less problems in vanilla Firefox installations like, my visitors, I disable most of obvious culprits early on. Once I added one by some indeed cause the same neightmare. Not sure if I really want to dig into this mess but for reliable testing I think I just revert to browsers without any add-ons.
Conclusion
Sometimes I wish we only had one browser, like back in the days with Internet Explorer. Bad thought, bad thought! But still the new wave of browsers, HTML5/CSS3, add-ons and now the latest new releases causes new problems in more than just one corner. That is probably the main problem here.
It is almost impossible to be up-to-date on all changes and when or how they are relevant, i.e. if you have to make changes on your site and pages.
One lesson is certainly to do your testing with plain vanilla browser installations and handle the caching correctly on the server side. Once that all is verified and working properly we can go on and do some funky stuff like add-ons.
I also wonder what monitoring tools like Google Analytics are doing with such duplicate reloads. Seriously, are they recording everything twice?