Is it really a Firefox browser?
When I was first shown the user agent string below and asked ‘what version of Firefox is this?’ At first I took this at face value, it’s obviously a Firefox browser, it says “Mozilla” right there, right? Digging a little deeper, it turned out not only was this wrong, but the question was wrong as well!
Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
If you’re already familiar with user agent strings it’s probably clear straight away that this string isn’t created by Firefox. A Firefox user agent string will have “Firefox/16.0” somewhere in it, so what we have here is not a Firefox user agent, but what is it?
How to read the user agent string?
The user agent string should follow some very simple rules, unfortunately, not all browsers do. A user agent string is a list of – so called – “product tokens,” ordered by their importance. Additional to these tokens, the string may contain “comments” surrounded by parentheses. The tokens always have the same format: “product-name/product-version”, where the “/product-version” part is optional.
So this means that in our example the only product token is the “Mozilla/4.0” at the beginning. Everything else is just a comment.
(compatible; MSIE 8.0; Windows NT 5.1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
If we take a real Firefox user agent string to compare it against, we’ll see the content is completely different. As we can see below, the Firefox user agent string contains three tokens, as well as a comment showing details about the operating system:
(Macintosh; Intel Mac OS X 10.7; rv:16.0)
The Firefox user agent string gives us the basic information we’re looking for right there in the three tokens “Mozilla”, “Gecko”, “Firefox”. All of the content inside the comment is additional detail, not important for identifying the browser. (Though of course these details might help us finding any problems on our website – sometimes the same browser might behave differently on another OS!) But going back to our first example user agent, there is no such information about the browser, just the single token, and the comment.
As we might expect, there’s something “special” about the way Microsoft’s Internet Explorer structures its user agent string! If we just focus on the comment part of our first user agent string, we will quickly find what we’re looking for. Right there in among the other parts is “MSIE 8.0” – which stands for “Microsoft Internet Explorer Version 8.0” – followed by the internal version number of Windows. In this particular case, “Windows NT 5.1” represents a client running Windows XP (You can find a list of these at Microsoft MSDN).
Windows NT 5.1
; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729)
What does the “Mozilla/4.0” stand for?
As far as I could tell, there was a time when web servers commonly provided different content to different browsers. At this time, Mozilla-based browsers were seen as the most full-featured and received the “best” version of the site content, while other less-capable browsers were served a more bare-bones version. At some point after this, all browsers started using the Mozilla token as the first part of their user agent string to make sure they were given the all-singing, all-dancing version of the content and not the dumbed-down version, and – eventually – this became the de-facto standard way to begin a user agent string. In most cases now, it just means that the browser is Mozilla compatible and not that it is really a Mozilla-based browser. More details about the history of user agent strings can be found on the Wikipedia page about user agents, as well as in RFC2616.
So what have we learned?
Face it, our first user agent example is not a Firefox browser, as much as it might look like it because of starting with “Mozilla”. Now that we’ve gained a basic understanding of the user agent format (and Microsoft’s interpretation of it!), it’s quite easy to identify the real browser as MS Internet Explorer 8. Investing a second to read a little further often easily reveals the real browser name and version.
Read more of my posts on my blog at http://blog.tinned-software.net/.