Wednesday, February 1, 2012

Do You Know Who's Tracking Your Web Usage?

My friend Cheryl posted a link on Facebook to an article that describes the way in which Google Analytics tracks the websites you visit and even which searches you perform on those sites (a) while knowing your identity and (b) without your having used any Google product to get to those sites.

Here's what's going on behind the scenes:
First, we see a transaction to Google Analytics to retrieve some JavaScript. Part of this JavaScript's purpose is to enable tracking even when users have cookies disabled.

Once this JavaScript is retrieved and executed by the browser, the Google JavaScript prepares a second request to Google for a special image file. In the request for that image file Google embeds tracking information which will be used during the rest of the session on the IRS website...

Now, I enter some search terms into the IRS website search box. For the sake of emphasizing the point, I enter the terms "Offer in Compromise", and hit return. While the search itself is presumably handled privately by the IRS site, surprisingly, the search terms are also sent to Google. This happens when the search results page makes another request for Google's hidden tracking image we previously observed. Web requests typically contain not just the URL of the request, but also the URL for the page from which the request is being made...

Finally, I now click on one of the search results to view a specific document. The Google tracking image is again requested, and in that request, the URL for document I chose to click on is sent to Google. In this way, Google is informed of which documents I view on the [website]. This happens whether or not I arrived at the document by searching or navigating on the [site]...

We've now seen not only that Google is tracking your use of the [website], but also may be receiving information that specifically identifies you. I believe it is, in fact, likely that they are receiving specifically identifying information from many users of the site because many users are logged in to some Google web property while browsing. This includes anyone who's logged in to Gmail before browsing the [site].
This procedure can happen on any site that leverages Google intellectual property. How many sites' internal searches are "powered by Google" alone? Now add in sites that embed YouTube videos, pictures via Picasa... you're starting to get the picture.

The purpose of this post is not to bash Google, though there is plenty here to bash. Instead, I want to take the opportunity to point out that my employer has tools in its latest browser that prevents this sort of thing from happening. How? Internet Explorer (IE) 9 leverages something called Tracking Protection Lists. These lists are maintained by third parties (indeed, anyone can create their own as they like; it still works with the browser) and block ads, pop-ups, and, most importantly, block requests to your browser for just the sort of information that tracking cookies (and the Google Analytics shenanigans above) attempt to get from your browsing session.

How does it work? Well, this article explains it in detail.

More importantly, how do you set this up in your browser? That's easy.

No comments:

Post a Comment