By Jonathan Mayer on July 19, 2011 at 4:20 am
Last week we reported some early results from the Stanford Security Lab's new web measurement platform on how advertising networks respond to opt outs and Do Not Track. This week we're back with a new discovery in the online advertising ecosystem: Epic Marketplace,1 a member of the self-regulatory Network Advertising Initiative (NAI), is history stealing.
Many thanks once again to research assistants Akshay Jagadeesh and Jovanni Hernandez.
A link can be styled differently based on whether you've been to the page it points to. You may recall, for example, that in the early days of the web links you hadn't visited were blue and links you had visited were purple. History stealing is a practice that exploits link styling to learn a user's web browsing history. The approach is simple: to test whether the user has visited a link, add it to a page and check how it's styled.2
Members of the computer security community have long considered history stealing a serious privacy vulnerability. The risk goes beyond leaking individual tidbits about past browsing; history stealing can be used to track or even identify a user. Mozilla finally implemented a fix in Firefox 4, and the other major browser vendors quickly followed. According to browser usage statistics roughly half of users remain vulnerable to history stealing.
About a year ago researchers at UCSD conducted the first comprehensive study of history stealing in practice. They found that a few popular adult sites were history stealing to learn whether users had visited their competitors. The UCSD team also discovered history stealing by several advertising networks, including Interclick (another NAI member). Class action litigation is ongoing.
Technical Findings - History Stealing
- The script is fast. Thousands of links are tested per second.
- Links are added in an invisible iframe; there is no apparent effect on the page layout.
- The script dynamically loads lists of URLs and associated interest segments using JSONP.
- Progress is stored in a cookie so the script can resume where it left off.
- The script sets a cookie indicating when it was last run; it will not history steal more than once every twenty-four hours.
- If history stealing is still in progress when the window is closed (e.g. the user navigates to another page) the script sends its findings before ending execution.
- The script slows down if a URL list takes over two seconds to process.
- To prevent multiple history stealing attempts in parallel, the script uses a mutex cookie.
- The script does not directly report the URLs that it detects the user has visited; it sends a deduplicated list of the interest segments associated with the visited URLs.
(For the technically inclined reader, here are an example iframe, script, and URL list.)
We also examined a series of URL lists (spreadsheet) that contain 15,511 entries. The URLs and interest segments range greatly. Some URLs are for a landing page; others are for a specific page. Some interest segments are broad; others are fine-grained. A few example segments:
- Segment 758: discount sites including Groupon and eBay Daily Deals
- Segment 876: sites about coffee, including Dunkin' Donuts, Folgers, and Starbucks
- Segments 984-989: home improvement sites including Home Depot and Grainger
- Segment 2701: pages about the Ford Fiesta
Several interest segments are highly sensitive:
- Segment 760: pages about getting pregnant and fertility, including at the Mayo Clinic
- Segment 2640: pages about menopause, including at the NIH and the University of Maryland
- Segment 2014: pages about repairing bad credit, including at the FTC
- Segment 2265: pages about debt relief, including at the FTC and the IRS
Technical Findings - Opt Out
We applied the methodology from last week's study to examine Epic Marketplace's opt-out practices. (Epic Marketplace was one of the eleven NAI members not included in that study.) We found that Epic Marketplace leaves its tracking cookies in place after both opting out with the NAI mechanism and enabling Do Not Track. We also found that history stealing continues after using either choice mechanism.
The 2008 NAI Code of Conduct requires member companies to receive express consent from a user before collecting "Sensitive Consumer Information," defined as:
- Social Security Numbers or other Government-issued identifiers
- Insurance plan numbers
- Financial account numbers
- Information that describes the precise real-time geographic
location of an individual derived through location-based services
such as through GPS-enabled devices
- Precise information about past, present, or potential future health
or medical conditions or treatments, including genetic, genomic,
and family medical history
(The Code of Conduct includes the unhelpful footnote, "[t]his provision is to be further developed in a distinct implementation guideline.")
Epic Marketplace also automatically receives and records anonymous information that your browser sends whenever you visit a website which is part of the Epic Marketplace Network. We use log files to collect Internet protocol (IP) addresses, browser type, Internet service provider (ISP), referring/exit pages, platform type, date/time stamp, one or more cookies that may uniquely identify your browser, and responses by a web surfer to an advertisement delivered by us. This information may be stored on our systems for about one year.
Web surfers may elect not to provide non-personally identifiable information by following the cookie opt-out procedures set forth below.
As with our prior work, we leave it to the reader to assess whether Epic Marketplace is complying with its privacy representations.
Thanks to Gordon Franken for reviewing this post.
1. Epic Marketplace was, until recently, named Traffic Marketplace. It hosts its third-party content on trafficmp.com.
2. Other forms of history stealing, beyond the scope of this post, rely on page layout, background images, and user interaction.
JustChecking December 5, 2012 at 1:25 pmPermalink
The links for the iframe, script and URL list are no longer accessible because cdn1.trafficmp.com is no longer listed in the DNS servers. Neither is trafficmp.com. The domain is still owned by Epic Media Group, according to Network Solutions' WhoIs service. Neither is their main domain, theepicmediagroup.com. Have they gone out of business?
Anonymous 2 July 23, 2011 at 6:29 amPermalink
Can people comment on spike's post:
Comment by Spike (not verified), posted July 22, 2011 - 10:23am
Did you determine how much bandwidth this process chews up? In this day and age of usage caps and bandwidth limitations anything like this is seriously unwelcome."
How much bandwidth is used for hx sniffing? Based upon the study the url's
Well, take a look at http://cdn1.trafficmp.com/prod/ig/110701-130258_adv_0.html
This is by far the longest URL list used by any history sniffing site, and it is something like 80 kilobytes. It's hard though to add up to meaningful amounts of data using text alone ...
Are there any studies about bandwidth use for hx sniffing and how does that equate to the user's limited data plan?
Anonymous Also July 22, 2011 at 4:57 pmPermalink
It would be too bad if they accidentally stole little Bobby tables' history.
Harlan Sanders July 22, 2011 at 11:35 amPermalink
q July 22, 2011 at 11:07 amPermalink
Two follow up questions that it would be great to know the answer to after reading this post:
1) You mention that Mozilla published a fix for history stealing - what about other browsers? Safari? Chrome?
2) Does Ghostery block this from happening?
Anonymouth July 22, 2011 at 10:33 amPermalink
So, does anyone have a list of domains/ip addresses operated by epic to add to my hosts and/or adblock list?
The only way to punish these people is to makes sure no information about your computer ever reaches theirs.
Spike July 22, 2011 at 10:23 amPermalink
Did you determine how much bandwidth this process chews up? In this day and age of usage caps and bandwidth limitations anything like this is seriously unwelcome.
Thank goodness for NoScript.
Harlan Sanders July 22, 2011 at 9:38 amPermalink
How about we put together a list of domain names that they're using for tracking so that we can all add them to /etc/hosts as:
Anonymous Coward July 22, 2011 at 9:21 amPermalink
Oh the horror... They might display ads to me that are relevant to my interests - how dare they!
Be-In-phorm-ed July 23, 2011 at 9:50 amPermalink
Damn right I'm whining lol
When I go shopping (in the "real world" of brick buildings) I don't expect to be followed around.
Why should I accept it in the internet? Followed by someone who wasn't even with me at the time I went to certain shops. It's wrong, plain and simple. If those with the power to do this want to do it, then they can ask me first if I mind them doing it. My choice, informed and clear, not sneaky and all about them and their monetising of me.
When I want something I'll go looking for it. They can stick their targeted adverts where the sun doesn't shine.
Yes, I'm a whiner in your terms. Do I care? No. I'll stuff them with whatever means I can and slow their systems down to a crawl. My little war, which I wil win, for the benefit of those who have no idea what's happening.
beinphormed... It was Phorm which made me look at what was happening.
Anonymous supporter August 11, 2011 at 3:11 amPermalink
If you use a "club" card at you supermarket they track you. In fact every rewards card track your spending/ shopping habits. Or there's Wal-Mart who reverse engineers how you walk their the store by the way you unload your cart at checkout so they can be dastardly and put products their consumers want in easy to get to places.
There's a lot of tracking going on associated with giving consumers better offers and most people see the value when they get the discounted items they're looking for. What I have actually never seen is anyone prove real harm other than to play up this drama about big bad corporate America spying on me.
Anonymous123455544 July 22, 2011 at 8:49 amPermalink
So whats the adbock pattern for this script? trafficmp/* ?
Keith Pieper | Pretarget July 21, 2011 at 9:32 amPermalink
Technology evolves much, much faster than regulation (industry or government). This is a great example of a new (albeit shady) technique of exploiting new technology which has not been broadly used, hence is not regulated. Furthermore, self regulation bodies simply lack the teeth to prevent this sort of behavior - membership in these groups is eye candy to make government regulators and noisy consumer interest groups "feel ok". Unfortunately, the only ways to prevent this is to over-regulate (which no one wants) or be 100% honest and forthcoming (there will always be a bad apple in the bunch to ruin it for all).
Tim July 20, 2011 at 2:47 pmPermalink
I don't think I'm going to visit their link to get my history stolen... Shame, shame, shame.
Epic Marketplace July 20, 2011 at 12:17 pmPermalink
We invite your readers to view our response to your blog post here: http://bit.ly/qZVAKh.
Chris August 23, 2011 at 7:54 amPermalink
To comment on a violation of privacy, you're sending us to some web site hosted in Libya? Really? I guess the laws of the USA don't suit your purpose.
Anonymous12 July 22, 2011 at 11:22 amPermalink
I predict we will soon see a respons something like the following.
Hello Epic Marketplace. We are Anonymous...
Then I imagine their corp will suffer greatly.
Add new comment