AOL Search Data Reveals a Great Deal
As I’m sure you’ve already heard, there was a little mistake made by a research team over at AOL when they decided to release a 3 month sample of their search log data to the academic community. Of course the dataset was retracted from their servers within a matter of days, but by that point there were mirrors of the data everywhere and it was too late.
During the week of August 6, some people in AOL’s research division decided to release to the public a little database they had. It contained a list of about 658,000 users and the Web searches each made from March to May. If you were one of those lucky, randomly selected souls, every search term you entered was opened to the world.
AOL didn’t tell its users it could do this, nor that it was going to, and it didn’t offer anyone the opportunity to opt out. It did take a small step back from the abyss by substituting a number for the users’ screen names.
“So what?” you might say. “As long as no one knows it was me searching for “dwarf prostitutes in south dakota” what difference does it make?”
The problem is that searches aren’t anonymous, even if the screen names were withheld to protect the innocent. The New York Times proved this when it tracked down user 4417749, one Thelma Arnold of Lilburn, Ga., from her searches.
And you don’t need the resources of the Times. Even a part-time technology columnist of average intelligence can glean plenty from the database.
Feel free to check out a few of the websites that have been built around this data set in the past few weeks:









