Archive for the ‘AOL’ Category

The closest thing to reading people’s minds

What would it be like if we can read people’s mind and know their private thoughts, desires, worries and their deepest secrets?

AOL’s data offers us a glimpse of just that. If you look at the 400+ search terms of user 1515830, a sketch of what was on a woman’s mind in Ohio starts to emerge (cancer operation, incest, spying, marriage problem, depression, psychotic drugs, looking for a job or start a home business, plus cheap curtains, Disneyland vacation etc).

The following are some selected search terms from user 1515830. (Warning: viewer discretion is advised).

  • precancer cells found during cyst biopsy
  • wide local excision for vulvar cancer
  • photographs of surgery for vulvar cancer
  • how can i hide a wireless connection
  • free keylogging programs
  • password stealers
  • aftermath of incest
  • how to tell your family you’re a victim of incest
  • vgn for depression
  • surgical help for depression
  • anti psychotic drugs
  • online auction managing tools
  • how to become an insurance underwriter in ohio
  • jobs in denver colorado
  • teaching jobs with the denver school system
  • hannah mullins nursing school
  • marriage counseling tips
  • are divorce laws affected by adultery in ohio
  • divorce laws in ohio
  • diet pills for sale
  • … (the full list can be found at here)

Do we want really all these to be recorded and kept forever?

The email that started the AOL search data firestorm

It all started when Abdur Chowdhury (AOL Chief Architect for Research) posted the following message to the Corpora mailing list:

(Update: three people were fired for their roles in the AOL data debacle, see here and here.)

AOL is embarking on a new direction for its business – making its
content and products freely available to all consumers. To support
those goals, AOL is also embracing the vision of an open research
community, which is creating opportunities for researchers in academia
and industry alike.

We are introducing AOL Research to everyone, with the goal of
facilitating closer collaboration between AOL and anyone with a desire
to work on interesting problems. To get started, we invite you to
visit us at, where you will find:

– 20,000 hand labeled, classified queries
– 3.5 million web question/answer queries (who, what, where, when, etc.)
– Query streams for 500,000 users over 3 months (20 million queries)
– Query arrival rates for queuing analysis
– 2 million queries against US Government domains

Also, please feel free to provide feedback on the site, datasets you’d
like to see in the future, and any other comments about our vision.


Abdur Chowdhury

It was simply an introduction of AOL’s research lab to the academic community. Soon, the blogoshere took notice and the rest is history:-)

The AOL research site is offline now but mirrored copies of the database are all over the net. You can also search it online at