The attempt by Justice Department lawyers to gain access to Google search data may have opened an important new front in the war over the security of personal data online. The search powers have generated amazing amounts of revenue for themselves and their shareholders simply by linking ads to the terms users enter for a general search. But sustaining that revenue growth into the future, the industry believes, requires offering users more personalization and more interrelated services, and giving advertisers increasingly better ways to target their ads both demographically and eventually by searcher behavior.
That makes it a business necessity to build ever bigger databases of where users go, what they click on, and how they act on the Internet even when they’re not performing searches. The search engines do this through depositing cookies on the browsers that point to them; Yahoo! uses cookies that expire in June 2006, while Google’s cookies can last until 2036. And users who open accounts for other services from the engines—from e-mail and instant messaging to shopping and online photo albums—contribute much more of their personal data. The engines maintain that the information they collect is aggregated and scrubbed of all personally identifiable details. Still, gathering so much knowledge about their users begs the question: Who gets to mine all that data?
Federal prosecutors believe they have a claim on that mine, and last summer they subpoenaed millions of search records from the top engines in an effort to reinstate an enjoined law to keep kids from Web pornography. They say they want to demonstrate that software filters won’t do enough to shield youngsters from all the adult content online. To make that case, they want to find out how big the adult-oriented Web is and how it is commonly accessed. To them, it made sense to go to the folks who know most about the size of and entry points to the Internet: the big search engines.
It’s not known what he DOJ asked for from Yahoo!, MSN and AOL, but at least in the case of MSN, it is known that they reduced their initial request. But it is known that the department’s first request to Google was quite simply for all the URLs in its index and all search queries entered for June and July 2005. That was later reduced to one million random Web pages and one week’s worth of search traffic.
Yahoo!, MSN and AOL complied with their subpoenas. But Google refused, a fact that came out last week when the DOJ asked a U.S. district court in San Jose, CA, to compel Google to turn over a “random sampling” of one million search queries entered into Google during a single week, along with a sampling of a million Web sites from the Google index.
“Google is not a party to this lawsuit, and the demand for the information is overreaching,” Google associate general counsel Nicole Wong said in an e-mail statement explaining her company’s position. Google maintained that turning over the requested data would both jeopardize users’ privacy and reveal trade secrets about how it views the Web.
For their part, MSN and Yahoo! said they acceded to the DOJ request in part because the data provided could not comprise the privacy of their searchers.
“This is a big deal,” says Eric Goldman, who teaches cyberlaw and intellectual property law at Marquette University. “This is not one of those little Web brouhahas that evanescently pass into the night. This is going to have multi-year consequences.”
Part of what makes it a big deal is that privacy watchdogs have been after the search engines for some time to cut back on the information they collect about their users in order to present less attractive targets of opportunity. Most of those advocacy groups lost no time in reviving their arguments, mostly interwoven with kudos for Google’s stance.
For example, while the Electronic Freedom Foundation applauded Google’s refusal to comply with the request, EFF attorney Kevin Bankston warned that “the only way Google can reasonably protect the privacy of its users from such legal demands…is to stop collecting so much information…delete information that it does collect as soon as possible, and take real steps to minimize how much of the information it collects is traceable back to individual Google users.”
Goldman says this is an argument that dates back at least to 2004, when Google revealed that it was planning to scan the contents of users’ Gmail e-mail messages to serve relevant ads. Google also riled privacy groups with flaws in its desktop search function and with an application that accelerated Web surfing but occasionally served up password-protected content.
It seems notable too that Google is resisting the DOJ request on the grounds of both user and corporate confidentiality. “In regard to the initial request, there was an unambiguous trade secrets issue,” Goldman says. “The government asked for information that, if it were widely known, would have significant value both to competitors and to the search engine optimizers. Google had to resist that first request to preserve its competitive posture.”
Even complying with the amended request could provide ammunition for Google’s competitors and content providers. “Google has the biggest database and 42% of all searches. Everyone wants to know what they’re doing, and what their users are searching for. If this information were publicly known, there would be a lot of rejoicing.”
Goldman, former general counsel for Web review site ePinions, thinks another outcome might be a call for federal legislation to keep search data out of the hands of the government. “Here’s the mentality of many government agencies, whether the DOJ or the FBI: ‘I need data; I see some party who has data; I’ve got the big guns; they’re going to do whatever I ask,’” he says. “And there’s always a subtext: ‘If you don’t do what we want, you’re our next target.’”
Some barrier should be established to keep government agencies from simply demanding data because it exists, he says. “I think this may backfire on the DOJ,” Goldman says. “They picked the wrong company. If there’s any company we’re prepared to rally around, it’s Google. We need some leavening at the DOJ to make them feel the cost of what they’re imposing on others, not just Google but all the consumers whose data they might have access to.”
One possible fallout from this dust-up might be a heightened public awareness that they’re being watched when they search. A weekend phone poll by The Ponemon Institute revealed that 77% of users don’t know Google keeps records on their searches. If the search security issue gets the same public exposure that’s been given to identity theft and spyware, consumers may attack the problem in the same way: by deleting cookies, using software that lets them search anonymously, and generally giving back some of the benefits of personalized search for increased peace of mind.
But whether or not that happens remains to be seen. In the meantime, 56% of respondents to the Ponemon survey made it clear that they don’t think Google should turn over the information it collects to the government. And in fact 14% said the search engine should refuse to reveal what it knows even in cases of possible criminal activity.
In light of sentiments like that, the issue of search data security may hinge on where the greatest potential harm lies.
“I’m not so concerned about what Google does with the data it knows about me,” Goldman says. “It’s not that I trust them; it’s just that I think there are far fewer adverse things they could do with it. But I have zero trust in what the government could do with that data.”

