Back in 2003 I was trying to drum up interest in Peter Wayner's book, Translucent Databases, which shows how to build and operate databases whose contents are opaque to their operators. Three years later, there's still no serious discussion of why translucency should be a key architectural principle, or how it might be applied.
A couple of recent examples show why it's an issue that belongs on IT's agenda. The first involves Prosper, a service whose tagline is "people-to-people lending." Using a social network to broker connections between groups of borrowers and groups of lenders, Prosper aims to do for loans what eBay has done for auctionable goods. I wanted to invest a small amount as a lender in order to find out more about how the system works, so I began the sign-up process. To enable a credit check, Prosper asked for my Social Security number. That seems like an obvious requirement but, when you stop and think, why should it be? Prosper doesn't actually need to receive and store that number. It only needs to relay it to Equifax, Experian, and TransUnion.
If Prosper ran its database translucently, I would be able to encrypt the number so that nobody inside Prosper, legitimate or otherwise, could read it. Equifax and others would ask me to unlock it. Ideally they'd promise to use it once and then discard it.
At this point, of course, it becomes clear that Prosper shouldn't need to store my encrypted number in its database. It should only need to sign a request to the bureaus for a credit check. The request should then bounce to me, acquire my encrypted Social Security number along with permission for one-time use, and hop along to the bureaus. This protocol won't work synchronously, but it doesn't have to. If asynchronous message flow gives me the control I want, that'll be just fine.
Translucency shouldn't apply to only databases; it should govern service networks too. Unfortunately, with the lone exception of SSL, every effort to make cryptographic protocols useful to ordinary folks has gone down in flames. How will that ever change?
Quixotic jousts with the likes of Prosper over individual Social Security numbers won't move the needle. But AOL's recent data spill, or another such Exxon Valdez-like disaster, just might. "My goodness," said Thelma Arnold, AOL's user #4417749, when her search history was linked to her identity and revealed to her. "It's my whole personal life."
It's time for a public conversation about the uses and limits of translucency. Is it really necessary to retain my Social Security number, or my search history, in order to provide a service? If not, what does it cost the provider of a service -- and cost the user, for that matter -- to achieve the benefit of translucency? Is this kind of opt-out a right that users of services should expect to enjoy for free, or is it a new kind of value-added service that provider can sell?
Realistically, given the very real technical challenges, I think it would have to be a service. Until recently, that hadn't been a service that many folks would have considered paying for. But Thelma Arnold and 658,000 other AOL customers probably see things differently now. If you'd rather not be liable for storing more of your customers' data than is strictly necessary, that's a step in the right direction.