Jeff Jonas, the chief scientist and distinguished engineer at IBM's entity analytic solutions group, has developed a means of sharing corporate data without revealing what that data contains.
This technology, called anonymization, effectively "shreds" information, making it possible for companies to share information about their customers with governments or other companies without giving away any personal data. Over time, Jonas believes companies will increasingly use anonymization to defend their data, and corporate well-being, from competitors and identity thieves.
Jonas recently sat down with IDG News Service to discuss anonymization and how protecting customer privacy will make companies more competitive.
How does anonymization work?
Normally, somebody with data encrypts it, and then they transfer it. Then, the recipient decrypts the data to use it. But while it's in transit -- in flight -- it's encrypted. Cryptographers have invented math that allows you to shred something, and then unshred it: encrypt it, and then decrypt it.
Part of cryptography is something else that creates digital signatures. Smart math people have invented algorithms that are called one-way hashes. It looks like encryption because you put in data and what comes out is not readable to humans. But there's no way to take what came out and take the math and run it backwards, and get the input value. That's why I use the example of a pig and a sausage. If I give you the sausage and the grinder, you can't go backwards and make a pig.
I just took advantage of something that someone else has made, and I just used it in a slightly different way to get a new result.
In effect, the process of anonymization creates digital signatures of information that can be compared against other signatures for possible matches. At the same time, the signatures cannot be used to recreate the original data.
Normally, I have data and you have data and we want to figure out what our data means together. But I don't want to give you mine and you don't want to give me yours. This is why information sharing will fail: everyone wants to be the recipient.
Sometimes a government may pass a law that says I, as a company, have to give you my data. Maybe you have a watch list, and you don't want me to see it. That's how I ended up creating this. I was getting ready to take my kids on a cruise. I made the reservations and then saw in a newspaper that there was a threat against Port Canaveral, Florida, from terrorist scuba divers. I was thinking, "Oh no, I'm taking my kids on a cruise."
The U.S. government has this really cool, big list of bad guys. They don't send it to the cruise line, and the cruise line has all these reservations, and they don't send it all to the government. You could take 10 bad guys, they could just sneak across the border, use their real names and get on the cruise ship. That was the tension point. All of the work I had done prior allows an organization to share data with itself. What happens if you want to share data across two organizations and only find things in common? How would you do that?