Let's hope the NSA hasn't actually used this machine-learning model to target drone strikes

The data set used to train it was 'totally inadequate,' one expert says

The U.S. National Security Agency could be relying on a seriously flawed machine-learning model to target drone strikes in Pakistan, according to a new analysis of slides uncovered last year by whistleblower Edward Snowden.

Published last May by The Intercept, the slides detail the NSA's so-called Skynet program, in which machine learning is apparently used to identify likely terrorists in Pakistan. While it's unclear if the machine-learning model has been used in the NSA's real-world efforts, it has serious problems that could put lives at risk if it were, according to Patrick Ball, director of research at the Human Rights Data Analysis Group.

"I have no idea if any of this was ever used in actual strikes or even made it into a meeting," Ball said Monday. But "nobody rational would use an analysis this crappy for any kind of decision making."

Dating back to 2012, the slides describe the use of GSM metadata for behavioral profiling of 55 million cellphone users, including factors such as travel behavior and social networks. Equipped with that data, the model aims to predict which people are likely to be terrorists.

It's no secret that the United States has been using unmanned drones to attack militants in Pakistan over the past decade. Between 2,500 and 4,000 Pakistanis have been killed by drones since 2004, according to the Bureau of Investigative Journalism, a nonprofit news organization. Many of those killed were members of groups such as al Qaeda, the organization said.

General Michael Hayden, former director of the NSA and the CIA, has stated the connection explicitly: “We kill people based on metadata.”

Particularly troubling, however, is that drones have reportedly killed more than 400 civilians -- possibly more than 900 -- along the way.

That's where the model's specific failings become relevant. First and foremost is that the NSA didn't use nearly enough data about known terrorists to be able to train the model to distinguish terrorists from other people with any reasonable level of accuracy, Ball explained.

In fact, the model was trained using data about just seven known terrorists, according to the slides. "That's totally inadequate," Ball said.

The algorithm itself is fine, he said, but the paucity of data used to train it leads to an unacceptably high chance of "false positives," or innocent people classified as terrorists. It it were actually used to direct drone attacks, that would mean the loss of innocent lives.

The NSA is "not stupid, and this is a stupid piece of analysis," Ball said. "My guess is that this was someone in technical management at NSA selling it up the chain, but it didn't really work -- it's a failed experiment."

That's not to say that drone strikes aren't going on, or that the possibility that a model like this might be used to direct them isn't concerning.

"Yes, there are drone strikes in Pakistan, and yes, they kill innocent people -- these things are not in dispute," Ball said. But in the case of this model, "all we know is what's on a few slides, and that's worrisome."

The NSA did not respond to a request for comment.

Join the newsletter!

Error: Please check your email address.
Rocket to Success - Your 10 Tips for Smarter ERP System Selection
Keep up with the latest tech news, reviews and previews by subscribing to the Good Gear Guide newsletter.

Katherine Noyes

IDG News Service
Show Comments

Cool Tech

SanDisk MicroSDXC™ for Nintendo® Switch™

Learn more >

Breitling Superocean Heritage Chronographe 44

Learn more >

Toys for Boys

Family Friendly

Panasonic 4K UHD Blu-Ray Player and Full HD Recorder with Netflix - UBT1GL-K

Learn more >

Stocking Stuffer

Razer DeathAdder Expert Ergonomic Gaming Mouse

Learn more >

Christmas Gift Guide

Click for more ›

Most Popular Reviews

Latest Articles

Resources

PCW Evaluation Team

Edwina Hargreaves

WD My Cloud Home

I would recommend this device for families and small businesses who want one safe place to store all their important digital content and a way to easily share it with friends, family, business partners, or customers.

Walid Mikhael

Brother QL-820NWB Professional Label Printer

It’s easy to set up, it’s compact and quiet when printing and to top if off, the print quality is excellent. This is hands down the best printer I’ve used for printing labels.

Ben Ramsden

Sharp PN-40TC1 Huddle Board

Brainstorming, innovation, problem solving, and negotiation have all become much more productive and valuable if people can easily collaborate in real time with minimal friction.

Sarah Ieroianni

Brother QL-820NWB Professional Label Printer

The print quality also does not disappoint, it’s clear, bold, doesn’t smudge and the text is perfectly sized.

Ratchada Dunn

Sharp PN-40TC1 Huddle Board

The Huddle Board’s built in program; Sharp Touch Viewing software allows us to easily manipulate and edit our documents (jpegs and PDFs) all at the same time on the dashboard.

George Khoury

Sharp PN-40TC1 Huddle Board

The biggest perks for me would be that it comes with easy to use and comprehensive programs that make the collaboration process a whole lot more intuitive and organic

Featured Content

Product Launch Showcase

Latest Jobs

Don’t have an account? Sign up here

Don't have an account? Sign up now

Forgot password?