First, came the harmless “face-swap” apps. A fun, interactive application of machine learning technology embraced by younger mobile users in droves. Then, came the less-harmless applications of machine learning like the recent and controversial rise and fall of “DeepFakes” videos. From here, security vendors warn, this practice of face-swapping could morph into a monster with chilling implications for the cybersecurity world.
Summarizing this trend, ESET’s Senior Research Fellow Nick FitzGerald says that there are several recent developments in machine learning that are “likely to have major implications for computer security, and even digital media in general.”
“The first is “adversarial machine learning” whereby adversarial examples can be generated to fool a machine-learning-based classifier, such that the sample is (to humans) apparently one thing, but it is reliably miss-classified as something else by the machine learning model used by the software. Adversarial examples of images and audio files – and potentially software examples such as PDF and executable files – have already been demonstrated.”
“Such adversarial examples are generated by the attacker being able to observe the operation of the classifier, extracting features that the learning model apparently depends on and then “perturbing” natural examples with enough “noise” that the software classifier is fooled, but the resulting audio or image is still clearly recognisable as something it is not, according to the machine classification.”
“The obvious implication for endpoint security products is that those that mainly or only depend on machine-learning-based approaches are just as susceptible to such adversarial example attacks, whereby an attacker with access to the product can reverse engineer enough features used by the model such that they could then produce malware files that would not be classified as such by that security product.”
“The second is recent advances in generative machine learning, whereby very believable but entirely fake images, video and audio can be produced to make any person look and/or sound as if they have been caught on film doing anything an adversary might wish that they had done. You can probably imagine several troublesome scenarios from the YouTube demo videos.”
“Several online services have recently banned so-called “deepfakes” – porn videos apparently featuring celebrities but that have been generated using these kinds of techniques and rendering celebrities’ faces onto those in actual porn videos.”
FitzGerald describes the implications for face and voice-recognition security systems as “obvious”.
Speaking to PC World, Trend Micro’s Dr. Jonathan Oliver issues a similar warning. He says that “once you’ve got fake video, there’s all sorts of opportunities for cybercriminals.”
“Deep learning allows the capabilities to modify video in a realistic way [and] in a way that’s harder to detect than existing technology”
“At the moment, it’s amateurish - it’s swapping in celebrities on porn - but think about the cyber-criminal effects. We can’t necessarily trust CCTV. One thing we’ve considered is fake videos doing an extension of the business compromise.”
“They’re going to pick the elements that suit their business model,” he says. The ability for cyber-criminal to convincingly fake their way past both human and machine-based security measures is something that could serve to give even obsolete malware or well-known scams a new lease on life.
“It’s going to be all sorts of content,” he says.
When asked what steps can be taken to defend against such uses of machine learning, FitzGerald says there aren’t many easy answers. At least, not yet.
”The academic research into adversarial examples is not currently very encouraging, as some very recent research suggests that previous, apparently successful fixes for adversarial example attacks, are not actually that successful at all.”
He also says “there is a fair deal of concern about the ongoing feasibility of deploying real-world software implementations of machine-learning-based systems, especially given the potential damage of such systems failing due to deliberate attack. For example, self-driving vehicles are heavily dependent on machine-learning and a major selling point of such vehicles thus far has been the promise that they are (or will soon be) much less accident-prone.”
To try and counteract some of these concerns, he says that “ESET’s endpoint protection technology utilises machine learning in-house, rather than on the client machine. It is used for much of our automated sample processing and analysis, and for creating detection updates through both our ESET LiveGrid technology and our detection module updates.”
He says that “our machine learning output is also under constant observation and oversight by expert human malware researchers and analysts. Because of the broad use of Gradient Tree Boosting, and because humans are involved in the final decision-making processes, our machine learning systems are not as prone to the kinds of attacks I’ve described above for on-client and/or Deep Learning machine learning products.”
To put it in more accessible terms, machine-learning is making it easier to fool both humans and machines - but it can’t do both at the same time. By keeping the human element in the mix, there may be hope yet in the fight to stay secure against deepfakes, adversarial machine learning and the further malicious applications of the technology yet to be discovered.
Let’s hope it’s not a false one.