The buzz around OpenAI’s chatbot, ChatGPT, along with advances in artificial intelligence (AI) and machine learning (ML) have prompted interest into how the technologies could be maliciously used. Intel 471 has seen increasing interest in “deepfake” production services advertised on underground forums. Deepfakes are images, audio and video clips that have been synthetically produced. Deepfakes can pose serious harm, from misinformation to fraud to harassment. Threat actors see potential in this. To understand how they can weaponize deepfakes to exploit new attack vectors, security teams first must be aware of the underlying technology’s limitations. Intel 471 analyzed deepfake services to see what’s on offer, what threats the services may pose and what lies ahead.
What is a Deepfake?
The term “deepfake” is an amalgamation of “deep learning” and “fake.” Deepfakes are defined as realistic synthetic imagery or audio created using ML. This technology is leveraged to augment or substitute the likeness of a human with realistic computer-generated content. Deepfakes are more convincing than traditional photo or video editing because they leverage sophisticated ML techniques such as generative adversarial networks (GANs). GANs work by pitting one AI application against another – the first is the generative network that creates a deepfake image based on a set of parameters, and the second is the discriminative network that compares that image to a real-life image and tries to identify which is the fake. The first AI then tries to improve the fake such that the discriminative network accepts it as real. The quality of video deepfakes widely varies, but some of the best – take the Tom Cruise ones – show that there’s a future ahead where it becomes exceedingly difficult for the human eye to tell the truth from AI-generated fiction.
We first took a look at underground markets for deepfake products in 2021. At that time, several threat actors were offering services that purportedly could bypass video verification and allow for account takeovers at financial institutions and cryptocurrency exchanges. One threat actor claimed that their service allowed someone to bypass two-factor authentication (2FA) and video verification at five cryptocurrency exchanges. But broadly, there appeared to be little meaningful use of deepfake technology for successfully perpetrating cybercrime, and the discussion of its use was limited and superficial.
Two years on, we have seen a refreshed interest in deepfakes due to the proliferation of ever-improving examples of video deepfakes and also the release of ChatGPT. Threat actors are offering various services, such as using an available video and swapping out a person’s face and adding a new voice. Another service purports to use real actors to imitate someone and then apply post-production techniques to refine the spoof. We observed an overall increase in threat actors offering deepfake services in the underground since 2021. This trend will likely continue at a steady rate rather than have an explosive increase in use or application.
A Barrier: Quality
We found video deepfake products on underground forums to be immature at this point.
After reviewing samples provided by each of these services, we discovered there still is progress to be made to create a truly convincing deepfake that is financially viable for the underground market. The videos were of varying quality, and we spotted details in each that indicated a synthetic product. Pricing usually reflects the amount of resources a threat actor has invested in making the final product. The quality of deepfake offers in the underground as well as those portrayed in open sources can be tied to the sophistication of the service or individual making the synthetic media. There are several open-source software tools, including DeepFaceLab, that can be leveraged. Higher-quality deepfakes would require greater effort, skill set, money and time to create as well as computing power. Convincing deepfakes also require access to sufficient amounts of imagery source material for the person who is going to be spoofed. This is why celebrity deepfakes are often the more polished examples. It would be difficult to create a deepfake for someone who has just a small imagery footprint on the internet.
The full potential criminal application of deepfakes is unknown at this time but only is limited by the ingenuity of threat actors creating them. While we postulated a theoretical application of deepfake technology against traditional security practices would be highly successful, we maintain our stance that obtaining the number and quality of images and videos required to create a convincing product is difficult.
Research projects and other efforts are underway to develop systems that can detect deepfake content (such as the Deepfake Detection Challenge run by Meta in 2020). While the Tom Cruise deepfake is a standout, low-budget and lesser sophisticated deepfakes can be detected by closely observing images to identify slight imperfections. Those can include face discolorations and asymmetry, variances in lighting, disassociated audio and video and blurriness where the face meets the neck and hair. Additionally, algorithms can detect deepfakes by analyzing the images and revealing small inconsistencies between pixels, coloring or distortion. It also is possible to use AI to detect deepfakes by training neural networks to spot changes in facial images that have been artificially altered by software.
There are methods under development that are aimed at proving that a video clip hasn’t been manipulated. One uses cryptogenic algorithms to insert hashes at set intervals that would change if the video is altered. Also, applications already exist that can slow deepfake creation by inserting digital artifacts to conceal the patterns of pixels that facial detection software uses. There are several projects dedicated to this space, including the Content Authenticity Initiative (CAI), which has created open-source tools that can be used to insert “provenance signals” into content. Another is the Coalition for Content Provenance and Authenticity (C2PA), which has developed a standard to mark content with provenance information.
There is one category of AI content that would appear to pose a more immediate risk: audio. Joseph Cox, a journalist with Vice’s Motherboard technology section, recently used an AI service to synthetically create his voice. The synthetic voice passed a voice authentication prompt, allowing him to access the balance of his bank account. He started by recording five minutes of himself speaking and uploading it to ElevenLabs, which created passages such as “My voice is my password.” The bank rejected those initial recordings and didn’t let him in. However, Cox recorded a longer passage of himself speaking, passed it to ElevenLabs and achieved a more refined result. After Cox supplied his birthdate and the new voice recording, the bank granted him access to his balance and transaction record.
Service providers want authentication to be in line with risk, but even accessing an account balance with just a voiceprint and a birthdate seems to cross a red line for risk in today’s threat environment. Audio deepfakes could aid business email compromise (BEC) schemes, particularly ones targeting organizations that have taken measures to guard against it. Those measures could include out-of-band communications, such as phone calls to verify, say, wire transfers. Mimicking a CEO’s voice could persuade an employee to undertake a risky action.
Assessment and Outlook
Even with the barriers for threat actors to prominently leverage deepfake technology, we continue to see underground services claim to be able to create passable deepfake videos, images and audio clips. These products could be useful to attackers attempting to bypass facial verification applications as well as conducting document forgery, social engineering via live video calls and voice phishing (vishing) scams. Quality remains a problem, however, it’s possible to spot obvious flaws in deepfake video products offered on underground markets that prompt questions about their veracity. We expect this to be an ever-evolving area, with the ability to create more convincing deepfakes at a lower cost over time.