Mona Sedky, an attorney in the Justice Department’ Computer Crime and Intellectual property section, billed herself today as the “voice of doom,” and lived up to that title in describing the potential security downsides of voice cloning technologies at a Federal Trade Commission workshop.
In its announcement for the workshop, the FTC explained that “voice cloning technologies … enable users to make near-perfect reproductions of a real person’s voice.” Cybercriminals are able to harness rapid advancements in artificial intelligence and text-to-speech synthesis to create “a near-perfect voice clone with less than a five-second recording of a person’s voice,” the agency said.
Sedky, who specializes in online fraud that uses social engineering, zeroed in on how cybercriminals can use deep fake audio and voice cloning to communicate while committing criminal acts.
She said cybercriminals really don’t want to communicate with their victims because they “leave fingerprints” that increase their risk of being caught. On top of that, communicating is “costly” and time consuming for criminals, and many of them “are bad at it,” Sedky said.
The DoJ official emphasized that without voicing cloning technology, “it’s difficult to convincingly pose as someone else.”
Because of the “advent of things like deep fake video, deep fake audio,” and other anonymizing communication tools, there has been “an enormous uptick in communication-focused crime,” she explained. With all of that in mind, she said there are “almost guaranteed criminal uses of this technology.”
Sedky said cybercriminals looking to commit fraud will “love this technology,” and predicted a resulting rise in “grandparent scams” and business email compromises – which Sedky said represent the “single largest financial injury of all computer crimes going on right now.”
Grandparent scams, as the name suggests, target the elderly and look to extract money via threats, scams, or creating fake problems involving their family or loved ones.
Sedky said a common fact pattern for business email compromises involves cybercriminals using spear phishing emails to impersonate an individual who does legitimate business with the victim. While businesses are becoming smarter about phishing attacks, Sedky said that voice cloning technology allows criminals to follow up on their emails with convincing phone calls.
Returning back to her self-proclaimed role as the “voice of doom,” she compared voice cloning to the internet, both on the upside and downside. “Just because the internet can be weaponized against people, doesn’t mean we shouldn’t have the internet. It just means that these are things we need to be thinking about,” she said.