When the first voice was cloned and the first deepfake video rendered, a creature was set loose that is still in pursuit of its own tail — a centrifugal force of innovation and condemnation.
To be a fly on the wall when the first deepfake was rendered. What were the reactions of those in the room who saw it?
They must have known that they were creating a monster, a really cool looking monster with enormous potential, but like the fictitious founder of “Jurassic Park,” did they really believe they could control it?
Nestled away in the labs of startups, governments, and big tech companies alike, deepfake technology is being incubated in one space while its antidote is being developed in another; a Bellerophon to combat every shape-shifting Chimera.
Read More: Is there nothing that can’t be faked with vocal, facial manipulation?
No one is really asking if we actually need the ability to clone other people’s voices or to make fake videos indistinguishable from the real ones; we just know that it can be done well enough to make a profit in the short term.
Hollywood and the entertainment industry has been at work on this for decades; thank you Auto-Tune!
“Imagine that we had decided not to release this technology at all. Others would develop it and who knows if their intentions would be as sincere as ours…”
Those who develop deepfake technology are acutely aware of its tremendous power for abuse, yet they are like moths to a flame they believe they won’t get burnt by.
And so they slap an “Ethics” page on their websites, acknowledging the potentials of misuse while setting themselves up as moral guardians of society, giving us tiny breadcrumbs of the technology in small increments lest we choke on our own temptations.
As Lyrebird AI, the company creating “the most realistic artificial voices in the world,” states on its ethics page:
“Imagine that we had decided not to release this technology at all. Others would develop it and who knows if their intentions would be as sincere as ours…”
I understand where they’re coming from. After all, Albert Einstein went to President Roosevelt saying that the US should harness the power of the atom before the Germans got their hands on the bomb. Its creation was inevitable.
The same could be said about a number of emerging technologies more perilous than deepfakes, not least of which an artificial superintillegence that could bring us to the brink of singularity.
At least with deepfakes, viable solutions are already available; the same can’t be said about AI.
Lyrebird’s “artificial voice” offering lets users create their own avatars. This is helpful to businesses in personalizing interactions with voices that sound more human.
Modulate’s “real time voice skins” allow gamers to “speak like a celebrity.”
Synthesia’s “AI-driven video production” lets you “synchronize the lip movements of an actor to a new dialogue track” that will look and sound realistic across at least nine languages.
Each company has taken precautions to not let Pandora out of the box and each has offered a viable and profitable business solution.
These are by no means the only companies doing voice cloning. Others such as Dessa, and Facebook’s Melnet are creating state-of-the-art audio deepfakes.
However, the deepfake tech goes both ways, and it is here where we see the ouroboros of innovation.
Despite all the precautions being taken by “ethical” startups who dare not call their products deepfake technology, there are other companies trying to keep these technologies (not the startups) in their cages.
Deepfake creators are keeping one step ahead of the deepfake destroyers with their innovative and “ethical” solutions. Deepfake destroyers, on the other hand, cannot exist without deepfake creators.
Deeptrace “utilizes computer vision technologies to detect deepfakes hidden in plain sight, and authenticate audiovisual media that has been manipulated.”
Truepic’s “patented ‘Controlled Capture’ camera technology bolsters the value of authentic images by establishing provenance and verifying data integrity at the point of creation” recently took first place at the Identity & Truth Tech Discovery Event put on by Microsoft’s M12, intelligence community funding arm In-Q-Tel, and Silicon Valley Bank.
And last week the Defense Advanced Research Projects Agency (DARPA) announced the Semantic Forensics (SemaFor) program to take a crack at deepfakes in the media, including voice cloning, video manipulation, and altered text.
The efforts to combat deepfakes are not efforts to take down startups working with this technology, and it is not my intention to vilify any of the companies mentioned.
Read More: House Intel Comm to examine deepfakes after ‘Drunk Pelosi’ goes viral
However, there is a need to stop those who would use deepfakes for nefarious purposes, and those who would use it for their own benefits may be closer to us than we think.
The cat’s out of the bag, been shoved into a box, and no one knows what type of creature will emerge.
Jeanna Liu’s love for nature is rooted in her childhood. As a young girl, Liu…
The arrival of generative artificial intelligence (genAI) into the mainstream at the end of 2022…
Data analytics and machine learning models deliver the most powerful results when they have access…
I’ve been on the road for almost a year now. Chasing freedom, adventure, and purpose.…
As technological use increases, so may the cost of innovation due to the global movement…
Have you ever asked yourself why some people are amazing at picking gifts, while others…
View Comments
Voice cloning powered by AI generates new speech that is indistinguishable from the original. As this technology gains traction in the film and game development industries, companies who produce speech-to-speech voice conversion technology must ensure their clients have obtained permission to use their target voice, and that they keep all data from projects safe. This is a very informative article, thank you for the great read.