Amazon has showcased an experimental Alexa artificial intelligence (AI) feature that lets it talk to you in the voice of your dead relative at its re:MARS 2022 event. This deepfake voice technology is said to only need one minute of recorded audio to recreate a passable imitation.
The video demo at Amazon’s MARS (machine learning, automation, robotics, space) conference has a child asking Alexa to read him Wizard of Oz. But instead of Alexa’s usual voice, the reading was in the voice of his dead grandmother.
Amazon’s head scientist for Alexa AI Rohit Prasad introduced the clip by saying that adding human-like empathy to AI systems was increasingly important “in times of the ongoing pandemic when so many of us have lost someone we love. While AI can’t eliminate that pain of loss, it can definitely make their memories last.”
Amazon says that its AI can learn to imitate someone’s voice from just a single minute of recorded audio. That means everyone’s voice can be immortalised and brought back to life. Even if they have passed away long ago, as long as you have a voice recording of theirs. And that is creepy.
Audio deepfake is not new, with the more prominent use cases being from Hollywood. Val Kilmer’s voice was artificially recreated with the help of Sonantic for Top Gun: Maverick after his tracheotomy. Meanwhile, Respeecher was said to have augmented Mark Hamill’s voice in The Book of Boba Fett.
There are interesting uses for the technology. Imagine hearing your own voice on the character you are playing in a video game, instead of a silent protagonist or an extremely restricted (or extremely expensive) set of voiced lines. Or bringing back beloved characters from old movies.
But there are many potential dangers as well. Like visual deepfakes, bad people have used audio mimicry to do harm. Back in 2019, thieves have already used fake voices to trick companies to transfer money into their accounts.
Then we have the issue of permission. The documentary Roadrunner drew criticisms when it was released in 2021, for using an AI-synthesised voice of the late Anthony Bourdain without his permission. And should the technology become more commonly available, we’ll need to address what constitutes stealing and what is fair use.
This brings us back to Amazon’s Alexa AI and the dead grandma reading the bedtime story. Amazon did not address potential issues about privacy and consent.
Did grandma agree to allow her voice to be used? What if instead of reading a story, the AI is asked to do something out of character? Imagine someone in the future using your voice as part of a comedic punchline. And shouldn’t someone who died be allowed to live on in memories and not be clung on to forever in the present?
Granted, this is still a rather new frontier but if it hopes to move forward in voice AI, Amazon should start considering the social and moral ramifications of what Alexa can do, and the safeguards it can put in place, less the dystopian futures portrayed in Black Mirror come to pass. Pretty sure Google and Microsoft (sorry Cortana) will be taking notes.