A British startup today unveiled new AI humans that blur the line between the virtual and the real. Synthesia calls the digital beings “Expressive Avatars.” They promise the most realistic emotional expressions on the market.
Generated by an AI model that’s trained on footage of real actors, the avatars are built for video creation. Users simply enter a text prompt and the synthetic humans read them out on a screen.
The photorealistic renders are certainly impressive. But what makes them unique is their capacity to convey human feelings.
Using a technique called “automatic sentiment prediction,” Synthesia’s AI models infer the emotions within text. This determines the avatar’s tone of voice, body language, and facial expression.
“This is definitely the first iteration of avatars that can express emotions and understand the sentiment of the content,” said Victor Riparbe, Synthesia’s CEO and co-founder.
The avatars will also always generate entirely new and unique outputs. Feed them the same script twice and they will respond with two different performances. In Synthesia’s words, they’ve evolved from “digital renders” to “digital actors.”
At a product demo on Monday, TNW got to review their acting chops.
AI avatars take the stage
To this humble critic’s eyes, Synthesia’s avatars are the best GenAI actors to ever perform on the screen.
The combination of photorealistic faces, emotional gestures, expressive voices, and synchronised movements brings a new level realism to the market.
Given an upbeat script, the avatars delivered a smile and energetic tone. When fed sadder lines, they offered a sombre inflection and slower speech.
However, their performances still haven’t escaped the uncanny valley. Their main shortcoming is a tendency to slightly exaggerate their emotions. As actors, they’re closer to D-list soap stars than Academy Award winners.
Another drawback is that their movements are confined to the head, face, and shoulders. As long as that remains the case, they will probably only pose a threat to newsreaders.
Despite these limitations, the avatars have the potential to unlock new applications.
Ready for new roles
Synthesia, which reached unicorn status last year after raising $90mn (€84mn) at a $1bn (€932mn) valuation, claims that 55,000 businesses — including half of the Fortune 100 — are already customers. They typically use the platform to create videos for training, presentations, marketing, and customer service.
With the new avatars on board, Synthesia plans to expand the use cases. Promotional videos could get an energy boost. Customer support avatars could add a friendly touch. Healthcare providers could bring empathy to presentations on sensitive topics. All these emotions are, of course, artificial, but they’ve become increasingly realistic.
The added realism, however, creates risks. As avatars becoming increasingly indistinguishable from real humans, their capacity to spread disinformation grows.
Synthesia has already been exploited for that purpose. The startup’s tech has previously been used to product fake news in China, Venezuela, and Mali.
In response, Synthesia has banned accounts, introduced new rules, and upgraded digital defences.
Over 60 forms of content are now prohibited or restricted. A combination of tech filters and human oversight provide moderation, new customers are vetted, and the platform is regularly audited. Over 10% of the company is dedicated to trust and safety-related work, Synthesia said.
As the world prepares for the biggest elections in human history, concerns about deepfakes are growing. Thus far, however, digital humans haven’t lived up to their supposed threat.
Preparing for bigger parts
At Synthesia’s London HQ, the avatars are eyeing more legitimate roles. Their latest performances are just a “first glimpse of what this model is capable of,” Riparbe said. He expects “a 10x improvement” in their capabilities this year.
Eventually, Synthesia plans to digitise the entire workplace experience. The startup envisions AI avatars roaming 3D offices, communicating with virtual humans, introducing us to new colleagues, finding meeting rooms, and teaching staff within the workspace.
But in the nearer future, Riparbe has a specific target in sight.
“The internal goal for the end of the year is that we can produce an Apple Keynote of someone walking and talking and that it feels and looks entirely real,” he said.
That might alarm one budding actor in particular. Tim Cook, your days on the stage could be numbered.
One of the themes of this year’s TNW Conference is Ren-AI-ssance: The AI-Powered Rebirth. If you want to go deeper into all things artificial intelligence, or simply experience the event (and say hi to our editorial team), we’ve got something special for our loyal readers. Use the code TNWXMEDIA at checkout to get 30% off your business pass, investor pass or startup packages (Bootstrap & Scaleup).
Get the TNW newsletter
Get the most important tech news in your inbox each week.