The American company Veritone claims to have introduced MARVEL.ai, one of the first commercial AI speaker solutions for professional use. As an end-to-end Voice-as-a-Service (VaaS), media companies, brands, branders, celebrities and influencers will now be able to create, manage, license and monetise hyper-realistic synthetic voices. Veritone MARVEL.ai supports both text-to-speech and speech-to-speech renderings.
If you fight your way through the buzzword bingo of the current press release, you will also learn that Marvel.ai also tries to take care of the complete rights exploitation of the licensed voice. In other words, to enforce a kind of digital rights management on the characteristic sound of a voice licensed from them.
Well, that sounds interesting, because there should be a lot to clarify in this area in the future. For example, how should one proceed in this context with a "natural" voice imitator who can deceptively imitate a celebrity? Interestingly enough, there are already legal limits to this for decades.
And in return, does an AI-simulated celebrity have to give his or her consent to the content of each rendered text snippet via MARVEL.ai? Then the speed advantages of synthetic production would quickly disappear again as soon as one wants to tweak a text at short notice.
Such corrections are likely to occur even more frequently in practice than Veritone suggests. If you listen to the German (and Austrian) audio demonstrations of the text-to-speech engine on the website, most sentences still sound somewhat stiff and not at all like emotional high-class audio advertising. If you wanted to advertise a "Melange" in Vienna, for example, you would probably not get any useful results with such an AI engine. Whether it looks better in the English-speaking market is difficult for us to judge, but considering that radio advertising is often characterised by exaggerated voice emotions, we believe that a few more generations of AI are needed here for a coherent sound optimisation.
Seen in this light, we see one thing above all: AI is advancing steadily - also in speech synthesis. And yet we are still far from making professional speakers obsolete. The lawyers, on the other hand, are already ready to distribute the digital skins...