Synthetic voices
Generating automated voices
This operation will generate all the sounds for the synthetic voices, along with their lips movements, if not generated yet. The sounds that were previously generated will not be created again, and the ones that were obsolete will be deleted to clean the project. Therefore, if you change a text, the new one will be generated and the old one will be deleted when you launch the generation process. If you wish to keep your current sounds, you can duplicate your project before the generation.
If you have set a default VTS Account from the main Menu, it will be used automatically. Otherwise, you will have to log in to start the generation.
An internet connexion is needed to run the generation.
You can save your project before generating all sounds - the editor will remind you to do so).
This operation can take several minutes but it can be stopped. If you cancel it, the voices that were generated will be kept, and the corresponding amount of credits will be debited.
Synthetic voice partners
When you choose a character's voice, you can see 2-letter prefixes (AC, RS, GS and GW) at the beginning of each voice's name. These prefixes are used to recognize which synthetic voice provider this voice comes from.
Here is the list of synthesis voice generation partners we have in VTS Editor:
- Google (GS and GW) : https://cloud.google.com/text-to-speech/
- GS stands for "Google Standard" and GW for "Google WaveNet".
WaveNet is generally of better quality (more information here).
- GS stands for "Google Standard" and GW for "Google WaveNet".
- Microsoft Azure (MS) : https://azure.microsoft.com
- IBM : https://www.ibm.com/
- ElevenLabs (EL) : https://elevenlabs.io/
Former suppliers Acapela and ReadSpeaker are no longer available for generation as of January 1, 2024.
Customize Google voices pronunciation
For the Google voice provider, it is possible to use SSML tags to add various effects to the synthesized voice generation: pause, spell, speed up, slow down, etc.
For example, the text "<speak>JI hesitate... <break time="3s"/>No, I really don't know.</speak>J" will be spoken with a three-second pause after saying "I hesitate..."
The available tags can be found in the dedicated documentation: Google SSML documentation
These tags must only be used in the text pronunciation field (Reformulation button located to the right of each text field pronounced by a character), so you don't modify the text displayed in the character's subtitles.
- ${ child.title }