Torque Talk
by James Holmes · 09/18/2002 (9:15 am) · 15 comments
Torque Talk TTS
This is a basic english TTS (text to speech) extension for TGE. It is based on the Flite engine written by Alan W Black and Kevin A. Lenzo of Carnegie Mellon University. As far as I can tell all the code is free and can be used in commercial products.
Currently the voice quality is not very good, not because the engine itself is lacking, but rather that the voice database is experimental. See comments in the 'future developments' section.
Integrating the TTS code should be straight forward. There are no changes to the TGE code itself. You of course need to have sound working with Torque, get the
OpenAL library if necessary.
A special thanks to Rob and John at Online Systems cc for hosting the download files.
Windows installation
To make it easy I've precompiled the Flite engine into a dll.
1. Download the zip.
8KHz version, 2.6MB
16KHz version, 3.9MB.
2. Unzip flite_headers.zip to a suitable directory, say \torque\tts. It does not really matter where you put it cos you will specify the location in the next steps.
3. Place tts.cc and flite.lib in a suitable place, say in the \torque\tts directory mentioned above.
4. Add tts.cc to the Torque project.
5. Add flite.h to the preprocessor.
6. Include flite.lib in the link.
7. Compile and link Torque.
8. Place flite.dll in the same directory as the Torque executable.
That's it, have a look at the script examples below.
If you need detailed help with steps 2-7, just ask.
Compiling Flite under VC6
Because I've precompiled Flite, this should not be necessary unless you want to make changes to Flite itself. If you really want to compile Flite under VC6 then you will probably have to make these changes.
1.Add the definition 'NO_UNION_INITIALIZATION' to the preprocessor.
2.Increase the compiler heap size with /Zm300. Needed to compile the big dictionary file.
3.Switch off optimizing. One of the .obj files is invalid if optimized :(
4.Use the included .DEF file to specify the DLL exports.
I did not add the following files to the project, they are not required for torque and give compile errors.
cst_file_wince.c
cst_mmap_posix.c
cst_socket.c
cst_mmap_win32.c
Linux and Mac
I am not familiar with developing under these so I cannot be of much help. It should not be too difficult to get TTS to work however.
The first step is to get Flite and to compile it. It is developed under Linux so for you Linux people that should simply be a matter of running a make file.
Download Flite source
Then download the tts_cc.zip file.
The next step is to add tts.cc to the Torque project and to compile it. Now you're all set to check out the scripting interface below.
Scripting interface
TTS is controlled with client side scripts. (There are equivalent C/C++ methods as well.) The server can of course send text to the client which is output there. Beside the commandToClient() interface, the Torque SDK document has an example in the Networking chapter, Network Ghosts and Scoping section, of a sim object orientated transfer.
These are the handful of console functions you can call.
getVoice
This is the first call you have to make. It gets a handle to a voice. The voice defines the sex, language etc (in future), as well as the volume, 3D/2D, looping etc. The handle is used in later calls.
Currently there can be up to 5 active voices. If more are needed you need to reuse them with getVoice(), speak(), and then releaseVoice().
audio_description - Name of an AudioDescription data block. (Not a AudioProfile block).
returns - a handle to a voice
releaseVoice
Releases a voice and any resources held by it.
speak
Speaks text. It would generally be a good idea to keep these messages relatively short, don't try to pass an entire book in one call :)
text - The message that will be spoken.
x,y,z - Optional, the coordinates where the sound will originate. These are required if the voice is defined as 3D in the AudioDescription block.
shutup
Stops a voice, useful for those irritating nagging messages. The handle is still valid and can be used to output futher text.
shutdownVoices
Stops all voices and recovers resources used by them. Call this at the end of your mission.
Script examples
Here are two examples, one to speak the messages displayed on the chat HUD and another where you can talk to a tree.
Define AudioDescription blocks
Here we define two voice descriptions. The first is a 2D voice which we'll use for speaking the chat messages. The second is a 3D voice we will attach to a tree.
Add the following to the end of fps/client/scripts/audioProfiles.cs.
Chat HUD
This mod simply takes the text that will is displayed in the HUD and speaks it.
Edit the file fps/client/scripts/chatHud.cs.
Find the function 'function ChatHud::addLine(%this,%text)'.
Add these lines to the end of the function.
Start the tree talking
Edit the file fps/client/scripts/playGui.cs
Find the function 'function PlayGui::onWake(%this)'
Add these lines to the end of the function.
Stop the voices
Leave this out and you'll be sorry. The talking tree is the most annoying garden feature you've ever been subjected to.
In the same file as above playGui.cs look for the function
onSleep(). (It should be right after the onWake() function).
Add this line to the start of the function.
Try it out
Start up the Scorched Earth mission. You should be greeted almost immediately with the welcome message. Type a message into the chat hud and hear it spoken. Walk around the edge of the pool, one of the trees should be speaking the message specified earlier.
Future improvements
There is a need for more and better quality voices. A callback for lip sync will also be useful.
Alternative voices
A quote from the Flite readme.
"So you've eagerly downloaded flite, compiled it and run it, now you are disappointed that is doesn't sound wonderful, sure its fast and small but what you really hoped for was the dulcit tones of a deep baritone voice that would make you desperately hang on every phrase it sang. But instead you get an 8Khz diphone voice that sounds like it came from the last millenium.
Well, first, you are right, it is an 8KHz diphone voice from the last millenium, and that was actually deliberate. As we developed flite we wanted a voice that was stable and that we could directly compare with that very same voice in Festival. Flite is an *engine*. We want to be able take voices built with the FestVox process and compile them for flite, the result should be exactly the same quality (though of course trading the size for quality in flite is also an option). The included voice is just an sample voice that was used in the testing process. We have better voices in Festival and are working on the coversion process to make it both more automatic and more robust and tunable, but we haven't done that yet, so in this first beta release. This old poor sounding voice is all we have, sorry, we'll provide you with free, high-quality, scalable, configurable, natural sounding voices for flite, in all languages and dialects, with the tools to built new voices efficiently and robustly as soon as we can. Though in the mean time, a few higher quality voices will be released with the next version."
(I have just discovered that Alan Black et al, have formed a company that is in the business of supplying voices, so don't expect a free one from them any time soon. They do have excellent voices and I take it these can be used with Flite. www.cepstral.com/offerings.php?in=consulting)
The FreeTTS project is a java port of the flite engine and uses compatible voices. It is worth keeping an eye on developments there.
Lip sync
A lip sync callback should not be too difficult. This is basically a callback method that is invoked just before each phoneme is spoken. For the purposes of TGE, only about eight distinct lip and mouth positions are needed, each corresponding to a group of phonemes. I need to become more familiar with the innards of TGE and its animation methods before I can implement this feature effectively.
Better TGE integration
At the moment the TTS bypasses all of TGE audio routines and goes directly to the AL Open library. The main reason for this is that AFAIK it is not possible to load a standard TGE audio buffer by any means other than from a resource file. TTS generates the audio samples on the fly, into memory. I did not want to change the existing TGE code and it seemed a terrible hack to first write the sample to a file.
This is a basic english TTS (text to speech) extension for TGE. It is based on the Flite engine written by Alan W Black and Kevin A. Lenzo of Carnegie Mellon University. As far as I can tell all the code is free and can be used in commercial products.
Currently the voice quality is not very good, not because the engine itself is lacking, but rather that the voice database is experimental. See comments in the 'future developments' section.
Integrating the TTS code should be straight forward. There are no changes to the TGE code itself. You of course need to have sound working with Torque, get the
OpenAL library if necessary.
A special thanks to Rob and John at Online Systems cc for hosting the download files.
Windows installation
To make it easy I've precompiled the Flite engine into a dll.
1. Download the zip.
8KHz version, 2.6MB
16KHz version, 3.9MB.
2. Unzip flite_headers.zip to a suitable directory, say \torque\tts. It does not really matter where you put it cos you will specify the location in the next steps.
3. Place tts.cc and flite.lib in a suitable place, say in the \torque\tts directory mentioned above.
4. Add tts.cc to the Torque project.
5. Add flite.h to the preprocessor.
6. Include flite.lib in the link.
7. Compile and link Torque.
8. Place flite.dll in the same directory as the Torque executable.
That's it, have a look at the script examples below.
If you need detailed help with steps 2-7, just ask.
Compiling Flite under VC6
Because I've precompiled Flite, this should not be necessary unless you want to make changes to Flite itself. If you really want to compile Flite under VC6 then you will probably have to make these changes.
1.Add the definition 'NO_UNION_INITIALIZATION' to the preprocessor.
2.Increase the compiler heap size with /Zm300. Needed to compile the big dictionary file.
3.Switch off optimizing. One of the .obj files is invalid if optimized :(
4.Use the included .DEF file to specify the DLL exports.
I did not add the following files to the project, they are not required for torque and give compile errors.
cst_file_wince.c
cst_mmap_posix.c
cst_socket.c
cst_mmap_win32.c
Linux and Mac
I am not familiar with developing under these so I cannot be of much help. It should not be too difficult to get TTS to work however.
The first step is to get Flite and to compile it. It is developed under Linux so for you Linux people that should simply be a matter of running a make file.
Download Flite source
Then download the tts_cc.zip file.
The next step is to add tts.cc to the Torque project and to compile it. Now you're all set to check out the scripting interface below.
Scripting interface
TTS is controlled with client side scripts. (There are equivalent C/C++ methods as well.) The server can of course send text to the client which is output there. Beside the commandToClient() interface, the Torque SDK document has an example in the Networking chapter, Network Ghosts and Scoping section, of a sim object orientated transfer.
These are the handful of console functions you can call.
getVoice
This is the first call you have to make. It gets a handle to a voice. The voice defines the sex, language etc (in future), as well as the volume, 3D/2D, looping etc. The handle is used in later calls.
Currently there can be up to 5 active voices. If more are needed you need to reuse them with getVoice(), speak(), and then releaseVoice().
voice = getVoice(speaker, audio_description)speaker - The name of the voice, unused for now but recommended to be "kevin"
audio_description - Name of an AudioDescription data block. (Not a AudioProfile block).
returns - a handle to a voice
releaseVoice
Releases a voice and any resources held by it.
releaseVoice(voice)voice - handle to a previously obtained voice.
speak
Speaks text. It would generally be a good idea to keep these messages relatively short, don't try to pass an entire book in one call :)
speak(voice, text [,x,y,z])voice - handle to the voice to use.
text - The message that will be spoken.
x,y,z - Optional, the coordinates where the sound will originate. These are required if the voice is defined as 3D in the AudioDescription block.
shutup
Stops a voice, useful for those irritating nagging messages. The handle is still valid and can be used to output futher text.
shutup(voice)voice - handle to the voice to stop.
shutdownVoices
Stops all voices and recovers resources used by them. Call this at the end of your mission.
shutdownVoices()
Script examples
Here are two examples, one to speak the messages displayed on the chat HUD and another where you can talk to a tree.
Define AudioDescription blocks
Here we define two voice descriptions. The first is a 2D voice which we'll use for speaking the chat messages. The second is a 3D voice we will attach to a tree.
Add the following to the end of fps/client/scripts/audioProfiles.cs.
new AudioDescription(ChatSpeech)
{
volume = 1.0;
isLooping= false;
is3D = false;
type = $GuiAudioType;
};
new AudioDescription(TreeTalk)
{
volume = 1.0;
isLooping= true;
is3D = true;
type = $SimAudioType;
referenceDistance = 5.0;
maxDistance = 10.0;
coneInsideAngle = 360.0;
coneOutsideAngle = 360.0;
coneOutsideVolume = 1.0;
coneVector = "0 0 1";
environmentLevel = 0;
};Chat HUD
This mod simply takes the text that will is displayed in the HUD and speaks it.
Edit the file fps/client/scripts/chatHud.cs.
Find the function 'function ChatHud::addLine(%this,%text)'.
Add these lines to the end of the function.
// Speak the message
if ($chatVoice == 0) {
$chatVoice = getVoice("kevin", ChatSpeech);
}
%text = strreplace(%text, ":", " says");
speak($chatVoice, %text);Start the tree talking
Edit the file fps/client/scripts/playGui.cs
Find the function 'function PlayGui::onWake(%this)'
Add these lines to the end of the function.
%treeVoice = getVoice("kevin", TreeTalk);
%say = "Hello, I am a tree,";
%say = %say @ " I can't really talk,";
%say = %say @ " and I sound like there is a frog in my throat,";
speak(%treeVoice, %say, 128, -217.44, 154);Stop the voices
Leave this out and you'll be sorry. The talking tree is the most annoying garden feature you've ever been subjected to.
In the same file as above playGui.cs look for the function
onSleep(). (It should be right after the onWake() function).
Add this line to the start of the function.
shutdownVoices();
Try it out
Start up the Scorched Earth mission. You should be greeted almost immediately with the welcome message. Type a message into the chat hud and hear it spoken. Walk around the edge of the pool, one of the trees should be speaking the message specified earlier.
Future improvements
There is a need for more and better quality voices. A callback for lip sync will also be useful.
Alternative voices
A quote from the Flite readme.
"So you've eagerly downloaded flite, compiled it and run it, now you are disappointed that is doesn't sound wonderful, sure its fast and small but what you really hoped for was the dulcit tones of a deep baritone voice that would make you desperately hang on every phrase it sang. But instead you get an 8Khz diphone voice that sounds like it came from the last millenium.
Well, first, you are right, it is an 8KHz diphone voice from the last millenium, and that was actually deliberate. As we developed flite we wanted a voice that was stable and that we could directly compare with that very same voice in Festival. Flite is an *engine*. We want to be able take voices built with the FestVox process and compile them for flite, the result should be exactly the same quality (though of course trading the size for quality in flite is also an option). The included voice is just an sample voice that was used in the testing process. We have better voices in Festival and are working on the coversion process to make it both more automatic and more robust and tunable, but we haven't done that yet, so in this first beta release. This old poor sounding voice is all we have, sorry, we'll provide you with free, high-quality, scalable, configurable, natural sounding voices for flite, in all languages and dialects, with the tools to built new voices efficiently and robustly as soon as we can. Though in the mean time, a few higher quality voices will be released with the next version."
(I have just discovered that Alan Black et al, have formed a company that is in the business of supplying voices, so don't expect a free one from them any time soon. They do have excellent voices and I take it these can be used with Flite. www.cepstral.com/offerings.php?in=consulting)
The FreeTTS project is a java port of the flite engine and uses compatible voices. It is worth keeping an eye on developments there.
Lip sync
A lip sync callback should not be too difficult. This is basically a callback method that is invoked just before each phoneme is spoken. For the purposes of TGE, only about eight distinct lip and mouth positions are needed, each corresponding to a group of phonemes. I need to become more familiar with the innards of TGE and its animation methods before I can implement this feature effectively.
Better TGE integration
At the moment the TTS bypasses all of TGE audio routines and goes directly to the AL Open library. The main reason for this is that AFAIK it is not possible to load a standard TGE audio buffer by any means other than from a resource file. TTS generates the audio samples on the fly, into memory. I did not want to change the existing TGE code and it seemed a terrible hack to first write the sample to a file.
#2
Anyway, I've uploaded the files to another site and added links to the page above.
James
09/18/2002 (2:24 pm)
Definitely not there. I suspect the file was too big and somehow it got silently ditched.Anyway, I've uploaded the files to another site and added links to the page above.
James
#3
09/19/2002 (6:48 am)
I'll have to try this out over the weekend. Not sure why the file wasn't posted. It isn't really that large. Maybe there's a size limit I'm not sure. Anyway your link works fine :)
#4
The Ogg-Vorbis resource also plays data from an in-memory buffer; I don't remember if it went straight to AL, or routed through the Torque stuff first. That code may be going in standard into CVS, and I would assume if this code needs similar capabilities, a case could be made for providing it at the engine level.
09/19/2002 (7:00 pm)
Yes it is free. The Ogg-Vorbis resource also plays data from an in-memory buffer; I don't remember if it went straight to AL, or routed through the Torque stuff first. That code may be going in standard into CVS, and I would assume if this code needs similar capabilities, a case could be made for providing it at the engine level.
#5
09/19/2002 (8:30 pm)
MCM2 POWERCURVE BIKE IF YOU HAVE THE GARAGE FILE PLEASE POST IT
#6
09/21/2002 (12:17 pm)
Kevin, this is NOT a games cheat site, please stop posting asking for that cheat, you should read what a site is about before you post. Unless of course you are just a guy with no life who is posting in here just annoy the community...
#7
09/24/2002 (4:05 am)
where can we get different voices? or are there more then kevin in there?
#8
AFAIK the only other voices that are free are those in the Flite download. One is a 16Khz variation of kevin, and the other is a domain restricted voice which is probably not very good for general purpose use. I could not compile the bigger version of Kevin with VC6 so that is why the 8Khz version is in there.
Cepstral (link above) will be able to sell you high quality voices if you are at that stage of your development.
If you want to get down and dirty, the FestVox tools from CMU can be used to make your own voices or to port voices from Festival. I believe that the kevin voice is Kevin Lenzo's, one of the developers of flite and that he used the FestVox tools to make it.
The freeTTS site has a forum dedicated to making new voices, you could try there.
09/24/2002 (11:52 am)
Only the 8Khz version of kevin is in the dll.AFAIK the only other voices that are free are those in the Flite download. One is a 16Khz variation of kevin, and the other is a domain restricted voice which is probably not very good for general purpose use. I could not compile the bigger version of Kevin with VC6 so that is why the 8Khz version is in there.
Cepstral (link above) will be able to sell you high quality voices if you are at that stage of your development.
If you want to get down and dirty, the FestVox tools from CMU can be used to make your own voices or to port voices from Festival. I believe that the kevin voice is Kevin Lenzo's, one of the developers of flite and that he used the FestVox tools to make it.
The freeTTS site has a forum dedicated to making new voices, you could try there.
#9
09/26/2002 (2:23 am)
Thanks I'll take a look... btw the why couldn't it compile in VC6?
#10
I've uploaded the 16KHz voice, if you're interested. I can't really hear much difference, maybe the distortion is a bit clearer now! The link to the file is above.
09/26/2002 (11:01 am)
No reason really :) I had problems originally because one source file is 20MB which caused VC6 to choke. But I've compiled it now using the /Zm (?) switch.I've uploaded the 16KHz voice, if you're interested. I can't really hear much difference, maybe the distortion is a bit clearer now! The link to the file is above.
#11
08/07/2003 (8:42 am)
Anyone have better voice files that they want to share? Kevin is OK for demo, but compared to other commercial voices it sucks... :-)
#12
08/13/2003 (7:28 am)
The FestVox site has some alternative voices (several US and British voices). Whether or not they work directly with flite is another story, haven't tested it.
#13
12/30/2005 (1:40 am)
Any changes on this technology? Maybe in spanish?
#14
03/24/2006 (12:30 pm)
This resource has broken download links. Anyone have a link they can share or a copy of the zips?
#15
www.codejar.com/torque/torque_textToSpeech_win16.zip
05/25/2006 (1:33 pm)
I put them up on my server here:www.codejar.com/torque/torque_textToSpeech_win16.zip
Torque Owner Sabrecyd