SALSA LipSync with RT-Voice (Realtime TTS)

Overview

Our RT-Voice addon has been updated for SALSA LipSync v2 for realtime text-to-speech!

Support

We (Crazy Minnow Studio) are happy to provide SALSA LipSync Suite-related support. However, we do not provide support for third-party assets. If you have problems getting the third-party asset working, please contact the asset's publisher for assistance.

NOTE: Please remember, the source is included for this free add-on and this should be considered example code that you can use to jump-start your project. It is not intended to fit all scenarios or requirements -- you are free to update it as necessary for your needs. We do provide limited support for all of our products; however, we do not make development changes to support specific project needs.

For SALSA LipSync Suite-related support, please email (assetsupport@crazyminnow.com) the following information:

Invoice number (support will not be provided without a Unity Invoice Number).
Operating System and version.
SALSA Suite version.
Add-on version (generally located in an associated readme or the script header comment).
Full details of your issue (steps to recreate the problem), including any error messages.
Full, expanded component screenshots (or video).
Full, expanded, associated hierarchy screenshots (or video).
Super helpful: video capture of issue in action if appropriate.

Requirements for This Example Implementation Tutorial

SALSA LipSync v2
RT-Voice (realtime text-to-speech)
SalsaRtVoice
Required for only for iOS implementation: SalsaTextSync (download)

ATTENTION: These instructions require you to download and install the appropriate add-on scripts into your Unity project. If you skip this step, you will not find the applicable option in the menu and/or component library.

NOTE: While every attempt has been made to ensure the safe content and operation of these files, they are provided as-is, without warranty or guarantee of any kind. By downloading and using these files you are accepting any and all risks associated and release Crazy Minnow Studio, LLC of any and all liability.

Installation Instructions

NOTE: For information on how to import/install Unity AssetStore packages or unitypackage files, please read the Unity documentation.

Import SALSA LipSync into your project and please familiarize yourself with SALSA using the online documentation for SALSA LipSync.
Import/install RT-Voice and familiarize yourself with its use and requirements according to the third-party asset's instructions and operational guides.
Import this SALSA LipSync v2 integration add-on and familiarize yourself with these instructions.

NOTE: While every attempt has been made to ensure the safe content and operation of these files, they are provided as-is, without warranty or guarantee of any kind. By downloading and using these files you are accepting any and all risks associated and release Crazy Minnow Studio, LLC of any and all liability.

Usage Instructions

NOTE: These instructions and scripts are intended to be an example of how to setup SALSA to work with RT-Voice. You will likely replace the SALSA example scripts with your own implementation. Please only use these scripts to get started and learn how a SALSA implementation with RT-Voice can work.

1. Setup a SALSA-enabled Character

Assumes you have already read and familiarized yourself with SALSA operations.

2. Setup the RT-Voice components using one of the below methods

Assumes you have already read and familiarized yourself with RT-Voice operations with Crosstales' documentation.

Method 1: Standard TTS Audio Return (does not work on iOS)

Using Speaker.Speak

This implementation returns audio data for SALSA to analyze and is the most responsive for lipsync purposes.

Add the RT-Voice [Speaker] component.
Add the SalsaRtVoice component:
Menu: [Component] > [Crazy Minnow Studio] > [SALSA] > [Addons] > [RT-Voice] > [Salsa_RTVoice]

Method 2: Non-Standard Phoneme Return (does not work on iOS)

Using Speaker.SpeakNative with the Speaker.Instance.OnSpeakCurrentViseme event.

This implementation does not return audio data to SALSA, but instead returns a limited phoneme interpretation of the text spoken. There are options to funnel the phonemes into buckets that correspond to a typical 3 or 7 viseme setup. It is recommended to alter this setup in the script to correspond to your particular setup.

Add the RT-Voice [Speaker] component.
Add the RT-Voice [Live Speaker] component.
Add the Salsa_RTVoice_Native component:
Menu: [Component] > [Crazy Minnow Studio] > [SALSA] > [Addons] > [RT-Voice] > [Salsa_RTVoice_Native]

Method 3: Non-Standard Word Return -- Required & Recommended for iOS Deployment ONLY

Using Speaker.SpeakNative with the Speaker.Instance.OnSpeakCurrentWord event for use on iOS.

iOS deployment of RT-Voice requires a 'spoken word' call-back to deliver data to SALSA. The actual audio data is not available to SALSA. This example setup uses SalsaTextSync to provide a loose lipsync interpretation of the words as they are 'spoken'.
NOTE: Per information provided by Crosstales (RT-Voice), as of this writing, this implementation does not work on MacOS so you will not be able to test in the Editor prior to deployment. This may change in future versions of RT-Voice. Oddly enough; this methodology does work on a Windows platform.

Add the RT-Voice [Speaker] component.
Add the RT-Voice [Live Speaker] component.
Add the Salsa_RTVoice_Native_iOS component:
Menu: [Component] > [Crazy Minnow Studio] > [SALSA] > [Addons] > [RT-Voice] -> [Salsa_RTVoice_Native_iOS]

NOTE: The Salsa_RTVoice_Native_iOS component relies on the assignment of the Salsa_RTVoice_Native_iOS.uid field to filter the current/required speaker. Simply adding this script to your GameObject and then calling SpeakNative from another script will not work without setting the Salsa_RTVoice_Native_iOS.uid field -- uid will be null. If you are relying on this script to work with SalsaTextSync, it will be necessary to capture the Uid value at call into the Salsa_RTVoice_Native_iOS.uid field. Since this field is private, it would require modifications of the script by either making the field public or adding a property setter to return the uid value to.

While it is possible to use these example scripts in your own project, in all likelihood, you will eventually replace the script with your own implementation. Study the example script to ensure your implementation works similarly with respect to the RT-Voice calls and call-backs. To recap: in this script, the uid field is assigned in LateUpdate() when the speak bool has been set to true. The call is made to SpeakNative which returns a UID of the speaker. This UID is used to bypass the gatekeeper and filter speakers for applying lipsync to specific models by leveraging SalsaTextSync. The supporting methodology in this script is not intended to be the best or most efficient way to accomplish this. The bool and LateUpdate() callback are only used to provide a simple implementation and understanding of the underlying functionality.

Finally:

Add the SalsaTextSync component: download SalsaTextSync (text-to-lipsync).
Menu: [Component] > [Crazy Minnow Studio] > [SALSA] > [Addons] > [TextSync] > [CM_TextSync]
Set the TextSync [Words Per Minute] to 300.

4. Play the Scene

Click the [Speak] check box in the Inspector on the Salsa_RTVoice, Salsa_RTVoice_Native, or Salsa_RTVoice_Native_iOS custom Inspector component.

NOTE: If you wish to use these simple, example scripts in your project, you may programmatically change the text and trigger the speak function by attaching to the component and supplying new values for speakText and then setting the speak bool to true.

For example, using Salsa_RTVoice_Native_iOS:

Salsa_RTVoice_Native_iOS salsaRtv; // choose your own way of assigning the integration script...
salsaRtv.speakText = "some string";
salsaRtv.speak = true;
// the LateUpdate() callback in the salsaRtv component will then process the new text.

Keep in mind, these scripts are examples and are not intended to work for every scenario. Nor are they expected to be the best or most efficient way to implement the functionality. They are purely for the simplest demonstration purposes.

NOTE: While every attempt has been made to ensure the safe content and operation of these files, they are provided as-is, without warranty or guarantee of any kind. By downloading and using these files you are accepting any and all risks associated and release Crazy Minnow Studio, LLC of any and all liability.

Troubleshooting and Operational Notes:

When using usage method #3, per information provided by Crosstales (RT-Voice), as of this writing, this implementation does not work on MacOS so you will not be able to test in the Editor prior to deployment. This may change in future versions of RT-Voice. Oddly enough; this methodology does work on a Windows platform.

Release Notes:

v2.0.0 - (2019-06-21):

+ Initial release for SALSA LipSync v2