Using External Analysis to Process a Custom AudioSource Filter Chain

NOTE: Please read and understand Delegate Processing if you have not already.

There are many ways to work with audio data in Unity and SALSA. By default, SALSA uses the AudioClip buffer of an AudioSource as its standardized method of working with audio data. This is not always possible and sometimes requires additional steps to implement. How a developer deals with audio data is outside the scope of SALSA support, but SALSA's delegate processing opens up the doors to dealing with nearly any audio data situation.

A common situation can result when audio data is inserted into the AudioSource pipeline, bypassing the AudioClip buffer. When this occurs, there will be an active AudioSource that does not have an AudioClip. As such, SALSA cannot "see" the data and therefore cannot process it. In this instance, it may be possible to expose the data by using Unity's OnAudioFilterRead() callback functionality. Check the Unity documentation for more information on OnAudioFilterRead().

While the audio data could be processed (analyzed) each time OnAudioFilterRead() is called, SALSA has its own timing, governed by Salsa.audioUpdateDelay. The OnAudioFilterRead() callback will likely be called much more frequently than SALSA requires. For this reason, it is recommended to simply store the data as a more lightweight action and only process it when necessary. The general process would be to collect the data from the filter (consumed by the audio chain) and then allow SALSA to process it as needed (according to its audio delay pulse cycle) -- leveraging external analysis. This is also recommended so that the data analysis viewport size can be controlled -- the size of the OnAudioFilterRead() data chunk is variable based on platform and audio parameters.

Scenarios Where a Filter Chain Solution May Be Beneficial

There are several situations where audio data may not be available for SALSA to analyze from the AudioSource.AudioClip buffer and an audio filter chain insert may provide a solution. Here are a few examples, the list is not all-inclusive:

When using AudioSource.PlayOneShot() clips.
When using Unity's built-in Timeline Audio track/clips.
Custom created audio data (via filter read callbacks).
Audio sources of streamed audio data. This is very common with TTS systems that do not buffer the audio data into an AudioClip prior to playing the result.

NOTE: As noted elsewhere in this document, custom solutions may not use Unity's AudioSource implementation and the example filter insert script will not work (as-is). It may be possible to modify the script to access the custom implementation if the data is available via similar callbacks or is directly accessible via a data buffer. This would require the developer to implement their own solution to this problem and is out-of-scope for Crazy Minnow Studio to provide support.

Caveats

It is important to know, accessing data via the filter chain has the advantage of being able to see data in Unity's Audio implementation where it otherwise would not be available, but may come with some gotchas.

Data read in this manner cannot be read in advance, it can only be processed when it is available. This may result in more perceived lag in animations than if using pre-recorded data. When SALSA has access to pre-recorded data, it can look ahead and pre-process data before it is used, thereby starting the animations before the data has been played, resulting in more natural looking animations.
If using spatial (2D/3D) processing, audio amplitude is diminished as the source moves away from the listener. Diminished amplitude dynamics will greatly affect SALSA processing, resulting in diminished animation movements. This is primarily a concern with 3D processing, but could also be problematic with simple stereo processing where the audio may be coming from one side or the other. By default, the example script processes only the left (first) channel in a multi-channel interleave, assuming the audio to monaural (mono). If this is not the case, changes to the script will be necessary to accomodate both channels.
As with spatial audio, any other applied effects can also impact animation dynamics.
It may be necessary to dynamically add the script to the GameObject with the AudioSource at runtime in order to properly insert the filter into the chain. For example, when playing OneShot audio, the filter should be added after the PlayOneShot() method is called.
The example script is designed to work only with AudioSource callbacks, but could be easily modified to work with other custom audio data implementations where audio data is available via some custom API or buffer.

Unity AudioSource Filter Chain Example

The script example below leverages Unity's OnAudioFilterRead() callback functionality to collect data fed to the AudioSource chain where an AudioClip buffer is not used. Refer to Unity's OnAudioFilterRead() documentation for information and details.

NOTE: The example script component should be placed on the GameObject with the AudioSource and essentially becomes a filter for the audio data.

Example Script Operation Explanation

FYI: SALSA must be configured to use External Analysis!

In the example below, a simple circular buffer called analysisBuffer is used to collect data since the size of the data chunk presented via Unity's OnAudioFilterRead() is variable based on platform and audio data characteristics. The idea is to simply have the latest data available in the analysisBuffer and the buffer should always be full (depending on the analysis used). Since we are using a simple 'amplitude peak analysis', the buffer does not need to be full, but does need to be of adequate size and filled with enough data to represent the amplitude characteristics of the current data. The size of the buffer should be tweaked to handle the audio parameters of your project; higher frequency data = a general requirement for a larger buffer. Also, consider using a larger buffer for more channels, or cull additional channel data and process a single channel. The example below only utilizes the first channel's data (typically the left channel in a stereo file).

Each call-back to OnAudioFilterRead() adds data to the analysisBuffer. Only data from the first channel is added to the analysisBuffer.

SALSA then 'polls' GetAnalysisValueLeveragingSalsaAnalyzer() as a delegate to Salsa.getExternalAnalysis which is set in Awake(). In this example, SALSA's analysis engine delegate is used to analyze the data stored in the analysisBuffer. Unless the analysis engine has been delegated to custom code as well, the default internal analysis engine will be used.

NOTE: This process can also be used with Unity's Timeline AudioTrack and also with AudioSource.PlayOneShot() calls. OneShot requires the filter callback script be added (dynamically) after the OneShot is called.

Script Example

FYI: SALSA must be configured to use External Analysis!

NOTE: This is only an example and may or may not meet your requirements. As such, it is your responsibility to modify this example for your own project's needs.

using UnityEngine;
using CrazyMinnow.SALSA;

namespace DemoCode
{
    public class UpstreamAudioFilterProcessing : MonoBehaviour
    {
        public Salsa salsaInstance;
        private float[] analysisBuffer = new float[1024];
        private int bufferPointer = 0;
        private int interleave = 1;

        private void Awake()
        {
            if (!salsaInstance)
                salsaInstance = GetComponent<Salsa>();
            if (salsaInstance)
                salsaInstance.getExternalAnalysis = GetAnalysisValueLeveragingSalsaAnalyzer;
            else
                Debug.Log("SALSA not found...");
        }

        private void OnAudioFilterRead(float[] data, int channels)
        {
            // Simply fill our buffer and keep it updated for ad-hoc analysis processing.

            // Only fill 'analysisBuffer' with channel 1 data. If you want
            // to store and keep track of additional channels, uncomment and
            // set 'interleave' to the number of channels of data passed into
            // this callback. Additionally adjust the for-loop as necessary.

            //interleave = channels;
            for (int i = 0; i < data.Length; i+=channels)
            {
                analysisBuffer[bufferPointer] = data[i];
                bufferPointer++;
                bufferPointer %= analysisBuffer.Length; // wrap the pointer if necessary
            }
        }

        // Utilize the built-in SALSA analyzer on your custom data.
        float GetAnalysisValueLeveragingSalsaAnalyzer()
        {
            // If you need more control over the analysis, process the buffer
            // here and then return the analysis. Since only the first channel of 
            // audio data is stored in the 'analysisBuffer' (in this example), the 
            // 'interleave' value is initialized as '1' -- we've already 
            // separated the data in the callback, so we want to analyze all of it.
            return salsaInstance.audioAnalyzer(interleave, analysisBuffer);
        }
    }
}