TextToSpeechModule
TypeScript API implementation of the useTextToSpeech hook.
API Reference
- For detailed API Reference for
TextToSpeechModulesee:TextToSpeechModuleAPI Reference. - For all text to speech models available out-of-the-box in React Native ExecuTorch see: TTS Models.
- For all supported voices in
TextToSpeechModuleplease refer to: Supported Voices
High Level Overview
import { models, TextToSpeechModule } from 'react-native-executorch';
const model = await TextToSpeechModule.fromModelName(
models.text_to_speech.kokoro.en_us.heart(),
(progress) => console.log(progress)
);
await model.forward(text, 1.0);
Methods
All methods of TextToSpeechModule are explained in details here: TextToSpeechModule API Reference
Loading the model
Use the static fromModelName factory method with the following parameters:
-
config- Object containing:model- Model configuration.voiceSource- Voice resource source.phonemizerConfig- Phonemizer configuration.
-
onDownloadProgress- Optional callback to track download progress (value between 0 and 1).
This method returns a promise that resolves to a TextToSpeechModule instance once the assets are downloaded and loaded into memory.
For more information on resource sources, see loading models.
Running the model
The module provides a way to generate speech using either raw text or pre-generated phonemes.
Methods
forward(text, speed, phonemize): Generates the complete audio waveform at once. Returns a promise resolving to aFloat32Array.phonemizedefaults totrue. When set tofalse, the input is expected to be a string of IPA phonemes.
stream({ speed, phonemize, stopAutomatically, ... }): An async generator that yields chunks of audio as they are computed. This is ideal for reducing the "time to first audio" for long sentences. In contrast toforward, it enables inserting text chunks dynamically into processing buffer withstreamInsert(text)and allows stopping generation early withstreamStop(instant).
Using Phonemes
If you have pre-computed phonemes (e.g., from an external dictionary or a custom G2P model), you can skip the internal phoneme generation step by setting phonemize: false in the forward or stream methods.
Since forward processes the entire input at once, it might take a significant amount of time to produce audio for long inputs.
Example
Speech Synthesis
import { models, TextToSpeechModule } from 'react-native-executorch';
import { AudioContext } from 'react-native-audio-api';
const tts = await TextToSpeechModule.fromModelName(
models.text_to_speech.kokoro.en_us.heart()
);
const audioContext = new AudioContext({ sampleRate: 24000 });
try {
const waveform = await tts.forward('Hello from ExecuTorch!', 1.0);
// Create audio buffer and play
const audioBuffer = audioContext.createBuffer(1, waveform.length, 24000);
audioBuffer.getChannelData(0).set(waveform);
const source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioContext.destination);
source.start();
} catch (error) {
console.error('Text-to-speech failed:', error);
}
Streaming Synthesis
import { models, TextToSpeechModule } from 'react-native-executorch';
import { AudioContext } from 'react-native-audio-api';
const tts = await TextToSpeechModule.fromModelName(
models.text_to_speech.kokoro.en_us.heart()
);
const audioContext = new AudioContext({ sampleRate: 24000 });
try {
for await (const chunk of tts.stream({
text: 'This is a streaming test, with a sample input.',
speed: 1.0,
})) {
// Play each chunk sequentially
await new Promise<void>((resolve) => {
const audioBuffer = audioContext.createBuffer(1, chunk.length, 24000);
audioBuffer.getChannelData(0).set(chunk);
const source = audioContext.createBufferSource();
source.buffer = audioBuffer;
source.connect(audioContext.destination);
source.onEnded = () => resolve();
source.start();
});
}
} catch (error) {
console.error('Streaming failed:', error);
}
Synthesis from Phonemes
If you already have a phoneme string (e.g., from an external library), you can use forward or stream with the phonemize: false flag to synthesize audio directly, skipping the internal phonemizer stage.
import { models, TextToSpeechModule } from 'react-native-executorch';
const tts = await TextToSpeechModule.fromModelName(
models.text_to_speech.kokoro.en_us.heart()
);
// Example phonemes for "ExecuTorch"
const waveform = await tts.forward('həlˈO wˈɜɹld!', 1.0, false);
// Or stream from phonemes
for await (const chunk of tts.stream({
text: 'ɐ mˈæn hˌu dˈʌzᵊnt tɹˈʌst hɪmsˈɛlf, kæn nˈɛvəɹ ɹˈiᵊli tɹˈʌst ˈɛniwˌʌn ˈɛls.',
speed: 1.0,
phonemize: false,
})) {
// ... process chunk ...
}