Class: SpeechToTextModule
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:17
Module for Speech to Text (STT) functionalities.
Methods
decode()
decode(
tokens,encoderOutput):Promise<Float32Array<ArrayBufferLike>>
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:140
Runs the decoder of the model.
Parameters
tokens
Int32Array
The input tokens.
encoderOutput
Float32Array
The encoder output.
Returns
Promise<Float32Array<ArrayBufferLike>>
Decoded output.
delete()
delete():
void
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:119
Unloads the model from memory.
Returns
void
encode()
encode(
waveform):Promise<Float32Array<ArrayBufferLike>>
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:129
Runs the encoding part of the model on the provided waveform. Returns the encoded waveform as a Float32Array.
Parameters
waveform
Float32Array
The input audio waveform.
Returns
Promise<Float32Array<ArrayBufferLike>>
The encoded output.
stream()
stream(
options?):AsyncGenerator<{committed:TranscriptionResult;nonCommitted:TranscriptionResult; }>
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:180
Starts a streaming transcription session.
Yields objects with committed and nonCommitted transcriptions.
Committed transcription contains the part of the transcription that is finalized and will not change.
Useful for displaying stable results during streaming.
Non-committed transcription contains the part of the transcription that is still being processed and may change.
Useful for displaying live, partial results during streaming.
Use with streamInsert and streamStop to control the stream.
Parameters
options?
DecodingOptions = {}
Decoding options including language.
Returns
AsyncGenerator<{ committed: TranscriptionResult; nonCommitted: TranscriptionResult; }>
An async generator yielding transcription updates.
Yields
An object containing committed and nonCommitted transcription results.
streamInsert()
streamInsert(
waveform):void
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:252
Inserts a new audio chunk into the streaming transcription session.
Parameters
waveform
Float32Array
The audio chunk to insert.
Returns
void
streamStop()
streamStop():
void
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:259
Stops the current streaming transcription session.
Returns
void
transcribe()
transcribe(
waveform,options?):Promise<TranscriptionResult>
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:156
Starts a transcription process for a given input array (16kHz waveform).
For multilingual models, specify the language in options.
Returns the transcription as a string. Passing number[] is deprecated.
Parameters
waveform
Float32Array
The Float32Array audio data.
options?
DecodingOptions = {}
Decoding options including language.
Returns
Promise<TranscriptionResult>
The transcription string.
fromCustomModel()
staticfromCustomModel(modelSource,tokenizerSource,isMultilingual,onDownloadProgress?):Promise<SpeechToTextModule>
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:69
Creates a Speech to Text instance with user-provided model binaries.
Use this when working with a custom-exported STT model.
Internally uses 'custom' as the model name for telemetry.
Parameters
modelSource
A fetchable resource pointing to the model binary.
tokenizerSource
A fetchable resource pointing to the tokenizer file.
isMultilingual
boolean
Whether the model supports multiple languages.
onDownloadProgress?
(progress) => void
Optional callback to monitor download progress, receiving a value between 0 and 1.
Returns
Promise<SpeechToTextModule>
A Promise resolving to a SpeechToTextModule instance.
Remarks
The native model contract for this method is not formally defined and may change between releases. Currently only the Whisper architecture is supported by the native runner. Refer to the native source code for the current expected interface.
fromModelName()
staticfromModelName(namedSources,onDownloadProgress?):Promise<SpeechToTextModule>
Defined in: modules/natural_language_processing/SpeechToTextModule.ts:40
Creates a Speech to Text instance for a built-in model.
Parameters
namedSources
Configuration object containing model name, sources, and multilingual flag.
onDownloadProgress?
(progress) => void
Optional callback to monitor download progress, receiving a value between 0 and 1.
Returns
Promise<SpeechToTextModule>
A Promise resolving to a SpeechToTextModule instance.
Example
import { SpeechToTextModule, WHISPER_TINY_EN } from 'react-native-executorch';
const stt = await SpeechToTextModule.fromModelName(WHISPER_TINY_EN);