Version: Next

Class: SpeechToTextModule

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:16

Module for Speech to Text (STT) functionalities.

Constructors

Constructor

new SpeechToTextModule(): SpeechToTextModule

Returns

SpeechToTextModule

Methods

decode()

decode(tokens, encoderOutput): Promise<Float32Array<ArrayBufferLike>>

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:91

Runs the decoder of the model.

Parameters

tokens

Int32Array

The input tokens.

encoderOutput

Float32Array

The encoder output.

Returns

Promise<Float32Array<ArrayBufferLike>>

Decoded output.

delete()

delete(): void

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:69

Unloads the model from memory.

Returns

void

encode()

encode(waveform): Promise<Float32Array<ArrayBufferLike>>

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:80

Runs the encoding part of the model on the provided waveform. Returns the encoded waveform as a Float32Array.

Parameters

waveform

Float32Array

The input audio waveform.

Returns

Promise<Float32Array<ArrayBufferLike>>

The encoded output.

load()

load(model, onDownloadProgressCallback?): Promise<void>

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:27

Loads the model specified by the config object. onDownloadProgressCallback allows you to monitor the current progress of the model download.

Parameters

model

SpeechToTextModelConfig

Configuration object containing model sources.

onDownloadProgressCallback?

(progress) => void

Optional callback to monitor download progress.

Returns

Promise<void>

stream()

stream(options?): AsyncGenerator<{ committed: TranscriptionResult; nonCommitted: TranscriptionResult; }>

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:133

Starts a streaming transcription session. Yields objects with committed and nonCommitted transcriptions. Committed transcription contains the part of the transcription that is finalized and will not change. Useful for displaying stable results during streaming. Non-committed transcription contains the part of the transcription that is still being processed and may change. Useful for displaying live, partial results during streaming. Use with streamInsert and streamStop to control the stream.

Parameters

options?

DecodingOptions = {}

Decoding options including language.

Returns

AsyncGenerator<{ committed: TranscriptionResult; nonCommitted: TranscriptionResult; }>

An async generator yielding transcription updates.

streamInsert()

streamInsert(waveform): void

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:206

Inserts a new audio chunk into the streaming transcription session.

Parameters

waveform

Float32Array

The audio chunk to insert.

Returns

void

streamStop()

streamStop(): void

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:213

Stops the current streaming transcription session.

Returns

void

transcribe()

transcribe(waveform, options?): Promise<TranscriptionResult>

Defined in: modules/natural_language_processing/SpeechToTextModule.ts:109

Starts a transcription process for a given input array (16kHz waveform). For multilingual models, specify the language in options. Returns the transcription as a string. Passing number[] is deprecated.

Parameters

waveform

Float32Array

The Float32Array audio data.

options?

DecodingOptions = {}

Decoding options including language.

Returns

Promise<TranscriptionResult>

The transcription string.

Constructors​

Constructor​

Returns​

Methods​

decode()​

Parameters​

tokens​

encoderOutput​

Returns​

delete()​

Returns​

encode()​

Parameters​

waveform​

Returns​

load()​

Parameters​

model​

onDownloadProgressCallback?​

Returns​

stream()​

Parameters​

options?​

Returns​

streamInsert()​

Parameters​

waveform​

Returns​

streamStop()​

Returns​

transcribe()​

Parameters​

waveform​

options?​

Returns​

Constructors

Constructor

Returns

Methods

decode()

Parameters

tokens

encoderOutput

Returns

delete()

Returns

encode()

Parameters

waveform

Returns

load()

Parameters

model

onDownloadProgressCallback?

Returns

stream()

Parameters

options?

Returns

streamInsert()

Parameters

waveform

Returns

streamStop()

Returns

transcribe()

Parameters

waveform

options?

Returns