Version: 0.7.x

useTextToImage

Text-to-image is a process of generating images directly from a description in natural language by conditioning a model on the provided text input. Our implementation follows the Stable Diffusion pipeline, which applies the diffusion process in a lower-dimensional latent space to reduce memory requirements. The pipeline combines a text encoder to preprocess the prompt, a U-Net that iteratively denoises latent representations, and a VAE decoder to reconstruct the final image. React Native ExecuTorch offers a dedicated hook, useTextToImage, for this task.

warning

It is recommended to use models provided by us which are available at our Hugging Face repository, you can also use constants shipped with our library.

API Reference

For detailed API Reference for useTextToImage see: useTextToImage API Reference.
For all text to image models available out-of-the-box in React Native ExecuTorch see: Text to Image Models.

High Level Overview

import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';

const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });

const input = 'a castle';

try {
  const image = await model.generate(input);
} catch (error) {
  console.error(error);
}

Arguments

useTextToImage takes TextToImageProps that consists of:

model containing schedulerSource, tokenizerSource, encoderSource, unetSource, and decoderSource.
An inference callback inferenceCallback.
An optional flag preventLoad which prevents auto-loading of the model.

You need more details? Check the following resources:

For detailed information about useTextToImage arguments check this section: useTextToImage arguments.
For all text to image models available out-of-the-box in React Native ExecuTorch see: Text to Image Models.
For more information on loading resources, take a look at loading models page.

Returns

useTextToImage returns an object called TextToImageType containing bunch of functions to interact with text to image models. To get more details please read: TextToImageType API Reference.

Running the model

To run the model, you can use the generate method. It accepts four arguments: a text prompt describing the requested image, a size of the image in pixels, a number of denoising steps, and an optional seed value, which enables reproducibility of the results.

The image size must be a multiple of 32 due to the architecture of the U-Net and VAE models. The seed should be a positive integer.

warning

Larger imageSize values require significantly more memory to run the model.

Example

import { useTextToImage, BK_SDM_TINY_VPRED_256 } from 'react-native-executorch';

function App() {
  const model = useTextToImage({ model: BK_SDM_TINY_VPRED_256 });

  //...
  const input = 'a medieval castle by the sea shore';

  const imageSize = 256;
  const numSteps = 25;

  try {
    image = await model.generate(input, imageSize, numSteps);
  } catch (error) {
    console.error(error);
  }
  //...

  return <Image source={{ uri: `data:image/png;base64,${image}` }} />;
}


Image of size 256×256	Image of size 512×512

Supported models

Model	Parameters [B]	Description
bk-sdm-tiny-vpred	0.5	BK-SDM (Block-removed Knowledge-distilled Stable Diffusion Model) is a compressed version of Stable Diffusion v1.4 with several residual and attention blocks removed. The BK-SDM-Tiny is a v-prediction variant of the model, obtained through further block removal, built around a 0.33B-parameter U-Net.

API Reference​

High Level Overview​

Arguments​

Returns​

Running the model​

Example​

Supported models​