Skip to main content

useImageSegmentation

Semantic image segmentation, akin to image classification, tries to assign the content of the image to one of the predefined classes. However, in case of segmentation this classification is done on a per-pixel basis, so as the result the model provides an image-sized array of scores for each of the classes. You can then use this information to detect objects on a per-pixel basis. React Native ExecuTorch offers a dedicated hook useImageSegmentation for this task.

caution

It is recommended to use models provided by us which are available at our Hugging Face repository, you can also use constants shipped with our library.

Reference

import {
useImageSegmentation,
DEEPLAB_V3_RESNET50,
} from 'react-native-executorch';

const model = useImageSegmentation({
modelSource: DEEPLAB_V3_RESNET50,
});

const imageUri = 'file::///Users/.../cute_cat.png';

try {
const outputDict = await model.forward(imageUri);
} catch (error) {
console.error(error);
}

Arguments

modelSource A string that specifies the location of the model binary. For more information, take a look at loading models page.

preventLoad? - Boolean that can prevent automatic model loading (and downloading the data if you load it for the first time) after running the hook.

Returns

FieldTypeDescription
forward(input: string, classesOfInterest?: DeeplabLabel[], resize?: boolean) => Promise<{[key in DeeplabLabel]?: number[]}>Executes the model's forward pass, where:
* input can be a fetchable resource or a Base64-encoded string.
* classesOfInterest is an optional list of DeeplabLabel used to indicate additional arrays of probabilities to output (see section "Running the model"). The default is an empty list.
* resize is an optional boolean to indicate whether the output should be resized to the original image dimensions, or left in the size of the model (see section "Running the model"). The default is false.

The return is a dictionary containing:
* for the key DeeplabLabel.ARGMAX an array of integers corresponding to the most probable class for each pixel
* an array of floats for each class from classesOfInterest corresponding to the probabilities for this class.
errorstring | nullContains the error message if the model failed to load.
isGeneratingbooleanIndicates whether the model is currently processing an inference.
isReadybooleanIndicates whether the model has successfully loaded and is ready for inference.
downloadProgressnumberRepresents the download progress as a value between 0 and 1.

Running the model

To run the model, you can use the forward method. It accepts three arguments: a required image, an optional list of classes, and an optional flag whether to resize the output to the original dimensions.

  • The image can be a remote URL, a local file URI, or a base64-encoded image.
  • The classesOfInterest list contains classes for which to output the full results. By default the list is empty, and only the most probable classes are returned (essentially an arg max for each pixel). Look at DeeplabLabel enum for possible classes.
  • The resize flag says whether the output will be rescaled back to the size of the image you put in. The default is false. The model runs inference on a scaled (probably smaller) version of your image (224x224 for DEEPLAB_V3_RESNET50). If you choose to resize, the output will be number[] of size width * height of your original image.
caution

Setting resize to true will make forward slower.

forward returns a promise which can resolve either to an error or a dictionary containing number arrays with size depending on resize:

  • For the key DeeplabLabel.ARGMAX the array contains for each pixel an integer corresponding to the class with the highest probability.
  • For every other key from DeeplabLabel, if the label was included in classesOfInterest the dictionary will contain an array of floats corresponding to the probability of this class for every pixel.

Example

function App(){
const model = useImageSegmentation(
modelSource: DEEPLAB_V3_RESNET50,
);

...
const imageUri = 'file::///Users/.../cute_cat.png';

try{
const outputDict = await model.forward(imageUri, [DeeplabLabel.CAT], true);
}catch(error){
console.error(error);
}
...
}

Supported models

ModelNumber of classesClass list
deeplabv3_resnet5021DeeplabLabel

Benchmarks

Model size

ModelXNNPACK [MB]
DEELABV3_RESNET50168

Memory usage

warning

Data presented in the following sections is based on inference with non-resized output. When resize is enabled, expect higher memory usage and inference time with higher resolutions.

ModelAndroid (XNNPACK) [MB]iOS (XNNPACK) [MB]
DEELABV3_RESNET50930660

Inference time

warning

Times presented in the tables are measured as consecutive runs of the model. Initial run times may be up to 2x longer due to model loading and initialization.

ModeliPhone 16 Pro (Core ML) [ms]iPhone 14 Pro Max (Core ML) [ms]Samsung Galaxy S24 (XNNPACK) [ms]
DEELABV3_RESNET501000670700