Version: Next

PoseEstimationModule

TypeScript API implementation of the usePoseEstimation hook.

API Reference

For detailed API Reference for PoseEstimationModule see: PoseEstimationModule API Reference.
For all pose estimation models available out-of-the-box in React Native ExecuTorch see: Pose Estimation Models.

High Level Overview

import { models, PoseEstimationModule } from 'react-native-executorch';
const imageUri = 'path/to/image.png';

// Creating an instance and loading the model
const poseEstimationModule = await PoseEstimationModule.fromModelName(
  models.pose_estimation.yolo26n()
);

// Running the model
const detections = await poseEstimationModule.forward(imageUri);
detections[0].NOSE; // { x, y }

Methods

All methods of PoseEstimationModule are explained in details here: PoseEstimationModule API Reference

Loading the model

Use the static fromModelName factory method. It accepts a model config object (with modelName and modelSource) and an optional onDownloadProgress callback. It returns a promise resolving to a PoseEstimationModule instance whose return type is statically tied to the model's keypoint map.

For more information on loading resources, take a look at loading models page.

Running the model

To run the model, use the forward method. It accepts two arguments:

input (required) - The image to process. Can be a remote URL, a local file URI, a base64-encoded image (whole URI or only raw base64), or a PixelData object (raw RGB pixel buffer).
options (optional) - A PoseEstimationOptions object with:
- detectionThreshold (optional) - Minimum confidence score for a detected person (0-1). Defaults to model-specific value.
- keypointThreshold (optional) - Per-keypoint visibility threshold (0-1). Keypoints whose model-reported visibility falls below this are reported as (-1, -1) so consumers can skip them. Defaults to model-specific value.
- inputSize (optional) - For YOLO models: 384, 512, or 640. Defaults to 384.

The method returns a promise resolving to an array of PersonKeypoints. Each entry is an object keyed by the model's keypoint names (e.g. NOSE, LEFT_SHOULDER), where each value is a Keypoint with x and y coordinates in the original image's pixel space.

info

Keypoints whose visibility falls below keypointThreshold (or that the model considers off-image) are returned as { x: -1, y: -1 }. Filter them out before drawing — e.g. if (kp.x < 0 || kp.y < 0) skip;.

For real-time frame processing, use runOnFrame instead.

Example with Options

const detections = await model.forward(imageUri, {
  detectionThreshold: 0.5,
  inputSize: 640, // YOLO models only
});

for (const person of detections) {
  console.log('Nose at', person.NOSE.x, person.NOSE.y);
}

Using a custom model

Use fromCustomModel to load your own exported model binary instead of a built-in preset. You provide the keypoint map; forward's return type is automatically derived from it, so each detected person is typed as a record keyed by the names you defined.

import { PoseEstimationModule } from 'react-native-executorch';
const HandKeypoints = {
  WRIST: 0,
  THUMB_TIP: 1,
  INDEX_TIP: 2,
  MIDDLE_TIP: 3,
  RING_TIP: 4,
  PINKY_TIP: 5,
} as const;

const detector = await PoseEstimationModule.fromCustomModel(
  'https://example.com/custom_pose.pte',
  { keypointMap: HandKeypoints },
  (progress) => console.log(progress)
);

const detections = await detector.forward(imageUri);
detections[0].THUMB_TIP; // { x, y }

Required model contract

The .pte binary must expose a forward method (or per-input-size methods such as forward_384, forward_512, forward_640 for multi-resolution models) with the following interface:

Input: one float32 tensor of shape [1, 3, H, W] — a single RGB image, values in [0, 1] after optional per-channel normalization (pixel − mean) / std. H and W are read from the model's declared input shape at load time. The mean and std vectors are supplied via preprocessorConfig.normMean and preprocessorConfig.normStd on the PoseEstimationConfig you pass to fromCustomModel; if omitted, the runtime feeds the resized image without normalization.

Outputs: exactly three float32 tensors, in this order:

Bounding boxes — shape [Q, 4], (x1, y1, x2, y2) per detection in model-input pixel space, where Q is the number of candidate detections.
Confidence scores — shape [Q], person confidence in [0, 1].
Keypoints — shape [Q, K, 3], where K is the number of keypoints (must match the size of your keypointMap) and the last dimension is (x, y, visibility) per keypoint, in model-input pixel space.

Preprocessing (resize → normalize) and postprocessing (coordinate rescaling, threshold filtering, mapping keypoints to your named keypoint map) are handled by the native runtime — your model only needs to produce the raw detections above.

Managing memory

The module is a regular JavaScript object, and as such its lifespan will be managed by the garbage collector. In most cases this should be enough, and you should not worry about freeing the memory of the module yourself, but in some cases you may want to release the memory occupied by the module before the garbage collector steps in. In this case use the method delete on the module object you will no longer use, and want to remove from the memory. Note that you cannot use forward after delete unless you load the module again.

API Reference​

High Level Overview​

Methods​

Loading the model​

Running the model​

Example with Options​

Using a custom model​

Required model contract​

Managing memory​