usePoseEstimation
Pose estimation is a computer vision technique that detects human bodies in an image and locates a fixed set of keypoints (e.g. nose, shoulders, knees) for each detected person. Unlike object detection, which produces a class label and a bounding box, pose estimation produces a structured set of named keypoints per person. React Native ExecuTorch offers a dedicated hook usePoseEstimation for this task.
It is recommended to use models provided by us, which are available at our Hugging Face repository. You can also use constants shipped with our library.
API Reference
- For detailed API Reference for
usePoseEstimationsee:usePoseEstimationAPI Reference. - For all pose estimation models available out-of-the-box in React Native ExecuTorch see: Pose Estimation Models.
High Level Overview
import { usePoseEstimation, YOLO26N_POSE } from 'react-native-executorch';
const model = usePoseEstimation({
model: YOLO26N_POSE,
});
const imageUri = 'file:///Users/.../photo.jpg';
try {
const detections = await model.forward(imageUri);
// detections is an array of PersonKeypoints, keyed by name (e.g. detections[0].NOSE)
} catch (error) {
console.error(error);
}
Arguments
usePoseEstimation takes PoseEstimationProps that consists of:
model- An object containing:modelName- The name of a built-in model. SeePoseEstimationModelSourcesfor the list of supported models.modelSource- The location of the model binary (a URL or a bundled resource).
- An optional flag
preventLoadwhich prevents auto-loading of the model.
The hook is generic over the model config — TypeScript automatically infers the correct keypoint type based on the modelName you provide. No explicit generic parameter is needed.
You need more details? Check the following resources:
- For detailed information about
usePoseEstimationarguments check this section:usePoseEstimationarguments. - For all pose estimation models available out-of-the-box in React Native ExecuTorch see: Pose Estimation Models.
- For more information on loading resources, take a look at loading models page.
Returns
usePoseEstimation returns a PoseEstimationType object containing:
isReady- Whether the model is loaded and ready to process images.isGenerating- Whether the model is currently processing an image.error- An error object if the model failed to load or encountered a runtime error.downloadProgress- A value between 0 and 1 representing the download progress of the model binary.forward- A function to run inference on an image.getAvailableInputSizes- A function that returns available input sizes for multi-method models (YOLO). Returnsundefinedfor single-method models.runOnFrame- A synchronous worklet function for real-time VisionCamera frame processing. See VisionCamera Integration for usage.
Running the model
To run the model, use the forward method. It accepts two arguments:
input(required) - The image to process. Can be a remote URL, a local file URI, a base64-encoded image (whole URI or only raw base64), or aPixelDataobject (raw RGB pixel buffer).options(optional) - APoseEstimationOptionsobject with the following properties:detectionThreshold(optional) - A number between 0 and 1 representing the minimum confidence score for a detected person. Defaults to model-specific value (typically0.5).keypointThreshold(optional) - Per-keypoint visibility threshold (0-1). Keypoints whose model-reported visibility falls below this are emitted as(-1, -1)so consumers can skip them. Defaults to model-specific value.inputSize(optional) - For multi-method models like YOLO, specify the input resolution (384,512, or640). Defaults to384for YOLO models.
forward returns a promise resolving to an array of PersonKeypoints — one entry per detected person. Each entry is an object keyed by the model's keypoint names (typed against the model's keypoint map), where each value is a Keypoint with:
x- The x coordinate in the original image's pixel space.y- The y coordinate in the original image's pixel space.
Keypoints whose visibility falls below keypointThreshold (or that the model considers off-image) are returned as { x: -1, y: -1 }. Filter them out before drawing — e.g. if (kp.x < 0 || kp.y < 0) skip;.
For example, with a COCO-keypoint model:
const detections = await model.forward(imageUri);
const firstPerson = detections[0];
firstPerson.NOSE; // { x, y }
firstPerson.LEFT_SHOULDER; // { x, y }
The keypoint names available on each person are determined by the model's keypoint map and are checked at compile time.
Example
import { usePoseEstimation, YOLO26N_POSE } from 'react-native-executorch';
function App() {
const model = usePoseEstimation({
model: YOLO26N_POSE,
});
const handleDetect = async () => {
if (!model.isReady) return;
const imageUri = 'file:///Users/.../photo.jpg';
try {
const detections = await model.forward(imageUri, {
detectionThreshold: 0.5,
inputSize: 640,
});
console.log('Detected:', detections.length, 'people');
for (const person of detections) {
console.log('Nose at', person.NOSE.x, person.NOSE.y);
}
} catch (error) {
console.error(error);
}
};
// ...
}
VisionCamera integration
See the full guide: VisionCamera Integration.
Supported models
| Model | Number of keypoints | Keypoint list | Multi-size Support |
|---|---|---|---|
| YOLO26N-Pose | 17 | COCO | Yes (384/512/640) |
YOLO models support multiple input sizes (384px, 512px, 640px). Smaller sizes are faster but less accurate, while larger sizes are more accurate but slower. Choose based on your speed/accuracy requirements.