Skip to main content
Version: Next

Model Size

Classification

ModelXNNPACK FP32 [MB]XNNPACK INT8 [MB]Core ML FP32 [MB]Core ML FP16 [MB]
EFFICIENTNET_V2_S85.722.986.543.9

Object Detection

ModelXNNPACK FP32 [MB]Core ML FP32 [MB]Core ML FP16 [MB]
SSDLITE_320_MOBILENET_V3_LARGE13.915.68.46

Instance Segmentation

ModelXNNPACK [MB]
YOLO26N_SEG11.6
YOLO26S_SEG42.3
YOLO26M_SEG95.4
YOLO26L_SEG113
YOLO26X_SEG252
RF_DETR_NANO_SEG124

Style Transfer

ModelXNNPACK FP32 [MB]XNNPACK INT8 [MB]Core ML FP32 [MB]Core ML FP16 [MB]
STYLE_TRANSFER_CANDY6.821.847.123.79
STYLE_TRANSFER_MOSAIC6.821.847.123.79
STYLE_TRANSFER_UDNIE6.821.847.123.79
STYLE_TRANSFER_RAIN_PRINCESS6.821.847.123.79

OCR

ModelXNNPACK [MB]
Detector (CRAFT_QUANTIZED)20.9
Recognizer (CRNN)18.5 - 25.2*

* - The model weights vary depending on the language.

Vertical OCR

ModelXNNPACK [MB]
Detector (CRAFT_QUANTIZED)20.9
Recognizer (CRNN)18.5 - 25.2*

* - The model weights vary depending on the language.

LLMs

ModelXNNPACK [GB]
LLAMA3_2_1B2.47
LLAMA3_2_1B_SPINQUANT1.14
LLAMA3_2_1B_QLORA1.18
LLAMA3_2_3B6.43
LLAMA3_2_3B_SPINQUANT2.55
LLAMA3_2_3B_QLORA2.65

Speech to text

ModelXNNPACK [MB]
WHISPER_TINY_EN151
WHISPER_TINY151
WHISPER_BASE_EN290.6
WHISPER_BASE290.6
WHISPER_SMALL_EN968
WHISPER_SMALL968

Text to speech

ModelXNNPACK [MB]
KOKORO_SMALL329.6
KOKORO_MEDIUM334.4

Text Embeddings

ModelXNNPACK [MB]
ALL_MINILM_L6_V291
ALL_MPNET_BASE_V2438
MULTI_QA_MINILM_L6_COS_V191
MULTI_QA_MPNET_BASE_DOT_V1438
CLIP_VIT_BASE_PATCH32_TEXT254

Image Embeddings

ModelXNNPACK FP32 [MB]XNNPACK INT8 [MB]
CLIP_VIT_BASE_PATCH32_IMAGE35296.4

Semantic Segmentation

ModelXNNPACK FP32 [MB]XNNPACK INT8 [MB]
DEEPLAB_V3_RESNET5016842.4
DEEPLAB_V3_RESNET10124461.7
DEEPLAB_V3_MOBILENET_V3_LARGE44.111.4
LRASPP_MOBILENET_V3_LARGE12.93.53
FCN_RESNET5014135.7
FCN_RESNET10121755

Text to image

ModelText encoder (XNNPACK) [MB]UNet (XNNPACK) [MB]VAE decoder (XNNPACK) [MB]
BK_SDM_TINY_VPRED4921290198

Voice Activity Detection (VAD)

ModelXNNPACK [MB]
FSMN_VAD1.83