-
-
Notifications
You must be signed in to change notification settings - Fork 330
Models
Default models in Human library are:
- Face Detection: MediaPipe BlazeFace Back variation
- Face Mesh: MediaPipe FaceMesh
- Face Iris Analysis: MediaPipe Iris
- Face Description: HSE FaceRes
- Emotion Detection: Oarriaga Emotion
- Body Analysis: MoveNet Lightning variation
- Hand Analysis: HandTrack combined with MediaPipe Hands
- Object Detection: MB3 CenterNet (not enabled by default)
- Body Segmentation: Google Selfie (not enabled by default)
- Face Anti-Spoofing: Real-or-Fake (not enabled by default)
- Face Live Detection: Liveness (not enabled by default)
Human
includes default models but supports number of additional models and model variations of existing models
Additional models can be accessed via:
To use alternative models from local host:
- download them either from github or npmjs and either
- set human configuration value
modelPath
for each model or - set global configuration value
baseModelPath
to location of downloaded models
To use alternative models from a CDN use location prefix https://www.jsdelivr.com/package/npm/@vladmandic/human-models/models/
for either configuration value of modelPath
or baseModelPath
All models are modified from original implementation in following manner:
- Input pre-processing: image enhancements, normalization, etc.
- Caching: custom caching operations to bypass specific model runs when no changes are detected
- Output parsing: custom analysis of HeatMaps to regions, output values normalization, etc.
- Output interpolation: custom smoothing operations
- Model modifications:
- Model definition: reformatted for readability, added conversion notes and correct signatures
- Model weights: quantized to 16-bit float values for size reduction
Models are not re-trained so any bias included in the original models is present in Human
For any possible bias notes, see specific model cards
Human
includes implementations for several alternative models which can be switched on-the-fly while keeping standardized input
and results
object structure
Switching model also automatically switches implementation used inside Human
so it is critical to keep model filenames in original form
Human
includes all default models while alternative models are kept in a separate repository due to size considerations and must be downloaded manually from https://github.com/vladmandic/human-models
Body detection can be switched from PoseNet
to BlazePose
, EfficientPose
or MoveNet
depending on the use case:
-
PoseNet
: Works with multiple people in frame, works with only partial people
Best described as works-anywhere, but not with great precision -
MoveNet-Lightning
: Works with single person in frame, works with only partial people
Modernized and optimized version of PoseNet with different model architecture -
MoveNet-Thunder
: Variation ofMoveNet
with higher precision but slower processing -
EfficientPose
: Works with single person in frame, works with only partial people
Experimental model that shows future promise but is not ready for wide spread usage due to performance -
BlazePose
: Works with single person in frame and that person should be fully visibile
But if conditions are met, it returns far more details (39 vs 17 keypoints) and is far more accurate
Furthermore, it returns 3D approximation of each point instead of 2D
Face description can be switched from default combined model FaceRes
to individual models
-
Gender Detection
: Oarriaga Gender -
Age Detection
: SSR-Net Age IMDB -
Face Embedding
: BecauseofAI MobileFace Embedding
Object detection can be switched from centernet
to nanodet
Hand destection can be switched from handdetect
to handtrack
Body Segmentation can be switched from rvm
to selfie
or meet
Model Name | Model Definition Size | Model Definition | Weights Size | Weights Name | Num Tensors | Resolution |
---|---|---|---|---|---|---|
Anti-Spoofing | 8K | antispoof.json | 834K | antispoof.bin | 11 | |
BecauseofAI MobileFace | 33K | mobileface.json | 2.1M | mobileface.bin | 75 | 112x112 |
EfficientPose | 134K | efficientpose.json | 5.6M | efficientpose.bin | 217 | 368x368 |
FaceBoxes | 212K | faceboxes.json | 2.0M | faceboxes.bin | 350 | 0x0 |
FaceRes | 70K | faceres.json | 6.7M | faceres.bin | 128 | 224x224 |
FaceRes (Deep) | 62K | faceres.json | 13.9M | faceres.bin | 128 | 224x224 |
GEAR Predictor (Gender/Emotion/Age/Race) | 28K | gear.json | 1.5M | gear.bin | 25 | 198x198 |
Google Selfie | 82K | selfie.json | 208K | selfie.bin | 136 | 256x256 |
Hand Tracking | 605K | handtrack.json | 2.9M | handtrack.bin | 619 | 320x320 |
Liveness | 17K | liveness.json | 580K | liveness.bin | 23 | 32x32 |
MB3-CenterNet | 197K | nanodet.json | 1.9M | nanodet.bin | 267 | 128x128 |
MediaPipe BlazeFace (Front) | 51K | blazeface-front.json | 323K | blazeface-front.bin | 73 | 128x128 |
MediaPipe BlazeFace (Back) | 78K | blazeface-back.json | 527K | blazeface-back.bin | 112 | 256x256 |
MediaPipe BlazePose (Lite) | 132K | blazepose-lite.json | 2.6M | blazepose-lite.bin | 177 | 256x256 |
MediaPipe BlazePose (Full) | 145K | blazepose-full.json | 6.6M | blazepose-full.bin | 193 | 256x256 |
MediaPipe BlazePose (Heavy) | 305K | blazepose-heavy.json | 27.0M | blazepose-heavy.bin | 400 | 256x256 |
MediaPipe BlazePose Detector (2D) | 129K | blazepose-detector2d.json | 7.2M | blazepose-detector2d.bin | 180 | 224x224 |
MediaPipe BlazePose Detector (3D) | 132K | blazepose-detector3d.json | 5.7M | blazepose-detector3d.bin | 181 | 224x224 |
MediaPipe FaceMesh | 94K | facemesh.json | 1.5M | facemesh.bin | 120 | 192x192 |
MediaPipe FaceMesh with Attention | 889K | facemesh-attention.json | 2.3M | facemesh-attention.bin | 1061 | 192x192 |
MediaPipe Hand Landmark (Full) | 81K | handlandmark-full.json | 5.4M | handlandmark-full.bin | 112 | 224x224 |
MediaPipe Hand Landmark (Lite) | 82K | handlandmark-lite.json | 2.0M | handlandmark-lite.bin | 112 | 224x224 |
MediaPipe Hand Landmark (Sparse) | 88K | handlandmark-sparse.json | 5.3M | handlandmark-sparse.bin | 112 | 224x224 |
MediaPipe HandPose (HandDetect) | 126K | handdetect.json | 6.8M | handdetect.bin | 152 | 256x256 |
MediaPipe HandPose (HandSkeleton) | 127K | handskeleton.json | 5.3M | handskeleton.bin | 145 | 256x256 |
MediaPipe Iris | 120K | iris.json | 2.5M | iris.bin | 191 | 64x64 |
MediaPipe Meet | 94K | meet.json | 364K | meet.bin | 163 | 144x256 |
MediaPipe Selfie | 82K | selfie.json | 208M | selfie.bin | 136 | 256x256 |
MoveNet-Lightning | 158K | movenet-lightning.json | 4.5M | movenet-lightning.bin | 180 | 192x192 |
MoveNet-MultiPose | 235K | movenet-thunder.json | 9.1M | movenet-thunder.bin | 303 | 256x256 |
MoveNet-Thunder | 158K | movenet-thunder.json | 12M | movenet-thunder.bin | 178 | 256x256 |
NanoDet | 255K | nanodet.json | 7.3M | nanodet.bin | 229 | 416x416 |
Oarriaga Emotion | 18K | emotion.json | 802K | emotion.bin | 23 | 64x64 |
Oarriaga Gender | 30K | gender.json | 198K | gender.bin | 39 | 64x64 |
HSE-AffectNet | 47K | affectnet-mobilenet.json | 6.7M | affectnet-mobilenet.bin | 64 | 224x224 |
PoseNet | 47K | posenet.json | 4.8M | posenet.bin | 62 | 385x385 |
Sirius-AI MobileFaceNet | 125K | mobilefacenet.json | 5.0M | mobilefacenet.bin | 139 | 112x112 |
SSR-Net Age (IMDB) | 93K | age.json | 158K | age.bin | 158 | 64x64 |
SSR-Net Gender (IMDB) | 92K | gender-ssrnet-imdb.json | 158K | gender-ssrnet-imdb.bin | 157 | 64x64 |
Robust Video Matting | 600K | rvm.json | 3.6M | rvm.bin | 425 | 512x512 |
Note: All model definitions JSON files are parsed for human readability
- Age & Gender Prediction: SSR-Net
- Anti-Spoofing: Real-of-Fake
- Body Pose Detection: BlazePose
- Body Pose Detection: EfficientPose
- Body Pose Detection: MoveNet
- Body Pose Detection: PoseNet
- Body Segmentation: MediaPipe Meet
- Body Segmentation: MediaPipe Selfie
- Body Segmentation: Robust Video Matting
- Emotion Prediction: Oarriaga
- Emotion Prediction: HSE-AffectNet
- Eye Iris Details: MediaPipe Iris
- Face Description: HSE-FaceRes
- Face Detection: MediaPipe BlazeFace
- Face Embedding: BecauseofAI MobileFace
- Face Embedding: DeepInsight InsightFace
- Facial Spacial Geometry: MediaPipe FaceMesh
- Facial Spacial Geometry with Attention: MediaPipe FaceMesh Attention Variation
- Gender, Emotion, Age, Race Prediction: GEAR Predictor
- Hand Detection & Skeleton: MediaPipe HandPose
- Hand Tracking: HandTracking
- Image Filters: WebGLImageFilter
- ObjectDetection: MB3-CenterNet
- ObjectDetection: NanoDet
- Pinto Model Zoo: Pinto
Included models are included under license inherited from the original model source
Model code has substantially changed from source that it is considered a derivative work and not simple re-publishing
Human Library Wiki Pages
3D Face Detection, Body Pose, Hand & Finger Tracking, Iris Tracking, Age & Gender Prediction, Emotion Prediction & Gesture Recognition