A human pose estimation model is a computer vision system that detects and localizes key human body joints (keypoints) from images or video, reconstructing a person’s pose in 2D or 3D. It underpins applications like motion analysis, AR/VR, sports analytics, and human–computer interaction by turning raw pixels into structured representations of human movement.
One of the earliest widely adopted open-source multi-person 2D pose estimation frameworks; often used as a baseline in research.
Google’s lightweight pose estimation model optimized for mobile and real-time applications via MediaPipe.
Google’s family of ultra-fast pose estimation models optimized for TensorFlow.js, TFLite, and real-time applications.
Research-grade architecture that maintains high-resolution representations throughout the network for high-accuracy keypoint localization.
High-accuracy open-source multi-person pose estimator often used in research and applications needing precise keypoints.