Pretrained PoseNet model in TensorFlow.js for real-time human pose estimation from images and video streams
npx @tessl/cli install tessl/npm-tensorflow-models--posenet@2.2.0PoseNet is a TensorFlow.js library for real-time human pose estimation in web browsers. It detects human poses from images or video streams, providing keypoint locations for body parts with confidence scores. The library supports both single and multiple person pose detection using pre-trained MobileNetV1 and ResNet50 neural network architectures.
npm install @tensorflow-models/posenetimport * as posenet from '@tensorflow-models/posenet';For ES6 destructuring:
import { load, version, partNames, partIds, poseChain, partChannels } from '@tensorflow-models/posenet';For CommonJS:
const posenet = require('@tensorflow-models/posenet');Note: PoseNet also requires TensorFlow.js core:
import * as tf from '@tensorflow/tfjs-core';import * as posenet from '@tensorflow-models/posenet';
// Load the model
const net = await posenet.load();
// Single person pose estimation
const pose = await net.estimateSinglePose(imageElement, {
flipHorizontal: false
});
console.log('Pose score:', pose.score);
console.log('Keypoints:', pose.keypoints);
// Multiple person pose estimation
const poses = await net.estimateMultiplePoses(imageElement, {
flipHorizontal: false,
maxDetections: 5,
scoreThreshold: 0.5,
nmsRadius: 20
});
poses.forEach((pose, i) => {
console.log(`Pose ${i} score:`, pose.score);
});PoseNet is built around several key components:
Load and configure PoseNet models with various architectures and performance trade-offs. Choose between MobileNetV1 for speed or ResNet50 for accuracy.
function load(config?: ModelConfig): Promise<PoseNet>;
interface ModelConfig {
architecture: PoseNetArchitecture;
outputStride: PoseNetOutputStride;
inputResolution: InputResolution;
multiplier?: MobileNetMultiplier;
modelUrl?: string;
quantBytes?: PoseNetQuantBytes;
}Model Loading and Configuration
Fast pose detection optimized for single person scenarios. Ideal when only one person is expected in the image.
class PoseNet {
estimateSinglePose(
input: PosenetInput,
config?: SinglePersonInterfaceConfig
): Promise<Pose>;
}
interface SinglePersonInterfaceConfig {
flipHorizontal: boolean;
}Robust pose detection for images containing multiple people. Uses non-maximum suppression to avoid duplicate detections.
class PoseNet {
estimateMultiplePoses(
input: PosenetInput,
config?: MultiPersonInferenceConfig
): Promise<Pose[]>;
}
interface MultiPersonInferenceConfig {
flipHorizontal: boolean;
maxDetections?: number;
scoreThreshold?: number;
nmsRadius?: number;
}Multiple Person Pose Estimation
Utility functions for manipulating, scaling, and analyzing detected poses. Includes keypoint relationships and geometric calculations.
function getAdjacentKeyPoints(keypoints: Keypoint[], minConfidence: number): Keypoint[][];
function getBoundingBox(keypoints: Keypoint[]): {maxX: number, maxY: number, minX: number, minY: number};
function getBoundingBoxPoints(keypoints: Keypoint[]): Vector2D[];
function scalePose(pose: Pose, scaleY: number, scaleX: number, offsetY?: number, offsetX?: number): Pose;
function scaleAndFlipPoses(poses: Pose[], imageSize: [number, number], inputResolution: [number, number], padding: Padding, flipHorizontal?: boolean): Pose[];Constants and data structures defining the 17-point human skeleton model used by PoseNet.
const partNames: string[];
const partIds: {[jointName: string]: number};
const poseChain: [string, string][];
const partChannels: string[];Keypoint System and Body Parts
Low-level decoding functions and neural network classes for custom pose estimation implementations and advanced use cases.
function decodeSinglePose(heatmapScores: tf.Tensor3D, offsets: tf.Tensor3D, outputStride: PoseNetOutputStride): Promise<Pose>;
function decodeMultiplePoses(scoresBuffer: TensorBuffer3D, offsetsBuffer: TensorBuffer3D, displacementsFwdBuffer: TensorBuffer3D, displacementsBwdBuffer: TensorBuffer3D, outputStride: number, maxPoseDetections: number, scoreThreshold?: number, nmsRadius?: number): Pose[];
class MobileNet extends BaseModel;Advanced APIs and Low-Level Functions
Version information and package metadata.
const version: string;interface Pose {
keypoints: Keypoint[];
score: number;
}
interface Keypoint {
score: number;
position: Vector2D;
part: string;
}
interface Vector2D {
x: number;
y: number;
}
interface Padding {
top: number;
bottom: number;
left: number;
right: number;
}
type PosenetInput = ImageData | HTMLImageElement | HTMLCanvasElement | HTMLVideoElement | tf.Tensor3D;
type PoseNetArchitecture = 'ResNet50' | 'MobileNetV1';
type PoseNetOutputStride = 32 | 16 | 8;
type PoseNetQuantBytes = 1 | 2 | 4;
type MobileNetMultiplier = 0.50 | 0.75 | 1.0;
type InputResolution = number | {width: number, height: number};
type TensorBuffer3D = tf.TensorBuffer<tf.Rank.R3>;