or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

advanced-apis.mdindex.mdkeypoints.mdmodel-loading.mdmulti-pose.mdpose-utilities.mdsingle-pose.md
tile.json

tessl/npm-tensorflow-models--posenet

Pretrained PoseNet model in TensorFlow.js for real-time human pose estimation from images and video streams

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
npmpkg:npm/@tensorflow-models/posenet@2.2.x

To install, run

npx @tessl/cli install tessl/npm-tensorflow-models--posenet@2.2.0

index.mddocs/

PoseNet

PoseNet is a TensorFlow.js library for real-time human pose estimation in web browsers. It detects human poses from images or video streams, providing keypoint locations for body parts with confidence scores. The library supports both single and multiple person pose detection using pre-trained MobileNetV1 and ResNet50 neural network architectures.

Package Information

  • Package Name: @tensorflow-models/posenet
  • Package Type: npm
  • Language: TypeScript
  • Installation: npm install @tensorflow-models/posenet

Core Imports

import * as posenet from '@tensorflow-models/posenet';

For ES6 destructuring:

import { load, version, partNames, partIds, poseChain, partChannels } from '@tensorflow-models/posenet';

For CommonJS:

const posenet = require('@tensorflow-models/posenet');

Note: PoseNet also requires TensorFlow.js core:

import * as tf from '@tensorflow/tfjs-core';

Basic Usage

import * as posenet from '@tensorflow-models/posenet';

// Load the model
const net = await posenet.load();

// Single person pose estimation
const pose = await net.estimateSinglePose(imageElement, {
  flipHorizontal: false
});

console.log('Pose score:', pose.score);
console.log('Keypoints:', pose.keypoints);

// Multiple person pose estimation
const poses = await net.estimateMultiplePoses(imageElement, {
  flipHorizontal: false,
  maxDetections: 5,
  scoreThreshold: 0.5,
  nmsRadius: 20
});

poses.forEach((pose, i) => {
  console.log(`Pose ${i} score:`, pose.score);
});

Architecture

PoseNet is built around several key components:

  • Model Loading: Configurable neural network architectures (MobileNetV1, ResNet50) with various trade-offs between speed and accuracy
  • Pose Estimation Engine: Core algorithms for single and multi-person pose detection with different decoding strategies
  • Keypoint System: 17-point human skeleton model with confidence scoring for each body part
  • Utility Functions: Helper functions for pose manipulation, scaling, and geometric calculations
  • WebGL Acceleration: TensorFlow.js backend integration for GPU-accelerated inference

Capabilities

Model Loading and Configuration

Load and configure PoseNet models with various architectures and performance trade-offs. Choose between MobileNetV1 for speed or ResNet50 for accuracy.

function load(config?: ModelConfig): Promise<PoseNet>;

interface ModelConfig {
  architecture: PoseNetArchitecture;
  outputStride: PoseNetOutputStride;
  inputResolution: InputResolution;
  multiplier?: MobileNetMultiplier;
  modelUrl?: string;
  quantBytes?: PoseNetQuantBytes;
}

Model Loading and Configuration

Single Person Pose Estimation

Fast pose detection optimized for single person scenarios. Ideal when only one person is expected in the image.

class PoseNet {
  estimateSinglePose(
    input: PosenetInput,
    config?: SinglePersonInterfaceConfig
  ): Promise<Pose>;
}

interface SinglePersonInterfaceConfig {
  flipHorizontal: boolean;
}

Single Person Pose Estimation

Multiple Person Pose Estimation

Robust pose detection for images containing multiple people. Uses non-maximum suppression to avoid duplicate detections.

class PoseNet {
  estimateMultiplePoses(
    input: PosenetInput, 
    config?: MultiPersonInferenceConfig
  ): Promise<Pose[]>;
}

interface MultiPersonInferenceConfig {
  flipHorizontal: boolean;
  maxDetections?: number;
  scoreThreshold?: number;
  nmsRadius?: number;
}

Multiple Person Pose Estimation

Pose Processing and Utilities

Utility functions for manipulating, scaling, and analyzing detected poses. Includes keypoint relationships and geometric calculations.

function getAdjacentKeyPoints(keypoints: Keypoint[], minConfidence: number): Keypoint[][];
function getBoundingBox(keypoints: Keypoint[]): {maxX: number, maxY: number, minX: number, minY: number};
function getBoundingBoxPoints(keypoints: Keypoint[]): Vector2D[];
function scalePose(pose: Pose, scaleY: number, scaleX: number, offsetY?: number, offsetX?: number): Pose;
function scaleAndFlipPoses(poses: Pose[], imageSize: [number, number], inputResolution: [number, number], padding: Padding, flipHorizontal?: boolean): Pose[];

Pose Processing and Utilities

Keypoint System and Body Parts

Constants and data structures defining the 17-point human skeleton model used by PoseNet.

const partNames: string[];  
const partIds: {[jointName: string]: number};
const poseChain: [string, string][];
const partChannels: string[];

Keypoint System and Body Parts

Advanced APIs and Low-Level Functions

Low-level decoding functions and neural network classes for custom pose estimation implementations and advanced use cases.

function decodeSinglePose(heatmapScores: tf.Tensor3D, offsets: tf.Tensor3D, outputStride: PoseNetOutputStride): Promise<Pose>;
function decodeMultiplePoses(scoresBuffer: TensorBuffer3D, offsetsBuffer: TensorBuffer3D, displacementsFwdBuffer: TensorBuffer3D, displacementsBwdBuffer: TensorBuffer3D, outputStride: number, maxPoseDetections: number, scoreThreshold?: number, nmsRadius?: number): Pose[];
class MobileNet extends BaseModel;

Advanced APIs and Low-Level Functions

Package Information

Version information and package metadata.

const version: string;

Core Types

interface Pose {
  keypoints: Keypoint[];
  score: number;
}

interface Keypoint {
  score: number;
  position: Vector2D;
  part: string;
}

interface Vector2D {
  x: number;
  y: number;
}

interface Padding {
  top: number;
  bottom: number;
  left: number;
  right: number;
}

type PosenetInput = ImageData | HTMLImageElement | HTMLCanvasElement | HTMLVideoElement | tf.Tensor3D;

type PoseNetArchitecture = 'ResNet50' | 'MobileNetV1';
type PoseNetOutputStride = 32 | 16 | 8;
type PoseNetQuantBytes = 1 | 2 | 4;
type MobileNetMultiplier = 0.50 | 0.75 | 1.0;
type InputResolution = number | {width: number, height: number};
type TensorBuffer3D = tf.TensorBuffer<tf.Rank.R3>;