tessl/npm-xenova--transformers

State-of-the-art Machine Learning for the web that runs Transformers directly in browsers with no server needed.

Overview

Eval results

Files

Pipelines

Name: tessl/npm-xenova--transformers
Author: tessl

Pipelines provide a high-level, task-specific API for running machine learning models. The pipeline interface is the easiest way to use transformers.js for most ML tasks, automatically handling model loading, preprocessing, and postprocessing.

Capabilities

Main Pipeline Function

Creates a pipeline instance for a specific machine learning task with automatic model selection and preprocessing.

/**
 * Create a pipeline for a specific ML task
 * @param task - The task identifier (see supported tasks below)
 * @param model - Optional model name/path (uses default if not specified)
 * @param options - Configuration options for the pipeline
 * @returns Promise that resolves to a Pipeline instance
 */
async function pipeline(
  task: string,
  model?: string,
  options?: PipelineOptions
): Promise<Pipeline>;

interface PipelineOptions {
  /** Whether to use quantized version of the model (default: true) */
  quantized?: boolean;
  /** Callback function to track model download progress */
  progress_callback?: (progress: any) => void;
  /** Custom model configuration */
  config?: any;
  /** Directory to cache downloaded models */
  cache_dir?: string;
  /** Only use local files, don't download from remote */
  local_files_only?: boolean;
  /** Model revision/branch to use (default: 'main') */
  revision?: string;
  /** Specific model file name to use */
  model_file_name?: string;
}

Usage Examples:

import { pipeline } from "@xenova/transformers";

// Basic usage with default model
const classifier = await pipeline("sentiment-analysis");
const result = await classifier("I love this library!");
// Output: [{ label: 'POSITIVE', score: 0.999 }]

// Custom model specification
const translator = await pipeline("translation", "Xenova/opus-mt-en-de");
const translation = await translator("Hello world");

// With custom options
const generator = await pipeline("text-generation", "gpt2", {
  quantized: false,
  progress_callback: (progress) => console.log(progress),
});

Text Processing Tasks

Text Classification

Classify text into predefined categories (sentiment analysis, topic classification, etc.).

interface TextClassificationPipeline {
  (
    texts: string | string[],
    options?: {
      top_k?: number;
      function_to_apply?: string;
    }
  ): Promise<Array<{
    label: string;
    score: number;
  }>>;
}

Supported Task Names: "text-classification", "sentiment-analysis"

Usage Example:

const classifier = await pipeline("sentiment-analysis");
const results = await classifier(["I love this!", "This is terrible"]);
// Results: [
//   [{ label: 'POSITIVE', score: 0.999 }],
//   [{ label: 'NEGATIVE', score: 0.998 }]
// ]

Token Classification

Classify individual tokens (Named Entity Recognition, Part-of-Speech tagging).

interface TokenClassificationPipeline {
  (
    texts: string | string[],
    options?: {
      aggregation_strategy?: string;
      ignore_labels?: string[];
    }
  ): Promise<Array<{
    entity: string;
    score: number;
    index: number;
    word: string;
    start: number;
    end: number;
  }>>;
}

Supported Task Names: "token-classification", "ner"

Question Answering

Extract answers from context text based on questions.

interface QuestionAnsweringPipeline {
  (
    question: string,
    context: string,
    options?: {
      top_k?: number;
    }
  ): Promise<{
    answer: string;
    score: number;
    start: number;
    end: number;
  }>;
}

Supported Task Names: "question-answering"

Fill Mask

Fill masked tokens in text.

interface FillMaskPipeline {
  (
    texts: string | string[],
    options?: {
      top_k?: number;
    }
  ): Promise<Array<{
    score: number;
    token: number;
    token_str: string;
    sequence: string;
  }>>;
}

Supported Task Names: "fill-mask"

Text Generation

Generate text continuations from input prompts.

interface TextGenerationPipeline {
  (
    texts: string | string[],
    options?: {
      max_new_tokens?: number;
      do_sample?: boolean;
      temperature?: number;
      top_k?: number;
      top_p?: number;
    }
  ): Promise<Array<{
    generated_text: string;
  }>>;
}

Supported Task Names: "text-generation"

Text-to-Text Generation

Generate text from text input (includes summarization, translation).

interface Text2TextGenerationPipeline {
  (
    texts: string | string[],
    options?: {
      max_new_tokens?: number;
      do_sample?: boolean;
      temperature?: number;
    }
  ): Promise<Array<{
    generated_text: string;
  }>>;
}

interface SummarizationPipeline {
  (
    texts: string | string[],
    options?: {
      max_new_tokens?: number;
      min_new_tokens?: number;
    }
  ): Promise<Array<{
    summary_text: string;
  }>>;
}

interface TranslationPipeline {
  (
    texts: string | string[],
    options?: {
      max_new_tokens?: number;
    }
  ): Promise<Array<{
    translation_text: string;
  }>>;
}

Supported Task Names: "text2text-generation", "summarization", "translation"

Zero-Shot Classification

Classify text without predefined training examples.

interface ZeroShotClassificationPipeline {
  (
    texts: string | string[],
    candidate_labels: string[],
    options?: {
      hypothesis_template?: string;
      multi_label?: boolean;
    }
  ): Promise<{
    sequence: string;
    labels: string[];
    scores: number[];
  }>;
}

Supported Task Names: "zero-shot-classification"

Feature Extraction

Extract embeddings from text for similarity tasks.

interface FeatureExtractionPipeline {
  (
    texts: string | string[],
    options?: {
      pooling?: string;
      normalize?: boolean;
      quantize?: boolean;
      precision?: string;
    }
  ): Promise<Tensor>;
}

Supported Task Names: "feature-extraction", "embeddings"

Vision Processing Tasks

Image Classification

Classify images into predefined categories.

interface ImageClassificationPipeline {
  (
    images: ImageInput | ImageInput[],
    options?: {
      top_k?: number;
    }
  ): Promise<Array<{
    label: string;
    score: number;
  }>>;
}

Supported Task Names: "image-classification"

Object Detection

Detect and locate objects in images.

interface ObjectDetectionPipeline {
  (
    images: ImageInput | ImageInput[],
    options?: {
      threshold?: number;
      percentage?: boolean;
    }
  ): Promise<Array<{
    score: number;
    label: string;
    box: {
      xmin: number;
      ymin: number;
      xmax: number;
      ymax: number;
    };
  }>>;
}

Supported Task Names: "object-detection"

Zero-Shot Object Detection

Detect objects in images using text descriptions.

interface ZeroShotObjectDetectionPipeline {
  (
    images: ImageInput | ImageInput[],
    candidate_labels: string[],
    options?: {
      threshold?: number;
      percentage?: boolean;
    }
  ): Promise<Array<{
    score: number;
    label: string;
    box: {
      xmin: number;
      ymin: number;
      xmax: number;
      ymax: number;
    };
  }>>;
}

Supported Task Names: "zero-shot-object-detection"

Image Segmentation

Segment objects and regions in images.

interface ImageSegmentationPipeline {
  (
    images: ImageInput | ImageInput[],
    options?: {
      threshold?: number;
      mask_threshold?: number;
      overlap_mask_area_threshold?: number;
    }
  ): Promise<Array<{
    score: number;
    label: string;
    mask: RawImage;
  }>>;
}

Supported Task Names: "image-segmentation"

Zero-Shot Image Classification

Classify images using text descriptions.

interface ZeroShotImageClassificationPipeline {
  (
    images: ImageInput | ImageInput[],
    candidate_labels: string[],
    options?: {
      hypothesis_template?: string;
    }
  ): Promise<Array<{
    label: string;
    score: number;
  }>>;
}

Supported Task Names: "zero-shot-image-classification"

Image-to-Text

Generate text descriptions from images.

interface ImageToTextPipeline {
  (
    images: ImageInput | ImageInput[],
    options?: {
      max_new_tokens?: number;
      do_sample?: boolean;
      temperature?: number;
    }
  ): Promise<Array<{
    generated_text: string;
  }>>;
}

Supported Task Names: "image-to-text"

Image-to-Image

Transform images (super-resolution, style transfer).

interface ImageToImagePipeline {
  (
    images: ImageInput | ImageInput[]
  ): Promise<RawImage[]>;
}

Supported Task Names: "image-to-image"

Depth Estimation

Estimate depth maps from images.

interface DepthEstimationPipeline {
  (
    images: ImageInput | ImageInput[]
  ): Promise<Array<{
    predicted_depth: Tensor;
    depth: RawImage;
  }>>;
}

Supported Task Names: "depth-estimation"

Image Feature Extraction

Extract embeddings from images.

interface ImageFeatureExtractionPipeline {
  (
    images: ImageInput | ImageInput[],
    options?: {
      pool?: boolean;
      normalize?: boolean;
      quantize?: boolean;
      precision?: string;
    }
  ): Promise<Tensor>;
}

Supported Task Names: "image-feature-extraction"

Audio Processing Tasks

Audio Classification

Classify audio content into categories.

interface AudioClassificationPipeline {
  (
    audio: AudioInput | AudioInput[],
    options?: {
      top_k?: number;
    }
  ): Promise<Array<{
    label: string;
    score: number;
  }>>;
}

Supported Task Names: "audio-classification"

Zero-Shot Audio Classification

Classify audio using text descriptions.

interface ZeroShotAudioClassificationPipeline {
  (
    audio: AudioInput | AudioInput[],
    candidate_labels: string[],
    options?: {
      hypothesis_template?: string;
    }
  ): Promise<Array<{
    label: string;
    score: number;
  }>>;
}

Supported Task Names: "zero-shot-audio-classification"

Automatic Speech Recognition

Convert speech to text.

interface AutomaticSpeechRecognitionPipeline {
  (
    audio: AudioInput | AudioInput[],
    options?: {
      top_k?: number;
      hotwords?: string;
      language?: string;
      task?: string;
      return_timestamps?: boolean | string;
      chunk_length_s?: number;
      stride_length_s?: number;
    }
  ): Promise<{
    text: string;
    chunks?: Array<{
      text: string;
      timestamp: [number, number];
    }>;
  }>;
}

Supported Task Names: "automatic-speech-recognition", "asr"

Text-to-Audio

Generate audio from text.

interface TextToAudioPipeline {
  (
    texts: string | string[],
    options?: {
      speaker_embeddings?: Tensor;
    }
  ): Promise<{
    audio: Float32Array;
    sampling_rate: number;
  }>;
}

Supported Task Names: "text-to-audio", "text-to-speech"

Multimodal Tasks

Document Question Answering

Answer questions about document images.

interface DocumentQuestionAnsweringPipeline {
  (
    image: ImageInput,
    question: string,
    options?: {
      top_k?: number;
    }
  ): Promise<Array<{
    answer: string;
    score: number;
  }>>;
}

Supported Task Names: "document-question-answering"

Types

type ImageInput = string | RawImage | URL;
type AudioInput = string | URL | Float32Array | Float64Array;

interface Pipeline {
  (input: any, options?: any): Promise<any>;
  dispose(): Promise<void>;
}

Supported Tasks Summary

Task	Task Names	Input Type	Output Type
Text Classification	`text-classification`, `sentiment-analysis`	Text	Labels + Scores
Token Classification	`token-classification`, `ner`	Text	Token Labels
Question Answering	`question-answering`	Question + Context	Answer + Score
Fill Mask	`fill-mask`	Masked Text	Token Predictions
Text Generation	`text-generation`	Text Prompt	Generated Text
Summarization	`summarization`	Text	Summary
Translation	`translation`	Text	Translated Text
Zero-Shot Classification	`zero-shot-classification`	Text + Labels	Classification
Feature Extraction	`feature-extraction`, `embeddings`	Text	Embeddings
Image Classification	`image-classification`	Image	Labels + Scores
Object Detection	`object-detection`	Image	Objects + Boxes
Image Segmentation	`image-segmentation`	Image	Segments + Masks
Zero-Shot Image Classification	`zero-shot-image-classification`	Image + Labels	Classification
Image-to-Text	`image-to-text`	Image	Generated Text
Audio Classification	`audio-classification`	Audio	Labels + Scores
Speech Recognition	`automatic-speech-recognition`, `asr`	Audio	Transcribed Text
Text-to-Audio	`text-to-audio`, `text-to-speech`	Text	Audio Waveform
Document QA	`document-question-answering`	Document + Question	Answer

Install with Tessl CLI

npx tessl i tessl/npm-xenova--transformers

docs