CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-transformers

State-of-the-art Machine Learning for JAX, PyTorch and TensorFlow

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

index.mddocs/

Transformers

State-of-the-art Machine Learning library for JAX, PyTorch and TensorFlow. Transformers provides a unified API for working with over 350 pre-trained models across natural language processing, computer vision, audio, and multimodal tasks. The library democratizes access to cutting-edge AI models with simple, efficient interfaces for both inference and training.

Package Information

  • Package Name: transformers
  • Language: Python
  • Installation: pip install transformers

Core Imports

import transformers

Common patterns for specific functionality:

# High-level Pipeline API (recommended for most use cases)
from transformers import pipeline

# Auto classes for automatic model/tokenizer selection
from transformers import AutoModel, AutoTokenizer, AutoConfig

# Specific model classes
from transformers import BertModel, BertTokenizer
from transformers import GPT2LMHeadModel, GPT2Tokenizer

# Training utilities
from transformers import Trainer, TrainingArguments

# Feature extraction for audio/vision
from transformers import AutoFeatureExtractor, AutoImageProcessor

Basic Usage

Quick Start with Pipelines

from transformers import pipeline

# Text classification
classifier = pipeline("text-classification")
results = classifier("I love using transformers!")

# Question answering
qa_pipeline = pipeline("question-answering")
answer = qa_pipeline(
    question="What is transformers?",
    context="Transformers is a library for natural language processing."
)

# Text generation
generator = pipeline("text-generation", model="gpt2")
output = generator("The future of AI is", max_length=50, num_return_sequences=1)

# Image classification
image_classifier = pipeline("image-classification")
results = image_classifier("path/to/image.jpg")

Working with Models Directly

from transformers import AutoModel, AutoTokenizer

# Load model and tokenizer
model_name = "bert-base-uncased"
model = AutoModel.from_pretrained(model_name)
tokenizer = AutoTokenizer.from_pretrained(model_name)

# Encode text
text = "Hello, world!"
inputs = tokenizer(text, return_tensors="pt")

# Forward pass
outputs = model(**inputs)
last_hidden_states = outputs.last_hidden_state

Training a Model

from transformers import Trainer, TrainingArguments
from transformers import AutoModelForSequenceClassification, AutoTokenizer

# Load model for fine-tuning
model = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased", 
    num_labels=2
)
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# Configure training
training_args = TrainingArguments(
    output_dir="./results",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    per_device_eval_batch_size=64,
    warmup_steps=500,
    weight_decay=0.01,
    logging_dir="./logs",
)

# Initialize trainer
trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

# Start training
trainer.train()

Architecture

The transformers library is built around several key architectural components:

  • Auto Classes: Automatically select the correct model, tokenizer, or configuration based on a model name or path
  • Model Classes: Implement specific architectures (BERT, GPT, T5, etc.) with consistent APIs across frameworks
  • Tokenizers: Convert text to tokens and back, handling different tokenization strategies and vocabularies
  • Pipelines: High-level abstraction providing simple interfaces for common ML tasks
  • Trainer: Comprehensive training framework with built-in optimization, logging, and evaluation
  • Hub Integration: Seamless downloading, caching, and sharing of models via Hugging Face Hub

This design enables transformers to serve as the foundational layer for the AI/ML ecosystem, providing consistent interfaces across 350+ model architectures while maintaining compatibility with PyTorch, TensorFlow, and JAX.

Capabilities

High-Level Pipeline API

Simple, task-oriented interface for common ML operations. Pipelines abstract away model selection, preprocessing, and postprocessing, providing immediate access to state-of-the-art capabilities.

def pipeline(
    task: str = None,
    model: str = None,
    tokenizer: str = None,
    **kwargs
) -> Pipeline

Pipelines

Model Management

Automatic model selection and loading with support for 350+ architectures. Auto classes intelligently choose the correct implementation based on model names or configurations.

class AutoModel:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs) -> PreTrainedModel

class AutoTokenizer:
    @classmethod 
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs) -> PreTrainedTokenizer

class AutoConfig:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs) -> PretrainedConfig

Models

Training and Fine-tuning

Comprehensive training framework with built-in optimization, distributed training support, and extensive customization options.

class Trainer:
    def __init__(
        self,
        model: PreTrainedModel,
        args: TrainingArguments,
        train_dataset = None,
        eval_dataset = None,
        **kwargs
    )
    
    def train(self) -> None
    def evaluate(self) -> Dict[str, float]
    def predict(self, test_dataset) -> PredictionOutput

class TrainingArguments:
    def __init__(
        self,
        output_dir: str,
        num_train_epochs: float = 3.0,
        per_device_train_batch_size: int = 8,
        learning_rate: float = 5e-5,
        **kwargs
    )

Training

Text Generation

Advanced text generation capabilities with multiple decoding strategies, fine-grained control over output, and support for conversational AI.

class GenerationMixin:
    def generate(
        self,
        inputs = None,
        max_length: int = None,
        num_beams: int = 1,
        temperature: float = 1.0,
        do_sample: bool = False,
        **kwargs
    ) -> torch.Tensor

class GenerationConfig:
    def __init__(
        self,
        max_length: int = 20,
        num_beams: int = 1,
        temperature: float = 1.0,
        **kwargs
    )

Generation

Tokenization

Comprehensive tokenization with support for 100+ different tokenizers, handling subword tokenization, special tokens, and efficient batch processing.

class PreTrainedTokenizer:
    def encode(
        self, 
        text: str,
        add_special_tokens: bool = True,
        **kwargs
    ) -> List[int]
    
    def decode(
        self,
        token_ids: List[int],
        skip_special_tokens: bool = False
    ) -> str
    
    def __call__(
        self,
        text,
        return_tensors: str = None,
        padding: bool = False,
        truncation: bool = False,
        **kwargs
    ) -> BatchEncoding

Tokenization

Feature Extraction

Audio and image preprocessing capabilities for multimodal models, providing consistent interfaces for different modalities.

class AutoFeatureExtractor:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs)

class AutoImageProcessor:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs)

Feature Extraction

Model Optimization

Advanced optimization techniques including quantization, mixed precision training, and hardware acceleration for efficient inference and training.

class BitsAndBytesConfig:
    def __init__(
        self,
        load_in_8bit: bool = False,
        load_in_4bit: bool = False,
        bnb_4bit_compute_dtype = None,
        **kwargs
    )

Optimization

Types

Core type definitions used throughout the library:

class PreTrainedModel:
    """Base class for all model implementations."""
    def forward(self, **kwargs)
    def save_pretrained(self, save_directory: str, **kwargs)
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs)

class PretrainedConfig:
    """Base configuration class for all models."""
    def save_pretrained(self, save_directory: str, **kwargs)
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path: str, **kwargs)

class BatchEncoding:
    """Container for tokenizer outputs with tensor conversion capabilities."""
    input_ids: List[List[int]]
    attention_mask: List[List[int]]
    def to(self, device: str) -> 'BatchEncoding'

class Pipeline:
    """Base class for all pipeline implementations."""
    def __call__(self, inputs, **kwargs)
    def save_pretrained(self, save_directory: str, **kwargs)

class ModelOutput:
    """Base class for all model outputs."""
    last_hidden_state: torch.Tensor
    hidden_states: Tuple[torch.Tensor]
    attentions: Tuple[torch.Tensor]

docs

feature-extraction.md

generation.md

index.md

models.md

optimization.md

pipelines.md

tokenization.md

training.md

tile.json