CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-pytorch-transformers

Repository of pre-trained NLP Transformer models: BERT & RoBERTa, GPT & GPT-2, Transformer-XL, XLNet and XLM

Pending
Overview
Eval results
Files

auto-classes.mddocs/

Auto Classes

Factory classes that provide automatic model, tokenizer, and configuration selection based on model name patterns. These classes eliminate the need to manually specify which architecture-specific class to use, making it easy to switch between different transformer models.

Capabilities

AutoTokenizer

Automatically selects and instantiates the appropriate tokenizer class based on the model name or path. Supports BERT, GPT-2, OpenAI GPT, Transformer-XL, XLNet, XLM, RoBERTa, and DistilBERT tokenizers.

class AutoTokenizer:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs):
        """
        Instantiate the appropriate tokenizer class from a pre-trained model.
        
        Parameters:
        - pretrained_model_name_or_path (str): Model name or local path
        - cache_dir (str, optional): Directory to cache downloaded files
        - force_download (bool, optional): Force re-download even if cached
        - resume_download (bool, optional): Resume incomplete downloads
        - proxies (dict, optional): HTTP proxy configuration
        - use_auth_token (str/bool, optional): Authentication token for private models
        
        Returns:
        PreTrainedTokenizer: Instance of the appropriate tokenizer class
        """

Usage Examples:

from pytorch_transformers import AutoTokenizer

# Load BERT tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")

# Load GPT-2 tokenizer  
tokenizer = AutoTokenizer.from_pretrained("gpt2")

# Load from local directory
tokenizer = AutoTokenizer.from_pretrained("./my-model")

# With custom cache directory
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", cache_dir="./cache")

AutoConfig

Automatically selects and loads the appropriate configuration class based on the model name or path. Configurations contain model hyperparameters and architecture specifications.

class AutoConfig:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path, **kwargs):
        """
        Instantiate the appropriate configuration class from a pre-trained model.
        
        Parameters:
        - pretrained_model_name_or_path (str): Model name or local path
        - cache_dir (str, optional): Directory to cache downloaded files
        - force_download (bool, optional): Force re-download even if cached
        - resume_download (bool, optional): Resume incomplete downloads
        - proxies (dict, optional): HTTP proxy configuration
        - use_auth_token (str/bool, optional): Authentication token for private models
        
        Returns:
        PretrainedConfig: Instance of the appropriate configuration class
        """

Usage Examples:

from pytorch_transformers import AutoConfig

# Load configuration
config = AutoConfig.from_pretrained("bert-base-uncased")

# Access configuration attributes
print(config.hidden_size)
print(config.num_attention_heads)
print(config.num_hidden_layers)

AutoModel

Automatically loads the base model class (without task-specific heads) for the specified architecture. Returns models suitable for feature extraction and embedding generation.

class AutoModel:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
        """
        Instantiate the appropriate base model class from a pre-trained model.
        
        Parameters:
        - pretrained_model_name_or_path (str): Model name or local path
        - config (PretrainedConfig, optional): Model configuration
        - cache_dir (str, optional): Directory to cache downloaded files
        - from_tf (bool, optional): Load from TensorFlow checkpoint
        - force_download (bool, optional): Force re-download even if cached
        - resume_download (bool, optional): Resume incomplete downloads
        - proxies (dict, optional): HTTP proxy configuration
        - output_loading_info (bool, optional): Return loading info dict
        - use_auth_token (str/bool, optional): Authentication token for private models
        
        Returns:
        PreTrainedModel: Instance of the appropriate base model class
        """

AutoModelWithLMHead

Automatically loads models with language modeling heads for text generation and language modeling tasks.

class AutoModelWithLMHead:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
        """
        Instantiate the appropriate language modeling model from a pre-trained model.
        
        Parameters:
        - pretrained_model_name_or_path (str): Model name or local path
        - config (PretrainedConfig, optional): Model configuration
        - cache_dir (str, optional): Directory to cache downloaded files
        - from_tf (bool, optional): Load from TensorFlow checkpoint
        - force_download (bool, optional): Force re-download even if cached
        - resume_download (bool, optional): Resume incomplete downloads
        - proxies (dict, optional): HTTP proxy configuration
        - output_loading_info (bool, optional): Return loading info dict
        - use_auth_token (str/bool, optional): Authentication token for private models
        
        Returns:
        PreTrainedModel: Instance of the appropriate LM model class
        """

AutoModelForSequenceClassification

Automatically loads models with sequence classification heads for tasks like sentiment analysis, text classification, and natural language inference.

class AutoModelForSequenceClassification:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
        """
        Instantiate the appropriate sequence classification model from a pre-trained model.
        
        Parameters:
        - pretrained_model_name_or_path (str): Model name or local path
        - config (PretrainedConfig, optional): Model configuration
        - num_labels (int, optional): Number of classification labels
        - cache_dir (str, optional): Directory to cache downloaded files
        - from_tf (bool, optional): Load from TensorFlow checkpoint
        - force_download (bool, optional): Force re-download even if cached
        - resume_download (bool, optional): Resume incomplete downloads
        - proxies (dict, optional): HTTP proxy configuration
        - output_loading_info (bool, optional): Return loading info dict
        - use_auth_token (str/bool, optional): Authentication token for private models
        
        Returns:
        PreTrainedModel: Instance of the appropriate sequence classification model
        """

AutoModelForQuestionAnswering

Automatically loads models with question answering heads for extractive question answering tasks like SQuAD.

class AutoModelForQuestionAnswering:
    @classmethod
    def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
        """
        Instantiate the appropriate question answering model from a pre-trained model.
        
        Parameters:
        - pretrained_model_name_or_path (str): Model name or local path
        - config (PretrainedConfig, optional): Model configuration
        - cache_dir (str, optional): Directory to cache downloaded files
        - from_tf (bool, optional): Load from TensorFlow checkpoint
        - force_download (bool, optional): Force re-download even if cached
        - resume_download (bool, optional): Resume incomplete downloads
        - proxies (dict, optional): HTTP proxy configuration
        - output_loading_info (bool, optional): Return loading info dict
        - use_auth_token (str/bool, optional): Authentication token for private models
        
        Returns:
        PreTrainedModel: Instance of the appropriate QA model class
        """

Usage Examples:

from pytorch_transformers import (
    AutoModel, 
    AutoModelWithLMHead, 
    AutoModelForSequenceClassification,
    AutoModelForQuestionAnswering
)

# Load base model for feature extraction
model = AutoModel.from_pretrained("bert-base-uncased")

# Load language model for text generation
lm_model = AutoModelWithLMHead.from_pretrained("gpt2")

# Load sequence classifier
classifier = AutoModelForSequenceClassification.from_pretrained(
    "bert-base-uncased", 
    num_labels=2
)

# Load question answering model
qa_model = AutoModelForQuestionAnswering.from_pretrained("bert-base-uncased")

# Use with tokenizer for complete pipeline
from pytorch_transformers import AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model(**inputs)

Supported Model Names

The Auto classes support the following pre-trained model names:

BERT Models:

  • bert-base-uncased, bert-large-uncased
  • bert-base-cased, bert-large-cased
  • bert-base-multilingual-uncased, bert-base-multilingual-cased
  • bert-base-chinese

GPT-2 Models:

  • gpt2, gpt2-medium, gpt2-large, gpt2-xl

Other Models:

  • openai-gpt
  • transfo-xl-wt103
  • xlnet-base-cased, xlnet-large-cased
  • xlm-mlm-en-2048, xlm-mlm-100-1280
  • roberta-base, roberta-large
  • distilbert-base-uncased, distilbert-base-cased

Install with Tessl CLI

npx tessl i tessl/pypi-pytorch-transformers

docs

auto-classes.md

base-classes.md

bert-models.md

file-utilities.md

gpt2-models.md

index.md

optimization.md

other-models.md

tile.json