Repository of pre-trained NLP Transformer models: BERT & RoBERTa, GPT & GPT-2, Transformer-XL, XLNet and XLM
—
Factory classes that provide automatic model, tokenizer, and configuration selection based on model name patterns. These classes eliminate the need to manually specify which architecture-specific class to use, making it easy to switch between different transformer models.
Automatically selects and instantiates the appropriate tokenizer class based on the model name or path. Supports BERT, GPT-2, OpenAI GPT, Transformer-XL, XLNet, XLM, RoBERTa, and DistilBERT tokenizers.
class AutoTokenizer:
@classmethod
def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs):
"""
Instantiate the appropriate tokenizer class from a pre-trained model.
Parameters:
- pretrained_model_name_or_path (str): Model name or local path
- cache_dir (str, optional): Directory to cache downloaded files
- force_download (bool, optional): Force re-download even if cached
- resume_download (bool, optional): Resume incomplete downloads
- proxies (dict, optional): HTTP proxy configuration
- use_auth_token (str/bool, optional): Authentication token for private models
Returns:
PreTrainedTokenizer: Instance of the appropriate tokenizer class
"""Usage Examples:
from pytorch_transformers import AutoTokenizer
# Load BERT tokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
# Load GPT-2 tokenizer
tokenizer = AutoTokenizer.from_pretrained("gpt2")
# Load from local directory
tokenizer = AutoTokenizer.from_pretrained("./my-model")
# With custom cache directory
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased", cache_dir="./cache")Automatically selects and loads the appropriate configuration class based on the model name or path. Configurations contain model hyperparameters and architecture specifications.
class AutoConfig:
@classmethod
def from_pretrained(cls, pretrained_model_name_or_path, **kwargs):
"""
Instantiate the appropriate configuration class from a pre-trained model.
Parameters:
- pretrained_model_name_or_path (str): Model name or local path
- cache_dir (str, optional): Directory to cache downloaded files
- force_download (bool, optional): Force re-download even if cached
- resume_download (bool, optional): Resume incomplete downloads
- proxies (dict, optional): HTTP proxy configuration
- use_auth_token (str/bool, optional): Authentication token for private models
Returns:
PretrainedConfig: Instance of the appropriate configuration class
"""Usage Examples:
from pytorch_transformers import AutoConfig
# Load configuration
config = AutoConfig.from_pretrained("bert-base-uncased")
# Access configuration attributes
print(config.hidden_size)
print(config.num_attention_heads)
print(config.num_hidden_layers)Automatically loads the base model class (without task-specific heads) for the specified architecture. Returns models suitable for feature extraction and embedding generation.
class AutoModel:
@classmethod
def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
"""
Instantiate the appropriate base model class from a pre-trained model.
Parameters:
- pretrained_model_name_or_path (str): Model name or local path
- config (PretrainedConfig, optional): Model configuration
- cache_dir (str, optional): Directory to cache downloaded files
- from_tf (bool, optional): Load from TensorFlow checkpoint
- force_download (bool, optional): Force re-download even if cached
- resume_download (bool, optional): Resume incomplete downloads
- proxies (dict, optional): HTTP proxy configuration
- output_loading_info (bool, optional): Return loading info dict
- use_auth_token (str/bool, optional): Authentication token for private models
Returns:
PreTrainedModel: Instance of the appropriate base model class
"""Automatically loads models with language modeling heads for text generation and language modeling tasks.
class AutoModelWithLMHead:
@classmethod
def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
"""
Instantiate the appropriate language modeling model from a pre-trained model.
Parameters:
- pretrained_model_name_or_path (str): Model name or local path
- config (PretrainedConfig, optional): Model configuration
- cache_dir (str, optional): Directory to cache downloaded files
- from_tf (bool, optional): Load from TensorFlow checkpoint
- force_download (bool, optional): Force re-download even if cached
- resume_download (bool, optional): Resume incomplete downloads
- proxies (dict, optional): HTTP proxy configuration
- output_loading_info (bool, optional): Return loading info dict
- use_auth_token (str/bool, optional): Authentication token for private models
Returns:
PreTrainedModel: Instance of the appropriate LM model class
"""Automatically loads models with sequence classification heads for tasks like sentiment analysis, text classification, and natural language inference.
class AutoModelForSequenceClassification:
@classmethod
def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
"""
Instantiate the appropriate sequence classification model from a pre-trained model.
Parameters:
- pretrained_model_name_or_path (str): Model name or local path
- config (PretrainedConfig, optional): Model configuration
- num_labels (int, optional): Number of classification labels
- cache_dir (str, optional): Directory to cache downloaded files
- from_tf (bool, optional): Load from TensorFlow checkpoint
- force_download (bool, optional): Force re-download even if cached
- resume_download (bool, optional): Resume incomplete downloads
- proxies (dict, optional): HTTP proxy configuration
- output_loading_info (bool, optional): Return loading info dict
- use_auth_token (str/bool, optional): Authentication token for private models
Returns:
PreTrainedModel: Instance of the appropriate sequence classification model
"""Automatically loads models with question answering heads for extractive question answering tasks like SQuAD.
class AutoModelForQuestionAnswering:
@classmethod
def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs):
"""
Instantiate the appropriate question answering model from a pre-trained model.
Parameters:
- pretrained_model_name_or_path (str): Model name or local path
- config (PretrainedConfig, optional): Model configuration
- cache_dir (str, optional): Directory to cache downloaded files
- from_tf (bool, optional): Load from TensorFlow checkpoint
- force_download (bool, optional): Force re-download even if cached
- resume_download (bool, optional): Resume incomplete downloads
- proxies (dict, optional): HTTP proxy configuration
- output_loading_info (bool, optional): Return loading info dict
- use_auth_token (str/bool, optional): Authentication token for private models
Returns:
PreTrainedModel: Instance of the appropriate QA model class
"""Usage Examples:
from pytorch_transformers import (
AutoModel,
AutoModelWithLMHead,
AutoModelForSequenceClassification,
AutoModelForQuestionAnswering
)
# Load base model for feature extraction
model = AutoModel.from_pretrained("bert-base-uncased")
# Load language model for text generation
lm_model = AutoModelWithLMHead.from_pretrained("gpt2")
# Load sequence classifier
classifier = AutoModelForSequenceClassification.from_pretrained(
"bert-base-uncased",
num_labels=2
)
# Load question answering model
qa_model = AutoModelForQuestionAnswering.from_pretrained("bert-base-uncased")
# Use with tokenizer for complete pipeline
from pytorch_transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
inputs = tokenizer("Hello, how are you?", return_tensors="pt")
outputs = model(**inputs)The Auto classes support the following pre-trained model names:
BERT Models:
bert-base-uncased, bert-large-uncasedbert-base-cased, bert-large-casedbert-base-multilingual-uncased, bert-base-multilingual-casedbert-base-chineseGPT-2 Models:
gpt2, gpt2-medium, gpt2-large, gpt2-xlOther Models:
openai-gpttransfo-xl-wt103xlnet-base-cased, xlnet-large-casedxlm-mlm-en-2048, xlm-mlm-100-1280roberta-base, roberta-largedistilbert-base-uncased, distilbert-base-casedInstall with Tessl CLI
npx tessl i tessl/pypi-pytorch-transformers