Tessl Tile for pypi/pytorch-transformers@1.2.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

auto-classes.md base-classes.md bert-models.md file-utilities.md gpt2-models.md index.md optimization.md other-models.md

index.mddocs/

0
# PyTorch Transformers
1

2
A comprehensive Python library providing state-of-the-art pre-trained transformer models for Natural Language Processing (NLP) tasks. PyTorch Transformers includes PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for major transformer architectures including BERT, GPT/GPT-2, Transformer-XL, XLNet, XLM, RoBERTa, and DistilBERT.
3

4
## Package Information
5

6
- **Package Name**: pytorch-transformers
7
- **Package Type**: Library
8
- **Language**: Python
9
- **Installation**: `pip install pytorch-transformers`
10

11
## Core Imports
12

13
```python
14
import pytorch_transformers
15
```
16

17
Common patterns for working with models and tokenizers:
18

19
```python
20
from pytorch_transformers import AutoModel, AutoTokenizer
21
from pytorch_transformers import BertModel, BertTokenizer
22
from pytorch_transformers import GPT2Model, GPT2Tokenizer
23
```
24

25
## Basic Usage
26

27
```python
28
from pytorch_transformers import AutoModel, AutoTokenizer
29

30
# Load a pre-trained model and tokenizer
31
model_name = "bert-base-uncased"
32
tokenizer = AutoTokenizer.from_pretrained(model_name)
33
model = AutoModel.from_pretrained(model_name)
34

35
# Tokenize input text
36
text = "Hello, how are you?"
37
inputs = tokenizer(text, return_tensors="pt")
38

39
# Get model outputs
40
outputs = model(**inputs)
41
last_hidden_states = outputs.last_hidden_state
42

43
# For specific tasks like sequence classification
44
from pytorch_transformers import AutoModelForSequenceClassification
45
classifier = AutoModelForSequenceClassification.from_pretrained("bert-base-uncased")
46
```
47

48
## Architecture
49

50
The library follows a consistent design pattern across all transformer architectures:
51

52
- **Auto Classes**: Factory classes that automatically select the appropriate model/tokenizer based on model name
53
- **Base Classes**: Abstract base classes (PreTrainedModel, PreTrainedTokenizer, PretrainedConfig) providing common interfaces
54
- **Model-Specific Classes**: Dedicated implementations for each transformer architecture with specialized task-specific variants
55
- **Configuration Classes**: Parameter containers for model initialization and customization
56
- **Tokenizers**: Architecture-specific text preprocessing with consistent encode/decode interfaces
57

58
This unified design enables seamless switching between different transformer architectures while maintaining consistent APIs for various NLP tasks including language modeling, sequence classification, question answering, and token classification.
59

60
## Capabilities
61

62
### Auto Classes
63

64
Factory classes that automatically select and instantiate the appropriate model, tokenizer, or configuration based on model name patterns. These provide the most convenient way to work with pre-trained models without needing to know the specific architecture.
65

66
```python { .api }
67
class AutoTokenizer:
68
    @classmethod
69
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
70

71
class AutoModel:
72
    @classmethod
73
    def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs): ...
74

75
class AutoConfig:
76
    @classmethod
77
    def from_pretrained(cls, pretrained_model_name_or_path, **kwargs): ...
78
```
79

80
[Auto Classes](./auto-classes.md)
81

82
### Base Classes
83

84
Core abstract base classes that define the common interface shared by all models, tokenizers, and configurations. These classes provide essential methods like `from_pretrained()` and `save_pretrained()` that enable consistent model and tokenizer loading/saving across all architectures.
85

86
```python { .api }
87
class PreTrainedModel:
88
    @classmethod
89
    def from_pretrained(cls, pretrained_model_name_or_path, *model_args, **kwargs): ...
90
    
91
    def save_pretrained(self, save_directory): ...
92
    def resize_token_embeddings(self, new_num_tokens): ...
93

94
class PreTrainedTokenizer:
95
    @classmethod
96
    def from_pretrained(cls, pretrained_model_name_or_path, **kwargs): ...
97
    
98
    def save_pretrained(self, save_directory): ...
99
    def tokenize(self, text): ...
100
    def encode(self, text): ...
101
    def decode(self, token_ids): ...
102
```
103

104
[Base Classes](./base-classes.md)
105

106
### BERT Models
107

108
BERT (Bidirectional Encoder Representations from Transformers) models for various NLP tasks including masked language modeling, next sentence prediction, sequence classification, token classification, and question answering.
109

110
```python { .api }
111
class BertModel:
112
    @classmethod
113
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
114

115
class BertForSequenceClassification:
116
    @classmethod  
117
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
118

119
class BertTokenizer:
120
    @classmethod
121
    def from_pretrained(cls, pretrained_model_name_or_path, **kwargs): ...
122
```
123

124
[BERT Models](./bert-models.md)
125

126
### GPT-2 Models
127

128
GPT-2 (Generative Pre-trained Transformer 2) models for language generation tasks, including standard language modeling and multi-task models with both language modeling and classification heads.
129

130
```python { .api }
131
class GPT2Model:
132
    @classmethod
133
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
134

135
class GPT2LMHeadModel:
136
    @classmethod
137
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
138

139
class GPT2Tokenizer:
140
    @classmethod
141
    def from_pretrained(cls, pretrained_model_name_or_path, **kwargs): ...
142
```
143

144
[GPT-2 Models](./gpt2-models.md)
145

146
### Other Transformer Models
147

148
Additional transformer architectures including OpenAI GPT, Transformer-XL, XLNet, XLM, RoBERTa, and DistilBERT, each with their specific model variants and tokenizers optimized for different NLP tasks and languages.
149

150
```python { .api }
151
class XLNetModel:
152
    @classmethod
153
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
154

155
class RobertaModel:
156
    @classmethod
157
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
158

159
class DistilBertModel:
160
    @classmethod
161
    def from_pretrained(cls, pretrained_model_name_or_path, *inputs, **kwargs): ...
162
```
163

164
[Other Models](./other-models.md)
165

166
### Optimization
167

168
Specialized optimizers and learning rate schedulers designed for transformer training, including AdamW optimizer with weight decay fix and various warmup schedules commonly used in transformer fine-tuning.
169

170
```python { .api }
171
class AdamW:
172
    def __init__(self, params, lr=1e-3, betas=(0.9, 0.999), eps=1e-8, weight_decay=0.01, correct_bias=True): ...
173

174
def WarmupLinearSchedule(optimizer, warmup_steps, t_total, last_epoch=-1): ...
175
def WarmupCosineSchedule(optimizer, warmup_steps, t_total, cycles=0.5, last_epoch=-1): ...
176
```
177

178
[Optimization](./optimization.md)
179

180
### File Utilities
181

182
File handling utilities for downloading, caching, and managing pre-trained model files. These utilities handle automatic download of model weights and configurations from remote repositories with local caching support.
183

184
```python { .api }
185
def cached_path(url_or_filename, cache_dir=None): ...
186

187
PYTORCH_TRANSFORMERS_CACHE: str
188
PYTORCH_PRETRAINED_BERT_CACHE: str
189
```
190

191
[File Utilities](./file-utilities.md)
192

193
## Constants
194

195
```python { .api }
196
__version__: str = "1.2.0"
197

198
# Model file names
199
WEIGHTS_NAME: str = "pytorch_model.bin"
200
CONFIG_NAME: str = "config.json"  
201
TF_WEIGHTS_NAME: str = "model.ckpt"
202

203
# Archive maps (model name to URL mappings for pre-trained models)
204
BERT_PRETRAINED_MODEL_ARCHIVE_MAP: Dict[str, str]
205
GPT2_PRETRAINED_MODEL_ARCHIVE_MAP: Dict[str, str]
206
XLNET_PRETRAINED_MODEL_ARCHIVE_MAP: Dict[str, str]
207
# ... and similar maps for all other architectures
208
```
209

210
## Special Token Properties
211

212
All tokenizers support standard special tokens:
213

214
```python { .api }
215
# Special tokens available on all tokenizers
216
bos_token: str  # Beginning of sequence
217
eos_token: str  # End of sequence  
218
unk_token: str  # Unknown token
219
sep_token: str  # Separator token
220
pad_token: str  # Padding token
221
cls_token: str  # Classification token
222
mask_token: str # Mask token for MLM
223
```

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/