Tessl Tile for pypi/transformers@4.56.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

feature-extraction.md generation.md index.md models.md optimization.md pipelines.md tokenization.md training.md

pipelines.mddocs/

0
# Pipelines
1

2
High-level, task-oriented interface for common machine learning operations. Pipelines abstract away model selection, preprocessing, and postprocessing, providing immediate access to state-of-the-art capabilities across text, vision, audio, and multimodal domains.
3

4
## Capabilities
5

6
### Pipeline Factory
7

8
Central factory function that creates task-specific pipeline instances with automatic model and tokenizer selection.
9

10
```python { .api }
11
def pipeline(
12
    task: str = None,
13
    model: str = None,
14
    config: str = None,
15
    tokenizer: str = None,
16
    feature_extractor: str = None,
17
    image_processor: str = None,
18
    processor: str = None,
19
    framework: str = None,
20
    revision: str = None,
21
    use_fast: bool = True,
22
    token: Union[str, bool] = None,
23
    device: Union[int, str, torch.device] = None,
24
    device_map: Union[str, Dict] = None,
25
    dtype: Union[str, torch.dtype] = "auto",
26
    trust_remote_code: bool = False,
27
    model_kwargs: Dict[str, Any] = None,
28
    pipeline_class = None,
29
    **kwargs
30
) -> Pipeline:
31
    """
32
    Create a pipeline for a specific ML task.
33
    
34
    Args:
35
        task: The task name (e.g., "text-classification", "question-answering")
36
        model: Model name or path (defaults to task-specific default)
37
        config: Configuration name or path (auto-detected if None)
38
        tokenizer: Tokenizer name or path (defaults to model's tokenizer)
39
        feature_extractor: Feature extractor for audio/vision tasks
40
        image_processor: Image processor for vision tasks
41
        processor: Processor combining tokenizer and feature extraction
42
        framework: "pt" (PyTorch), "tf" (TensorFlow), or auto-detect
43
        revision: Model revision/branch to use
44
        use_fast: Use fast tokenizer implementation when available
45
        token: Hugging Face authentication token
46
        device: Device for inference (int, str, or torch.device)
47
        device_map: Advanced device mapping for multi-GPU
48
        dtype: Data type for model weights ("auto", torch.float16, etc.)
49
        trust_remote_code: Allow custom code from model repos
50
        model_kwargs: Additional arguments for model initialization
51
        pipeline_class: Custom pipeline class to use
52
    
53
    Returns:
54
        Task-specific pipeline instance
55
    """
56
```
57

58
### Text Classification
59

60
Classify text into predefined categories with confidence scores and label mapping.
61

62
```python { .api }
63
class TextClassificationPipeline(Pipeline):
64
    def __call__(
65
        self,
66
        inputs: Union[str, List[str]],
67
        top_k: int = None,
68
        function_to_apply: str = "default"
69
    ) -> Union[Dict, List[Dict]]:
70
        """
71
        Classify input text(s).
72
        
73
        Args:
74
            inputs: Text string or list of strings to classify
75
            top_k: Return top-k predictions (None for all)
76
            function_to_apply: "softmax", "sigmoid", or "none"
77
        
78
        Returns:
79
            Dictionary with 'label' and 'score' keys, or list of such dicts
80
        """
81
```
82

83
Usage example:
84
```python
85
classifier = pipeline("text-classification")
86
result = classifier("I love this movie!")
87
# Output: {'label': 'POSITIVE', 'score': 0.9998}
88

89
# Multi-class with top-k
90
classifier = pipeline("text-classification", model="cardiffnlp/twitter-roberta-base-emotion")
91
results = classifier("I'm so excited about this!", top_k=3)
92
# Output: [{'label': 'joy', 'score': 0.8}, {'label': 'optimism', 'score': 0.15}, ...]
93
```
94

95
### Token Classification
96

97
Identify and classify individual tokens in text for named entity recognition, part-of-speech tagging, and other token-level tasks.
98

99
```python { .api }
100
class TokenClassificationPipeline(Pipeline):
101
    def __call__(
102
        self,
103
        inputs: Union[str, List[str]],
104
        aggregation_strategy: str = "simple",
105
        ignore_labels: List[str] = None
106
    ) -> Union[List[Dict], List[List[Dict]]]:
107
        """
108
        Classify tokens in input text(s).
109
        
110
        Args:
111
            inputs: Text string or list of strings
112
            aggregation_strategy: "simple", "first", "average", "max", or "none"
113
            ignore_labels: Labels to filter out from results
114
        
115
        Returns:
116
            List of entity dictionaries with 'entity', 'score', 'index', 'word', 'start', 'end'
117
        """
118
```
119

120
Usage example:
121
```python
122
ner = pipeline("ner", aggregation_strategy="simple")
123
entities = ner("Apple Inc. was founded by Steve Jobs in Cupertino.")
124
# Output: [
125
#     {'entity': 'ORG', 'score': 0.999, 'word': 'Apple Inc.', 'start': 0, 'end': 10},
126
#     {'entity': 'PER', 'score': 0.998, 'word': 'Steve Jobs', 'start': 28, 'end': 38},
127
#     {'entity': 'LOC', 'score': 0.992, 'word': 'Cupertino', 'start': 42, 'end': 51}
128
# ]
129
```
130

131
### Question Answering
132

133
Extract answers from context text given a question, with confidence scores and answer span positions.
134

135
```python { .api }
136
class QuestionAnsweringPipeline(Pipeline):
137
    def __call__(
138
        self,
139
        question: str,
140
        context: str,
141
        top_k: int = 1,
142
        doc_stride: int = 128,
143
        max_answer_len: int = 15,
144
        max_seq_len: int = 384,
145
        max_question_len: int = 64,
146
        handle_impossible_answer: bool = False
147
    ) -> Union[Dict, List[Dict]]:
148
        """
149
        Extract answers from context given a question.
150
        
151
        Args:
152
            question: Question to answer
153
            context: Context text containing the answer
154
            top_k: Number of answers to return
155
            doc_stride: Overlap between context chunks
156
            max_answer_len: Maximum answer length in tokens
157
            max_seq_len: Maximum sequence length
158
            max_question_len: Maximum question length
159
            handle_impossible_answer: Allow "impossible to answer" responses
160
        
161
        Returns:
162
            Dictionary with 'answer', 'score', 'start', 'end' keys
163
        """
164
```
165

166
Usage example:
167
```python
168
qa = pipeline("question-answering")
169
result = qa(
170
    question="Where was Apple founded?",
171
    context="Apple Inc. was founded by Steve Jobs in Cupertino, California."
172
)
173
# Output: {'answer': 'Cupertino, California', 'score': 0.95, 'start': 42, 'end': 63}
174
```
175

176
### Text Generation
177

178
Generate text continuations using autoregressive language models with extensive control over generation parameters.
179

180
```python { .api }
181
class TextGenerationPipeline(Pipeline):
182
    def __call__(
183
        self,
184
        text_inputs: Union[str, List[str]],
185
        return_full_text: bool = True,
186
        clean_up_tokenization_spaces: bool = False,
187
        **generate_kwargs
188
    ) -> Union[List[Dict], List[List[Dict]]]:
189
        """
190
        Generate text continuations.
191
        
192
        Args:
193
            text_inputs: Input text(s) to continue
194
            return_full_text: Include input in output
195
            clean_up_tokenization_spaces: Clean tokenization artifacts
196
            **generate_kwargs: Additional generation parameters (max_length, temperature, etc.)
197
        
198
        Returns:
199
            List of dictionaries with 'generated_text' key
200
        """
201
```
202

203
Usage example:
204
```python
205
generator = pipeline("text-generation", model="gpt2")
206
outputs = generator(
207
    "The future of artificial intelligence is",
208
    max_length=50,
209
    num_return_sequences=2,
210
    temperature=0.8
211
)
212
# Output: [
213
#     {'generated_text': 'The future of artificial intelligence is bright and full of possibilities...'},
214
#     {'generated_text': 'The future of artificial intelligence is uncertain but promising...'}
215
# ]
216
```
217

218
### Text Summarization
219

220
Generate concise summaries of longer texts using sequence-to-sequence models.
221

222
```python { .api }
223
class SummarizationPipeline(Pipeline):
224
    def __call__(
225
        self,
226
        documents: Union[str, List[str]],
227
        return_text: bool = True,
228
        return_tensors: bool = False,
229
        clean_up_tokenization_spaces: bool = False,
230
        **generate_kwargs
231
    ) -> Union[List[Dict], List[List[Dict]]]:
232
        """
233
        Summarize input documents.
234
        
235
        Args:
236
            documents: Text(s) to summarize
237
            return_text: Return text summaries
238
            return_tensors: Return tensor outputs
239
            clean_up_tokenization_spaces: Clean tokenization artifacts
240
            **generate_kwargs: Generation parameters (max_length, min_length, etc.)
241
        
242
        Returns:
243
            List of dictionaries with 'summary_text' key
244
        """
245
```
246

247
### Translation
248

249
Translate text between languages using sequence-to-sequence models.
250

251
```python { .api }
252
class TranslationPipeline(Pipeline):
253
    def __call__(
254
        self,
255
        text: Union[str, List[str]],
256
        return_text: bool = True,
257
        clean_up_tokenization_spaces: bool = False,
258
        **generate_kwargs
259
    ) -> Union[List[Dict], List[List[Dict]]]:
260
        """
261
        Translate input text.
262
        
263
        Args:
264
            text: Text(s) to translate
265
            return_text: Return translated text
266
            clean_up_tokenization_spaces: Clean tokenization artifacts
267
            **generate_kwargs: Generation parameters
268
        
269
        Returns:
270
            List of dictionaries with 'translation_text' key
271
        """
272
```
273

274
### Image Classification
275

276
Classify images into predefined categories with confidence scores.
277

278
```python { .api }
279
class ImageClassificationPipeline(Pipeline):
280
    def __call__(
281
        self,
282
        images: Union[str, "PIL.Image", List],
283
        top_k: int = 5
284
    ) -> Union[List[Dict], List[List[Dict]]]:
285
        """
286
        Classify input image(s).
287
        
288
        Args:
289
            images: Image path, PIL Image, or list of images
290
            top_k: Number of top predictions to return
291
        
292
        Returns:
293
            List of dictionaries with 'label' and 'score' keys
294
        """
295
```
296

297
### Object Detection
298

299
Detect and locate objects in images with bounding boxes and confidence scores.
300

301
```python { .api }
302
class ObjectDetectionPipeline(Pipeline):
303
    def __call__(
304
        self,
305
        images: Union[str, "PIL.Image", List],
306
        threshold: float = 0.9
307
    ) -> Union[List[Dict], List[List[Dict]]]:
308
        """
309
        Detect objects in image(s).
310
        
311
        Args:
312
            images: Image path, PIL Image, or list of images
313
            threshold: Confidence threshold for detections
314
        
315
        Returns:
316
            List of dictionaries with 'score', 'label', 'box' keys
317
        """
318
```
319

320
### Automatic Speech Recognition
321

322
Convert speech audio to text with support for various audio formats and languages.
323

324
```python { .api }
325
class AutomaticSpeechRecognitionPipeline(Pipeline):
326
    def __call__(
327
        self,
328
        inputs: Union[np.ndarray, bytes, str],
329
        return_timestamps: Union[bool, str] = False,
330
        generate_kwargs: Dict = None
331
    ) -> Union[Dict, List[Dict]]:
332
        """
333
        Transcribe speech to text.
334
        
335
        Args:
336
            inputs: Audio array, bytes, or file path
337
            return_timestamps: Include word-level timestamps
338
            generate_kwargs: Additional generation parameters
339
        
340
        Returns:
341
            Dictionary with 'text' key and optional timestamps
342
        """
343
```
344

345
### Zero-Shot Classification
346

347
Classify text into arbitrary categories without task-specific training.
348

349
```python { .api }
350
class ZeroShotClassificationPipeline(Pipeline):
351
    def __call__(
352
        self,
353
        sequences: Union[str, List[str]],
354
        candidate_labels: List[str],
355
        hypothesis_template: str = "This example is {}.",
356
        multi_label: bool = False
357
    ) -> Union[Dict, List[Dict]]:
358
        """
359
        Classify text into arbitrary categories.
360
        
361
        Args:
362
            sequences: Text(s) to classify
363
            candidate_labels: Possible classification labels
364
            hypothesis_template: Template for label hypotheses
365
            multi_label: Allow multiple labels per input
366
        
367
        Returns:
368
            Dictionary with 'sequence', 'labels', 'scores' keys
369
        """
370
```
371

372
Usage example:
373
```python
374
classifier = pipeline("zero-shot-classification")
375
result = classifier(
376
    "This is a movie review about a great film.",
377
    candidate_labels=["movie", "sports", "technology", "politics"]
378
)
379
# Output: {
380
#     'sequence': 'This is a movie review about a great film.',
381
#     'labels': ['movie', 'technology', 'politics', 'sports'],
382
#     'scores': [0.85, 0.08, 0.04, 0.03]
383
# }
384
```
385

386
### Fill Mask
387

388
Predict masked tokens in text using masked language models.
389

390
```python { .api }
391
class FillMaskPipeline(Pipeline):
392
    def __call__(
393
        self,
394
        inputs: Union[str, List[str]],
395
        top_k: int = 5
396
    ) -> Union[List[Dict], List[List[Dict]]]:
397
        """
398
        Fill masked tokens in text.
399
        
400
        Args:
401
            inputs: Text with [MASK] tokens or list of such texts
402
            top_k: Number of predictions per mask
403
        
404
        Returns:
405
            List of dictionaries with 'score', 'token', 'token_str', 'sequence' keys
406
        """
407
```
408

409
### Image Text To Text
410

411
Generate text descriptions from images with optional text prompts, supporting multimodal understanding tasks.
412

413
```python { .api }
414
class ImageTextToTextPipeline(Pipeline):
415
    def __call__(
416
        self,
417
        images,
418
        prompt: str = None,
419
        **kwargs
420
    ) -> Union[str, List[str]]:
421
        """
422
        Generate text from images with optional prompts.
423
        
424
        Args:
425
            images: Single image or list of images (PIL, numpy array, or paths)
426
            prompt: Optional text prompt to guide generation
427
            
428
        Returns:
429
            Generated text string or list of strings
430
        """
431
```
432

433
### Video Classification
434

435
Classify video content into predefined categories with temporal understanding.
436

437
```python { .api }
438
class VideoClassificationPipeline(Pipeline):
439
    def __call__(
440
        self,
441
        videos,
442
        top_k: int = 5
443
    ) -> Union[List[Dict], List[List[Dict]]]:
444
        """
445
        Classify video content.
446
        
447
        Args:
448
            videos: Video file path(s) or video tensor(s)
449
            top_k: Number of top predictions to return
450
            
451
        Returns:
452
            List of classification results with 'label' and 'score'
453
        """
454
```
455

456
### Depth Estimation
457

458
Estimate depth information from single images for 3D scene understanding.
459

460
```python { .api }
461
class DepthEstimationPipeline(Pipeline):
462
    def __call__(
463
        self,
464
        images
465
    ) -> Union[Dict, List[Dict]]:
466
        """
467
        Estimate depth from images.
468
        
469
        Args:
470
            images: Single image or list of images
471
            
472
        Returns:
473
            Dictionary with 'predicted_depth' and 'depth' keys
474
        """
475
```
476

477
### Conversational
478

479
Engage in multi-turn conversations with context-aware response generation.
480

481
```python { .api }
482
class ConversationalPipeline(Pipeline):
483
    def __call__(
484
        self,
485
        conversations,
486
        clean_up_tokenization_spaces: bool = False,
487
        **generate_kwargs
488
    ) -> Union[Conversation, List[Conversation]]:
489
        """
490
        Generate conversational responses.
491
        
492
        Args:
493
            conversations: Conversation object(s) with history
494
            clean_up_tokenization_spaces: Remove extra spaces in output
495
            **generate_kwargs: Additional generation parameters
496
            
497
        Returns:
498
            Updated Conversation object(s) with new responses
499
        """
500
```
501

502
## Pipeline Base Class
503

504
All pipelines inherit from the base Pipeline class:
505

506
```python { .api }
507
class Pipeline:
508
    def __init__(
509
        self,
510
        model: PreTrainedModel,
511
        tokenizer: PreTrainedTokenizer = None,
512
        feature_extractor = None,
513
        modelcard: ModelCard = None,
514
        framework: str = None,
515
        task: str = "",
516
        args_parser = None,
517
        device: int = -1,
518
        torch_dtype = None,
519
        binary_output: bool = False
520
    )
521
    
522
    def save_pretrained(
523
        self,
524
        save_directory: str,
525
        safe_serialization: bool = True,
526
        **kwargs
527
    ) -> None:
528
        """Save pipeline components to directory."""
529
    
530
    def __call__(self, inputs, **kwargs):
531
        """Process inputs through the pipeline."""
532
    
533
    def predict(self, inputs, **kwargs):
534
        """Alias for __call__."""
535
    
536
    def transform(self, inputs, **kwargs):
537
        """Alias for __call__."""
538
```
539

540
## Available Pipeline Tasks
541

542
Complete list of supported pipeline tasks:
543

544
- **Text**: "text-classification", "token-classification", "question-answering", "fill-mask", "summarization", "translation", "text2text-generation", "text-generation", "zero-shot-classification", "conversational"
545
- **Vision**: "image-classification", "image-segmentation", "image-to-text", "image-to-image", "object-detection", "depth-estimation", "zero-shot-image-classification", "zero-shot-object-detection", "keypoint-matching", "mask-generation"
546
- **Audio**: "automatic-speech-recognition", "audio-classification", "text-to-audio", "zero-shot-audio-classification"
547
- **Video**: "video-classification" 
548
- **Multimodal**: "visual-question-answering", "document-question-answering", "image-text-to-text", "feature-extraction"
549

550
Each task automatically selects appropriate default models when no specific model is provided.

Version

Tile

Files

pipelines.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

pipelines.mddocs/