Efficient few-shot learning with Sentence Transformers
npx @tessl/cli install tessl/pypi-setfit@1.1.00
# SetFit
1
2
SetFit is an efficient and prompt-free framework for few-shot fine-tuning of Sentence Transformers that achieves high accuracy with minimal labeled data. It eliminates the need for handcrafted prompts by generating rich embeddings directly from text examples, trains significantly faster than large-scale models like T0 or GPT-3, and provides multilingual classification support through any Sentence Transformer model.
3
4
## Package Information
5
6
- **Package Name**: setfit
7
- **Language**: Python
8
- **Installation**: `pip install setfit`
9
10
## Core Imports
11
12
```python
13
import setfit
14
```
15
16
Common imports for working with SetFit models:
17
18
```python
19
from setfit import SetFitModel, SetFitTrainer, TrainingArguments
20
```
21
22
## Basic Usage
23
24
```python
25
from setfit import SetFitModel, SetFitTrainer, TrainingArguments
26
from datasets import Dataset
27
28
# Prepare your few-shot dataset
29
train_texts = [
30
"I love this movie!",
31
"This film is terrible.",
32
"Amazing cinematography!",
33
"Waste of time."
34
]
35
train_labels = [1, 0, 1, 0] # 1 = positive, 0 = negative
36
37
train_dataset = Dataset.from_dict({
38
"text": train_texts,
39
"label": train_labels
40
})
41
42
# Initialize a SetFit model from a pre-trained Sentence Transformer
43
model = SetFitModel.from_pretrained("sentence-transformers/paraphrase-mpnet-base-v2")
44
45
# Create trainer with training arguments
46
args = TrainingArguments(
47
batch_size=16,
48
num_epochs=4,
49
evaluation_strategy="epoch"
50
)
51
52
trainer = SetFitTrainer(
53
model=model,
54
train_dataset=train_dataset,
55
args=args,
56
column_mapping={"text": "text", "label": "label"}
57
)
58
59
# Train the model
60
trainer.train()
61
62
# Make predictions
63
predictions = model.predict([
64
"This movie is fantastic!",
65
"I didn't enjoy this film."
66
])
67
print(predictions) # [1, 0]
68
69
# Get prediction probabilities
70
probs = model.predict_proba([
71
"This movie is fantastic!",
72
"I didn't enjoy this film."
73
])
74
print(probs) # [[0.1, 0.9], [0.8, 0.2]]
75
```
76
77
## Architecture
78
79
SetFit combines two key components for efficient few-shot learning:
80
81
- **Sentence Transformer Body**: Generates rich semantic embeddings from text inputs using pre-trained models
82
- **Classification Head**: Either scikit-learn LogisticRegression (default) or differentiable PyTorch head for end-to-end training
83
- **Contrastive Learning**: Uses positive and negative pairs to fine-tune the sentence transformer for the target task
84
- **Training Pipeline**: Two-phase training process that first fine-tunes embeddings, then trains the classification head
85
86
This design enables SetFit to achieve strong performance with minimal training data (as few as 8 examples per class) while training much faster than large generative models.
87
88
## Capabilities
89
90
### Core Model and Training
91
92
Main model classes and training functionality for few-shot text classification with sentence transformers.
93
94
```python { .api }
95
class SetFitModel:
96
def __init__(
97
self,
98
model_body: Optional[SentenceTransformer] = None,
99
model_head: Optional[Union[SetFitHead, LogisticRegression]] = None,
100
multi_target_strategy: Optional[str] = None,
101
normalize_embeddings: bool = False,
102
labels: Optional[List[str]] = None,
103
model_card_data: Optional[SetFitModelCardData] = None,
104
sentence_transformers_kwargs: Optional[Dict] = None
105
): ...
106
def fit(
107
self,
108
x_train: List[str],
109
y_train: Union[List[int], List[List[int]]],
110
num_epochs: int,
111
batch_size: Optional[int] = None,
112
body_learning_rate: Optional[float] = None,
113
head_learning_rate: Optional[float] = None,
114
end_to_end: bool = False,
115
l2_weight: Optional[float] = None,
116
max_length: Optional[int] = None,
117
show_progress_bar: bool = True
118
): ...
119
def predict(
120
self,
121
inputs: Union[str, List[str]],
122
batch_size: int = 32,
123
as_numpy: bool = False,
124
use_labels: bool = True,
125
show_progress_bar: Optional[bool] = None
126
): ...
127
def predict_proba(
128
self,
129
inputs: Union[str, List[str]],
130
batch_size: int = 32,
131
as_numpy: bool = False,
132
show_progress_bar: Optional[bool] = None
133
): ...
134
def encode(
135
self,
136
inputs: List[str],
137
batch_size: int = 32,
138
show_progress_bar: Optional[bool] = None
139
): ...
140
141
class SetFitTrainer:
142
def __init__(
143
self,
144
model: Optional[SetFitModel] = None,
145
args: Optional[TrainingArguments] = None,
146
train_dataset: Optional[Dataset] = None,
147
eval_dataset: Optional[Dataset] = None,
148
model_init: Optional[Callable[[], SetFitModel]] = None,
149
compute_metrics: Optional[Callable] = None,
150
callbacks: Optional[List] = None,
151
column_mapping: Optional[Dict[str, str]] = None
152
): ...
153
def train(self): ...
154
def evaluate(self, eval_dataset: Optional[Dataset] = None): ...
155
def predict(self, test_dataset: Dataset): ...
156
157
class TrainingArguments:
158
def __init__(
159
self,
160
output_dir: str = "./results",
161
batch_size: int = 16,
162
num_epochs: Union[int, Tuple[int, int]] = 1,
163
max_steps: Union[int, Tuple[int, int]] = -1,
164
sampling_strategy: str = "oversampling",
165
learning_rate: Union[float, Tuple[float, float]] = 2e-5,
166
eval_strategy: str = "no",
167
save_strategy: str = "steps"
168
# ... more parameters available
169
): ...
170
```
171
172
[Core Model and Training](./core-model-training.md)
173
174
### Data Utilities
175
176
Dataset preparation, sampling, and templating utilities for few-shot learning scenarios.
177
178
```python { .api }
179
def sample_dataset(
180
dataset: Dataset,
181
label_column: str = "label",
182
num_samples: int = 8,
183
seed: int = 42
184
) -> Dataset: ...
185
186
def get_templated_dataset(
187
dataset: Optional[Dataset] = None,
188
candidate_labels: Optional[List[str]] = None,
189
reference_dataset: Optional[str] = None,
190
template: str = "This example is {}",
191
sample_size: int = 2,
192
text_column: str = "text",
193
label_column: str = "label",
194
multi_label: bool = False,
195
label_names_column: str = "label_text"
196
) -> Dataset: ...
197
198
def create_fewshot_splits(
199
dataset: Dataset,
200
sample_sizes: List[int] = [2, 4, 8, 16, 32, 64],
201
add_data_augmentation: bool = False,
202
dataset_name: Optional[str] = None
203
) -> DatasetDict: ...
204
```
205
206
[Data Utilities](./data-utilities.md)
207
208
### Knowledge Distillation
209
210
Teacher-student training framework for model compression and efficiency improvements.
211
212
```python { .api }
213
class DistillationTrainer:
214
def __init__(
215
self,
216
teacher_model: SetFitModel,
217
student_model: Optional[SetFitModel] = None,
218
args: Optional[TrainingArguments] = None,
219
train_dataset: Optional[Dataset] = None,
220
eval_dataset: Optional[Dataset] = None,
221
model_init: Optional[Callable[[], SetFitModel]] = None,
222
metric: Union[str, Callable] = "accuracy",
223
column_mapping: Optional[Dict[str, str]] = None
224
): ...
225
def train(self): ...
226
def evaluate(self, eval_dataset: Optional[Dataset] = None): ...
227
```
228
229
[Knowledge Distillation](./knowledge-distillation.md)
230
231
### Aspect-Based Sentiment Analysis (ABSA)
232
233
Specialized models and trainers for aspect-based sentiment analysis tasks with span-level predictions.
234
235
```python { .api }
236
class AbsaModel:
237
def __init__(self, aspect_model=None, polarity_model=None): ...
238
def predict(self, inputs): ...
239
240
class AspectModel:
241
def __init__(self, spacy_model="en_core_web_sm", span_context=0): ...
242
243
class PolarityModel:
244
def __init__(self, spacy_model="en_core_web_sm", span_context=0): ...
245
```
246
247
[Aspect-Based Sentiment Analysis](./absa.md)
248
249
### Model Export and Deployment
250
251
Export functionality for ONNX and OpenVINO formats to enable efficient deployment and inference.
252
253
```python { .api }
254
# Note: These functions require explicit imports from submodules:
255
# from setfit.exporters.onnx import export_onnx
256
# from setfit.exporters.openvino import export_to_openvino
257
258
def export_onnx(
259
model_body: SentenceTransformer,
260
model_head: Union[torch.nn.Module, LogisticRegression],
261
opset: int,
262
output_path: str = "model.onnx",
263
ignore_ir_version: bool = True,
264
use_hummingbird: bool = False
265
) -> None: ...
266
267
def export_to_openvino(
268
model: SetFitModel,
269
output_path: str = "model.xml"
270
) -> None: ...
271
```
272
273
[Model Export](./model-export.md)
274
275
### Model Cards and Documentation
276
277
Automatic model card generation and metadata management for reproducibility and documentation.
278
279
```python { .api }
280
class SetFitModelCardData:
281
def __init__(self, language=None, license=None, tags=None, model_name=None): ...
282
def set_train_set_metrics(self, metrics): ...
283
def generate_model_card(self): ...
284
```
285
286
[Model Cards](./model-cards.md)
287
288
## Constants
289
290
```python { .api }
291
__version__: str # Library version ("1.1.3")
292
```
293
294
## Types
295
296
```python { .api }
297
# Core types used across the API
298
from typing import List, Dict, Optional, Union, Tuple, Any, Callable
299
import numpy as np
300
import torch
301
from datasets import Dataset, DatasetDict
302
from sentence_transformers import SentenceTransformer
303
from sklearn.linear_model import LogisticRegression
304
305
# SetFit-specific type aliases
306
ModelBody = SentenceTransformer
307
ModelHead = Union[LogisticRegression, "SetFitHead"]
308
Labels = Union[List[int], List[List[int]]] # Single or multi-label
309
PredictionOutput = Union[np.ndarray, List[int]]
310
ProbabilityOutput = Union[np.ndarray, List[List[float]]]
311
```