0
# Models
1
2
Model abstraction layer supporting 15+ LLM providers, multimodal models, and embedding models with a unified interface. Use custom models for metric evaluation or integrate with existing LLM applications.
3
4
## Imports
5
6
```python
7
from deepeval.models import (
8
# Base classes
9
DeepEvalBaseLLM,
10
DeepEvalBaseMLLM,
11
DeepEvalBaseEmbeddingModel,
12
# LLM implementations
13
GPTModel,
14
AnthropicModel,
15
GeminiModel,
16
OllamaModel,
17
LocalModel,
18
AzureOpenAIModel,
19
LiteLLMModel,
20
AmazonBedrockModel,
21
KimiModel,
22
GrokModel,
23
DeepSeekModel,
24
# Multimodal models
25
MultimodalOpenAIModel,
26
MultimodalGeminiModel,
27
MultimodalOllamaModel,
28
# Embedding models
29
OpenAIEmbeddingModel,
30
AzureOpenAIEmbeddingModel,
31
LocalEmbeddingModel,
32
OllamaEmbeddingModel
33
)
34
```
35
36
## Capabilities
37
38
### Base LLM Class
39
40
Abstract base class for LLM integrations.
41
42
```python { .api }
43
class DeepEvalBaseLLM:
44
"""
45
Base class for LLM integrations.
46
47
Attributes:
48
- model_name (str, optional): Name of the model
49
- model (Any): The underlying model instance
50
51
Abstract Methods:
52
- load_model(*args, **kwargs): Load the model
53
- generate(prompt: str, **kwargs) -> str: Generate text
54
- a_generate(prompt: str, **kwargs) -> str: Async generate
55
- get_model_name() -> str: Get model name
56
57
Optional Methods:
58
- batch_generate(prompts: List[str], **kwargs) -> List[str]: Batch generation
59
"""
60
```
61
62
### LLM Implementations
63
64
#### OpenAI GPT Models
65
66
```python { .api }
67
class GPTModel:
68
"""
69
OpenAI GPT model integration.
70
71
Parameters:
72
- model (str, optional): Model name (default: "gpt-4o")
73
- api_key (str, optional): OpenAI API key
74
- *args, **kwargs: Additional arguments for OpenAI client
75
76
Methods:
77
- generate(prompt: str) -> str
78
- a_generate(prompt: str) -> str
79
- get_model_name() -> str
80
"""
81
```
82
83
Usage:
84
85
```python
86
from deepeval.models import GPTModel
87
from deepeval.metrics import AnswerRelevancyMetric
88
89
# Use GPT-4 for evaluation
90
model = GPTModel(model="gpt-4")
91
92
metric = AnswerRelevancyMetric(
93
threshold=0.7,
94
model=model
95
)
96
```
97
98
#### Anthropic Claude
99
100
```python { .api }
101
class AnthropicModel:
102
"""
103
Anthropic Claude integration.
104
105
Parameters:
106
- model (str, optional): Model name (default: "claude-3-5-sonnet-20241022")
107
- api_key (str, optional): Anthropic API key
108
"""
109
```
110
111
#### Google Gemini
112
113
```python { .api }
114
class GeminiModel:
115
"""
116
Google Gemini integration.
117
118
Parameters:
119
- model (str, optional): Model name (default: "gemini-2.0-flash-exp")
120
- api_key (str, optional): Google API key
121
"""
122
```
123
124
#### Local/Ollama Models
125
126
```python { .api }
127
class OllamaModel:
128
"""
129
Ollama model integration for local models.
130
131
Parameters:
132
- model (str, optional): Model name (default: "llama3.2")
133
- base_url (str, optional): Ollama server URL
134
"""
135
136
class LocalModel:
137
"""
138
Local model integration (e.g., HuggingFace).
139
140
Parameters:
141
- model (Any): HuggingFace model or pipeline
142
- tokenizer (Any, optional): Tokenizer
143
"""
144
```
145
146
#### Azure OpenAI
147
148
```python { .api }
149
class AzureOpenAIModel:
150
"""
151
Azure OpenAI integration.
152
153
Parameters:
154
- deployment_name (str): Azure deployment name
155
- api_key (str, optional): Azure API key
156
- azure_endpoint (str, optional): Azure endpoint URL
157
- api_version (str, optional): API version
158
"""
159
```
160
161
#### Other Providers
162
163
```python { .api }
164
class LiteLLMModel:
165
"""
166
LiteLLM integration for unified API across providers.
167
168
Parameters:
169
- model (str): Model name (e.g., "anthropic/claude-3-opus")
170
"""
171
172
class AmazonBedrockModel:
173
"""Amazon Bedrock integration."""
174
175
class KimiModel:
176
"""Kimi model integration."""
177
178
class GrokModel:
179
"""Grok model integration."""
180
181
class DeepSeekModel:
182
"""DeepSeek model integration."""
183
```
184
185
### Multimodal LLM Class
186
187
```python { .api }
188
class DeepEvalBaseMLLM:
189
"""
190
Base class for multimodal LLM integrations.
191
192
Abstract Methods:
193
- generate(messages: List, **kwargs) -> str: Generate from multimodal input
194
- a_generate(messages: List, **kwargs) -> str: Async generate
195
- get_model_name() -> str: Get model name
196
"""
197
198
class MultimodalOpenAIModel:
199
"""
200
OpenAI multimodal integration (GPT-4V, etc.).
201
202
Parameters:
203
- model (str, optional): Model name (default: "gpt-4o")
204
"""
205
206
class MultimodalGeminiModel:
207
"""Gemini multimodal integration."""
208
209
class MultimodalOllamaModel:
210
"""Ollama multimodal integration."""
211
```
212
213
### Embedding Models
214
215
```python { .api }
216
class DeepEvalBaseEmbeddingModel:
217
"""
218
Base class for embedding model integrations.
219
220
Abstract Methods:
221
- embed_text(text: str) -> List[float]: Embed single text
222
- a_embed_text(text: str) -> List[float]: Async embed single text
223
- embed_texts(texts: List[str]) -> List[List[float]]: Embed multiple texts
224
- a_embed_texts(texts: List[str]) -> List[List[float]]: Async embed multiple
225
- get_model_name() -> str: Get model name
226
"""
227
228
class OpenAIEmbeddingModel:
229
"""
230
OpenAI embeddings integration.
231
232
Parameters:
233
- model (str, optional): Model name (default: "text-embedding-3-small")
234
"""
235
236
class AzureOpenAIEmbeddingModel:
237
"""Azure OpenAI embeddings integration."""
238
239
class LocalEmbeddingModel:
240
"""Local embedding model integration."""
241
242
class OllamaEmbeddingModel:
243
"""Ollama embeddings integration."""
244
```
245
246
## Usage Examples
247
248
### Using Custom Models for Metrics
249
250
```python
251
from deepeval.models import GPTModel, AnthropicModel
252
from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
253
254
# Use GPT-4 for one metric
255
gpt4_metric = AnswerRelevancyMetric(
256
model=GPTModel(model="gpt-4"),
257
threshold=0.7
258
)
259
260
# Use Claude for another
261
claude_metric = FaithfulnessMetric(
262
model=AnthropicModel(model="claude-3-5-sonnet-20241022"),
263
threshold=0.8
264
)
265
```
266
267
### Using Local Models
268
269
```python
270
from deepeval.models import OllamaModel
271
from deepeval.metrics import GEval
272
from deepeval.test_case import LLMTestCaseParams
273
274
# Use local Llama model for evaluation
275
local_model = OllamaModel(
276
model="llama3.2",
277
base_url="http://localhost:11434"
278
)
279
280
metric = GEval(
281
name="Quality",
282
criteria="Evaluate response quality",
283
evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],
284
model=local_model
285
)
286
```
287
288
### Creating Custom Model Integration
289
290
```python
291
from deepeval.models import DeepEvalBaseLLM
292
293
class CustomModel(DeepEvalBaseLLM):
294
def __init__(self, api_endpoint: str):
295
self.api_endpoint = api_endpoint
296
self.model_name = "custom-model-v1"
297
298
def load_model(self):
299
# Initialize your model
300
pass
301
302
def generate(self, prompt: str) -> str:
303
# Call your model API
304
response = requests.post(
305
self.api_endpoint,
306
json={"prompt": prompt}
307
)
308
return response.json()["output"]
309
310
async def a_generate(self, prompt: str) -> str:
311
# Async version
312
return self.generate(prompt)
313
314
def get_model_name(self) -> str:
315
return self.model_name
316
317
# Use custom model
318
custom_model = CustomModel(api_endpoint="https://api.example.com/generate")
319
metric = AnswerRelevancyMetric(model=custom_model)
320
```
321
322
### Multimodal Models
323
324
```python
325
from deepeval.models import MultimodalOpenAIModel
326
from deepeval.metrics import MultimodalGEval
327
from deepeval.test_case import MLLMTestCase, MLLMImage, MLLMTestCaseParams
328
329
# Use GPT-4V for multimodal evaluation
330
mllm = MultimodalOpenAIModel(model="gpt-4o")
331
332
metric = MultimodalGEval(
333
name="Image Description Quality",
334
criteria="Evaluate if the description accurately represents the image",
335
evaluation_params=[MLLMTestCaseParams.INPUT, MLLMTestCaseParams.ACTUAL_OUTPUT],
336
model=mllm
337
)
338
339
test_case = MLLMTestCase(
340
input=["Describe this image:", MLLMImage(url="photo.jpg", local=True)],
341
actual_output=["A golden retriever playing in a park"]
342
)
343
344
metric.measure(test_case)
345
```
346