Tessl Tile for pypi/deepeval@3.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

agentic-metrics.md benchmarks.md content-quality-metrics.md conversational-metrics.md core-evaluation.md custom-metrics.md dataset.md index.md integrations.md models.md multimodal-metrics.md rag-metrics.md synthesizer.md test-cases.md tracing.md

models.mddocs/

0
# Models
1

2
Model abstraction layer supporting 15+ LLM providers, multimodal models, and embedding models with a unified interface. Use custom models for metric evaluation or integrate with existing LLM applications.
3

4
## Imports
5

6
```python
7
from deepeval.models import (
8
    # Base classes
9
    DeepEvalBaseLLM,
10
    DeepEvalBaseMLLM,
11
    DeepEvalBaseEmbeddingModel,
12
    # LLM implementations
13
    GPTModel,
14
    AnthropicModel,
15
    GeminiModel,
16
    OllamaModel,
17
    LocalModel,
18
    AzureOpenAIModel,
19
    LiteLLMModel,
20
    AmazonBedrockModel,
21
    KimiModel,
22
    GrokModel,
23
    DeepSeekModel,
24
    # Multimodal models
25
    MultimodalOpenAIModel,
26
    MultimodalGeminiModel,
27
    MultimodalOllamaModel,
28
    # Embedding models
29
    OpenAIEmbeddingModel,
30
    AzureOpenAIEmbeddingModel,
31
    LocalEmbeddingModel,
32
    OllamaEmbeddingModel
33
)
34
```
35

36
## Capabilities
37

38
### Base LLM Class
39

40
Abstract base class for LLM integrations.
41

42
```python { .api }
43
class DeepEvalBaseLLM:
44
    """
45
    Base class for LLM integrations.
46

47
    Attributes:
48
    - model_name (str, optional): Name of the model
49
    - model (Any): The underlying model instance
50

51
    Abstract Methods:
52
    - load_model(*args, **kwargs): Load the model
53
    - generate(prompt: str, **kwargs) -> str: Generate text
54
    - a_generate(prompt: str, **kwargs) -> str: Async generate
55
    - get_model_name() -> str: Get model name
56

57
    Optional Methods:
58
    - batch_generate(prompts: List[str], **kwargs) -> List[str]: Batch generation
59
    """
60
```
61

62
### LLM Implementations
63

64
#### OpenAI GPT Models
65

66
```python { .api }
67
class GPTModel:
68
    """
69
    OpenAI GPT model integration.
70

71
    Parameters:
72
    - model (str, optional): Model name (default: "gpt-4o")
73
    - api_key (str, optional): OpenAI API key
74
    - *args, **kwargs: Additional arguments for OpenAI client
75

76
    Methods:
77
    - generate(prompt: str) -> str
78
    - a_generate(prompt: str) -> str
79
    - get_model_name() -> str
80
    """
81
```
82

83
Usage:
84

85
```python
86
from deepeval.models import GPTModel
87
from deepeval.metrics import AnswerRelevancyMetric
88

89
# Use GPT-4 for evaluation
90
model = GPTModel(model="gpt-4")
91

92
metric = AnswerRelevancyMetric(
93
    threshold=0.7,
94
    model=model
95
)
96
```
97

98
#### Anthropic Claude
99

100
```python { .api }
101
class AnthropicModel:
102
    """
103
    Anthropic Claude integration.
104

105
    Parameters:
106
    - model (str, optional): Model name (default: "claude-3-5-sonnet-20241022")
107
    - api_key (str, optional): Anthropic API key
108
    """
109
```
110

111
#### Google Gemini
112

113
```python { .api }
114
class GeminiModel:
115
    """
116
    Google Gemini integration.
117

118
    Parameters:
119
    - model (str, optional): Model name (default: "gemini-2.0-flash-exp")
120
    - api_key (str, optional): Google API key
121
    """
122
```
123

124
#### Local/Ollama Models
125

126
```python { .api }
127
class OllamaModel:
128
    """
129
    Ollama model integration for local models.
130

131
    Parameters:
132
    - model (str, optional): Model name (default: "llama3.2")
133
    - base_url (str, optional): Ollama server URL
134
    """
135

136
class LocalModel:
137
    """
138
    Local model integration (e.g., HuggingFace).
139

140
    Parameters:
141
    - model (Any): HuggingFace model or pipeline
142
    - tokenizer (Any, optional): Tokenizer
143
    """
144
```
145

146
#### Azure OpenAI
147

148
```python { .api }
149
class AzureOpenAIModel:
150
    """
151
    Azure OpenAI integration.
152

153
    Parameters:
154
    - deployment_name (str): Azure deployment name
155
    - api_key (str, optional): Azure API key
156
    - azure_endpoint (str, optional): Azure endpoint URL
157
    - api_version (str, optional): API version
158
    """
159
```
160

161
#### Other Providers
162

163
```python { .api }
164
class LiteLLMModel:
165
    """
166
    LiteLLM integration for unified API across providers.
167

168
    Parameters:
169
    - model (str): Model name (e.g., "anthropic/claude-3-opus")
170
    """
171

172
class AmazonBedrockModel:
173
    """Amazon Bedrock integration."""
174

175
class KimiModel:
176
    """Kimi model integration."""
177

178
class GrokModel:
179
    """Grok model integration."""
180

181
class DeepSeekModel:
182
    """DeepSeek model integration."""
183
```
184

185
### Multimodal LLM Class
186

187
```python { .api }
188
class DeepEvalBaseMLLM:
189
    """
190
    Base class for multimodal LLM integrations.
191

192
    Abstract Methods:
193
    - generate(messages: List, **kwargs) -> str: Generate from multimodal input
194
    - a_generate(messages: List, **kwargs) -> str: Async generate
195
    - get_model_name() -> str: Get model name
196
    """
197

198
class MultimodalOpenAIModel:
199
    """
200
    OpenAI multimodal integration (GPT-4V, etc.).
201

202
    Parameters:
203
    - model (str, optional): Model name (default: "gpt-4o")
204
    """
205

206
class MultimodalGeminiModel:
207
    """Gemini multimodal integration."""
208

209
class MultimodalOllamaModel:
210
    """Ollama multimodal integration."""
211
```
212

213
### Embedding Models
214

215
```python { .api }
216
class DeepEvalBaseEmbeddingModel:
217
    """
218
    Base class for embedding model integrations.
219

220
    Abstract Methods:
221
    - embed_text(text: str) -> List[float]: Embed single text
222
    - a_embed_text(text: str) -> List[float]: Async embed single text
223
    - embed_texts(texts: List[str]) -> List[List[float]]: Embed multiple texts
224
    - a_embed_texts(texts: List[str]) -> List[List[float]]: Async embed multiple
225
    - get_model_name() -> str: Get model name
226
    """
227

228
class OpenAIEmbeddingModel:
229
    """
230
    OpenAI embeddings integration.
231

232
    Parameters:
233
    - model (str, optional): Model name (default: "text-embedding-3-small")
234
    """
235

236
class AzureOpenAIEmbeddingModel:
237
    """Azure OpenAI embeddings integration."""
238

239
class LocalEmbeddingModel:
240
    """Local embedding model integration."""
241

242
class OllamaEmbeddingModel:
243
    """Ollama embeddings integration."""
244
```
245

246
## Usage Examples
247

248
### Using Custom Models for Metrics
249

250
```python
251
from deepeval.models import GPTModel, AnthropicModel
252
from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric
253

254
# Use GPT-4 for one metric
255
gpt4_metric = AnswerRelevancyMetric(
256
    model=GPTModel(model="gpt-4"),
257
    threshold=0.7
258
)
259

260
# Use Claude for another
261
claude_metric = FaithfulnessMetric(
262
    model=AnthropicModel(model="claude-3-5-sonnet-20241022"),
263
    threshold=0.8
264
)
265
```
266

267
### Using Local Models
268

269
```python
270
from deepeval.models import OllamaModel
271
from deepeval.metrics import GEval
272
from deepeval.test_case import LLMTestCaseParams
273

274
# Use local Llama model for evaluation
275
local_model = OllamaModel(
276
    model="llama3.2",
277
    base_url="http://localhost:11434"
278
)
279

280
metric = GEval(
281
    name="Quality",
282
    criteria="Evaluate response quality",
283
    evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],
284
    model=local_model
285
)
286
```
287

288
### Creating Custom Model Integration
289

290
```python
291
from deepeval.models import DeepEvalBaseLLM
292

293
class CustomModel(DeepEvalBaseLLM):
294
    def __init__(self, api_endpoint: str):
295
        self.api_endpoint = api_endpoint
296
        self.model_name = "custom-model-v1"
297

298
    def load_model(self):
299
        # Initialize your model
300
        pass
301

302
    def generate(self, prompt: str) -> str:
303
        # Call your model API
304
        response = requests.post(
305
            self.api_endpoint,
306
            json={"prompt": prompt}
307
        )
308
        return response.json()["output"]
309

310
    async def a_generate(self, prompt: str) -> str:
311
        # Async version
312
        return self.generate(prompt)
313

314
    def get_model_name(self) -> str:
315
        return self.model_name
316

317
# Use custom model
318
custom_model = CustomModel(api_endpoint="https://api.example.com/generate")
319
metric = AnswerRelevancyMetric(model=custom_model)
320
```
321

322
### Multimodal Models
323

324
```python
325
from deepeval.models import MultimodalOpenAIModel
326
from deepeval.metrics import MultimodalGEval
327
from deepeval.test_case import MLLMTestCase, MLLMImage, MLLMTestCaseParams
328

329
# Use GPT-4V for multimodal evaluation
330
mllm = MultimodalOpenAIModel(model="gpt-4o")
331

332
metric = MultimodalGEval(
333
    name="Image Description Quality",
334
    criteria="Evaluate if the description accurately represents the image",
335
    evaluation_params=[MLLMTestCaseParams.INPUT, MLLMTestCaseParams.ACTUAL_OUTPUT],
336
    model=mllm
337
)
338

339
test_case = MLLMTestCase(
340
    input=["Describe this image:", MLLMImage(url="photo.jpg", local=True)],
341
    actual_output=["A golden retriever playing in a park"]
342
)
343

344
metric.measure(test_case)
345
```
346

Version

Tile

Files

models.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

models.mddocs/