or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

agentic-metrics.mdbenchmarks.mdcontent-quality-metrics.mdconversational-metrics.mdcore-evaluation.mdcustom-metrics.mddataset.mdindex.mdintegrations.mdmodels.mdmultimodal-metrics.mdrag-metrics.mdsynthesizer.mdtest-cases.mdtracing.md

models.mddocs/

0

# Models

1

2

Model abstraction layer supporting 15+ LLM providers, multimodal models, and embedding models with a unified interface. Use custom models for metric evaluation or integrate with existing LLM applications.

3

4

## Imports

5

6

```python

7

from deepeval.models import (

8

# Base classes

9

DeepEvalBaseLLM,

10

DeepEvalBaseMLLM,

11

DeepEvalBaseEmbeddingModel,

12

# LLM implementations

13

GPTModel,

14

AnthropicModel,

15

GeminiModel,

16

OllamaModel,

17

LocalModel,

18

AzureOpenAIModel,

19

LiteLLMModel,

20

AmazonBedrockModel,

21

KimiModel,

22

GrokModel,

23

DeepSeekModel,

24

# Multimodal models

25

MultimodalOpenAIModel,

26

MultimodalGeminiModel,

27

MultimodalOllamaModel,

28

# Embedding models

29

OpenAIEmbeddingModel,

30

AzureOpenAIEmbeddingModel,

31

LocalEmbeddingModel,

32

OllamaEmbeddingModel

33

)

34

```

35

36

## Capabilities

37

38

### Base LLM Class

39

40

Abstract base class for LLM integrations.

41

42

```python { .api }

43

class DeepEvalBaseLLM:

44

"""

45

Base class for LLM integrations.

46

47

Attributes:

48

- model_name (str, optional): Name of the model

49

- model (Any): The underlying model instance

50

51

Abstract Methods:

52

- load_model(*args, **kwargs): Load the model

53

- generate(prompt: str, **kwargs) -> str: Generate text

54

- a_generate(prompt: str, **kwargs) -> str: Async generate

55

- get_model_name() -> str: Get model name

56

57

Optional Methods:

58

- batch_generate(prompts: List[str], **kwargs) -> List[str]: Batch generation

59

"""

60

```

61

62

### LLM Implementations

63

64

#### OpenAI GPT Models

65

66

```python { .api }

67

class GPTModel:

68

"""

69

OpenAI GPT model integration.

70

71

Parameters:

72

- model (str, optional): Model name (default: "gpt-4o")

73

- api_key (str, optional): OpenAI API key

74

- *args, **kwargs: Additional arguments for OpenAI client

75

76

Methods:

77

- generate(prompt: str) -> str

78

- a_generate(prompt: str) -> str

79

- get_model_name() -> str

80

"""

81

```

82

83

Usage:

84

85

```python

86

from deepeval.models import GPTModel

87

from deepeval.metrics import AnswerRelevancyMetric

88

89

# Use GPT-4 for evaluation

90

model = GPTModel(model="gpt-4")

91

92

metric = AnswerRelevancyMetric(

93

threshold=0.7,

94

model=model

95

)

96

```

97

98

#### Anthropic Claude

99

100

```python { .api }

101

class AnthropicModel:

102

"""

103

Anthropic Claude integration.

104

105

Parameters:

106

- model (str, optional): Model name (default: "claude-3-5-sonnet-20241022")

107

- api_key (str, optional): Anthropic API key

108

"""

109

```

110

111

#### Google Gemini

112

113

```python { .api }

114

class GeminiModel:

115

"""

116

Google Gemini integration.

117

118

Parameters:

119

- model (str, optional): Model name (default: "gemini-2.0-flash-exp")

120

- api_key (str, optional): Google API key

121

"""

122

```

123

124

#### Local/Ollama Models

125

126

```python { .api }

127

class OllamaModel:

128

"""

129

Ollama model integration for local models.

130

131

Parameters:

132

- model (str, optional): Model name (default: "llama3.2")

133

- base_url (str, optional): Ollama server URL

134

"""

135

136

class LocalModel:

137

"""

138

Local model integration (e.g., HuggingFace).

139

140

Parameters:

141

- model (Any): HuggingFace model or pipeline

142

- tokenizer (Any, optional): Tokenizer

143

"""

144

```

145

146

#### Azure OpenAI

147

148

```python { .api }

149

class AzureOpenAIModel:

150

"""

151

Azure OpenAI integration.

152

153

Parameters:

154

- deployment_name (str): Azure deployment name

155

- api_key (str, optional): Azure API key

156

- azure_endpoint (str, optional): Azure endpoint URL

157

- api_version (str, optional): API version

158

"""

159

```

160

161

#### Other Providers

162

163

```python { .api }

164

class LiteLLMModel:

165

"""

166

LiteLLM integration for unified API across providers.

167

168

Parameters:

169

- model (str): Model name (e.g., "anthropic/claude-3-opus")

170

"""

171

172

class AmazonBedrockModel:

173

"""Amazon Bedrock integration."""

174

175

class KimiModel:

176

"""Kimi model integration."""

177

178

class GrokModel:

179

"""Grok model integration."""

180

181

class DeepSeekModel:

182

"""DeepSeek model integration."""

183

```

184

185

### Multimodal LLM Class

186

187

```python { .api }

188

class DeepEvalBaseMLLM:

189

"""

190

Base class for multimodal LLM integrations.

191

192

Abstract Methods:

193

- generate(messages: List, **kwargs) -> str: Generate from multimodal input

194

- a_generate(messages: List, **kwargs) -> str: Async generate

195

- get_model_name() -> str: Get model name

196

"""

197

198

class MultimodalOpenAIModel:

199

"""

200

OpenAI multimodal integration (GPT-4V, etc.).

201

202

Parameters:

203

- model (str, optional): Model name (default: "gpt-4o")

204

"""

205

206

class MultimodalGeminiModel:

207

"""Gemini multimodal integration."""

208

209

class MultimodalOllamaModel:

210

"""Ollama multimodal integration."""

211

```

212

213

### Embedding Models

214

215

```python { .api }

216

class DeepEvalBaseEmbeddingModel:

217

"""

218

Base class for embedding model integrations.

219

220

Abstract Methods:

221

- embed_text(text: str) -> List[float]: Embed single text

222

- a_embed_text(text: str) -> List[float]: Async embed single text

223

- embed_texts(texts: List[str]) -> List[List[float]]: Embed multiple texts

224

- a_embed_texts(texts: List[str]) -> List[List[float]]: Async embed multiple

225

- get_model_name() -> str: Get model name

226

"""

227

228

class OpenAIEmbeddingModel:

229

"""

230

OpenAI embeddings integration.

231

232

Parameters:

233

- model (str, optional): Model name (default: "text-embedding-3-small")

234

"""

235

236

class AzureOpenAIEmbeddingModel:

237

"""Azure OpenAI embeddings integration."""

238

239

class LocalEmbeddingModel:

240

"""Local embedding model integration."""

241

242

class OllamaEmbeddingModel:

243

"""Ollama embeddings integration."""

244

```

245

246

## Usage Examples

247

248

### Using Custom Models for Metrics

249

250

```python

251

from deepeval.models import GPTModel, AnthropicModel

252

from deepeval.metrics import AnswerRelevancyMetric, FaithfulnessMetric

253

254

# Use GPT-4 for one metric

255

gpt4_metric = AnswerRelevancyMetric(

256

model=GPTModel(model="gpt-4"),

257

threshold=0.7

258

)

259

260

# Use Claude for another

261

claude_metric = FaithfulnessMetric(

262

model=AnthropicModel(model="claude-3-5-sonnet-20241022"),

263

threshold=0.8

264

)

265

```

266

267

### Using Local Models

268

269

```python

270

from deepeval.models import OllamaModel

271

from deepeval.metrics import GEval

272

from deepeval.test_case import LLMTestCaseParams

273

274

# Use local Llama model for evaluation

275

local_model = OllamaModel(

276

model="llama3.2",

277

base_url="http://localhost:11434"

278

)

279

280

metric = GEval(

281

name="Quality",

282

criteria="Evaluate response quality",

283

evaluation_params=[LLMTestCaseParams.ACTUAL_OUTPUT],

284

model=local_model

285

)

286

```

287

288

### Creating Custom Model Integration

289

290

```python

291

from deepeval.models import DeepEvalBaseLLM

292

293

class CustomModel(DeepEvalBaseLLM):

294

def __init__(self, api_endpoint: str):

295

self.api_endpoint = api_endpoint

296

self.model_name = "custom-model-v1"

297

298

def load_model(self):

299

# Initialize your model

300

pass

301

302

def generate(self, prompt: str) -> str:

303

# Call your model API

304

response = requests.post(

305

self.api_endpoint,

306

json={"prompt": prompt}

307

)

308

return response.json()["output"]

309

310

async def a_generate(self, prompt: str) -> str:

311

# Async version

312

return self.generate(prompt)

313

314

def get_model_name(self) -> str:

315

return self.model_name

316

317

# Use custom model

318

custom_model = CustomModel(api_endpoint="https://api.example.com/generate")

319

metric = AnswerRelevancyMetric(model=custom_model)

320

```

321

322

### Multimodal Models

323

324

```python

325

from deepeval.models import MultimodalOpenAIModel

326

from deepeval.metrics import MultimodalGEval

327

from deepeval.test_case import MLLMTestCase, MLLMImage, MLLMTestCaseParams

328

329

# Use GPT-4V for multimodal evaluation

330

mllm = MultimodalOpenAIModel(model="gpt-4o")

331

332

metric = MultimodalGEval(

333

name="Image Description Quality",

334

criteria="Evaluate if the description accurately represents the image",

335

evaluation_params=[MLLMTestCaseParams.INPUT, MLLMTestCaseParams.ACTUAL_OUTPUT],

336

model=mllm

337

)

338

339

test_case = MLLMTestCase(

340

input=["Describe this image:", MLLMImage(url="photo.jpg", local=True)],

341

actual_output=["A golden retriever playing in a park"]

342

)

343

344

metric.measure(test_case)

345

```

346