or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

agent-orchestration.mdbrowser-actions.mdbrowser-session.mddom-processing.mdindex.mdllm-integration.mdtask-results.md

llm-integration.mddocs/

0

# LLM Integration

1

2

Multi-provider language model support with consistent interfaces for OpenAI, Anthropic, Google, Groq, Azure OpenAI, and Ollama models. All chat models implement the BaseChatModel protocol for seamless integration with browser-use agents.

3

4

## Capabilities

5

6

### OpenAI Integration

7

8

OpenAI GPT model integration with support for GPT-4, GPT-3.5, and other OpenAI models.

9

10

```python { .api }

11

class ChatOpenAI:

12

def __init__(

13

self,

14

model: str = "gpt-4o-mini",

15

temperature: float = 0.2,

16

frequency_penalty: float = 0.3,

17

presence_penalty: float = 0.0,

18

max_tokens: int = None,

19

api_key: str = None,

20

base_url: str = None,

21

timeout: float = 60.0

22

):

23

"""

24

Initialize OpenAI chat model.

25

26

Parameters:

27

- model: OpenAI model name (e.g., "gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo")

28

- temperature: Randomness in generation (0.0-2.0)

29

- frequency_penalty: Penalty for frequent tokens (-2.0 to 2.0)

30

- presence_penalty: Penalty for token presence (-2.0 to 2.0)

31

- max_tokens: Maximum tokens in response

32

- api_key: OpenAI API key (uses OPENAI_API_KEY env var if not provided)

33

- base_url: Custom API base URL

34

- timeout: Request timeout in seconds

35

"""

36

37

model: str

38

provider: str = "openai"

39

40

async def ainvoke(

41

self,

42

messages: list[BaseMessage],

43

output_format: type[T] = None

44

) -> ChatInvokeCompletion:

45

"""

46

Invoke OpenAI model with messages.

47

48

Parameters:

49

- messages: List of conversation messages

50

- output_format: Optional Pydantic model for structured output

51

52

Returns:

53

ChatInvokeCompletion: Model response with content and metadata

54

"""

55

```

56

57

### Anthropic Integration

58

59

Anthropic Claude model integration with support for Claude 3 family models.

60

61

```python { .api }

62

class ChatAnthropic:

63

def __init__(

64

self,

65

model: str = "claude-3-sonnet-20240229",

66

temperature: float = 0.2,

67

max_tokens: int = 4096,

68

api_key: str = None,

69

timeout: float = 60.0

70

):

71

"""

72

Initialize Anthropic Claude model.

73

74

Parameters:

75

- model: Claude model name (e.g., "claude-3-sonnet-20240229", "claude-3-haiku-20240307")

76

- temperature: Randomness in generation (0.0-1.0)

77

- max_tokens: Maximum tokens in response

78

- api_key: Anthropic API key (uses ANTHROPIC_API_KEY env var if not provided)

79

- timeout: Request timeout in seconds

80

"""

81

82

model: str

83

provider: str = "anthropic"

84

85

async def ainvoke(

86

self,

87

messages: list[BaseMessage],

88

output_format: type[T] = None

89

) -> ChatInvokeCompletion:

90

"""Invoke Claude model with messages."""

91

```

92

93

### Google Integration

94

95

Google Gemini model integration with support for Gemini Pro and other Google models.

96

97

```python { .api }

98

class ChatGoogle:

99

def __init__(

100

self,

101

model: str = "gemini-pro",

102

temperature: float = 0.2,

103

max_tokens: int = None,

104

api_key: str = None,

105

timeout: float = 60.0

106

):

107

"""

108

Initialize Google Gemini model.

109

110

Parameters:

111

- model: Gemini model name (e.g., "gemini-pro", "gemini-pro-vision")

112

- temperature: Randomness in generation (0.0-1.0)

113

- max_tokens: Maximum tokens in response

114

- api_key: Google API key (uses GOOGLE_API_KEY env var if not provided)

115

- timeout: Request timeout in seconds

116

"""

117

118

model: str

119

provider: str = "google"

120

121

async def ainvoke(

122

self,

123

messages: list[BaseMessage],

124

output_format: type[T] = None

125

) -> ChatInvokeCompletion:

126

"""Invoke Gemini model with messages."""

127

```

128

129

### Groq Integration

130

131

Groq model integration for fast inference with Llama, Mixtral, and other supported models.

132

133

```python { .api }

134

class ChatGroq:

135

def __init__(

136

self,

137

model: str = "llama3-70b-8192",

138

temperature: float = 0.2,

139

max_tokens: int = None,

140

api_key: str = None,

141

timeout: float = 60.0

142

):

143

"""

144

Initialize Groq model.

145

146

Parameters:

147

- model: Groq model name (e.g., "llama3-70b-8192", "mixtral-8x7b-32768")

148

- temperature: Randomness in generation (0.0-2.0)

149

- max_tokens: Maximum tokens in response

150

- api_key: Groq API key (uses GROQ_API_KEY env var if not provided)

151

- timeout: Request timeout in seconds

152

"""

153

154

model: str

155

provider: str = "groq"

156

157

async def ainvoke(

158

self,

159

messages: list[BaseMessage],

160

output_format: type[T] = None

161

) -> ChatInvokeCompletion:

162

"""Invoke Groq model with messages."""

163

```

164

165

### Azure OpenAI Integration

166

167

Azure OpenAI service integration for enterprise OpenAI model deployment.

168

169

```python { .api }

170

class ChatAzureOpenAI:

171

def __init__(

172

self,

173

model: str,

174

azure_endpoint: str,

175

api_version: str = "2024-02-15-preview",

176

temperature: float = 0.2,

177

frequency_penalty: float = 0.3,

178

max_tokens: int = None,

179

api_key: str = None,

180

timeout: float = 60.0

181

):

182

"""

183

Initialize Azure OpenAI model.

184

185

Parameters:

186

- model: Azure deployment name

187

- azure_endpoint: Azure OpenAI endpoint URL

188

- api_version: Azure OpenAI API version

189

- temperature: Randomness in generation (0.0-2.0)

190

- frequency_penalty: Penalty for frequent tokens (-2.0 to 2.0)

191

- max_tokens: Maximum tokens in response

192

- api_key: Azure OpenAI API key

193

- timeout: Request timeout in seconds

194

"""

195

196

model: str

197

provider: str = "azure_openai"

198

199

async def ainvoke(

200

self,

201

messages: list[BaseMessage],

202

output_format: type[T] = None

203

) -> ChatInvokeCompletion:

204

"""Invoke Azure OpenAI model with messages."""

205

```

206

207

### Ollama Integration

208

209

Local model integration using Ollama for running models locally.

210

211

```python { .api }

212

class ChatOllama:

213

def __init__(

214

self,

215

model: str = "llama2",

216

temperature: float = 0.2,

217

base_url: str = "http://localhost:11434",

218

timeout: float = 120.0

219

):

220

"""

221

Initialize Ollama local model.

222

223

Parameters:

224

- model: Ollama model name (e.g., "llama2", "codellama", "mistral")

225

- temperature: Randomness in generation (0.0-1.0)

226

- base_url: Ollama server URL

227

- timeout: Request timeout in seconds

228

"""

229

230

model: str

231

provider: str = "ollama"

232

233

async def ainvoke(

234

self,

235

messages: list[BaseMessage],

236

output_format: type[T] = None

237

) -> ChatInvokeCompletion:

238

"""Invoke local Ollama model with messages."""

239

```

240

241

### Base Chat Model Protocol

242

243

Protocol defining the interface that all chat models must implement.

244

245

```python { .api }

246

from typing import Protocol, TypeVar

247

from abc import abstractmethod

248

249

T = TypeVar('T')

250

251

class BaseChatModel(Protocol):

252

"""Protocol for chat model implementations."""

253

254

model: str

255

provider: str

256

257

@abstractmethod

258

async def ainvoke(

259

self,

260

messages: list[BaseMessage],

261

output_format: type[T] = None

262

) -> ChatInvokeCompletion:

263

"""

264

Invoke the chat model with messages.

265

266

Parameters:

267

- messages: Conversation messages

268

- output_format: Optional structured output format

269

270

Returns:

271

ChatInvokeCompletion: Model response

272

"""

273

```

274

275

### Message Types

276

277

Message types for structured conversation handling.

278

279

```python { .api }

280

class BaseMessage:

281

"""Base class for conversation messages."""

282

content: str

283

role: str

284

285

class SystemMessage(BaseMessage):

286

"""System message for model prompting."""

287

role: str = "system"

288

289

class HumanMessage(BaseMessage):

290

"""Human/user message."""

291

role: str = "user"

292

293

class AIMessage(BaseMessage):

294

"""AI assistant message."""

295

role: str = "assistant"

296

297

class ChatInvokeCompletion:

298

"""Chat model response."""

299

content: str

300

model: str

301

usage: dict[str, int]

302

finish_reason: str

303

```

304

305

## Usage Examples

306

307

### Basic Model Usage

308

309

```python

310

from browser_use import Agent, ChatOpenAI, ChatAnthropic, ChatGoogle

311

312

# OpenAI GPT-4

313

agent = Agent(

314

task="Search for Python tutorials",

315

llm=ChatOpenAI(model="gpt-4o", temperature=0.1)

316

)

317

318

# Anthropic Claude

319

agent = Agent(

320

task="Analyze web page content",

321

llm=ChatAnthropic(model="claude-3-sonnet-20240229")

322

)

323

324

# Google Gemini

325

agent = Agent(

326

task="Extract structured data",

327

llm=ChatGoogle(model="gemini-pro")

328

)

329

```

330

331

### Custom Model Configuration

332

333

```python

334

from browser_use import ChatOpenAI, ChatGroq, ChatOllama

335

336

# Custom OpenAI configuration

337

openai_model = ChatOpenAI(

338

model="gpt-4o",

339

temperature=0.0, # Deterministic output

340

frequency_penalty=0.5, # Reduce repetition

341

max_tokens=2000,

342

timeout=30.0

343

)

344

345

# Fast inference with Groq

346

groq_model = ChatGroq(

347

model="llama3-70b-8192",

348

temperature=0.3,

349

max_tokens=4000

350

)

351

352

# Local model with Ollama

353

local_model = ChatOllama(

354

model="codellama:13b",

355

temperature=0.1,

356

base_url="http://localhost:11434"

357

)

358

```

359

360

### Azure OpenAI Enterprise Setup

361

362

```python

363

from browser_use import ChatAzureOpenAI, Agent

364

365

# Azure OpenAI configuration

366

azure_model = ChatAzureOpenAI(

367

model="gpt-4-deployment", # Your Azure deployment name

368

azure_endpoint="https://your-resource.openai.azure.com/",

369

api_version="2024-02-15-preview",

370

api_key="your-azure-api-key",

371

temperature=0.2

372

)

373

374

agent = Agent(

375

task="Enterprise browser automation task",

376

llm=azure_model

377

)

378

```

379

380

### Model Comparison Workflow

381

382

```python

383

from browser_use import Agent, ChatOpenAI, ChatAnthropic, ChatGoogle

384

385

task = "Analyze this webpage and extract key information"

386

387

# Test with different models

388

models = [

389

ChatOpenAI(model="gpt-4o"),

390

ChatAnthropic(model="claude-3-sonnet-20240229"),

391

ChatGoogle(model="gemini-pro")

392

]

393

394

results = []

395

for model in models:

396

agent = Agent(task=task, llm=model)

397

result = agent.run_sync()

398

results.append({

399

'provider': model.provider,

400

'model': model.model,

401

'result': result.final_result(),

402

'success': result.is_successful()

403

})

404

405

# Compare results

406

for result in results:

407

print(f"{result['provider']}: {result['success']}")

408

```

409

410

### Structured Output with Models

411

412

```python

413

from browser_use import Agent, ChatOpenAI

414

from pydantic import BaseModel

415

416

class WebPageInfo(BaseModel):

417

title: str

418

main_content: str

419

links: list[str]

420

images: list[str]

421

422

# Model with structured output

423

agent = Agent(

424

task="Extract structured information from webpage",

425

llm=ChatOpenAI(model="gpt-4o"),

426

output_model_schema=WebPageInfo

427

)

428

429

result = agent.run_sync()

430

webpage_info = result.final_result() # Returns WebPageInfo instance

431

print(f"Title: {webpage_info.title}")

432

print(f"Links found: {len(webpage_info.links)}")

433

```

434

435

### Error Handling and Fallbacks

436

437

```python

438

from browser_use import Agent, ChatOpenAI, ChatAnthropic, LLMException

439

440

primary_model = ChatOpenAI(model="gpt-4o")

441

fallback_model = ChatAnthropic(model="claude-3-haiku-20240307")

442

443

try:

444

agent = Agent(task="Complex task", llm=primary_model)

445

result = agent.run_sync()

446

except LLMException as e:

447

print(f"Primary model failed: {e}")

448

# Fallback to alternative model

449

agent = Agent(task="Complex task", llm=fallback_model)

450

result = agent.run_sync()

451

```

452

453

### Local Model Setup

454

455

```python

456

from browser_use import ChatOllama, Agent

457

458

# Ensure Ollama is running: ollama serve

459

# Pull model: ollama pull llama2

460

461

local_model = ChatOllama(

462

model="llama2:13b",

463

temperature=0.1,

464

base_url="http://localhost:11434"

465

)

466

467

agent = Agent(

468

task="Local browser automation task",

469

llm=local_model

470

)

471

472

# Works offline with local inference

473

result = agent.run_sync()

474

```

475

476

## Model Selection Guidelines

477

478

### Performance Characteristics

479

480

- **GPT-4o**: Excellent reasoning, vision capabilities, reliable

481

- **Claude-3**: Strong analysis, long context, good at following instructions

482

- **Gemini Pro**: Good vision, fast inference, cost-effective

483

- **Groq**: Very fast inference, good for simple tasks

484

- **Local (Ollama)**: Privacy, offline operation, no API costs

485

486

### Use Case Recommendations

487

488

- **Complex reasoning**: GPT-4o, Claude-3 Sonnet

489

- **Fast simple tasks**: Groq, Gemini Pro

490

- **Privacy/offline**: Ollama local models

491

- **Enterprise**: Azure OpenAI

492

- **Cost optimization**: GPT-4o-mini, Claude-3 Haiku

493

494

### Configuration Best Practices

495

496

- Use low temperature (0.0-0.3) for deterministic browser automation

497

- Set appropriate timeouts for model response times

498

- Configure max_tokens based on expected response length

499

- Use frequency_penalty to reduce repetitive actions