Tessl Tile for pypi/browser-use@0.7.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

agent-orchestration.md browser-actions.md browser-session.md dom-processing.md index.md llm-integration.md task-results.md

llm-integration.mddocs/

0
# LLM Integration
1

2
Multi-provider language model support with consistent interfaces for OpenAI, Anthropic, Google, Groq, Azure OpenAI, and Ollama models. All chat models implement the BaseChatModel protocol for seamless integration with browser-use agents.
3

4
## Capabilities
5

6
### OpenAI Integration
7

8
OpenAI GPT model integration with support for GPT-4, GPT-3.5, and other OpenAI models.
9

10
```python { .api }
11
class ChatOpenAI:
12
    def __init__(
13
        self,
14
        model: str = "gpt-4o-mini",
15
        temperature: float = 0.2,
16
        frequency_penalty: float = 0.3,
17
        presence_penalty: float = 0.0,
18
        max_tokens: int = None,
19
        api_key: str = None,
20
        base_url: str = None,
21
        timeout: float = 60.0
22
    ):
23
        """
24
        Initialize OpenAI chat model.
25

26
        Parameters:
27
        - model: OpenAI model name (e.g., "gpt-4o", "gpt-4o-mini", "gpt-3.5-turbo")
28
        - temperature: Randomness in generation (0.0-2.0)
29
        - frequency_penalty: Penalty for frequent tokens (-2.0 to 2.0)
30
        - presence_penalty: Penalty for token presence (-2.0 to 2.0)
31
        - max_tokens: Maximum tokens in response
32
        - api_key: OpenAI API key (uses OPENAI_API_KEY env var if not provided)
33
        - base_url: Custom API base URL
34
        - timeout: Request timeout in seconds
35
        """
36

37
    model: str
38
    provider: str = "openai"
39

40
    async def ainvoke(
41
        self,
42
        messages: list[BaseMessage],
43
        output_format: type[T] = None
44
    ) -> ChatInvokeCompletion:
45
        """
46
        Invoke OpenAI model with messages.
47

48
        Parameters:
49
        - messages: List of conversation messages
50
        - output_format: Optional Pydantic model for structured output
51

52
        Returns:
53
        ChatInvokeCompletion: Model response with content and metadata
54
        """
55
```
56

57
### Anthropic Integration
58

59
Anthropic Claude model integration with support for Claude 3 family models.
60

61
```python { .api }
62
class ChatAnthropic:
63
    def __init__(
64
        self,
65
        model: str = "claude-3-sonnet-20240229",
66
        temperature: float = 0.2,
67
        max_tokens: int = 4096,
68
        api_key: str = None,
69
        timeout: float = 60.0
70
    ):
71
        """
72
        Initialize Anthropic Claude model.
73

74
        Parameters:
75
        - model: Claude model name (e.g., "claude-3-sonnet-20240229", "claude-3-haiku-20240307")
76
        - temperature: Randomness in generation (0.0-1.0)
77
        - max_tokens: Maximum tokens in response
78
        - api_key: Anthropic API key (uses ANTHROPIC_API_KEY env var if not provided)
79
        - timeout: Request timeout in seconds
80
        """
81

82
    model: str
83
    provider: str = "anthropic"
84

85
    async def ainvoke(
86
        self,
87
        messages: list[BaseMessage],
88
        output_format: type[T] = None
89
    ) -> ChatInvokeCompletion:
90
        """Invoke Claude model with messages."""
91
```
92

93
### Google Integration
94

95
Google Gemini model integration with support for Gemini Pro and other Google models.
96

97
```python { .api }
98
class ChatGoogle:
99
    def __init__(
100
        self,
101
        model: str = "gemini-pro",
102
        temperature: float = 0.2,
103
        max_tokens: int = None,
104
        api_key: str = None,
105
        timeout: float = 60.0
106
    ):
107
        """
108
        Initialize Google Gemini model.
109

110
        Parameters:
111
        - model: Gemini model name (e.g., "gemini-pro", "gemini-pro-vision")
112
        - temperature: Randomness in generation (0.0-1.0)
113
        - max_tokens: Maximum tokens in response
114
        - api_key: Google API key (uses GOOGLE_API_KEY env var if not provided)
115
        - timeout: Request timeout in seconds
116
        """
117

118
    model: str
119
    provider: str = "google"
120

121
    async def ainvoke(
122
        self,
123
        messages: list[BaseMessage],
124
        output_format: type[T] = None
125
    ) -> ChatInvokeCompletion:
126
        """Invoke Gemini model with messages."""
127
```
128

129
### Groq Integration
130

131
Groq model integration for fast inference with Llama, Mixtral, and other supported models.
132

133
```python { .api }
134
class ChatGroq:
135
    def __init__(
136
        self,
137
        model: str = "llama3-70b-8192",
138
        temperature: float = 0.2,
139
        max_tokens: int = None,
140
        api_key: str = None,
141
        timeout: float = 60.0
142
    ):
143
        """
144
        Initialize Groq model.
145

146
        Parameters:
147
        - model: Groq model name (e.g., "llama3-70b-8192", "mixtral-8x7b-32768")
148
        - temperature: Randomness in generation (0.0-2.0)
149
        - max_tokens: Maximum tokens in response
150
        - api_key: Groq API key (uses GROQ_API_KEY env var if not provided)
151
        - timeout: Request timeout in seconds
152
        """
153

154
    model: str
155
    provider: str = "groq"
156

157
    async def ainvoke(
158
        self,
159
        messages: list[BaseMessage],
160
        output_format: type[T] = None
161
    ) -> ChatInvokeCompletion:
162
        """Invoke Groq model with messages."""
163
```
164

165
### Azure OpenAI Integration
166

167
Azure OpenAI service integration for enterprise OpenAI model deployment.
168

169
```python { .api }
170
class ChatAzureOpenAI:
171
    def __init__(
172
        self,
173
        model: str,
174
        azure_endpoint: str,
175
        api_version: str = "2024-02-15-preview",
176
        temperature: float = 0.2,
177
        frequency_penalty: float = 0.3,
178
        max_tokens: int = None,
179
        api_key: str = None,
180
        timeout: float = 60.0
181
    ):
182
        """
183
        Initialize Azure OpenAI model.
184

185
        Parameters:
186
        - model: Azure deployment name
187
        - azure_endpoint: Azure OpenAI endpoint URL
188
        - api_version: Azure OpenAI API version
189
        - temperature: Randomness in generation (0.0-2.0)
190
        - frequency_penalty: Penalty for frequent tokens (-2.0 to 2.0)
191
        - max_tokens: Maximum tokens in response
192
        - api_key: Azure OpenAI API key
193
        - timeout: Request timeout in seconds
194
        """
195

196
    model: str
197
    provider: str = "azure_openai"
198

199
    async def ainvoke(
200
        self,
201
        messages: list[BaseMessage],
202
        output_format: type[T] = None
203
    ) -> ChatInvokeCompletion:
204
        """Invoke Azure OpenAI model with messages."""
205
```
206

207
### Ollama Integration
208

209
Local model integration using Ollama for running models locally.
210

211
```python { .api }
212
class ChatOllama:
213
    def __init__(
214
        self,
215
        model: str = "llama2",
216
        temperature: float = 0.2,
217
        base_url: str = "http://localhost:11434",
218
        timeout: float = 120.0
219
    ):
220
        """
221
        Initialize Ollama local model.
222

223
        Parameters:
224
        - model: Ollama model name (e.g., "llama2", "codellama", "mistral")
225
        - temperature: Randomness in generation (0.0-1.0)
226
        - base_url: Ollama server URL
227
        - timeout: Request timeout in seconds
228
        """
229

230
    model: str
231
    provider: str = "ollama"
232

233
    async def ainvoke(
234
        self,
235
        messages: list[BaseMessage],
236
        output_format: type[T] = None
237
    ) -> ChatInvokeCompletion:
238
        """Invoke local Ollama model with messages."""
239
```
240

241
### Base Chat Model Protocol
242

243
Protocol defining the interface that all chat models must implement.
244

245
```python { .api }
246
from typing import Protocol, TypeVar
247
from abc import abstractmethod
248

249
T = TypeVar('T')
250

251
class BaseChatModel(Protocol):
252
    """Protocol for chat model implementations."""
253
    
254
    model: str
255
    provider: str
256

257
    @abstractmethod
258
    async def ainvoke(
259
        self,
260
        messages: list[BaseMessage],
261
        output_format: type[T] = None
262
    ) -> ChatInvokeCompletion:
263
        """
264
        Invoke the chat model with messages.
265

266
        Parameters:
267
        - messages: Conversation messages
268
        - output_format: Optional structured output format
269

270
        Returns:
271
        ChatInvokeCompletion: Model response
272
        """
273
```
274

275
### Message Types
276

277
Message types for structured conversation handling.
278

279
```python { .api }
280
class BaseMessage:
281
    """Base class for conversation messages."""
282
    content: str
283
    role: str
284

285
class SystemMessage(BaseMessage):
286
    """System message for model prompting."""
287
    role: str = "system"
288

289
class HumanMessage(BaseMessage):
290
    """Human/user message."""
291
    role: str = "user"
292

293
class AIMessage(BaseMessage):
294
    """AI assistant message."""
295
    role: str = "assistant"
296

297
class ChatInvokeCompletion:
298
    """Chat model response."""
299
    content: str
300
    model: str
301
    usage: dict[str, int]
302
    finish_reason: str
303
```
304

305
## Usage Examples
306

307
### Basic Model Usage
308

309
```python
310
from browser_use import Agent, ChatOpenAI, ChatAnthropic, ChatGoogle
311

312
# OpenAI GPT-4
313
agent = Agent(
314
    task="Search for Python tutorials",
315
    llm=ChatOpenAI(model="gpt-4o", temperature=0.1)
316
)
317

318
# Anthropic Claude
319
agent = Agent(
320
    task="Analyze web page content",
321
    llm=ChatAnthropic(model="claude-3-sonnet-20240229")
322
)
323

324
# Google Gemini
325
agent = Agent(
326
    task="Extract structured data",
327
    llm=ChatGoogle(model="gemini-pro")
328
)
329
```
330

331
### Custom Model Configuration
332

333
```python
334
from browser_use import ChatOpenAI, ChatGroq, ChatOllama
335

336
# Custom OpenAI configuration
337
openai_model = ChatOpenAI(
338
    model="gpt-4o",
339
    temperature=0.0,  # Deterministic output
340
    frequency_penalty=0.5,  # Reduce repetition
341
    max_tokens=2000,
342
    timeout=30.0
343
)
344

345
# Fast inference with Groq
346
groq_model = ChatGroq(
347
    model="llama3-70b-8192",
348
    temperature=0.3,
349
    max_tokens=4000
350
)
351

352
# Local model with Ollama
353
local_model = ChatOllama(
354
    model="codellama:13b",
355
    temperature=0.1,
356
    base_url="http://localhost:11434"
357
)
358
```
359

360
### Azure OpenAI Enterprise Setup
361

362
```python
363
from browser_use import ChatAzureOpenAI, Agent
364

365
# Azure OpenAI configuration
366
azure_model = ChatAzureOpenAI(
367
    model="gpt-4-deployment",  # Your Azure deployment name
368
    azure_endpoint="https://your-resource.openai.azure.com/",
369
    api_version="2024-02-15-preview",
370
    api_key="your-azure-api-key",
371
    temperature=0.2
372
)
373

374
agent = Agent(
375
    task="Enterprise browser automation task",
376
    llm=azure_model
377
)
378
```
379

380
### Model Comparison Workflow
381

382
```python
383
from browser_use import Agent, ChatOpenAI, ChatAnthropic, ChatGoogle
384

385
task = "Analyze this webpage and extract key information"
386

387
# Test with different models
388
models = [
389
    ChatOpenAI(model="gpt-4o"),
390
    ChatAnthropic(model="claude-3-sonnet-20240229"),
391
    ChatGoogle(model="gemini-pro")
392
]
393

394
results = []
395
for model in models:
396
    agent = Agent(task=task, llm=model)
397
    result = agent.run_sync()
398
    results.append({
399
        'provider': model.provider,
400
        'model': model.model,
401
        'result': result.final_result(),
402
        'success': result.is_successful()
403
    })
404

405
# Compare results
406
for result in results:
407
    print(f"{result['provider']}: {result['success']}")
408
```
409

410
### Structured Output with Models
411

412
```python
413
from browser_use import Agent, ChatOpenAI
414
from pydantic import BaseModel
415

416
class WebPageInfo(BaseModel):
417
    title: str
418
    main_content: str
419
    links: list[str]
420
    images: list[str]
421

422
# Model with structured output
423
agent = Agent(
424
    task="Extract structured information from webpage",
425
    llm=ChatOpenAI(model="gpt-4o"),
426
    output_model_schema=WebPageInfo
427
)
428

429
result = agent.run_sync()
430
webpage_info = result.final_result()  # Returns WebPageInfo instance
431
print(f"Title: {webpage_info.title}")
432
print(f"Links found: {len(webpage_info.links)}")
433
```
434

435
### Error Handling and Fallbacks
436

437
```python
438
from browser_use import Agent, ChatOpenAI, ChatAnthropic, LLMException
439

440
primary_model = ChatOpenAI(model="gpt-4o")
441
fallback_model = ChatAnthropic(model="claude-3-haiku-20240307")
442

443
try:
444
    agent = Agent(task="Complex task", llm=primary_model)
445
    result = agent.run_sync()
446
except LLMException as e:
447
    print(f"Primary model failed: {e}")
448
    # Fallback to alternative model
449
    agent = Agent(task="Complex task", llm=fallback_model)
450
    result = agent.run_sync()
451
```
452

453
### Local Model Setup
454

455
```python
456
from browser_use import ChatOllama, Agent
457

458
# Ensure Ollama is running: ollama serve
459
# Pull model: ollama pull llama2
460

461
local_model = ChatOllama(
462
    model="llama2:13b",
463
    temperature=0.1,
464
    base_url="http://localhost:11434"
465
)
466

467
agent = Agent(
468
    task="Local browser automation task",
469
    llm=local_model
470
)
471

472
# Works offline with local inference
473
result = agent.run_sync()
474
```
475

476
## Model Selection Guidelines
477

478
### Performance Characteristics
479

480
- **GPT-4o**: Excellent reasoning, vision capabilities, reliable
481
- **Claude-3**: Strong analysis, long context, good at following instructions  
482
- **Gemini Pro**: Good vision, fast inference, cost-effective
483
- **Groq**: Very fast inference, good for simple tasks
484
- **Local (Ollama)**: Privacy, offline operation, no API costs
485

486
### Use Case Recommendations
487

488
- **Complex reasoning**: GPT-4o, Claude-3 Sonnet
489
- **Fast simple tasks**: Groq, Gemini Pro
490
- **Privacy/offline**: Ollama local models
491
- **Enterprise**: Azure OpenAI
492
- **Cost optimization**: GPT-4o-mini, Claude-3 Haiku
493

494
### Configuration Best Practices
495

496
- Use low temperature (0.0-0.3) for deterministic browser automation
497
- Set appropriate timeouts for model response times
498
- Configure max_tokens based on expected response length
499
- Use frequency_penalty to reduce repetitive actions

Version

Tile

Files

llm-integration.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

llm-integration.mddocs/