Tessl Tile for pypi/langchain-tests@0.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cache-testing.md configuration.md index.md integration-tests.md key-value-stores.md unit-tests.md vector-stores.md

integration-tests.mddocs/

0
# Integration Testing
1

2
Comprehensive test classes for full functionality verification including real API calls, streaming, tool calling, structured output, and multimodal inputs. Integration tests verify complete feature sets and real-world usage patterns with external services.
3

4
## Capabilities
5

6
### Chat Model Integration Tests
7

8
Comprehensive integration testing for chat models with 40+ test methods covering all aspects of chat model functionality.
9

10
```python { .api }
11
from langchain_tests.integration_tests import ChatModelIntegrationTests
12

13
class ChatModelIntegrationTests(ChatModelTests):
14
    """Integration tests for chat models with comprehensive functionality testing."""
15
    
16
    # Inherits all configuration from ChatModelTests
17
    
18
    # Basic invocation tests
19
    def test_invoke(self) -> None:
20
        """Test basic model invocation with simple prompts."""
21
    
22
    def test_ainvoke(self) -> None:
23
        """Test asynchronous model invocation."""
24
    
25
    # Streaming tests
26
    def test_stream(self) -> None:
27
        """Test streaming responses from the model."""
28
    
29
    def test_astream(self) -> None:
30
        """Test asynchronous streaming responses."""
31
    
32
    # Batch processing tests
33
    def test_batch(self) -> None:
34
        """Test batch processing of multiple prompts."""
35
    
36
    def test_abatch(self) -> None:
37
        """Test asynchronous batch processing."""
38
    
39
    # Conversation tests
40
    def test_conversation(self) -> None:
41
        """Test multi-turn conversation handling."""
42
    
43
    def test_double_messages_conversation(self) -> None:
44
        """Test sequential message handling in conversations."""
45
    
46
    # Usage metadata tests
47
    def test_usage_metadata(self) -> None:
48
        """Test usage metadata tracking and validation."""
49
    
50
    def test_usage_metadata_streaming(self) -> None:
51
        """Test usage metadata in streaming responses."""
52
    
53
    # Stop sequence tests
54
    def test_stop_sequence(self) -> None:
55
        """Test stop sequence functionality."""
56
    
57
    # Tool calling tests (if has_tool_calling=True)
58
    def test_tool_calling(self) -> None:
59
        """Test tool calling functionality."""
60
    
61
    def test_tool_calling_async(self) -> None:
62
        """Test asynchronous tool calling."""
63
    
64
    def test_bind_runnables_as_tools(self) -> None:
65
        """Test binding runnable objects as tools."""
66
    
67
    def test_tool_message_histories_string_content(self) -> None:
68
        """Test tool message histories with string content."""
69
    
70
    def test_tool_message_histories_list_content(self) -> None:
71
        """Test tool message histories with complex list content."""
72
    
73
    def test_tool_choice(self) -> None:
74
        """Test tool choice functionality."""
75
    
76
    def test_tool_calling_with_no_arguments(self) -> None:
77
        """Test tool calling with tools that take no arguments."""
78
    
79
    def test_tool_message_error_status(self) -> None:
80
        """Test error handling in tool messages."""
81
    
82
    # Structured output tests (if has_structured_output=True)
83
    def test_structured_few_shot_examples(self) -> None:
84
        """Test structured output with few-shot examples."""
85
    
86
    def test_structured_output(self) -> None:
87
        """Test structured output generation."""
88
    
89
    def test_structured_output_async(self) -> None:
90
        """Test asynchronous structured output generation."""
91
    
92
    def test_structured_output_pydantic_2_v1(self) -> None:
93
        """Test Pydantic V1 compatibility in structured output."""
94
    
95
    def test_structured_output_optional_param(self) -> None:
96
        """Test structured output with optional parameters."""
97
    
98
    # JSON mode tests (if supports_json_mode=True)
99
    def test_json_mode(self) -> None:
100
        """Test JSON mode functionality."""
101
    
102
    # Multimodal input tests (if corresponding support flags=True)
103
    def test_pdf_inputs(self) -> None:
104
        """Test PDF input handling."""
105
    
106
    def test_audio_inputs(self) -> None:
107
        """Test audio input handling."""
108
    
109
    def test_image_inputs(self) -> None:
110
        """Test image input handling."""
111
    
112
    def test_image_tool_message(self) -> None:
113
        """Test image content in tool messages."""
114
    
115
    def test_anthropic_inputs(self) -> None:
116
        """Test Anthropic-style input format handling."""
117
    
118
    # Message handling tests
119
    def test_message_with_name(self) -> None:
120
        """Test messages with name attributes."""
121
    
122
    # Advanced functionality tests
123
    def test_agent_loop(self) -> None:
124
        """Test agent loop functionality with tool calling."""
125
    
126
    def test_unicode_tool_call_integration(self) -> None:
127
        """Test Unicode handling in tool calls."""
128
    
129
    # Performance tests
130
    def test_stream_time(self) -> None:
131
        """Benchmark streaming performance."""
132
```
133

134
#### Usage Example
135

136
```python
137
from langchain_tests.integration_tests import ChatModelIntegrationTests
138
from my_integration import MyChatModel
139

140
class TestMyChatModelIntegration(ChatModelIntegrationTests):
141
    @property
142
    def chat_model_class(self):
143
        return MyChatModel
144
    
145
    @property
146
    def chat_model_params(self):
147
        return {
148
            "api_key": "real-api-key",  # Use real credentials for integration tests
149
            "model": "gpt-4",
150
            "temperature": 0.1
151
        }
152
    
153
    # Configure model capabilities
154
    @property
155
    def has_tool_calling(self):
156
        return True
157
    
158
    @property
159
    def has_structured_output(self):
160
        return True
161
    
162
    @property
163
    def supports_image_inputs(self):
164
        return True
165
    
166
    @property
167
    def returns_usage_metadata(self):
168
        return True
169
```
170

171
### Embeddings Integration Tests
172

173
Integration testing for embeddings models with synchronous and asynchronous operations.
174

175
```python { .api }
176
from langchain_tests.integration_tests import EmbeddingsIntegrationTests
177

178
class EmbeddingsIntegrationTests(EmbeddingsTests):
179
    """Integration tests for embeddings models."""
180
    
181
    def test_embed_query(self) -> None:
182
        """Test embedding a single query string."""
183
    
184
    def test_embed_documents(self) -> None:
185
        """Test embedding a list of documents."""
186
    
187
    def test_aembed_query(self) -> None:
188
        """Test asynchronous embedding of a single query."""
189
    
190
    def test_aembed_documents(self) -> None:
191
        """Test asynchronous embedding of document lists."""
192
```
193

194
#### Usage Example
195

196
```python
197
from langchain_tests.integration_tests import EmbeddingsIntegrationTests
198
from my_integration import MyEmbeddings
199

200
class TestMyEmbeddingsIntegration(EmbeddingsIntegrationTests):
201
    @property
202
    def embeddings_class(self):
203
        return MyEmbeddings
204
    
205
    @property
206
    def embedding_model_params(self):
207
        return {
208
            "api_key": "real-api-key",
209
            "model": "text-embedding-3-large"
210
        }
211
```
212

213
### Tools Integration Tests
214

215
Integration testing for tools with schema validation and invocation verification.
216

217
```python { .api }
218
from langchain_tests.integration_tests import ToolsIntegrationTests
219

220
class ToolsIntegrationTests(ToolsTests):
221
    """Integration tests for tools."""
222
    
223
    def test_invoke_matches_output_schema(self) -> None:
224
        """Test that tool output matches its declared schema."""
225
    
226
    def test_async_invoke_matches_output_schema(self) -> None:
227
        """Test that async tool output matches its declared schema."""
228
    
229
    def test_invoke_no_tool_call(self) -> None:
230
        """Test direct tool invocation without tool call wrapper."""
231
    
232
    def test_async_invoke_no_tool_call(self) -> None:
233
        """Test direct async tool invocation."""
234
```
235

236
#### Usage Example
237

238
```python
239
from langchain_tests.integration_tests import ToolsIntegrationTests
240
from my_integration import MySearchTool
241

242
class TestMySearchToolIntegration(ToolsIntegrationTests):
243
    @property
244
    def tool_constructor(self):
245
        return MySearchTool
246
    
247
    @property
248
    def tool_constructor_params(self):
249
        return {
250
            "api_key": "real-search-api-key",
251
            "base_url": "https://api.search-service.com"
252
        }
253
    
254
    @property
255
    def tool_invoke_params_example(self):
256
        return {
257
            "query": "LangChain framework",
258
            "num_results": 5
259
        }
260
```
261

262
### Retrievers Integration Tests
263

264
Integration testing for retriever implementations with document retrieval and parameter validation.
265

266
```python { .api }
267
from langchain_tests.integration_tests import RetrieversIntegrationTests
268

269
class RetrieversIntegrationTests(BaseStandardTests):
270
    """Integration tests for retrievers."""
271
    
272
    # Required abstract properties
273
    @property
274
    def retriever_constructor(self):
275
        """Retriever class to test."""
276
    
277
    @property
278
    def retriever_constructor_params(self) -> dict:
279
        """Constructor parameters for the retriever."""
280
    
281
    @property
282
    def retriever_query_example(self) -> str:
283
        """Example query string for testing."""
284
    
285
    @property
286
    def num_results_arg_name(self) -> str:
287
        """Name of the parameter that controls number of results. Default: 'k'."""
288
    
289
    # Fixtures
290
    @pytest.fixture
291
    def retriever(self):
292
        """Retriever fixture for testing."""
293
    
294
    def test_k_constructor_param(self) -> None:
295
        """Test the number of results constructor parameter."""
296
    
297
    def test_invoke_with_k_kwarg(self) -> None:
298
        """Test runtime parameter for number of results."""
299
    
300
    def test_invoke_returns_documents(self) -> None:
301
        """Test that retriever returns Document objects."""
302
    
303
    def test_ainvoke_returns_documents(self) -> None:
304
        """Test that async retriever returns Document objects."""
305
```
306

307
#### Usage Example
308

309
```python
310
from langchain_tests.integration_tests import RetrieversIntegrationTests
311
from my_integration import MyRetriever
312

313
class TestMyRetrieverIntegration(RetrieversIntegrationTests):
314
    @property
315
    def retriever_constructor(self):
316
        return MyRetriever
317
    
318
    @property
319
    def retriever_constructor_params(self):
320
        return {
321
            "index_name": "test-index",
322
            "api_key": "real-api-key"
323
        }
324
    
325
    @property
326
    def retriever_query_example(self):
327
        return "machine learning algorithms"
328
    
329
    @property
330
    def num_results_arg_name(self):
331
        return "top_k"  # If your retriever uses 'top_k' instead of 'k'
332
```
333

334
## Pre-defined Test Tools
335

336
The integration test framework includes several pre-built tools for testing tool calling functionality:
337

338
```python { .api }
339
# Pre-defined tools for testing
340
def magic_function(input: int) -> int:
341
    """Magic function tool with input validation."""
342

343
def magic_function_no_args() -> str:
344
    """No-argument magic function tool."""
345

346
def unicode_customer(customer_name: str, description: str) -> str:
347
    """Unicode handling tool for internationalization testing."""
348

349
def current_weather_tool():
350
    """Weather tool fixture for testing tool calling."""
351
```
352

353
## Test Callback Handlers
354

355
Integration tests include callback handlers for capturing and validating model behavior:
356

357
```python { .api }
358
class _TestCallbackHandler:
359
    """Callback handler for capturing chat model options and events."""
360
    
361
    def on_chat_model_start(self, serialized, messages, **kwargs):
362
        """Called when chat model starts processing."""
363
    
364
    def on_llm_end(self, response, **kwargs):
365
        """Called when chat model completes processing."""
366
```
367

368
## Schema Generation Utilities
369

370
Utilities for generating test schemas for structured output testing:
371

372
```python { .api }
373
def _get_joke_class(schema_type: str):
374
    """Generate joke schema for different output formats."""
375
```
376

377
## VCR Integration
378

379
Integration tests automatically use VCR (Video Cassette Recorder) for HTTP call recording and playback, enabling:
380

381
- **Consistent Testing**: Record real API responses once, replay for subsequent test runs
382
- **Offline Testing**: Run tests without network connectivity
383
- **Cost Reduction**: Avoid repeated API calls during test development
384
- **Deterministic Results**: Same responses every time for reliable testing
385

386
VCR integration is controlled by the `enable_vcr_tests` property in the base test class.
387

388
## Performance Benchmarking
389

390
Integration tests include performance benchmarking capabilities:
391

392
- **Stream Performance**: `test_stream_time()` benchmarks streaming response times
393
- **Batch Performance**: Timing analysis for batch operations
394
- **Tool Calling Performance**: Benchmarking for tool calling overhead
395

396
Performance tests use pytest-benchmark for detailed statistical analysis and regression detection.
397

398
## Multimodal Input Testing
399

400
For models that support multimodal inputs, the framework provides comprehensive testing:
401

402
- **Image Inputs**: Base64 encoded images and image URLs
403
- **PDF Inputs**: Document processing capabilities
404
- **Audio Inputs**: Speech and audio file processing
405
- **Video Inputs**: Video content analysis
406

407
Each multimodal capability is controlled by feature flags in the test class configuration.
408

409
## Error Handling Validation
410

411
Integration tests verify proper error handling for common failure scenarios:
412

413
- **API Key Errors**: Invalid or missing authentication
414
- **Rate Limiting**: Handling of rate limit responses
415
- **Network Errors**: Connection timeouts and failures
416
- **Invalid Parameters**: Malformed requests and responses
417
- **Tool Errors**: Tool execution failures and error propagation
418

419
The framework ensures that implementations handle these errors gracefully and provide meaningful error messages to developers.

Version

Tile

Files

integration-tests.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

integration-tests.mddocs/