0
# Integration Testing
1
2
Comprehensive test classes for full functionality verification including real API calls, streaming, tool calling, structured output, and multimodal inputs. Integration tests verify complete feature sets and real-world usage patterns with external services.
3
4
## Capabilities
5
6
### Chat Model Integration Tests
7
8
Comprehensive integration testing for chat models with 40+ test methods covering all aspects of chat model functionality.
9
10
```python { .api }
11
from langchain_tests.integration_tests import ChatModelIntegrationTests
12
13
class ChatModelIntegrationTests(ChatModelTests):
14
"""Integration tests for chat models with comprehensive functionality testing."""
15
16
# Inherits all configuration from ChatModelTests
17
18
# Basic invocation tests
19
def test_invoke(self) -> None:
20
"""Test basic model invocation with simple prompts."""
21
22
def test_ainvoke(self) -> None:
23
"""Test asynchronous model invocation."""
24
25
# Streaming tests
26
def test_stream(self) -> None:
27
"""Test streaming responses from the model."""
28
29
def test_astream(self) -> None:
30
"""Test asynchronous streaming responses."""
31
32
# Batch processing tests
33
def test_batch(self) -> None:
34
"""Test batch processing of multiple prompts."""
35
36
def test_abatch(self) -> None:
37
"""Test asynchronous batch processing."""
38
39
# Conversation tests
40
def test_conversation(self) -> None:
41
"""Test multi-turn conversation handling."""
42
43
def test_double_messages_conversation(self) -> None:
44
"""Test sequential message handling in conversations."""
45
46
# Usage metadata tests
47
def test_usage_metadata(self) -> None:
48
"""Test usage metadata tracking and validation."""
49
50
def test_usage_metadata_streaming(self) -> None:
51
"""Test usage metadata in streaming responses."""
52
53
# Stop sequence tests
54
def test_stop_sequence(self) -> None:
55
"""Test stop sequence functionality."""
56
57
# Tool calling tests (if has_tool_calling=True)
58
def test_tool_calling(self) -> None:
59
"""Test tool calling functionality."""
60
61
def test_tool_calling_async(self) -> None:
62
"""Test asynchronous tool calling."""
63
64
def test_bind_runnables_as_tools(self) -> None:
65
"""Test binding runnable objects as tools."""
66
67
def test_tool_message_histories_string_content(self) -> None:
68
"""Test tool message histories with string content."""
69
70
def test_tool_message_histories_list_content(self) -> None:
71
"""Test tool message histories with complex list content."""
72
73
def test_tool_choice(self) -> None:
74
"""Test tool choice functionality."""
75
76
def test_tool_calling_with_no_arguments(self) -> None:
77
"""Test tool calling with tools that take no arguments."""
78
79
def test_tool_message_error_status(self) -> None:
80
"""Test error handling in tool messages."""
81
82
# Structured output tests (if has_structured_output=True)
83
def test_structured_few_shot_examples(self) -> None:
84
"""Test structured output with few-shot examples."""
85
86
def test_structured_output(self) -> None:
87
"""Test structured output generation."""
88
89
def test_structured_output_async(self) -> None:
90
"""Test asynchronous structured output generation."""
91
92
def test_structured_output_pydantic_2_v1(self) -> None:
93
"""Test Pydantic V1 compatibility in structured output."""
94
95
def test_structured_output_optional_param(self) -> None:
96
"""Test structured output with optional parameters."""
97
98
# JSON mode tests (if supports_json_mode=True)
99
def test_json_mode(self) -> None:
100
"""Test JSON mode functionality."""
101
102
# Multimodal input tests (if corresponding support flags=True)
103
def test_pdf_inputs(self) -> None:
104
"""Test PDF input handling."""
105
106
def test_audio_inputs(self) -> None:
107
"""Test audio input handling."""
108
109
def test_image_inputs(self) -> None:
110
"""Test image input handling."""
111
112
def test_image_tool_message(self) -> None:
113
"""Test image content in tool messages."""
114
115
def test_anthropic_inputs(self) -> None:
116
"""Test Anthropic-style input format handling."""
117
118
# Message handling tests
119
def test_message_with_name(self) -> None:
120
"""Test messages with name attributes."""
121
122
# Advanced functionality tests
123
def test_agent_loop(self) -> None:
124
"""Test agent loop functionality with tool calling."""
125
126
def test_unicode_tool_call_integration(self) -> None:
127
"""Test Unicode handling in tool calls."""
128
129
# Performance tests
130
def test_stream_time(self) -> None:
131
"""Benchmark streaming performance."""
132
```
133
134
#### Usage Example
135
136
```python
137
from langchain_tests.integration_tests import ChatModelIntegrationTests
138
from my_integration import MyChatModel
139
140
class TestMyChatModelIntegration(ChatModelIntegrationTests):
141
@property
142
def chat_model_class(self):
143
return MyChatModel
144
145
@property
146
def chat_model_params(self):
147
return {
148
"api_key": "real-api-key", # Use real credentials for integration tests
149
"model": "gpt-4",
150
"temperature": 0.1
151
}
152
153
# Configure model capabilities
154
@property
155
def has_tool_calling(self):
156
return True
157
158
@property
159
def has_structured_output(self):
160
return True
161
162
@property
163
def supports_image_inputs(self):
164
return True
165
166
@property
167
def returns_usage_metadata(self):
168
return True
169
```
170
171
### Embeddings Integration Tests
172
173
Integration testing for embeddings models with synchronous and asynchronous operations.
174
175
```python { .api }
176
from langchain_tests.integration_tests import EmbeddingsIntegrationTests
177
178
class EmbeddingsIntegrationTests(EmbeddingsTests):
179
"""Integration tests for embeddings models."""
180
181
def test_embed_query(self) -> None:
182
"""Test embedding a single query string."""
183
184
def test_embed_documents(self) -> None:
185
"""Test embedding a list of documents."""
186
187
def test_aembed_query(self) -> None:
188
"""Test asynchronous embedding of a single query."""
189
190
def test_aembed_documents(self) -> None:
191
"""Test asynchronous embedding of document lists."""
192
```
193
194
#### Usage Example
195
196
```python
197
from langchain_tests.integration_tests import EmbeddingsIntegrationTests
198
from my_integration import MyEmbeddings
199
200
class TestMyEmbeddingsIntegration(EmbeddingsIntegrationTests):
201
@property
202
def embeddings_class(self):
203
return MyEmbeddings
204
205
@property
206
def embedding_model_params(self):
207
return {
208
"api_key": "real-api-key",
209
"model": "text-embedding-3-large"
210
}
211
```
212
213
### Tools Integration Tests
214
215
Integration testing for tools with schema validation and invocation verification.
216
217
```python { .api }
218
from langchain_tests.integration_tests import ToolsIntegrationTests
219
220
class ToolsIntegrationTests(ToolsTests):
221
"""Integration tests for tools."""
222
223
def test_invoke_matches_output_schema(self) -> None:
224
"""Test that tool output matches its declared schema."""
225
226
def test_async_invoke_matches_output_schema(self) -> None:
227
"""Test that async tool output matches its declared schema."""
228
229
def test_invoke_no_tool_call(self) -> None:
230
"""Test direct tool invocation without tool call wrapper."""
231
232
def test_async_invoke_no_tool_call(self) -> None:
233
"""Test direct async tool invocation."""
234
```
235
236
#### Usage Example
237
238
```python
239
from langchain_tests.integration_tests import ToolsIntegrationTests
240
from my_integration import MySearchTool
241
242
class TestMySearchToolIntegration(ToolsIntegrationTests):
243
@property
244
def tool_constructor(self):
245
return MySearchTool
246
247
@property
248
def tool_constructor_params(self):
249
return {
250
"api_key": "real-search-api-key",
251
"base_url": "https://api.search-service.com"
252
}
253
254
@property
255
def tool_invoke_params_example(self):
256
return {
257
"query": "LangChain framework",
258
"num_results": 5
259
}
260
```
261
262
### Retrievers Integration Tests
263
264
Integration testing for retriever implementations with document retrieval and parameter validation.
265
266
```python { .api }
267
from langchain_tests.integration_tests import RetrieversIntegrationTests
268
269
class RetrieversIntegrationTests(BaseStandardTests):
270
"""Integration tests for retrievers."""
271
272
# Required abstract properties
273
@property
274
def retriever_constructor(self):
275
"""Retriever class to test."""
276
277
@property
278
def retriever_constructor_params(self) -> dict:
279
"""Constructor parameters for the retriever."""
280
281
@property
282
def retriever_query_example(self) -> str:
283
"""Example query string for testing."""
284
285
@property
286
def num_results_arg_name(self) -> str:
287
"""Name of the parameter that controls number of results. Default: 'k'."""
288
289
# Fixtures
290
@pytest.fixture
291
def retriever(self):
292
"""Retriever fixture for testing."""
293
294
def test_k_constructor_param(self) -> None:
295
"""Test the number of results constructor parameter."""
296
297
def test_invoke_with_k_kwarg(self) -> None:
298
"""Test runtime parameter for number of results."""
299
300
def test_invoke_returns_documents(self) -> None:
301
"""Test that retriever returns Document objects."""
302
303
def test_ainvoke_returns_documents(self) -> None:
304
"""Test that async retriever returns Document objects."""
305
```
306
307
#### Usage Example
308
309
```python
310
from langchain_tests.integration_tests import RetrieversIntegrationTests
311
from my_integration import MyRetriever
312
313
class TestMyRetrieverIntegration(RetrieversIntegrationTests):
314
@property
315
def retriever_constructor(self):
316
return MyRetriever
317
318
@property
319
def retriever_constructor_params(self):
320
return {
321
"index_name": "test-index",
322
"api_key": "real-api-key"
323
}
324
325
@property
326
def retriever_query_example(self):
327
return "machine learning algorithms"
328
329
@property
330
def num_results_arg_name(self):
331
return "top_k" # If your retriever uses 'top_k' instead of 'k'
332
```
333
334
## Pre-defined Test Tools
335
336
The integration test framework includes several pre-built tools for testing tool calling functionality:
337
338
```python { .api }
339
# Pre-defined tools for testing
340
def magic_function(input: int) -> int:
341
"""Magic function tool with input validation."""
342
343
def magic_function_no_args() -> str:
344
"""No-argument magic function tool."""
345
346
def unicode_customer(customer_name: str, description: str) -> str:
347
"""Unicode handling tool for internationalization testing."""
348
349
def current_weather_tool():
350
"""Weather tool fixture for testing tool calling."""
351
```
352
353
## Test Callback Handlers
354
355
Integration tests include callback handlers for capturing and validating model behavior:
356
357
```python { .api }
358
class _TestCallbackHandler:
359
"""Callback handler for capturing chat model options and events."""
360
361
def on_chat_model_start(self, serialized, messages, **kwargs):
362
"""Called when chat model starts processing."""
363
364
def on_llm_end(self, response, **kwargs):
365
"""Called when chat model completes processing."""
366
```
367
368
## Schema Generation Utilities
369
370
Utilities for generating test schemas for structured output testing:
371
372
```python { .api }
373
def _get_joke_class(schema_type: str):
374
"""Generate joke schema for different output formats."""
375
```
376
377
## VCR Integration
378
379
Integration tests automatically use VCR (Video Cassette Recorder) for HTTP call recording and playback, enabling:
380
381
- **Consistent Testing**: Record real API responses once, replay for subsequent test runs
382
- **Offline Testing**: Run tests without network connectivity
383
- **Cost Reduction**: Avoid repeated API calls during test development
384
- **Deterministic Results**: Same responses every time for reliable testing
385
386
VCR integration is controlled by the `enable_vcr_tests` property in the base test class.
387
388
## Performance Benchmarking
389
390
Integration tests include performance benchmarking capabilities:
391
392
- **Stream Performance**: `test_stream_time()` benchmarks streaming response times
393
- **Batch Performance**: Timing analysis for batch operations
394
- **Tool Calling Performance**: Benchmarking for tool calling overhead
395
396
Performance tests use pytest-benchmark for detailed statistical analysis and regression detection.
397
398
## Multimodal Input Testing
399
400
For models that support multimodal inputs, the framework provides comprehensive testing:
401
402
- **Image Inputs**: Base64 encoded images and image URLs
403
- **PDF Inputs**: Document processing capabilities
404
- **Audio Inputs**: Speech and audio file processing
405
- **Video Inputs**: Video content analysis
406
407
Each multimodal capability is controlled by feature flags in the test class configuration.
408
409
## Error Handling Validation
410
411
Integration tests verify proper error handling for common failure scenarios:
412
413
- **API Key Errors**: Invalid or missing authentication
414
- **Rate Limiting**: Handling of rate limit responses
415
- **Network Errors**: Connection timeouts and failures
416
- **Invalid Parameters**: Malformed requests and responses
417
- **Tool Errors**: Tool execution failures and error propagation
418
419
The framework ensures that implementations handle these errors gracefully and provide meaningful error messages to developers.