Tessl Tile for pypi/langchain-groq@0.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/pypi-langchain-groq

An integration package connecting Groq's Language Processing Unit (LPU) with LangChain for high-performance AI inference

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/langchain-groq@0.3.x

To install, run

npx @tessl/cli install tessl/pypi-langchain-groq@0.3.0

0
# LangChain Groq
1

2
An integration package connecting Groq's Language Processing Unit (LPU) with LangChain for high-performance AI inference. This package provides seamless access to Groq's deterministic, single-core streaming architecture that delivers predictable and repeatable performance for GenAI inference workloads.
3

4
## Package Information
5

6
- **Package Name**: langchain-groq
7
- **Language**: Python
8
- **Installation**: `pip install langchain-groq`
9
- **Dependencies**: langchain-core, groq
10
- **Python Version**: >=3.9
11

12
## Core Imports
13

14
```python
15
from langchain_groq import ChatGroq
16
```
17

18
Import version information:
19

20
```python
21
from langchain_groq import __version__
22
```
23

24
## Basic Usage
25

26
```python
27
from langchain_groq import ChatGroq
28
from langchain_core.messages import HumanMessage, SystemMessage
29

30
# Basic initialization
31
llm = ChatGroq(
32
    model="llama-3.1-8b-instant",
33
    temperature=0.0,
34
    api_key="your-groq-api-key"  # or set GROQ_API_KEY env var
35
)
36

37
# Simple conversation
38
messages = [
39
    SystemMessage(content="You are a helpful assistant."),
40
    HumanMessage(content="What is the capital of France?")
41
]
42

43
response = llm.invoke(messages)
44
print(response.content)
45

46
# Streaming response
47
for chunk in llm.stream(messages):
48
    print(chunk.content, end="", flush=True)
49
```
50

51
## Architecture
52

53
LangChain Groq integrates with the LangChain ecosystem through the standard `BaseChatModel` interface, providing:
54

55
- **LangChain Compatibility**: Full integration with LangChain's Runnable interface, supporting chaining, composition, and streaming
56
- **Groq LPU Integration**: Direct connection to Groq's deterministic Language Processing Units for consistent, high-performance inference
57
- **Tool Calling Support**: Native function calling capabilities using OpenAI-compatible tool schemas
58
- **Structured Output**: Built-in support for generating responses conforming to specific schemas via function calling or JSON mode
59
- **Async Support**: Full asynchronous operation support for high-throughput applications
60
- **Streaming**: Real-time token streaming with predictable performance characteristics
61

62
The package follows LangChain's standard patterns while leveraging Groq's unique deterministic architecture for reproducible results across inference runs.
63

64
## Environment Variables
65

66
- **GROQ_API_KEY**: Required API key for Groq service
67
- **GROQ_API_BASE**: Optional custom API base URL  
68
- **GROQ_PROXY**: Optional proxy configuration
69

70
## Capabilities
71

72
### Chat Model Initialization
73

74
Initialize the ChatGroq model with comprehensive configuration options for performance, behavior, and API settings.
75

76
```python { .api }
77
class ChatGroq:
78
    def __init__(
79
        self,
80
        model: str,
81
        temperature: float = 0.7,
82
        max_tokens: Optional[int] = None,
83
        stop: Optional[Union[List[str], str]] = None,
84
        reasoning_format: Optional[Literal["parsed", "raw", "hidden"]] = None,
85
        reasoning_effort: Optional[str] = None,
86
        service_tier: Literal["on_demand", "flex", "auto"] = "on_demand",
87
        api_key: Optional[str] = None,
88
        base_url: Optional[str] = None,
89
        timeout: Union[float, Tuple[float, float], Any, None] = None,
90
        max_retries: int = 2,
91
        streaming: bool = False,
92
        n: int = 1,
93
        model_kwargs: Dict[str, Any] = None,
94
        default_headers: Union[Mapping[str, str], None] = None,
95
        default_query: Union[Mapping[str, object], None] = None,
96
        http_client: Union[Any, None] = None,
97
        http_async_client: Union[Any, None] = None,
98
        **kwargs: Any
99
    ) -> None:
100
        """
101
        Initialize ChatGroq model.
102

103
        Parameters:
104
        - model: Name of Groq model (e.g., "llama-3.1-8b-instant")
105
                 Note: Aliased to internal field 'model_name'
106
        - temperature: Sampling temperature (0.0 to 1.0)
107
        - max_tokens: Maximum tokens to generate
108
        - stop: Stop sequences (string or list of strings)
109
                Note: Aliased to internal field 'stop_sequences'
110
        - reasoning_format: Format for reasoning output ("parsed", "raw", "hidden")
111
        - reasoning_effort: Level of reasoning effort
112
        - service_tier: Service tier ("on_demand", "flex", "auto")
113
        - api_key: Groq API key (defaults to GROQ_API_KEY env var)
114
                   Note: Aliased to internal field 'groq_api_key'
115
        - base_url: Custom API base URL
116
                    Note: Aliased to internal field 'groq_api_base'
117
        - timeout: Request timeout in seconds
118
                   Note: Aliased to internal field 'request_timeout'
119
        - max_retries: Maximum retry attempts
120
        - streaming: Enable streaming responses
121
        - n: Number of completions to generate
122
        - model_kwargs: Additional model parameters
123
        - default_headers: Default HTTP headers
124
        - default_query: Default query parameters
125
        - http_client: Custom httpx client for sync requests
126
        - http_async_client: Custom httpx client for async requests
127
        """
128
```
129

130
### Synchronous Chat Operations
131

132
Generate responses using synchronous methods for immediate results and batch processing.
133

134
```python { .api }
135
def invoke(
136
    self, 
137
    input: LanguageModelInput, 
138
    config: Optional[RunnableConfig] = None, 
139
    **kwargs: Any
140
) -> BaseMessage:
141
    """
142
    Generate a single response from input messages.
143

144
    Parameters:
145
    - input: Messages (list of BaseMessage) or string
146
    - config: Runtime configuration
147
    - **kwargs: Additional parameters
148

149
    Returns:
150
    BaseMessage: Generated response message
151
    """
152

153
def batch(
154
    self,
155
    inputs: List[LanguageModelInput],
156
    config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None,
157
    **kwargs: Any
158
) -> List[BaseMessage]:
159
    """
160
    Process multiple inputs in batch.
161

162
    Parameters:
163
    - inputs: List of message sequences or strings
164
    - config: Runtime configuration(s)
165
    - **kwargs: Additional parameters
166

167
    Returns:
168
    List[BaseMessage]: List of generated responses
169
    """
170

171
def stream(
172
    self,
173
    input: LanguageModelInput,
174
    config: Optional[RunnableConfig] = None,
175
    **kwargs: Any
176
) -> Iterator[BaseMessageChunk]:
177
    """
178
    Stream response tokens as they're generated.
179

180
    Parameters:
181
    - input: Messages (list of BaseMessage) or string
182
    - config: Runtime configuration
183
    - **kwargs: Additional parameters
184

185
    Yields:
186
    BaseMessageChunk: Individual response chunks
187
    """
188

189
def generate(
190
    self,
191
    messages: List[List[BaseMessage]],
192
    stop: Optional[List[str]] = None,
193
    callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None,
194
    **kwargs: Any
195
) -> LLMResult:
196
    """
197
    Legacy generate method returning detailed results.
198

199
    Parameters:
200
    - messages: List of message sequences
201
    - stop: Stop sequences
202
    - callbacks: Callback handlers
203
    - **kwargs: Additional parameters
204

205
    Returns:
206
    LLMResult: Detailed generation results with metadata
207
    """
208
```
209

210
### Asynchronous Chat Operations
211

212
Generate responses using asynchronous methods for concurrent processing and high-throughput applications.
213

214
```python { .api }
215
async def ainvoke(
216
    self,
217
    input: LanguageModelInput,
218
    config: Optional[RunnableConfig] = None,
219
    **kwargs: Any
220
) -> BaseMessage:
221
    """
222
    Asynchronously generate a single response.
223

224
    Parameters:
225
    - input: Messages (list of BaseMessage) or string
226
    - config: Runtime configuration
227
    - **kwargs: Additional parameters
228

229
    Returns:
230
    BaseMessage: Generated response message
231
    """
232

233
async def abatch(
234
    self,
235
    inputs: List[LanguageModelInput],
236
    config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None,
237
    **kwargs: Any
238
) -> List[BaseMessage]:
239
    """
240
    Asynchronously process multiple inputs in batch.
241

242
    Parameters:
243
    - inputs: List of message sequences or strings
244
    - config: Runtime configuration(s)
245
    - **kwargs: Additional parameters
246

247
    Returns:
248
    List[BaseMessage]: List of generated responses
249
    """
250

251
async def astream(
252
    self,
253
    input: LanguageModelInput,
254
    config: Optional[RunnableConfig] = None,
255
    **kwargs: Any
256
) -> AsyncIterator[BaseMessageChunk]:
257
    """
258
    Asynchronously stream response tokens.
259

260
    Parameters:
261
    - input: Messages (list of BaseMessage) or string
262
    - config: Runtime configuration
263
    - **kwargs: Additional parameters
264

265
    Yields:
266
    BaseMessageChunk: Individual response chunks
267
    """
268

269
async def agenerate(
270
    self,
271
    messages: List[List[BaseMessage]],
272
    stop: Optional[List[str]] = None,
273
    callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None,
274
    **kwargs: Any
275
) -> LLMResult:
276
    """
277
    Asynchronously generate with detailed results.
278

279
    Parameters:
280
    - messages: List of message sequences
281
    - stop: Stop sequences
282
    - callbacks: Callback handlers
283
    - **kwargs: Additional parameters
284

285
    Returns:
286
    LLMResult: Detailed generation results with metadata
287
    """
288
```
289

290
### Tool Integration
291

292
Bind tools and functions to enable function calling capabilities with the Groq model.
293

294
```python { .api }
295
def bind_tools(
296
    self,
297
    tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
298
    *,
299
    tool_choice: Optional[Union[Dict, str, Literal["auto", "any", "none"], bool]] = None,
300
    **kwargs: Any
301
) -> Runnable[LanguageModelInput, BaseMessage]:
302
    """
303
    Bind tools for function calling.
304

305
    Parameters:
306
    - tools: List of tool definitions (Pydantic models, functions, or dicts)
307
    - tool_choice: Tool selection strategy
308
      - "auto": Model chooses whether to call tools
309
      - "any"/"required": Model must call a tool
310
      - "none": Disable tool calling
311
      - str: Specific tool name to call
312
      - bool: True requires single tool call
313
      - dict: {"type": "function", "function": {"name": "tool_name"}}
314
    - **kwargs: Additional binding parameters
315

316
    Returns:
317
    Runnable: Model with bound tools
318
    """
319

320
def bind_functions(
321
    self,
322
    functions: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],
323
    function_call: Optional[Union[Dict, str, Literal["auto", "none"]]] = None,
324
    **kwargs: Any
325
) -> Runnable[LanguageModelInput, BaseMessage]:
326
    """
327
    [DEPRECATED] Bind functions for function calling. Use bind_tools instead.
328
    
329
    This method is deprecated since version 0.2.1 and will be removed in 1.0.0.
330
    Use bind_tools() for new development.
331

332
    Parameters:
333
    - functions: List of function definitions (dicts, Pydantic models, callables, or tools)
334
    - function_call: Function call strategy
335
      - "auto": Model chooses whether to call function
336
      - "none": Disable function calling
337
      - str: Specific function name to call
338
      - dict: {"name": "function_name"}
339
    - **kwargs: Additional binding parameters
340

341
    Returns:
342
    Runnable: Model with bound functions
343
    """
344
```
345

346
### Structured Output
347

348
Generate responses conforming to specific schemas using function calling or JSON mode.
349

350
```python { .api }
351
def with_structured_output(
352
    self,
353
    schema: Optional[Union[Dict, Type[BaseModel]]] = None,
354
    *,
355
    method: Literal["function_calling", "json_mode"] = "function_calling",
356
    include_raw: bool = False,
357
    **kwargs: Any
358
) -> Runnable[LanguageModelInput, Union[Dict, BaseModel]]:
359
    """
360
    Create model that outputs structured data.
361

362
    Parameters:
363
    - schema: Output schema (Pydantic model, TypedDict, or OpenAI function schema)
364
    - method: Generation method
365
      - "function_calling": Use function calling API
366
      - "json_mode": Use JSON mode (requires schema instructions in prompt)
367
    - include_raw: Include raw response alongside parsed output
368
    - **kwargs: Additional parameters
369

370
    Returns:
371
    Runnable: Model that returns structured output
372

373
    If include_raw=False:
374
      - Returns: Instance of schema type (if Pydantic) or dict
375
    If include_raw=True:
376
      - Returns: Dict with keys 'raw', 'parsed', 'parsing_error'
377
    """
378
```
379

380
### Model Properties
381

382
Access model configuration and type information.
383

384
```python { .api }
385
@property
386
def _llm_type(self) -> str:
387
    """
388
    Return model type identifier for LangChain integration.
389
    
390
    Returns:
391
        str: Always returns "groq-chat"
392
    """
393

394
@property
395
def lc_secrets(self) -> Dict[str, str]:
396
    """
397
    Return secret field mappings for serialization.
398
    
399
    Returns:
400
        Dict[str, str]: Mapping of secret fields to environment variables
401
                       {"groq_api_key": "GROQ_API_KEY"}
402
    """
403

404
@classmethod
405
def is_lc_serializable(cls) -> bool:
406
    """
407
    Check if model supports LangChain serialization.
408
    
409
    Returns:
410
        bool: Always returns True
411
    """
412
```
413

414
## Usage Examples
415

416
### Tool Calling Example
417

418
```python
419
from langchain_groq import ChatGroq
420
from pydantic import BaseModel, Field
421

422
class WeatherTool(BaseModel):
423
    """Get weather information for a location."""
424
    location: str = Field(description="City and state, e.g. 'San Francisco, CA'")
425

426
llm = ChatGroq(model="llama-3.1-8b-instant")
427
llm_with_tools = llm.bind_tools([WeatherTool], tool_choice="auto")
428

429
response = llm_with_tools.invoke("What's the weather in New York?")
430
print(response.tool_calls)
431
```
432

433
### Structured Output Example
434

435
```python
436
from langchain_groq import ChatGroq
437
from pydantic import BaseModel, Field
438
from typing import Optional
439

440
class PersonInfo(BaseModel):
441
    """Extract person information from text."""
442
    name: str = Field(description="Person's full name")
443
    age: Optional[int] = Field(description="Person's age if mentioned")
444
    occupation: Optional[str] = Field(description="Person's job or profession")
445

446
llm = ChatGroq(model="llama-3.1-8b-instant")
447
structured_llm = llm.with_structured_output(PersonInfo)
448

449
result = structured_llm.invoke("John Smith is a 35-year-old software engineer.")
450
print(f"Name: {result.name}, Age: {result.age}, Job: {result.occupation}")
451
```
452

453
### Reasoning Model Example
454

455
```python
456
from langchain_groq import ChatGroq
457
from langchain_core.messages import HumanMessage, SystemMessage
458

459
# Use reasoning-capable model with parsed reasoning format
460
llm = ChatGroq(
461
    model="deepseek-r1-distill-llama-70b",
462
    reasoning_format="parsed"
463
)
464

465
messages = [
466
    SystemMessage(content="You are a math tutor. Show your reasoning."),
467
    HumanMessage(content="If a train travels 120 miles in 2 hours, what's its average speed?")
468
]
469

470
response = llm.invoke(messages)
471
print("Answer:", response.content)
472
print("Reasoning:", response.additional_kwargs.get("reasoning_content", "No reasoning available"))
473
```
474

475
### Streaming with Token Usage
476

477
```python
478
from langchain_groq import ChatGroq
479

480
llm = ChatGroq(model="llama-3.1-8b-instant")
481
messages = [{"role": "user", "content": "Write a short poem about coding."}]
482

483
full_response = None
484
for chunk in llm.stream(messages):
485
    print(chunk.content, end="", flush=True)
486
    if full_response is None:
487
        full_response = chunk
488
    else:
489
        full_response += chunk
490

491
print("\n\nToken usage:", full_response.usage_metadata)
492
print("Response metadata:", full_response.response_metadata)
493
```
494

495
## Response Metadata
496

497
ChatGroq responses include comprehensive metadata for monitoring and optimization:
498

499
```python { .api }
500
# Response metadata structure
501
{
502
    "token_usage": {
503
        "completion_tokens": int,      # Output tokens used
504
        "prompt_tokens": int,          # Input tokens used
505
        "total_tokens": int,           # Total tokens used
506
        "completion_time": float,      # Time for completion
507
        "prompt_time": float,          # Time for prompt processing
508
        "queue_time": Optional[float], # Time spent in queue
509
        "total_time": float           # Total processing time
510
    },
511
    "model_name": str,                # Model used for generation
512
    "system_fingerprint": str,        # System configuration fingerprint
513
    "finish_reason": str,             # Completion reason ("stop", "length", etc.)
514
    "service_tier": str,             # Service tier used
515
    "reasoning_effort": Optional[str] # Reasoning effort level (if applicable)
516
}
517
```
518

519
## Error Handling
520

521
The package handles various error conditions and provides clear error messages:
522

523
```python
524
from langchain_groq import ChatGroq
525
from groq import BadRequestError
526

527
try:
528
    llm = ChatGroq(model="invalid-model")
529
    response = llm.invoke("Hello")
530
except BadRequestError as e:
531
    print(f"API Error: {e}")
532
except ValueError as e:
533
    print(f"Configuration Error: {e}")
534
```
535

536
Common validation errors:
537
- `n` must be >= 1
538
- `n` must be 1 when streaming is enabled
539
- Missing API key when GROQ_API_KEY environment variable not set
540
- Invalid model name or unavailable model
541

542
## Types
543

544
```python { .api }
545
# Core types used throughout the API
546
from typing import Any, Callable, Dict, List, Literal, Optional, Sequence, Tuple, Union
547
from typing_extensions import TypedDict
548
from langchain_core.messages import BaseMessage, BaseMessageChunk
549
from langchain_core.outputs import ChatResult, LLMResult
550
from langchain_core.language_models import LanguageModelInput
551
from langchain_core.runnables import Runnable, RunnableConfig
552
from langchain_core.callbacks import BaseCallbackHandler, BaseCallbackManager
553
from langchain_core.tools import BaseTool
554
from pydantic import BaseModel, SecretStr
555
from collections.abc import AsyncIterator, Iterator, Mapping
556

557
# Message types for input
558
LanguageModelInput = Union[
559
    str,                          # Simple string input
560
    List[BaseMessage],           # List of messages
561
    # ... other LangChain input types
562
]
563

564
# Service tier options
565
ServiceTier = Literal["on_demand", "flex", "auto"]
566

567
# Reasoning format options  
568
ReasoningFormat = Literal["parsed", "raw", "hidden"]
569

570
# Tool choice options
571
ToolChoice = Union[
572
    Dict,                        # {"type": "function", "function": {"name": "tool_name"}}
573
    str,                        # Tool name or "auto"/"any"/"none"  
574
    Literal["auto", "any", "none"],
575
    bool                        # True for single tool requirement
576
]
577
```