or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-langchain-groq

An integration package connecting Groq's Language Processing Unit (LPU) with LangChain for high-performance AI inference

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/langchain-groq@0.3.x

To install, run

npx @tessl/cli install tessl/pypi-langchain-groq@0.3.0

0

# LangChain Groq

1

2

An integration package connecting Groq's Language Processing Unit (LPU) with LangChain for high-performance AI inference. This package provides seamless access to Groq's deterministic, single-core streaming architecture that delivers predictable and repeatable performance for GenAI inference workloads.

3

4

## Package Information

5

6

- **Package Name**: langchain-groq

7

- **Language**: Python

8

- **Installation**: `pip install langchain-groq`

9

- **Dependencies**: langchain-core, groq

10

- **Python Version**: >=3.9

11

12

## Core Imports

13

14

```python

15

from langchain_groq import ChatGroq

16

```

17

18

Import version information:

19

20

```python

21

from langchain_groq import __version__

22

```

23

24

## Basic Usage

25

26

```python

27

from langchain_groq import ChatGroq

28

from langchain_core.messages import HumanMessage, SystemMessage

29

30

# Basic initialization

31

llm = ChatGroq(

32

model="llama-3.1-8b-instant",

33

temperature=0.0,

34

api_key="your-groq-api-key" # or set GROQ_API_KEY env var

35

)

36

37

# Simple conversation

38

messages = [

39

SystemMessage(content="You are a helpful assistant."),

40

HumanMessage(content="What is the capital of France?")

41

]

42

43

response = llm.invoke(messages)

44

print(response.content)

45

46

# Streaming response

47

for chunk in llm.stream(messages):

48

print(chunk.content, end="", flush=True)

49

```

50

51

## Architecture

52

53

LangChain Groq integrates with the LangChain ecosystem through the standard `BaseChatModel` interface, providing:

54

55

- **LangChain Compatibility**: Full integration with LangChain's Runnable interface, supporting chaining, composition, and streaming

56

- **Groq LPU Integration**: Direct connection to Groq's deterministic Language Processing Units for consistent, high-performance inference

57

- **Tool Calling Support**: Native function calling capabilities using OpenAI-compatible tool schemas

58

- **Structured Output**: Built-in support for generating responses conforming to specific schemas via function calling or JSON mode

59

- **Async Support**: Full asynchronous operation support for high-throughput applications

60

- **Streaming**: Real-time token streaming with predictable performance characteristics

61

62

The package follows LangChain's standard patterns while leveraging Groq's unique deterministic architecture for reproducible results across inference runs.

63

64

## Environment Variables

65

66

- **GROQ_API_KEY**: Required API key for Groq service

67

- **GROQ_API_BASE**: Optional custom API base URL

68

- **GROQ_PROXY**: Optional proxy configuration

69

70

## Capabilities

71

72

### Chat Model Initialization

73

74

Initialize the ChatGroq model with comprehensive configuration options for performance, behavior, and API settings.

75

76

```python { .api }

77

class ChatGroq:

78

def __init__(

79

self,

80

model: str,

81

temperature: float = 0.7,

82

max_tokens: Optional[int] = None,

83

stop: Optional[Union[List[str], str]] = None,

84

reasoning_format: Optional[Literal["parsed", "raw", "hidden"]] = None,

85

reasoning_effort: Optional[str] = None,

86

service_tier: Literal["on_demand", "flex", "auto"] = "on_demand",

87

api_key: Optional[str] = None,

88

base_url: Optional[str] = None,

89

timeout: Union[float, Tuple[float, float], Any, None] = None,

90

max_retries: int = 2,

91

streaming: bool = False,

92

n: int = 1,

93

model_kwargs: Dict[str, Any] = None,

94

default_headers: Union[Mapping[str, str], None] = None,

95

default_query: Union[Mapping[str, object], None] = None,

96

http_client: Union[Any, None] = None,

97

http_async_client: Union[Any, None] = None,

98

**kwargs: Any

99

) -> None:

100

"""

101

Initialize ChatGroq model.

102

103

Parameters:

104

- model: Name of Groq model (e.g., "llama-3.1-8b-instant")

105

Note: Aliased to internal field 'model_name'

106

- temperature: Sampling temperature (0.0 to 1.0)

107

- max_tokens: Maximum tokens to generate

108

- stop: Stop sequences (string or list of strings)

109

Note: Aliased to internal field 'stop_sequences'

110

- reasoning_format: Format for reasoning output ("parsed", "raw", "hidden")

111

- reasoning_effort: Level of reasoning effort

112

- service_tier: Service tier ("on_demand", "flex", "auto")

113

- api_key: Groq API key (defaults to GROQ_API_KEY env var)

114

Note: Aliased to internal field 'groq_api_key'

115

- base_url: Custom API base URL

116

Note: Aliased to internal field 'groq_api_base'

117

- timeout: Request timeout in seconds

118

Note: Aliased to internal field 'request_timeout'

119

- max_retries: Maximum retry attempts

120

- streaming: Enable streaming responses

121

- n: Number of completions to generate

122

- model_kwargs: Additional model parameters

123

- default_headers: Default HTTP headers

124

- default_query: Default query parameters

125

- http_client: Custom httpx client for sync requests

126

- http_async_client: Custom httpx client for async requests

127

"""

128

```

129

130

### Synchronous Chat Operations

131

132

Generate responses using synchronous methods for immediate results and batch processing.

133

134

```python { .api }

135

def invoke(

136

self,

137

input: LanguageModelInput,

138

config: Optional[RunnableConfig] = None,

139

**kwargs: Any

140

) -> BaseMessage:

141

"""

142

Generate a single response from input messages.

143

144

Parameters:

145

- input: Messages (list of BaseMessage) or string

146

- config: Runtime configuration

147

- **kwargs: Additional parameters

148

149

Returns:

150

BaseMessage: Generated response message

151

"""

152

153

def batch(

154

self,

155

inputs: List[LanguageModelInput],

156

config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None,

157

**kwargs: Any

158

) -> List[BaseMessage]:

159

"""

160

Process multiple inputs in batch.

161

162

Parameters:

163

- inputs: List of message sequences or strings

164

- config: Runtime configuration(s)

165

- **kwargs: Additional parameters

166

167

Returns:

168

List[BaseMessage]: List of generated responses

169

"""

170

171

def stream(

172

self,

173

input: LanguageModelInput,

174

config: Optional[RunnableConfig] = None,

175

**kwargs: Any

176

) -> Iterator[BaseMessageChunk]:

177

"""

178

Stream response tokens as they're generated.

179

180

Parameters:

181

- input: Messages (list of BaseMessage) or string

182

- config: Runtime configuration

183

- **kwargs: Additional parameters

184

185

Yields:

186

BaseMessageChunk: Individual response chunks

187

"""

188

189

def generate(

190

self,

191

messages: List[List[BaseMessage]],

192

stop: Optional[List[str]] = None,

193

callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None,

194

**kwargs: Any

195

) -> LLMResult:

196

"""

197

Legacy generate method returning detailed results.

198

199

Parameters:

200

- messages: List of message sequences

201

- stop: Stop sequences

202

- callbacks: Callback handlers

203

- **kwargs: Additional parameters

204

205

Returns:

206

LLMResult: Detailed generation results with metadata

207

"""

208

```

209

210

### Asynchronous Chat Operations

211

212

Generate responses using asynchronous methods for concurrent processing and high-throughput applications.

213

214

```python { .api }

215

async def ainvoke(

216

self,

217

input: LanguageModelInput,

218

config: Optional[RunnableConfig] = None,

219

**kwargs: Any

220

) -> BaseMessage:

221

"""

222

Asynchronously generate a single response.

223

224

Parameters:

225

- input: Messages (list of BaseMessage) or string

226

- config: Runtime configuration

227

- **kwargs: Additional parameters

228

229

Returns:

230

BaseMessage: Generated response message

231

"""

232

233

async def abatch(

234

self,

235

inputs: List[LanguageModelInput],

236

config: Optional[Union[RunnableConfig, List[RunnableConfig]]] = None,

237

**kwargs: Any

238

) -> List[BaseMessage]:

239

"""

240

Asynchronously process multiple inputs in batch.

241

242

Parameters:

243

- inputs: List of message sequences or strings

244

- config: Runtime configuration(s)

245

- **kwargs: Additional parameters

246

247

Returns:

248

List[BaseMessage]: List of generated responses

249

"""

250

251

async def astream(

252

self,

253

input: LanguageModelInput,

254

config: Optional[RunnableConfig] = None,

255

**kwargs: Any

256

) -> AsyncIterator[BaseMessageChunk]:

257

"""

258

Asynchronously stream response tokens.

259

260

Parameters:

261

- input: Messages (list of BaseMessage) or string

262

- config: Runtime configuration

263

- **kwargs: Additional parameters

264

265

Yields:

266

BaseMessageChunk: Individual response chunks

267

"""

268

269

async def agenerate(

270

self,

271

messages: List[List[BaseMessage]],

272

stop: Optional[List[str]] = None,

273

callbacks: Optional[Union[List[BaseCallbackHandler], BaseCallbackManager]] = None,

274

**kwargs: Any

275

) -> LLMResult:

276

"""

277

Asynchronously generate with detailed results.

278

279

Parameters:

280

- messages: List of message sequences

281

- stop: Stop sequences

282

- callbacks: Callback handlers

283

- **kwargs: Additional parameters

284

285

Returns:

286

LLMResult: Detailed generation results with metadata

287

"""

288

```

289

290

### Tool Integration

291

292

Bind tools and functions to enable function calling capabilities with the Groq model.

293

294

```python { .api }

295

def bind_tools(

296

self,

297

tools: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],

298

*,

299

tool_choice: Optional[Union[Dict, str, Literal["auto", "any", "none"], bool]] = None,

300

**kwargs: Any

301

) -> Runnable[LanguageModelInput, BaseMessage]:

302

"""

303

Bind tools for function calling.

304

305

Parameters:

306

- tools: List of tool definitions (Pydantic models, functions, or dicts)

307

- tool_choice: Tool selection strategy

308

- "auto": Model chooses whether to call tools

309

- "any"/"required": Model must call a tool

310

- "none": Disable tool calling

311

- str: Specific tool name to call

312

- bool: True requires single tool call

313

- dict: {"type": "function", "function": {"name": "tool_name"}}

314

- **kwargs: Additional binding parameters

315

316

Returns:

317

Runnable: Model with bound tools

318

"""

319

320

def bind_functions(

321

self,

322

functions: Sequence[Union[Dict[str, Any], Type[BaseModel], Callable, BaseTool]],

323

function_call: Optional[Union[Dict, str, Literal["auto", "none"]]] = None,

324

**kwargs: Any

325

) -> Runnable[LanguageModelInput, BaseMessage]:

326

"""

327

[DEPRECATED] Bind functions for function calling. Use bind_tools instead.

328

329

This method is deprecated since version 0.2.1 and will be removed in 1.0.0.

330

Use bind_tools() for new development.

331

332

Parameters:

333

- functions: List of function definitions (dicts, Pydantic models, callables, or tools)

334

- function_call: Function call strategy

335

- "auto": Model chooses whether to call function

336

- "none": Disable function calling

337

- str: Specific function name to call

338

- dict: {"name": "function_name"}

339

- **kwargs: Additional binding parameters

340

341

Returns:

342

Runnable: Model with bound functions

343

"""

344

```

345

346

### Structured Output

347

348

Generate responses conforming to specific schemas using function calling or JSON mode.

349

350

```python { .api }

351

def with_structured_output(

352

self,

353

schema: Optional[Union[Dict, Type[BaseModel]]] = None,

354

*,

355

method: Literal["function_calling", "json_mode"] = "function_calling",

356

include_raw: bool = False,

357

**kwargs: Any

358

) -> Runnable[LanguageModelInput, Union[Dict, BaseModel]]:

359

"""

360

Create model that outputs structured data.

361

362

Parameters:

363

- schema: Output schema (Pydantic model, TypedDict, or OpenAI function schema)

364

- method: Generation method

365

- "function_calling": Use function calling API

366

- "json_mode": Use JSON mode (requires schema instructions in prompt)

367

- include_raw: Include raw response alongside parsed output

368

- **kwargs: Additional parameters

369

370

Returns:

371

Runnable: Model that returns structured output

372

373

If include_raw=False:

374

- Returns: Instance of schema type (if Pydantic) or dict

375

If include_raw=True:

376

- Returns: Dict with keys 'raw', 'parsed', 'parsing_error'

377

"""

378

```

379

380

### Model Properties

381

382

Access model configuration and type information.

383

384

```python { .api }

385

@property

386

def _llm_type(self) -> str:

387

"""

388

Return model type identifier for LangChain integration.

389

390

Returns:

391

str: Always returns "groq-chat"

392

"""

393

394

@property

395

def lc_secrets(self) -> Dict[str, str]:

396

"""

397

Return secret field mappings for serialization.

398

399

Returns:

400

Dict[str, str]: Mapping of secret fields to environment variables

401

{"groq_api_key": "GROQ_API_KEY"}

402

"""

403

404

@classmethod

405

def is_lc_serializable(cls) -> bool:

406

"""

407

Check if model supports LangChain serialization.

408

409

Returns:

410

bool: Always returns True

411

"""

412

```

413

414

## Usage Examples

415

416

### Tool Calling Example

417

418

```python

419

from langchain_groq import ChatGroq

420

from pydantic import BaseModel, Field

421

422

class WeatherTool(BaseModel):

423

"""Get weather information for a location."""

424

location: str = Field(description="City and state, e.g. 'San Francisco, CA'")

425

426

llm = ChatGroq(model="llama-3.1-8b-instant")

427

llm_with_tools = llm.bind_tools([WeatherTool], tool_choice="auto")

428

429

response = llm_with_tools.invoke("What's the weather in New York?")

430

print(response.tool_calls)

431

```

432

433

### Structured Output Example

434

435

```python

436

from langchain_groq import ChatGroq

437

from pydantic import BaseModel, Field

438

from typing import Optional

439

440

class PersonInfo(BaseModel):

441

"""Extract person information from text."""

442

name: str = Field(description="Person's full name")

443

age: Optional[int] = Field(description="Person's age if mentioned")

444

occupation: Optional[str] = Field(description="Person's job or profession")

445

446

llm = ChatGroq(model="llama-3.1-8b-instant")

447

structured_llm = llm.with_structured_output(PersonInfo)

448

449

result = structured_llm.invoke("John Smith is a 35-year-old software engineer.")

450

print(f"Name: {result.name}, Age: {result.age}, Job: {result.occupation}")

451

```

452

453

### Reasoning Model Example

454

455

```python

456

from langchain_groq import ChatGroq

457

from langchain_core.messages import HumanMessage, SystemMessage

458

459

# Use reasoning-capable model with parsed reasoning format

460

llm = ChatGroq(

461

model="deepseek-r1-distill-llama-70b",

462

reasoning_format="parsed"

463

)

464

465

messages = [

466

SystemMessage(content="You are a math tutor. Show your reasoning."),

467

HumanMessage(content="If a train travels 120 miles in 2 hours, what's its average speed?")

468

]

469

470

response = llm.invoke(messages)

471

print("Answer:", response.content)

472

print("Reasoning:", response.additional_kwargs.get("reasoning_content", "No reasoning available"))

473

```

474

475

### Streaming with Token Usage

476

477

```python

478

from langchain_groq import ChatGroq

479

480

llm = ChatGroq(model="llama-3.1-8b-instant")

481

messages = [{"role": "user", "content": "Write a short poem about coding."}]

482

483

full_response = None

484

for chunk in llm.stream(messages):

485

print(chunk.content, end="", flush=True)

486

if full_response is None:

487

full_response = chunk

488

else:

489

full_response += chunk

490

491

print("\n\nToken usage:", full_response.usage_metadata)

492

print("Response metadata:", full_response.response_metadata)

493

```

494

495

## Response Metadata

496

497

ChatGroq responses include comprehensive metadata for monitoring and optimization:

498

499

```python { .api }

500

# Response metadata structure

501

{

502

"token_usage": {

503

"completion_tokens": int, # Output tokens used

504

"prompt_tokens": int, # Input tokens used

505

"total_tokens": int, # Total tokens used

506

"completion_time": float, # Time for completion

507

"prompt_time": float, # Time for prompt processing

508

"queue_time": Optional[float], # Time spent in queue

509

"total_time": float # Total processing time

510

},

511

"model_name": str, # Model used for generation

512

"system_fingerprint": str, # System configuration fingerprint

513

"finish_reason": str, # Completion reason ("stop", "length", etc.)

514

"service_tier": str, # Service tier used

515

"reasoning_effort": Optional[str] # Reasoning effort level (if applicable)

516

}

517

```

518

519

## Error Handling

520

521

The package handles various error conditions and provides clear error messages:

522

523

```python

524

from langchain_groq import ChatGroq

525

from groq import BadRequestError

526

527

try:

528

llm = ChatGroq(model="invalid-model")

529

response = llm.invoke("Hello")

530

except BadRequestError as e:

531

print(f"API Error: {e}")

532

except ValueError as e:

533

print(f"Configuration Error: {e}")

534

```

535

536

Common validation errors:

537

- `n` must be >= 1

538

- `n` must be 1 when streaming is enabled

539

- Missing API key when GROQ_API_KEY environment variable not set

540

- Invalid model name or unavailable model

541

542

## Types

543

544

```python { .api }

545

# Core types used throughout the API

546

from typing import Any, Callable, Dict, List, Literal, Optional, Sequence, Tuple, Union

547

from typing_extensions import TypedDict

548

from langchain_core.messages import BaseMessage, BaseMessageChunk

549

from langchain_core.outputs import ChatResult, LLMResult

550

from langchain_core.language_models import LanguageModelInput

551

from langchain_core.runnables import Runnable, RunnableConfig

552

from langchain_core.callbacks import BaseCallbackHandler, BaseCallbackManager

553

from langchain_core.tools import BaseTool

554

from pydantic import BaseModel, SecretStr

555

from collections.abc import AsyncIterator, Iterator, Mapping

556

557

# Message types for input

558

LanguageModelInput = Union[

559

str, # Simple string input

560

List[BaseMessage], # List of messages

561

# ... other LangChain input types

562

]

563

564

# Service tier options

565

ServiceTier = Literal["on_demand", "flex", "auto"]

566

567

# Reasoning format options

568

ReasoningFormat = Literal["parsed", "raw", "hidden"]

569

570

# Tool choice options

571

ToolChoice = Union[

572

Dict, # {"type": "function", "function": {"name": "tool_name"}}

573

str, # Tool name or "auto"/"any"/"none"

574

Literal["auto", "any", "none"],

575

bool # True for single tool requirement

576

]

577

```