0
# LLM Models
1
2
Simple text generation interface providing direct access to Google's Gemini models for completion-style tasks. This interface extends LangChain's `BaseLLM` and is designed for straightforward text generation without the complexity of conversational context management.
3
4
## Capabilities
5
6
### GoogleGenerativeAI
7
8
Text generation LLM that wraps Google's Gemini models in a simple completion interface.
9
10
```python { .api }
11
class GoogleGenerativeAI:
12
def __init__(
13
self,
14
*,
15
model: str,
16
google_api_key: Optional[SecretStr] = None,
17
credentials: Any = None,
18
temperature: float = 0.7,
19
top_p: Optional[float] = None,
20
top_k: Optional[int] = None,
21
max_output_tokens: Optional[int] = None,
22
n: int = 1,
23
max_retries: int = 6,
24
timeout: Optional[float] = None,
25
client_options: Optional[Dict] = None,
26
transport: Optional[str] = None,
27
additional_headers: Optional[Dict[str, str]] = None,
28
response_modalities: Optional[List[Modality]] = None,
29
thinking_budget: Optional[int] = None,
30
include_thoughts: Optional[bool] = None,
31
safety_settings: Optional[Dict[HarmCategory, HarmBlockThreshold]] = None
32
)
33
```
34
35
**Parameters:**
36
- `model` (str): Model name (e.g., "gemini-2.5-pro", "gemini-2.0-flash")
37
- `google_api_key` (Optional[SecretStr]): Google API key (defaults to GOOGLE_API_KEY env var)
38
- `credentials` (Any): Google authentication credentials object
39
- `temperature` (float): Generation temperature [0.0, 2.0], controls randomness
40
- `top_p` (Optional[float]): Nucleus sampling parameter [0.0, 1.0]
41
- `top_k` (Optional[int]): Top-k sampling parameter for vocabulary selection
42
- `max_output_tokens` (Optional[int]): Maximum tokens in response
43
- `n` (int): Number of completions to generate (default: 1)
44
- `max_retries` (int): Maximum retry attempts for failed requests (default: 6)
45
- `timeout` (Optional[float]): Request timeout in seconds
46
- `client_options` (Optional[Dict]): API client configuration options
47
- `transport` (Optional[str]): Transport method ["rest", "grpc", "grpc_asyncio"]
48
- `additional_headers` (Optional[Dict[str, str]]): Additional HTTP headers
49
- `response_modalities` (Optional[List[Modality]]): Response output modalities
50
- `thinking_budget` (Optional[int]): Thinking budget in tokens for reasoning
51
- `include_thoughts` (Optional[bool]): Include reasoning thoughts in response
52
- `safety_settings` (Optional[Dict[HarmCategory, HarmBlockThreshold]]): Content safety configuration
53
54
### Core Methods
55
56
#### Text Generation
57
58
```python { .api }
59
def invoke(
60
self,
61
input: Union[str, List[BaseMessage]],
62
config: Optional[RunnableConfig] = None,
63
*,
64
stop: Optional[List[str]] = None,
65
**kwargs: Any
66
) -> str
67
```
68
69
Generate text completion for the given input.
70
71
**Parameters:**
72
- `input`: Input text prompt or list of messages
73
- `config`: Optional run configuration
74
- `stop`: List of stop sequences to end generation
75
- `**kwargs`: Additional generation parameters
76
77
**Returns:** Generated text as string
78
79
```python { .api }
80
async def ainvoke(
81
self,
82
input: Union[str, List[BaseMessage]],
83
config: Optional[RunnableConfig] = None,
84
**kwargs: Any
85
) -> str
86
```
87
88
Async version of invoke().
89
90
#### Streaming
91
92
```python { .api }
93
def stream(
94
self,
95
input: Union[str, List[BaseMessage]],
96
config: Optional[RunnableConfig] = None,
97
*,
98
stop: Optional[List[str]] = None,
99
**kwargs: Any
100
) -> Iterator[str]
101
```
102
103
Stream text generation as chunks.
104
105
**Parameters:**
106
- `input`: Input text prompt or list of messages
107
- `config`: Optional run configuration
108
- `stop`: List of stop sequences
109
- `**kwargs`: Additional generation parameters
110
111
**Returns:** Iterator of text chunks
112
113
```python { .api }
114
async def astream(
115
self,
116
input: Union[str, List[BaseMessage]],
117
config: Optional[RunnableConfig] = None,
118
**kwargs: Any
119
) -> AsyncIterator[str]
120
```
121
122
Async version of stream().
123
124
### Utility Methods
125
126
```python { .api }
127
def get_num_tokens(self, text: str) -> int
128
```
129
130
Estimate token count for input text.
131
132
**Parameters:**
133
- `text` (str): Input text to count tokens for
134
135
**Returns:** Estimated token count
136
137
## Usage Examples
138
139
### Basic Text Generation
140
141
```python
142
from langchain_google_genai import GoogleGenerativeAI
143
144
# Initialize LLM
145
llm = GoogleGenerativeAI(model="gemini-2.5-pro")
146
147
# Generate text completion
148
result = llm.invoke("Once upon a time in a land of artificial intelligence")
149
print(result)
150
```
151
152
### Streaming Generation
153
154
```python
155
# Stream text as it's generated
156
for chunk in llm.stream("Write a short story about robots learning to paint"):
157
print(chunk, end="", flush=True)
158
print() # New line after streaming
159
```
160
161
### Temperature Control
162
163
```python
164
# Creative writing with higher temperature
165
creative_llm = GoogleGenerativeAI(
166
model="gemini-2.5-pro",
167
temperature=1.2 # More creative/random
168
)
169
170
creative_text = creative_llm.invoke("Describe a futuristic city")
171
172
# Factual content with lower temperature
173
factual_llm = GoogleGenerativeAI(
174
model="gemini-2.5-pro",
175
temperature=0.1 # More focused/deterministic
176
)
177
178
factual_text = factual_llm.invoke("Explain photosynthesis")
179
```
180
181
### Token Limits and Sampling
182
183
```python
184
# Configure generation parameters
185
llm = GoogleGenerativeAI(
186
model="gemini-2.5-pro",
187
max_output_tokens=500, # Limit response length
188
top_p=0.8, # Nucleus sampling
189
top_k=40 # Top-k sampling
190
)
191
192
result = llm.invoke("Write a summary of machine learning")
193
```
194
195
### Safety Settings
196
197
```python
198
from langchain_google_genai import HarmCategory, HarmBlockThreshold
199
200
# Configure content safety
201
safe_llm = GoogleGenerativeAI(
202
model="gemini-2.5-pro",
203
safety_settings={
204
HarmCategory.HARM_CATEGORY_HARASSMENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
205
HarmCategory.HARM_CATEGORY_HATE_SPEECH: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
206
HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,
207
}
208
)
209
210
result = safe_llm.invoke("Generate helpful and safe content")
211
```
212
213
### Async Usage
214
215
```python
216
import asyncio
217
218
async def generate_multiple():
219
llm = GoogleGenerativeAI(model="gemini-2.5-pro")
220
221
# Generate multiple completions concurrently
222
tasks = [
223
llm.ainvoke("Write about space exploration"),
224
llm.ainvoke("Write about ocean conservation"),
225
llm.ainvoke("Write about renewable energy")
226
]
227
228
results = await asyncio.gather(*tasks)
229
230
for i, result in enumerate(results, 1):
231
print(f"Result {i}: {result[:100]}...")
232
233
# Run async example
234
asyncio.run(generate_multiple())
235
```
236
237
### Stop Sequences
238
239
```python
240
# Use stop sequences to control generation
241
llm = GoogleGenerativeAI(model="gemini-2.5-pro")
242
243
result = llm.invoke(
244
"List the planets in our solar system:\n1.",
245
stop=["\n\n", "10."] # Stop at double newline or item 10
246
)
247
print(result)
248
```
249
250
### Custom Client Configuration
251
252
```python
253
# Configure API client options
254
llm = GoogleGenerativeAI(
255
model="gemini-2.5-pro",
256
client_options={
257
"api_endpoint": "https://generativelanguage.googleapis.com"
258
},
259
transport="rest", # Use REST instead of gRPC
260
additional_headers={
261
"User-Agent": "MyApp/1.0"
262
},
263
timeout=30.0 # 30 second timeout
264
)
265
266
result = llm.invoke("Generate content with custom configuration")
267
```
268
269
### Integration with LangChain
270
271
```python
272
from langchain_core.prompts import PromptTemplate
273
from langchain_core.output_parsers import StrOutputParser
274
275
# Create a simple chain
276
llm = GoogleGenerativeAI(model="gemini-2.5-pro")
277
278
prompt = PromptTemplate.from_template(
279
"Write a {style} poem about {topic}"
280
)
281
282
output_parser = StrOutputParser()
283
284
# Build chain
285
chain = prompt | llm | output_parser
286
287
# Use chain
288
result = chain.invoke({
289
"style": "haiku",
290
"topic": "artificial intelligence"
291
})
292
print(result)
293
```
294
295
### Token Counting
296
297
```python
298
# Estimate tokens before generation
299
llm = GoogleGenerativeAI(model="gemini-2.5-pro")
300
301
prompt = "Explain quantum computing in detail"
302
token_count = llm.get_num_tokens(prompt)
303
304
print(f"Input tokens: {token_count}")
305
306
# Generate with awareness of token usage
307
if token_count < 1000: # Stay within limits
308
result = llm.invoke(prompt)
309
print(f"Generated: {result[:100]}...")
310
else:
311
print("Prompt too long, consider shortening")
312
```
313
314
## Error Handling
315
316
Handle errors appropriately for LLM operations:
317
318
```python
319
from langchain_google_genai import GoogleGenerativeAI
320
321
try:
322
llm = GoogleGenerativeAI(model="gemini-2.5-pro")
323
result = llm.invoke("Your prompt here")
324
print(result)
325
except Exception as e:
326
if "safety" in str(e).lower():
327
print(f"Content blocked by safety filters: {e}")
328
elif "model" in str(e).lower():
329
print(f"Model error: {e}")
330
else:
331
print(f"Generation error: {e}")
332
```
333
334
## Model Recommendations
335
336
- **gemini-2.5-pro**: Best for complex reasoning, creative writing, and detailed analysis
337
- **gemini-2.0-flash**: Faster inference for simpler tasks and real-time applications
338
- **gemini-pro**: General-purpose model for balanced performance and cost