Tessl Tile for pypi/cerebras-cloud-sdk@1.50.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

chat-completions.md client-management.md index.md legacy-completions.md models.md types-and-configuration.md

legacy-completions.mddocs/

0
# Legacy Completions
1

2
Legacy text completion API for traditional completion-style interactions. Supports text generation with various parameters including temperature, top-p sampling, frequency penalties, and custom stop sequences. This API follows the traditional completion format where the model continues from a given prompt.
3

4
## Capabilities
5

6
### Text Completion Creation
7

8
Creates text completions using the traditional prompt-based format with extensive configuration options for controlling generation behavior.
9

10
```python { .api }
11
def create(
12
    self,
13
    *,
14
    model: str,
15
    best_of: Optional[int] = NOT_GIVEN,
16
    echo: Optional[bool] = NOT_GIVEN,
17
    frequency_penalty: Optional[float] = NOT_GIVEN,
18
    logit_bias: Optional[Dict[str, int]] = NOT_GIVEN,
19
    logprobs: Optional[int] = NOT_GIVEN,
20
    max_tokens: Optional[int] = NOT_GIVEN,
21
    n: Optional[int] = NOT_GIVEN,
22
    presence_penalty: Optional[float] = NOT_GIVEN,
23
    prompt: Union[str, List[str], List[int], List[List[int]], None] = NOT_GIVEN,
24
    seed: Optional[int] = NOT_GIVEN,
25
    stop: Union[Optional[str], List[str], None] = NOT_GIVEN,
26
    stream: Optional[Literal[False]] | NotGiven = NOT_GIVEN,
27
    stream_options: Optional[completion_create_params.StreamOptions] | NotGiven = NOT_GIVEN,
28
    suffix: Optional[str] = NOT_GIVEN,
29
    temperature: Optional[float] = NOT_GIVEN,
30
    top_p: Optional[float] = NOT_GIVEN,
31
    user: str | NotGiven = NOT_GIVEN,
32
    grammar_root: Optional[str] = NOT_GIVEN,
33
    return_raw_tokens: Optional[bool] = NOT_GIVEN,
34
    **kwargs
35
) -> Completion:
36
    """
37
    Create a text completion.
38
    
39
    Parameters:
40
    - model: ID of the model to use (e.g., "llama3.1-70b")
41
    - best_of: Generate N completions server-side and return the best one
42
    - echo: Echo back the prompt in addition to the completion
43
    - frequency_penalty: Penalty for frequent token usage (-2.0 to 2.0)
44
    - logit_bias: Modify likelihood of specific tokens appearing
45
    - logprobs: Include log probabilities on most likely tokens (0-5)
46
    - max_tokens: Maximum number of tokens to generate
47
    - n: Number of completion choices to generate
48
    - presence_penalty: Penalty for token presence (-2.0 to 2.0)
49
    - prompt: Text prompt(s) to complete (string, list of strings, or token arrays)
50
    - seed: Random seed for deterministic generation
51
    - stop: Sequences where generation should stop
52
    - stream: Enable streaming response (use stream=True for streaming)
53
    - stream_options: Additional streaming options
54
    - suffix: Text that comes after the completion (for insertion tasks)
55
    - temperature: Sampling temperature (0.0 to 2.0)
56
    - top_p: Nucleus sampling parameter (0.0 to 1.0)
57
    - user: Unique identifier for the end-user
58
    - grammar_root: Grammar rule for structured output generation
59
    - return_raw_tokens: Return raw tokens instead of decoded text
60
    
61
    Returns:
62
    Completion object with generated text
63
    """
64
```
65

66
### Streaming Text Completion
67

68
Creates streaming text completions for real-time token generation.
69

70
```python { .api }
71
def create(
72
    self,
73
    *,
74
    model: str,
75
    prompt: Union[str, List[str], List[int], List[List[int]], None],
76
    stream: Literal[True],
77
    **kwargs
78
) -> Stream[CompletionChunk]:
79
    """
80
    Create a streaming text completion.
81
    
82
    Parameters:
83
    - stream: Must be True for streaming responses
84
    - All other parameters same as non-streaming create()
85
    
86
    Returns:
87
    Stream object yielding CompletionChunk objects
88
    """
89
```
90

91
### Resource Classes
92

93
Synchronous and asynchronous resource classes that provide the completions API methods.
94

95
```python { .api }
96
class CompletionsResource(SyncAPIResource):
97
    """Synchronous completions resource."""
98
    
99
    @cached_property
100
    def with_raw_response(self) -> CompletionsResourceWithRawResponse: ...
101
    
102
    @cached_property
103
    def with_streaming_response(self) -> CompletionsResourceWithStreamingResponse: ...
104

105
class AsyncCompletionsResource(AsyncAPIResource):
106
    """Asynchronous completions resource."""
107
    
108
    @cached_property
109
    def with_raw_response(self) -> AsyncCompletionsResourceWithRawResponse: ...
110
    
111
    @cached_property
112
    def with_streaming_response(self) -> AsyncCompletionsResourceWithStreamingResponse: ...
113
```
114

115
## Parameter Types
116

117
### Completion Parameters
118

119
```python { .api }
120
class CompletionCreateParams(TypedDict, total=False):
121
    """Parameters for creating text completions."""
122
    model: Required[str]
123
    
124
    best_of: Optional[int]
125
    echo: Optional[bool]
126
    frequency_penalty: Optional[float]
127
    logit_bias: Optional[Dict[str, int]]
128
    logprobs: Optional[int]
129
    max_tokens: Optional[int]
130
    n: Optional[int]
131
    presence_penalty: Optional[float]
132
    prompt: Union[str, List[str], List[int], List[List[int]], None]
133
    seed: Optional[int]
134
    stop: Union[Optional[str], List[str], None]
135
    stream: Optional[bool]
136
    stream_options: Optional[StreamOptions]
137
    suffix: Optional[str]
138
    temperature: Optional[float]
139
    top_p: Optional[float]
140
    user: Optional[str]
141

142
class StreamOptions(TypedDict, total=False):
143
    """Options for streaming completions."""
144
    include_usage: Optional[bool]
145
```
146

147
## Response Types
148

149
### Completion Response
150

151
```python { .api }
152
class Completion(BaseModel):
153
    """Complete text completion response."""
154
    id: str
155
    choices: List[CompletionChoice]
156
    created: int
157
    model: str
158
    object: Literal["text_completion"]
159
    system_fingerprint: Optional[str]
160
    usage: Optional[CompletionUsage]
161

162
class CompletionChoice(BaseModel):
163
    """Individual completion choice."""
164
    finish_reason: Optional[Literal["stop", "length", "content_filter"]]
165
    index: int
166
    logprobs: Optional[CompletionLogprobs]
167
    text: str
168

169
class CompletionUsage(BaseModel):
170
    """Token usage information."""
171
    completion_tokens: int
172
    prompt_tokens: int
173
    total_tokens: int
174

175
class CompletionLogprobs(BaseModel):
176
    """Log probability information."""
177
    text_offset: List[int]
178
    token_logprobs: List[Optional[float]]
179
    tokens: List[str]
180
    top_logprobs: Optional[List[Dict[str, float]]]
181
```
182

183
### Streaming Response Types
184

185
```python { .api }
186
class CompletionChunk(BaseModel):
187
    """Streaming chunk in text completion."""
188
    id: str
189
    choices: List[CompletionChunkChoice]
190
    created: int
191
    model: str
192
    object: Literal["text_completion"]
193
    system_fingerprint: Optional[str]
194
    usage: Optional[CompletionUsage]
195

196
class CompletionChunkChoice(BaseModel):
197
    """Choice in streaming chunk."""
198
    finish_reason: Optional[Literal["stop", "length", "content_filter"]]
199
    index: int
200
    logprobs: Optional[CompletionLogprobs]
201
    text: str
202
```
203

204
## Usage Examples
205

206
### Basic Text Completion
207

208
```python
209
from cerebras.cloud.sdk import Cerebras
210

211
client = Cerebras()
212

213
response = client.completions.create(
214
    model="llama3.1-70b",
215
    prompt="The future of artificial intelligence is",
216
    max_tokens=100,
217
    temperature=0.7,
218
    stop=["\n", "."]
219
)
220

221
print(response.choices[0].text)
222
print(f"Used {response.usage.total_tokens} tokens")
223
```
224

225
### Text Completion with Multiple Choices
226

227
```python
228
from cerebras.cloud.sdk import Cerebras
229

230
client = Cerebras()
231

232
response = client.completions.create(
233
    model="llama3.1-70b",
234
    prompt="Complete this sentence: The most important skill in programming is",
235
    max_tokens=50,
236
    n=3,  # Generate 3 different completions
237
    temperature=0.8
238
)
239

240
for i, choice in enumerate(response.choices):
241
    print(f"Option {i+1}: {choice.text.strip()}")
242
```
243

244
### Streaming Text Completion
245

246
```python
247
from cerebras.cloud.sdk import Cerebras
248

249
client = Cerebras()
250

251
stream = client.completions.create(
252
    model="llama3.1-70b",
253
    prompt="Write a short poem about machine learning:",
254
    max_tokens=200,
255
    stream=True,
256
    temperature=0.8
257
)
258

259
print("Poem:", end="")
260
for chunk in stream:
261
    if chunk.choices[0].text:
262
        print(chunk.choices[0].text, end="", flush=True)
263
print()
264
```
265

266
### Text Completion with Log Probabilities
267

268
```python
269
from cerebras.cloud.sdk import Cerebras
270

271
client = Cerebras()
272

273
response = client.completions.create(
274
    model="llama3.1-70b",
275
    prompt="The capital of France is",
276
    max_tokens=10,
277
    logprobs=5,  # Return top 5 log probabilities
278
    temperature=0.1
279
)
280

281
choice = response.choices[0]
282
print(f"Generated text: {choice.text}")
283

284
if choice.logprobs:
285
    print("\nToken probabilities:")
286
    for token, logprob in zip(choice.logprobs.tokens, choice.logprobs.token_logprobs):
287
        if logprob is not None:
288
            probability = round(100 * (2.71828 ** logprob), 2)
289
            print(f"  '{token}': {probability}%")
290
```
291

292
### Text Insertion (with Suffix)
293

294
```python
295
from cerebras.cloud.sdk import Cerebras
296

297
client = Cerebras()
298

299
# Complete text in the middle of a sentence
300
response = client.completions.create(
301
    model="llama3.1-70b",
302
    prompt="def fibonacci(n):\n    ",
303
    suffix="\n    return result",
304
    max_tokens=100,
305
    temperature=0.3
306
)
307

308
print("Generated code:")
309
print(response.choices[0].text)
310
```
311

312
### Best-of Sampling
313

314
```python
315
from cerebras.cloud.sdk import Cerebras
316

317
client = Cerebras()
318

319
response = client.completions.create(
320
    model="llama3.1-70b",
321
    prompt="Explain quantum computing in simple terms:",
322
    max_tokens=150,
323
    best_of=5,  # Generate 5 completions, return the best one
324
    n=1,  # Return only the best completion
325
    temperature=0.8
326
)
327

328
print("Best completion:")
329
print(response.choices[0].text)
330
```
331

332
### Async Text Completion
333

334
```python
335
import asyncio
336
from cerebras.cloud.sdk import AsyncCerebras
337

338
async def complete_text():
339
    client = AsyncCerebras()
340
    
341
    response = await client.completions.create(
342
        model="llama3.1-70b",
343
        prompt="The benefits of renewable energy include",
344
        max_tokens=100,
345
        temperature=0.6
346
    )
347
    
348
    print(response.choices[0].text)
349
    await client.aclose()
350

351
asyncio.run(complete_text())
352
```
353

354
### Batch Completions
355

356
```python
357
from cerebras.cloud.sdk import Cerebras
358

359
client = Cerebras()
360

361
prompts = [
362
    "The advantages of solar power are",
363
    "Wind energy is beneficial because",
364
    "Hydroelectric power works by"
365
]
366

367
response = client.completions.create(
368
    model="llama3.1-70b",
369
    prompt=prompts,  # Multiple prompts
370
    max_tokens=50,
371
    temperature=0.5
372
)
373

374
for i, choice in enumerate(response.choices):
375
    print(f"Prompt {i+1} completion: {choice.text.strip()}")
376
```
377

378
### Frequency and Presence Penalties
379

380
```python
381
from cerebras.cloud.sdk import Cerebras
382

383
client = Cerebras()
384

385
response = client.completions.create(
386
    model="llama3.1-70b",
387
    prompt="List the planets in our solar system:",
388
    max_tokens=100,
389
    frequency_penalty=0.5,  # Reduce repetition
390
    presence_penalty=0.3,   # Encourage new topics
391
    temperature=0.7
392
)
393

394
print(response.choices[0].text)
395
```
396

397
### Stop Sequences
398

399
```python
400
from cerebras.cloud.sdk import Cerebras
401

402
client = Cerebras()
403

404
response = client.completions.create(
405
    model="llama3.1-70b",
406
    prompt="Q: What is photosynthesis?\nA:",
407
    max_tokens=200,
408
    stop=["Q:", "\n\n"],  # Stop at next question or double newline
409
    temperature=0.5
410
)
411

412
print(f"Answer: {response.choices[0].text.strip()}")
413
```

Version

Tile

Files

legacy-completions.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

legacy-completions.mddocs/