Tessl Tile for pypi/pyllamacpp@2.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli.md embeddings.md index.md langchain-integration.md model-operations.md utilities.md web-ui.md

cli.mddocs/

0
# Command Line Interface
1

2
Interactive command-line interface for model testing and development. The CLI provides a configurable chat interface with extensive parameter control, debugging features, and direct access to model capabilities for development and experimentation.
3

4
## Capabilities
5

6
### Basic CLI Usage
7

8
Launch the interactive chat interface with a model file:
9

10
```bash
11
pyllamacpp /path/to/model.ggml
12
```
13

14
This starts an interactive session where you can chat with the model:
15

16
```
17
██████╗ ██╗   ██╗██╗     ██╗      █████╗ ███╗   ███╗ █████╗  ██████╗██████╗ ██████╗ 
18
██╔══██╗╚██╗ ██╔╝██║     ██║     ██╔══██╗████╗ ████║██╔══██╗██╔════╝██╔══██╗██╔══██╗
19
██████╔╝ ╚████╔╝ ██║     ██║     ███████║██╔████╔██║███████║██║     ██████╔╝██████╔╝
20
██╔═══╝   ╚██╔╝  ██║     ██║     ██╔══██║██║╚██╔╝██║██╔══██║██║     ██╔═══╝ ██╔═══╝ 
21
██║        ██║   ███████╗███████╗██║  ██║██║ ╚═╝ ██║██║  ██║╚██████╗██║     ██║     
22
╚═╝        ╚═╝   ╚══════╝╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝╚═╝  ╚═╝ ╚═════╝╚═╝     ╚═╝     
23

24
PyLLaMACpp
25
A simple Command Line Interface to test the package
26
Version: 2.4.3
27

28
You: Hello, how are you?
29
AI: I'm doing well, thank you for asking! How can I help you today?
30

31
You: 
32
```
33

34
### Command Line Arguments
35

36
The CLI supports extensive parameter customization:
37

38
```bash
39
pyllamacpp --help
40

41
usage: pyllamacpp [-h] [--n_ctx N_CTX] [--seed SEED] [--f16_kv F16_KV] 
42
                  [--logits_all LOGITS_ALL] [--vocab_only VOCAB_ONLY] 
43
                  [--use_mlock USE_MLOCK] [--embedding EMBEDDING] 
44
                  [--n_predict N_PREDICT] [--n_threads N_THREADS] 
45
                  [--repeat_last_n REPEAT_LAST_N] [--top_k TOP_K] 
46
                  [--top_p TOP_P] [--temp TEMP] [--repeat_penalty REPEAT_PENALTY] 
47
                  [--n_batch N_BATCH]
48
                  model
49

50
positional arguments:
51
  model                 The path of the model file
52

53
options:
54
  -h, --help            show this help message and exit
55
  
56
  # Context Parameters
57
  --n_ctx N_CTX         text context (default: 512)
58
  --seed SEED           RNG seed (default: -1 for random)
59
  --f16_kv F16_KV       use fp16 for KV cache (default: False)
60
  --logits_all LOGITS_ALL
61
                        compute all logits, not just the last one (default: False)
62
  --vocab_only VOCAB_ONLY
63
                        only load vocabulary, no weights (default: False)
64
  --use_mlock USE_MLOCK
65
                        force system to keep model in RAM (default: False)
66
  --embedding EMBEDDING
67
                        embedding mode only (default: False)
68
  
69
  # Generation Parameters
70
  --n_predict N_PREDICT
71
                        Number of tokens to predict (default: 256)
72
  --n_threads N_THREADS
73
                        Number of threads (default: 4)
74
  --repeat_last_n REPEAT_LAST_N
75
                        Last n tokens to penalize (default: 64)
76
  --top_k TOP_K         top_k sampling (default: 40)
77
  --top_p TOP_P         top_p sampling (default: 0.95)
78
  --temp TEMP           temperature (default: 0.8)
79
  --repeat_penalty REPEAT_PENALTY
80
                        repeat_penalty (default: 1.1)
81
  --n_batch N_BATCH     batch size for prompt processing (default: 512)
82
```
83

84
### CLI Parameter Examples
85

86
Configure the model for different use cases:
87

88
```bash
89
# High creativity configuration
90
pyllamacpp /path/to/model.ggml \
91
  --temp 1.2 \
92
  --top_p 0.9 \
93
  --top_k 50 \
94
  --n_predict 200
95

96
# Focused, deterministic responses
97
pyllamacpp /path/to/model.ggml \
98
  --temp 0.1 \
99
  --top_p 0.9 \
100
  --top_k 20 \
101
  --repeat_penalty 1.15
102

103
# Large context configuration
104
pyllamacpp /path/to/model.ggml \
105
  --n_ctx 2048 \
106
  --n_batch 1024 \
107
  --n_threads 8
108

109
# GPU acceleration (if supported)
110
pyllamacpp /path/to/model.ggml \
111
  --n_gpu_layers 32 \
112
  --f16_kv True
113

114
# Memory-optimized configuration
115
pyllamacpp /path/to/model.ggml \
116
  --use_mlock True \
117
  --n_batch 256
118
```
119

120
### Interactive Features
121

122
The CLI provides several interactive features:
123

124
1. **Multi-line Input**: Press Enter twice to send multi-line messages
125
2. **Exit Commands**: Type 'exit', 'quit', or press Ctrl+C to quit
126
3. **Context Persistence**: Conversation context is maintained across exchanges
127
4. **Real-time Generation**: See tokens generated in real-time
128
5. **Color Output**: Colored output for better readability
129

130
### Instruction-Following Mode
131

132
The CLI includes built-in instruction-following templates:
133

134
```python { .api }
135
# Default prompt templates in CLI
136
PROMPT_CONTEXT = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
137
PROMPT_PREFIX = "\n\n##Instruction:\n"
138
PROMPT_SUFFIX = "\n\n##Response:\n"
139
```
140

141
Example interaction with instruction format:
142

143
```
144
You: Explain how photosynthesis works
145

146
AI: ##Response:
147
Photosynthesis is the process by which plants convert light energy into chemical energy...
148
```
149

150
### Performance Monitoring
151

152
The CLI includes performance monitoring capabilities:
153

154
```python
155
# Example CLI session with timing info
156
You: Tell me about machine learning
157
AI: Machine learning is a subset of artificial intelligence... (Generated in 2.3s, 45 tokens/s)
158

159
# System information display
160
Model: /path/to/llama-7b.ggml
161
Context size: 512 tokens
162
Threads: 4
163
Memory usage: 4.2 GB
164
```
165

166
### Configuration Schema
167

168
The CLI uses structured parameter schemas for validation:
169

170
```python { .api }
171
# Context parameters schema
172
LLAMA_CONTEXT_PARAMS_SCHEMA = {
173
    'n_ctx': {
174
        'type': int,
175
        'description': "text context",
176
        'default': 512
177
    },
178
    'seed': {
179
        'type': int,
180
        'description': "RNG seed",
181
        'default': -1
182
    },
183
    'f16_kv': {
184
        'type': bool,
185
        'description': "use fp16 for KV cache",
186
        'default': False
187
    },
188
    # ... more parameters
189
}
190

191
# Generation parameters schema
192
GPT_PARAMS_SCHEMA = {
193
    'n_predict': {
194
        'type': int,
195
        'description': "Number of tokens to predict",
196
        'default': 256
197
    },
198
    'n_threads': {
199
        'type': int,
200
        'description': "Number of threads",
201
        'default': 4
202
    },
203
    # ... more parameters
204
}
205
```
206

207
### Programmatic CLI Access
208

209
Access CLI functionality programmatically:
210

211
```python { .api }
212
def main():
213
    """Main entry point for command line interface."""
214

215
def run(args):
216
    """
217
    Run interactive chat session with parsed arguments.
218
    
219
    Parameters:
220
    - args: Parsed command line arguments
221
    """
222
```
223

224
Example programmatic usage:
225

226
```python
227
import argparse
228
from pyllamacpp.cli import run
229

230
# Create argument parser
231
parser = argparse.ArgumentParser()
232
parser.add_argument('model', help='Path to model file')
233
parser.add_argument('--temp', type=float, default=0.8)
234
parser.add_argument('--n_predict', type=int, default=128)
235

236
# Parse arguments and run
237
args = parser.parse_args(['/path/to/model.ggml', '--temp', '0.7'])
238
run(args)
239
```
240

241
### Custom CLI Applications
242

243
Build custom CLI applications using the CLI components:
244

245
```python
246
from pyllamacpp.model import Model
247
from pyllamacpp.cli import bcolors, PROMPT_CONTEXT, PROMPT_PREFIX, PROMPT_SUFFIX
248
import argparse
249

250
def custom_cli():
251
    parser = argparse.ArgumentParser(description="Custom PyLLaMACpp CLI")
252
    parser.add_argument('model', help='Model path')
253
    parser.add_argument('--system-prompt', default="You are a helpful assistant.")
254
    args = parser.parse_args()
255
    
256
    # Initialize model with custom configuration
257
    model = Model(
258
        model_path=args.model,
259
        prompt_context=args.system_prompt,
260
        prompt_prefix="\n\nUser: ",
261
        prompt_suffix="\n\nAssistant: "
262
    )
263
    
264
    print(f"{bcolors.HEADER}Custom PyLLaMACpp Chat{bcolors.ENDC}")
265
    print(f"Model: {args.model}")
266
    print(f"System: {args.system_prompt}")
267
    print("-" * 50)
268
    
269
    while True:
270
        try:
271
            user_input = input(f"{bcolors.OKBLUE}You: {bcolors.ENDC}")
272
            if user_input.lower() in ['exit', 'quit']:
273
                break
274
                
275
            print(f"{bcolors.OKGREEN}AI: {bcolors.ENDC}", end="")
276
            for token in model.generate(user_input, n_predict=150):
277
                print(token, end="", flush=True)
278
            print()
279
            
280
        except KeyboardInterrupt:
281
            print(f"\n{bcolors.WARNING}Goodbye!{bcolors.ENDC}")
282
            break
283

284
if __name__ == "__main__":
285
    custom_cli()
286
```
287

288
### Debugging and Development
289

290
The CLI includes debugging features for development:
291

292
```python
293
# Color codes for terminal output
294
class bcolors:
295
    HEADER = '\033[95m'
296
    OKBLUE = '\033[94m' 
297
    OKGREEN = '\033[92m'
298
    WARNING = '\033[93m'
299
    FAIL = '\033[91m'
300
    ENDC = '\033[0m'
301
    BOLD = '\033[1m'
302
    UNDERLINE = '\033[4m'
303

304
# Usage in CLI output
305
print(f"{bcolors.OKGREEN}Model loaded successfully{bcolors.ENDC}")
306
print(f"{bcolors.WARNING}Warning: Large context size{bcolors.ENDC}")
307
print(f"{bcolors.FAIL}Error: Model file not found{bcolors.ENDC}")
308
```
309

310
### Batch Processing Mode
311

312
Run the CLI in batch mode for automated testing:
313

314
```bash
315
# Process commands from file
316
echo "Tell me a joke" | pyllamacpp /path/to/model.ggml --n_predict 50
317

318
# Multiple prompts
319
cat prompts.txt | pyllamacpp /path/to/model.ggml --temp 0.5
320
```
321

322
### Integration with Development Workflow
323

324
Use the CLI for rapid prototyping and testing:
325

326
```bash
327
# Test different temperatures
328
for temp in 0.3 0.7 1.0; do
329
    echo "Temperature: $temp"
330
    echo "What is AI?" | pyllamacpp model.ggml --temp $temp --n_predict 50
331
    echo "---"
332
done
333

334
# Performance testing
335
time pyllamacpp model.ggml --n_predict 1000 < test_prompt.txt
336

337
# Memory usage monitoring
338
/usr/bin/time -v pyllamacpp model.ggml --use_mlock True < test_prompt.txt
339
```

Version

Tile

Files

cli.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

cli.mddocs/