0
# Command Line Interface
1
2
Interactive command-line interface for model testing and development. The CLI provides a configurable chat interface with extensive parameter control, debugging features, and direct access to model capabilities for development and experimentation.
3
4
## Capabilities
5
6
### Basic CLI Usage
7
8
Launch the interactive chat interface with a model file:
9
10
```bash
11
pyllamacpp /path/to/model.ggml
12
```
13
14
This starts an interactive session where you can chat with the model:
15
16
```
17
██████╗ ██╗ ██╗██╗ ██╗ █████╗ ███╗ ███╗ █████╗ ██████╗██████╗ ██████╗
18
██╔══██╗╚██╗ ██╔╝██║ ██║ ██╔══██╗████╗ ████║██╔══██╗██╔════╝██╔══██╗██╔══██╗
19
██████╔╝ ╚████╔╝ ██║ ██║ ███████║██╔████╔██║███████║██║ ██████╔╝██████╔╝
20
██╔═══╝ ╚██╔╝ ██║ ██║ ██╔══██║██║╚██╔╝██║██╔══██║██║ ██╔═══╝ ██╔═══╝
21
██║ ██║ ███████╗███████╗██║ ██║██║ ╚═╝ ██║██║ ██║╚██████╗██║ ██║
22
╚═╝ ╚═╝ ╚══════╝╚══════╝╚═╝ ╚═╝╚═╝ ╚═╝╚═╝ ╚═╝ ╚═════╝╚═╝ ╚═╝
23
24
PyLLaMACpp
25
A simple Command Line Interface to test the package
26
Version: 2.4.3
27
28
You: Hello, how are you?
29
AI: I'm doing well, thank you for asking! How can I help you today?
30
31
You:
32
```
33
34
### Command Line Arguments
35
36
The CLI supports extensive parameter customization:
37
38
```bash
39
pyllamacpp --help
40
41
usage: pyllamacpp [-h] [--n_ctx N_CTX] [--seed SEED] [--f16_kv F16_KV]
42
[--logits_all LOGITS_ALL] [--vocab_only VOCAB_ONLY]
43
[--use_mlock USE_MLOCK] [--embedding EMBEDDING]
44
[--n_predict N_PREDICT] [--n_threads N_THREADS]
45
[--repeat_last_n REPEAT_LAST_N] [--top_k TOP_K]
46
[--top_p TOP_P] [--temp TEMP] [--repeat_penalty REPEAT_PENALTY]
47
[--n_batch N_BATCH]
48
model
49
50
positional arguments:
51
model The path of the model file
52
53
options:
54
-h, --help show this help message and exit
55
56
# Context Parameters
57
--n_ctx N_CTX text context (default: 512)
58
--seed SEED RNG seed (default: -1 for random)
59
--f16_kv F16_KV use fp16 for KV cache (default: False)
60
--logits_all LOGITS_ALL
61
compute all logits, not just the last one (default: False)
62
--vocab_only VOCAB_ONLY
63
only load vocabulary, no weights (default: False)
64
--use_mlock USE_MLOCK
65
force system to keep model in RAM (default: False)
66
--embedding EMBEDDING
67
embedding mode only (default: False)
68
69
# Generation Parameters
70
--n_predict N_PREDICT
71
Number of tokens to predict (default: 256)
72
--n_threads N_THREADS
73
Number of threads (default: 4)
74
--repeat_last_n REPEAT_LAST_N
75
Last n tokens to penalize (default: 64)
76
--top_k TOP_K top_k sampling (default: 40)
77
--top_p TOP_P top_p sampling (default: 0.95)
78
--temp TEMP temperature (default: 0.8)
79
--repeat_penalty REPEAT_PENALTY
80
repeat_penalty (default: 1.1)
81
--n_batch N_BATCH batch size for prompt processing (default: 512)
82
```
83
84
### CLI Parameter Examples
85
86
Configure the model for different use cases:
87
88
```bash
89
# High creativity configuration
90
pyllamacpp /path/to/model.ggml \
91
--temp 1.2 \
92
--top_p 0.9 \
93
--top_k 50 \
94
--n_predict 200
95
96
# Focused, deterministic responses
97
pyllamacpp /path/to/model.ggml \
98
--temp 0.1 \
99
--top_p 0.9 \
100
--top_k 20 \
101
--repeat_penalty 1.15
102
103
# Large context configuration
104
pyllamacpp /path/to/model.ggml \
105
--n_ctx 2048 \
106
--n_batch 1024 \
107
--n_threads 8
108
109
# GPU acceleration (if supported)
110
pyllamacpp /path/to/model.ggml \
111
--n_gpu_layers 32 \
112
--f16_kv True
113
114
# Memory-optimized configuration
115
pyllamacpp /path/to/model.ggml \
116
--use_mlock True \
117
--n_batch 256
118
```
119
120
### Interactive Features
121
122
The CLI provides several interactive features:
123
124
1. **Multi-line Input**: Press Enter twice to send multi-line messages
125
2. **Exit Commands**: Type 'exit', 'quit', or press Ctrl+C to quit
126
3. **Context Persistence**: Conversation context is maintained across exchanges
127
4. **Real-time Generation**: See tokens generated in real-time
128
5. **Color Output**: Colored output for better readability
129
130
### Instruction-Following Mode
131
132
The CLI includes built-in instruction-following templates:
133
134
```python { .api }
135
# Default prompt templates in CLI
136
PROMPT_CONTEXT = "Below is an instruction that describes a task. Write a response that appropriately completes the request."
137
PROMPT_PREFIX = "\n\n##Instruction:\n"
138
PROMPT_SUFFIX = "\n\n##Response:\n"
139
```
140
141
Example interaction with instruction format:
142
143
```
144
You: Explain how photosynthesis works
145
146
AI: ##Response:
147
Photosynthesis is the process by which plants convert light energy into chemical energy...
148
```
149
150
### Performance Monitoring
151
152
The CLI includes performance monitoring capabilities:
153
154
```python
155
# Example CLI session with timing info
156
You: Tell me about machine learning
157
AI: Machine learning is a subset of artificial intelligence... (Generated in 2.3s, 45 tokens/s)
158
159
# System information display
160
Model: /path/to/llama-7b.ggml
161
Context size: 512 tokens
162
Threads: 4
163
Memory usage: 4.2 GB
164
```
165
166
### Configuration Schema
167
168
The CLI uses structured parameter schemas for validation:
169
170
```python { .api }
171
# Context parameters schema
172
LLAMA_CONTEXT_PARAMS_SCHEMA = {
173
'n_ctx': {
174
'type': int,
175
'description': "text context",
176
'default': 512
177
},
178
'seed': {
179
'type': int,
180
'description': "RNG seed",
181
'default': -1
182
},
183
'f16_kv': {
184
'type': bool,
185
'description': "use fp16 for KV cache",
186
'default': False
187
},
188
# ... more parameters
189
}
190
191
# Generation parameters schema
192
GPT_PARAMS_SCHEMA = {
193
'n_predict': {
194
'type': int,
195
'description': "Number of tokens to predict",
196
'default': 256
197
},
198
'n_threads': {
199
'type': int,
200
'description': "Number of threads",
201
'default': 4
202
},
203
# ... more parameters
204
}
205
```
206
207
### Programmatic CLI Access
208
209
Access CLI functionality programmatically:
210
211
```python { .api }
212
def main():
213
"""Main entry point for command line interface."""
214
215
def run(args):
216
"""
217
Run interactive chat session with parsed arguments.
218
219
Parameters:
220
- args: Parsed command line arguments
221
"""
222
```
223
224
Example programmatic usage:
225
226
```python
227
import argparse
228
from pyllamacpp.cli import run
229
230
# Create argument parser
231
parser = argparse.ArgumentParser()
232
parser.add_argument('model', help='Path to model file')
233
parser.add_argument('--temp', type=float, default=0.8)
234
parser.add_argument('--n_predict', type=int, default=128)
235
236
# Parse arguments and run
237
args = parser.parse_args(['/path/to/model.ggml', '--temp', '0.7'])
238
run(args)
239
```
240
241
### Custom CLI Applications
242
243
Build custom CLI applications using the CLI components:
244
245
```python
246
from pyllamacpp.model import Model
247
from pyllamacpp.cli import bcolors, PROMPT_CONTEXT, PROMPT_PREFIX, PROMPT_SUFFIX
248
import argparse
249
250
def custom_cli():
251
parser = argparse.ArgumentParser(description="Custom PyLLaMACpp CLI")
252
parser.add_argument('model', help='Model path')
253
parser.add_argument('--system-prompt', default="You are a helpful assistant.")
254
args = parser.parse_args()
255
256
# Initialize model with custom configuration
257
model = Model(
258
model_path=args.model,
259
prompt_context=args.system_prompt,
260
prompt_prefix="\n\nUser: ",
261
prompt_suffix="\n\nAssistant: "
262
)
263
264
print(f"{bcolors.HEADER}Custom PyLLaMACpp Chat{bcolors.ENDC}")
265
print(f"Model: {args.model}")
266
print(f"System: {args.system_prompt}")
267
print("-" * 50)
268
269
while True:
270
try:
271
user_input = input(f"{bcolors.OKBLUE}You: {bcolors.ENDC}")
272
if user_input.lower() in ['exit', 'quit']:
273
break
274
275
print(f"{bcolors.OKGREEN}AI: {bcolors.ENDC}", end="")
276
for token in model.generate(user_input, n_predict=150):
277
print(token, end="", flush=True)
278
print()
279
280
except KeyboardInterrupt:
281
print(f"\n{bcolors.WARNING}Goodbye!{bcolors.ENDC}")
282
break
283
284
if __name__ == "__main__":
285
custom_cli()
286
```
287
288
### Debugging and Development
289
290
The CLI includes debugging features for development:
291
292
```python
293
# Color codes for terminal output
294
class bcolors:
295
HEADER = '\033[95m'
296
OKBLUE = '\033[94m'
297
OKGREEN = '\033[92m'
298
WARNING = '\033[93m'
299
FAIL = '\033[91m'
300
ENDC = '\033[0m'
301
BOLD = '\033[1m'
302
UNDERLINE = '\033[4m'
303
304
# Usage in CLI output
305
print(f"{bcolors.OKGREEN}Model loaded successfully{bcolors.ENDC}")
306
print(f"{bcolors.WARNING}Warning: Large context size{bcolors.ENDC}")
307
print(f"{bcolors.FAIL}Error: Model file not found{bcolors.ENDC}")
308
```
309
310
### Batch Processing Mode
311
312
Run the CLI in batch mode for automated testing:
313
314
```bash
315
# Process commands from file
316
echo "Tell me a joke" | pyllamacpp /path/to/model.ggml --n_predict 50
317
318
# Multiple prompts
319
cat prompts.txt | pyllamacpp /path/to/model.ggml --temp 0.5
320
```
321
322
### Integration with Development Workflow
323
324
Use the CLI for rapid prototyping and testing:
325
326
```bash
327
# Test different temperatures
328
for temp in 0.3 0.7 1.0; do
329
echo "Temperature: $temp"
330
echo "What is AI?" | pyllamacpp model.ggml --temp $temp --n_predict 50
331
echo "---"
332
done
333
334
# Performance testing
335
time pyllamacpp model.ggml --n_predict 1000 < test_prompt.txt
336
337
# Memory usage monitoring
338
/usr/bin/time -v pyllamacpp model.ggml --use_mlock True < test_prompt.txt
339
```