Tessl Tile for pypi/pyllamacpp@2.4.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli.md embeddings.md index.md langchain-integration.md model-operations.md utilities.md web-ui.md

index.mddocs/

0
# PyLLaMACpp
1

2
Python bindings for llama.cpp enabling developers to run Facebook's LLaMA language models and other compatible large language models directly in Python applications. PyLLaMACpp provides both high-level Python APIs through the Model class for easy integration, and low-level access to llama.cpp C-API functions for advanced users requiring custom implementations.
3

4
## Package Information
5

6
- **Package Name**: pyllamacpp
7
- **Language**: Python
8
- **Installation**: `pip install pyllamacpp`
9
- **Dependencies**: CMake, pybind11 (for building from source); optional: numpy, torch, sentencepiece (for model conversion utilities)
10

11
## Core Imports
12

13
```python
14
from pyllamacpp.model import Model
15
```
16

17
For utility functions:
18

19
```python
20
from pyllamacpp import utils
21
```
22

23
For LangChain integration:
24

25
```python
26
from pyllamacpp.langchain_llm import PyllamacppLLM
27
```
28

29
For logging configuration:
30

31
```python
32
from pyllamacpp._logger import get_logger, set_log_level
33
```
34

35
For package constants:
36

37
```python
38
from pyllamacpp.constants import PACKAGE_NAME, LOGGING_LEVEL
39
```
40

41
For web interface:
42

43
```python
44
from pyllamacpp.webui import webui, run
45
```
46

47
## Basic Usage
48

49
```python
50
from pyllamacpp.model import Model
51

52
# Load a GGML model
53
model = Model(model_path='/path/to/model.ggml')
54

55
# Generate text streaming tokens
56
for token in model.generate("Tell me a joke"):
57
    print(token, end='', flush=True)
58

59
# Or generate all at once using cpp_generate
60
response = model.cpp_generate("What is artificial intelligence?", n_predict=100)
61
print(response)
62
```
63

64
Interactive dialogue example:
65

66
```python
67
from pyllamacpp.model import Model
68

69
model = Model(model_path='/path/to/model.ggml')
70

71
while True:
72
    try:
73
        prompt = input("You: ")
74
        if prompt == '':
75
            continue
76
        print("AI:", end='')
77
        for token in model.generate(prompt):
78
            print(token, end='', flush=True)
79
        print()
80
    except KeyboardInterrupt:
81
        break
82
```
83

84
## Architecture
85

86
PyLLaMACpp operates as a bridge between Python and the high-performance llama.cpp C++ library:
87

88
- **Model Class**: High-level Python interface providing text generation, tokenization, and embedding capabilities
89
- **C++ Extension (_pyllamacpp)**: Direct bindings to llama.cpp functions built with pybind11
90
- **Utility Functions**: Model format conversion and quantization tools
91
- **Integration Wrappers**: LangChain compatibility and web UI interfaces
92
- **CLI Interface**: Command-line tool for interactive model testing
93

94
The architecture enables maximum performance by leveraging llama.cpp's optimized C++ implementation while maintaining ease of use through Python interfaces, making it suitable for chatbots, text generation, interactive AI applications, and any project requiring efficient local language model inference without external API dependencies.
95

96
## Capabilities
97

98
### Model Operations
99

100
Core functionality for loading models, generating text, and managing model state. Includes both streaming token generation and batch text generation methods with extensive parameter control.
101

102
```python { .api }
103
class Model:
104
    def __init__(self, model_path: str, prompt_context: str = '', prompt_prefix: str = '', prompt_suffix: str = '', log_level: int = logging.ERROR, n_ctx: int = 512, seed: int = 0, n_gpu_layers: int = 0, f16_kv: bool = False, logits_all: bool = False, vocab_only: bool = False, use_mlock: bool = False, embedding: bool = False): ...
105
    def generate(self, prompt: str, n_predict: Union[None, int] = None, n_threads: int = 4, **kwargs) -> Generator: ...
106
    def cpp_generate(self, prompt: str, n_predict: int = 128, **kwargs) -> str: ...
107
    def tokenize(self, text: str): ...
108
    def detokenize(self, tokens: list): ...
109
    def reset(self) -> None: ...
110
```
111

112
[Model Operations](./model-operations.md)
113

114
### Utility Functions
115

116
Helper functions for model format conversion and quantization. Includes conversion from LLaMA PyTorch models to GGML format and quantization for reduced model sizes.
117

118
```python { .api }
119
def llama_to_ggml(dir_model: str, ftype: int = 1) -> str: ...
120
def quantize(ggml_model_path: str, output_model_path: str = None, itype: int = 2) -> str: ...
121
```
122

123
[Utility Functions](./utilities.md)
124

125
### LangChain Integration
126

127
LangChain-compatible wrapper class enabling seamless integration with LangChain workflows and chains. Provides the same interface as other LangChain LLM implementations.
128

129
```python { .api }
130
class PyllamacppLLM(LLM):
131
    model: str
132
    n_ctx: int = 512
133
    seed: int = 0
134
    n_threads: int = 4
135
    n_predict: int = 50
136
    temp: float = 0.8
137
    top_p: float = 0.95
138
    top_k: int = 40
139
```
140

141
[LangChain Integration](./langchain-integration.md)
142

143
### Embeddings
144

145
Vector embeddings functionality for semantic similarity and RAG applications. Supports generating embeddings for individual prompts or extracting embeddings from current model context.
146

147
```python { .api }
148
def get_embeddings(self) -> List[float]: ...
149
def get_prompt_embeddings(self, prompt: str, n_threads: int = 4, n_batch: int = 512) -> List[float]: ...
150
```
151

152
[Embeddings](./embeddings.md)
153

154
### Web User Interface
155

156
Streamlit-based web interface for interactive model testing and development. Provides browser-based chat interface with configurable parameters and real-time model interaction.
157

158
```python { .api }
159
def webui() -> None: ...
160
def run(): ...
161
```
162

163
[Web User Interface](./web-ui.md)
164

165
### Command Line Interface
166

167
Interactive command-line interface for model testing and development. Provides configurable chat interface with extensive parameter control and debugging features.
168

169
```bash
170
pyllamacpp path/to/model.ggml
171
```
172

173
[Command Line Interface](./cli.md)

Version

Tile

Files

index.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

index.mddocs/