Tessl Tile for pypi/ngt@2.3.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

index.md legacy-interface.md modern-index.md optimization.md quantized-indexes.md

quantized-indexes.mddocs/

0
# Quantized Indexes
1

2
Quantized indexes provide memory-efficient vector indexing by using compressed representations while maintaining search accuracy. NGT offers two quantization approaches: QuantizedIndex for standard quantization and QuantizedBlobIndex for advanced blob-based quantization with maximum compression.
3

4
## Capabilities
5

6
### Quantized Index
7

8
Standard quantized indexing that reduces memory usage through vector quantization while preserving search performance.
9

10
```python { .api }
11
class QuantizedIndex:
12
    def __init__(self, path, max_no_of_edges=128, zero_based_numbering=True, 
13
                read_only=False, log_disabled=False):
14
        """
15
        Open quantized index for memory-efficient search.
16
        
17
        Args:
18
            path (str): Path to quantized index directory
19
            max_no_of_edges (int): Maximum edges per node (default: 128)
20
            zero_based_numbering (bool): Use zero-based object IDs (default: True)
21
            read_only (bool): Open in read-only mode (default: False)
22
            log_disabled (bool): Disable progress logging (default: False)
23
        """
24
```
25

26
### Quantized Search Operations
27

28
Search operations optimized for quantized vector representations with result expansion control.
29

30
```python { .api }
31
class QuantizedIndex:
32
    def search(self, query, size=0, epsilon=-1.0, result_expansion=-1.0, edge_size=-1):
33
        """
34
        Search nearest neighbors in quantized index.
35
        
36
        Args:
37
            query (array-like): Query vector
38
            size (int): Number of results to return, 0 uses default (default: 0)
39
            epsilon (float): Search range expansion, -1.0 uses default (default: -1.0)
40
            result_expansion (float): Result expansion ratio, -1.0 uses default (default: -1.0)
41
            edge_size (int): Number of edges to explore, -1 uses default
42
        
43
        Returns:
44
            list: List of (object_id, distance) tuples
45
        """
46
```
47

48
### Quantized Configuration
49

50
Configure search parameters specific to quantized index operations.
51

52
```python { .api }
53
class QuantizedIndex:
54
    def set(self, num_of_search_objects=0, search_radius=float('-inf'), epsilon=-1.0, result_expansion=-1.0):
55
        """
56
        Set default search parameters for quantized index.
57
        
58
        Args:
59
            num_of_search_objects (int): Default number of search results (default: 0)
60
            search_radius (float): Maximum search radius (default: float('-inf'))
61
            epsilon (float): Default search epsilon (default: -1.0)
62
            result_expansion (float): Default result expansion ratio (default: -1.0)
63
        
64
        Returns:
65
            None
66
        """
67
    
68
    def set_with_distance(self, boolean=True):
69
        """
70
        Configure whether to return distances with search results.
71
        
72
        Args:
73
            boolean (bool): Include distances in results (default: True)
74
        
75
        Returns:
76
            None
77
        """
78
    
79
    def set_defaults(self, size=0, search_radius=float('-inf'), epsilon=-1.0, result_expansion=-1.0):
80
        """
81
        Set default parameters (deprecated, use set() instead).
82
        
83
        Args:
84
            size (int): Default number of search results (default: 0)
85
            search_radius (float): Maximum search radius (default: float('-inf'))
86
            epsilon (float): Default search epsilon (default: -1.0)
87
            result_expansion (float): Default result expansion ratio (default: -1.0)
88
        
89
        Returns:
90
            None
91
        """
92
```
93

94
### Quantized Blob Index
95

96
Advanced quantized indexing with blob storage for maximum compression and specialized search operations.
97

98
```python { .api }
99
class QuantizedBlobIndex:
100
    def __init__(self, path, max_no_of_edges=128, zero_based_numbering=True, 
101
                read_only=False, log_disabled=False, refinement=False, 
102
                refinement_object_type="Any"):
103
        """
104
        Open quantized blob index for maximum compression.
105
        
106
        Args:
107
            path (str): Path to quantized blob index directory
108
            max_no_of_edges (int): Maximum edges per node (default: 128)
109
            zero_based_numbering (bool): Use zero-based object IDs (default: True)
110
            read_only (bool): Open in read-only mode (default: False)
111
            log_disabled (bool): Disable progress logging (default: False)
112
            refinement (bool): Enable search refinement (default: False)
113
            refinement_object_type (str): Object type for refinement (default: "Any")
114
        """
115
```
116

117
### Blob Search Operations
118

119
Search operations designed for blob-quantized indexes with batch processing capabilities.
120

121
```python { .api }
122
class QuantizedBlobIndex:
123
    def search(self, query, size=0, epsilon=float('-inf')):
124
        """
125
        Search nearest neighbors in blob quantized index.
126
        
127
        Args:
128
            query (array-like): Query vector
129
            size (int): Number of results to return, 0 uses default (default: 0)
130
            epsilon (float): Search range parameter (default: float('-inf'))
131
        
132
        Returns:
133
            list: List of (object_id, distance) tuples
134
        """
135
    
136
    def batch_search(self, query, results, size=0):
137
        """
138
        Batch search multiple queries in blob index.
139
        
140
        Args:
141
            query (array-like): Array of query vectors
142
            results (BatchResults): Container for batch results
143
            size (int): Number of results per query, 0 uses default (default: 0)
144
        
145
        Returns:
146
            None (results stored in results parameter)
147
        """
148
    
149
    def batch_search_tmp(self, query, size=0):
150
        """
151
        Temporary batch search implementation.
152
        
153
        Args:
154
            query (array-like): Array of query vectors
155
            size (int): Number of results per query, 0 uses default (default: 0)
156
        
157
        Returns:
158
            list: Batch search results
159
        """
160
    
161
    def batch_range_search(self, query, results, radius=float('-inf')):
162
        """
163
        Range-based batch search within specified radius.
164
        
165
        Args:
166
            query (array-like): Array of query vectors
167
            results (BatchResults): Container for batch results
168
            radius (float): Search radius, float('-inf') for no limit (default: float('-inf'))
169
        
170
        Returns:
171
            None (results stored in results parameter)
172
        """
173
```
174

175
### Blob Index Management
176

177
Management operations for blob quantized indexes including insertion and persistence.
178

179
```python { .api }
180
class QuantizedBlobIndex:
181
    def batch_insert(self, objects, debug=False):
182
        """
183
        Insert multiple objects into blob index.
184
        
185
        Args:
186
            objects (array-like): Array of vectors to insert
187
            debug (bool): Enable debug output (default: False)
188
        
189
        Returns:
190
            None
191
        """
192
    
193
    def save(self):
194
        """
195
        Save blob index to disk.
196
        
197
        Returns:
198
            None
199
        """
200
    
201
    def set(self, num_of_search_objects=0, epsilon=float('-inf'), blob_epsilon=-1.0, result_expansion=-1.0, radius=-1.0, edge_size=-1, exploration_size=0, exact_result_expansion=0.0, num_of_probes=-1):
202
        """
203
        Set default search parameters for blob index.
204
        
205
        Args:
206
            num_of_search_objects (int): Default number of search results (default: 0)
207
            epsilon (float): Default search epsilon (default: float('-inf'))
208
            blob_epsilon (float): Blob-specific epsilon, -1.0 uses default (default: -1.0)
209
            result_expansion (float): Default result expansion ratio, -1.0 uses default (default: -1.0)
210
            radius (float): Search radius, -1.0 uses default (default: -1.0)
211
            edge_size (int): Edge size, -1 uses default (default: -1)
212
            exploration_size (int): Graph exploration size (default: 0)
213
            exact_result_expansion (float): Exact result expansion ratio (default: 0.0)
214
            num_of_probes (int): Number of probes, -1 uses default (default: -1)
215
        
216
        Returns:
217
            None
218
        """
219
    
220
    def set_with_distance(self, boolean=True):
221
        """
222
        Configure whether to return distances with search results.
223
        
224
        Args:
225
            boolean (bool): Include distances in results (default: True)
226
        
227
        Returns:
228
            None
229
        """
230
```
231

232
### Batch Results Container
233

234
Container class for managing batch search results from quantized indexes.
235

236
```python { .api }
237
class BatchResults:
238
    def __init__(self):
239
        """
240
        Create empty batch results container.
241
        """
242
    
243
    def get(self, position):
244
        """
245
        Get result at specific position.
246
        
247
        Args:
248
            position (int): Position index
249
        
250
        Returns:
251
            Result at specified position
252
        """
253
    
254
    def get_ids(self):
255
        """
256
        Get all result IDs as array.
257
        
258
        Returns:
259
            array: Object IDs from search results
260
        """
261
    
262
    def get_indexed_ids(self):
263
        """
264
        Get indexed result IDs.
265
        
266
        Returns:
267
            array: Indexed object IDs
268
        """
269
    
270
    def get_indexed_distances(self):
271
        """
272
        Get indexed distances.
273
        
274
        Returns:
275
            array: Distance values for indexed results
276
        """
277
    
278
    def get_index(self):
279
        """
280
        Get result index information.
281
        
282
        Returns:
283
            Index information for results
284
        """
285
    
286
    def get_size(self):
287
        """
288
        Get number of results in container.
289
        
290
        Returns:
291
            int: Number of results
292
        """
293
```
294

295
## Usage Examples
296

297
### Quantized Index Search
298

299
```python
300
import ngtpy
301
import numpy as np
302

303
# Open existing quantized index
304
# Note: Quantized index must be created using command-line tools (ngtqg quantize)
305
qindex = ngtpy.QuantizedIndex("quantized_index_path", max_no_of_edges=64)
306

307
# Configure search parameters
308
qindex.set(num_of_search_objects=20, epsilon=0.02, result_expansion=3.0)
309

310
# Search with result expansion for higher accuracy
311
query = np.random.random(128).astype(np.float32)
312
results = qindex.search(query, size=10, epsilon=0.05, result_expansion=5.0)
313

314
for rank, (obj_id, distance) in enumerate(results):
315
    print(f"Rank {rank+1}: Object {obj_id}, Distance {distance:.4f}")
316
```
317

318
### Quantized Blob Index with Batch Search
319

320
```python
321
import ngtpy
322
import numpy as np
323

324
# Open quantized blob index with refinement
325
qb_index = ngtpy.QuantizedBlobIndex(
326
    "blob_index_path", 
327
    max_no_of_edges=128, 
328
    refinement=True
329
)
330

331
# Prepare multiple queries
332
queries = np.random.random((5, 128)).astype(np.float32)
333
batch_results = ngtpy.BatchResults()
334

335
# Perform batch search
336
qb_index.batch_search(queries, batch_results, size=5)
337

338
# Process batch results
339
for i in range(batch_results.get_size()):
340
    result = batch_results.get(i)
341
    print(f"Query {i}: {result}")
342

343
# Range-based search within specified radius
344
range_results = ngtpy.BatchResults()
345
qb_index.batch_range_search(queries, range_results, radius=0.5)
346

347
print(f"Range search found {range_results.get_size()} total results")
348
```
349

350
### Adding Data to Blob Index
351

352
```python
353
import ngtpy
354
import numpy as np
355

356
# Open blob index for writing
357
qb_index = ngtpy.QuantizedBlobIndex("writable_blob_index", read_only=False)
358

359
# Insert new vectors
360
new_vectors = np.random.random((100, 128)).astype(np.float32)
361
qb_index.batch_insert(new_vectors, debug=True)
362

363
# Save the updated index
364
qb_index.save()
365

366
print("Successfully added 100 new vectors to blob index")
367
```

Version

Tile

Files

quantized-indexes.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

quantized-indexes.mddocs/