0
# Utility Functions
1
2
Extended functionality from the bitarray.util module including analysis functions, pretty printing, set operations, serialization, and specialized bit manipulation operations.
3
4
## Capabilities
5
6
### Analysis Functions
7
8
Functions for analyzing bitarray contents and computing statistical properties.
9
10
```python { .api }
11
def count_n(a: bitarray, n: int, value: int = 1) -> int:
12
"""
13
Count non-overlapping occurrences of n consecutive bits with specified value.
14
15
Args:
16
a: Input bitarray
17
n: Length of bit sequence to count
18
value: Bit value to count (0 or 1)
19
20
Returns:
21
Number of non-overlapping occurrences
22
"""
23
24
def parity(a: bitarray) -> int:
25
"""
26
Calculate parity (XOR of all bits).
27
28
Args:
29
a: Input bitarray
30
31
Returns:
32
0 if even number of 1-bits, 1 if odd
33
"""
34
35
def sum_indices(a: bitarray, mode: int = 1) -> int:
36
"""
37
Sum indices of set bits.
38
39
Args:
40
a: Input bitarray
41
mode: Calculation mode (1 for standard sum)
42
43
Returns:
44
Sum of all indices where bit is set
45
"""
46
47
def xor_indices(a: bitarray) -> int:
48
"""
49
Calculate XOR of all indices where bit is set.
50
51
Args:
52
a: Input bitarray
53
54
Returns:
55
XOR of all set bit indices
56
"""
57
```
58
59
**Usage Examples:**
60
61
```python
62
from bitarray import bitarray
63
from bitarray.util import count_n, parity, sum_indices, xor_indices
64
65
a = bitarray('11010110')
66
67
# Analysis functions
68
consecutive_1s = count_n(a, 2, 1) # Count '11' patterns: 1
69
consecutive_0s = count_n(a, 1, 0) # Count '0' bits: 3
70
parity_val = parity(a) # XOR of all bits: 0
71
index_sum = sum_indices(a) # Sum of indices with 1s
72
index_xor = xor_indices(a) # XOR of indices with 1s
73
```
74
75
### Set Operations
76
77
Functions for performing set-like operations between bitarrays.
78
79
```python { .api }
80
def count_and(a: bitarray, b: bitarray) -> int:
81
"""
82
Count bits set in both bitarrays (intersection).
83
84
Args:
85
a, b: Input bitarrays
86
87
Returns:
88
Number of bits set in both a and b
89
"""
90
91
def count_or(a: bitarray, b: bitarray) -> int:
92
"""
93
Count bits set in either bitarray (union).
94
95
Args:
96
a, b: Input bitarrays
97
98
Returns:
99
Number of bits set in a or b (or both)
100
"""
101
102
def count_xor(a: bitarray, b: bitarray) -> int:
103
"""
104
Count bits different between bitarrays (symmetric difference).
105
106
Args:
107
a, b: Input bitarrays
108
109
Returns:
110
Number of bits where a and b differ
111
"""
112
113
def any_and(a: bitarray, b: bitarray) -> bool:
114
"""
115
Check if any bits are set in both bitarrays.
116
117
Args:
118
a, b: Input bitarrays
119
120
Returns:
121
True if intersection is non-empty
122
"""
123
124
def subset(a: bitarray, b: bitarray) -> bool:
125
"""
126
Check if a is subset of b (all bits in a are also in b).
127
128
Args:
129
a, b: Input bitarrays
130
131
Returns:
132
True if a is subset of b
133
"""
134
135
def correspond_all(a: bitarray, b: bitarray) -> tuple:
136
"""
137
Return correspondence information between bitarrays.
138
139
Args:
140
a, b: Input bitarrays
141
142
Returns:
143
Tuple with correspondence data
144
"""
145
```
146
147
**Usage Examples:**
148
149
```python
150
from bitarray import bitarray
151
from bitarray.util import count_and, count_or, count_xor, any_and, subset
152
153
a = bitarray('11010')
154
b = bitarray('10110')
155
156
# Set operations
157
and_count = count_and(a, b) # 2 (bits set in both)
158
or_count = count_or(a, b) # 4 (bits set in either)
159
xor_count = count_xor(a, b) # 2 (bits different)
160
161
# Boolean tests
162
has_overlap = any_and(a, b) # True (some bits match)
163
is_subset = subset(a, b) # False (a is not subset of b)
164
165
# Check subset relationship
166
c = bitarray('10010')
167
d = bitarray('11011')
168
is_c_subset_d = subset(c, d) # True (all 1s in c are also in d)
169
```
170
171
### Pretty Printing
172
173
Functions for displaying bitarrays in formatted, readable ways.
174
175
```python { .api }
176
def pprint(a: Any,
177
stream: Optional[BinaryIO] = None,
178
group: int = 8,
179
indent: int = 4,
180
width: int = 80) -> None:
181
"""
182
Pretty-print bitarray with formatting options.
183
184
Args:
185
a: Bitarray or other object to print
186
stream: Output stream (stdout if None)
187
group: Number of bits per group
188
indent: Indentation level
189
width: Maximum line width
190
"""
191
192
def strip(a: bitarray, mode: str = 'right') -> bitarray:
193
"""
194
Strip leading/trailing zeros from bitarray.
195
196
Args:
197
a: Input bitarray
198
mode: Strip mode - 'left', 'right', or 'both'
199
200
Returns:
201
Stripped bitarray
202
"""
203
```
204
205
**Usage Examples:**
206
207
```python
208
from bitarray import bitarray
209
from bitarray.util import pprint, strip
210
211
# Long bitarray for demonstration
212
a = bitarray('0000110101100110010110110001100011010110100000')
213
214
# Pretty printing with grouping
215
pprint(a, group=4, width=40)
216
217
# Strip zeros
218
b = bitarray('0001101000')
219
stripped_right = strip(b) # '0001101' (default right)
220
stripped_left = strip(b, 'left') # '1101000'
221
stripped_both = strip(b, 'both') # '1101'
222
```
223
224
### Data Processing
225
226
Functions for processing and manipulating bit sequences and intervals.
227
228
```python { .api }
229
def intervals(a: bitarray) -> Iterator:
230
"""
231
Return iterator of intervals of consecutive set bits.
232
233
Args:
234
a: Input bitarray
235
236
Yields:
237
Tuples (start, stop) for each interval of consecutive 1s
238
"""
239
240
def byteswap(a: Union[bytes, bytearray], n: int) -> None:
241
"""
242
Swap byte order in-place for n-byte integers.
243
244
Args:
245
a: Byte array to modify
246
n: Size of integers in bytes (2, 4, or 8)
247
"""
248
```
249
250
**Usage Examples:**
251
252
```python
253
from bitarray import bitarray
254
from bitarray.util import intervals, byteswap
255
256
# Find intervals of consecutive 1s
257
a = bitarray('0011101001110000111')
258
interval_list = list(intervals(a))
259
# [(2, 5), (8, 11), (16, 19)] - ranges of consecutive 1s
260
261
# Byte swapping for endianness conversion
262
data = bytearray(b'\\x12\\x34\\x56\\x78')
263
byteswap(data, 2) # Swap 2-byte integers: b'\\x34\\x12\\x78\\x56'
264
```
265
266
### Serialization
267
268
Functions for serializing bitarrays to portable byte formats and deserializing them back.
269
270
```python { .api }
271
def serialize(a: bitarray) -> bytes:
272
"""
273
Serialize bitarray to bytes with metadata.
274
Preserves endianness and exact bit length.
275
276
Args:
277
a: Input bitarray
278
279
Returns:
280
Serialized bytes including metadata
281
"""
282
283
def deserialize(b: Union[bytes, bytearray]) -> bitarray:
284
"""
285
Deserialize bytes back to bitarray.
286
Restores original endianness and bit length.
287
288
Args:
289
b: Serialized bytes
290
291
Returns:
292
Restored bitarray
293
"""
294
```
295
296
**Usage Examples:**
297
298
```python
299
from bitarray import bitarray
300
from bitarray.util import serialize, deserialize
301
302
# Serialization preserves all properties
303
original = bitarray('1011010', 'little')
304
serialized = serialize(original)
305
306
# Deserialize restores exact bitarray
307
restored = deserialize(serialized)
308
print(restored.endian) # 'little'
309
print(restored.to01()) # '1011010'
310
print(original == restored) # True
311
312
# Useful for storage and transmission
313
with open('bitarray.dat', 'wb') as f:
314
f.write(serialize(original))
315
316
with open('bitarray.dat', 'rb') as f:
317
loaded = deserialize(f.read())
318
```
319
320
### Compression
321
322
Functions for compressing sparse bitarrays and variable-length encoding.
323
324
```python { .api }
325
def sc_encode(a: bitarray) -> bytes:
326
"""
327
Sparse compression encoding for bitarrays with few set bits.
328
329
Args:
330
a: Input bitarray
331
332
Returns:
333
Compressed bytes
334
"""
335
336
def sc_decode(stream: Iterable[int]) -> bitarray:
337
"""
338
Sparse compression decoding.
339
340
Args:
341
stream: Iterable of integers (compressed data)
342
343
Returns:
344
Decompressed bitarray
345
"""
346
347
def vl_encode(a: bitarray) -> bytes:
348
"""
349
Variable-length encoding.
350
351
Args:
352
a: Input bitarray
353
354
Returns:
355
Variable-length encoded bytes
356
"""
357
358
def vl_decode(stream: Iterable[int], endian: Optional[str] = None) -> bitarray:
359
"""
360
Variable-length decoding.
361
362
Args:
363
stream: Iterable of integers (encoded data)
364
endian: Bit-endianness for result
365
366
Returns:
367
Decoded bitarray
368
"""
369
```
370
371
**Usage Examples:**
372
373
```python
374
from bitarray import bitarray
375
from bitarray.util import sc_encode, sc_decode, vl_encode, vl_decode
376
377
# Sparse compression works well for arrays with few 1s
378
sparse = bitarray('000000100000001000000000100000')
379
compressed = sc_encode(sparse)
380
decompressed = sc_decode(compressed)
381
382
print(len(sparse.tobytes())) # Original size in bytes
383
print(len(compressed)) # Compressed size (smaller)
384
print(sparse == decompressed) # True
385
386
# Variable-length encoding
387
dense = bitarray('11010110' * 100) # Dense bitarray
388
vl_compressed = vl_encode(dense)
389
vl_decompressed = vl_decode(vl_compressed)
390
print(dense == vl_decompressed) # True
391
```
392
393
### Advanced Analysis
394
395
Additional functions for complex bitarray analysis and manipulation.
396
397
```python { .api }
398
def _ssqi(a: bitarray, i: int) -> int:
399
"""Internal function for subsequence query indexing (advanced use)"""
400
```
401
402
**Note**: Some utility functions like `_ssqi` are internal implementation details and should generally not be used directly in application code.
403
404
## Error Handling
405
406
Utility functions can raise various exceptions:
407
408
- `TypeError`: Invalid argument types
409
- `ValueError`: Invalid argument values (e.g., negative lengths, invalid bit values)
410
- `IndexError`: Invalid indices or ranges
411
- `NotImplementedError`: Functions requiring newer Python versions
412
- `MemoryError`: Insufficient memory for large operations
413
414
## Performance Considerations
415
416
Most utility functions are implemented in C for optimal performance:
417
418
- Set operations (`count_and`, `count_or`, etc.) are highly optimized
419
- Conversion functions use efficient algorithms
420
- Serialization preserves bit-level accuracy while minimizing space
421
- Compression functions are designed for specific use cases (sparse vs dense data)
422
423
The utility module extends bitarray functionality while maintaining the same performance characteristics as the core library.