0
# HEIC Format Support
1
2
Support for HEIC/HEIF image format with specialized parsing for ISO base media file format. The HEIC processing system handles the complex box structure used in Apple's HEIC format to locate and extract EXIF metadata.
3
4
## Capabilities
5
6
### HEIC EXIF Finder
7
8
Primary class for locating EXIF data within HEIC files using ISO base media file format parsing.
9
10
```python { .api }
11
class HEICExifFinder:
12
"""
13
Find and extract EXIF data from HEIC files.
14
15
Handles the ISO base media file format used in HEIC/HEIF images,
16
parsing the box structure to locate embedded EXIF metadata.
17
"""
18
19
def __init__(self, file_handle):
20
"""
21
Initialize HEIC EXIF finder with file handle.
22
23
Parameters:
24
- file_handle: file object, open binary file handle to HEIC file
25
"""
26
27
def find_exif(self):
28
"""
29
Locate EXIF data within HEIC file structure.
30
31
Parses the HEIC file structure to find the EXIF metadata location
32
by analyzing 'ftyp', 'meta', 'iinf', and 'iloc' boxes.
33
34
Returns:
35
tuple: (offset, endian) where offset is the position of EXIF data
36
and endian is the byte order indicator (bytes: b'I' or b'M')
37
38
Raises:
39
WrongBox: If HEIC box structure is unexpected
40
NoParser: If no parser is available for a box type
41
BoxVersion: If box version is unsupported
42
BadSize: If box size information is invalid
43
"""
44
```
45
46
### HEIC Data Reading Methods
47
48
Methods for reading binary data from HEIC files with proper byte order handling.
49
50
```python { .api }
51
def get(self, nbytes):
52
"""
53
Read specified number of bytes from file.
54
55
Parameters:
56
- nbytes: int, number of bytes to read
57
58
Returns:
59
bytes: Raw binary data
60
61
Raises:
62
EOFError: If end of file reached
63
BadSize: If requested bytes not available
64
"""
65
66
def get16(self):
67
"""Read 16-bit big-endian integer."""
68
69
def get32(self):
70
"""Read 32-bit big-endian integer."""
71
72
def get64(self):
73
"""Read 64-bit big-endian integer."""
74
75
def get_int(self, size):
76
"""
77
Read integer of variable size.
78
79
Parameters:
80
- size: int, size in bytes (0, 2, 4, or 8)
81
82
Returns:
83
int: Integer value
84
"""
85
86
def get_string(self):
87
"""Read null-terminated string."""
88
89
def get_int4x2(self):
90
"""
91
Extract two 4-bit integers from a single byte.
92
93
Returns:
94
tuple: (high_4_bits, low_4_bits)
95
"""
96
```
97
98
### HEIC Box Parsing Methods
99
100
Methods for parsing specific HEIC box types and navigating the box structure.
101
102
```python { .api }
103
def get_full(self, box):
104
"""
105
Handle 'full' variant ISO boxes with version and flags.
106
107
Parameters:
108
- box: Box, box object to process
109
110
Adds attributes:
111
- box.version: int, box version
112
- box.flags: int, box flags
113
"""
114
115
def skip(self, box):
116
"""
117
Skip to the end of a box.
118
119
Parameters:
120
- box: Box, box to skip
121
"""
122
123
def expect_parse(self, name):
124
"""
125
Search for and parse a specific box type.
126
127
Parameters:
128
- name: str, four-character box type to find
129
130
Returns:
131
Box: Found and parsed box
132
133
Raises:
134
WrongBox: If expected box not found
135
"""
136
137
def get_parser(self, box):
138
"""
139
Get appropriate parser method for a box type.
140
141
Parameters:
142
- box: Box, box to get parser for
143
144
Returns:
145
callable: Parser method for the box type
146
147
Raises:
148
NoParser: If no parser available for box type
149
"""
150
151
def parse_box(self, box):
152
"""
153
Generic box parsing with error handling.
154
155
Parameters:
156
- box: Box, box to parse
157
158
Returns:
159
Box: Parsed box with added attributes
160
"""
161
162
def next_box(self):
163
"""
164
Read the next box from the file stream.
165
166
Returns:
167
Box: Next box with size, name, position info
168
169
Raises:
170
BadSize: If box size is invalid
171
"""
172
```
173
174
### HEIC Box Type Parsers
175
176
Specialized parsers for different HEIC box types in the ISO base media file format.
177
178
```python { .api }
179
def parse_ftyp(self, box):
180
"""
181
Parse file type box.
182
183
Parameters:
184
- box: Box, ftyp box to parse
185
186
Adds attributes:
187
- box.major_brand: bytes, major brand identifier
188
- box.minor_version: int, minor version
189
- box.compat: list, compatible brands
190
"""
191
192
def parse_meta(self, box):
193
"""
194
Parse metadata container box.
195
196
Parameters:
197
- box: Box, meta box to parse
198
199
Adds attributes:
200
- box.subs: dict, sub-boxes by type
201
"""
202
203
def parse_infe(self, box):
204
"""
205
Parse item information entry box.
206
207
Parameters:
208
- box: Box, infe box to parse
209
210
Adds attributes:
211
- box.item_protection_index: int, protection index
212
- box.item_type: bytes, item type (e.g. b'Exif')
213
- box.item_name: str, item name (version >= 2)
214
"""
215
216
def parse_iinf(self, box):
217
"""
218
Parse item information box.
219
220
Parameters:
221
- box: Box, iinf box to parse
222
223
Adds attributes:
224
- box.exif_infe: Box, reference to EXIF infe box if found
225
"""
226
227
def parse_iloc(self, box):
228
"""
229
Parse item location box.
230
231
Parameters:
232
- box: Box, iloc box to parse
233
234
Adds attributes:
235
- box.offset_size: int, size of offset fields
236
- box.length_size: int, size of length fields
237
- box.base_offset_size: int, size of base offset field
238
- box.index_size: int, size of index field
239
- box.item_count: int, number of items
240
- box.locs: dict, location mappings by item_id
241
- box.base_offset: int, base offset value
242
"""
243
```
244
245
### Box Structure Class
246
247
Internal class representing HEIC box structures.
248
249
```python { .api }
250
class Box:
251
"""
252
Represents an ISO base media file format box.
253
254
Used internally by HEICExifFinder to track box metadata
255
during HEIC file parsing.
256
"""
257
258
def __init__(self, name):
259
"""
260
Initialize box with name.
261
262
Parameters:
263
- name: str, four-character box type identifier
264
"""
265
266
name: str
267
"""Four-character box type identifier."""
268
269
size: int
270
"""Size of box data in bytes."""
271
272
after: int
273
"""File position after this box."""
274
275
pos: int
276
"""File position of box data start."""
277
278
item_id: int
279
"""Item ID for boxes that reference items."""
280
281
# Dynamically added attributes by parsing methods:
282
version: int
283
"""Box version (added by get_full())."""
284
285
flags: int
286
"""Box flags (added by get_full())."""
287
288
major_brand: bytes
289
"""File type major brand (added by parse_ftyp())."""
290
291
minor_version: int
292
"""File type minor version (added by parse_ftyp())."""
293
294
compat: list
295
"""Compatible brands list (added by parse_ftyp())."""
296
297
subs: dict
298
"""Sub-boxes dictionary (added by parse_meta())."""
299
300
item_protection_index: int
301
"""Item protection index (added by parse_infe())."""
302
303
item_type: bytes
304
"""Item type (added by parse_infe())."""
305
306
item_name: str
307
"""Item name (added by parse_infe())."""
308
309
exif_infe: 'Box'
310
"""Reference to EXIF infe box (added by parse_iinf())."""
311
312
offset_size: int
313
"""Size of offset fields (added by parse_iloc())."""
314
315
length_size: int
316
"""Size of length fields (added by parse_iloc())."""
317
318
base_offset_size: int
319
"""Size of base offset field (added by parse_iloc())."""
320
321
index_size: int
322
"""Size of index field (added by parse_iloc())."""
323
324
item_count: int
325
"""Number of items (added by parse_iloc())."""
326
327
locs: dict
328
"""Location mappings by item_id (added by parse_iloc())."""
329
330
base_offset: int
331
"""Base offset value (added by parse_iloc())."""}
332
```
333
334
### HEIC Exception Classes
335
336
Specialized exceptions for HEIC format processing errors.
337
338
```python { .api }
339
class WrongBox(Exception):
340
"""
341
Raised when HEIC box structure is unexpected.
342
343
This exception occurs during HEIC file parsing when the ISO base
344
media file format box structure does not match expectations for
345
standard HEIC files.
346
"""
347
348
class NoParser(Exception):
349
"""
350
Raised when no parser is available for HEIC box type.
351
352
This exception occurs when encountering unsupported or unknown
353
box types in HEIC files that cannot be processed by the current
354
implementation.
355
"""
356
357
class BoxVersion(Exception):
358
"""
359
Raised when HEIC box version is unsupported.
360
361
This exception occurs when the version field in HEIC box headers
362
indicates a format version that is not supported by the parser.
363
"""
364
365
class BadSize(Exception):
366
"""
367
Raised when HEIC box size information is invalid.
368
369
This exception occurs when box size fields contain invalid or
370
inconsistent length information that would cause parsing errors
371
or file corruption.
372
"""
373
```
374
375
## Usage Examples
376
377
### Basic HEIC Processing
378
379
```python
380
import exifread
381
from exifread.heic import HEICExifFinder, WrongBox, NoParser, BoxVersion, BadSize
382
383
def process_heic_file(filename):
384
"""Process HEIC file with comprehensive error handling."""
385
try:
386
with open(filename, 'rb') as f:
387
# Process using standard interface
388
tags = exifread.process_file(f)
389
return tags, "success"
390
391
except WrongBox as e:
392
return None, f"HEIC box structure error: {e}"
393
except NoParser as e:
394
return None, f"Unsupported HEIC box type: {e}"
395
except BoxVersion as e:
396
return None, f"Unsupported HEIC version: {e}"
397
except BadSize as e:
398
return None, f"Invalid HEIC box size: {e}"
399
except exifread.ExifNotFound as e:
400
return None, f"No EXIF in HEIC file: {e}"
401
except Exception as e:
402
return None, f"General HEIC processing error: {e}"
403
404
# Usage
405
tags, status = process_heic_file('photo.heic')
406
if tags:
407
print(f"Successfully extracted {len(tags)} tags from HEIC file")
408
409
# Access standard EXIF data
410
if 'EXIF DateTimeOriginal' in tags:
411
print(f"Photo taken: {tags['EXIF DateTimeOriginal']}")
412
if 'Image Make' in tags:
413
print(f"Device: {tags['Image Make']}")
414
else:
415
print(f"HEIC processing failed: {status}")
416
```
417
418
### Advanced HEIC Analysis
419
420
```python
421
from exifread.heic import HEICExifFinder
422
423
def analyze_heic_structure(filename):
424
"""Analyze HEIC file structure for debugging."""
425
try:
426
with open(filename, 'rb') as f:
427
finder = HEICExifFinder(f)
428
429
# This will parse the HEIC structure
430
offset, endian = finder.find_exif()
431
432
print(f"HEIC analysis successful:")
433
print(f" EXIF data offset: {offset}")
434
print(f" Byte order: {endian}")
435
436
# Now process normally
437
f.seek(0) # Reset file position
438
tags = exifread.process_file(f)
439
440
return tags
441
442
except Exception as e:
443
print(f"HEIC structure analysis failed: {e}")
444
return None
445
446
# Usage
447
tags = analyze_heic_structure('photo.heic')
448
```
449
450
### Batch HEIC Processing
451
452
```python
453
import os
454
from collections import defaultdict
455
456
def batch_process_heic_files(directory):
457
"""Process all HEIC files in a directory."""
458
results = {
459
'success': [],
460
'heic_errors': [],
461
'no_exif': [],
462
'other_errors': []
463
}
464
465
for filename in os.listdir(directory):
466
if not filename.lower().endswith('.heic'):
467
continue
468
469
filepath = os.path.join(directory, filename)
470
471
try:
472
with open(filepath, 'rb') as f:
473
tags = exifread.process_file(f)
474
results['success'].append((filename, len(tags)))
475
476
except (WrongBox, NoParser, BoxVersion, BadSize) as e:
477
results['heic_errors'].append((filename, str(e)))
478
479
except exifread.ExifNotFound:
480
results['no_exif'].append(filename)
481
482
except Exception as e:
483
results['other_errors'].append((filename, str(e)))
484
485
# Print summary
486
print(f"HEIC Processing Summary:")
487
print(f" Successfully processed: {len(results['success'])}")
488
print(f" HEIC format errors: {len(results['heic_errors'])}")
489
print(f" No EXIF data: {len(results['no_exif'])}")
490
print(f" Other errors: {len(results['other_errors'])}")
491
492
return results
493
494
# Usage
495
results = batch_process_heic_files('/path/to/heic/photos')
496
```
497
498
### HEIC Format Validation
499
500
```python
501
from exifread.heic import HEICExifFinder, WrongBox, NoParser, BoxVersion, BadSize
502
503
def validate_heic_format(filename):
504
"""Validate that a file is a proper HEIC format."""
505
try:
506
with open(filename, 'rb') as f:
507
finder = HEICExifFinder(f)
508
509
# Try to parse the basic structure
510
box = finder.next_box()
511
if box.name != 'ftyp':
512
return False, "Missing ftyp box"
513
514
finder.parse_ftyp(box)
515
if box.major_brand != b'heic':
516
return False, f"Not HEIC format: {box.major_brand}"
517
518
return True, "Valid HEIC format"
519
520
except (WrongBox, NoParser, BoxVersion, BadSize) as e:
521
return False, f"HEIC structure error: {e}"
522
except Exception as e:
523
return False, f"Invalid HEIC: {e}"
524
525
# Usage
526
is_valid, message = validate_heic_format('photo.heic')
527
print(f"HEIC validation: {message}")
528
```
529
530
### Technical Implementation Details
531
532
The HEIC processing implementation handles several complex aspects of the ISO base media file format:
533
534
#### Box Size Handling
535
- Supports both 32-bit and 64-bit box sizes (when size == 1, reads 64-bit extended size)
536
- Validates box boundaries to prevent file corruption
537
- Handles variable-length fields based on version
538
539
#### Version-Dependent Parsing
540
Different box versions require different parsing approaches:
541
- **infe box versions 0-3**: Different field layouts and required fields
542
- **iloc box versions 0-2**: Variable field sizes and extent handling
543
- **meta box**: Uses 'full' box format with version and flags
544
545
#### Extent-Based Data Location
546
The `iloc` box uses a complex extent system:
547
- Multiple extents per item supported
548
- Base offset calculations with variable field sizes
549
- Item ID linking between `iinf` and `iloc` boxes
550
551
#### Endian Detection
552
EXIF data endian order detected from first byte of EXIF payload:
553
- Returns actual bytes object (`b'I'` or `b'M'`), not string
554
- Handles both APP1 marker format and direct EXIF format
555
556
#### Error Handling Strategy
557
- `WrongBox`: Structural format violations
558
- `NoParser`: Unsupported box types encountered
559
- `BoxVersion`: Version fields indicate unsupported format variants
560
- `BadSize`: Size field inconsistencies that could cause corruption