0
# Image Processing
1
2
Comprehensive image filtering, transformations, morphological operations, and computer vision preprocessing functions for image analysis and manipulation.
3
4
## Capabilities
5
6
### Image I/O Functions
7
8
Core functions for loading and saving images in various formats with automatic format detection.
9
10
```python { .api }
11
def load_rgb_image(filename: str):
12
"""
13
Load RGB image from file as numpy array.
14
15
Args:
16
filename: Path to image file
17
18
Returns:
19
RGB image as numpy array with shape (height, width, 3)
20
"""
21
22
def load_rgb_alpha_image(filename: str):
23
"""
24
Load RGBA image with alpha channel.
25
26
Args:
27
filename: Path to image file with alpha channel
28
29
Returns:
30
RGBA image as numpy array with shape (height, width, 4)
31
"""
32
33
def load_grayscale_image(filename: str):
34
"""
35
Load image as 8-bit grayscale.
36
37
Args:
38
filename: Path to image file
39
40
Returns:
41
Grayscale image as numpy array
42
"""
43
44
def save_image(img, filename: str, quality: int = 75):
45
"""
46
Save image with automatic format detection from extension.
47
48
Args:
49
img: Input image array
50
filename: Output filename (supports .bmp, .png, .jpg, .jpeg, .webp, .jxl, .dng)
51
quality: JPEG quality (1-100, only for JPEG files)
52
"""
53
54
def save_png(img, filename: str):
55
"""Save image as PNG format."""
56
57
def save_jpeg(img, filename: str):
58
"""Save image as JPEG format."""
59
60
def save_bmp(img, filename: str):
61
"""Save image as BMP format."""
62
63
def save_dng(img, filename: str):
64
"""Save image as DNG format."""
65
66
def save_webp(img, filename: str):
67
"""Save image as WebP format."""
68
69
def save_jxl(img, filename: str):
70
"""Save image as JPEG XL format."""
71
```
72
73
**Usage Example:**
74
```python
75
import dlib
76
77
# Load images in different formats
78
rgb_img = dlib.load_rgb_image("photo.jpg")
79
rgba_img = dlib.load_rgb_alpha_image("logo.png")
80
gray_img = dlib.load_grayscale_image("document.tiff")
81
82
print(f"RGB image shape: {rgb_img.shape}")
83
print(f"RGBA image shape: {rgba_img.shape}")
84
print(f"Grayscale image shape: {gray_img.shape}")
85
86
# Save in different formats
87
dlib.save_image(rgb_img, "output.png") # Auto-detect PNG
88
dlib.save_image(rgb_img, "output.jpg", quality=95) # High quality JPEG
89
dlib.save_png(gray_img, "grayscale.png")
90
dlib.save_jpeg(rgb_img, "compressed.jpg")
91
```
92
93
### Basic Image Types
94
95
Core image representation and pixel types for image processing operations.
96
97
```python { .api }
98
class rgb_pixel:
99
"""RGB color pixel representation."""
100
101
def __init__(self, red: int, green: int, blue: int):
102
"""
103
Create RGB pixel.
104
105
Args:
106
red: Red channel value (0-255)
107
green: Green channel value (0-255)
108
blue: Blue channel value (0-255)
109
"""
110
111
red: int # Red channel (0-255)
112
green: int # Green channel (0-255)
113
blue: int # Blue channel (0-255)
114
```
115
116
### Image Gradients and Derivatives
117
118
Gradient computation tools for edge detection and image analysis.
119
120
```python { .api }
121
class image_gradients:
122
"""Gradient computation tool for images."""
123
124
def __init__(self, scale: int = 1):
125
"""
126
Initialize gradient computer.
127
128
Args:
129
scale: Scale parameter controlling filter window size
130
"""
131
132
def get_scale(self) -> int:
133
"""Get current scale parameter."""
134
135
def gradient_x(self, img):
136
"""
137
Compute X gradient (first derivative in X direction).
138
139
Args:
140
img: Input image
141
142
Returns:
143
X gradient image
144
"""
145
146
def gradient_y(self, img):
147
"""
148
Compute Y gradient (first derivative in Y direction).
149
150
Args:
151
img: Input image
152
153
Returns:
154
Y gradient image
155
"""
156
157
def gradient_xx(self, img):
158
"""
159
Compute XX gradient (second derivative in X direction).
160
161
Args:
162
img: Input image
163
164
Returns:
165
XX gradient image
166
"""
167
168
def gradient_xy(self, img):
169
"""
170
Compute XY gradient (mixed second derivative).
171
172
Args:
173
img: Input image
174
175
Returns:
176
XY gradient image
177
"""
178
179
def gradient_yy(self, img):
180
"""
181
Compute YY gradient (second derivative in Y direction).
182
183
Args:
184
img: Input image
185
186
Returns:
187
YY gradient image
188
"""
189
190
def get_x_filter(self):
191
"""Get X gradient filter kernel."""
192
193
def get_y_filter(self):
194
"""Get Y gradient filter kernel."""
195
```
196
197
**Usage Example:**
198
```python
199
import dlib
200
import cv2
201
202
# Load image
203
img = cv2.imread("image.jpg", cv2.IMREAD_GRAYSCALE)
204
205
# Create gradient computer
206
gradients = dlib.image_gradients(scale=2)
207
208
# Compute gradients
209
grad_x = gradients.gradient_x(img)
210
grad_y = gradients.gradient_y(img)
211
212
# Compute magnitude
213
import numpy as np
214
magnitude = np.sqrt(grad_x**2 + grad_y**2)
215
```
216
217
### Image Thresholding and Segmentation
218
219
Functions for image binarization and automatic threshold selection.
220
221
```python { .api }
222
def threshold_image(img):
223
"""
224
Threshold image using automatic threshold selection.
225
226
Args:
227
img: Input grayscale image
228
229
Returns:
230
Binary image
231
"""
232
233
def threshold_image(img, threshold: float):
234
"""
235
Threshold image using specified threshold.
236
237
Args:
238
img: Input grayscale image
239
threshold: Threshold value
240
241
Returns:
242
Binary image
243
"""
244
245
def partition_pixels(img):
246
"""
247
Find optimal thresholds for image segmentation.
248
249
Args:
250
img: Input grayscale image
251
252
Returns:
253
List of optimal threshold values
254
"""
255
256
def partition_pixels(img, num_thresholds: int):
257
"""
258
Find specified number of optimal thresholds.
259
260
Args:
261
img: Input grayscale image
262
num_thresholds: Number of thresholds to find
263
264
Returns:
265
List of optimal threshold values
266
"""
267
```
268
269
### Image Filtering and Enhancement
270
271
Filtering operations for noise reduction and enhancement.
272
273
```python { .api }
274
def gaussian_blur(img, sigma: float, max_size: int = 1000):
275
"""
276
Apply Gaussian filtering to image.
277
278
Args:
279
img: Input image
280
sigma: Standard deviation of Gaussian kernel
281
max_size: Maximum kernel size limit
282
283
Returns:
284
Filtered image
285
"""
286
```
287
288
**Usage Example:**
289
```python
290
import dlib
291
import cv2
292
293
# Load image
294
img = cv2.imread("noisy_image.jpg", cv2.IMREAD_GRAYSCALE)
295
296
# Apply Gaussian blur
297
blurred = dlib.gaussian_blur(img, sigma=2.0)
298
299
# Automatic thresholding
300
binary = dlib.threshold_image(blurred)
301
302
# Find optimal thresholds
303
thresholds = dlib.partition_pixels(img, num_thresholds=3)
304
print(f"Optimal thresholds: {thresholds}")
305
```
306
307
### Morphological Operations
308
309
Binary image morphological processing and shape analysis.
310
311
```python { .api }
312
def skeleton(img):
313
"""
314
Skeletonization of binary image.
315
316
Args:
317
img: Input binary image
318
319
Returns:
320
Skeletonized image
321
"""
322
323
def find_line_endpoints(img):
324
"""
325
Find endpoints in binary line image.
326
327
Args:
328
img: Input binary image containing lines
329
330
Returns:
331
Image with endpoint locations marked
332
"""
333
```
334
335
### Connected Component Analysis
336
337
Connected component labeling and analysis functions.
338
339
```python { .api }
340
def label_connected_blobs(
341
img,
342
zero_pixels_are_background: bool = True,
343
neighborhood_connectivity: int = 8,
344
connected_if_both_not_zero: bool = False
345
):
346
"""
347
Label connected components in binary image.
348
349
Args:
350
img: Input binary image
351
zero_pixels_are_background: Whether zero pixels are background
352
neighborhood_connectivity: 4 or 8 connectivity
353
connected_if_both_not_zero: Alternative connectivity rule
354
355
Returns:
356
Labeled image with component IDs
357
"""
358
359
def label_connected_blobs_watershed(
360
img,
361
background_thresh: float,
362
smoothing: float = 0.0
363
):
364
"""
365
Watershed-based connected component segmentation.
366
367
Args:
368
img: Input grayscale image
369
background_thresh: Background threshold value
370
smoothing: Smoothing parameter for preprocessing
371
372
Returns:
373
Labeled image with component IDs
374
"""
375
```
376
377
**Usage Example:**
378
```python
379
import dlib
380
import cv2
381
382
# Load binary image
383
img = cv2.imread("binary_shapes.jpg", cv2.IMREAD_GRAYSCALE)
384
385
# Label connected components
386
labeled = dlib.label_connected_blobs(img, neighborhood_connectivity=8)
387
388
# Watershed segmentation
389
grayscale = cv2.imread("cells.jpg", cv2.IMREAD_GRAYSCALE)
390
watershed_labels = dlib.label_connected_blobs_watershed(
391
grayscale,
392
background_thresh=100,
393
smoothing=1.0
394
)
395
396
# Skeletonization
397
skeleton_img = dlib.skeleton(img)
398
endpoints = dlib.find_line_endpoints(skeleton_img)
399
```
400
401
### Image Conversion and Color Mapping
402
403
Functions for image format conversion and visualization.
404
405
```python { .api }
406
def convert_image(img, dtype):
407
"""
408
Convert image pixel type.
409
410
Args:
411
img: Input image
412
dtype: Target data type
413
414
Returns:
415
Converted image
416
"""
417
418
def as_grayscale(img):
419
"""
420
Convert image to grayscale.
421
422
Args:
423
img: Input color image
424
425
Returns:
426
Grayscale image
427
"""
428
429
def jet(img):
430
"""
431
Convert grayscale image to jet colormap.
432
433
Args:
434
img: Input grayscale image
435
436
Returns:
437
Color-mapped image using jet colormap
438
"""
439
440
def randomly_color_image(img):
441
"""
442
Apply random color mapping to labeled image.
443
444
Args:
445
img: Input labeled image
446
447
Returns:
448
Randomly colored image for visualization
449
"""
450
451
def get_rect(img) -> rectangle:
452
"""
453
Get bounding rectangle of image.
454
455
Args:
456
img: Input image
457
458
Returns:
459
Rectangle representing image bounds
460
"""
461
```
462
463
**Usage Example:**
464
```python
465
import dlib
466
import cv2
467
468
# Load color image
469
color_img = cv2.imread("color_image.jpg")
470
471
# Convert to grayscale
472
gray = dlib.as_grayscale(color_img)
473
474
# Apply jet colormap for visualization
475
jet_colored = dlib.jet(gray)
476
477
# Get image bounds
478
bounds = dlib.get_rect(gray)
479
print(f"Image size: {bounds.width()} x {bounds.height()}")
480
481
# Random coloring for labeled images
482
labeled_img = dlib.label_connected_blobs(gray > 128)
483
colored_labels = dlib.randomly_color_image(labeled_img)
484
```
485
486
### Edge Detection and Feature Processing
487
488
Comprehensive edge detection and feature extraction tools using gradient analysis.
489
490
```python { .api }
491
def sobel_edge_detector(img):
492
"""
493
Sobel edge detection returning horizontal and vertical gradients.
494
495
Args:
496
img: Input grayscale image
497
498
Returns:
499
Tuple of (horizontal_gradient, vertical_gradient)
500
"""
501
502
def find_bright_lines(gradient_xx, gradient_xy, gradient_yy):
503
"""
504
Detect bright lines using Hessian matrix analysis.
505
506
Args:
507
gradient_xx: Second derivative in X direction
508
gradient_xy: Mixed second derivative
509
gradient_yy: Second derivative in Y direction
510
511
Returns:
512
Image with detected bright lines
513
"""
514
515
def find_dark_lines(gradient_xx, gradient_xy, gradient_yy):
516
"""
517
Detect dark lines using Hessian matrix analysis.
518
519
Args:
520
gradient_xx: Second derivative in X direction
521
gradient_xy: Mixed second derivative
522
gradient_yy: Second derivative in Y direction
523
524
Returns:
525
Image with detected dark lines
526
"""
527
528
def find_bright_keypoints(gradient_xx, gradient_xy, gradient_yy):
529
"""
530
Detect bright blob-like features using Hessian analysis.
531
532
Args:
533
gradient_xx: Second derivative in X direction
534
gradient_xy: Mixed second derivative
535
gradient_yy: Second derivative in Y direction
536
537
Returns:
538
Image with detected bright keypoints
539
"""
540
541
def find_dark_keypoints(gradient_xx, gradient_xy, gradient_yy):
542
"""
543
Detect dark blob-like features using Hessian analysis.
544
545
Args:
546
gradient_xx: Second derivative in X direction
547
gradient_xy: Mixed second derivative
548
gradient_yy: Second derivative in Y direction
549
550
Returns:
551
Image with detected dark keypoints
552
"""
553
554
def suppress_non_maximum_edges(horizontal_gradient, vertical_gradient):
555
"""
556
Apply non-maximum suppression to edge gradients.
557
558
Args:
559
horizontal_gradient: Horizontal gradient image
560
vertical_gradient: Vertical gradient image
561
562
Returns:
563
Suppressed edge image
564
"""
565
566
def find_peaks(img, non_max_suppression_radius: int, threshold: float = None):
567
"""
568
Find local peaks with non-maximum suppression.
569
570
Args:
571
img: Input image
572
non_max_suppression_radius: Radius for peak suppression
573
threshold: Minimum peak value (auto-detected if None)
574
575
Returns:
576
List of peak locations as points
577
"""
578
579
def hysteresis_threshold(img, lower_threshold: float = None, upper_threshold: float = None):
580
"""
581
Apply hysteresis thresholding for edge linking.
582
583
Args:
584
img: Input edge magnitude image
585
lower_threshold: Lower threshold (auto-detected if None)
586
upper_threshold: Upper threshold (auto-detected if None)
587
588
Returns:
589
Hysteresis thresholded binary image
590
"""
591
```
592
593
### Image Resizing and Transformation
594
595
Functions for geometric image transformations and resizing operations.
596
597
```python { .api }
598
def resize_image(img, rows: int, cols: int):
599
"""
600
Resize image to specific dimensions using bilinear interpolation.
601
602
Args:
603
img: Input image
604
rows: Target height
605
cols: Target width
606
607
Returns:
608
Resized image
609
"""
610
611
def resize_image(img, scale: float):
612
"""
613
Resize image by scale factor.
614
615
Args:
616
img: Input image
617
scale: Scale factor (>1 for upsampling, <1 for downsampling)
618
619
Returns:
620
Resized image
621
"""
622
623
def transform_image(img, transform_function, rows: int, cols: int):
624
"""
625
Apply arbitrary geometric transformation to image.
626
627
Args:
628
img: Input image
629
transform_function: Function mapping output coordinates to input coordinates
630
rows: Output image height
631
cols: Output image width
632
633
Returns:
634
Transformed image
635
"""
636
637
def extract_image_4points(img, corners: list, rows: int, cols: int):
638
"""
639
Extract rectified image patch from four corner points.
640
641
Args:
642
img: Input image
643
corners: List of 4 corner points defining source quadrilateral
644
rows: Output patch height
645
cols: Output patch width
646
647
Returns:
648
Rectified image patch
649
"""
650
651
class chip_details:
652
"""Image chip extraction parameters."""
653
654
def __init__(self, rect: rectangle, dims: tuple, angle: float = 0):
655
"""
656
Initialize chip details.
657
658
Args:
659
rect: Source rectangle in image
660
dims: Output dimensions (rows, cols)
661
angle: Rotation angle in radians
662
"""
663
664
@property
665
def rect(self) -> rectangle:
666
"""Source rectangle."""
667
668
@property
669
def rows(self) -> int:
670
"""Output height."""
671
672
@property
673
def cols(self) -> int:
674
"""Output width."""
675
676
@property
677
def angle(self) -> float:
678
"""Rotation angle."""
679
680
def extract_image_chip(img, chip_location: chip_details):
681
"""
682
Extract single image chip with geometric transformation.
683
684
Args:
685
img: Input image
686
chip_location: Chip extraction parameters
687
688
Returns:
689
Extracted and transformed image chip
690
"""
691
692
def extract_image_chips(img, chip_locations: list):
693
"""
694
Extract multiple image chips with transformations.
695
696
Args:
697
img: Input image
698
chip_locations: List of chip_details objects
699
700
Returns:
701
List of extracted image chips
702
"""
703
704
def insert_image_chip(img, chip, chip_location: chip_details):
705
"""
706
Insert transformed chip back into image.
707
708
Args:
709
img: Target image to modify
710
chip: Chip image to insert
711
chip_location: Insertion location and transformation
712
"""
713
```
714
715
### Histogram and Intensity Processing
716
717
Functions for histogram analysis and intensity transformations.
718
719
```python { .api }
720
def equalize_histogram(img):
721
"""
722
Apply histogram equalization for contrast enhancement.
723
724
Args:
725
img: Input grayscale image
726
727
Returns:
728
Histogram-equalized image
729
"""
730
731
def get_histogram(img, hist_size: int):
732
"""
733
Compute image histogram.
734
735
Args:
736
img: Input grayscale image
737
hist_size: Number of histogram bins
738
739
Returns:
740
Histogram array
741
"""
742
743
def convert_image_scaled(img, dtype, threshold: float = 4):
744
"""
745
Convert image type with automatic scaling.
746
747
Args:
748
img: Input image
749
dtype: Target data type
750
threshold: Scaling threshold parameter
751
752
Returns:
753
Converted and scaled image
754
"""
755
```
756
757
### Advanced Image Analysis
758
759
Higher-level image analysis and feature extraction functions.
760
761
```python { .api }
762
def max_point(img):
763
"""
764
Find pixel location with maximum value.
765
766
Args:
767
img: Input image
768
769
Returns:
770
Point with maximum value
771
"""
772
773
def max_point_interpolated(img):
774
"""
775
Find sub-pixel maximum location using interpolation.
776
777
Args:
778
img: Input image
779
780
Returns:
781
Sub-pixel point with interpolated maximum
782
"""
783
784
def zero_border_pixels(img, x_border_size: int, y_border_size: int):
785
"""
786
Set image border pixels to zero.
787
788
Args:
789
img: Input image to modify in-place
790
x_border_size: Horizontal border size
791
y_border_size: Vertical border size
792
"""
793
794
def sub_image(img, rect: rectangle):
795
"""
796
Extract rectangular sub-region from image.
797
798
Args:
799
img: Input image
800
rect: Rectangle defining sub-region
801
802
Returns:
803
Sub-image as view or copy
804
"""
805
806
def tile_images(images: list):
807
"""
808
Tile multiple images into a single grid image.
809
810
Args:
811
images: List of input images to tile
812
813
Returns:
814
Tiled image containing all input images
815
"""
816
817
def min_barrier_distance(img, iterations: int = 10, do_left_right_scans: bool = True):
818
"""
819
Compute minimum barrier distance for salient object detection.
820
821
Args:
822
img: Input grayscale image
823
iterations: Number of diffusion iterations
824
do_left_right_scans: Enable bidirectional scanning
825
826
Returns:
827
Distance map highlighting salient regions
828
"""
829
830
def normalize_image_gradients(img1, img2):
831
"""
832
Normalize gradient vectors between two images.
833
834
Args:
835
img1: First gradient component image
836
img2: Second gradient component image
837
838
Returns:
839
Tuple of normalized gradient images
840
"""
841
842
def remove_incoherent_edge_pixels(line, horizontal_gradient, vertical_gradient, angle_threshold: float):
843
"""
844
Remove edge pixels inconsistent with gradient direction.
845
846
Args:
847
line: Input line/edge image
848
horizontal_gradient: X gradient image
849
vertical_gradient: Y gradient image
850
angle_threshold: Maximum angle deviation in radians
851
852
Returns:
853
Filtered edge image
854
"""
855
856
class pyramid_down:
857
"""Image pyramid downsampling class."""
858
859
def __init__(self, N: int = 2):
860
"""
861
Initialize pyramid with downsampling factor.
862
863
Args:
864
N: Downsampling factor
865
"""
866
867
def __call__(self, img):
868
"""
869
Apply pyramid downsampling to image.
870
871
Args:
872
img: Input image
873
874
Returns:
875
Downsampled image
876
"""
877
878
class hough_transform:
879
"""Hough transform for line detection."""
880
881
def __init__(self, size: int):
882
"""
883
Initialize Hough transform.
884
885
Args:
886
size: Size of Hough space
887
"""
888
889
def __call__(self, img):
890
"""
891
Apply Hough transform to detect lines.
892
893
Args:
894
img: Input binary edge image
895
896
Returns:
897
Hough space accumulator
898
"""
899
900
def get_line(self, hough_point):
901
"""Get line from Hough space point."""
902
903
def get_line_angle_in_degrees(self, hough_point) -> float:
904
"""Get line angle in degrees."""
905
906
def get_line_properties(self, hough_point):
907
"""Get complete line properties."""
908
909
def find_pixels_voting_for_lines(self, img, lines: list):
910
"""Find pixels contributing to detected lines."""
911
912
def find_strong_hough_points(self, hough_img, max_lines: int):
913
"""Find strongest peaks in Hough space."""
914
915
def jitter_image(img, num_jitters: int = 1, disturb_colors: bool = False):
916
"""
917
Generate jittered versions of image for data augmentation.
918
919
Args:
920
img: Input image
921
num_jitters: Number of jittered versions to generate
922
disturb_colors: Whether to apply color distortions
923
924
Returns:
925
List of jittered images
926
"""
927
928
def spatially_filter_image(img, filter_kernel):
929
"""
930
Apply spatial filtering with custom kernel.
931
932
Args:
933
img: Input image
934
filter_kernel: Convolution kernel
935
936
Returns:
937
Filtered image
938
"""
939
940
def spatially_filter_image_separable(img, row_filter, col_filter):
941
"""
942
Apply separable spatial filtering.
943
944
Args:
945
img: Input image
946
row_filter: Row-wise filter kernel
947
col_filter: Column-wise filter kernel
948
949
Returns:
950
Filtered image
951
"""
952
```
953
954
**Complete Image Processing Pipeline Example:**
955
```python
956
import dlib
957
import cv2
958
import numpy as np
959
960
# Load and preprocess image
961
img = cv2.imread("document.jpg")
962
gray = dlib.as_grayscale(img)
963
964
# Noise reduction
965
denoised = dlib.gaussian_blur(gray, sigma=1.0)
966
967
# Edge detection using gradients
968
gradients = dlib.image_gradients(scale=1)
969
grad_x = gradients.gradient_x(denoised)
970
grad_y = gradients.gradient_y(denoised)
971
edges = np.sqrt(grad_x**2 + grad_y**2)
972
973
# Thresholding
974
binary = dlib.threshold_image(edges)
975
976
# Morphological processing
977
skeleton_img = dlib.skeleton(binary)
978
endpoints = dlib.find_line_endpoints(skeleton_img)
979
980
# Connected component analysis
981
labeled = dlib.label_connected_blobs(binary)
982
colored_result = dlib.randomly_color_image(labeled)
983
984
# Final visualization
985
jet_result = dlib.jet(gray)
986
```
987
988
This comprehensive image processing capability provides the foundation for computer vision applications, preprocessing for machine learning, and advanced image analysis tasks.