Tessl Tile for pypi/opencv-python@4.12.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

aruco.md camera-calibration.md computational-photography.md contours-shapes.md core-operations.md dnn.md feature-detection.md gui-drawing.md image-processing.md image-video-io.md index.md machine-learning.md object-detection.md task-log.md video-analysis.md

object-detection.mddocs/

0
# Object Detection
1

2
OpenCV provides robust tools for detecting and recognizing objects in images and videos. The object detection module includes traditional computer vision methods like Haar cascades and HOG descriptors, as well as modern deep learning-based detectors. These capabilities enable applications such as face detection, pedestrian detection, QR code scanning, and custom object recognition.
3

4
## Capabilities
5

6
### Cascade Classifiers
7

8
Cascade classifiers use a machine learning approach based on Haar features or LBP features to detect objects. They are fast and efficient for real-time detection tasks.
9

10
**CascadeClassifier Class**
11

12
```python { .api }
13
cv2.CascadeClassifier(filename=None)
14
```
15

16
Creates a cascade classifier object for object detection. If `filename` is provided, loads the cascade from the specified XML file.
17

18
**Loading a Cascade**
19

20
```python { .api }
21
classifier.load(filename)
22
```
23

24
Loads a cascade classifier from an XML file. Returns `True` if successful, `False` otherwise.
25

26
**Detecting Objects**
27

28
```python { .api }
29
objects = classifier.detectMultiScale(
30
    image,
31
    scaleFactor=1.1,
32
    minNeighbors=3,
33
    flags=0,
34
    minSize=(0, 0),
35
    maxSize=(0, 0)
36
)
37
```
38

39
Detects objects of different sizes in the input image. Returns a list of rectangles where objects were found, as `(x, y, width, height)` tuples.
40

41
- `image`: Input image (grayscale recommended for better performance)
42
- `scaleFactor`: Parameter specifying how much the image size is reduced at each scale (e.g., 1.1 means 10% reduction)
43
- `minNeighbors`: Specifies how many neighbors each candidate rectangle should have to retain it (higher value results in fewer detections but higher quality)
44
- `flags`: Legacy parameter from old API, typically set to 0
45
- `minSize`: Minimum object size in pixels
46
- `maxSize`: Maximum object size in pixels (0 means no limit)
47

48
**Detecting with Level Information**
49

50
```python { .api }
51
objects, numDetections = classifier.detectMultiScale2(
52
    image,
53
    scaleFactor=1.1,
54
    minNeighbors=3,
55
    flags=0,
56
    minSize=(0, 0),
57
    maxSize=(0, 0)
58
)
59
```
60

61
Similar to `detectMultiScale()`, but also returns the number of neighbor rectangles for each detection, which can be used as a confidence measure.
62

63
**Detecting with Weights**
64

65
```python { .api }
66
objects, rejectLevels, levelWeights = classifier.detectMultiScale3(
67
    image,
68
    scaleFactor=1.1,
69
    minNeighbors=3,
70
    flags=0,
71
    minSize=(0, 0),
72
    maxSize=(0, 0),
73
    outputRejectLevels=True
74
)
75
```
76

77
Extended detection that returns reject levels and level weights for each detection, providing more detailed information about detection confidence.
78

79
**Example: Face Detection**
80

81
```python { .api }
82
import cv2
83

84
# Load the cascade classifier
85
face_cascade = cv2.CascadeClassifier(
86
    cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'
87
)
88

89
# Load image
90
img = cv2.imread('image.jpg')
91
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
92

93
# Detect faces
94
faces = face_cascade.detectMultiScale(
95
    gray,
96
    scaleFactor=1.1,
97
    minNeighbors=5,
98
    minSize=(30, 30)
99
)
100

101
# Draw rectangles around faces
102
for (x, y, w, h) in faces:
103
    cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
104
```
105

106
### HOG Descriptor
107

108
Histogram of Oriented Gradients (HOG) is a feature descriptor used for object detection, particularly effective for pedestrian detection.
109

110
**HOGDescriptor Class**
111

112
```python
113
hog = cv2.HOGDescriptor()
114
```
115

116
Creates a HOG descriptor and detector object with default parameters.
117

118
**Computing HOG Descriptors**
119

120
```python { .api }
121
descriptors = hog.compute(
122
    img,
123
    winStride=(8, 8),
124
    padding=(0, 0),
125
    locations=None
126
)
127
```
128

129
Computes HOG descriptors for the image.
130

131
- `img`: Input image
132
- `winStride`: Step size for sliding window (in pixels)
133
- `padding`: Padding around the image
134
- `locations`: Optional list of detection locations
135

136
Returns a numpy array of HOG descriptors.
137

138
**Setting the SVM Detector**
139

140
```python { .api }
141
hog.setSVMDetector(detector)
142
```
143

144
Sets the SVM (Support Vector Machine) detector coefficients for object detection. OpenCV provides pre-trained detectors.
145

146
**Getting Default People Detector**
147

148
```python { .api }
149
detector = cv2.HOGDescriptor_getDefaultPeopleDetector()
150
```
151

152
Returns the default people/pedestrian detector coefficients trained on the INRIA person dataset.
153

154
**Detecting Objects with HOG**
155

156
```python { .api }
157
found, weights = hog.detectMultiScale(
158
    img,
159
    hitThreshold=0,
160
    winStride=(8, 8),
161
    padding=(0, 0),
162
    scale=1.05,
163
    finalThreshold=2.0,
164
    useMeanshiftGrouping=False
165
)
166
```
167

168
Detects objects (e.g., people) in the image using the HOG descriptor and SVM classifier.
169

170
- `img`: Input image
171
- `hitThreshold`: Threshold for detection decision (lower values increase detections but also false positives)
172
- `winStride`: Step size for sliding window
173
- `padding`: Padding around the image
174
- `scale`: Scale factor for image pyramid
175
- `finalThreshold`: Threshold for the final detection grouping
176
- `useMeanshiftGrouping`: Use mean-shift grouping instead of NMS
177

178
Returns detected object rectangles and their weights.
179

180
**Example: Pedestrian Detection**
181

182
```python { .api }
183
import cv2
184

185
# Initialize HOG descriptor with default people detector
186
hog = cv2.HOGDescriptor()
187
hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())
188

189
# Load image
190
img = cv2.imread('street.jpg')
191

192
# Detect people
193
found, weights = hog.detectMultiScale(
194
    img,
195
    winStride=(8, 8),
196
    padding=(4, 4),
197
    scale=1.05
198
)
199

200
# Draw rectangles around detected people
201
for (x, y, w, h) in found:
202
    cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)
203
```
204

205
### QR Code Detection
206

207
OpenCV provides dedicated tools for detecting and decoding QR codes in images.
208

209
**QRCodeDetector Class**
210

211
```python
212
detector = cv2.QRCodeDetector()
213
```
214

215
Creates a QR code detector object.
216

217
**Detecting QR Codes**
218

219
```python { .api }
220
retval, points = detector.detect(img)
221
```
222

223
Detects a QR code in the image. Returns `True` if found, and the corner points of the QR code.
224

225
**Decoding QR Codes**
226

227
```python { .api }
228
data, points, straight_qrcode = detector.decode(img, points)
229
```
230

231
Decodes a QR code given its corner points. Returns the decoded string, corner points, and the rectified QR code image.
232

233
**Detecting and Decoding in One Call**
234

235
```python { .api }
236
data, points, straight_qrcode = detector.detectAndDecode(img)
237
```
238

239
Detects and decodes a QR code in a single operation. Returns:
240
- `data`: Decoded string from QR code (empty if no QR code found)
241
- `points`: Corner points of the QR code
242
- `straight_qrcode`: Rectified QR code image
243

244
**Detecting Multiple QR Codes**
245

246
```python { .api }
247
retval, points = detector.detectMulti(img)
248
```
249

250
Detects multiple QR codes in the image. Returns `True` if any QR codes are found, and a list of corner points for each detected QR code.
251

252
**Decoding Multiple QR Codes**
253

254
```python { .api }
255
retval, decoded_info, points, straight_qrcodes = detector.decodeMulti(img, points)
256
```
257

258
Decodes multiple QR codes given their corner points. Returns:
259
- `retval`: `True` if successful
260
- `decoded_info`: List of decoded strings
261
- `points`: List of corner points for each QR code
262
- `straight_qrcodes`: List of rectified QR code images
263

264
**Example: QR Code Detection and Decoding**
265

266
```python { .api }
267
import cv2
268

269
# Create QR code detector
270
detector = cv2.QRCodeDetector()
271

272
# Load image
273
img = cv2.imread('qrcode.jpg')
274

275
# Detect and decode
276
data, points, straight_qrcode = detector.detectAndDecode(img)
277

278
if data:
279
    print(f"QR Code detected: {data}")
280

281
    # Draw boundary around QR code
282
    if points is not None:
283
        points = points.reshape(-1, 2).astype(int)
284
        for i in range(4):
285
            cv2.line(img, tuple(points[i]), tuple(points[(i+1)%4]), (0, 255, 0), 3)
286
else:
287
    print("No QR Code detected")
288
```
289

290
### QRCodeDetectorAruco
291

292
```python
293
detector = cv2.QRCodeDetectorAruco()
294
```
295

296
An enhanced QR code detector that uses ArUco markers for improved detection. Provides the same interface as `QRCodeDetector` but with better robustness in challenging conditions.
297

298
### Face Detection (DNN-based)
299

300
Modern face detection using deep neural networks provides more accurate results than traditional cascade classifiers.
301

302
**FaceDetectorYN Class**
303

304
```python { .api }
305
detector = cv2.FaceDetectorYN.create(
306
    model,
307
    config,
308
    input_size,
309
    score_threshold=0.9,
310
    nms_threshold=0.3,
311
    top_k=5000,
312
    backend_id=0,
313
    target_id=0
314
)
315
```
316

317
Creates a YuNet face detector. YuNet is a lightweight and accurate face detection model.
318

319
- `model`: Path to the ONNX model file
320
- `config`: Path to the config file (can be empty string)
321
- `input_size`: Input size for the neural network as (width, height)
322
- `score_threshold`: Confidence threshold for face detection
323
- `nms_threshold`: Non-maximum suppression threshold
324
- `top_k`: Keep top K detections before NMS
325
- `backend_id`: Backend identifier (e.g., default, OpenCV, CUDA)
326
- `target_id`: Target device identifier (e.g., CPU, GPU)
327

328
**Detecting Faces**
329

330
```python { .api }
331
faces = detector.detect(img)
332
```
333

334
Detects faces in the input image. Returns a tuple containing:
335
- Return value (1 if faces detected, 0 otherwise)
336
- Face detections as numpy array where each row contains: [x, y, w, h, x_re, y_re, x_le, y_le, x_nt, y_nt, x_rcm, y_rcm, x_lcm, y_lcm, confidence]
337
  - First 4 values: Bounding box (x, y, width, height)
338
  - Next values: Facial landmarks (right eye, left eye, nose tip, right corner of mouth, left corner of mouth)
339
  - Last value: Detection confidence score
340

341
### Face Recognition
342

343
**FaceRecognizerSF Class**
344

345
```python { .api }
346
recognizer = cv2.FaceRecognizerSF.create(
347
    model,
348
    config,
349
    backend_id=0,
350
    target_id=0
351
)
352
```
353

354
Creates a face recognition model based on SFace. Used to extract face features and compare faces for recognition tasks.
355

356
**Extracting Face Features**
357

358
```python { .api }
359
feature = recognizer.feature(aligned_face)
360
```
361

362
Extracts a feature vector from an aligned face image. The feature vector can be used for face comparison and recognition.
363

364
**Comparing Faces**
365

366
```python { .api }
367
score = recognizer.match(
368
    face_feature1,
369
    face_feature2,
370
    dis_type=cv2.FaceRecognizerSF_FR_COSINE
371
)
372
```
373

374
Computes the similarity score between two face features. Higher scores indicate more similar faces.
375

376
Distance types:
377
- `cv2.FaceRecognizerSF_FR_COSINE`: Cosine distance
378
- `cv2.FaceRecognizerSF_FR_NORM_L2`: L2 norm distance
379

380
## Haar Cascade Data Files
381

382
OpenCV includes pre-trained Haar cascade classifiers for various object detection tasks. These XML files are distributed with the opencv-python package and can be accessed via the `cv2.data.haarcascades` path.
383

384
**Accessing Haar Cascade Files**
385

386
```python { .api }
387
import cv2
388

389
# Get the path to the haarcascades directory
390
cascade_path = cv2.data.haarcascades
391

392
# Load a specific cascade
393
face_cascade = cv2.CascadeClassifier(cascade_path + 'haarcascade_frontalface_default.xml')
394
```
395

396
**Available Cascade Files**
397

398
OpenCV provides the following pre-trained Haar cascade classifiers:
399

400
**Face Detection:**
401
- `haarcascade_frontalface_default.xml` - Default frontal face detector (most commonly used)
402
- `haarcascade_frontalface_alt.xml` - Alternative frontal face detector
403
- `haarcascade_frontalface_alt2.xml` - Another alternative frontal face detector
404
- `haarcascade_frontalface_alt_tree.xml` - Tree-based frontal face detector
405
- `haarcascade_profileface.xml` - Profile (side view) face detector
406

407
**Eye Detection:**
408
- `haarcascade_eye.xml` - Generic eye detector
409
- `haarcascade_eye_tree_eyeglasses.xml` - Eye detector that works with eyeglasses
410
- `haarcascade_lefteye_2splits.xml` - Left eye detector
411
- `haarcascade_righteye_2splits.xml` - Right eye detector
412

413
**Facial Features:**
414
- `haarcascade_smile.xml` - Smile detector
415

416
**Body Detection:**
417
- `haarcascade_fullbody.xml` - Full body detector
418
- `haarcascade_upperbody.xml` - Upper body detector
419
- `haarcascade_lowerbody.xml` - Lower body detector
420

421
**Animal Detection:**
422
- `haarcascade_frontalcatface.xml` - Cat face detector
423
- `haarcascade_frontalcatface_extended.xml` - Extended cat face detector
424

425
**Other Objects:**
426
- `haarcascade_licence_plate_rus_16stages.xml` - Russian license plate detector
427

428
**Example: Loading Multiple Cascades**
429

430
```python
431
import cv2
432

433
cascade_path = cv2.data.haarcascades
434

435
# Load face and eye cascades
436
face_cascade = cv2.CascadeClassifier(
437
    cascade_path + 'haarcascade_frontalface_default.xml'
438
)
439
eye_cascade = cv2.CascadeClassifier(
440
    cascade_path + 'haarcascade_eye.xml'
441
)
442
smile_cascade = cv2.CascadeClassifier(
443
    cascade_path + 'haarcascade_smile.xml'
444
)
445

446
# Load image and convert to grayscale
447
img = cv2.imread('people.jpg')
448
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
449

450
# Detect faces
451
faces = face_cascade.detectMultiScale(gray, 1.3, 5)
452

453
# For each face, detect eyes and smile
454
for (x, y, w, h) in faces:
455
    cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)
456
    roi_gray = gray[y:y+h, x:x+w]
457
    roi_color = img[y:y+h, x:x+w]
458

459
    # Detect eyes in face region
460
    eyes = eye_cascade.detectMultiScale(roi_gray)
461
    for (ex, ey, ew, eh) in eyes:
462
        cv2.rectangle(roi_color, (ex, ey), (ex+ew, ey+eh), (0, 255, 0), 2)
463

464
    # Detect smile in face region
465
    smiles = smile_cascade.detectMultiScale(roi_gray, 1.8, 20)
466
    for (sx, sy, sw, sh) in smiles:
467
        cv2.rectangle(roi_color, (sx, sy), (sx+sw, sy+sh), (0, 0, 255), 2)
468
```
469

470
**Performance Tips for Cascade Classifiers:**
471

472
1. **Convert to Grayscale**: Cascade classifiers work faster on grayscale images
473
2. **Adjust scaleFactor**: Smaller values (e.g., 1.05) are more thorough but slower; larger values (e.g., 1.3) are faster but may miss objects
474
3. **Tune minNeighbors**: Higher values reduce false positives but may miss some objects
475
4. **Set Size Limits**: Use `minSize` and `maxSize` to restrict detection to expected object sizes
476
5. **Process at Lower Resolution**: Resize large images before detection for better performance
477
6. **Region of Interest**: If possible, detect only in specific regions of the image
478

479
**Cascade Classifier Limitations:**
480

481
- Haar cascades are sensitive to object orientation and scale
482
- Performance decreases with variations in lighting, pose, and occlusion
483
- For more robust detection, consider using DNN-based detectors (see cv2.dnn module)
484
- Profile face detection is generally less accurate than frontal face detection
485
- Eye detection works best on frontal faces with open eyes
486

Version

Tile

Files

object-detection.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

object-detection.mddocs/