or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

aruco.mdcamera-calibration.mdcomputational-photography.mdcontours-shapes.mdcore-operations.mddnn.mdfeature-detection.mdgui-drawing.mdimage-processing.mdimage-video-io.mdindex.mdmachine-learning.mdobject-detection.mdtask-log.mdvideo-analysis.md

object-detection.mddocs/

0

# Object Detection

1

2

OpenCV provides robust tools for detecting and recognizing objects in images and videos. The object detection module includes traditional computer vision methods like Haar cascades and HOG descriptors, as well as modern deep learning-based detectors. These capabilities enable applications such as face detection, pedestrian detection, QR code scanning, and custom object recognition.

3

4

## Capabilities

5

6

### Cascade Classifiers

7

8

Cascade classifiers use a machine learning approach based on Haar features or LBP features to detect objects. They are fast and efficient for real-time detection tasks.

9

10

**CascadeClassifier Class**

11

12

```python { .api }

13

cv2.CascadeClassifier(filename=None)

14

```

15

16

Creates a cascade classifier object for object detection. If `filename` is provided, loads the cascade from the specified XML file.

17

18

**Loading a Cascade**

19

20

```python { .api }

21

classifier.load(filename)

22

```

23

24

Loads a cascade classifier from an XML file. Returns `True` if successful, `False` otherwise.

25

26

**Detecting Objects**

27

28

```python { .api }

29

objects = classifier.detectMultiScale(

30

image,

31

scaleFactor=1.1,

32

minNeighbors=3,

33

flags=0,

34

minSize=(0, 0),

35

maxSize=(0, 0)

36

)

37

```

38

39

Detects objects of different sizes in the input image. Returns a list of rectangles where objects were found, as `(x, y, width, height)` tuples.

40

41

- `image`: Input image (grayscale recommended for better performance)

42

- `scaleFactor`: Parameter specifying how much the image size is reduced at each scale (e.g., 1.1 means 10% reduction)

43

- `minNeighbors`: Specifies how many neighbors each candidate rectangle should have to retain it (higher value results in fewer detections but higher quality)

44

- `flags`: Legacy parameter from old API, typically set to 0

45

- `minSize`: Minimum object size in pixels

46

- `maxSize`: Maximum object size in pixels (0 means no limit)

47

48

**Detecting with Level Information**

49

50

```python { .api }

51

objects, numDetections = classifier.detectMultiScale2(

52

image,

53

scaleFactor=1.1,

54

minNeighbors=3,

55

flags=0,

56

minSize=(0, 0),

57

maxSize=(0, 0)

58

)

59

```

60

61

Similar to `detectMultiScale()`, but also returns the number of neighbor rectangles for each detection, which can be used as a confidence measure.

62

63

**Detecting with Weights**

64

65

```python { .api }

66

objects, rejectLevels, levelWeights = classifier.detectMultiScale3(

67

image,

68

scaleFactor=1.1,

69

minNeighbors=3,

70

flags=0,

71

minSize=(0, 0),

72

maxSize=(0, 0),

73

outputRejectLevels=True

74

)

75

```

76

77

Extended detection that returns reject levels and level weights for each detection, providing more detailed information about detection confidence.

78

79

**Example: Face Detection**

80

81

```python { .api }

82

import cv2

83

84

# Load the cascade classifier

85

face_cascade = cv2.CascadeClassifier(

86

cv2.data.haarcascades + 'haarcascade_frontalface_default.xml'

87

)

88

89

# Load image

90

img = cv2.imread('image.jpg')

91

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

92

93

# Detect faces

94

faces = face_cascade.detectMultiScale(

95

gray,

96

scaleFactor=1.1,

97

minNeighbors=5,

98

minSize=(30, 30)

99

)

100

101

# Draw rectangles around faces

102

for (x, y, w, h) in faces:

103

cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)

104

```

105

106

### HOG Descriptor

107

108

Histogram of Oriented Gradients (HOG) is a feature descriptor used for object detection, particularly effective for pedestrian detection.

109

110

**HOGDescriptor Class**

111

112

```python

113

hog = cv2.HOGDescriptor()

114

```

115

116

Creates a HOG descriptor and detector object with default parameters.

117

118

**Computing HOG Descriptors**

119

120

```python { .api }

121

descriptors = hog.compute(

122

img,

123

winStride=(8, 8),

124

padding=(0, 0),

125

locations=None

126

)

127

```

128

129

Computes HOG descriptors for the image.

130

131

- `img`: Input image

132

- `winStride`: Step size for sliding window (in pixels)

133

- `padding`: Padding around the image

134

- `locations`: Optional list of detection locations

135

136

Returns a numpy array of HOG descriptors.

137

138

**Setting the SVM Detector**

139

140

```python { .api }

141

hog.setSVMDetector(detector)

142

```

143

144

Sets the SVM (Support Vector Machine) detector coefficients for object detection. OpenCV provides pre-trained detectors.

145

146

**Getting Default People Detector**

147

148

```python { .api }

149

detector = cv2.HOGDescriptor_getDefaultPeopleDetector()

150

```

151

152

Returns the default people/pedestrian detector coefficients trained on the INRIA person dataset.

153

154

**Detecting Objects with HOG**

155

156

```python { .api }

157

found, weights = hog.detectMultiScale(

158

img,

159

hitThreshold=0,

160

winStride=(8, 8),

161

padding=(0, 0),

162

scale=1.05,

163

finalThreshold=2.0,

164

useMeanshiftGrouping=False

165

)

166

```

167

168

Detects objects (e.g., people) in the image using the HOG descriptor and SVM classifier.

169

170

- `img`: Input image

171

- `hitThreshold`: Threshold for detection decision (lower values increase detections but also false positives)

172

- `winStride`: Step size for sliding window

173

- `padding`: Padding around the image

174

- `scale`: Scale factor for image pyramid

175

- `finalThreshold`: Threshold for the final detection grouping

176

- `useMeanshiftGrouping`: Use mean-shift grouping instead of NMS

177

178

Returns detected object rectangles and their weights.

179

180

**Example: Pedestrian Detection**

181

182

```python { .api }

183

import cv2

184

185

# Initialize HOG descriptor with default people detector

186

hog = cv2.HOGDescriptor()

187

hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

188

189

# Load image

190

img = cv2.imread('street.jpg')

191

192

# Detect people

193

found, weights = hog.detectMultiScale(

194

img,

195

winStride=(8, 8),

196

padding=(4, 4),

197

scale=1.05

198

)

199

200

# Draw rectangles around detected people

201

for (x, y, w, h) in found:

202

cv2.rectangle(img, (x, y), (x+w, y+h), (0, 255, 0), 2)

203

```

204

205

### QR Code Detection

206

207

OpenCV provides dedicated tools for detecting and decoding QR codes in images.

208

209

**QRCodeDetector Class**

210

211

```python

212

detector = cv2.QRCodeDetector()

213

```

214

215

Creates a QR code detector object.

216

217

**Detecting QR Codes**

218

219

```python { .api }

220

retval, points = detector.detect(img)

221

```

222

223

Detects a QR code in the image. Returns `True` if found, and the corner points of the QR code.

224

225

**Decoding QR Codes**

226

227

```python { .api }

228

data, points, straight_qrcode = detector.decode(img, points)

229

```

230

231

Decodes a QR code given its corner points. Returns the decoded string, corner points, and the rectified QR code image.

232

233

**Detecting and Decoding in One Call**

234

235

```python { .api }

236

data, points, straight_qrcode = detector.detectAndDecode(img)

237

```

238

239

Detects and decodes a QR code in a single operation. Returns:

240

- `data`: Decoded string from QR code (empty if no QR code found)

241

- `points`: Corner points of the QR code

242

- `straight_qrcode`: Rectified QR code image

243

244

**Detecting Multiple QR Codes**

245

246

```python { .api }

247

retval, points = detector.detectMulti(img)

248

```

249

250

Detects multiple QR codes in the image. Returns `True` if any QR codes are found, and a list of corner points for each detected QR code.

251

252

**Decoding Multiple QR Codes**

253

254

```python { .api }

255

retval, decoded_info, points, straight_qrcodes = detector.decodeMulti(img, points)

256

```

257

258

Decodes multiple QR codes given their corner points. Returns:

259

- `retval`: `True` if successful

260

- `decoded_info`: List of decoded strings

261

- `points`: List of corner points for each QR code

262

- `straight_qrcodes`: List of rectified QR code images

263

264

**Example: QR Code Detection and Decoding**

265

266

```python { .api }

267

import cv2

268

269

# Create QR code detector

270

detector = cv2.QRCodeDetector()

271

272

# Load image

273

img = cv2.imread('qrcode.jpg')

274

275

# Detect and decode

276

data, points, straight_qrcode = detector.detectAndDecode(img)

277

278

if data:

279

print(f"QR Code detected: {data}")

280

281

# Draw boundary around QR code

282

if points is not None:

283

points = points.reshape(-1, 2).astype(int)

284

for i in range(4):

285

cv2.line(img, tuple(points[i]), tuple(points[(i+1)%4]), (0, 255, 0), 3)

286

else:

287

print("No QR Code detected")

288

```

289

290

### QRCodeDetectorAruco

291

292

```python

293

detector = cv2.QRCodeDetectorAruco()

294

```

295

296

An enhanced QR code detector that uses ArUco markers for improved detection. Provides the same interface as `QRCodeDetector` but with better robustness in challenging conditions.

297

298

### Face Detection (DNN-based)

299

300

Modern face detection using deep neural networks provides more accurate results than traditional cascade classifiers.

301

302

**FaceDetectorYN Class**

303

304

```python { .api }

305

detector = cv2.FaceDetectorYN.create(

306

model,

307

config,

308

input_size,

309

score_threshold=0.9,

310

nms_threshold=0.3,

311

top_k=5000,

312

backend_id=0,

313

target_id=0

314

)

315

```

316

317

Creates a YuNet face detector. YuNet is a lightweight and accurate face detection model.

318

319

- `model`: Path to the ONNX model file

320

- `config`: Path to the config file (can be empty string)

321

- `input_size`: Input size for the neural network as (width, height)

322

- `score_threshold`: Confidence threshold for face detection

323

- `nms_threshold`: Non-maximum suppression threshold

324

- `top_k`: Keep top K detections before NMS

325

- `backend_id`: Backend identifier (e.g., default, OpenCV, CUDA)

326

- `target_id`: Target device identifier (e.g., CPU, GPU)

327

328

**Detecting Faces**

329

330

```python { .api }

331

faces = detector.detect(img)

332

```

333

334

Detects faces in the input image. Returns a tuple containing:

335

- Return value (1 if faces detected, 0 otherwise)

336

- Face detections as numpy array where each row contains: [x, y, w, h, x_re, y_re, x_le, y_le, x_nt, y_nt, x_rcm, y_rcm, x_lcm, y_lcm, confidence]

337

- First 4 values: Bounding box (x, y, width, height)

338

- Next values: Facial landmarks (right eye, left eye, nose tip, right corner of mouth, left corner of mouth)

339

- Last value: Detection confidence score

340

341

### Face Recognition

342

343

**FaceRecognizerSF Class**

344

345

```python { .api }

346

recognizer = cv2.FaceRecognizerSF.create(

347

model,

348

config,

349

backend_id=0,

350

target_id=0

351

)

352

```

353

354

Creates a face recognition model based on SFace. Used to extract face features and compare faces for recognition tasks.

355

356

**Extracting Face Features**

357

358

```python { .api }

359

feature = recognizer.feature(aligned_face)

360

```

361

362

Extracts a feature vector from an aligned face image. The feature vector can be used for face comparison and recognition.

363

364

**Comparing Faces**

365

366

```python { .api }

367

score = recognizer.match(

368

face_feature1,

369

face_feature2,

370

dis_type=cv2.FaceRecognizerSF_FR_COSINE

371

)

372

```

373

374

Computes the similarity score between two face features. Higher scores indicate more similar faces.

375

376

Distance types:

377

- `cv2.FaceRecognizerSF_FR_COSINE`: Cosine distance

378

- `cv2.FaceRecognizerSF_FR_NORM_L2`: L2 norm distance

379

380

## Haar Cascade Data Files

381

382

OpenCV includes pre-trained Haar cascade classifiers for various object detection tasks. These XML files are distributed with the opencv-python package and can be accessed via the `cv2.data.haarcascades` path.

383

384

**Accessing Haar Cascade Files**

385

386

```python { .api }

387

import cv2

388

389

# Get the path to the haarcascades directory

390

cascade_path = cv2.data.haarcascades

391

392

# Load a specific cascade

393

face_cascade = cv2.CascadeClassifier(cascade_path + 'haarcascade_frontalface_default.xml')

394

```

395

396

**Available Cascade Files**

397

398

OpenCV provides the following pre-trained Haar cascade classifiers:

399

400

**Face Detection:**

401

- `haarcascade_frontalface_default.xml` - Default frontal face detector (most commonly used)

402

- `haarcascade_frontalface_alt.xml` - Alternative frontal face detector

403

- `haarcascade_frontalface_alt2.xml` - Another alternative frontal face detector

404

- `haarcascade_frontalface_alt_tree.xml` - Tree-based frontal face detector

405

- `haarcascade_profileface.xml` - Profile (side view) face detector

406

407

**Eye Detection:**

408

- `haarcascade_eye.xml` - Generic eye detector

409

- `haarcascade_eye_tree_eyeglasses.xml` - Eye detector that works with eyeglasses

410

- `haarcascade_lefteye_2splits.xml` - Left eye detector

411

- `haarcascade_righteye_2splits.xml` - Right eye detector

412

413

**Facial Features:**

414

- `haarcascade_smile.xml` - Smile detector

415

416

**Body Detection:**

417

- `haarcascade_fullbody.xml` - Full body detector

418

- `haarcascade_upperbody.xml` - Upper body detector

419

- `haarcascade_lowerbody.xml` - Lower body detector

420

421

**Animal Detection:**

422

- `haarcascade_frontalcatface.xml` - Cat face detector

423

- `haarcascade_frontalcatface_extended.xml` - Extended cat face detector

424

425

**Other Objects:**

426

- `haarcascade_licence_plate_rus_16stages.xml` - Russian license plate detector

427

428

**Example: Loading Multiple Cascades**

429

430

```python

431

import cv2

432

433

cascade_path = cv2.data.haarcascades

434

435

# Load face and eye cascades

436

face_cascade = cv2.CascadeClassifier(

437

cascade_path + 'haarcascade_frontalface_default.xml'

438

)

439

eye_cascade = cv2.CascadeClassifier(

440

cascade_path + 'haarcascade_eye.xml'

441

)

442

smile_cascade = cv2.CascadeClassifier(

443

cascade_path + 'haarcascade_smile.xml'

444

)

445

446

# Load image and convert to grayscale

447

img = cv2.imread('people.jpg')

448

gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

449

450

# Detect faces

451

faces = face_cascade.detectMultiScale(gray, 1.3, 5)

452

453

# For each face, detect eyes and smile

454

for (x, y, w, h) in faces:

455

cv2.rectangle(img, (x, y), (x+w, y+h), (255, 0, 0), 2)

456

roi_gray = gray[y:y+h, x:x+w]

457

roi_color = img[y:y+h, x:x+w]

458

459

# Detect eyes in face region

460

eyes = eye_cascade.detectMultiScale(roi_gray)

461

for (ex, ey, ew, eh) in eyes:

462

cv2.rectangle(roi_color, (ex, ey), (ex+ew, ey+eh), (0, 255, 0), 2)

463

464

# Detect smile in face region

465

smiles = smile_cascade.detectMultiScale(roi_gray, 1.8, 20)

466

for (sx, sy, sw, sh) in smiles:

467

cv2.rectangle(roi_color, (sx, sy), (sx+sw, sy+sh), (0, 0, 255), 2)

468

```

469

470

**Performance Tips for Cascade Classifiers:**

471

472

1. **Convert to Grayscale**: Cascade classifiers work faster on grayscale images

473

2. **Adjust scaleFactor**: Smaller values (e.g., 1.05) are more thorough but slower; larger values (e.g., 1.3) are faster but may miss objects

474

3. **Tune minNeighbors**: Higher values reduce false positives but may miss some objects

475

4. **Set Size Limits**: Use `minSize` and `maxSize` to restrict detection to expected object sizes

476

5. **Process at Lower Resolution**: Resize large images before detection for better performance

477

6. **Region of Interest**: If possible, detect only in specific regions of the image

478

479

**Cascade Classifier Limitations:**

480

481

- Haar cascades are sensitive to object orientation and scale

482

- Performance decreases with variations in lighting, pose, and occlusion

483

- For more robust detection, consider using DNN-based detectors (see cv2.dnn module)

484

- Profile face detection is generally less accurate than frontal face detection

485

- Eye detection works best on frontal faces with open eyes

486