0
# Object Detection
1
2
Detect and locate objects within images, providing bounding boxes, confidence scores, and hierarchical object relationships. The service can identify a wide range of common objects and provides spatial location information for each detection.
3
4
## Capabilities
5
6
### Object Detection
7
8
Identify objects within images and provide their locations using bounding rectangles.
9
10
```python { .api }
11
def detect_objects(url, model_version="latest", custom_headers=None, raw=False, **operation_config):
12
"""
13
Detect objects within an image.
14
15
Args:
16
url (str): Publicly reachable URL of an image
17
model_version (str, optional): AI model version to use. Default: "latest"
18
custom_headers (dict, optional): Custom HTTP headers
19
raw (bool, optional): Return raw response. Default: False
20
21
Returns:
22
DetectResult: Object detection results with bounding boxes and confidence scores
23
24
Raises:
25
ComputerVisionErrorResponseException: API error occurred
26
"""
27
28
def detect_objects_in_stream(image, model_version="latest", custom_headers=None, raw=False, **operation_config):
29
"""
30
Detect objects from binary image stream.
31
32
Args:
33
image (Generator): Binary image data stream
34
model_version (str, optional): AI model version to use
35
36
Returns:
37
DetectResult: Object detection results
38
"""
39
```
40
41
## Usage Examples
42
43
### Basic Object Detection
44
45
```python
46
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
47
from msrest.authentication import CognitiveServicesCredentials
48
49
# Initialize client
50
credentials = CognitiveServicesCredentials("your-api-key")
51
client = ComputerVisionClient("https://your-endpoint.cognitiveservices.azure.com/", credentials)
52
53
# Detect objects in image
54
image_url = "https://example.com/street-scene.jpg"
55
detection_result = client.detect_objects(image_url)
56
57
print(f"Detected {len(detection_result.objects)} objects:")
58
59
for obj in detection_result.objects:
60
print(f"\nObject: {obj.object_property}")
61
print(f"Confidence: {obj.confidence:.3f}")
62
63
# Bounding rectangle
64
rect = obj.rectangle
65
print(f"Location: x={rect.x}, y={rect.y}, width={rect.w}, height={rect.h}")
66
67
# Parent object (if part of hierarchy)
68
if obj.parent:
69
print(f"Parent object: {obj.parent.object_property}")
70
parent_rect = obj.parent.rectangle
71
print(f"Parent location: x={parent_rect.x}, y={parent_rect.y}, "
72
f"width={parent_rect.w}, height={parent_rect.h}")
73
```
74
75
### Object Detection from Local File
76
77
```python
78
# Detect objects from local image file
79
with open("local_image.jpg", "rb") as image_stream:
80
detection_result = client.detect_objects_in_stream(image_stream)
81
82
# Group objects by type
83
object_counts = {}
84
for obj in detection_result.objects:
85
obj_type = obj.object_property
86
object_counts[obj_type] = object_counts.get(obj_type, 0) + 1
87
88
print("Object summary:")
89
for obj_type, count in object_counts.items():
90
print(f" {obj_type}: {count}")
91
```
92
93
### Filtering Objects by Confidence
94
95
```python
96
# Filter objects by confidence threshold
97
image_url = "https://example.com/busy-scene.jpg"
98
detection_result = client.detect_objects(image_url)
99
100
confidence_threshold = 0.7
101
high_confidence_objects = [
102
obj for obj in detection_result.objects
103
if obj.confidence >= confidence_threshold
104
]
105
106
print(f"High confidence objects (≥{confidence_threshold}):")
107
for obj in high_confidence_objects:
108
print(f" {obj.object_property}: {obj.confidence:.3f}")
109
```
110
111
### Spatial Analysis
112
113
```python
114
# Analyze object spatial relationships
115
detection_result = client.detect_objects(image_url)
116
117
# Find largest object by area
118
largest_object = max(
119
detection_result.objects,
120
key=lambda obj: obj.rectangle.w * obj.rectangle.h
121
)
122
123
print(f"Largest object: {largest_object.object_property}")
124
print(f"Area: {largest_object.rectangle.w * largest_object.rectangle.h} pixels")
125
126
# Find objects in the left half of the image
127
image_width = detection_result.metadata.width if detection_result.metadata else 1000 # fallback
128
left_half_objects = [
129
obj for obj in detection_result.objects
130
if obj.rectangle.x + obj.rectangle.w / 2 < image_width / 2
131
]
132
133
print(f"\nObjects in left half: {len(left_half_objects)}")
134
for obj in left_half_objects:
135
print(f" {obj.object_property}")
136
```
137
138
## Response Data Types
139
140
### DetectResult
141
142
```python { .api }
143
class DetectResult:
144
"""
145
Object detection operation result.
146
147
Attributes:
148
objects (list[DetectedObject]): List of detected objects with locations
149
request_id (str): Request identifier
150
metadata (ImageMetadata): Image metadata (dimensions, format)
151
model_version (str): Model version used for detection
152
"""
153
```
154
155
### DetectedObject
156
157
```python { .api }
158
class DetectedObject:
159
"""
160
Individual detected object with location and hierarchy information.
161
162
Attributes:
163
rectangle (BoundingRect): Object bounding rectangle
164
object_property (str): Object name/type (e.g., "person", "car", "bicycle")
165
confidence (float): Detection confidence score (0.0 to 1.0)
166
parent (ObjectHierarchy, optional): Parent object in hierarchy
167
"""
168
```
169
170
### BoundingRect
171
172
```python { .api }
173
class BoundingRect:
174
"""
175
Rectangular bounding box for detected objects.
176
177
Attributes:
178
x (int): Left coordinate (pixels from left edge)
179
y (int): Top coordinate (pixels from top edge)
180
w (int): Rectangle width in pixels
181
h (int): Rectangle height in pixels
182
"""
183
```
184
185
### ObjectHierarchy
186
187
```python { .api }
188
class ObjectHierarchy:
189
"""
190
Parent object information in object hierarchy.
191
192
Attributes:
193
object_property (str): Parent object name/type
194
confidence (float): Parent object confidence score
195
rectangle (BoundingRect): Parent object bounding rectangle
196
"""
197
```
198
199
### ImageMetadata
200
201
```python { .api }
202
class ImageMetadata:
203
"""
204
Image metadata information.
205
206
Attributes:
207
height (int): Image height in pixels
208
width (int): Image width in pixels
209
format (str): Image format (e.g., "Jpeg", "Png")
210
"""
211
```
212
213
## Common Object Types
214
215
The object detection service can identify many common objects including:
216
217
- **People and Body Parts**: person, face, hand
218
- **Vehicles**: car, truck, bus, motorcycle, bicycle, airplane, train
219
- **Animals**: dog, cat, horse, bird, cow, sheep
220
- **Furniture**: chair, table, couch, bed, desk
221
- **Electronics**: computer, laptop, cell phone, keyboard, mouse, tv, remote
222
- **Kitchen Items**: refrigerator, oven, microwave, sink, cup, bowl, plate
223
- **Sports**: ball, racket, skateboard, skis, snowboard
224
- **Food**: pizza, sandwich, apple, banana, orange, carrot
225
- **Clothing**: hat, shirt, pants, shoes, tie, handbag, suitcase
226
- **Nature**: tree, flower, grass, rock, mountain, ocean
227
- **Buildings and Infrastructure**: building, house, bridge, road, street sign
228
- **Transportation**: traffic light, stop sign, parking meter, bench
229
230
The service continues to expand its object recognition capabilities, and confidence scores help determine the reliability of each detection.