or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

area-of-interest.mddomain-analysis.mdimage-analysis.mdimage-description.mdimage-tagging.mdindex.mdobject-detection.mdocr-text-recognition.mdthumbnail-generation.md

object-detection.mddocs/

0

# Object Detection

1

2

Detect and locate objects within images, providing bounding boxes, confidence scores, and hierarchical object relationships. The service can identify a wide range of common objects and provides spatial location information for each detection.

3

4

## Capabilities

5

6

### Object Detection

7

8

Identify objects within images and provide their locations using bounding rectangles.

9

10

```python { .api }

11

def detect_objects(url, model_version="latest", custom_headers=None, raw=False, **operation_config):

12

"""

13

Detect objects within an image.

14

15

Args:

16

url (str): Publicly reachable URL of an image

17

model_version (str, optional): AI model version to use. Default: "latest"

18

custom_headers (dict, optional): Custom HTTP headers

19

raw (bool, optional): Return raw response. Default: False

20

21

Returns:

22

DetectResult: Object detection results with bounding boxes and confidence scores

23

24

Raises:

25

ComputerVisionErrorResponseException: API error occurred

26

"""

27

28

def detect_objects_in_stream(image, model_version="latest", custom_headers=None, raw=False, **operation_config):

29

"""

30

Detect objects from binary image stream.

31

32

Args:

33

image (Generator): Binary image data stream

34

model_version (str, optional): AI model version to use

35

36

Returns:

37

DetectResult: Object detection results

38

"""

39

```

40

41

## Usage Examples

42

43

### Basic Object Detection

44

45

```python

46

from azure.cognitiveservices.vision.computervision import ComputerVisionClient

47

from msrest.authentication import CognitiveServicesCredentials

48

49

# Initialize client

50

credentials = CognitiveServicesCredentials("your-api-key")

51

client = ComputerVisionClient("https://your-endpoint.cognitiveservices.azure.com/", credentials)

52

53

# Detect objects in image

54

image_url = "https://example.com/street-scene.jpg"

55

detection_result = client.detect_objects(image_url)

56

57

print(f"Detected {len(detection_result.objects)} objects:")

58

59

for obj in detection_result.objects:

60

print(f"\nObject: {obj.object_property}")

61

print(f"Confidence: {obj.confidence:.3f}")

62

63

# Bounding rectangle

64

rect = obj.rectangle

65

print(f"Location: x={rect.x}, y={rect.y}, width={rect.w}, height={rect.h}")

66

67

# Parent object (if part of hierarchy)

68

if obj.parent:

69

print(f"Parent object: {obj.parent.object_property}")

70

parent_rect = obj.parent.rectangle

71

print(f"Parent location: x={parent_rect.x}, y={parent_rect.y}, "

72

f"width={parent_rect.w}, height={parent_rect.h}")

73

```

74

75

### Object Detection from Local File

76

77

```python

78

# Detect objects from local image file

79

with open("local_image.jpg", "rb") as image_stream:

80

detection_result = client.detect_objects_in_stream(image_stream)

81

82

# Group objects by type

83

object_counts = {}

84

for obj in detection_result.objects:

85

obj_type = obj.object_property

86

object_counts[obj_type] = object_counts.get(obj_type, 0) + 1

87

88

print("Object summary:")

89

for obj_type, count in object_counts.items():

90

print(f" {obj_type}: {count}")

91

```

92

93

### Filtering Objects by Confidence

94

95

```python

96

# Filter objects by confidence threshold

97

image_url = "https://example.com/busy-scene.jpg"

98

detection_result = client.detect_objects(image_url)

99

100

confidence_threshold = 0.7

101

high_confidence_objects = [

102

obj for obj in detection_result.objects

103

if obj.confidence >= confidence_threshold

104

]

105

106

print(f"High confidence objects (≥{confidence_threshold}):")

107

for obj in high_confidence_objects:

108

print(f" {obj.object_property}: {obj.confidence:.3f}")

109

```

110

111

### Spatial Analysis

112

113

```python

114

# Analyze object spatial relationships

115

detection_result = client.detect_objects(image_url)

116

117

# Find largest object by area

118

largest_object = max(

119

detection_result.objects,

120

key=lambda obj: obj.rectangle.w * obj.rectangle.h

121

)

122

123

print(f"Largest object: {largest_object.object_property}")

124

print(f"Area: {largest_object.rectangle.w * largest_object.rectangle.h} pixels")

125

126

# Find objects in the left half of the image

127

image_width = detection_result.metadata.width if detection_result.metadata else 1000 # fallback

128

left_half_objects = [

129

obj for obj in detection_result.objects

130

if obj.rectangle.x + obj.rectangle.w / 2 < image_width / 2

131

]

132

133

print(f"\nObjects in left half: {len(left_half_objects)}")

134

for obj in left_half_objects:

135

print(f" {obj.object_property}")

136

```

137

138

## Response Data Types

139

140

### DetectResult

141

142

```python { .api }

143

class DetectResult:

144

"""

145

Object detection operation result.

146

147

Attributes:

148

objects (list[DetectedObject]): List of detected objects with locations

149

request_id (str): Request identifier

150

metadata (ImageMetadata): Image metadata (dimensions, format)

151

model_version (str): Model version used for detection

152

"""

153

```

154

155

### DetectedObject

156

157

```python { .api }

158

class DetectedObject:

159

"""

160

Individual detected object with location and hierarchy information.

161

162

Attributes:

163

rectangle (BoundingRect): Object bounding rectangle

164

object_property (str): Object name/type (e.g., "person", "car", "bicycle")

165

confidence (float): Detection confidence score (0.0 to 1.0)

166

parent (ObjectHierarchy, optional): Parent object in hierarchy

167

"""

168

```

169

170

### BoundingRect

171

172

```python { .api }

173

class BoundingRect:

174

"""

175

Rectangular bounding box for detected objects.

176

177

Attributes:

178

x (int): Left coordinate (pixels from left edge)

179

y (int): Top coordinate (pixels from top edge)

180

w (int): Rectangle width in pixels

181

h (int): Rectangle height in pixels

182

"""

183

```

184

185

### ObjectHierarchy

186

187

```python { .api }

188

class ObjectHierarchy:

189

"""

190

Parent object information in object hierarchy.

191

192

Attributes:

193

object_property (str): Parent object name/type

194

confidence (float): Parent object confidence score

195

rectangle (BoundingRect): Parent object bounding rectangle

196

"""

197

```

198

199

### ImageMetadata

200

201

```python { .api }

202

class ImageMetadata:

203

"""

204

Image metadata information.

205

206

Attributes:

207

height (int): Image height in pixels

208

width (int): Image width in pixels

209

format (str): Image format (e.g., "Jpeg", "Png")

210

"""

211

```

212

213

## Common Object Types

214

215

The object detection service can identify many common objects including:

216

217

- **People and Body Parts**: person, face, hand

218

- **Vehicles**: car, truck, bus, motorcycle, bicycle, airplane, train

219

- **Animals**: dog, cat, horse, bird, cow, sheep

220

- **Furniture**: chair, table, couch, bed, desk

221

- **Electronics**: computer, laptop, cell phone, keyboard, mouse, tv, remote

222

- **Kitchen Items**: refrigerator, oven, microwave, sink, cup, bowl, plate

223

- **Sports**: ball, racket, skateboard, skis, snowboard

224

- **Food**: pizza, sandwich, apple, banana, orange, carrot

225

- **Clothing**: hat, shirt, pants, shoes, tie, handbag, suitcase

226

- **Nature**: tree, flower, grass, rock, mountain, ocean

227

- **Buildings and Infrastructure**: building, house, bridge, road, street sign

228

- **Transportation**: traffic light, stop sign, parking meter, bench

229

230

The service continues to expand its object recognition capabilities, and confidence scores help determine the reliability of each detection.