0
# Azure Cognitive Services Computer Vision
1
2
Microsoft Azure Cognitive Services Computer Vision Client Library for Python provides state-of-the-art algorithms to process images and return information. The library enables developers to analyze images for mature content detection, face detection, color analysis, image categorization, description generation, and intelligent thumbnail creation.
3
4
**Note**: This package has been deprecated as of November 1, 2024, and is replaced by azure-ai-vision-imageanalysis.
5
6
## Package Information
7
8
- **Package Name**: azure-cognitiveservices-vision-computervision
9
- **Language**: Python
10
- **Installation**: `pip install azure-cognitiveservices-vision-computervision`
11
- **Dependencies**: `msrest>=0.6.21`, `azure-common~=1.1`
12
13
## Core Imports
14
15
```python
16
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
17
from azure.cognitiveservices.vision.computervision import ComputerVisionClientConfiguration
18
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
19
```
20
21
## Basic Usage
22
23
```python
24
from azure.cognitiveservices.vision.computervision import ComputerVisionClient
25
from azure.cognitiveservices.vision.computervision.models import VisualFeatureTypes
26
from msrest.authentication import CognitiveServicesCredentials
27
28
# Initialize client
29
credentials = CognitiveServicesCredentials("your-api-key")
30
client = ComputerVisionClient("https://your-endpoint.cognitiveservices.azure.com/", credentials)
31
32
# Analyze an image
33
image_url = "https://example.com/image.jpg"
34
visual_features = [VisualFeatureTypes.categories, VisualFeatureTypes.description, VisualFeatureTypes.faces]
35
36
analysis = client.analyze_image(image_url, visual_features=visual_features)
37
38
# Access results
39
print(f"Description: {analysis.description.captions[0].text}")
40
print(f"Categories: {[cat.name for cat in analysis.categories]}")
41
print(f"Faces detected: {len(analysis.faces)}")
42
```
43
44
## Architecture
45
46
The Computer Vision API provides a comprehensive set of image analysis capabilities through a single client interface:
47
48
- **ComputerVisionClient**: Main client class providing all image analysis operations
49
- **Analysis Pipeline**: Supports both URL-based and binary stream image input
50
- **Feature Selection**: Configurable visual features and analysis details
51
- **Asynchronous Operations**: Long-running text recognition with operation polling
52
- **Domain Models**: Specialized analysis for celebrities and landmarks
53
54
## Capabilities
55
56
### Image Analysis
57
58
Comprehensive image analysis including content categorization, description generation, face detection, color analysis, and object recognition. Supports multiple visual features in a single API call.
59
60
```python { .api }
61
def analyze_image(url, visual_features=None, details=None, language="en", description_exclude=None, model_version="latest", custom_headers=None, raw=False, **operation_config):
62
"""
63
Extract rich visual features from image content.
64
65
Args:
66
url (str): Publicly reachable URL of an image
67
visual_features (list[VisualFeatureTypes], optional): Visual feature types to return
68
details (list[Details], optional): Domain-specific details (Celebrities, Landmarks)
69
language (str, optional): Output language ("en", "es", "ja", "pt", "zh")
70
description_exclude (list[DescriptionExclude], optional): Domain models to exclude
71
model_version (str, optional): AI model version ("latest", "2021-04-01")
72
73
Returns:
74
ImageAnalysis: Complete analysis results
75
"""
76
77
def analyze_image_in_stream(image, visual_features=None, details=None, language="en", description_exclude=None, model_version="latest", custom_headers=None, raw=False, **operation_config):
78
"""
79
Analyze image from binary stream.
80
81
Args:
82
image (Generator): Binary image data stream
83
84
Returns:
85
ImageAnalysis: Complete analysis results
86
"""
87
```
88
89
[Image Analysis](./image-analysis.md)
90
91
### Object Detection
92
93
Detect and locate objects within images, providing bounding boxes and confidence scores for identified objects.
94
95
```python { .api }
96
def detect_objects(url, model_version="latest", custom_headers=None, raw=False, **operation_config):
97
"""
98
Detect objects within an image.
99
100
Args:
101
url (str): Publicly reachable URL of an image
102
model_version (str, optional): AI model version
103
104
Returns:
105
DetectResult: Object detection results with bounding boxes
106
"""
107
108
def detect_objects_in_stream(image, model_version="latest", custom_headers=None, raw=False, **operation_config):
109
"""
110
Detect objects from binary stream.
111
112
Returns:
113
DetectResult: Object detection results
114
"""
115
```
116
117
[Object Detection](./object-detection.md)
118
119
### OCR and Text Recognition
120
121
Extract text from images using both synchronous OCR for printed text and asynchronous Read API for handwritten and printed text recognition.
122
123
```python { .api }
124
def recognize_printed_text(detect_orientation, url, language=None, custom_headers=None, raw=False, **operation_config):
125
"""
126
Perform OCR on printed text in images.
127
128
Args:
129
detect_orientation (bool): Whether to detect text orientation
130
url (str): Publicly reachable URL of an image
131
language (str, optional): OCR language code
132
133
Returns:
134
OcrResult: OCR results with text regions and words
135
"""
136
137
def read(url, language=None, pages=None, model_version="latest", custom_headers=None, raw=False, **operation_config):
138
"""
139
Read text from image (asynchronous operation).
140
141
Args:
142
url (str): Publicly reachable URL of an image
143
language (str, optional): Text language for recognition
144
pages (list[int], optional): Page numbers to process
145
146
Returns:
147
str: Operation location URL for polling status
148
"""
149
150
def get_read_result(operation_id, custom_headers=None, raw=False, **operation_config):
151
"""
152
Get result of read operation.
153
154
Args:
155
operation_id (str): Operation ID from read operation
156
157
Returns:
158
ReadOperationResult: Text recognition results
159
"""
160
```
161
162
[OCR and Text Recognition](./ocr-text-recognition.md)
163
164
### Image Description
165
166
Generate human-readable descriptions of image content in complete English sentences.
167
168
```python { .api }
169
def describe_image(url, max_candidates=None, language="en", description_exclude=None, model_version="latest", custom_headers=None, raw=False, **operation_config):
170
"""
171
Generate description of image content.
172
173
Args:
174
url (str): Publicly reachable URL of an image
175
max_candidates (int, optional): Maximum description candidates to return
176
language (str, optional): Output language
177
178
Returns:
179
ImageDescription: Generated descriptions with confidence scores
180
"""
181
```
182
183
[Image Description](./image-description.md)
184
185
### Image Tagging
186
187
Generate detailed tags for image content with confidence scores.
188
189
```python { .api }
190
def tag_image(url, language="en", model_version="latest", custom_headers=None, raw=False, **operation_config):
191
"""
192
Generate tags for image content.
193
194
Args:
195
url (str): Publicly reachable URL of an image
196
language (str, optional): Output language
197
198
Returns:
199
TagResult: Generated tags with confidence scores
200
"""
201
```
202
203
[Image Tagging](./image-tagging.md)
204
205
### Thumbnail Generation
206
207
Generate intelligent thumbnails with smart cropping to preserve important image content.
208
209
```python { .api }
210
def generate_thumbnail(width, height, url, smart_cropping=None, model_version="latest", custom_headers=None, raw=False, **operation_config):
211
"""
212
Generate thumbnail image with smart cropping.
213
214
Args:
215
width (int): Thumbnail width in pixels
216
height (int): Thumbnail height in pixels
217
url (str): Source image URL
218
smart_cropping (bool, optional): Enable smart cropping algorithm
219
220
Returns:
221
Generator: Binary image data stream
222
"""
223
```
224
225
[Thumbnail Generation](./thumbnail-generation.md)
226
227
### Domain-Specific Analysis
228
229
Specialized analysis using domain-specific models for celebrity and landmark recognition.
230
231
```python { .api }
232
def analyze_image_by_domain(model, url, language="en", custom_headers=None, raw=False, **operation_config):
233
"""
234
Analyze image using domain-specific model.
235
236
Args:
237
model (str): Domain model name ("celebrities" or "landmarks")
238
url (str): Publicly reachable URL of an image
239
language (str, optional): Output language
240
241
Returns:
242
DomainModelResults: Domain-specific analysis results
243
"""
244
245
def list_models(custom_headers=None, raw=False, **operation_config):
246
"""
247
List available domain models.
248
249
Returns:
250
ListModelsResult: Available domain models
251
"""
252
```
253
254
[Domain-Specific Analysis](./domain-analysis.md)
255
256
### Area of Interest
257
258
Identify the most important rectangular area within an image for cropping or focus.
259
260
```python { .api }
261
def get_area_of_interest(url, model_version="latest", custom_headers=None, raw=False, **operation_config):
262
"""
263
Get area of interest in image for optimal cropping.
264
265
Args:
266
url (str): Publicly reachable URL of an image
267
268
Returns:
269
AreaOfInterestResult: Bounding rectangle of interest area
270
"""
271
```
272
273
[Area of Interest](./area-of-interest.md)
274
275
## Core Data Types
276
277
### ComputerVisionClient
278
279
```python { .api }
280
class ComputerVisionClient:
281
"""
282
Main client for Computer Vision API operations.
283
284
Attributes:
285
config (ComputerVisionClientConfiguration): Client configuration
286
api_version (str): API version ("3.2")
287
"""
288
289
def __init__(self, endpoint, credentials):
290
"""
291
Initialize Computer Vision client.
292
293
Args:
294
endpoint (str): Cognitive Services endpoint URL
295
credentials: Subscription credentials
296
"""
297
```
298
299
### ComputerVisionClientConfiguration
300
301
```python { .api }
302
class ComputerVisionClientConfiguration:
303
"""
304
Configuration for ComputerVisionClient.
305
306
Attributes:
307
endpoint (str): Service endpoint URL
308
credentials: Authentication credentials
309
keep_alive (bool): Connection pool setting
310
"""
311
312
def __init__(self, endpoint, credentials):
313
"""
314
Initialize client configuration.
315
316
Args:
317
endpoint (str): Cognitive Services endpoint URL
318
credentials: Subscription credentials
319
"""
320
```
321
322
### ImageAnalysis
323
324
```python { .api }
325
class ImageAnalysis:
326
"""
327
Complete image analysis results.
328
329
Attributes:
330
categories (list[Category]): Image categories with confidence scores
331
adult (AdultInfo): Adult content detection results
332
tags (list[ImageTag]): Generated tags with confidence
333
description (ImageDescription): Generated descriptions
334
faces (list[FaceDescription]): Detected faces with demographics
335
color (ColorInfo): Color analysis results
336
image_type (ImageType): Image type classification
337
objects (list[DetectedObject]): Detected objects with locations
338
brands (list[DetectedBrand]): Detected brands with locations
339
request_id (str): Request identifier
340
metadata (ImageMetadata): Image metadata
341
model_version (str): Model version used
342
"""
343
```
344
345
## Enumerations
346
347
### VisualFeatureTypes
348
349
```python { .api }
350
class VisualFeatureTypes(str, Enum):
351
"""Visual features available for image analysis."""
352
353
image_type = "ImageType"
354
faces = "Faces"
355
adult = "Adult"
356
categories = "Categories"
357
color = "Color"
358
tags = "Tags"
359
description = "Description"
360
objects = "Objects"
361
brands = "Brands"
362
```
363
364
### Details
365
366
```python { .api }
367
class Details(str, Enum):
368
"""Domain-specific detail types."""
369
370
celebrities = "Celebrities"
371
landmarks = "Landmarks"
372
```
373
374
### DescriptionExclude
375
376
```python { .api }
377
class DescriptionExclude(str, Enum):
378
"""Domain models to exclude from descriptions."""
379
380
celebrities = "Celebrities"
381
landmarks = "Landmarks"
382
```
383
384
### OCR and Text Recognition Enums
385
386
```python { .api }
387
class OcrDetectionLanguage(str, Enum):
388
"""Languages supported for OCR detection."""
389
390
zh_hans = "zh-Hans"
391
zh_hant = "zh-Hant"
392
cs = "cs"
393
da = "da"
394
nl = "nl"
395
en = "en"
396
fi = "fi"
397
fr = "fr"
398
de = "de"
399
el = "el"
400
hu = "hu"
401
it = "it"
402
ja = "ja"
403
ko = "ko"
404
nb = "nb"
405
pl = "pl"
406
pt = "pt"
407
ru = "ru"
408
es = "es"
409
sv = "sv"
410
tr = "tr"
411
ar = "ar"
412
ro = "ro"
413
sr_cyrl = "sr-Cyrl"
414
sr_latn = "sr-Latn"
415
sk = "sk"
416
417
class OperationStatusCodes(str, Enum):
418
"""Status codes for asynchronous operations."""
419
420
not_started = "notStarted"
421
running = "running"
422
succeeded = "succeeded"
423
failed = "failed"
424
425
class TextStyle(str, Enum):
426
"""Text style types for text recognition."""
427
428
handwriting = "handwriting"
429
print = "print"
430
431
class TextRecognitionResultDimensionUnit(str, Enum):
432
"""Dimension units for text recognition results."""
433
434
pixel = "pixel"
435
inch = "inch"
436
```
437
438
### Supporting Data Types
439
440
```python { .api }
441
class ImageUrl:
442
"""
443
Image URL wrapper for API requests.
444
445
Attributes:
446
url (str): Publicly reachable URL of an image
447
"""
448
449
def __init__(self, url):
450
"""
451
Initialize with image URL.
452
453
Args:
454
url (str): Image URL
455
"""
456
457
class ImageMetadata:
458
"""
459
Image metadata information.
460
461
Attributes:
462
height (int): Image height in pixels
463
width (int): Image width in pixels
464
format (str): Image format (e.g., 'Jpeg', 'Png')
465
"""
466
467
class BoundingRect:
468
"""
469
Rectangular bounding box coordinates.
470
471
Attributes:
472
x (int): Left coordinate (pixels from left edge)
473
y (int): Top coordinate (pixels from top edge)
474
w (int): Rectangle width in pixels
475
h (int): Rectangle height in pixels
476
"""
477
```
478
479
### Additional Model Types
480
481
```python { .api }
482
class CategoryDetail:
483
"""Additional details for image categories."""
484
pass
485
486
class CelebritiesModel:
487
"""Celebrity recognition model information."""
488
pass
489
490
class LandmarksModel:
491
"""Landmark recognition model information."""
492
pass
493
494
class ComputerVisionError:
495
"""
496
Computer Vision API error information.
497
498
Attributes:
499
code (str): Error code
500
message (str): Error message
501
inner_error (ComputerVisionInnerError): Detailed error information
502
"""
503
504
class ComputerVisionInnerError:
505
"""
506
Detailed error information.
507
508
Attributes:
509
code (str): Specific error code
510
message (str): Detailed error message
511
"""
512
```
513
514
## Error Handling
515
516
```python { .api }
517
class ComputerVisionErrorResponseException(Exception):
518
"""Exception for Computer Vision API errors."""
519
pass
520
521
class ComputerVisionOcrErrorException(Exception):
522
"""Exception for OCR operation errors."""
523
pass
524
```