tessl/pypi-azure-cognitiveservices-vision-computervision

Microsoft Azure Cognitive Services Computer Vision Client Library for Python providing state-of-the-art algorithms to process images and return information including mature content detection, face detection, color analysis, image categorization, description generation, and thumbnail creation.

—

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview

Eval results

Files

Image Tagging

Name: tessl/pypi-azure-cognitiveservices-vision-computervision
Author: tessl

Generate detailed tags for image content with confidence scores. The tagging service identifies objects, concepts, activities, and attributes present in images, providing a comprehensive set of keywords that describe the visual content.

Capabilities

Image Tag Generation

Extract comprehensive tags that describe various aspects of image content including objects, activities, settings, and visual attributes.

def tag_image(url, language="en", model_version="latest", custom_headers=None, raw=False, **operation_config):
    """
    Generate tags for image content.
    
    Args:
        url (str): Publicly reachable URL of an image
        language (str, optional): Output language for tags.
            Supported: "en", "es", "ja", "pt", "zh". Default: "en"
        model_version (str, optional): AI model version. Default: "latest"
        custom_headers (dict, optional): Custom HTTP headers
        raw (bool, optional): Return raw response. Default: False
        
    Returns:
        TagResult: Generated tags with confidence scores
        
    Raises:
        ComputerVisionErrorResponseException: API error occurred
    """

def tag_image_in_stream(image, language="en", model_version="latest", custom_headers=None, raw=False, **operation_config):
    """
    Generate tags from binary image stream.
    
    Args:
        image (Generator): Binary image data stream
        language (str, optional): Output language for tags
        model_version (str, optional): AI model version
        
    Returns:
        TagResult: Generated tags with confidence scores and metadata
    """

Usage Examples

Basic Image Tagging

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials

# Initialize client
credentials = CognitiveServicesCredentials("your-api-key")
client = ComputerVisionClient("https://your-endpoint.cognitiveservices.azure.com/", credentials)

# Generate tags for image
image_url = "https://example.com/beach-scene.jpg"
tag_result = client.tag_image(image_url)

print(f"Generated {len(tag_result.tags)} tags:")
for tag in tag_result.tags:
    print(f"  {tag.name}: {tag.confidence:.3f}")

Filtering Tags by Confidence

# Filter tags by confidence threshold
image_url = "https://example.com/complex-scene.jpg"
tag_result = client.tag_image(image_url)

# Define confidence thresholds
high_confidence = 0.8
medium_confidence = 0.5

high_conf_tags = [tag for tag in tag_result.tags if tag.confidence >= high_confidence]
medium_conf_tags = [tag for tag in tag_result.tags if medium_confidence <= tag.confidence < high_confidence]
low_conf_tags = [tag for tag in tag_result.tags if tag.confidence < medium_confidence]

print("High confidence tags:")
for tag in high_conf_tags:
    print(f"  {tag.name} ({tag.confidence:.3f})")

print(f"\nMedium confidence tags: {len(medium_conf_tags)}")
print(f"Low confidence tags: {len(low_conf_tags)}")

Multilingual Tagging

# Generate tags in different languages
image_url = "https://example.com/street-food.jpg"

languages = ["en", "es", "ja"]
all_tags = {}

for lang in languages:
    try:
        tag_result = client.tag_image(image_url, language=lang)
        all_tags[lang] = [(tag.name, tag.confidence) for tag in tag_result.tags[:5]]  # Top 5 tags
    except Exception as e:
        print(f"Failed to get tags in {lang}: {e}")

# Display results
for lang, tags in all_tags.items():
    print(f"\n{lang.upper()} tags:")
    for name, confidence in tags:
        print(f"  {name} ({confidence:.3f})")

Tag Analysis and Categorization

# Analyze tags by category/type
tag_result = client.tag_image(image_url)

# Categorize tags (example categorization)
categories = {
    'objects': [],
    'people': [],
    'activities': [],
    'settings': [],
    'colors': [],
    'other': []
}

# Simple categorization logic (you can expand this)
for tag in tag_result.tags:
    tag_name = tag.name.lower()
    if any(word in tag_name for word in ['person', 'man', 'woman', 'child', 'people']):
        categories['people'].append(tag)
    elif any(word in tag_name for word in ['red', 'blue', 'green', 'yellow', 'black', 'white']):
        categories['colors'].append(tag)
    elif any(word in tag_name for word in ['indoor', 'outdoor', 'room', 'kitchen', 'street']):
        categories['settings'].append(tag)
    elif any(word in tag_name for word in ['sitting', 'standing', 'walking', 'running', 'playing']):
        categories['activities'].append(tag)
    else:
        categories['objects'].append(tag)

# Display categorized results
for category, tags in categories.items():
    if tags:
        print(f"\n{category.upper()}:")
        for tag in tags:
            print(f"  {tag.name} ({tag.confidence:.3f})")

Local File Tagging

# Generate tags from local image file
with open("local_image.jpg", "rb") as image_stream:
    tag_result = client.tag_image_in_stream(image_stream)
    
    # Get top tags above threshold
    threshold = 0.7
    top_tags = [tag for tag in tag_result.tags if tag.confidence >= threshold]
    
    if top_tags:
        print(f"Top tags (confidence ≥ {threshold}):")
        for tag in top_tags:
            print(f"  {tag.name}: {tag.confidence:.3f}")
    else:
        print(f"No tags found above confidence threshold {threshold}")
        print("All tags:")
        for tag in tag_result.tags[:10]:  # Show top 10
            print(f"  {tag.name}: {tag.confidence:.3f}")

Batch Tagging with Aggregation

# Process multiple images and aggregate tags
image_urls = [
    "https://example.com/image1.jpg",
    "https://example.com/image2.jpg",
    "https://example.com/image3.jpg"
]

all_tags = {}  # Dictionary to aggregate tags across images

for i, url in enumerate(image_urls):
    try:
        tag_result = client.tag_image(url)
        print(f"Processed image {i+1}/{len(image_urls)}")
        
        # Aggregate tags
        for tag in tag_result.tags:
            if tag.name in all_tags:
                all_tags[tag.name].append(tag.confidence)
            else:
                all_tags[tag.name] = [tag.confidence]
                
    except Exception as e:
        print(f"Error processing {url}: {e}")

# Calculate average confidence for each tag
tag_averages = {
    tag_name: sum(confidences) / len(confidences)
    for tag_name, confidences in all_tags.items()
}

# Sort by average confidence and frequency
popular_tags = sorted(
    [(name, avg_conf, len(all_tags[name])) for name, avg_conf in tag_averages.items()],
    key=lambda x: (x[2], x[1]),  # Sort by frequency, then confidence
    reverse=True
)

print("\nMost common tags across all images:")
for tag_name, avg_confidence, frequency in popular_tags[:10]:
    print(f"  {tag_name}: appears in {frequency} images, avg confidence {avg_confidence:.3f}")

Tag Hints Analysis

# Analyze tag hints for additional context
tag_result = client.tag_image(image_url)

print("Tags with hints:")
for tag in tag_result.tags:
    if hasattr(tag, 'hint') and tag.hint:
        print(f"  {tag.name} ({tag.confidence:.3f}) - Hint: {tag.hint}")
    else:
        print(f"  {tag.name} ({tag.confidence:.3f})")

Response Data Types

TagResult

class TagResult:
    """
    Image tagging operation result.
    
    Attributes:
        tags (list[ImageTag]): Generated tags with confidence scores
        request_id (str): Request identifier
        metadata (ImageMetadata): Image metadata (dimensions, format)
        model_version (str): AI model version used for tagging
    """

ImageTag

class ImageTag:
    """
    Individual image tag with confidence score.
    
    Attributes:
        name (str): Tag name/label describing image content
        confidence (float): Confidence score for the tag (0.0 to 1.0)
        hint (str, optional): Additional context or hint about the tag
    """

ImageMetadata

class ImageMetadata:
    """
    Image metadata information.
    
    Attributes:
        height (int): Image height in pixels
        width (int): Image width in pixels
        format (str): Image format (e.g., "Jpeg", "Png")
    """

Tag Categories

The tagging service identifies various types of content:

Objects and Items

Physical objects (car, building, tree, computer, phone)
Food items (pizza, apple, coffee, sandwich)
Clothing (shirt, hat, shoes, jacket)
Furniture (chair, table, bed, couch)

People and Body Parts

People descriptors (person, man, woman, child)
Body parts (face, hand, eye, hair)
Demographics (adult, young, elderly)

Activities and Actions

Actions (sitting, standing, walking, running, eating)
Activities (playing, working, cooking, reading)
Sports (tennis, soccer, swimming, cycling)

Settings and Locations

Indoor/outdoor classification
Specific locations (kitchen, office, park, street)
Environments (urban, rural, natural, architectural)

Visual Attributes

Colors (red, blue, colorful, black and white)
Lighting (bright, dark, sunny, shadowy)
Composition (close-up, wide shot, portrait)
Style (modern, vintage, artistic)

Abstract Concepts

Emotions and mood (happy, peaceful, busy)
Qualities (beautiful, interesting, professional)
Concepts (transportation, technology, nature)

Tag Quality and Confidence

Confidence Score Interpretation

0.9-1.0: Extremely confident - tag is almost certainly present
0.8-0.9: High confidence - tag is very likely accurate
0.7-0.8: Good confidence - tag is probably correct
0.5-0.7: Moderate confidence - tag may be present but uncertain
Below 0.5: Low confidence - tag presence is questionable

Best Practices

Use confidence thresholds appropriate for your application (typically 0.5-0.8)
Consider the number of tags returned (usually 10-50 per image)
Combine with other analysis features for comprehensive understanding
Use multilingual support when working with international content
Aggregate results across similar images for better accuracy

Install with Tessl CLI

npx tessl i tessl/pypi-azure-cognitiveservices-vision-computervision

docs

ocr-text-recognition.md

thumbnail-generation.md

tile.json

tessl/pypi-azure-cognitiveservices-vision-computervision

image-tagging.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

Image Tagging

Capabilities

Image Tag Generation

Usage Examples

Basic Image Tagging

Filtering Tags by Confidence

Multilingual Tagging

Tag Analysis and Categorization

Local File Tagging

Batch Tagging with Aggregation

Tag Hints Analysis

Response Data Types

TagResult

ImageTag

ImageMetadata

Tag Categories

Objects and Items

People and Body Parts

Activities and Actions

Settings and Locations

Visual Attributes

Abstract Concepts

Tag Quality and Confidence

Confidence Score Interpretation

Best Practices

image-tagging.mddocs/