CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-azure-cognitiveservices-vision-computervision

Microsoft Azure Cognitive Services Computer Vision Client Library for Python providing state-of-the-art algorithms to process images and return information including mature content detection, face detection, color analysis, image categorization, description generation, and thumbnail creation.

Pending

Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Overview
Eval results
Files

image-tagging.mddocs/

Image Tagging

Generate detailed tags for image content with confidence scores. The tagging service identifies objects, concepts, activities, and attributes present in images, providing a comprehensive set of keywords that describe the visual content.

Capabilities

Image Tag Generation

Extract comprehensive tags that describe various aspects of image content including objects, activities, settings, and visual attributes.

def tag_image(url, language="en", model_version="latest", custom_headers=None, raw=False, **operation_config):
    """
    Generate tags for image content.
    
    Args:
        url (str): Publicly reachable URL of an image
        language (str, optional): Output language for tags.
            Supported: "en", "es", "ja", "pt", "zh". Default: "en"
        model_version (str, optional): AI model version. Default: "latest"
        custom_headers (dict, optional): Custom HTTP headers
        raw (bool, optional): Return raw response. Default: False
        
    Returns:
        TagResult: Generated tags with confidence scores
        
    Raises:
        ComputerVisionErrorResponseException: API error occurred
    """

def tag_image_in_stream(image, language="en", model_version="latest", custom_headers=None, raw=False, **operation_config):
    """
    Generate tags from binary image stream.
    
    Args:
        image (Generator): Binary image data stream
        language (str, optional): Output language for tags
        model_version (str, optional): AI model version
        
    Returns:
        TagResult: Generated tags with confidence scores and metadata
    """

Usage Examples

Basic Image Tagging

from azure.cognitiveservices.vision.computervision import ComputerVisionClient
from msrest.authentication import CognitiveServicesCredentials

# Initialize client
credentials = CognitiveServicesCredentials("your-api-key")
client = ComputerVisionClient("https://your-endpoint.cognitiveservices.azure.com/", credentials)

# Generate tags for image
image_url = "https://example.com/beach-scene.jpg"
tag_result = client.tag_image(image_url)

print(f"Generated {len(tag_result.tags)} tags:")
for tag in tag_result.tags:
    print(f"  {tag.name}: {tag.confidence:.3f}")

Filtering Tags by Confidence

# Filter tags by confidence threshold
image_url = "https://example.com/complex-scene.jpg"
tag_result = client.tag_image(image_url)

# Define confidence thresholds
high_confidence = 0.8
medium_confidence = 0.5

high_conf_tags = [tag for tag in tag_result.tags if tag.confidence >= high_confidence]
medium_conf_tags = [tag for tag in tag_result.tags if medium_confidence <= tag.confidence < high_confidence]
low_conf_tags = [tag for tag in tag_result.tags if tag.confidence < medium_confidence]

print("High confidence tags:")
for tag in high_conf_tags:
    print(f"  {tag.name} ({tag.confidence:.3f})")

print(f"\nMedium confidence tags: {len(medium_conf_tags)}")
print(f"Low confidence tags: {len(low_conf_tags)}")

Multilingual Tagging

# Generate tags in different languages
image_url = "https://example.com/street-food.jpg"

languages = ["en", "es", "ja"]
all_tags = {}

for lang in languages:
    try:
        tag_result = client.tag_image(image_url, language=lang)
        all_tags[lang] = [(tag.name, tag.confidence) for tag in tag_result.tags[:5]]  # Top 5 tags
    except Exception as e:
        print(f"Failed to get tags in {lang}: {e}")

# Display results
for lang, tags in all_tags.items():
    print(f"\n{lang.upper()} tags:")
    for name, confidence in tags:
        print(f"  {name} ({confidence:.3f})")

Tag Analysis and Categorization

# Analyze tags by category/type
tag_result = client.tag_image(image_url)

# Categorize tags (example categorization)
categories = {
    'objects': [],
    'people': [],
    'activities': [],
    'settings': [],
    'colors': [],
    'other': []
}

# Simple categorization logic (you can expand this)
for tag in tag_result.tags:
    tag_name = tag.name.lower()
    if any(word in tag_name for word in ['person', 'man', 'woman', 'child', 'people']):
        categories['people'].append(tag)
    elif any(word in tag_name for word in ['red', 'blue', 'green', 'yellow', 'black', 'white']):
        categories['colors'].append(tag)
    elif any(word in tag_name for word in ['indoor', 'outdoor', 'room', 'kitchen', 'street']):
        categories['settings'].append(tag)
    elif any(word in tag_name for word in ['sitting', 'standing', 'walking', 'running', 'playing']):
        categories['activities'].append(tag)
    else:
        categories['objects'].append(tag)

# Display categorized results
for category, tags in categories.items():
    if tags:
        print(f"\n{category.upper()}:")
        for tag in tags:
            print(f"  {tag.name} ({tag.confidence:.3f})")

Local File Tagging

# Generate tags from local image file
with open("local_image.jpg", "rb") as image_stream:
    tag_result = client.tag_image_in_stream(image_stream)
    
    # Get top tags above threshold
    threshold = 0.7
    top_tags = [tag for tag in tag_result.tags if tag.confidence >= threshold]
    
    if top_tags:
        print(f"Top tags (confidence ≥ {threshold}):")
        for tag in top_tags:
            print(f"  {tag.name}: {tag.confidence:.3f}")
    else:
        print(f"No tags found above confidence threshold {threshold}")
        print("All tags:")
        for tag in tag_result.tags[:10]:  # Show top 10
            print(f"  {tag.name}: {tag.confidence:.3f}")

Batch Tagging with Aggregation

# Process multiple images and aggregate tags
image_urls = [
    "https://example.com/image1.jpg",
    "https://example.com/image2.jpg",
    "https://example.com/image3.jpg"
]

all_tags = {}  # Dictionary to aggregate tags across images

for i, url in enumerate(image_urls):
    try:
        tag_result = client.tag_image(url)
        print(f"Processed image {i+1}/{len(image_urls)}")
        
        # Aggregate tags
        for tag in tag_result.tags:
            if tag.name in all_tags:
                all_tags[tag.name].append(tag.confidence)
            else:
                all_tags[tag.name] = [tag.confidence]
                
    except Exception as e:
        print(f"Error processing {url}: {e}")

# Calculate average confidence for each tag
tag_averages = {
    tag_name: sum(confidences) / len(confidences)
    for tag_name, confidences in all_tags.items()
}

# Sort by average confidence and frequency
popular_tags = sorted(
    [(name, avg_conf, len(all_tags[name])) for name, avg_conf in tag_averages.items()],
    key=lambda x: (x[2], x[1]),  # Sort by frequency, then confidence
    reverse=True
)

print("\nMost common tags across all images:")
for tag_name, avg_confidence, frequency in popular_tags[:10]:
    print(f"  {tag_name}: appears in {frequency} images, avg confidence {avg_confidence:.3f}")

Tag Hints Analysis

# Analyze tag hints for additional context
tag_result = client.tag_image(image_url)

print("Tags with hints:")
for tag in tag_result.tags:
    if hasattr(tag, 'hint') and tag.hint:
        print(f"  {tag.name} ({tag.confidence:.3f}) - Hint: {tag.hint}")
    else:
        print(f"  {tag.name} ({tag.confidence:.3f})")

Response Data Types

TagResult

class TagResult:
    """
    Image tagging operation result.
    
    Attributes:
        tags (list[ImageTag]): Generated tags with confidence scores
        request_id (str): Request identifier
        metadata (ImageMetadata): Image metadata (dimensions, format)
        model_version (str): AI model version used for tagging
    """

ImageTag

class ImageTag:
    """
    Individual image tag with confidence score.
    
    Attributes:
        name (str): Tag name/label describing image content
        confidence (float): Confidence score for the tag (0.0 to 1.0)
        hint (str, optional): Additional context or hint about the tag
    """

ImageMetadata

class ImageMetadata:
    """
    Image metadata information.
    
    Attributes:
        height (int): Image height in pixels
        width (int): Image width in pixels
        format (str): Image format (e.g., "Jpeg", "Png")
    """

Tag Categories

The tagging service identifies various types of content:

Objects and Items

  • Physical objects (car, building, tree, computer, phone)
  • Food items (pizza, apple, coffee, sandwich)
  • Clothing (shirt, hat, shoes, jacket)
  • Furniture (chair, table, bed, couch)

People and Body Parts

  • People descriptors (person, man, woman, child)
  • Body parts (face, hand, eye, hair)
  • Demographics (adult, young, elderly)

Activities and Actions

  • Actions (sitting, standing, walking, running, eating)
  • Activities (playing, working, cooking, reading)
  • Sports (tennis, soccer, swimming, cycling)

Settings and Locations

  • Indoor/outdoor classification
  • Specific locations (kitchen, office, park, street)
  • Environments (urban, rural, natural, architectural)

Visual Attributes

  • Colors (red, blue, colorful, black and white)
  • Lighting (bright, dark, sunny, shadowy)
  • Composition (close-up, wide shot, portrait)
  • Style (modern, vintage, artistic)

Abstract Concepts

  • Emotions and mood (happy, peaceful, busy)
  • Qualities (beautiful, interesting, professional)
  • Concepts (transportation, technology, nature)

Tag Quality and Confidence

Confidence Score Interpretation

  • 0.9-1.0: Extremely confident - tag is almost certainly present
  • 0.8-0.9: High confidence - tag is very likely accurate
  • 0.7-0.8: Good confidence - tag is probably correct
  • 0.5-0.7: Moderate confidence - tag may be present but uncertain
  • Below 0.5: Low confidence - tag presence is questionable

Best Practices

  • Use confidence thresholds appropriate for your application (typically 0.5-0.8)
  • Consider the number of tags returned (usually 10-50 per image)
  • Combine with other analysis features for comprehensive understanding
  • Use multilingual support when working with international content
  • Aggregate results across similar images for better accuracy

Install with Tessl CLI

npx tessl i tessl/pypi-azure-cognitiveservices-vision-computervision

docs

area-of-interest.md

domain-analysis.md

image-analysis.md

image-description.md

image-tagging.md

index.md

object-detection.md

ocr-text-recognition.md

thumbnail-generation.md

tile.json