0
# Google Cloud Video Intelligence
1
2
## Overview
3
4
A comprehensive Python client library for Google Cloud Video Intelligence API that enables developers to make videos searchable and discoverable by extracting metadata through machine learning. The library provides capabilities for analyzing videos with features like label detection, face detection, explicit content detection, speech transcription, text detection, object tracking, logo recognition, person detection, and celebrity recognition (beta).
5
6
## Package Information
7
8
- **Package Name**: google-cloud-videointelligence
9
- **Language**: Python
10
- **Installation**: `pip install google-cloud-videointelligence`
11
12
## Core Imports
13
14
```python
15
from google.cloud import videointelligence
16
```
17
18
For specific API versions:
19
20
```python
21
from google.cloud import videointelligence_v1
22
from google.cloud import videointelligence_v1p3beta1 # For streaming features
23
```
24
25
## Basic Usage
26
27
```python
28
from google.cloud import videointelligence
29
30
# Create a client
31
client = videointelligence.VideoIntelligenceServiceClient()
32
33
# Annotate a video with label detection
34
features = [videointelligence.Feature.LABEL_DETECTION]
35
operation = client.annotate_video(
36
request={
37
"features": features,
38
"input_uri": "gs://your-bucket/your-video.mp4",
39
}
40
)
41
42
# Wait for the operation to complete
43
print("Processing video for label detection...")
44
result = operation.result(timeout=300)
45
46
# Process results
47
for annotation_result in result.annotation_results:
48
for label in annotation_result.segment_label_annotations:
49
print(f"Label: {label.entity.description}")
50
for segment in label.segments:
51
start_time = segment.segment.start_time_offset.total_seconds()
52
end_time = segment.segment.end_time_offset.total_seconds()
53
print(f" Segment: {start_time}s to {end_time}s (confidence: {segment.confidence})")
54
```
55
56
## Architecture
57
58
The Google Cloud Video Intelligence client library follows Google's client library design patterns:
59
60
- **Client Classes**: Synchronous and asynchronous clients for different API versions
61
- **Request/Response Objects**: Structured data types for API communication
62
- **Long-running Operations**: Video analysis operations return Operation objects that can be polled for completion
63
- **Feature-based Analysis**: Different AI capabilities are enabled through feature flags
64
- **Configuration Objects**: Specialized configuration classes for fine-tuning each analysis feature
65
- **Multiple API Versions**: Support for stable (v1) and beta versions with additional features
66
67
## Capabilities
68
69
### Video Analysis Client
70
71
Core client functionality for analyzing videos with Google's AI capabilities. Supports both synchronous and asynchronous operations, multiple transport protocols, and comprehensive error handling.
72
73
```python { .api }
74
class VideoIntelligenceServiceClient:
75
def __init__(self, *, credentials=None, transport=None, client_options=None, client_info=None): ...
76
def annotate_video(self, request=None, *, input_uri=None, features=None, retry=None, timeout=None, metadata=()) -> operation.Operation: ...
77
@classmethod
78
def from_service_account_file(cls, filename: str, *args, **kwargs) -> VideoIntelligenceServiceClient: ...
79
@classmethod
80
def from_service_account_info(cls, info: dict, *args, **kwargs) -> VideoIntelligenceServiceClient: ...
81
82
class VideoIntelligenceServiceAsyncClient:
83
def __init__(self, *, credentials=None, transport=None, client_options=None, client_info=None): ...
84
async def annotate_video(self, request=None, *, input_uri=None, features=None, retry=None, timeout=None, metadata=()) -> operation_async.AsyncOperation: ...
85
@classmethod
86
def from_service_account_file(cls, filename: str, *args, **kwargs) -> VideoIntelligenceServiceAsyncClient: ...
87
@classmethod
88
def from_service_account_info(cls, info: dict, *args, **kwargs) -> VideoIntelligenceServiceAsyncClient: ...
89
```
90
91
[Video Analysis](./video-analysis.md)
92
93
### Streaming Video Analysis (Beta)
94
95
Real-time video analysis capabilities for processing video streams. Available in the v1p3beta1 API version for applications requiring immediate feedback on video content.
96
97
```python { .api }
98
class StreamingVideoIntelligenceServiceClient:
99
def __init__(self, *, credentials=None, transport=None, client_options=None, client_info=None): ...
100
def streaming_annotate_video(self, requests, retry=None, timeout=None, metadata=()) -> Iterable[StreamingAnnotateVideoResponse]: ...
101
102
class StreamingVideoIntelligenceServiceAsyncClient:
103
def __init__(self, *, credentials=None, transport=None, client_options=None, client_info=None): ...
104
async def streaming_annotate_video(self, requests, retry=None, timeout=None, metadata=()) -> AsyncIterable[StreamingAnnotateVideoResponse]: ...
105
```
106
107
[Streaming Analysis](./streaming-analysis.md)
108
109
### Detection Features and Configuration
110
111
Comprehensive configuration options for different AI detection capabilities. Each feature can be fine-tuned with specific parameters and thresholds to optimize results for different use cases.
112
113
```python { .api }
114
class Feature(Enum):
115
FEATURE_UNSPECIFIED = 0
116
LABEL_DETECTION = 1
117
SHOT_CHANGE_DETECTION = 2
118
EXPLICIT_CONTENT_DETECTION = 3
119
FACE_DETECTION = 4
120
SPEECH_TRANSCRIPTION = 6
121
TEXT_DETECTION = 7
122
OBJECT_TRACKING = 9
123
LOGO_RECOGNITION = 12
124
PERSON_DETECTION = 14
125
126
class VideoContext:
127
segments: MutableSequence[VideoSegment]
128
label_detection_config: LabelDetectionConfig
129
shot_change_detection_config: ShotChangeDetectionConfig
130
explicit_content_detection_config: ExplicitContentDetectionConfig
131
face_detection_config: FaceDetectionConfig
132
speech_transcription_config: SpeechTranscriptionConfig
133
text_detection_config: TextDetectionConfig
134
object_tracking_config: ObjectTrackingConfig
135
person_detection_config: PersonDetectionConfig
136
```
137
138
[Features and Configuration](./features-config.md)
139
140
### Annotation Results and Data Types
141
142
Structured data types for representing video analysis results. Includes annotations for detected objects, faces, text, speech, and other content with timestamps and confidence scores.
143
144
```python { .api }
145
class AnnotateVideoResponse:
146
annotation_results: MutableSequence[VideoAnnotationResults]
147
148
class VideoAnnotationResults:
149
segment_label_annotations: MutableSequence[LabelAnnotation]
150
shot_label_annotations: MutableSequence[LabelAnnotation]
151
frame_label_annotations: MutableSequence[LabelAnnotation]
152
face_annotations: MutableSequence[FaceAnnotation]
153
shot_annotations: MutableSequence[VideoSegment]
154
explicit_annotation: ExplicitContentAnnotation
155
speech_transcriptions: MutableSequence[SpeechTranscription]
156
text_annotations: MutableSequence[TextAnnotation]
157
object_annotations: MutableSequence[ObjectTrackingAnnotation]
158
logo_recognition_annotations: MutableSequence[LogoRecognitionAnnotation]
159
person_detection_annotations: MutableSequence[PersonDetectionAnnotation]
160
```
161
162
[Results and Data Types](./results-data-types.md)
163
164
## Common Data Types
165
166
```python { .api }
167
class AnnotateVideoRequest:
168
input_uri: str
169
input_content: bytes
170
features: MutableSequence[Feature]
171
video_context: VideoContext
172
output_uri: str
173
location_id: str
174
175
class VideoSegment:
176
start_time_offset: duration_pb2.Duration
177
end_time_offset: duration_pb2.Duration
178
179
class Entity:
180
entity_id: str
181
description: str
182
language_code: str
183
184
class NormalizedBoundingBox:
185
left: float
186
top: float
187
right: float
188
bottom: float
189
190
class Likelihood(Enum):
191
LIKELIHOOD_UNSPECIFIED = 0
192
VERY_UNLIKELY = 1
193
UNLIKELY = 2
194
POSSIBLE = 3
195
LIKELY = 4
196
VERY_LIKELY = 5
197
```