Tessl Tile for pypi/google-cloud-videointelligence@2.16.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

features-config.md index.md results-data-types.md streaming-analysis.md video-analysis.md

features-config.mddocs/

0
# Features and Configuration
1

2
Comprehensive configuration options for different AI detection capabilities. Each feature can be fine-tuned with specific parameters and thresholds to optimize results for different use cases.
3

4
## Capabilities
5

6
### Video Analysis Features
7

8
Core features available for video analysis, each providing different types of AI-powered insights.
9

10
```python { .api }
11
class Feature(Enum):
12
    """
13
    Video annotation feature.
14
    
15
    Values:
16
        FEATURE_UNSPECIFIED: Unspecified feature
17
        LABEL_DETECTION: Label detection - detect objects, such as dog or flower
18
        SHOT_CHANGE_DETECTION: Shot change detection
19
        EXPLICIT_CONTENT_DETECTION: Explicit content detection
20
        FACE_DETECTION: Human face detection
21
        SPEECH_TRANSCRIPTION: Speech transcription
22
        TEXT_DETECTION: OCR text detection and tracking
23
        OBJECT_TRACKING: Object detection and tracking
24
        LOGO_RECOGNITION: Logo detection, tracking, and recognition
25
        PERSON_DETECTION: Person detection
26
    """
27
    
28
    FEATURE_UNSPECIFIED = 0
29
    LABEL_DETECTION = 1
30
    SHOT_CHANGE_DETECTION = 2
31
    EXPLICIT_CONTENT_DETECTION = 3
32
    FACE_DETECTION = 4
33
    SPEECH_TRANSCRIPTION = 6
34
    TEXT_DETECTION = 7
35
    OBJECT_TRACKING = 9
36
    LOGO_RECOGNITION = 12
37
    PERSON_DETECTION = 14
38
```
39

40
### Video Context Configuration
41

42
Main configuration object that allows fine-tuning of different analysis features.
43

44
```python { .api }
45
class VideoContext:
46
    """
47
    Video context and/or feature-specific parameters.
48
    
49
    Attributes:
50
        segments: Video segments to annotate. If unspecified, each video is treated as a single segment
51
        label_detection_config: Config for LABEL_DETECTION
52
        shot_change_detection_config: Config for SHOT_CHANGE_DETECTION
53
        explicit_content_detection_config: Config for EXPLICIT_CONTENT_DETECTION
54
        face_detection_config: Config for FACE_DETECTION
55
        speech_transcription_config: Config for SPEECH_TRANSCRIPTION
56
        text_detection_config: Config for TEXT_DETECTION
57
        object_tracking_config: Config for OBJECT_TRACKING
58
        person_detection_config: Config for PERSON_DETECTION
59
    """
60
    
61
    segments: MutableSequence[VideoSegment]
62
    label_detection_config: LabelDetectionConfig
63
    shot_change_detection_config: ShotChangeDetectionConfig
64
    explicit_content_detection_config: ExplicitContentDetectionConfig
65
    face_detection_config: FaceDetectionConfig
66
    speech_transcription_config: SpeechTranscriptionConfig
67
    text_detection_config: TextDetectionConfig
68
    object_tracking_config: ObjectTrackingConfig
69
    person_detection_config: PersonDetectionConfig
70
```
71

72
### Label Detection Configuration
73

74
Configure how labels (objects, activities, concepts) are detected in videos.
75

76
```python { .api }
77
class LabelDetectionConfig:
78
    """
79
    Config for LABEL_DETECTION.
80
    
81
    Attributes:
82
        label_detection_mode: What labels should be detected with LABEL_DETECTION, in addition to video-level labels or segment-level labels
83
        stationary_camera: Whether the video has been shot from a stationary (non-moving) camera
84
        model: Model to use for label detection. Supported values: "builtin/stable", "builtin/latest"
85
        frame_confidence_threshold: The confidence threshold for frame-level label detection (0.0-1.0)
86
        video_confidence_threshold: The confidence threshold for video-level label detection (0.0-1.0)
87
    """
88
    
89
    label_detection_mode: LabelDetectionMode
90
    stationary_camera: bool
91
    model: str
92
    frame_confidence_threshold: float
93
    video_confidence_threshold: float
94

95
class LabelDetectionMode(Enum):
96
    """
97
    Label detection mode.
98
    
99
    Values:
100
        LABEL_DETECTION_MODE_UNSPECIFIED: Unspecified
101
        SHOT_MODE: Detect shot-level labels
102
        FRAME_MODE: Detect frame-level labels
103
        SHOT_AND_FRAME_MODE: Detect both shot-level and frame-level labels
104
    """
105
    
106
    LABEL_DETECTION_MODE_UNSPECIFIED = 0
107
    SHOT_MODE = 1
108
    FRAME_MODE = 2
109
    SHOT_AND_FRAME_MODE = 3
110
```
111

112
### Face Detection Configuration
113

114
Configure detection and tracking of human faces in videos.
115

116
```python { .api }
117
class FaceDetectionConfig:
118
    """
119
    Config for FACE_DETECTION.
120
    
121
    Attributes:
122
        model: Model to use for face detection. Supported values: "builtin/stable", "builtin/latest"
123
        include_bounding_boxes: Whether bounding boxes are included in the face annotation output
124
        include_attributes: Whether to enable face attributes detection, such as glasses, dark_glasses, mouth_open etc
125
    """
126
    
127
    model: str
128
    include_bounding_boxes: bool
129
    include_attributes: bool
130
```
131

132
### Object Tracking Configuration
133

134
Configure detection and tracking of objects throughout the video.
135

136
```python { .api }
137
class ObjectTrackingConfig:
138
    """
139
    Config for OBJECT_TRACKING.
140
    
141
    Attributes:
142
        model: Model to use for object tracking. Supported values: "builtin/stable", "builtin/latest"
143
    """
144
    
145
    model: str
146
```
147

148
### Explicit Content Detection Configuration
149

150
Configure detection of explicit or inappropriate content.
151

152
```python { .api }
153
class ExplicitContentDetectionConfig:
154
    """
155
    Config for EXPLICIT_CONTENT_DETECTION.
156
    
157
    Attributes:
158
        model: Model to use for explicit content detection. Supported values: "builtin/stable", "builtin/latest"
159
    """
160
    
161
    model: str
162
```
163

164
### Speech Transcription Configuration
165

166
Configure speech-to-text transcription with language and context options.
167

168
```python { .api }
169
class SpeechTranscriptionConfig:
170
    """
171
    Config for SPEECH_TRANSCRIPTION.
172
    
173
    Attributes:
174
        language_code: Required. BCP-47 language tag of the language spoken in the audio (e.g., "en-US")
175
        max_alternatives: Maximum number of recognition hypotheses to be returned
176
        filter_profanity: If set to true, the server will attempt to filter out profanities
177
        speech_contexts: A means to provide context to assist the speech recognition
178
        enable_automatic_punctuation: If set to true, adds punctuation to recognition result hypotheses
179
        audio_tracks: For file formats that contain multiple audio tracks, this field controls which track should be transcribed
180
        enable_speaker_diarization: If true, enable speaker detection for each recognized word
181
        diarization_speaker_count: If speaker_diarization is enabled, set this field to specify the number of speakers
182
        enable_word_confidence: If true, the top result includes a list of words and the confidence for those words
183
    """
184
    
185
    language_code: str
186
    max_alternatives: int
187
    filter_profanity: bool
188
    speech_contexts: MutableSequence[SpeechContext]
189
    enable_automatic_punctuation: bool
190
    audio_tracks: MutableSequence[int]
191
    enable_speaker_diarization: bool
192
    diarization_speaker_count: int
193
    enable_word_confidence: bool
194

195
class SpeechContext:
196
    """
197
    Provides "hints" to the speech recognizer to favor specific words and phrases in the results.
198
    
199
    Attributes:
200
        phrases: A list of strings containing words and phrases "hints" so that the speech recognition is more likely to recognize them
201
    """
202
    
203
    phrases: MutableSequence[str]
204
```
205

206
### Text Detection Configuration
207

208
Configure optical character recognition (OCR) for detecting text in videos.
209

210
```python { .api }
211
class TextDetectionConfig:
212
    """
213
    Config for TEXT_DETECTION.
214
    
215
    Attributes:
216
        language_hints: Language hint can be specified if the language spoken in the audio is known a priori
217
        model: Model to use for text detection. Supported values: "builtin/stable", "builtin/latest"
218
    """
219
    
220
    language_hints: MutableSequence[str]
221
    model: str
222
```
223

224
### Person Detection Configuration
225

226
Configure detection and tracking of people in videos.
227

228
```python { .api }
229
class PersonDetectionConfig:
230
    """
231
    Config for PERSON_DETECTION.
232
    
233
    Attributes:
234
        include_bounding_boxes: Whether bounding boxes are included in the person detection annotation output
235
        include_pose_landmarks: Whether to enable pose landmarks detection
236
        include_attributes: Whether to enable person attributes detection, such as cloth color
237
    """
238
    
239
    include_bounding_boxes: bool
240
    include_pose_landmarks: bool
241
    include_attributes: bool
242
```
243

244
### Shot Change Detection Configuration
245

246
Configure detection of shot boundaries and scene changes.
247

248
```python { .api }
249
class ShotChangeDetectionConfig:
250
    """
251
    Config for SHOT_CHANGE_DETECTION.
252
    
253
    Attributes:
254
        model: Model to use for shot change detection. Supported values: "builtin/stable", "builtin/latest"
255
    """
256
    
257
    model: str
258
```
259

260
### Common Enums and Utilities
261

262
```python { .api }
263
class Likelihood(Enum):
264
    """
265
    Bucketized representation of likelihood.
266
    
267
    Values:
268
        LIKELIHOOD_UNSPECIFIED: Unspecified likelihood
269
        VERY_UNLIKELY: Very unlikely
270
        UNLIKELY: Unlikely
271
        POSSIBLE: Possible
272
        LIKELY: Likely
273
        VERY_LIKELY: Very likely
274
    """
275
    
276
    LIKELIHOOD_UNSPECIFIED = 0
277
    VERY_UNLIKELY = 1
278
    UNLIKELY = 2
279
    POSSIBLE = 3
280
    LIKELY = 4
281
    VERY_LIKELY = 5
282

283
class VideoSegment:
284
    """
285
    Video segment.
286
    
287
    Attributes:
288
        start_time_offset: Time-offset, relative to the beginning of the video, corresponding to the start of the segment
289
        end_time_offset: Time-offset, relative to the beginning of the video, corresponding to the end of the segment
290
    """
291
    
292
    start_time_offset: duration_pb2.Duration
293
    end_time_offset: duration_pb2.Duration
294
```
295

296
## Usage Examples
297

298
### Multi-Feature Analysis with Custom Configuration
299

300
```python
301
from google.cloud import videointelligence
302

303
# Create client
304
client = videointelligence.VideoIntelligenceServiceClient()
305

306
# Configure multiple features with custom settings
307
video_context = videointelligence.VideoContext(
308
    segments=[
309
        videointelligence.VideoSegment(
310
            start_time_offset={"seconds": 10},
311
            end_time_offset={"seconds": 50}
312
        )
313
    ],
314
    label_detection_config=videointelligence.LabelDetectionConfig(
315
        label_detection_mode=videointelligence.LabelDetectionMode.SHOT_AND_FRAME_MODE,
316
        stationary_camera=True,
317
        model="builtin/latest",
318
        frame_confidence_threshold=0.7,
319
        video_confidence_threshold=0.8
320
    ),
321
    face_detection_config=videointelligence.FaceDetectionConfig(
322
        model="builtin/latest",
323
        include_bounding_boxes=True,
324
        include_attributes=True
325
    ),
326
    speech_transcription_config=videointelligence.SpeechTranscriptionConfig(
327
        language_code="en-US",
328
        enable_automatic_punctuation=True,
329
        enable_speaker_diarization=True,
330
        diarization_speaker_count=2,
331
        enable_word_confidence=True
332
    )
333
)
334

335
# Annotate video with custom configuration
336
operation = client.annotate_video(
337
    request={
338
        "features": [
339
            videointelligence.Feature.LABEL_DETECTION,
340
            videointelligence.Feature.FACE_DETECTION,
341
            videointelligence.Feature.SPEECH_TRANSCRIPTION
342
        ],
343
        "input_uri": "gs://your-bucket/your-video.mp4",
344
        "video_context": video_context
345
    }
346
)
347

348
result = operation.result(timeout=600)
349
```
350

351
### Text Detection with Language Hints
352

353
```python
354
from google.cloud import videointelligence
355

356
client = videointelligence.VideoIntelligenceServiceClient()
357

358
# Configure text detection for multiple languages
359
text_config = videointelligence.TextDetectionConfig(
360
    language_hints=["en", "fr", "es"],  # English, French, Spanish
361
    model="builtin/latest"
362
)
363

364
video_context = videointelligence.VideoContext(
365
    text_detection_config=text_config
366
)
367

368
operation = client.annotate_video(
369
    request={
370
        "features": [videointelligence.Feature.TEXT_DETECTION],
371
        "input_uri": "gs://your-bucket/multilingual-video.mp4",
372
        "video_context": video_context
373
    }
374
)
375

376
result = operation.result(timeout=300)
377
```
378

379
### Person Detection with Pose Landmarks
380

381
```python
382
from google.cloud import videointelligence
383

384
client = videointelligence.VideoIntelligenceServiceClient()
385

386
# Configure person detection with all features enabled
387
person_config = videointelligence.PersonDetectionConfig(
388
    include_bounding_boxes=True,
389
    include_pose_landmarks=True,
390
    include_attributes=True
391
)
392

393
video_context = videointelligence.VideoContext(
394
    person_detection_config=person_config
395
)
396

397
operation = client.annotate_video(
398
    request={
399
        "features": [videointelligence.Feature.PERSON_DETECTION],
400
        "input_uri": "gs://your-bucket/sports-video.mp4",
401
        "video_context": video_context
402
    }
403
)
404

405
result = operation.result(timeout=400)
406
```
407

408
### Explicit Content Detection
409

410
```python
411
from google.cloud import videointelligence
412

413
client = videointelligence.VideoIntelligenceServiceClient()
414

415
# Configure explicit content detection
416
explicit_config = videointelligence.ExplicitContentDetectionConfig(
417
    model="builtin/latest"
418
)
419

420
video_context = videointelligence.VideoContext(
421
    explicit_content_detection_config=explicit_config
422
)
423

424
operation = client.annotate_video(
425
    request={
426
        "features": [videointelligence.Feature.EXPLICIT_CONTENT_DETECTION],
427
        "input_uri": "gs://your-bucket/content-to-moderate.mp4",
428
        "video_context": video_context
429
    }
430
)
431

432
result = operation.result(timeout=300)
433

434
# Check explicit content results
435
for annotation_result in result.annotation_results:
436
    explicit_annotation = annotation_result.explicit_annotation
437
    for frame in explicit_annotation.frames:
438
        likelihood = frame.pornography_likelihood
439
        time_offset = frame.time_offset.total_seconds()
440
        print(f"Frame at {time_offset}s: {likelihood.name}")
441
```

Version

Tile

Files

features-config.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

features-config.mddocs/