0
# Content Moderation
1
2
Detects and flags potentially harmful, inappropriate, or unsafe content in text, providing moderation categories and confidence scores for content filtering applications. Essential for maintaining safe online environments, protecting users from harmful content, and ensuring compliance with content policies.
3
4
## Capabilities
5
6
### Moderate Text
7
8
Analyzes the provided text to detect potentially harmful or inappropriate content across multiple safety categories.
9
10
```python { .api }
11
def moderate_text(
12
self,
13
request: Optional[Union[ModerateTextRequest, dict]] = None,
14
*,
15
document: Optional[Document] = None,
16
retry: OptionalRetry = gapic_v1.method.DEFAULT,
17
timeout: Union[float, object] = gapic_v1.method.DEFAULT,
18
metadata: Sequence[Tuple[str, Union[str, bytes]]] = ()
19
) -> ModerateTextResponse:
20
"""
21
Moderates text to detect potentially harmful or inappropriate content.
22
23
Args:
24
request: The request object containing document
25
document: Input document for moderation
26
retry: Retry configuration for the request
27
timeout: Request timeout in seconds
28
metadata: Additional metadata to send with the request
29
30
Returns:
31
ModerateTextResponse containing moderation results
32
"""
33
```
34
35
#### Usage Example
36
37
```python
38
from google.cloud import language
39
40
# Initialize client
41
client = language.LanguageServiceClient()
42
43
# Create document
44
document = language.Document(
45
content="This content contains inappropriate language and harmful statements.",
46
type_=language.Document.Type.PLAIN_TEXT
47
)
48
49
# Moderate content
50
response = client.moderate_text(
51
request={"document": document}
52
)
53
54
# Process moderation results
55
print("Content Moderation Results:")
56
for category in response.moderation_categories:
57
print(f"Category: {category.name}")
58
print(f"Confidence: {category.confidence:.3f}")
59
60
# Check if content should be flagged
61
if category.confidence > 0.5: # Threshold can be adjusted
62
print(f"⚠️ Content flagged for: {category.name}")
63
print()
64
65
# Overall safety assessment
66
flagged_categories = [
67
cat for cat in response.moderation_categories
68
if cat.confidence > 0.5
69
]
70
71
if flagged_categories:
72
print(f"Content FLAGGED - {len(flagged_categories)} safety issues detected")
73
else:
74
print("Content appears safe")
75
```
76
77
## Request and Response Types
78
79
### ModerateTextRequest
80
81
```python { .api }
82
class ModerateTextRequest:
83
document: Document
84
```
85
86
### ModerateTextResponse
87
88
```python { .api }
89
class ModerateTextResponse:
90
moderation_categories: MutableSequence[ClassificationCategory]
91
```
92
93
## Moderation Categories
94
95
The system detects various types of harmful content:
96
97
### Common Moderation Categories
98
99
- **Toxic**: Generally harmful, offensive, or inappropriate content
100
- **Severe Toxicity**: Extremely harmful content with high confidence
101
- **Identity Attack**: Content attacking individuals based on identity
102
- **Insult**: Content intended to insult or demean
103
- **Profanity**: Content containing profane or vulgar language
104
- **Threat**: Content containing threats of violence or harm
105
- **Sexually Explicit**: Content containing explicit sexual material
106
- **Flirtation**: Content with flirtatious or suggestive language
107
108
### Confidence Scores
109
110
Each category includes a confidence score from 0.0 to 1.0:
111
- **0.0 - 0.3**: Low likelihood of harmful content
112
- **0.3 - 0.7**: Moderate likelihood - may require review
113
- **0.7 - 1.0**: High likelihood - likely harmful content
114
115
## Advanced Usage
116
117
### Configurable Content Filtering
118
119
```python
120
class ContentModerator:
121
def __init__(self, client, thresholds=None):
122
self.client = client
123
self.thresholds = thresholds or {
124
'Toxic': 0.7,
125
'Severe Toxicity': 0.5,
126
'Identity Attack': 0.6,
127
'Insult': 0.8,
128
'Profanity': 0.9,
129
'Threat': 0.3,
130
'Sexually Explicit': 0.8,
131
'Flirtation': 0.9
132
}
133
134
def moderate_content(self, text):
135
"""Moderate content with configurable thresholds."""
136
document = language.Document(
137
content=text,
138
type_=language.Document.Type.PLAIN_TEXT
139
)
140
141
response = self.client.moderate_text(
142
request={"document": document}
143
)
144
145
violations = []
146
warnings = []
147
148
for category in response.moderation_categories:
149
category_name = category.name
150
confidence = category.confidence
151
152
# Check against custom thresholds
153
threshold = self.thresholds.get(category_name, 0.5)
154
155
if confidence >= threshold:
156
severity = 'high' if confidence >= 0.7 else 'medium'
157
violations.append({
158
'category': category_name,
159
'confidence': confidence,
160
'severity': severity,
161
'threshold': threshold
162
})
163
elif confidence >= 0.3: # Warning threshold
164
warnings.append({
165
'category': category_name,
166
'confidence': confidence
167
})
168
169
return {
170
'violations': violations,
171
'warnings': warnings,
172
'safe': len(violations) == 0,
173
'all_categories': response.moderation_categories
174
}
175
176
def get_action_recommendation(self, moderation_result):
177
"""Get recommended action based on moderation results."""
178
violations = moderation_result['violations']
179
180
if not violations:
181
return 'approve'
182
183
# Check for severe violations
184
severe_violations = [v for v in violations if v['severity'] == 'high']
185
threat_violations = [v for v in violations if v['category'] == 'Threat']
186
187
if severe_violations or threat_violations:
188
return 'block'
189
elif len(violations) >= 3:
190
return 'review'
191
elif any(v['confidence'] >= 0.8 for v in violations):
192
return 'review'
193
else:
194
return 'flag'
195
196
# Usage
197
moderator = ContentModerator(client)
198
199
test_texts = [
200
"This is a normal, friendly message.",
201
"You're such an idiot and I hate you!",
202
"I'm going to hurt you if you don't stop.",
203
"That's a really inappropriate and offensive comment."
204
]
205
206
for text in test_texts:
207
result = moderator.moderate_content(text)
208
action = moderator.get_action_recommendation(result)
209
210
print(f"Text: {text[:50]}...")
211
print(f"Action: {action}")
212
print(f"Safe: {result['safe']}")
213
214
if result['violations']:
215
print("Violations:")
216
for violation in result['violations']:
217
print(f" - {violation['category']}: {violation['confidence']:.3f} ({violation['severity']})")
218
219
if result['warnings']:
220
print("Warnings:")
221
for warning in result['warnings']:
222
print(f" - {warning['category']}: {warning['confidence']:.3f}")
223
print()
224
```
225
226
### Batch Content Moderation
227
228
```python
229
def moderate_content_batch(client, texts, batch_size=10):
230
"""Moderate multiple texts efficiently."""
231
results = []
232
233
for i in range(0, len(texts), batch_size):
234
batch = texts[i:i + batch_size]
235
batch_results = []
236
237
for text in batch:
238
try:
239
document = language.Document(
240
content=text,
241
type_=language.Document.Type.PLAIN_TEXT
242
)
243
244
response = client.moderate_text(
245
request={"document": document}
246
)
247
248
# Categorize results
249
violations = []
250
max_confidence = 0
251
252
for category in response.moderation_categories:
253
if category.confidence > 0.5:
254
violations.append({
255
'category': category.name,
256
'confidence': category.confidence
257
})
258
max_confidence = max(max_confidence, category.confidence)
259
260
batch_results.append({
261
'text': text,
262
'violations': violations,
263
'max_confidence': max_confidence,
264
'safe': len(violations) == 0,
265
'all_categories': response.moderation_categories
266
})
267
268
except Exception as e:
269
batch_results.append({
270
'text': text,
271
'error': str(e),
272
'safe': None
273
})
274
275
results.extend(batch_results)
276
277
return results
278
279
def generate_moderation_report(results):
280
"""Generate a summary report from batch moderation results."""
281
total_texts = len(results)
282
safe_count = sum(1 for r in results if r.get('safe') == True)
283
flagged_count = sum(1 for r in results if r.get('safe') == False)
284
error_count = sum(1 for r in results if 'error' in r)
285
286
# Category statistics
287
category_counts = {}
288
for result in results:
289
if 'violations' in result:
290
for violation in result['violations']:
291
category = violation['category']
292
category_counts[category] = category_counts.get(category, 0) + 1
293
294
print(f"Moderation Report")
295
print(f"================")
296
print(f"Total texts processed: {total_texts}")
297
print(f"Safe content: {safe_count} ({safe_count/total_texts*100:.1f}%)")
298
print(f"Flagged content: {flagged_count} ({flagged_count/total_texts*100:.1f}%)")
299
print(f"Processing errors: {error_count}")
300
print()
301
302
if category_counts:
303
print("Most common violations:")
304
sorted_categories = sorted(category_counts.items(), key=lambda x: x[1], reverse=True)
305
for category, count in sorted_categories[:5]:
306
print(f" {category}: {count} ({count/total_texts*100:.1f}%)")
307
308
return {
309
'total': total_texts,
310
'safe': safe_count,
311
'flagged': flagged_count,
312
'errors': error_count,
313
'category_counts': category_counts
314
}
315
316
# Usage
317
sample_texts = [
318
"Welcome to our community! Please be respectful.",
319
"This is completely inappropriate and offensive.",
320
"Great post! Thanks for sharing this information.",
321
"You're an absolute moron and should be banned.",
322
"I love this product and would recommend it to others."
323
]
324
325
batch_results = moderate_content_batch(client, sample_texts)
326
report = generate_moderation_report(batch_results)
327
```
328
329
### Real-time Content Filtering
330
331
```python
332
class RealTimeContentFilter:
333
def __init__(self, client, auto_block_threshold=0.8):
334
self.client = client
335
self.auto_block_threshold = auto_block_threshold
336
self.cache = {} # Simple cache for repeated content
337
338
def filter_message(self, message, user_id=None):
339
"""Filter a message in real-time with caching."""
340
# Check cache first
341
cache_key = hash(message.strip().lower())
342
if cache_key in self.cache:
343
return self.cache[cache_key]
344
345
document = language.Document(
346
content=message,
347
type_=language.Document.Type.PLAIN_TEXT
348
)
349
350
try:
351
response = self.client.moderate_text(
352
request={"document": document}
353
)
354
355
# Analyze results
356
max_confidence = 0
357
violations = []
358
359
for category in response.moderation_categories:
360
if category.confidence > 0.3: # Low threshold for tracking
361
violations.append({
362
'category': category.name,
363
'confidence': category.confidence
364
})
365
max_confidence = max(max_confidence, category.confidence)
366
367
# Determine action
368
if max_confidence >= self.auto_block_threshold:
369
action = 'block'
370
reason = f"High confidence violation ({max_confidence:.3f})"
371
elif max_confidence >= 0.5:
372
action = 'review'
373
reason = f"Moderate confidence violation ({max_confidence:.3f})"
374
else:
375
action = 'allow'
376
reason = "Content appears safe"
377
378
result = {
379
'action': action,
380
'reason': reason,
381
'confidence': max_confidence,
382
'violations': violations,
383
'user_id': user_id,
384
'timestamp': None # Would be set in real implementation
385
}
386
387
# Cache result
388
self.cache[cache_key] = result
389
390
return result
391
392
except Exception as e:
393
# Fail safe - allow content but log error
394
return {
395
'action': 'allow',
396
'reason': f"Moderation error: {str(e)}",
397
'confidence': 0,
398
'violations': [],
399
'user_id': user_id,
400
'error': True
401
}
402
403
def get_filter_stats(self):
404
"""Get statistics about filtering actions."""
405
if not self.cache:
406
return {}
407
408
actions = [result['action'] for result in self.cache.values()]
409
stats = {
410
'total_processed': len(actions),
411
'blocked': actions.count('block'),
412
'reviewed': actions.count('review'),
413
'allowed': actions.count('allow')
414
}
415
416
stats['block_rate'] = stats['blocked'] / stats['total_processed'] * 100
417
stats['review_rate'] = stats['reviewed'] / stats['total_processed'] * 100
418
419
return stats
420
421
# Usage
422
filter_system = RealTimeContentFilter(client, auto_block_threshold=0.7)
423
424
messages = [
425
("Hello everyone!", "user1"),
426
("This is absolutely disgusting content.", "user2"),
427
("Thanks for the helpful information.", "user3"),
428
("You're all idiots and I hate this place.", "user4"),
429
("Looking forward to the next update!", "user5")
430
]
431
432
print("Real-time Content Filtering:")
433
for message, user in messages:
434
result = filter_system.filter_message(message, user)
435
436
print(f"User {user}: {message[:30]}...")
437
print(f" Action: {result['action']} - {result['reason']}")
438
439
if result['violations']:
440
print(f" Violations: {len(result['violations'])}")
441
for violation in result['violations'][:2]: # Show top 2
442
print(f" - {violation['category']}: {violation['confidence']:.3f}")
443
print()
444
445
# Show filtering statistics
446
stats = filter_system.get_filter_stats()
447
print("Filtering Statistics:")
448
for key, value in stats.items():
449
print(f" {key}: {value}")
450
```
451
452
### Content Moderation Pipeline
453
454
```python
455
class ModerationPipeline:
456
def __init__(self, client):
457
self.client = client
458
self.processing_queue = []
459
self.processed_results = []
460
461
def add_content(self, content_id, text, metadata=None):
462
"""Add content to moderation queue."""
463
self.processing_queue.append({
464
'id': content_id,
465
'text': text,
466
'metadata': metadata or {},
467
'status': 'queued'
468
})
469
470
def process_queue(self):
471
"""Process all queued content."""
472
processed_count = 0
473
474
for item in self.processing_queue:
475
if item['status'] == 'queued':
476
try:
477
# Moderate content
478
document = language.Document(
479
content=item['text'],
480
type_=language.Document.Type.PLAIN_TEXT
481
)
482
483
response = self.client.moderate_text(
484
request={"document": document}
485
)
486
487
# Process results
488
violations = []
489
for category in response.moderation_categories:
490
violations.append({
491
'category': category.name,
492
'confidence': category.confidence
493
})
494
495
# Determine final action
496
high_confidence_violations = [
497
v for v in violations if v['confidence'] >= 0.7
498
]
499
500
if high_confidence_violations:
501
final_action = 'reject'
502
elif any(v['confidence'] >= 0.5 for v in violations):
503
final_action = 'review'
504
else:
505
final_action = 'approve'
506
507
result = {
508
'id': item['id'],
509
'text': item['text'],
510
'metadata': item['metadata'],
511
'action': final_action,
512
'violations': violations,
513
'processed': True,
514
'error': None
515
}
516
517
item['status'] = 'processed'
518
self.processed_results.append(result)
519
processed_count += 1
520
521
except Exception as e:
522
result = {
523
'id': item['id'],
524
'text': item['text'],
525
'metadata': item['metadata'],
526
'action': 'error',
527
'violations': [],
528
'processed': False,
529
'error': str(e)
530
}
531
532
item['status'] = 'error'
533
self.processed_results.append(result)
534
535
return processed_count
536
537
def get_results_by_action(self, action):
538
"""Get all results with a specific action."""
539
return [r for r in self.processed_results if r['action'] == action]
540
541
def export_review_queue(self):
542
"""Export items that need human review."""
543
review_items = self.get_results_by_action('review')
544
545
export_data = []
546
for item in review_items:
547
export_data.append({
548
'content_id': item['id'],
549
'text_preview': item['text'][:100] + "..." if len(item['text']) > 100 else item['text'],
550
'violations': item['violations'],
551
'metadata': item['metadata']
552
})
553
554
return export_data
555
556
# Usage
557
pipeline = ModerationPipeline(client)
558
559
# Add content to queue
560
content_samples = [
561
("post_1", "This is a great article about technology trends."),
562
("comment_2", "Your opinion is completely wrong and stupid."),
563
("review_3", "The product works well and I'm satisfied."),
564
("message_4", "I'm going to report you for this behavior."),
565
("post_5", "Looking forward to the conference next week!")
566
]
567
568
for content_id, text in content_samples:
569
pipeline.add_content(content_id, text, {'source': 'user_generated'})
570
571
# Process the queue
572
processed = pipeline.process_queue()
573
print(f"Processed {processed} items")
574
575
# Get results by action
576
approved = pipeline.get_results_by_action('approve')
577
rejected = pipeline.get_results_by_action('reject')
578
review_needed = pipeline.get_results_by_action('review')
579
580
print(f"Approved: {len(approved)}")
581
print(f"Rejected: {len(rejected)}")
582
print(f"Needs Review: {len(review_needed)}")
583
584
# Export review queue
585
if review_needed:
586
review_queue = pipeline.export_review_queue()
587
print("\nItems needing human review:")
588
for item in review_queue:
589
print(f"ID: {item['content_id']}")
590
print(f"Text: {item['text_preview']}")
591
print(f"Violations: {len(item['violations'])}")
592
print()
593
```
594
595
## Error Handling
596
597
```python
598
from google.api_core import exceptions
599
600
try:
601
response = client.moderate_text(
602
request={"document": document},
603
timeout=10.0
604
)
605
except exceptions.InvalidArgument as e:
606
print(f"Invalid document: {e}")
607
# Common causes: empty document, unsupported content type
608
except exceptions.ResourceExhausted:
609
print("API quota exceeded")
610
except exceptions.DeadlineExceeded:
611
print("Request timed out")
612
except exceptions.GoogleAPIError as e:
613
print(f"API error: {e}")
614
615
# Handle no moderation results
616
if not response.moderation_categories:
617
print("No moderation categories returned - content may be too short")
618
```
619
620
## Performance Considerations
621
622
- **Text Length**: Works with various text lengths, but very short texts may have limited results
623
- **Batch Processing**: Use async client for high-volume moderation
624
- **Caching**: Implement caching for repeated content to reduce API calls
625
- **Fallback Strategy**: Have fallback moderation in case of API failures
626
- **Rate Limiting**: Implement rate limiting for high-traffic applications
627
628
## Use Cases
629
630
- **Social Media Platforms**: Moderate user posts, comments, and messages
631
- **Content Publishing**: Screen articles, blog posts, and user-generated content
632
- **Chat Applications**: Filter inappropriate messages in real-time
633
- **Review Systems**: Moderate product and service reviews
634
- **Community Forums**: Maintain safe discussion environments
635
- **Educational Platforms**: Ensure appropriate content for learning environments
636
- **Gaming Platforms**: Moderate in-game chat and user communications
637
- **Customer Support**: Screen support tickets and feedback for inappropriate content