or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

client-management.mdcombined-analysis.mdcontent-moderation.mdentity-analysis.mdentity-sentiment-analysis.mdindex.mdsentiment-analysis.mdsyntax-analysis.mdtext-classification.md

content-moderation.mddocs/

0

# Content Moderation

1

2

Detects and flags potentially harmful, inappropriate, or unsafe content in text, providing moderation categories and confidence scores for content filtering applications. Essential for maintaining safe online environments, protecting users from harmful content, and ensuring compliance with content policies.

3

4

## Capabilities

5

6

### Moderate Text

7

8

Analyzes the provided text to detect potentially harmful or inappropriate content across multiple safety categories.

9

10

```python { .api }

11

def moderate_text(

12

self,

13

request: Optional[Union[ModerateTextRequest, dict]] = None,

14

*,

15

document: Optional[Document] = None,

16

retry: OptionalRetry = gapic_v1.method.DEFAULT,

17

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

18

metadata: Sequence[Tuple[str, Union[str, bytes]]] = ()

19

) -> ModerateTextResponse:

20

"""

21

Moderates text to detect potentially harmful or inappropriate content.

22

23

Args:

24

request: The request object containing document

25

document: Input document for moderation

26

retry: Retry configuration for the request

27

timeout: Request timeout in seconds

28

metadata: Additional metadata to send with the request

29

30

Returns:

31

ModerateTextResponse containing moderation results

32

"""

33

```

34

35

#### Usage Example

36

37

```python

38

from google.cloud import language

39

40

# Initialize client

41

client = language.LanguageServiceClient()

42

43

# Create document

44

document = language.Document(

45

content="This content contains inappropriate language and harmful statements.",

46

type_=language.Document.Type.PLAIN_TEXT

47

)

48

49

# Moderate content

50

response = client.moderate_text(

51

request={"document": document}

52

)

53

54

# Process moderation results

55

print("Content Moderation Results:")

56

for category in response.moderation_categories:

57

print(f"Category: {category.name}")

58

print(f"Confidence: {category.confidence:.3f}")

59

60

# Check if content should be flagged

61

if category.confidence > 0.5: # Threshold can be adjusted

62

print(f"⚠️ Content flagged for: {category.name}")

63

print()

64

65

# Overall safety assessment

66

flagged_categories = [

67

cat for cat in response.moderation_categories

68

if cat.confidence > 0.5

69

]

70

71

if flagged_categories:

72

print(f"Content FLAGGED - {len(flagged_categories)} safety issues detected")

73

else:

74

print("Content appears safe")

75

```

76

77

## Request and Response Types

78

79

### ModerateTextRequest

80

81

```python { .api }

82

class ModerateTextRequest:

83

document: Document

84

```

85

86

### ModerateTextResponse

87

88

```python { .api }

89

class ModerateTextResponse:

90

moderation_categories: MutableSequence[ClassificationCategory]

91

```

92

93

## Moderation Categories

94

95

The system detects various types of harmful content:

96

97

### Common Moderation Categories

98

99

- **Toxic**: Generally harmful, offensive, or inappropriate content

100

- **Severe Toxicity**: Extremely harmful content with high confidence

101

- **Identity Attack**: Content attacking individuals based on identity

102

- **Insult**: Content intended to insult or demean

103

- **Profanity**: Content containing profane or vulgar language

104

- **Threat**: Content containing threats of violence or harm

105

- **Sexually Explicit**: Content containing explicit sexual material

106

- **Flirtation**: Content with flirtatious or suggestive language

107

108

### Confidence Scores

109

110

Each category includes a confidence score from 0.0 to 1.0:

111

- **0.0 - 0.3**: Low likelihood of harmful content

112

- **0.3 - 0.7**: Moderate likelihood - may require review

113

- **0.7 - 1.0**: High likelihood - likely harmful content

114

115

## Advanced Usage

116

117

### Configurable Content Filtering

118

119

```python

120

class ContentModerator:

121

def __init__(self, client, thresholds=None):

122

self.client = client

123

self.thresholds = thresholds or {

124

'Toxic': 0.7,

125

'Severe Toxicity': 0.5,

126

'Identity Attack': 0.6,

127

'Insult': 0.8,

128

'Profanity': 0.9,

129

'Threat': 0.3,

130

'Sexually Explicit': 0.8,

131

'Flirtation': 0.9

132

}

133

134

def moderate_content(self, text):

135

"""Moderate content with configurable thresholds."""

136

document = language.Document(

137

content=text,

138

type_=language.Document.Type.PLAIN_TEXT

139

)

140

141

response = self.client.moderate_text(

142

request={"document": document}

143

)

144

145

violations = []

146

warnings = []

147

148

for category in response.moderation_categories:

149

category_name = category.name

150

confidence = category.confidence

151

152

# Check against custom thresholds

153

threshold = self.thresholds.get(category_name, 0.5)

154

155

if confidence >= threshold:

156

severity = 'high' if confidence >= 0.7 else 'medium'

157

violations.append({

158

'category': category_name,

159

'confidence': confidence,

160

'severity': severity,

161

'threshold': threshold

162

})

163

elif confidence >= 0.3: # Warning threshold

164

warnings.append({

165

'category': category_name,

166

'confidence': confidence

167

})

168

169

return {

170

'violations': violations,

171

'warnings': warnings,

172

'safe': len(violations) == 0,

173

'all_categories': response.moderation_categories

174

}

175

176

def get_action_recommendation(self, moderation_result):

177

"""Get recommended action based on moderation results."""

178

violations = moderation_result['violations']

179

180

if not violations:

181

return 'approve'

182

183

# Check for severe violations

184

severe_violations = [v for v in violations if v['severity'] == 'high']

185

threat_violations = [v for v in violations if v['category'] == 'Threat']

186

187

if severe_violations or threat_violations:

188

return 'block'

189

elif len(violations) >= 3:

190

return 'review'

191

elif any(v['confidence'] >= 0.8 for v in violations):

192

return 'review'

193

else:

194

return 'flag'

195

196

# Usage

197

moderator = ContentModerator(client)

198

199

test_texts = [

200

"This is a normal, friendly message.",

201

"You're such an idiot and I hate you!",

202

"I'm going to hurt you if you don't stop.",

203

"That's a really inappropriate and offensive comment."

204

]

205

206

for text in test_texts:

207

result = moderator.moderate_content(text)

208

action = moderator.get_action_recommendation(result)

209

210

print(f"Text: {text[:50]}...")

211

print(f"Action: {action}")

212

print(f"Safe: {result['safe']}")

213

214

if result['violations']:

215

print("Violations:")

216

for violation in result['violations']:

217

print(f" - {violation['category']}: {violation['confidence']:.3f} ({violation['severity']})")

218

219

if result['warnings']:

220

print("Warnings:")

221

for warning in result['warnings']:

222

print(f" - {warning['category']}: {warning['confidence']:.3f}")

223

print()

224

```

225

226

### Batch Content Moderation

227

228

```python

229

def moderate_content_batch(client, texts, batch_size=10):

230

"""Moderate multiple texts efficiently."""

231

results = []

232

233

for i in range(0, len(texts), batch_size):

234

batch = texts[i:i + batch_size]

235

batch_results = []

236

237

for text in batch:

238

try:

239

document = language.Document(

240

content=text,

241

type_=language.Document.Type.PLAIN_TEXT

242

)

243

244

response = client.moderate_text(

245

request={"document": document}

246

)

247

248

# Categorize results

249

violations = []

250

max_confidence = 0

251

252

for category in response.moderation_categories:

253

if category.confidence > 0.5:

254

violations.append({

255

'category': category.name,

256

'confidence': category.confidence

257

})

258

max_confidence = max(max_confidence, category.confidence)

259

260

batch_results.append({

261

'text': text,

262

'violations': violations,

263

'max_confidence': max_confidence,

264

'safe': len(violations) == 0,

265

'all_categories': response.moderation_categories

266

})

267

268

except Exception as e:

269

batch_results.append({

270

'text': text,

271

'error': str(e),

272

'safe': None

273

})

274

275

results.extend(batch_results)

276

277

return results

278

279

def generate_moderation_report(results):

280

"""Generate a summary report from batch moderation results."""

281

total_texts = len(results)

282

safe_count = sum(1 for r in results if r.get('safe') == True)

283

flagged_count = sum(1 for r in results if r.get('safe') == False)

284

error_count = sum(1 for r in results if 'error' in r)

285

286

# Category statistics

287

category_counts = {}

288

for result in results:

289

if 'violations' in result:

290

for violation in result['violations']:

291

category = violation['category']

292

category_counts[category] = category_counts.get(category, 0) + 1

293

294

print(f"Moderation Report")

295

print(f"================")

296

print(f"Total texts processed: {total_texts}")

297

print(f"Safe content: {safe_count} ({safe_count/total_texts*100:.1f}%)")

298

print(f"Flagged content: {flagged_count} ({flagged_count/total_texts*100:.1f}%)")

299

print(f"Processing errors: {error_count}")

300

print()

301

302

if category_counts:

303

print("Most common violations:")

304

sorted_categories = sorted(category_counts.items(), key=lambda x: x[1], reverse=True)

305

for category, count in sorted_categories[:5]:

306

print(f" {category}: {count} ({count/total_texts*100:.1f}%)")

307

308

return {

309

'total': total_texts,

310

'safe': safe_count,

311

'flagged': flagged_count,

312

'errors': error_count,

313

'category_counts': category_counts

314

}

315

316

# Usage

317

sample_texts = [

318

"Welcome to our community! Please be respectful.",

319

"This is completely inappropriate and offensive.",

320

"Great post! Thanks for sharing this information.",

321

"You're an absolute moron and should be banned.",

322

"I love this product and would recommend it to others."

323

]

324

325

batch_results = moderate_content_batch(client, sample_texts)

326

report = generate_moderation_report(batch_results)

327

```

328

329

### Real-time Content Filtering

330

331

```python

332

class RealTimeContentFilter:

333

def __init__(self, client, auto_block_threshold=0.8):

334

self.client = client

335

self.auto_block_threshold = auto_block_threshold

336

self.cache = {} # Simple cache for repeated content

337

338

def filter_message(self, message, user_id=None):

339

"""Filter a message in real-time with caching."""

340

# Check cache first

341

cache_key = hash(message.strip().lower())

342

if cache_key in self.cache:

343

return self.cache[cache_key]

344

345

document = language.Document(

346

content=message,

347

type_=language.Document.Type.PLAIN_TEXT

348

)

349

350

try:

351

response = self.client.moderate_text(

352

request={"document": document}

353

)

354

355

# Analyze results

356

max_confidence = 0

357

violations = []

358

359

for category in response.moderation_categories:

360

if category.confidence > 0.3: # Low threshold for tracking

361

violations.append({

362

'category': category.name,

363

'confidence': category.confidence

364

})

365

max_confidence = max(max_confidence, category.confidence)

366

367

# Determine action

368

if max_confidence >= self.auto_block_threshold:

369

action = 'block'

370

reason = f"High confidence violation ({max_confidence:.3f})"

371

elif max_confidence >= 0.5:

372

action = 'review'

373

reason = f"Moderate confidence violation ({max_confidence:.3f})"

374

else:

375

action = 'allow'

376

reason = "Content appears safe"

377

378

result = {

379

'action': action,

380

'reason': reason,

381

'confidence': max_confidence,

382

'violations': violations,

383

'user_id': user_id,

384

'timestamp': None # Would be set in real implementation

385

}

386

387

# Cache result

388

self.cache[cache_key] = result

389

390

return result

391

392

except Exception as e:

393

# Fail safe - allow content but log error

394

return {

395

'action': 'allow',

396

'reason': f"Moderation error: {str(e)}",

397

'confidence': 0,

398

'violations': [],

399

'user_id': user_id,

400

'error': True

401

}

402

403

def get_filter_stats(self):

404

"""Get statistics about filtering actions."""

405

if not self.cache:

406

return {}

407

408

actions = [result['action'] for result in self.cache.values()]

409

stats = {

410

'total_processed': len(actions),

411

'blocked': actions.count('block'),

412

'reviewed': actions.count('review'),

413

'allowed': actions.count('allow')

414

}

415

416

stats['block_rate'] = stats['blocked'] / stats['total_processed'] * 100

417

stats['review_rate'] = stats['reviewed'] / stats['total_processed'] * 100

418

419

return stats

420

421

# Usage

422

filter_system = RealTimeContentFilter(client, auto_block_threshold=0.7)

423

424

messages = [

425

("Hello everyone!", "user1"),

426

("This is absolutely disgusting content.", "user2"),

427

("Thanks for the helpful information.", "user3"),

428

("You're all idiots and I hate this place.", "user4"),

429

("Looking forward to the next update!", "user5")

430

]

431

432

print("Real-time Content Filtering:")

433

for message, user in messages:

434

result = filter_system.filter_message(message, user)

435

436

print(f"User {user}: {message[:30]}...")

437

print(f" Action: {result['action']} - {result['reason']}")

438

439

if result['violations']:

440

print(f" Violations: {len(result['violations'])}")

441

for violation in result['violations'][:2]: # Show top 2

442

print(f" - {violation['category']}: {violation['confidence']:.3f}")

443

print()

444

445

# Show filtering statistics

446

stats = filter_system.get_filter_stats()

447

print("Filtering Statistics:")

448

for key, value in stats.items():

449

print(f" {key}: {value}")

450

```

451

452

### Content Moderation Pipeline

453

454

```python

455

class ModerationPipeline:

456

def __init__(self, client):

457

self.client = client

458

self.processing_queue = []

459

self.processed_results = []

460

461

def add_content(self, content_id, text, metadata=None):

462

"""Add content to moderation queue."""

463

self.processing_queue.append({

464

'id': content_id,

465

'text': text,

466

'metadata': metadata or {},

467

'status': 'queued'

468

})

469

470

def process_queue(self):

471

"""Process all queued content."""

472

processed_count = 0

473

474

for item in self.processing_queue:

475

if item['status'] == 'queued':

476

try:

477

# Moderate content

478

document = language.Document(

479

content=item['text'],

480

type_=language.Document.Type.PLAIN_TEXT

481

)

482

483

response = self.client.moderate_text(

484

request={"document": document}

485

)

486

487

# Process results

488

violations = []

489

for category in response.moderation_categories:

490

violations.append({

491

'category': category.name,

492

'confidence': category.confidence

493

})

494

495

# Determine final action

496

high_confidence_violations = [

497

v for v in violations if v['confidence'] >= 0.7

498

]

499

500

if high_confidence_violations:

501

final_action = 'reject'

502

elif any(v['confidence'] >= 0.5 for v in violations):

503

final_action = 'review'

504

else:

505

final_action = 'approve'

506

507

result = {

508

'id': item['id'],

509

'text': item['text'],

510

'metadata': item['metadata'],

511

'action': final_action,

512

'violations': violations,

513

'processed': True,

514

'error': None

515

}

516

517

item['status'] = 'processed'

518

self.processed_results.append(result)

519

processed_count += 1

520

521

except Exception as e:

522

result = {

523

'id': item['id'],

524

'text': item['text'],

525

'metadata': item['metadata'],

526

'action': 'error',

527

'violations': [],

528

'processed': False,

529

'error': str(e)

530

}

531

532

item['status'] = 'error'

533

self.processed_results.append(result)

534

535

return processed_count

536

537

def get_results_by_action(self, action):

538

"""Get all results with a specific action."""

539

return [r for r in self.processed_results if r['action'] == action]

540

541

def export_review_queue(self):

542

"""Export items that need human review."""

543

review_items = self.get_results_by_action('review')

544

545

export_data = []

546

for item in review_items:

547

export_data.append({

548

'content_id': item['id'],

549

'text_preview': item['text'][:100] + "..." if len(item['text']) > 100 else item['text'],

550

'violations': item['violations'],

551

'metadata': item['metadata']

552

})

553

554

return export_data

555

556

# Usage

557

pipeline = ModerationPipeline(client)

558

559

# Add content to queue

560

content_samples = [

561

("post_1", "This is a great article about technology trends."),

562

("comment_2", "Your opinion is completely wrong and stupid."),

563

("review_3", "The product works well and I'm satisfied."),

564

("message_4", "I'm going to report you for this behavior."),

565

("post_5", "Looking forward to the conference next week!")

566

]

567

568

for content_id, text in content_samples:

569

pipeline.add_content(content_id, text, {'source': 'user_generated'})

570

571

# Process the queue

572

processed = pipeline.process_queue()

573

print(f"Processed {processed} items")

574

575

# Get results by action

576

approved = pipeline.get_results_by_action('approve')

577

rejected = pipeline.get_results_by_action('reject')

578

review_needed = pipeline.get_results_by_action('review')

579

580

print(f"Approved: {len(approved)}")

581

print(f"Rejected: {len(rejected)}")

582

print(f"Needs Review: {len(review_needed)}")

583

584

# Export review queue

585

if review_needed:

586

review_queue = pipeline.export_review_queue()

587

print("\nItems needing human review:")

588

for item in review_queue:

589

print(f"ID: {item['content_id']}")

590

print(f"Text: {item['text_preview']}")

591

print(f"Violations: {len(item['violations'])}")

592

print()

593

```

594

595

## Error Handling

596

597

```python

598

from google.api_core import exceptions

599

600

try:

601

response = client.moderate_text(

602

request={"document": document},

603

timeout=10.0

604

)

605

except exceptions.InvalidArgument as e:

606

print(f"Invalid document: {e}")

607

# Common causes: empty document, unsupported content type

608

except exceptions.ResourceExhausted:

609

print("API quota exceeded")

610

except exceptions.DeadlineExceeded:

611

print("Request timed out")

612

except exceptions.GoogleAPIError as e:

613

print(f"API error: {e}")

614

615

# Handle no moderation results

616

if not response.moderation_categories:

617

print("No moderation categories returned - content may be too short")

618

```

619

620

## Performance Considerations

621

622

- **Text Length**: Works with various text lengths, but very short texts may have limited results

623

- **Batch Processing**: Use async client for high-volume moderation

624

- **Caching**: Implement caching for repeated content to reduce API calls

625

- **Fallback Strategy**: Have fallback moderation in case of API failures

626

- **Rate Limiting**: Implement rate limiting for high-traffic applications

627

628

## Use Cases

629

630

- **Social Media Platforms**: Moderate user posts, comments, and messages

631

- **Content Publishing**: Screen articles, blog posts, and user-generated content

632

- **Chat Applications**: Filter inappropriate messages in real-time

633

- **Review Systems**: Moderate product and service reviews

634

- **Community Forums**: Maintain safe discussion environments

635

- **Educational Platforms**: Ensure appropriate content for learning environments

636

- **Gaming Platforms**: Moderate in-game chat and user communications

637

- **Customer Support**: Screen support tickets and feedback for inappropriate content