0
# Combined Analysis
1
2
Performs multiple types of analysis in a single API call for efficiency, allowing you to get sentiment, entities, syntax, classification, and moderation results simultaneously. This comprehensive analysis approach reduces API calls, improves performance, and provides a complete understanding of text content in one operation.
3
4
## Capabilities
5
6
### Annotate Text
7
8
Performs comprehensive text analysis including sentiment, entities, syntax (v1/v1beta2), classification, and moderation based on the specified features.
9
10
```python { .api }
11
def annotate_text(
12
self,
13
request: Optional[Union[AnnotateTextRequest, dict]] = None,
14
*,
15
document: Optional[Document] = None,
16
features: Optional[AnnotateTextRequest.Features] = None,
17
encoding_type: Optional[EncodingType] = None,
18
retry: OptionalRetry = gapic_v1.method.DEFAULT,
19
timeout: Union[float, object] = gapic_v1.method.DEFAULT,
20
metadata: Sequence[Tuple[str, Union[str, bytes]]] = ()
21
) -> AnnotateTextResponse:
22
"""
23
Provides comprehensive text analysis in a single request.
24
25
Args:
26
request: The request object containing document, features, and options
27
document: Input document for analysis
28
features: Features to extract (sentiment, entities, syntax, etc.)
29
encoding_type: Text encoding type for offset calculations
30
retry: Retry configuration for the request
31
timeout: Request timeout in seconds
32
metadata: Additional metadata to send with the request
33
34
Returns:
35
AnnotateTextResponse containing all requested analysis results
36
"""
37
```
38
39
#### Usage Example
40
41
```python
42
from google.cloud import language
43
44
# Initialize client
45
client = language.LanguageServiceClient()
46
47
# Create document
48
document = language.Document(
49
content="""
50
Google is an amazing technology company founded by Larry Page and Sergey Brin.
51
They have revolutionized internet search and continue to innovate in artificial
52
intelligence and cloud computing. I'm really impressed with their latest products!
53
""",
54
type_=language.Document.Type.PLAIN_TEXT
55
)
56
57
# Configure features to extract
58
features = language.AnnotateTextRequest.Features(
59
extract_sentiment=True,
60
extract_entities=True,
61
extract_document_sentiment=True,
62
extract_entity_sentiment=True, # v1/v1beta2 only
63
extract_syntax=True, # v1/v1beta2 only
64
classify_text=True,
65
moderate_text=True
66
)
67
68
# Perform comprehensive analysis
69
response = client.annotate_text(
70
request={
71
"document": document,
72
"features": features,
73
"encoding_type": language.EncodingType.UTF8
74
}
75
)
76
77
# Process all results
78
print("=== COMPREHENSIVE TEXT ANALYSIS ===\n")
79
80
# Document sentiment
81
if response.document_sentiment:
82
print(f"Document Sentiment: {response.document_sentiment.score:.2f} (magnitude: {response.document_sentiment.magnitude:.2f})")
83
84
# Entities
85
if response.entities:
86
print(f"\nEntities Found: {len(response.entities)}")
87
for entity in response.entities[:3]: # Show top 3
88
print(f" - {entity.name} ({entity.type_.name}): salience {entity.salience:.2f}")
89
90
# Sentences with sentiment
91
if response.sentences:
92
print(f"\nSentences: {len(response.sentences)}")
93
for i, sentence in enumerate(response.sentences):
94
print(f" {i+1}. {sentence.text.content}")
95
if sentence.sentiment:
96
print(f" Sentiment: {sentence.sentiment.score:.2f}")
97
98
# Classification
99
if response.categories:
100
print(f"\nClassification Categories:")
101
for category in response.categories:
102
print(f" - {category.name}: {category.confidence:.2f}")
103
104
# Moderation
105
if response.moderation_categories:
106
print(f"\nModeration Results:")
107
flagged = [cat for cat in response.moderation_categories if cat.confidence > 0.5]
108
if flagged:
109
for cat in flagged:
110
print(f" ⚠️ {cat.name}: {cat.confidence:.2f}")
111
else:
112
print(" ✅ Content appears safe")
113
114
print(f"\nLanguage detected: {response.language}")
115
```
116
117
## Request and Response Types
118
119
### AnnotateTextRequest
120
121
```python { .api }
122
class AnnotateTextRequest:
123
document: Document
124
features: Features
125
encoding_type: EncodingType
126
```
127
128
### AnnotateTextRequest.Features
129
130
Configuration for which analysis features to include.
131
132
```python { .api }
133
class Features:
134
extract_syntax: bool # Extract syntax info (v1/v1beta2 only)
135
extract_entities: bool # Extract entities
136
extract_document_sentiment: bool # Extract document sentiment
137
extract_entity_sentiment: bool # Extract entity sentiment (v1/v1beta2 only)
138
extract_sentiment: bool # Extract sentence-level sentiment
139
classify_text: bool # Classify text into categories
140
moderate_text: bool # Moderate content for safety
141
classification_model_options: ClassificationModelOptions # v1/v1beta2 only
142
```
143
144
### AnnotateTextResponse
145
146
Comprehensive response containing all requested analysis results.
147
148
```python { .api }
149
class AnnotateTextResponse:
150
sentences: MutableSequence[Sentence] # Sentences with sentiment
151
tokens: MutableSequence[Token] # Tokens with syntax info (v1/v1beta2)
152
entities: MutableSequence[Entity] # Entities found
153
document_sentiment: Sentiment # Overall document sentiment
154
language: str # Detected language
155
categories: MutableSequence[ClassificationCategory] # Classification results
156
moderation_categories: MutableSequence[ClassificationCategory] # Moderation results
157
```
158
159
## Advanced Usage
160
161
### Comprehensive Content Analysis
162
163
```python
164
def comprehensive_content_analysis(client, text, api_version='v1'):
165
"""Perform complete analysis of text content."""
166
if api_version == 'v2':
167
# v2 has limited features
168
client = language_v2.LanguageServiceClient()
169
features = language_v2.AnnotateTextRequest.Features(
170
extract_sentiment=True,
171
extract_entities=True,
172
extract_document_sentiment=True,
173
classify_text=True,
174
moderate_text=True
175
)
176
else:
177
# v1/v1beta2 have full features
178
features = language.AnnotateTextRequest.Features(
179
extract_syntax=True,
180
extract_entities=True,
181
extract_document_sentiment=True,
182
extract_entity_sentiment=True,
183
extract_sentiment=True,
184
classify_text=True,
185
moderate_text=True
186
)
187
188
document = language.Document(
189
content=text,
190
type_=language.Document.Type.PLAIN_TEXT
191
)
192
193
response = client.annotate_text(
194
request={
195
"document": document,
196
"features": features,
197
"encoding_type": language.EncodingType.UTF8
198
}
199
)
200
201
# Process and structure results
202
analysis = {
203
'text': text,
204
'language': response.language,
205
'word_count': len(text.split()),
206
'character_count': len(text)
207
}
208
209
# Document sentiment
210
if response.document_sentiment:
211
analysis['document_sentiment'] = {
212
'score': response.document_sentiment.score,
213
'magnitude': response.document_sentiment.magnitude,
214
'label': get_sentiment_label(response.document_sentiment.score)
215
}
216
217
# Entities
218
if response.entities:
219
analysis['entities'] = []
220
for entity in response.entities:
221
entity_data = {
222
'name': entity.name,
223
'type': entity.type_.name,
224
'salience': entity.salience,
225
'mentions': len(entity.mentions)
226
}
227
228
# Add entity sentiment if available
229
if hasattr(entity, 'sentiment') and entity.sentiment:
230
entity_data['sentiment'] = {
231
'score': entity.sentiment.score,
232
'magnitude': entity.sentiment.magnitude
233
}
234
235
analysis['entities'].append(entity_data)
236
237
# Classification
238
if response.categories:
239
analysis['categories'] = [
240
{'name': cat.name, 'confidence': cat.confidence}
241
for cat in response.categories
242
]
243
244
# Moderation
245
if response.moderation_categories:
246
analysis['moderation'] = {
247
'safe': all(cat.confidence < 0.5 for cat in response.moderation_categories),
248
'categories': [
249
{'name': cat.name, 'confidence': cat.confidence}
250
for cat in response.moderation_categories if cat.confidence > 0.1
251
]
252
}
253
254
# Sentence analysis
255
if response.sentences:
256
analysis['sentences'] = []
257
for sentence in response.sentences:
258
sentence_data = {
259
'text': sentence.text.content,
260
'word_count': len(sentence.text.content.split())
261
}
262
263
if sentence.sentiment:
264
sentence_data['sentiment'] = {
265
'score': sentence.sentiment.score,
266
'magnitude': sentence.sentiment.magnitude
267
}
268
269
analysis['sentences'].append(sentence_data)
270
271
# Syntax analysis (v1/v1beta2 only)
272
if response.tokens:
273
pos_counts = {}
274
for token in response.tokens:
275
pos = token.part_of_speech.tag.name
276
pos_counts[pos] = pos_counts.get(pos, 0) + 1
277
278
analysis['syntax'] = {
279
'token_count': len(response.tokens),
280
'pos_distribution': pos_counts,
281
'complexity_score': calculate_complexity_score(response.tokens)
282
}
283
284
return analysis
285
286
def get_sentiment_label(score):
287
"""Convert sentiment score to human-readable label."""
288
if score >= 0.25:
289
return 'Positive'
290
elif score <= -0.25:
291
return 'Negative'
292
else:
293
return 'Neutral'
294
295
def calculate_complexity_score(tokens):
296
"""Calculate text complexity based on syntax."""
297
if not tokens:
298
return 0
299
300
complex_pos = ['ADJ', 'ADV', 'VERB']
301
complex_tokens = sum(1 for token in tokens if token.part_of_speech.tag.name in complex_pos)
302
303
return complex_tokens / len(tokens)
304
305
# Usage
306
text = """
307
Apple Inc. is a fantastic American technology company that designs and develops
308
consumer electronics, software, and online services. Founded by Steve Jobs,
309
Steve Wozniak, and Ronald Wayne in 1976, the company has become one of the
310
world's most valuable companies. I absolutely love their innovative products
311
and exceptional design philosophy!
312
"""
313
314
analysis = comprehensive_content_analysis(client, text)
315
316
print("Comprehensive Content Analysis:")
317
print(f"Language: {analysis['language']}")
318
print(f"Words: {analysis['word_count']}, Characters: {analysis['character_count']}")
319
320
if 'document_sentiment' in analysis:
321
sent = analysis['document_sentiment']
322
print(f"Overall Sentiment: {sent['label']} (score: {sent['score']:.2f})")
323
324
if 'entities' in analysis:
325
print(f"Entities: {len(analysis['entities'])}")
326
for entity in analysis['entities'][:3]:
327
print(f" - {entity['name']} ({entity['type']})")
328
329
if 'categories' in analysis:
330
print("Top Categories:")
331
for cat in analysis['categories'][:2]:
332
print(f" - {cat['name']}: {cat['confidence']:.2f}")
333
334
if 'moderation' in analysis:
335
print(f"Content Safety: {'Safe' if analysis['moderation']['safe'] else 'Flagged'}")
336
```
337
338
### Feature-Specific Analysis
339
340
```python
341
def analyze_with_specific_features(client, text, feature_set='basic'):
342
"""Perform analysis with predefined feature sets."""
343
344
feature_sets = {
345
'basic': language.AnnotateTextRequest.Features(
346
extract_document_sentiment=True,
347
extract_entities=True
348
),
349
'sentiment_focus': language.AnnotateTextRequest.Features(
350
extract_document_sentiment=True,
351
extract_sentiment=True,
352
extract_entity_sentiment=True # v1/v1beta2 only
353
),
354
'safety_focus': language.AnnotateTextRequest.Features(
355
moderate_text=True,
356
classify_text=True,
357
extract_document_sentiment=True
358
),
359
'linguistic': language.AnnotateTextRequest.Features(
360
extract_syntax=True, # v1/v1beta2 only
361
extract_entities=True,
362
extract_sentiment=True
363
),
364
'complete': language.AnnotateTextRequest.Features(
365
extract_syntax=True, # v1/v1beta2 only
366
extract_entities=True,
367
extract_document_sentiment=True,
368
extract_entity_sentiment=True, # v1/v1beta2 only
369
extract_sentiment=True,
370
classify_text=True,
371
moderate_text=True
372
)
373
}
374
375
features = feature_sets.get(feature_set, feature_sets['basic'])
376
377
document = language.Document(
378
content=text,
379
type_=language.Document.Type.PLAIN_TEXT
380
)
381
382
response = client.annotate_text(
383
request={
384
"document": document,
385
"features": features
386
}
387
)
388
389
return response
390
391
# Usage examples
392
text = "Google's new AI technology is impressive but raises some privacy concerns."
393
394
# Basic analysis
395
basic_response = analyze_with_specific_features(client, text, 'basic')
396
print("Basic Analysis:")
397
print(f" Sentiment: {basic_response.document_sentiment.score:.2f}")
398
print(f" Entities: {len(basic_response.entities)}")
399
400
# Safety-focused analysis
401
safety_response = analyze_with_specific_features(client, text, 'safety_focus')
402
print("\nSafety Analysis:")
403
if safety_response.moderation_categories:
404
flagged = [cat for cat in safety_response.moderation_categories if cat.confidence > 0.3]
405
print(f" Safety issues: {len(flagged)}")
406
if safety_response.categories:
407
print(f" Categories: {len(safety_response.categories)}")
408
```
409
410
### Batch Comprehensive Analysis
411
412
```python
413
def batch_comprehensive_analysis(client, texts, max_workers=3):
414
"""Perform comprehensive analysis on multiple texts concurrently."""
415
import concurrent.futures
416
417
def analyze_single_text(text):
418
features = language.AnnotateTextRequest.Features(
419
extract_document_sentiment=True,
420
extract_entities=True,
421
extract_sentiment=True,
422
classify_text=True,
423
moderate_text=True
424
)
425
426
document = language.Document(
427
content=text,
428
type_=language.Document.Type.PLAIN_TEXT
429
)
430
431
try:
432
response = client.annotate_text(
433
request={
434
"document": document,
435
"features": features
436
}
437
)
438
439
return {
440
'text': text[:100] + "..." if len(text) > 100 else text,
441
'success': True,
442
'sentiment_score': response.document_sentiment.score if response.document_sentiment else None,
443
'entity_count': len(response.entities),
444
'category_count': len(response.categories),
445
'safe': all(cat.confidence < 0.5 for cat in response.moderation_categories) if response.moderation_categories else True,
446
'language': response.language
447
}
448
except Exception as e:
449
return {
450
'text': text[:100] + "..." if len(text) > 100 else text,
451
'success': False,
452
'error': str(e)
453
}
454
455
results = []
456
with concurrent.futures.ThreadPoolExecutor(max_workers=max_workers) as executor:
457
future_to_text = {executor.submit(analyze_single_text, text): text for text in texts}
458
459
for future in concurrent.futures.as_completed(future_to_text):
460
result = future.result()
461
results.append(result)
462
463
return results
464
465
# Usage
466
texts = [
467
"This product is absolutely amazing and I love using it every day!",
468
"The customer service was terrible and the staff was very rude.",
469
"Apple announced new features for their latest iPhone model.",
470
"The weather is nice today and perfect for outdoor activities.",
471
"I'm concerned about privacy issues with social media platforms."
472
]
473
474
batch_results = batch_comprehensive_analysis(client, texts)
475
476
print("Batch Analysis Results:")
477
for i, result in enumerate(batch_results, 1):
478
print(f"{i}. {result['text']}")
479
if result['success']:
480
print(f" Sentiment: {result['sentiment_score']:.2f}")
481
print(f" Entities: {result['entity_count']}")
482
print(f" Categories: {result['category_count']}")
483
print(f" Safe: {result['safe']}")
484
print(f" Language: {result['language']}")
485
else:
486
print(f" Error: {result['error']}")
487
print()
488
```
489
490
### Performance Optimization
491
492
```python
493
def optimized_analysis_pipeline(client, texts, chunk_size=10):
494
"""Optimized pipeline for processing large volumes of text."""
495
import time
496
497
results = []
498
total_texts = len(texts)
499
500
# Process in chunks to manage memory and API limits
501
for i in range(0, total_texts, chunk_size):
502
chunk = texts[i:i + chunk_size]
503
chunk_start_time = time.time()
504
505
print(f"Processing chunk {i//chunk_size + 1}/{(total_texts-1)//chunk_size + 1}")
506
507
for text in chunk:
508
# Use minimal features for faster processing
509
features = language.AnnotateTextRequest.Features(
510
extract_document_sentiment=True,
511
extract_entities=True,
512
classify_text=True
513
)
514
515
document = language.Document(
516
content=text,
517
type_=language.Document.Type.PLAIN_TEXT
518
)
519
520
try:
521
response = client.annotate_text(
522
request={
523
"document": document,
524
"features": features
525
},
526
timeout=10.0 # Shorter timeout for faster processing
527
)
528
529
# Extract key metrics only
530
result = {
531
'sentiment': response.document_sentiment.score if response.document_sentiment else 0,
532
'entity_count': len(response.entities),
533
'top_category': response.categories[0].name if response.categories else None,
534
'language': response.language
535
}
536
537
results.append(result)
538
539
except Exception as e:
540
results.append({
541
'error': str(e),
542
'sentiment': 0,
543
'entity_count': 0,
544
'top_category': None,
545
'language': 'unknown'
546
})
547
548
chunk_time = time.time() - chunk_start_time
549
print(f" Processed {len(chunk)} texts in {chunk_time:.2f} seconds")
550
551
# Brief pause between chunks
552
time.sleep(0.1)
553
554
return results
555
556
# Usage for high-volume processing
557
# large_text_collection = [...] # Your large collection of texts
558
# optimized_results = optimized_analysis_pipeline(client, large_text_collection)
559
```
560
561
## Error Handling
562
563
```python
564
from google.api_core import exceptions
565
566
try:
567
response = client.annotate_text(
568
request={
569
"document": document,
570
"features": features
571
},
572
timeout=30.0
573
)
574
except exceptions.InvalidArgument as e:
575
print(f"Invalid request: {e}")
576
# Check document content and feature configuration
577
except exceptions.DeadlineExceeded:
578
print("Request timed out - consider reducing features or text length")
579
except exceptions.ResourceExhausted:
580
print("API quota exceeded")
581
except exceptions.FailedPrecondition as e:
582
print(f"Feature not available in this API version: {e}")
583
except exceptions.GoogleAPIError as e:
584
print(f"API error: {e}")
585
```
586
587
## Performance Considerations
588
589
- **Feature Selection**: Only request features you need to improve performance
590
- **Text Length**: Larger texts take more time; consider chunking very long documents
591
- **Timeout Configuration**: Set appropriate timeouts based on text length and features
592
- **API Version**: v2 has fewer features but may be faster for basic analysis
593
- **Batch Processing**: Use async client for high-volume processing
594
- **Caching**: Cache results for static content to reduce API calls
595
596
## Use Cases
597
598
- **Content Management Systems**: Comprehensive analysis for content approval workflows
599
- **Social Media Monitoring**: Complete understanding of user-generated content
600
- **Customer Feedback Analysis**: Multi-dimensional analysis of reviews and feedback
601
- **Document Processing**: Full analysis of business documents and reports
602
- **Research Applications**: Comprehensive text analysis for academic research
603
- **Quality Assurance**: Complete content verification before publication
604
- **AI Training Data**: Generate comprehensive labels for machine learning datasets
605
- **Content Optimization**: Understand all aspects of content for optimization strategies