Tessl Tile for pypi/google-cloud-language@2.17.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

client-management.md combined-analysis.md content-moderation.md entity-analysis.md entity-sentiment-analysis.md index.md sentiment-analysis.md syntax-analysis.md text-classification.md

syntax-analysis.mddocs/

0
# Syntax Analysis (v1/v1beta2 only)
1

2
Provides comprehensive linguistic analysis including part-of-speech tagging, dependency parsing, morphological analysis, and token-level information to understand the grammatical structure and linguistic properties of text. Essential for applications requiring deep language understanding, grammar checking, and linguistic research.
3

4
**Note**: This feature is only available in API versions v1 and v1beta2. It is not included in the simplified v2 API.
5

6
## Capabilities
7

8
### Analyze Syntax
9

10
Performs detailed syntactic analysis of the provided text, returning information about sentences, tokens, part-of-speech tags, and dependency relationships.
11

12
```python { .api }
13
def analyze_syntax(
14
    self,
15
    request: Optional[Union[AnalyzeSyntaxRequest, dict]] = None,
16
    *,
17
    document: Optional[Document] = None,
18
    encoding_type: Optional[EncodingType] = None,
19
    retry: OptionalRetry = gapic_v1.method.DEFAULT,
20
    timeout: Union[float, object] = gapic_v1.method.DEFAULT,
21
    metadata: Sequence[Tuple[str, Union[str, bytes]]] = ()
22
) -> AnalyzeSyntaxResponse:
23
    """
24
    Analyzes the syntax of the text and provides part-of-speech tagging,
25
    dependency parsing, and other linguistic information.
26
    
27
    Args:
28
        request: The request object containing document and options
29
        document: Input document for analysis
30
        encoding_type: Text encoding type for offset calculations
31
        retry: Retry configuration for the request
32
        timeout: Request timeout in seconds
33
        metadata: Additional metadata to send with the request
34
        
35
    Returns:
36
        AnalyzeSyntaxResponse containing linguistic analysis results
37
    """
38
```
39

40
#### Usage Example
41

42
```python
43
from google.cloud import language_v1  # Use v1 or v1beta2
44

45
# Initialize client (must use v1 or v1beta2)
46
client = language_v1.LanguageServiceClient()
47

48
# Create document
49
document = language_v1.Document(
50
    content="The quick brown fox jumps over the lazy dog.",
51
    type_=language_v1.Document.Type.PLAIN_TEXT
52
)
53

54
# Analyze syntax
55
response = client.analyze_syntax(
56
    request={"document": document}
57
)
58

59
# Process sentences
60
print("Sentences:")
61
for i, sentence in enumerate(response.sentences):
62
    print(f"{i+1}. {sentence.text.content}")
63

64
print("\nTokens with POS tags:")
65
for token in response.tokens:
66
    print(f"'{token.text.content}' - {token.part_of_speech.tag.name}")
67

68
print("\nDependency relationships:")
69
for i, token in enumerate(response.tokens):
70
    if token.dependency_edge.head_token_index != i:  # Not the root
71
        head_token = response.tokens[token.dependency_edge.head_token_index]
72
        print(f"'{token.text.content}' --{token.dependency_edge.label.name}--> '{head_token.text.content}'")
73
```
74

75
## Request and Response Types
76

77
### AnalyzeSyntaxRequest
78

79
```python { .api }
80
class AnalyzeSyntaxRequest:
81
    document: Document
82
    encoding_type: EncodingType
83
```
84

85
### AnalyzeSyntaxResponse
86

87
```python { .api }
88
class AnalyzeSyntaxResponse:
89
    sentences: MutableSequence[Sentence]
90
    tokens: MutableSequence[Token]
91
    language: str
92
```
93

94
## Supporting Types
95

96
### Token
97

98
Represents a linguistic token with comprehensive morphological and syntactic information.
99

100
```python { .api }
101
class Token:
102
    text: TextSpan               # Token text and position
103
    part_of_speech: PartOfSpeech # Part-of-speech information
104
    dependency_edge: DependencyEdge # Dependency relationship
105
    lemma: str                   # Canonical form of the token
106
```
107

108
### PartOfSpeech
109

110
Comprehensive part-of-speech and morphological information.
111

112
```python { .api }
113
class PartOfSpeech:
114
    class Tag(proto.Enum):
115
        UNKNOWN = 0
116
        ADJ = 1      # Adjective
117
        ADP = 2      # Adposition (preposition/postposition)
118
        ADV = 3      # Adverb
119
        CONJ = 4     # Conjunction
120
        DET = 5      # Determiner
121
        NOUN = 6     # Noun
122
        NUM = 7      # Numeral
123
        PRON = 8     # Pronoun
124
        PRT = 9      # Particle
125
        PUNCT = 10   # Punctuation
126
        VERB = 11    # Verb
127
        X = 12       # Other/Unknown
128
        AFFIX = 13   # Affix
129
    
130
    class Aspect(proto.Enum):
131
        ASPECT_UNKNOWN = 0
132
        PERFECTIVE = 1
133
        IMPERFECTIVE = 2
134
        PROGRESSIVE = 3
135
    
136
    class Case(proto.Enum):
137
        CASE_UNKNOWN = 0
138
        ACCUSATIVE = 1
139
        ADVERBIAL = 2
140
        COMPLEMENTIVE = 3
141
        DATIVE = 4
142
        GENITIVE = 5
143
        INSTRUMENTAL = 6
144
        LOCATIVE = 7
145
        NOMINATIVE = 8
146
        OBLIQUE = 9
147
        PARTITIVE = 10
148
        PREPOSITIONAL = 11
149
        REFLEXIVE_CASE = 12
150
        RELATIVE_CASE = 13
151
        VOCATIVE = 14
152
    
153
    # Additional enums for Form, Gender, Mood, Number, Person, Proper, Reciprocity, Tense, Voice
154
    
155
    tag: Tag                     # Main part-of-speech tag
156
    aspect: Aspect               # Verbal aspect
157
    case: Case                   # Grammatical case
158
    form: Form                   # Word form
159
    gender: Gender               # Grammatical gender
160
    mood: Mood                   # Grammatical mood
161
    number: Number               # Grammatical number
162
    person: Person               # Grammatical person
163
    proper: Proper               # Proper noun indicator
164
    reciprocity: Reciprocity     # Reciprocity
165
    tense: Tense                 # Grammatical tense
166
    voice: Voice                 # Grammatical voice
167
```
168

169
### DependencyEdge
170

171
Represents a dependency relationship between tokens in the parse tree.
172

173
```python { .api }
174
class DependencyEdge:
175
    class Label(proto.Enum):
176
        UNKNOWN = 0
177
        ABBREV = 1      # Abbreviation modifier
178
        ACOMP = 2       # Adjectival complement
179
        ADVCL = 3       # Adverbial clause modifier
180
        ADVMOD = 4      # Adverbial modifier
181
        AMOD = 5        # Adjectival modifier
182
        APPOS = 6       # Appositional modifier
183
        ATTR = 7        # Attribute
184
        AUX = 8         # Auxiliary
185
        AUXPASS = 9     # Passive auxiliary
186
        CC = 10         # Coordinating conjunction
187
        CCOMP = 11      # Clausal complement
188
        CONJ = 12       # Conjunct
189
        CSUBJ = 13      # Clausal subject
190
        CSUBJPASS = 14  # Clausal passive subject
191
        DEP = 15        # Dependent
192
        DET = 16        # Determiner
193
        DISCOURSE = 17  # Discourse element
194
        DOBJ = 18       # Direct object
195
        EXPL = 19       # Expletive
196
        GOESWITH = 20   # Goes with
197
        IOBJ = 21       # Indirect object
198
        MARK = 22       # Marker
199
        MWE = 23        # Multi-word expression
200
        MWV = 24        # Multi-word verbal expression
201
        NEG = 25        # Negation modifier
202
        NN = 26         # Noun compound modifier
203
        NPADVMOD = 27   # Noun phrase adverbial modifier
204
        NSUBJ = 28      # Nominal subject
205
        NSUBJPASS = 29  # Passive nominal subject
206
        NUM = 30        # Numeric modifier
207
        NUMBER = 31     # Element of compound number
208
        P = 32          # Punctuation mark
209
        PARATAXIS = 33  # Parataxis
210
        PARTMOD = 34    # Participial modifier
211
        PCOMP = 35      # Prepositional complement
212
        POBJ = 36       # Object of preposition
213
        POSS = 37       # Possession modifier
214
        POSTNEG = 38    # Postverbal negative particle
215
        PRECOMP = 39    # Predicate complement
216
        PRECONJ = 40    # Preconjunct
217
        PREDET = 41     # Predeterminer
218
        PREF = 42       # Prefix
219
        PREP = 43       # Prepositional modifier
220
        PRONL = 44      # Pronominal locative
221
        PRT = 45        # Particle
222
        PS = 46         # Possessive ending
223
        QUANTMOD = 47   # Quantifier phrase modifier
224
        RCMOD = 48      # Relative clause modifier
225
        RCMODREL = 49   # Complementizer in relative clause
226
        RDROP = 50      # Ellipsis without a preceding predicate
227
        REF = 51        # Referent
228
        REMNANT = 52    # Remnant
229
        REPARANDUM = 53 # Reparandum
230
        ROOT = 54       # Root
231
        SNUM = 55       # Suffix specifying a unit of number
232
        SUFF = 56       # Suffix
233
        TMOD = 57       # Temporal modifier
234
        TOPIC = 58      # Topic marker
235
        VMOD = 59       # Verbal modifier
236
        VOCATIVE = 60   # Vocative
237
        XCOMP = 61      # Open clausal complement
238
        SUFFIX = 62     # Suffix
239
        TITLE = 63      # Title
240
        ADVPHMOD = 64   # Adverbial phrase modifier
241
        AUXCAUS = 65    # Causative auxiliary
242
        AUXVV = 66      # Helper auxiliary
243
        DTMOD = 67      # Rentaishi (Prenominal modifier)
244
        FOREIGN = 68    # Foreign words
245
        KW = 69         # Keyword
246
        LIST = 70       # List for chains of comparable items
247
        NOMC = 71       # Nominalized clause
248
        NOMCSUBJ = 72   # Nominalized clausal subject
249
        NOMCSUBJPASS = 73 # Nominalized clausal passive
250
        NUMC = 74       # Compound of numeric modifier
251
        COP = 75        # Copula
252
        DISLOCATED = 76 # Dislocated relation
253
        ASP = 77        # Aspect marker
254
        GMOD = 78       # Genitive modifier
255
        GOBJ = 79       # Genitive object
256
        INFMOD = 80     # Infinitival modifier
257
        MES = 81        # Measure
258
        NCOMP = 82      # Nominal complement of a noun
259
    
260
    head_token_index: int       # Index of the head token
261
    label: Label                # Dependency relationship label
262
```
263

264
## Advanced Usage
265

266
### Part-of-Speech Analysis
267

268
```python
269
def analyze_pos_distribution(client, text):
270
    """Analyze the distribution of parts of speech in text."""
271
    document = language_v1.Document(
272
        content=text,
273
        type_=language_v1.Document.Type.PLAIN_TEXT
274
    )
275
    
276
    response = client.analyze_syntax(
277
        request={"document": document}
278
    )
279
    
280
    pos_counts = {}
281
    total_tokens = len(response.tokens)
282
    
283
    for token in response.tokens:
284
        pos_tag = token.part_of_speech.tag.name
285
        pos_counts[pos_tag] = pos_counts.get(pos_tag, 0) + 1
286
    
287
    print("Part-of-Speech Distribution:")
288
    for pos, count in sorted(pos_counts.items(), key=lambda x: x[1], reverse=True):
289
        percentage = (count / total_tokens) * 100
290
        print(f"{pos}: {count} ({percentage:.1f}%)")
291
    
292
    return pos_counts
293

294
# Usage
295
text = "The quick brown fox jumps gracefully over the very lazy dog near the old oak tree."
296
pos_distribution = analyze_pos_distribution(client, text)
297
```
298

299
### Dependency Tree Visualization
300

301
```python
302
def visualize_dependency_tree(client, text):
303
    """Create a simple text representation of the dependency tree."""
304
    document = language_v1.Document(
305
        content=text,
306
        type_=language_v1.Document.Type.PLAIN_TEXT
307
    )
308
    
309
    response = client.analyze_syntax(
310
        request={"document": document}
311
    )
312
    
313
    # Find the root token
314
    root_index = None
315
    for i, token in enumerate(response.tokens):
316
        if token.dependency_edge.label == language_v1.DependencyEdge.Label.ROOT:
317
            root_index = i
318
            break
319
    
320
    if root_index is not None:
321
        print(f"Dependency Tree (root: '{response.tokens[root_index].text.content}'):")
322
        print_dependency_subtree(response.tokens, root_index, 0)
323
    
324
    return response.tokens
325

326
def print_dependency_subtree(tokens, head_index, depth):
327
    """Recursively print dependency subtree."""
328
    head_token = tokens[head_index]
329
    indent = "  " * depth
330
    pos_tag = head_token.part_of_speech.tag.name
331
    print(f"{indent}{head_token.text.content} ({pos_tag})")
332
    
333
    # Find children
334
    children = []
335
    for i, token in enumerate(tokens):
336
        if token.dependency_edge.head_token_index == head_index and i != head_index:
337
            children.append((i, token.dependency_edge.label.name))
338
    
339
    # Sort children by position in sentence
340
    children.sort(key=lambda x: tokens[x[0]].text.begin_offset)
341
    
342
    for child_index, relation in children:
343
        child_indent = "  " * (depth + 1)
344
        print(f"{child_indent}--{relation}-->")
345
        print_dependency_subtree(tokens, child_index, depth + 2)
346

347
# Usage
348
text = "The cat sat on the mat."
349
visualize_dependency_tree(client, text)
350
```
351

352
### Lemmatization
353

354
```python
355
def extract_lemmas(client, text):
356
    """Extract lemmatized forms of words."""
357
    document = language_v1.Document(
358
        content=text,
359
        type_=language_v1.Document.Type.PLAIN_TEXT
360
    )
361
    
362
    response = client.analyze_syntax(
363
        request={"document": document}
364
    )
365
    
366
    lemmas = []
367
    print("Word -> Lemma:")
368
    for token in response.tokens:
369
        word = token.text.content
370
        lemma = token.lemma
371
        pos = token.part_of_speech.tag.name
372
        
373
        if word != lemma:
374
            print(f"{word} -> {lemma} ({pos})")
375
        
376
        lemmas.append(lemma)
377
    
378
    return lemmas
379

380
# Usage
381
text = "The children were running quickly through the trees and jumped over the fallen logs."
382
lemmas = extract_lemmas(client, text)
383
print(f"\nLemmatized text: {' '.join(lemmas)}")
384
```
385

386
### Subject-Verb-Object Extraction
387

388
```python
389
def extract_svo_triples(client, text):
390
    """Extract Subject-Verb-Object triples from text."""
391
    document = language_v1.Document(
392
        content=text,
393
        type_=language_v1.Document.Type.PLAIN_TEXT
394
    )
395
    
396
    response = client.analyze_syntax(
397
        request={"document": document}
398
    )
399
    
400
    triples = []
401
    
402
    # Find verbs
403
    for i, token in enumerate(response.tokens):
404
        if token.part_of_speech.tag == language_v1.PartOfSpeech.Tag.VERB:
405
            verb = token.text.content
406
            subject = None
407
            obj = None
408
            
409
            # Find subject and object
410
            for j, dependent in enumerate(response.tokens):
411
                if dependent.dependency_edge.head_token_index == i:
412
                    if dependent.dependency_edge.label == language_v1.DependencyEdge.Label.NSUBJ:
413
                        subject = dependent.text.content
414
                    elif dependent.dependency_edge.label == language_v1.DependencyEdge.Label.DOBJ:
415
                        obj = dependent.text.content
416
            
417
            if subject and obj:
418
                triples.append((subject, verb, obj))
419
    
420
    return triples
421

422
# Usage
423
text = "The dog chased the cat. Mary loves books. John ate an apple."
424
svo_triples = extract_svo_triples(client, text)
425

426
print("Subject-Verb-Object triples:")
427
for subject, verb, obj in svo_triples:
428
    print(f"{subject} -> {verb} -> {obj}")
429
```
430

431
### Morphological Analysis
432

433
```python
434
def analyze_morphology(client, text):
435
    """Analyze morphological features of words."""
436
    document = language_v1.Document(
437
        content=text,
438
        type_=language_v1.Document.Type.PLAIN_TEXT
439
    )
440
    
441
    response = client.analyze_syntax(
442
        request={"document": document}
443
    )
444
    
445
    print("Morphological Analysis:")
446
    for token in response.tokens:
447
        word = token.text.content
448
        pos_info = token.part_of_speech
449
        
450
        features = []
451
        
452
        # Collect non-unknown morphological features
453
        if pos_info.aspect != language_v1.PartOfSpeech.Aspect.ASPECT_UNKNOWN:
454
            features.append(f"Aspect: {pos_info.aspect.name}")
455
        if pos_info.case != language_v1.PartOfSpeech.Case.CASE_UNKNOWN:
456
            features.append(f"Case: {pos_info.case.name}")
457
        if pos_info.gender != language_v1.PartOfSpeech.Gender.GENDER_UNKNOWN:
458
            features.append(f"Gender: {pos_info.gender.name}")
459
        if pos_info.mood != language_v1.PartOfSpeech.Mood.MOOD_UNKNOWN:
460
            features.append(f"Mood: {pos_info.mood.name}")
461
        if pos_info.number != language_v1.PartOfSpeech.Number.NUMBER_UNKNOWN:
462
            features.append(f"Number: {pos_info.number.name}")
463
        if pos_info.person != language_v1.PartOfSpeech.Person.PERSON_UNKNOWN:
464
            features.append(f"Person: {pos_info.person.name}")
465
        if pos_info.tense != language_v1.PartOfSpeech.Tense.TENSE_UNKNOWN:
466
            features.append(f"Tense: {pos_info.tense.name}")
467
        if pos_info.voice != language_v1.PartOfSpeech.Voice.VOICE_UNKNOWN:
468
            features.append(f"Voice: {pos_info.voice.name}")
469
        
470
        if features:
471
            print(f"{word} ({pos_info.tag.name}): {', '.join(features)}")
472
        else:
473
            print(f"{word} ({pos_info.tag.name})")
474

475
# Usage
476
text = "The cats were sleeping peacefully in their beds."
477
analyze_morphology(client, text)
478
```
479

480
### Sentence Complexity Analysis
481

482
```python
483
def analyze_sentence_complexity(client, text):
484
    """Analyze grammatical complexity of sentences."""
485
    document = language_v1.Document(
486
        content=text,
487
        type_=language_v1.Document.Type.PLAIN_TEXT
488
    )
489
    
490
    response = client.analyze_syntax(
491
        request={"document": document}
492
    )
493
    
494
    sentence_stats = []
495
    
496
    for sentence in response.sentences:
497
        # Find tokens in this sentence
498
        sentence_tokens = [
499
            token for token in response.tokens
500
            if (token.text.begin_offset >= sentence.text.begin_offset and
501
                token.text.begin_offset < sentence.text.begin_offset + len(sentence.text.content))
502
        ]
503
        
504
        # Count different types of dependencies
505
        clause_count = 0
506
        modifier_count = 0
507
        
508
        for token in sentence_tokens:
509
            label = token.dependency_edge.label
510
            if label in [language_v1.DependencyEdge.Label.CCOMP, 
511
                        language_v1.DependencyEdge.Label.ADVCL,
512
                        language_v1.DependencyEdge.Label.RCMOD]:
513
                clause_count += 1
514
            elif label in [language_v1.DependencyEdge.Label.AMOD,
515
                          language_v1.DependencyEdge.Label.ADVMOD,
516
                          language_v1.DependencyEdge.Label.PREP]:
517
                modifier_count += 1
518
        
519
        stats = {
520
            'sentence': sentence.text.content,
521
            'token_count': len(sentence_tokens),
522
            'clause_count': clause_count,
523
            'modifier_count': modifier_count,
524
            'complexity_score': len(sentence_tokens) + clause_count * 2 + modifier_count
525
        }
526
        
527
        sentence_stats.append(stats)
528
    
529
    return sentence_stats
530

531
# Usage
532
text = """
533
The cat sat. 
534
The big fluffy cat that we adopted last year sat quietly on the comfortable wooden chair 
535
that my grandmother gave me when I moved into my first apartment.
536
"""
537

538
complexity_stats = analyze_sentence_complexity(client, text)
539

540
print("Sentence Complexity Analysis:")
541
for i, stats in enumerate(complexity_stats, 1):
542
    print(f"Sentence {i}: {stats['sentence'][:50]}...")
543
    print(f"  Tokens: {stats['token_count']}")
544
    print(f"  Clauses: {stats['clause_count']}")
545
    print(f"  Modifiers: {stats['modifier_count']}")
546
    print(f"  Complexity Score: {stats['complexity_score']}")
547
    print()
548
```
549

550
## Error Handling
551

552
```python
553
from google.api_core import exceptions
554

555
try:
556
    response = client.analyze_syntax(
557
        request={"document": document},
558
        timeout=25.0
559
    )
560
except exceptions.InvalidArgument as e:
561
    print(f"Invalid request: {e}")
562
except exceptions.DeadlineExceeded:
563
    print("Request timed out")
564
except exceptions.FailedPrecondition as e:
565
    print(f"API version error: {e}")
566
    print("Note: Syntax analysis requires v1 or v1beta2")
567
except exceptions.GoogleAPIError as e:
568
    print(f"API error: {e}")
569
```
570

571
## Performance Considerations
572

573
- **Text Length**: Optimal for documents under 1MB
574
- **Computation**: Most intensive analysis type
575
- **Language Support**: Best results with well-supported languages
576
- **Caching**: Results can be cached for static text
577
- **API Version**: Only available in v1 and v1beta2
578

579
## Use Cases
580

581
- **Grammar Checking**: Identify grammatical errors and suggest corrections
582
- **Text Simplification**: Analyze and simplify complex sentence structures
583
- **Information Extraction**: Extract structured information using syntactic patterns
584
- **Language Learning**: Provide detailed grammatical analysis for educational purposes
585
- **Machine Translation**: Use syntactic information to improve translation quality
586
- **Content Analysis**: Analyze writing style and complexity
587
- **Search Enhancement**: Use syntactic features for better search understanding
588
- **Question Answering**: Use dependency parsing to understand question structure

Version

Tile

Files

syntax-analysis.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

syntax-analysis.mddocs/