or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

client-management.mdcombined-analysis.mdcontent-moderation.mdentity-analysis.mdentity-sentiment-analysis.mdindex.mdsentiment-analysis.mdsyntax-analysis.mdtext-classification.md

syntax-analysis.mddocs/

0

# Syntax Analysis (v1/v1beta2 only)

1

2

Provides comprehensive linguistic analysis including part-of-speech tagging, dependency parsing, morphological analysis, and token-level information to understand the grammatical structure and linguistic properties of text. Essential for applications requiring deep language understanding, grammar checking, and linguistic research.

3

4

**Note**: This feature is only available in API versions v1 and v1beta2. It is not included in the simplified v2 API.

5

6

## Capabilities

7

8

### Analyze Syntax

9

10

Performs detailed syntactic analysis of the provided text, returning information about sentences, tokens, part-of-speech tags, and dependency relationships.

11

12

```python { .api }

13

def analyze_syntax(

14

self,

15

request: Optional[Union[AnalyzeSyntaxRequest, dict]] = None,

16

*,

17

document: Optional[Document] = None,

18

encoding_type: Optional[EncodingType] = None,

19

retry: OptionalRetry = gapic_v1.method.DEFAULT,

20

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

21

metadata: Sequence[Tuple[str, Union[str, bytes]]] = ()

22

) -> AnalyzeSyntaxResponse:

23

"""

24

Analyzes the syntax of the text and provides part-of-speech tagging,

25

dependency parsing, and other linguistic information.

26

27

Args:

28

request: The request object containing document and options

29

document: Input document for analysis

30

encoding_type: Text encoding type for offset calculations

31

retry: Retry configuration for the request

32

timeout: Request timeout in seconds

33

metadata: Additional metadata to send with the request

34

35

Returns:

36

AnalyzeSyntaxResponse containing linguistic analysis results

37

"""

38

```

39

40

#### Usage Example

41

42

```python

43

from google.cloud import language_v1 # Use v1 or v1beta2

44

45

# Initialize client (must use v1 or v1beta2)

46

client = language_v1.LanguageServiceClient()

47

48

# Create document

49

document = language_v1.Document(

50

content="The quick brown fox jumps over the lazy dog.",

51

type_=language_v1.Document.Type.PLAIN_TEXT

52

)

53

54

# Analyze syntax

55

response = client.analyze_syntax(

56

request={"document": document}

57

)

58

59

# Process sentences

60

print("Sentences:")

61

for i, sentence in enumerate(response.sentences):

62

print(f"{i+1}. {sentence.text.content}")

63

64

print("\nTokens with POS tags:")

65

for token in response.tokens:

66

print(f"'{token.text.content}' - {token.part_of_speech.tag.name}")

67

68

print("\nDependency relationships:")

69

for i, token in enumerate(response.tokens):

70

if token.dependency_edge.head_token_index != i: # Not the root

71

head_token = response.tokens[token.dependency_edge.head_token_index]

72

print(f"'{token.text.content}' --{token.dependency_edge.label.name}--> '{head_token.text.content}'")

73

```

74

75

## Request and Response Types

76

77

### AnalyzeSyntaxRequest

78

79

```python { .api }

80

class AnalyzeSyntaxRequest:

81

document: Document

82

encoding_type: EncodingType

83

```

84

85

### AnalyzeSyntaxResponse

86

87

```python { .api }

88

class AnalyzeSyntaxResponse:

89

sentences: MutableSequence[Sentence]

90

tokens: MutableSequence[Token]

91

language: str

92

```

93

94

## Supporting Types

95

96

### Token

97

98

Represents a linguistic token with comprehensive morphological and syntactic information.

99

100

```python { .api }

101

class Token:

102

text: TextSpan # Token text and position

103

part_of_speech: PartOfSpeech # Part-of-speech information

104

dependency_edge: DependencyEdge # Dependency relationship

105

lemma: str # Canonical form of the token

106

```

107

108

### PartOfSpeech

109

110

Comprehensive part-of-speech and morphological information.

111

112

```python { .api }

113

class PartOfSpeech:

114

class Tag(proto.Enum):

115

UNKNOWN = 0

116

ADJ = 1 # Adjective

117

ADP = 2 # Adposition (preposition/postposition)

118

ADV = 3 # Adverb

119

CONJ = 4 # Conjunction

120

DET = 5 # Determiner

121

NOUN = 6 # Noun

122

NUM = 7 # Numeral

123

PRON = 8 # Pronoun

124

PRT = 9 # Particle

125

PUNCT = 10 # Punctuation

126

VERB = 11 # Verb

127

X = 12 # Other/Unknown

128

AFFIX = 13 # Affix

129

130

class Aspect(proto.Enum):

131

ASPECT_UNKNOWN = 0

132

PERFECTIVE = 1

133

IMPERFECTIVE = 2

134

PROGRESSIVE = 3

135

136

class Case(proto.Enum):

137

CASE_UNKNOWN = 0

138

ACCUSATIVE = 1

139

ADVERBIAL = 2

140

COMPLEMENTIVE = 3

141

DATIVE = 4

142

GENITIVE = 5

143

INSTRUMENTAL = 6

144

LOCATIVE = 7

145

NOMINATIVE = 8

146

OBLIQUE = 9

147

PARTITIVE = 10

148

PREPOSITIONAL = 11

149

REFLEXIVE_CASE = 12

150

RELATIVE_CASE = 13

151

VOCATIVE = 14

152

153

# Additional enums for Form, Gender, Mood, Number, Person, Proper, Reciprocity, Tense, Voice

154

155

tag: Tag # Main part-of-speech tag

156

aspect: Aspect # Verbal aspect

157

case: Case # Grammatical case

158

form: Form # Word form

159

gender: Gender # Grammatical gender

160

mood: Mood # Grammatical mood

161

number: Number # Grammatical number

162

person: Person # Grammatical person

163

proper: Proper # Proper noun indicator

164

reciprocity: Reciprocity # Reciprocity

165

tense: Tense # Grammatical tense

166

voice: Voice # Grammatical voice

167

```

168

169

### DependencyEdge

170

171

Represents a dependency relationship between tokens in the parse tree.

172

173

```python { .api }

174

class DependencyEdge:

175

class Label(proto.Enum):

176

UNKNOWN = 0

177

ABBREV = 1 # Abbreviation modifier

178

ACOMP = 2 # Adjectival complement

179

ADVCL = 3 # Adverbial clause modifier

180

ADVMOD = 4 # Adverbial modifier

181

AMOD = 5 # Adjectival modifier

182

APPOS = 6 # Appositional modifier

183

ATTR = 7 # Attribute

184

AUX = 8 # Auxiliary

185

AUXPASS = 9 # Passive auxiliary

186

CC = 10 # Coordinating conjunction

187

CCOMP = 11 # Clausal complement

188

CONJ = 12 # Conjunct

189

CSUBJ = 13 # Clausal subject

190

CSUBJPASS = 14 # Clausal passive subject

191

DEP = 15 # Dependent

192

DET = 16 # Determiner

193

DISCOURSE = 17 # Discourse element

194

DOBJ = 18 # Direct object

195

EXPL = 19 # Expletive

196

GOESWITH = 20 # Goes with

197

IOBJ = 21 # Indirect object

198

MARK = 22 # Marker

199

MWE = 23 # Multi-word expression

200

MWV = 24 # Multi-word verbal expression

201

NEG = 25 # Negation modifier

202

NN = 26 # Noun compound modifier

203

NPADVMOD = 27 # Noun phrase adverbial modifier

204

NSUBJ = 28 # Nominal subject

205

NSUBJPASS = 29 # Passive nominal subject

206

NUM = 30 # Numeric modifier

207

NUMBER = 31 # Element of compound number

208

P = 32 # Punctuation mark

209

PARATAXIS = 33 # Parataxis

210

PARTMOD = 34 # Participial modifier

211

PCOMP = 35 # Prepositional complement

212

POBJ = 36 # Object of preposition

213

POSS = 37 # Possession modifier

214

POSTNEG = 38 # Postverbal negative particle

215

PRECOMP = 39 # Predicate complement

216

PRECONJ = 40 # Preconjunct

217

PREDET = 41 # Predeterminer

218

PREF = 42 # Prefix

219

PREP = 43 # Prepositional modifier

220

PRONL = 44 # Pronominal locative

221

PRT = 45 # Particle

222

PS = 46 # Possessive ending

223

QUANTMOD = 47 # Quantifier phrase modifier

224

RCMOD = 48 # Relative clause modifier

225

RCMODREL = 49 # Complementizer in relative clause

226

RDROP = 50 # Ellipsis without a preceding predicate

227

REF = 51 # Referent

228

REMNANT = 52 # Remnant

229

REPARANDUM = 53 # Reparandum

230

ROOT = 54 # Root

231

SNUM = 55 # Suffix specifying a unit of number

232

SUFF = 56 # Suffix

233

TMOD = 57 # Temporal modifier

234

TOPIC = 58 # Topic marker

235

VMOD = 59 # Verbal modifier

236

VOCATIVE = 60 # Vocative

237

XCOMP = 61 # Open clausal complement

238

SUFFIX = 62 # Suffix

239

TITLE = 63 # Title

240

ADVPHMOD = 64 # Adverbial phrase modifier

241

AUXCAUS = 65 # Causative auxiliary

242

AUXVV = 66 # Helper auxiliary

243

DTMOD = 67 # Rentaishi (Prenominal modifier)

244

FOREIGN = 68 # Foreign words

245

KW = 69 # Keyword

246

LIST = 70 # List for chains of comparable items

247

NOMC = 71 # Nominalized clause

248

NOMCSUBJ = 72 # Nominalized clausal subject

249

NOMCSUBJPASS = 73 # Nominalized clausal passive

250

NUMC = 74 # Compound of numeric modifier

251

COP = 75 # Copula

252

DISLOCATED = 76 # Dislocated relation

253

ASP = 77 # Aspect marker

254

GMOD = 78 # Genitive modifier

255

GOBJ = 79 # Genitive object

256

INFMOD = 80 # Infinitival modifier

257

MES = 81 # Measure

258

NCOMP = 82 # Nominal complement of a noun

259

260

head_token_index: int # Index of the head token

261

label: Label # Dependency relationship label

262

```

263

264

## Advanced Usage

265

266

### Part-of-Speech Analysis

267

268

```python

269

def analyze_pos_distribution(client, text):

270

"""Analyze the distribution of parts of speech in text."""

271

document = language_v1.Document(

272

content=text,

273

type_=language_v1.Document.Type.PLAIN_TEXT

274

)

275

276

response = client.analyze_syntax(

277

request={"document": document}

278

)

279

280

pos_counts = {}

281

total_tokens = len(response.tokens)

282

283

for token in response.tokens:

284

pos_tag = token.part_of_speech.tag.name

285

pos_counts[pos_tag] = pos_counts.get(pos_tag, 0) + 1

286

287

print("Part-of-Speech Distribution:")

288

for pos, count in sorted(pos_counts.items(), key=lambda x: x[1], reverse=True):

289

percentage = (count / total_tokens) * 100

290

print(f"{pos}: {count} ({percentage:.1f}%)")

291

292

return pos_counts

293

294

# Usage

295

text = "The quick brown fox jumps gracefully over the very lazy dog near the old oak tree."

296

pos_distribution = analyze_pos_distribution(client, text)

297

```

298

299

### Dependency Tree Visualization

300

301

```python

302

def visualize_dependency_tree(client, text):

303

"""Create a simple text representation of the dependency tree."""

304

document = language_v1.Document(

305

content=text,

306

type_=language_v1.Document.Type.PLAIN_TEXT

307

)

308

309

response = client.analyze_syntax(

310

request={"document": document}

311

)

312

313

# Find the root token

314

root_index = None

315

for i, token in enumerate(response.tokens):

316

if token.dependency_edge.label == language_v1.DependencyEdge.Label.ROOT:

317

root_index = i

318

break

319

320

if root_index is not None:

321

print(f"Dependency Tree (root: '{response.tokens[root_index].text.content}'):")

322

print_dependency_subtree(response.tokens, root_index, 0)

323

324

return response.tokens

325

326

def print_dependency_subtree(tokens, head_index, depth):

327

"""Recursively print dependency subtree."""

328

head_token = tokens[head_index]

329

indent = " " * depth

330

pos_tag = head_token.part_of_speech.tag.name

331

print(f"{indent}{head_token.text.content} ({pos_tag})")

332

333

# Find children

334

children = []

335

for i, token in enumerate(tokens):

336

if token.dependency_edge.head_token_index == head_index and i != head_index:

337

children.append((i, token.dependency_edge.label.name))

338

339

# Sort children by position in sentence

340

children.sort(key=lambda x: tokens[x[0]].text.begin_offset)

341

342

for child_index, relation in children:

343

child_indent = " " * (depth + 1)

344

print(f"{child_indent}--{relation}-->")

345

print_dependency_subtree(tokens, child_index, depth + 2)

346

347

# Usage

348

text = "The cat sat on the mat."

349

visualize_dependency_tree(client, text)

350

```

351

352

### Lemmatization

353

354

```python

355

def extract_lemmas(client, text):

356

"""Extract lemmatized forms of words."""

357

document = language_v1.Document(

358

content=text,

359

type_=language_v1.Document.Type.PLAIN_TEXT

360

)

361

362

response = client.analyze_syntax(

363

request={"document": document}

364

)

365

366

lemmas = []

367

print("Word -> Lemma:")

368

for token in response.tokens:

369

word = token.text.content

370

lemma = token.lemma

371

pos = token.part_of_speech.tag.name

372

373

if word != lemma:

374

print(f"{word} -> {lemma} ({pos})")

375

376

lemmas.append(lemma)

377

378

return lemmas

379

380

# Usage

381

text = "The children were running quickly through the trees and jumped over the fallen logs."

382

lemmas = extract_lemmas(client, text)

383

print(f"\nLemmatized text: {' '.join(lemmas)}")

384

```

385

386

### Subject-Verb-Object Extraction

387

388

```python

389

def extract_svo_triples(client, text):

390

"""Extract Subject-Verb-Object triples from text."""

391

document = language_v1.Document(

392

content=text,

393

type_=language_v1.Document.Type.PLAIN_TEXT

394

)

395

396

response = client.analyze_syntax(

397

request={"document": document}

398

)

399

400

triples = []

401

402

# Find verbs

403

for i, token in enumerate(response.tokens):

404

if token.part_of_speech.tag == language_v1.PartOfSpeech.Tag.VERB:

405

verb = token.text.content

406

subject = None

407

obj = None

408

409

# Find subject and object

410

for j, dependent in enumerate(response.tokens):

411

if dependent.dependency_edge.head_token_index == i:

412

if dependent.dependency_edge.label == language_v1.DependencyEdge.Label.NSUBJ:

413

subject = dependent.text.content

414

elif dependent.dependency_edge.label == language_v1.DependencyEdge.Label.DOBJ:

415

obj = dependent.text.content

416

417

if subject and obj:

418

triples.append((subject, verb, obj))

419

420

return triples

421

422

# Usage

423

text = "The dog chased the cat. Mary loves books. John ate an apple."

424

svo_triples = extract_svo_triples(client, text)

425

426

print("Subject-Verb-Object triples:")

427

for subject, verb, obj in svo_triples:

428

print(f"{subject} -> {verb} -> {obj}")

429

```

430

431

### Morphological Analysis

432

433

```python

434

def analyze_morphology(client, text):

435

"""Analyze morphological features of words."""

436

document = language_v1.Document(

437

content=text,

438

type_=language_v1.Document.Type.PLAIN_TEXT

439

)

440

441

response = client.analyze_syntax(

442

request={"document": document}

443

)

444

445

print("Morphological Analysis:")

446

for token in response.tokens:

447

word = token.text.content

448

pos_info = token.part_of_speech

449

450

features = []

451

452

# Collect non-unknown morphological features

453

if pos_info.aspect != language_v1.PartOfSpeech.Aspect.ASPECT_UNKNOWN:

454

features.append(f"Aspect: {pos_info.aspect.name}")

455

if pos_info.case != language_v1.PartOfSpeech.Case.CASE_UNKNOWN:

456

features.append(f"Case: {pos_info.case.name}")

457

if pos_info.gender != language_v1.PartOfSpeech.Gender.GENDER_UNKNOWN:

458

features.append(f"Gender: {pos_info.gender.name}")

459

if pos_info.mood != language_v1.PartOfSpeech.Mood.MOOD_UNKNOWN:

460

features.append(f"Mood: {pos_info.mood.name}")

461

if pos_info.number != language_v1.PartOfSpeech.Number.NUMBER_UNKNOWN:

462

features.append(f"Number: {pos_info.number.name}")

463

if pos_info.person != language_v1.PartOfSpeech.Person.PERSON_UNKNOWN:

464

features.append(f"Person: {pos_info.person.name}")

465

if pos_info.tense != language_v1.PartOfSpeech.Tense.TENSE_UNKNOWN:

466

features.append(f"Tense: {pos_info.tense.name}")

467

if pos_info.voice != language_v1.PartOfSpeech.Voice.VOICE_UNKNOWN:

468

features.append(f"Voice: {pos_info.voice.name}")

469

470

if features:

471

print(f"{word} ({pos_info.tag.name}): {', '.join(features)}")

472

else:

473

print(f"{word} ({pos_info.tag.name})")

474

475

# Usage

476

text = "The cats were sleeping peacefully in their beds."

477

analyze_morphology(client, text)

478

```

479

480

### Sentence Complexity Analysis

481

482

```python

483

def analyze_sentence_complexity(client, text):

484

"""Analyze grammatical complexity of sentences."""

485

document = language_v1.Document(

486

content=text,

487

type_=language_v1.Document.Type.PLAIN_TEXT

488

)

489

490

response = client.analyze_syntax(

491

request={"document": document}

492

)

493

494

sentence_stats = []

495

496

for sentence in response.sentences:

497

# Find tokens in this sentence

498

sentence_tokens = [

499

token for token in response.tokens

500

if (token.text.begin_offset >= sentence.text.begin_offset and

501

token.text.begin_offset < sentence.text.begin_offset + len(sentence.text.content))

502

]

503

504

# Count different types of dependencies

505

clause_count = 0

506

modifier_count = 0

507

508

for token in sentence_tokens:

509

label = token.dependency_edge.label

510

if label in [language_v1.DependencyEdge.Label.CCOMP,

511

language_v1.DependencyEdge.Label.ADVCL,

512

language_v1.DependencyEdge.Label.RCMOD]:

513

clause_count += 1

514

elif label in [language_v1.DependencyEdge.Label.AMOD,

515

language_v1.DependencyEdge.Label.ADVMOD,

516

language_v1.DependencyEdge.Label.PREP]:

517

modifier_count += 1

518

519

stats = {

520

'sentence': sentence.text.content,

521

'token_count': len(sentence_tokens),

522

'clause_count': clause_count,

523

'modifier_count': modifier_count,

524

'complexity_score': len(sentence_tokens) + clause_count * 2 + modifier_count

525

}

526

527

sentence_stats.append(stats)

528

529

return sentence_stats

530

531

# Usage

532

text = """

533

The cat sat.

534

The big fluffy cat that we adopted last year sat quietly on the comfortable wooden chair

535

that my grandmother gave me when I moved into my first apartment.

536

"""

537

538

complexity_stats = analyze_sentence_complexity(client, text)

539

540

print("Sentence Complexity Analysis:")

541

for i, stats in enumerate(complexity_stats, 1):

542

print(f"Sentence {i}: {stats['sentence'][:50]}...")

543

print(f" Tokens: {stats['token_count']}")

544

print(f" Clauses: {stats['clause_count']}")

545

print(f" Modifiers: {stats['modifier_count']}")

546

print(f" Complexity Score: {stats['complexity_score']}")

547

print()

548

```

549

550

## Error Handling

551

552

```python

553

from google.api_core import exceptions

554

555

try:

556

response = client.analyze_syntax(

557

request={"document": document},

558

timeout=25.0

559

)

560

except exceptions.InvalidArgument as e:

561

print(f"Invalid request: {e}")

562

except exceptions.DeadlineExceeded:

563

print("Request timed out")

564

except exceptions.FailedPrecondition as e:

565

print(f"API version error: {e}")

566

print("Note: Syntax analysis requires v1 or v1beta2")

567

except exceptions.GoogleAPIError as e:

568

print(f"API error: {e}")

569

```

570

571

## Performance Considerations

572

573

- **Text Length**: Optimal for documents under 1MB

574

- **Computation**: Most intensive analysis type

575

- **Language Support**: Best results with well-supported languages

576

- **Caching**: Results can be cached for static text

577

- **API Version**: Only available in v1 and v1beta2

578

579

## Use Cases

580

581

- **Grammar Checking**: Identify grammatical errors and suggest corrections

582

- **Text Simplification**: Analyze and simplify complex sentence structures

583

- **Information Extraction**: Extract structured information using syntactic patterns

584

- **Language Learning**: Provide detailed grammatical analysis for educational purposes

585

- **Machine Translation**: Use syntactic information to improve translation quality

586

- **Content Analysis**: Analyze writing style and complexity

587

- **Search Enhancement**: Use syntactic features for better search understanding

588

- **Question Answering**: Use dependency parsing to understand question structure