or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

batch-operations.mdbeta-features.mddocument-processing.mddocument-types.mdindex.mdprocessor-management.md

index.mddocs/

0

# Google Cloud Document AI

1

2

Google Cloud Document AI is a machine learning service that extracts structured data from documents using pre-trained and custom document processors. The service can process various document types including invoices, receipts, forms, contracts, and other business documents.

3

4

## Package Information

5

6

**Package Name:** `google-cloud-documentai`

7

**Version:** 3.6.0

8

**Documentation:** [Google Cloud Document AI Documentation](https://cloud.google.com/document-ai)

9

10

### Installation

11

12

```bash

13

pip install google-cloud-documentai

14

```

15

16

### Authentication

17

18

This package requires Google Cloud authentication. Set up authentication using one of these methods:

19

20

1. **Application Default Credentials (Recommended)**:

21

```bash

22

gcloud auth application-default login

23

```

24

25

2. **Service Account Key**:

26

```bash

27

export GOOGLE_APPLICATION_CREDENTIALS="path/to/service-account-key.json"

28

```

29

30

3. **Environment Variables**:

31

```bash

32

export GOOGLE_CLOUD_PROJECT="your-project-id"

33

```

34

35

## Core Imports

36

37

```python { .api }

38

# Main module - exports v1 (stable) API

39

from google.cloud.documentai import DocumentProcessorServiceClient

40

from google.cloud.documentai import Document, ProcessRequest, ProcessResponse

41

42

# Alternative import pattern

43

from google.cloud import documentai

44

45

# For async operations

46

from google.cloud.documentai import DocumentProcessorServiceAsyncClient

47

48

# Core types for document processing

49

from google.cloud.documentai.types import (

50

RawDocument,

51

GcsDocument,

52

Processor,

53

ProcessorType,

54

BoundingPoly,

55

Vertex

56

)

57

```

58

59

## Basic Usage Example

60

61

```python { .api }

62

from google.cloud.documentai import DocumentProcessorServiceClient

63

from google.cloud.documentai.types import RawDocument, ProcessRequest

64

65

def process_document(project_id: str, location: str, processor_id: str, file_path: str, mime_type: str):

66

"""

67

Process a document using Google Cloud Document AI.

68

69

Args:

70

project_id: Google Cloud project ID

71

location: Processor location (e.g., 'us' or 'eu')

72

processor_id: ID of the document processor to use

73

file_path: Path to the document file

74

mime_type: MIME type of the document (e.g., 'application/pdf')

75

76

Returns:

77

Document: Processed document with extracted data

78

"""

79

# Initialize the client

80

client = DocumentProcessorServiceClient()

81

82

# The full resource name of the processor

83

name = client.processor_path(project_id, location, processor_id)

84

85

# Read the document file

86

with open(file_path, "rb") as document:

87

document_content = document.read()

88

89

# Create raw document

90

raw_document = RawDocument(content=document_content, mime_type=mime_type)

91

92

# Configure the process request

93

request = ProcessRequest(name=name, raw_document=raw_document)

94

95

# Process the document

96

result = client.process_document(request=request)

97

98

# Access processed document

99

document = result.document

100

101

print(f"Document text: {document.text}")

102

print(f"Number of pages: {len(document.pages)}")

103

104

# Extract entities

105

for entity in document.entities:

106

print(f"Entity: {entity.type_} = {entity.mention_text}")

107

108

return document

109

110

# Example usage

111

document = process_document(

112

project_id="my-project",

113

location="us",

114

processor_id="abc123def456",

115

file_path="invoice.pdf",

116

mime_type="application/pdf"

117

)

118

```

119

120

## Architecture

121

122

### Document Processing Workflow

123

124

Google Cloud Document AI follows this processing workflow:

125

126

1. **Document Input**: Raw documents (PDF, images) or Cloud Storage references

127

2. **Processor Selection**: Choose appropriate pre-trained or custom processor

128

3. **Processing**: AI models extract text, layout, and structured data

129

4. **Output**: Structured document with text, entities, tables, and metadata

130

131

### Key Concepts

132

133

#### Processors

134

Processors are AI models that extract data from specific document types:

135

- **Pre-trained processors**: Ready-to-use for common documents (invoices, receipts, forms)

136

- **Custom processors**: Trained on your specific document types

137

- **Processor versions**: Different iterations of a processor with varying capabilities

138

139

#### Documents

140

The `Document` type represents processed documents with:

141

- **Text**: Extracted text content with character-level positioning

142

- **Pages**: Individual pages with layout elements (blocks, paragraphs, lines, tokens)

143

- **Entities**: Extracted structured data (names, dates, amounts, addresses)

144

- **Tables**: Detected tables with cell-level data

145

- **Form fields**: Key-value pairs from forms

146

147

#### Locations

148

Processors are deployed in specific regions:

149

- `us`: United States (Iowa)

150

- `eu`: Europe (Belgium)

151

- Custom locations for enterprise customers

152

153

## Capabilities

154

155

### Document Processing Operations

156

Core functionality for processing individual and batch documents.

157

158

```python { .api }

159

# Process single document

160

from google.cloud.documentai import DocumentProcessorServiceClient

161

from google.cloud.documentai.types import ProcessRequest

162

163

client = DocumentProcessorServiceClient()

164

request = ProcessRequest(name="projects/my-project/locations/us/processors/abc123")

165

result = client.process_document(request=request)

166

```

167

168

**[→ Document Processing Operations](./document-processing.md)**

169

170

### Processor Management

171

Manage processor lifecycle including creation, deployment, and training.

172

173

```python { .api }

174

# List available processors

175

from google.cloud.documentai import DocumentProcessorServiceClient

176

from google.cloud.documentai.types import ListProcessorsRequest

177

178

client = DocumentProcessorServiceClient()

179

request = ListProcessorsRequest(parent="projects/my-project/locations/us")

180

response = client.list_processors(request=request)

181

182

for processor in response.processors:

183

print(f"Processor: {processor.display_name} ({processor.name})")

184

```

185

186

**[→ Processor Management](./processor-management.md)**

187

188

### Document Types and Schemas

189

Work with document structures, entities, and type definitions.

190

191

```python { .api }

192

# Access document structure

193

from google.cloud.documentai.types import Document

194

195

def analyze_document_structure(document: Document):

196

"""Analyze the structure of a processed document."""

197

print(f"Total text length: {len(document.text)}")

198

199

# Analyze pages

200

for i, page in enumerate(document.pages):

201

print(f"Page {i+1}: {len(page.blocks)} blocks, {len(page.paragraphs)} paragraphs")

202

203

# Analyze entities by type

204

entity_types = {}

205

for entity in document.entities:

206

entity_type = entity.type_

207

if entity_type not in entity_types:

208

entity_types[entity_type] = []

209

entity_types[entity_type].append(entity.mention_text)

210

211

for entity_type, mentions in entity_types.items():

212

print(f"{entity_type}: {len(mentions)} instances")

213

```

214

215

**[→ Document Types and Schemas](./document-types.md)**

216

217

### Batch Operations

218

Process multiple documents asynchronously for high-volume workflows.

219

220

```python { .api }

221

# Batch process documents

222

from google.cloud.documentai import DocumentProcessorServiceClient

223

from google.cloud.documentai.types import BatchProcessRequest, GcsDocuments

224

225

client = DocumentProcessorServiceClient()

226

227

# Configure batch request

228

gcs_documents = GcsDocuments(documents=[

229

{"gcs_uri": "gs://my-bucket/doc1.pdf", "mime_type": "application/pdf"},

230

{"gcs_uri": "gs://my-bucket/doc2.pdf", "mime_type": "application/pdf"}

231

])

232

233

request = BatchProcessRequest(

234

name="projects/my-project/locations/us/processors/abc123",

235

input_documents=gcs_documents,

236

document_output_config={

237

"gcs_output_config": {"gcs_uri": "gs://my-bucket/output/"}

238

}

239

)

240

241

operation = client.batch_process_documents(request=request)

242

```

243

244

**[→ Batch Operations](./batch-operations.md)**

245

246

### Beta Features (v1beta3)

247

Access experimental features including dataset management and enhanced document processing.

248

249

```python { .api }

250

# Beta features - DocumentService for dataset management

251

from google.cloud.documentai_v1beta3 import DocumentServiceClient

252

from google.cloud.documentai_v1beta3.types import Dataset

253

254

client = DocumentServiceClient()

255

256

# List documents in a dataset

257

request = {"parent": "projects/my-project/locations/us/processors/abc123/dataset"}

258

response = client.list_documents(request=request)

259

```

260

261

**[→ Beta Features](./beta-features.md)**

262

263

## API Versions

264

265

### V1 (Stable)

266

The main `google.cloud.documentai` module exports the stable v1 API:

267

- **Module**: `google.cloud.documentai`

268

- **Direct access**: `google.cloud.documentai_v1`

269

- **Status**: Production ready

270

- **Features**: Core document processing and processor management

271

272

### V1beta3 (Beta)

273

Extended API with additional features:

274

- **Module**: `google.cloud.documentai_v1beta3`

275

- **Status**: Beta (subject to breaking changes)

276

- **Additional features**: Dataset management, enhanced document operations, custom training

277

278

## Error Handling

279

280

```python { .api }

281

from google.cloud.documentai import DocumentProcessorServiceClient

282

from google.cloud.exceptions import GoogleCloudError

283

from google.api_core.exceptions import NotFound, InvalidArgument

284

285

client = DocumentProcessorServiceClient()

286

287

try:

288

# Process document

289

result = client.process_document(request=request)

290

except NotFound as e:

291

print(f"Processor not found: {e}")

292

except InvalidArgument as e:

293

print(f"Invalid request: {e}")

294

except GoogleCloudError as e:

295

print(f"Google Cloud error: {e}")

296

except Exception as e:

297

print(f"Unexpected error: {e}")

298

```

299

300

## Resource Names

301

302

Google Cloud Document AI uses hierarchical resource names:

303

304

```python { .api }

305

from google.cloud.documentai import DocumentProcessorServiceClient

306

307

client = DocumentProcessorServiceClient()

308

309

# Build resource names using helper methods

310

processor_path = client.processor_path("my-project", "us", "processor-id")

311

# Result: "projects/my-project/locations/us/processors/processor-id"

312

313

processor_version_path = client.processor_version_path(

314

"my-project", "us", "processor-id", "version-id"

315

)

316

# Result: "projects/my-project/locations/us/processors/processor-id/processorVersions/version-id"

317

318

location_path = client.common_location_path("my-project", "us")

319

# Result: "projects/my-project/locations/us"

320

```

321

322

## Performance Considerations

323

324

- **Document Size**: Individual documents up to 20MB, batch operations up to 1000 documents

325

- **Rate Limits**: Varies by processor type and region

326

- **Async Processing**: Use batch operations for high-volume processing

327

- **Caching**: Consider caching processed results for frequently accessed documents

328

- **Regional Processing**: Use the same region as your data for better performance

329

330

## Next Steps

331

332

- **[Document Processing Operations](./document-processing.md)**: Learn core document processing workflows

333

- **[Processor Management](./processor-management.md)**: Manage and configure processors

334

- **[Document Types and Schemas](./document-types.md)**: Understand document structure and types

335

- **[Batch Operations](./batch-operations.md)**: Process documents at scale

336

- **[Beta Features](./beta-features.md)**: Explore cutting-edge capabilities