or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-google-cloud-dlp

Google Cloud Data Loss Prevention (DLP) API client library for discovering, classifying, and protecting sensitive data

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/google-cloud-dlp@3.31.x

To install, run

npx @tessl/cli install tessl/pypi-google-cloud-dlp@3.31.0

0

# Google Cloud DLP

1

2

Google Cloud Data Loss Prevention (DLP) API enables organizations to discover, classify, and protect sensitive data across their cloud and hybrid environments. It provides comprehensive content inspection, data transformation, risk analysis, and automated data discovery capabilities with extensive configuration options for compliance and privacy requirements.

3

4

## Package Information

5

6

- **Package Name**: google-cloud-dlp

7

- **Language**: Python

8

- **Installation**: `pip install google-cloud-dlp`

9

10

## Core Imports

11

12

```python

13

from google.cloud import dlp

14

```

15

16

For direct access to v2 API:

17

18

```python

19

from google.cloud import dlp_v2

20

```

21

22

## Basic Usage

23

24

```python

25

from google.cloud import dlp

26

27

# Initialize the DLP client

28

client = dlp.DlpServiceClient()

29

30

# Basic content inspection

31

parent = f"projects/{project_id}/locations/global"

32

content_item = dlp.ContentItem(value="My SSN is 123-45-6789")

33

34

# Configure inspection

35

inspect_config = dlp.InspectConfig(

36

info_types=[dlp.InfoType(name="US_SOCIAL_SECURITY_NUMBER")]

37

)

38

39

# Create request

40

request = dlp.InspectContentRequest(

41

parent=parent,

42

inspect_config=inspect_config,

43

item=content_item,

44

)

45

46

# Inspect content

47

response = client.inspect_content(request=request)

48

49

# Process findings

50

for finding in response.result.findings:

51

print(f"Found {finding.info_type.name}: {finding.quote}")

52

```

53

54

## Architecture

55

56

The Google Cloud DLP API follows a service-oriented architecture with distinct functional areas:

57

58

- **Client Libraries**: Synchronous and asynchronous clients for API interaction

59

- **Content Analysis**: Real-time inspection, de-identification, and image redaction

60

- **Job Management**: Long-running batch operations with triggers and scheduling

61

- **Data Discovery**: Automated scanning and profiling of cloud data sources

62

- **Template System**: Reusable configurations for inspection and transformation

63

- **Type System**: Extensive type definitions for configuration and results

64

65

The API supports both immediate operations for small datasets and batch processing for enterprise-scale data protection workflows.

66

67

## Capabilities

68

69

### Content Analysis

70

71

Real-time analysis of text and images to detect, redact, and transform sensitive information. Supports immediate inspection with customizable info types and confidence levels.

72

73

```python { .api }

74

def inspect_content(

75

request: dlp.InspectContentRequest,

76

*,

77

retry: OptionalRetry = gapic_v1.method.DEFAULT,

78

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

79

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

80

) -> dlp.InspectContentResponse: ...

81

82

def deidentify_content(

83

request: dlp.DeidentifyContentRequest,

84

*,

85

retry: OptionalRetry = gapic_v1.method.DEFAULT,

86

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

87

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

88

) -> dlp.DeidentifyContentResponse: ...

89

90

def redact_image(

91

request: dlp.RedactImageRequest,

92

*,

93

retry: OptionalRetry = gapic_v1.method.DEFAULT,

94

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

95

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

96

) -> dlp.RedactImageResponse: ...

97

```

98

99

[Content Analysis](./content-analysis.md)

100

101

### Template Management

102

103

Reusable configurations for inspection and de-identification operations. Templates standardize DLP policies across an organization and simplify repeated operations.

104

105

```python { .api }

106

def create_inspect_template(

107

request: dlp.CreateInspectTemplateRequest,

108

*,

109

parent: Optional[str] = None,

110

inspect_template: Optional[dlp.InspectTemplate] = None,

111

retry: OptionalRetry = gapic_v1.method.DEFAULT,

112

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

113

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

114

) -> dlp.InspectTemplate: ...

115

116

def create_deidentify_template(

117

request: dlp.CreateDeidentifyTemplateRequest,

118

*,

119

parent: Optional[str] = None,

120

deidentify_template: Optional[dlp.DeidentifyTemplate] = None,

121

retry: OptionalRetry = gapic_v1.method.DEFAULT,

122

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

123

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

124

) -> dlp.DeidentifyTemplate: ...

125

```

126

127

[Template Management](./template-management.md)

128

129

### Job Management

130

131

Long-running batch operations for processing large datasets, including scheduled triggers, hybrid content inspection, and job lifecycle management.

132

133

```python { .api }

134

def create_dlp_job(

135

request: dlp.CreateDlpJobRequest,

136

*,

137

parent: Optional[str] = None,

138

inspect_job: Optional[dlp.InspectJobConfig] = None,

139

risk_job: Optional[dlp.RiskAnalysisJobConfig] = None,

140

retry: OptionalRetry = gapic_v1.method.DEFAULT,

141

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

142

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

143

) -> dlp.DlpJob: ...

144

145

def create_job_trigger(

146

request: dlp.CreateJobTriggerRequest,

147

*,

148

parent: Optional[str] = None,

149

job_trigger: Optional[dlp.JobTrigger] = None,

150

retry: OptionalRetry = gapic_v1.method.DEFAULT,

151

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

152

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

153

) -> dlp.JobTrigger: ...

154

```

155

156

[Job Management](./job-management.md)

157

158

### Data Discovery

159

160

Automated scanning and profiling of cloud data sources to understand data distribution, sensitivity, and compliance posture across BigQuery, Cloud Storage, Cloud SQL, and more.

161

162

```python { .api }

163

def create_discovery_config(

164

request: dlp.CreateDiscoveryConfigRequest,

165

*,

166

parent: Optional[str] = None,

167

discovery_config: Optional[dlp.DiscoveryConfig] = None,

168

retry: OptionalRetry = gapic_v1.method.DEFAULT,

169

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

170

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

171

) -> dlp.DiscoveryConfig: ...

172

```

173

174

[Data Discovery](./data-discovery.md)

175

176

### Data Profiling

177

178

Access to data profiles and insights generated by discovery scans, providing visibility into data sensitivity, distribution, and risk levels across projects, tables, columns, and file stores.

179

180

```python { .api }

181

def get_project_data_profile(

182

request: dlp.GetProjectDataProfileRequest,

183

*,

184

name: Optional[str] = None,

185

retry: OptionalRetry = gapic_v1.method.DEFAULT,

186

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

187

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

188

) -> dlp.ProjectDataProfile: ...

189

190

def get_table_data_profile(

191

request: dlp.GetTableDataProfileRequest,

192

*,

193

name: Optional[str] = None,

194

retry: OptionalRetry = gapic_v1.method.DEFAULT,

195

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

196

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

197

) -> dlp.TableDataProfile: ...

198

199

def get_column_data_profile(

200

request: dlp.GetColumnDataProfileRequest,

201

*,

202

name: Optional[str] = None,

203

retry: OptionalRetry = gapic_v1.method.DEFAULT,

204

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

205

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

206

) -> dlp.ColumnDataProfile: ...

207

208

def get_file_store_data_profile(

209

request: dlp.GetFileStoreDataProfileRequest,

210

*,

211

name: Optional[str] = None,

212

retry: OptionalRetry = gapic_v1.method.DEFAULT,

213

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

214

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

215

) -> dlp.FileStoreDataProfile: ...

216

```

217

218

[Data Profiling](./data-profiling.md)

219

220

### Stored Info Types

221

222

Custom sensitive information detection patterns for organization-specific data types. Extends built-in detectors with custom dictionaries, regular expressions, and machine learning models.

223

224

```python { .api }

225

def create_stored_info_type(

226

request: dlp.CreateStoredInfoTypeRequest,

227

*,

228

parent: Optional[str] = None,

229

config: Optional[dlp.StoredInfoTypeConfig] = None,

230

retry: OptionalRetry = gapic_v1.method.DEFAULT,

231

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

232

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

233

) -> dlp.StoredInfoType: ...

234

```

235

236

[Stored Info Types](./stored-info-types.md)

237

238

### Connection Management

239

240

External data source connections for accessing data outside Google Cloud, including database connections, cloud storage from other providers, and hybrid environments.

241

242

```python { .api }

243

def create_connection(

244

request: dlp.CreateConnectionRequest,

245

*,

246

parent: Optional[str] = None,

247

connection: Optional[dlp.Connection] = None,

248

retry: OptionalRetry = gapic_v1.method.DEFAULT,

249

timeout: Union[float, object] = gapic_v1.method.DEFAULT,

250

metadata: Sequence[Tuple[str, Union[str, bytes]]] = (),

251

) -> dlp.Connection: ...

252

```

253

254

[Connection Management](./connection-management.md)

255

256

## Core Types

257

258

### Client Classes

259

260

```python { .api }

261

class DlpServiceClient:

262

"""Synchronous client for Google Cloud DLP service operations."""

263

264

def __init__(

265

self,

266

*,

267

credentials: Optional[ga_credentials.Credentials] = None,

268

transport: Optional[DlpServiceTransport] = None,

269

client_options: Optional[ClientOptions] = None,

270

client_info: gapic_v1.client_info.ClientInfo = DEFAULT_CLIENT_INFO,

271

) -> None: ...

272

273

class DlpServiceAsyncClient:

274

"""Asynchronous client for Google Cloud DLP service operations."""

275

276

def __init__(

277

self,

278

*,

279

credentials: Optional[ga_credentials.Credentials] = None,

280

transport: Optional[DlpServiceAsyncTransport] = None,

281

client_options: Optional[ClientOptions] = None,

282

client_info: gapic_v1.client_info.ClientInfo = DEFAULT_CLIENT_INFO,

283

) -> None: ...

284

```

285

286

### Core Data Types

287

288

```python { .api }

289

class ContentItem:

290

"""Container for content to be inspected."""

291

292

value: str

293

table: Table

294

byte_item: ByteContentItem

295

296

class InfoType:

297

"""Type of information detector."""

298

299

name: str

300

version: str

301

sensitivity_score: SensitivityScore

302

303

class Finding:

304

"""Detected sensitive information."""

305

306

info_type: InfoType

307

likelihood: Likelihood

308

location: Location

309

quote: str

310

quote_info: QuoteInfo

311

312

class InspectConfig:

313

"""Configuration for content inspection."""

314

315

info_types: Sequence[InfoType]

316

min_likelihood: Likelihood

317

limits: InspectConfig.FindingLimits

318

include_quote: bool

319

exclude_info_types: bool

320

```

321

322

### Transformation Types

323

324

```python { .api }

325

class DeidentifyConfig:

326

"""Configuration for content de-identification."""

327

328

info_type_transformations: InfoTypeTransformations

329

record_transformations: RecordTransformations

330

transformation_error_handling: TransformationErrorHandling

331

332

class PrimitiveTransformation:

333

"""Basic data transformation operations."""

334

335

replace_config: ReplaceValueConfig

336

redact_config: RedactConfig

337

character_mask_config: CharacterMaskConfig

338

crypto_replace_ffx_fpe_config: CryptoReplaceFfxFpeConfig

339

fixed_size_bucketing_config: FixedSizeBucketingConfig

340

bucketing_config: BucketingConfig

341

replace_dictionary_config: ReplaceDictionaryConfig

342

time_part_config: TimePartConfig

343

crypto_hash_config: CryptoHashConfig

344

date_shift_config: DateShiftConfig

345

crypto_deterministic_config: CryptoDeterministicConfig

346

```

347

348

### Enumeration Types

349

350

```python { .api }

351

class Likelihood(proto.Enum):

352

"""Likelihood levels for detection confidence."""

353

354

LIKELIHOOD_UNSPECIFIED = 0

355

VERY_UNLIKELY = 1

356

UNLIKELY = 2

357

POSSIBLE = 3

358

LIKELY = 4

359

VERY_LIKELY = 5

360

361

class FileType(proto.Enum):

362

"""Supported file types for processing."""

363

364

FILE_TYPE_UNSPECIFIED = 0

365

BINARY_FILE = 1

366

TEXT_FILE = 2

367

IMAGE = 3

368

WORD = 5

369

PDF = 6

370

AVRO = 7

371

CSV = 8

372

TSV = 9

373

POWERPOINT = 11

374

EXCEL = 12

375

```