Tessl Tile for pypi/google-cloud-bigquery-connection@1.18.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

client-configuration.md connection-management.md connection-types.md iam-policy-management.md index.md resource-path-helpers.md

connection-types.mddocs/

0
# Connection Types
1

2
BigQuery Connection API supports multiple external data source connection types, each with specific configuration properties. The `Connection` object uses a OneOf field structure, meaning only one connection type can be active per connection.
3

4
## Capabilities
5

6
### Core Connection Structure
7

8
The base `Connection` object contains common metadata and exactly one connection type configuration.
9

10
```python { .api }
11
class Connection:
12
    """Configuration parameters for an external data source connection."""
13
    name: str  # Output only. Resource name of the connection
14
    friendly_name: str  # User provided display name for the connection
15
    description: str  # User provided description
16
    creation_time: int  # Output only. Creation timestamp in milliseconds since epoch
17
    last_modified_time: int  # Output only. Last update timestamp in milliseconds since epoch  
18
    has_credential: bool  # Output only. True if credential is configured for this connection
19
    
20
    # OneOf connection type (exactly one must be set)
21
    cloud_sql: CloudSqlProperties
22
    aws: AwsProperties
23
    azure: AzureProperties
24
    cloud_spanner: CloudSpannerProperties
25
    cloud_resource: CloudResourceProperties
26
    spark: SparkProperties
27
    salesforce_data_cloud: SalesforceDataCloudProperties
28
```
29

30
**Usage Example:**
31

32
```python
33
from google.cloud.bigquery_connection import Connection
34

35
# Create base connection
36
connection = Connection()
37
connection.friendly_name = "My External Database"
38
connection.description = "Connection to external data source"
39

40
# Set exactly one connection type (examples below)
41
```
42

43
### Cloud SQL Connections
44

45
Connects to Google Cloud SQL instances (PostgreSQL or MySQL) for querying relational database data.
46

47
```python { .api }
48
class CloudSqlProperties:
49
    """Properties for a Cloud SQL connection."""
50
    instance_id: str  # Cloud SQL instance ID in format 'project:location:instance'
51
    database: str  # Database name
52
    type_: DatabaseType  # Type of Cloud SQL database  
53
    credential: CloudSqlCredential  # Input only. Database credentials
54
    service_account_id: str  # Output only. Service account ID for the connection
55

56
    class DatabaseType:
57
        """Database type enumeration."""
58
        DATABASE_TYPE_UNSPECIFIED = 0
59
        POSTGRES = 1
60
        MYSQL = 2
61

62
class CloudSqlCredential:
63
    """Credential for Cloud SQL connections."""
64
    username: str  # Database username
65
    password: str  # Database password
66
```
67

68
**Usage Example:**
69

70
```python
71
from google.cloud.bigquery_connection import (
72
    Connection,
73
    CloudSqlProperties,
74
    CloudSqlCredential
75
)
76

77
connection = Connection()
78
connection.friendly_name = "PostgreSQL Analytics DB"
79
connection.description = "Production analytics database"
80

81
# Configure Cloud SQL connection
82
connection.cloud_sql = CloudSqlProperties()
83
connection.cloud_sql.instance_id = "my-project:us-central1:analytics-db"
84
connection.cloud_sql.database = "analytics"
85
connection.cloud_sql.type_ = CloudSqlProperties.DatabaseType.POSTGRES
86
connection.cloud_sql.credential = CloudSqlCredential(
87
    username="bigquery_service",
88
    password="secure_password_123"
89
)
90

91
# After creation, service_account_id will be populated:
92
# print(f"Service Account: {connection.cloud_sql.service_account_id}")
93
```
94

95
### AWS Connections
96

97
Connects to Amazon Web Services data sources using IAM role-based authentication.
98

99
```python { .api }
100
class AwsProperties:
101
    """Properties for AWS connections."""
102
    # OneOf authentication method (exactly one must be set)
103
    cross_account_role: AwsCrossAccountRole  # Deprecated. Google-owned AWS IAM user access key
104
    access_role: AwsAccessRole  # Recommended. Google-owned service account authentication
105

106
class AwsCrossAccountRole:
107
    """AWS cross-account role authentication (deprecated)."""
108
    iam_role_id: str  # User's AWS IAM Role that trusts Google-owned AWS IAM user
109
    iam_user_id: str  # Output only. Google-owned AWS IAM User for the connection
110
    external_id: str  # Output only. Google-generated ID for representing connection's identity in AWS
111

112
class AwsAccessRole:
113
    """AWS access role authentication (recommended)."""
114
    iam_role_id: str  # User's AWS IAM Role that trusts Google-owned AWS IAM user  
115
    identity: str  # Unique Google-owned and generated identity for the connection
116
```
117

118
**Usage Example:**
119

120
```python
121
from google.cloud.bigquery_connection import (
122
    Connection,
123
    AwsProperties,
124
    AwsAccessRole
125
)
126

127
connection = Connection()
128
connection.friendly_name = "AWS S3 Data Lake"
129
connection.description = "Connection to S3 data lake for analytics"
130

131
# Configure AWS connection (using recommended access role method)
132
connection.aws = AwsProperties()
133
connection.aws.access_role = AwsAccessRole()
134
connection.aws.access_role.iam_role_id = "arn:aws:iam::123456789012:role/BigQueryAccessRole"
135

136
# After creation, identity will be populated:
137
# print(f"Google Identity: {connection.aws.access_role.identity}")
138
```
139

140
### Azure Connections
141

142
Connects to Microsoft Azure data sources using Azure Active Directory authentication.
143

144
```python { .api }
145
class AzureProperties:
146
    """Properties for Azure connections."""
147
    application: str  # Output only. Name of the Azure Active Directory Application
148
    client_id: str  # Output only. Client ID of the Azure AD Application
149
    object_id: str  # Output only. Object ID of the Azure AD Application
150
    customer_tenant_id: str  # ID of the customer's directory that hosts the data
151
    redirect_uri: str  # URL user will be redirected to after granting consent during connection setup
152
    federated_application_client_id: str  # Client ID of the user's Azure AD Application for federated connection
153
    identity: str  # Output only. Unique Google identity for the connection
154
```
155

156
**Usage Example:**
157

158
```python
159
from google.cloud.bigquery_connection import Connection, AzureProperties
160

161
connection = Connection()
162
connection.friendly_name = "Azure Data Lake Gen2"
163
connection.description = "Connection to Azure Data Lake for analytics"
164

165
# Configure Azure connection
166
connection.azure = AzureProperties()
167
connection.azure.customer_tenant_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"
168
connection.azure.redirect_uri = "https://console.cloud.google.com/bigquery"
169
connection.azure.federated_application_client_id = "12345678-90ab-cdef-1234-567890abcdef"
170

171
# After creation, output-only fields will be populated:
172
# print(f"Application: {connection.azure.application}")
173
# print(f"Client ID: {connection.azure.client_id}")
174
# print(f"Google Identity: {connection.azure.identity}")
175
```
176

177
### Cloud Spanner Connections
178

179
Connects to Google Cloud Spanner databases for analytical queries.
180

181
```python { .api }
182
class CloudSpannerProperties:
183
    """Properties for Cloud Spanner connections."""
184
    database: str  # Cloud Spanner database resource name in format 'projects/{project}/instances/{instance}/databases/{database}'
185
    use_parallelism: bool  # If parallelism should be used when reading from the Spanner database
186
    max_parallelism: int  # Allows setting max parallelism per query when executing on Spanner compute resources
187
    use_serverless_analytics: bool  # If the serverless analytics service should be used to read data from Spanner
188
    use_data_boost: bool  # If the request should be executed via Spanner independent compute resources
189
    database_role: str  # Optional. Cloud Spanner database role for fine-grained access control
190
```
191

192
**Usage Example:**
193

194
```python
195
from google.cloud.bigquery_connection import Connection, CloudSpannerProperties
196

197
connection = Connection()
198
connection.friendly_name = "Spanner OLTP Database"
199
connection.description = "Connection to Spanner for analytical queries"
200

201
# Configure Cloud Spanner connection
202
connection.cloud_spanner = CloudSpannerProperties()
203
connection.cloud_spanner.database = "projects/my-project/instances/my-instance/databases/my-database"
204
connection.cloud_spanner.use_parallelism = True
205
connection.cloud_spanner.max_parallelism = 4
206
connection.cloud_spanner.use_serverless_analytics = True
207
connection.cloud_spanner.use_data_boost = False
208
connection.cloud_spanner.database_role = "analytics_reader"
209
```
210

211
### Cloud Resource Connections
212

213
Connects to other Google Cloud resources with automatic service account management.
214

215
```python { .api }
216
class CloudResourceProperties:
217
    """Properties for Cloud Resource connections."""
218
    service_account_id: str  # Output only. The account ID of the service created for the connection
219
```
220

221
**Usage Example:**
222

223
```python
224
from google.cloud.bigquery_connection import Connection, CloudResourceProperties
225

226
connection = Connection()
227
connection.friendly_name = "Cloud Storage Data"
228
connection.description = "Connection to Google Cloud Storage buckets"
229

230
# Configure Cloud Resource connection
231
connection.cloud_resource = CloudResourceProperties()
232

233
# After creation, service_account_id will be populated:
234
# print(f"Service Account: {connection.cloud_resource.service_account_id}")
235
```
236

237
### Spark Connections
238

239
Connects to Apache Spark clusters for distributed data processing.
240

241
```python { .api }
242
class SparkProperties:
243
    """Properties for Spark connections."""
244
    service_account_id: str  # Output only. The account ID of the service created for the connection
245
    metastore_service_config: MetastoreServiceConfig  # Optional. Dataproc Metastore Service configuration
246
    spark_history_server_config: SparkHistoryServerConfig  # Optional. Spark History Server configuration
247

248
class MetastoreServiceConfig:
249
    """Configuration for Dataproc Metastore Service."""
250
    metastore_service: str  # Optional. Resource name of an existing Dataproc Metastore service
251

252
class SparkHistoryServerConfig:
253
    """Configuration for Spark History Server."""
254
    dataproc_cluster: str  # Optional. Resource name of an existing Dataproc Cluster to act as a Spark History Server
255
```
256

257
**Usage Example:**
258

259
```python
260
from google.cloud.bigquery_connection import (
261
    Connection,
262
    SparkProperties,
263
    MetastoreServiceConfig,
264
    SparkHistoryServerConfig
265
)
266

267
connection = Connection()
268
connection.friendly_name = "Spark Analytics Cluster"
269
connection.description = "Connection to Spark cluster for big data processing"
270

271
# Configure Spark connection
272
connection.spark = SparkProperties()
273

274
# Optional: Configure metastore service
275
connection.spark.metastore_service_config = MetastoreServiceConfig()
276
connection.spark.metastore_service_config.metastore_service = (
277
    "projects/my-project/locations/us-central1/services/my-metastore"
278
)
279

280
# Optional: Configure history server
281
connection.spark.spark_history_server_config = SparkHistoryServerConfig()
282
connection.spark.spark_history_server_config.dataproc_cluster = (
283
    "projects/my-project/regions/us-central1/clusters/spark-history-cluster"
284
)
285

286
# After creation, service_account_id will be populated:
287
# print(f"Service Account: {connection.spark.service_account_id}")
288
```
289

290
### Salesforce Data Cloud Connections
291

292
Connects to Salesforce Data Cloud for CRM and customer data analytics.
293

294
```python { .api }
295
class SalesforceDataCloudProperties:
296
    """Properties for Salesforce Data Cloud connections."""
297
    instance_uri: str  # The URL to the user's Salesforce DataCloud instance
298
    identity: str  # Output only. Unique Google service account identity for the connection
299
    tenant_id: str  # The ID of the user's Salesforce tenant
300
```
301

302
**Usage Example:**
303

304
```python
305
from google.cloud.bigquery_connection import Connection, SalesforceDataCloudProperties
306

307
connection = Connection()
308
connection.friendly_name = "Salesforce CRM Data"
309
connection.description = "Connection to Salesforce Data Cloud for customer analytics"
310

311
# Configure Salesforce Data Cloud connection
312
connection.salesforce_data_cloud = SalesforceDataCloudProperties()
313
connection.salesforce_data_cloud.instance_uri = "https://mycompany.my.salesforce-datacloud.com"
314
connection.salesforce_data_cloud.tenant_id = "00D123456789012345"
315

316
# After creation, identity will be populated:
317
# print(f"Google Identity: {connection.salesforce_data_cloud.identity}")
318
```
319

320
## Connection Type Selection
321

322
When creating a connection, you must choose exactly one connection type. The choice depends on your external data source:
323

324
```python
325
# Cloud SQL for relational databases (PostgreSQL, MySQL)
326
connection.cloud_sql = CloudSqlProperties()
327

328
# AWS for Amazon S3, Redshift, RDS, etc.  
329
connection.aws = AwsProperties()
330

331
# Azure for Azure Data Lake, SQL Database, etc.
332
connection.azure = AzureProperties()
333

334
# Cloud Spanner for Google's globally distributed database
335
connection.cloud_spanner = CloudSpannerProperties()
336

337
# Cloud Resource for other Google Cloud services
338
connection.cloud_resource = CloudResourceProperties()
339

340
# Spark for distributed data processing
341
connection.spark = SparkProperties()
342

343
# Salesforce Data Cloud for CRM data
344
connection.salesforce_data_cloud = SalesforceDataCloudProperties()
345
```
346

347
## Common Patterns
348

349
### Output-Only Fields
350

351
Many connection types have output-only fields that are populated by the service after connection creation:
352

353
```python
354
# These fields are set by the service and cannot be modified
355
connection.name  # Resource name assigned by the service
356
connection.creation_time  # Timestamp when connection was created
357
connection.last_modified_time  # Timestamp when connection was last updated
358
connection.has_credential  # Whether credential information is configured
359

360
# Connection-type specific output fields
361
connection.cloud_sql.service_account_id  # For Cloud SQL
362
connection.aws.access_role.identity  # For AWS access role
363
connection.azure.identity  # For Azure
364
connection.cloud_resource.service_account_id  # For Cloud Resource
365
```
366

367
### Credential Management
368

369
Credential information is handled differently by connection type:
370

371
- **Cloud SQL**: Direct username/password stored securely
372
- **AWS**: IAM role trust relationship with Google-managed identity
373
- **Azure**: OAuth-based federated authentication
374
- **Cloud Spanner**: Uses Google Cloud IAM (no explicit credentials)
375
- **Cloud Resource**: Automatic service account creation
376
- **Spark**: Automatic service account creation  
377
- **Salesforce**: OAuth-based authentication with tenant-specific configuration
378

379
### Security Considerations
380

381
- Credentials are encrypted and stored securely by Google Cloud
382
- Output-only identity fields provide secure authentication to external services
383
- IAM policies control access to connection resources
384
- Service accounts created for connections follow least-privilege principles

Version

Tile

Files

connection-types.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

connection-types.mddocs/