or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

client-configuration.mdconnection-management.mdconnection-types.mdiam-policy-management.mdindex.mdresource-path-helpers.md

connection-types.mddocs/

0

# Connection Types

1

2

BigQuery Connection API supports multiple external data source connection types, each with specific configuration properties. The `Connection` object uses a OneOf field structure, meaning only one connection type can be active per connection.

3

4

## Capabilities

5

6

### Core Connection Structure

7

8

The base `Connection` object contains common metadata and exactly one connection type configuration.

9

10

```python { .api }

11

class Connection:

12

"""Configuration parameters for an external data source connection."""

13

name: str # Output only. Resource name of the connection

14

friendly_name: str # User provided display name for the connection

15

description: str # User provided description

16

creation_time: int # Output only. Creation timestamp in milliseconds since epoch

17

last_modified_time: int # Output only. Last update timestamp in milliseconds since epoch

18

has_credential: bool # Output only. True if credential is configured for this connection

19

20

# OneOf connection type (exactly one must be set)

21

cloud_sql: CloudSqlProperties

22

aws: AwsProperties

23

azure: AzureProperties

24

cloud_spanner: CloudSpannerProperties

25

cloud_resource: CloudResourceProperties

26

spark: SparkProperties

27

salesforce_data_cloud: SalesforceDataCloudProperties

28

```

29

30

**Usage Example:**

31

32

```python

33

from google.cloud.bigquery_connection import Connection

34

35

# Create base connection

36

connection = Connection()

37

connection.friendly_name = "My External Database"

38

connection.description = "Connection to external data source"

39

40

# Set exactly one connection type (examples below)

41

```

42

43

### Cloud SQL Connections

44

45

Connects to Google Cloud SQL instances (PostgreSQL or MySQL) for querying relational database data.

46

47

```python { .api }

48

class CloudSqlProperties:

49

"""Properties for a Cloud SQL connection."""

50

instance_id: str # Cloud SQL instance ID in format 'project:location:instance'

51

database: str # Database name

52

type_: DatabaseType # Type of Cloud SQL database

53

credential: CloudSqlCredential # Input only. Database credentials

54

service_account_id: str # Output only. Service account ID for the connection

55

56

class DatabaseType:

57

"""Database type enumeration."""

58

DATABASE_TYPE_UNSPECIFIED = 0

59

POSTGRES = 1

60

MYSQL = 2

61

62

class CloudSqlCredential:

63

"""Credential for Cloud SQL connections."""

64

username: str # Database username

65

password: str # Database password

66

```

67

68

**Usage Example:**

69

70

```python

71

from google.cloud.bigquery_connection import (

72

Connection,

73

CloudSqlProperties,

74

CloudSqlCredential

75

)

76

77

connection = Connection()

78

connection.friendly_name = "PostgreSQL Analytics DB"

79

connection.description = "Production analytics database"

80

81

# Configure Cloud SQL connection

82

connection.cloud_sql = CloudSqlProperties()

83

connection.cloud_sql.instance_id = "my-project:us-central1:analytics-db"

84

connection.cloud_sql.database = "analytics"

85

connection.cloud_sql.type_ = CloudSqlProperties.DatabaseType.POSTGRES

86

connection.cloud_sql.credential = CloudSqlCredential(

87

username="bigquery_service",

88

password="secure_password_123"

89

)

90

91

# After creation, service_account_id will be populated:

92

# print(f"Service Account: {connection.cloud_sql.service_account_id}")

93

```

94

95

### AWS Connections

96

97

Connects to Amazon Web Services data sources using IAM role-based authentication.

98

99

```python { .api }

100

class AwsProperties:

101

"""Properties for AWS connections."""

102

# OneOf authentication method (exactly one must be set)

103

cross_account_role: AwsCrossAccountRole # Deprecated. Google-owned AWS IAM user access key

104

access_role: AwsAccessRole # Recommended. Google-owned service account authentication

105

106

class AwsCrossAccountRole:

107

"""AWS cross-account role authentication (deprecated)."""

108

iam_role_id: str # User's AWS IAM Role that trusts Google-owned AWS IAM user

109

iam_user_id: str # Output only. Google-owned AWS IAM User for the connection

110

external_id: str # Output only. Google-generated ID for representing connection's identity in AWS

111

112

class AwsAccessRole:

113

"""AWS access role authentication (recommended)."""

114

iam_role_id: str # User's AWS IAM Role that trusts Google-owned AWS IAM user

115

identity: str # Unique Google-owned and generated identity for the connection

116

```

117

118

**Usage Example:**

119

120

```python

121

from google.cloud.bigquery_connection import (

122

Connection,

123

AwsProperties,

124

AwsAccessRole

125

)

126

127

connection = Connection()

128

connection.friendly_name = "AWS S3 Data Lake"

129

connection.description = "Connection to S3 data lake for analytics"

130

131

# Configure AWS connection (using recommended access role method)

132

connection.aws = AwsProperties()

133

connection.aws.access_role = AwsAccessRole()

134

connection.aws.access_role.iam_role_id = "arn:aws:iam::123456789012:role/BigQueryAccessRole"

135

136

# After creation, identity will be populated:

137

# print(f"Google Identity: {connection.aws.access_role.identity}")

138

```

139

140

### Azure Connections

141

142

Connects to Microsoft Azure data sources using Azure Active Directory authentication.

143

144

```python { .api }

145

class AzureProperties:

146

"""Properties for Azure connections."""

147

application: str # Output only. Name of the Azure Active Directory Application

148

client_id: str # Output only. Client ID of the Azure AD Application

149

object_id: str # Output only. Object ID of the Azure AD Application

150

customer_tenant_id: str # ID of the customer's directory that hosts the data

151

redirect_uri: str # URL user will be redirected to after granting consent during connection setup

152

federated_application_client_id: str # Client ID of the user's Azure AD Application for federated connection

153

identity: str # Output only. Unique Google identity for the connection

154

```

155

156

**Usage Example:**

157

158

```python

159

from google.cloud.bigquery_connection import Connection, AzureProperties

160

161

connection = Connection()

162

connection.friendly_name = "Azure Data Lake Gen2"

163

connection.description = "Connection to Azure Data Lake for analytics"

164

165

# Configure Azure connection

166

connection.azure = AzureProperties()

167

connection.azure.customer_tenant_id = "a1b2c3d4-e5f6-7890-abcd-ef1234567890"

168

connection.azure.redirect_uri = "https://console.cloud.google.com/bigquery"

169

connection.azure.federated_application_client_id = "12345678-90ab-cdef-1234-567890abcdef"

170

171

# After creation, output-only fields will be populated:

172

# print(f"Application: {connection.azure.application}")

173

# print(f"Client ID: {connection.azure.client_id}")

174

# print(f"Google Identity: {connection.azure.identity}")

175

```

176

177

### Cloud Spanner Connections

178

179

Connects to Google Cloud Spanner databases for analytical queries.

180

181

```python { .api }

182

class CloudSpannerProperties:

183

"""Properties for Cloud Spanner connections."""

184

database: str # Cloud Spanner database resource name in format 'projects/{project}/instances/{instance}/databases/{database}'

185

use_parallelism: bool # If parallelism should be used when reading from the Spanner database

186

max_parallelism: int # Allows setting max parallelism per query when executing on Spanner compute resources

187

use_serverless_analytics: bool # If the serverless analytics service should be used to read data from Spanner

188

use_data_boost: bool # If the request should be executed via Spanner independent compute resources

189

database_role: str # Optional. Cloud Spanner database role for fine-grained access control

190

```

191

192

**Usage Example:**

193

194

```python

195

from google.cloud.bigquery_connection import Connection, CloudSpannerProperties

196

197

connection = Connection()

198

connection.friendly_name = "Spanner OLTP Database"

199

connection.description = "Connection to Spanner for analytical queries"

200

201

# Configure Cloud Spanner connection

202

connection.cloud_spanner = CloudSpannerProperties()

203

connection.cloud_spanner.database = "projects/my-project/instances/my-instance/databases/my-database"

204

connection.cloud_spanner.use_parallelism = True

205

connection.cloud_spanner.max_parallelism = 4

206

connection.cloud_spanner.use_serverless_analytics = True

207

connection.cloud_spanner.use_data_boost = False

208

connection.cloud_spanner.database_role = "analytics_reader"

209

```

210

211

### Cloud Resource Connections

212

213

Connects to other Google Cloud resources with automatic service account management.

214

215

```python { .api }

216

class CloudResourceProperties:

217

"""Properties for Cloud Resource connections."""

218

service_account_id: str # Output only. The account ID of the service created for the connection

219

```

220

221

**Usage Example:**

222

223

```python

224

from google.cloud.bigquery_connection import Connection, CloudResourceProperties

225

226

connection = Connection()

227

connection.friendly_name = "Cloud Storage Data"

228

connection.description = "Connection to Google Cloud Storage buckets"

229

230

# Configure Cloud Resource connection

231

connection.cloud_resource = CloudResourceProperties()

232

233

# After creation, service_account_id will be populated:

234

# print(f"Service Account: {connection.cloud_resource.service_account_id}")

235

```

236

237

### Spark Connections

238

239

Connects to Apache Spark clusters for distributed data processing.

240

241

```python { .api }

242

class SparkProperties:

243

"""Properties for Spark connections."""

244

service_account_id: str # Output only. The account ID of the service created for the connection

245

metastore_service_config: MetastoreServiceConfig # Optional. Dataproc Metastore Service configuration

246

spark_history_server_config: SparkHistoryServerConfig # Optional. Spark History Server configuration

247

248

class MetastoreServiceConfig:

249

"""Configuration for Dataproc Metastore Service."""

250

metastore_service: str # Optional. Resource name of an existing Dataproc Metastore service

251

252

class SparkHistoryServerConfig:

253

"""Configuration for Spark History Server."""

254

dataproc_cluster: str # Optional. Resource name of an existing Dataproc Cluster to act as a Spark History Server

255

```

256

257

**Usage Example:**

258

259

```python

260

from google.cloud.bigquery_connection import (

261

Connection,

262

SparkProperties,

263

MetastoreServiceConfig,

264

SparkHistoryServerConfig

265

)

266

267

connection = Connection()

268

connection.friendly_name = "Spark Analytics Cluster"

269

connection.description = "Connection to Spark cluster for big data processing"

270

271

# Configure Spark connection

272

connection.spark = SparkProperties()

273

274

# Optional: Configure metastore service

275

connection.spark.metastore_service_config = MetastoreServiceConfig()

276

connection.spark.metastore_service_config.metastore_service = (

277

"projects/my-project/locations/us-central1/services/my-metastore"

278

)

279

280

# Optional: Configure history server

281

connection.spark.spark_history_server_config = SparkHistoryServerConfig()

282

connection.spark.spark_history_server_config.dataproc_cluster = (

283

"projects/my-project/regions/us-central1/clusters/spark-history-cluster"

284

)

285

286

# After creation, service_account_id will be populated:

287

# print(f"Service Account: {connection.spark.service_account_id}")

288

```

289

290

### Salesforce Data Cloud Connections

291

292

Connects to Salesforce Data Cloud for CRM and customer data analytics.

293

294

```python { .api }

295

class SalesforceDataCloudProperties:

296

"""Properties for Salesforce Data Cloud connections."""

297

instance_uri: str # The URL to the user's Salesforce DataCloud instance

298

identity: str # Output only. Unique Google service account identity for the connection

299

tenant_id: str # The ID of the user's Salesforce tenant

300

```

301

302

**Usage Example:**

303

304

```python

305

from google.cloud.bigquery_connection import Connection, SalesforceDataCloudProperties

306

307

connection = Connection()

308

connection.friendly_name = "Salesforce CRM Data"

309

connection.description = "Connection to Salesforce Data Cloud for customer analytics"

310

311

# Configure Salesforce Data Cloud connection

312

connection.salesforce_data_cloud = SalesforceDataCloudProperties()

313

connection.salesforce_data_cloud.instance_uri = "https://mycompany.my.salesforce-datacloud.com"

314

connection.salesforce_data_cloud.tenant_id = "00D123456789012345"

315

316

# After creation, identity will be populated:

317

# print(f"Google Identity: {connection.salesforce_data_cloud.identity}")

318

```

319

320

## Connection Type Selection

321

322

When creating a connection, you must choose exactly one connection type. The choice depends on your external data source:

323

324

```python

325

# Cloud SQL for relational databases (PostgreSQL, MySQL)

326

connection.cloud_sql = CloudSqlProperties()

327

328

# AWS for Amazon S3, Redshift, RDS, etc.

329

connection.aws = AwsProperties()

330

331

# Azure for Azure Data Lake, SQL Database, etc.

332

connection.azure = AzureProperties()

333

334

# Cloud Spanner for Google's globally distributed database

335

connection.cloud_spanner = CloudSpannerProperties()

336

337

# Cloud Resource for other Google Cloud services

338

connection.cloud_resource = CloudResourceProperties()

339

340

# Spark for distributed data processing

341

connection.spark = SparkProperties()

342

343

# Salesforce Data Cloud for CRM data

344

connection.salesforce_data_cloud = SalesforceDataCloudProperties()

345

```

346

347

## Common Patterns

348

349

### Output-Only Fields

350

351

Many connection types have output-only fields that are populated by the service after connection creation:

352

353

```python

354

# These fields are set by the service and cannot be modified

355

connection.name # Resource name assigned by the service

356

connection.creation_time # Timestamp when connection was created

357

connection.last_modified_time # Timestamp when connection was last updated

358

connection.has_credential # Whether credential information is configured

359

360

# Connection-type specific output fields

361

connection.cloud_sql.service_account_id # For Cloud SQL

362

connection.aws.access_role.identity # For AWS access role

363

connection.azure.identity # For Azure

364

connection.cloud_resource.service_account_id # For Cloud Resource

365

```

366

367

### Credential Management

368

369

Credential information is handled differently by connection type:

370

371

- **Cloud SQL**: Direct username/password stored securely

372

- **AWS**: IAM role trust relationship with Google-managed identity

373

- **Azure**: OAuth-based federated authentication

374

- **Cloud Spanner**: Uses Google Cloud IAM (no explicit credentials)

375

- **Cloud Resource**: Automatic service account creation

376

- **Spark**: Automatic service account creation

377

- **Salesforce**: OAuth-based authentication with tenant-specific configuration

378

379

### Security Considerations

380

381

- Credentials are encrypted and stored securely by Google Cloud

382

- Output-only identity fields provide secure authentication to external services

383

- IAM policies control access to connection resources

384

- Service accounts created for connections follow least-privilege principles