0
# Client Operations
1
2
Core client functionality for authentication, project management, and resource operations. The Client class serves as the main entry point for all BigQuery interactions, providing authenticated access to Google Cloud BigQuery services.
3
4
## Capabilities
5
6
### Client Initialization
7
8
Creates the primary interface to BigQuery with authentication and configuration options.
9
10
```python { .api }
11
class Client:
12
def __init__(
13
self,
14
project: str = None,
15
credentials: google.auth.credentials.Credentials = None,
16
_http: requests.Session = None,
17
location: str = None,
18
default_query_job_config: QueryJobConfig = None,
19
default_load_job_config: LoadJobConfig = None,
20
client_info: google.api_core.client_info.ClientInfo = None,
21
client_options: google.api_core.client_options.ClientOptions = None,
22
):
23
"""
24
Initialize BigQuery client.
25
26
Args:
27
project: Google Cloud project ID. If None, inferred from environment.
28
credentials: OAuth2 credentials. If None, uses default credentials.
29
location: Default location for BigQuery operations.
30
default_query_job_config: Default configuration for query jobs.
31
default_load_job_config: Default configuration for load jobs.
32
client_info: Client library information.
33
client_options: Client configuration options.
34
"""
35
36
@property
37
def project(self) -> str:
38
"""Project ID associated with this client."""
39
40
@property
41
def location(self) -> str:
42
"""Default location for BigQuery operations."""
43
```
44
45
### Query Execution
46
47
Execute SQL queries and manage query jobs with comprehensive configuration options.
48
49
```python { .api }
50
def query(
51
self,
52
query: str,
53
job_config: QueryJobConfig = None,
54
job_id: str = None,
55
job_retry: google.api_core.retry.Retry = DEFAULT_RETRY,
56
timeout: float = None,
57
location: str = None,
58
project: str = None,
59
) -> QueryJob:
60
"""
61
Execute a SQL query and return a job.
62
63
Args:
64
query: SQL query string to execute.
65
job_config: Configuration for the query job.
66
job_id: Unique identifier for the job.
67
job_retry: Retry configuration for job creation.
68
timeout: Timeout in seconds for job creation.
69
location: Location where job should run.
70
project: Project ID for the job.
71
72
Returns:
73
QueryJob: Job instance for the query operation.
74
"""
75
76
def query_and_wait(
77
self,
78
query: str,
79
**kwargs
80
) -> google.cloud.bigquery.table.RowIterator:
81
"""
82
Execute query and wait for completion, returning results directly.
83
84
Args:
85
query: SQL query string to execute.
86
**kwargs: Additional arguments passed to query().
87
88
Returns:
89
RowIterator: Query results.
90
"""
91
```
92
93
### Dataset Operations
94
95
Manage BigQuery datasets including creation, deletion, and listing operations.
96
97
```python { .api }
98
def create_dataset(
99
self,
100
dataset: Union[Dataset, DatasetReference, str],
101
exists_ok: bool = False,
102
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
103
timeout: float = None,
104
) -> Dataset:
105
"""
106
Create a new dataset.
107
108
Args:
109
dataset: Dataset to create.
110
exists_ok: If True, do not raise error if dataset already exists.
111
retry: Retry configuration.
112
timeout: Timeout in seconds.
113
114
Returns:
115
Dataset: The created dataset.
116
"""
117
118
def delete_dataset(
119
self,
120
dataset: Union[Dataset, DatasetReference, str],
121
delete_contents: bool = False,
122
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
123
timeout: float = None,
124
) -> None:
125
"""
126
Delete a dataset.
127
128
Args:
129
dataset: Dataset to delete.
130
delete_contents: If True, delete all tables in dataset.
131
retry: Retry configuration.
132
timeout: Timeout in seconds.
133
"""
134
135
def get_dataset(
136
self,
137
dataset_ref: Union[DatasetReference, str],
138
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
139
timeout: float = None,
140
) -> Dataset:
141
"""
142
Fetch dataset metadata.
143
144
Args:
145
dataset_ref: Reference to dataset to fetch.
146
retry: Retry configuration.
147
timeout: Timeout in seconds.
148
149
Returns:
150
Dataset: The requested dataset.
151
"""
152
153
def list_datasets(
154
self,
155
project: str = None,
156
include_all: bool = False,
157
filter: str = None,
158
max_results: int = None,
159
page_token: str = None,
160
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
161
timeout: float = None,
162
) -> google.api_core.page_iterator.Iterator[Dataset]:
163
"""
164
List datasets in a project.
165
166
Args:
167
project: Project ID to list datasets from.
168
include_all: Include hidden datasets.
169
filter: Label filter expression.
170
max_results: Maximum datasets to return.
171
page_token: Token for pagination.
172
retry: Retry configuration.
173
timeout: Timeout in seconds.
174
175
Returns:
176
Iterator[Dataset]: Iterator of datasets.
177
"""
178
179
def update_dataset(
180
self,
181
dataset: Dataset,
182
fields: List[str],
183
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
184
timeout: float = None,
185
) -> Dataset:
186
"""
187
Update dataset metadata.
188
189
Args:
190
dataset: Dataset with updated metadata.
191
fields: Fields to update.
192
retry: Retry configuration.
193
timeout: Timeout in seconds.
194
195
Returns:
196
Dataset: Updated dataset.
197
"""
198
```
199
200
### Table Operations
201
202
Manage BigQuery tables including creation, deletion, and metadata operations.
203
204
```python { .api }
205
def create_table(
206
self,
207
table: Union[Table, TableReference, str],
208
exists_ok: bool = False,
209
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
210
timeout: float = None,
211
) -> Table:
212
"""
213
Create a new table.
214
215
Args:
216
table: Table to create.
217
exists_ok: If True, do not raise error if table already exists.
218
retry: Retry configuration.
219
timeout: Timeout in seconds.
220
221
Returns:
222
Table: The created table.
223
"""
224
225
def delete_table(
226
self,
227
table: Union[Table, TableReference, str],
228
not_found_ok: bool = False,
229
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
230
timeout: float = None,
231
) -> None:
232
"""
233
Delete a table.
234
235
Args:
236
table: Table to delete.
237
not_found_ok: If True, do not raise error if table not found.
238
retry: Retry configuration.
239
timeout: Timeout in seconds.
240
"""
241
242
def get_table(
243
self,
244
table: Union[Table, TableReference, str],
245
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
246
timeout: float = None,
247
) -> Table:
248
"""
249
Fetch table metadata.
250
251
Args:
252
table: Reference to table to fetch.
253
retry: Retry configuration.
254
timeout: Timeout in seconds.
255
256
Returns:
257
Table: The requested table.
258
"""
259
260
def list_tables(
261
self,
262
dataset: Union[Dataset, DatasetReference, str],
263
max_results: int = None,
264
page_token: str = None,
265
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
266
timeout: float = None,
267
) -> google.api_core.page_iterator.Iterator[Table]:
268
"""
269
List tables in a dataset.
270
271
Args:
272
dataset: Dataset to list tables from.
273
max_results: Maximum tables to return.
274
page_token: Token for pagination.
275
retry: Retry configuration.
276
timeout: Timeout in seconds.
277
278
Returns:
279
Iterator[Table]: Iterator of tables.
280
"""
281
282
def update_table(
283
self,
284
table: Table,
285
fields: List[str],
286
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
287
timeout: float = None,
288
) -> Table:
289
"""
290
Update table metadata.
291
292
Args:
293
table: Table with updated metadata.
294
fields: Fields to update.
295
retry: Retry configuration.
296
timeout: Timeout in seconds.
297
298
Returns:
299
Table: Updated table.
300
"""
301
```
302
303
### Job Management
304
305
Monitor and control BigQuery jobs including queries, loads, extracts, and copies.
306
307
```python { .api }
308
def get_job(
309
self,
310
job_id: str,
311
project: str = None,
312
location: str = None,
313
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
314
timeout: float = None,
315
) -> Union[QueryJob, LoadJob, ExtractJob, CopyJob, UnknownJob]:
316
"""
317
Fetch job metadata.
318
319
Args:
320
job_id: Unique identifier for the job.
321
project: Project ID where job was created.
322
location: Location where job was created.
323
retry: Retry configuration.
324
timeout: Timeout in seconds.
325
326
Returns:
327
Job: The requested job instance.
328
"""
329
330
def list_jobs(
331
self,
332
project: str = None,
333
parent_job: str = None,
334
state_filter: str = None,
335
min_creation_time: datetime.datetime = None,
336
max_creation_time: datetime.datetime = None,
337
max_results: int = None,
338
page_token: str = None,
339
all_users: bool = None,
340
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
341
timeout: float = None,
342
) -> google.api_core.page_iterator.Iterator:
343
"""
344
List jobs in a project.
345
346
Args:
347
project: Project ID to list jobs from.
348
parent_job: Parent job ID for script jobs.
349
state_filter: Filter by job state ('done', 'pending', 'running').
350
min_creation_time: Minimum job creation time.
351
max_creation_time: Maximum job creation time.
352
max_results: Maximum jobs to return.
353
page_token: Token for pagination.
354
all_users: Include jobs from all users.
355
retry: Retry configuration.
356
timeout: Timeout in seconds.
357
358
Returns:
359
Iterator: Iterator of job instances.
360
"""
361
362
def cancel_job(
363
self,
364
job_id: str,
365
project: str = None,
366
location: str = None,
367
retry: google.api_core.retry.Retry = DEFAULT_RETRY,
368
timeout: float = None,
369
) -> bool:
370
"""
371
Cancel a job.
372
373
Args:
374
job_id: Unique identifier for the job.
375
project: Project ID where job was created.
376
location: Location where job was created.
377
retry: Retry configuration.
378
timeout: Timeout in seconds.
379
380
Returns:
381
bool: True if cancellation was successful.
382
"""
383
```
384
385
### Data Transfer Operations
386
387
Load and extract data to/from BigQuery tables with various configuration options.
388
389
```python { .api }
390
def load_table_from_uri(
391
self,
392
source_uris: Union[str, List[str]],
393
destination: Union[Table, TableReference, str],
394
job_config: LoadJobConfig = None,
395
**kwargs
396
) -> LoadJob:
397
"""
398
Load data from Cloud Storage URIs.
399
400
Args:
401
source_uris: Cloud Storage URIs to load from.
402
destination: Destination table.
403
job_config: Configuration for the load job.
404
405
Returns:
406
LoadJob: Job instance for the load operation.
407
"""
408
409
def load_table_from_file(
410
self,
411
file_obj: typing.BinaryIO,
412
destination: Union[Table, TableReference, str],
413
rewind: bool = False,
414
size: int = None,
415
num_retries: int = 6,
416
job_config: LoadJobConfig = None,
417
**kwargs
418
) -> LoadJob:
419
"""
420
Load data from a file object.
421
422
Args:
423
file_obj: File-like object to load from.
424
destination: Destination table.
425
rewind: Whether to rewind file before loading.
426
size: Number of bytes to load.
427
num_retries: Number of upload retries.
428
job_config: Configuration for the load job.
429
430
Returns:
431
LoadJob: Job instance for the load operation.
432
"""
433
434
def extract_table(
435
self,
436
source: Union[Table, TableReference, str],
437
destination_uris: Union[str, List[str]],
438
job_config: ExtractJobConfig = None,
439
**kwargs
440
) -> ExtractJob:
441
"""
442
Extract data from a table to Cloud Storage.
443
444
Args:
445
source: Source table to extract from.
446
destination_uris: Cloud Storage URIs to extract to.
447
job_config: Configuration for the extract job.
448
449
Returns:
450
ExtractJob: Job instance for the extract operation.
451
"""
452
```
453
454
## Usage Examples
455
456
### Basic Client Setup
457
458
```python
459
from google.cloud import bigquery
460
461
# Use default credentials and project
462
client = bigquery.Client()
463
464
# Specify project explicitly
465
client = bigquery.Client(project="my-project-id")
466
467
# Use service account credentials
468
from google.oauth2 import service_account
469
470
credentials = service_account.Credentials.from_service_account_file(
471
"path/to/service-account-key.json"
472
)
473
client = bigquery.Client(credentials=credentials, project="my-project-id")
474
```
475
476
### Resource Management
477
478
```python
479
# Create a dataset
480
dataset_id = "my_new_dataset"
481
dataset = bigquery.Dataset(f"{client.project}.{dataset_id}")
482
dataset.location = "US"
483
dataset = client.create_dataset(dataset, exists_ok=True)
484
485
# List all datasets
486
datasets = list(client.list_datasets())
487
for dataset in datasets:
488
print(dataset.dataset_id)
489
490
# Create a table with schema
491
table_id = "my_new_table"
492
schema = [
493
bigquery.SchemaField("name", "STRING", mode="REQUIRED"),
494
bigquery.SchemaField("age", "INTEGER", mode="NULLABLE"),
495
]
496
497
table = bigquery.Table(f"{client.project}.{dataset_id}.{table_id}", schema=schema)
498
table = client.create_table(table, exists_ok=True)
499
```