Tessl Tile for pypi/google-cloud-bigquery@3.36.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

tessl/pypi-google-cloud-bigquery

Google BigQuery API client library for Python providing comprehensive data warehouse and analytics capabilities

Workspace: tessl
Visibility: Public
Created: 3 months ago
Last updated: 3 months ago
Describes: pkg:pypi/google-cloud-bigquery@3.36.x

To install, run

npx @tessl/cli install tessl/pypi-google-cloud-bigquery@3.36.0

0
# Google Cloud BigQuery
1

2
Google BigQuery API client library for Python providing comprehensive data warehouse and analytics capabilities. This library enables developers to interact with Google's cloud-based data warehouse, perform SQL queries on massive datasets, manage BigQuery resources, and integrate with the broader Google Cloud ecosystem.
3

4
## Package Information
5

6
- **Package Name**: google-cloud-bigquery
7
- **Package Type**: library
8
- **Language**: Python
9
- **Installation**: `pip install google-cloud-bigquery`
10

11
## Core Imports
12

13
```python
14
from google.cloud import bigquery
15
```
16

17
Main client and commonly used classes:
18

19
```python
20
from google.cloud.bigquery import Client, Dataset, Table, QueryJob
21
```
22

23
Import specific components as needed:
24

25
```python
26
from google.cloud.bigquery import (
27
    SchemaField, LoadJob, ExtractJob, 
28
    QueryJobConfig, LoadJobConfig
29
)
30
```
31

32
## Basic Usage
33

34
```python
35
from google.cloud import bigquery
36

37
# Initialize the client
38
client = bigquery.Client()
39

40
# Simple query example
41
query = """
42
    SELECT name, COUNT(*) as count
43
    FROM `bigquery-public-data.usa_names.usa_1910_2013`
44
    WHERE state = 'TX'
45
    GROUP BY name
46
    ORDER BY count DESC
47
    LIMIT 10
48
"""
49

50
# Execute query and get results
51
query_job = client.query(query)
52
results = query_job.result()
53

54
# Process results
55
for row in results:
56
    print(f"{row.name}: {row.count}")
57

58
# Working with datasets and tables
59
dataset_id = "my_dataset"
60
table_id = "my_table"
61

62
# Create dataset
63
dataset = bigquery.Dataset(f"{client.project}.{dataset_id}")
64
dataset = client.create_dataset(dataset, exists_ok=True)
65

66
# Define table schema
67
schema = [
68
    bigquery.SchemaField("name", "STRING", mode="REQUIRED"),
69
    bigquery.SchemaField("age", "INTEGER", mode="NULLABLE"),
70
    bigquery.SchemaField("city", "STRING", mode="NULLABLE"),
71
]
72

73
# Create table
74
table = bigquery.Table(f"{client.project}.{dataset_id}.{table_id}", schema=schema)
75
table = client.create_table(table, exists_ok=True)
76
```
77

78
## Architecture
79

80
BigQuery client library follows a hierarchical resource model:
81

82
- **Client**: Central connection manager for all BigQuery operations
83
- **Dataset**: Container for tables, models, and routines within a project
84
- **Table**: Data storage with schema, containing rows and columns
85
- **Job**: Asynchronous operation (query, load, extract, copy) with progress tracking
86
- **Schema**: Structure definition using SchemaField objects for type safety
87
- **Query Parameters**: Type-safe parameter binding for SQL queries
88

89
The library integrates seamlessly with pandas, PyArrow, and other data science tools, supports both synchronous and asynchronous operations, and provides comprehensive error handling and retry mechanisms.
90

91
## Capabilities
92

93
### Client Operations
94

95
Core client functionality for authentication, project management, and resource operations. Provides the main entry point for all BigQuery interactions.
96

97
```python { .api }
98
class Client:
99
    def __init__(self, project: str = None, credentials: Any = None, **kwargs): ...
100
    def query(self, query: str, **kwargs) -> QueryJob: ...
101
    def get_dataset(self, dataset_ref: str) -> Dataset: ...
102
    def create_dataset(self, dataset: Dataset, **kwargs) -> Dataset: ...
103
    def delete_dataset(self, dataset_ref: str, **kwargs) -> None: ...
104
    def list_datasets(self, **kwargs) -> Iterator[Dataset]: ...
105
```
106

107
[Client Operations](./client-operations.md)
108

109
### Query Operations
110

111
SQL query execution with parameters, job configuration, and result processing. Supports both simple queries and complex analytical workloads with pagination and streaming.
112

113
```python { .api }
114
class QueryJob:
115
    def result(self, **kwargs) -> RowIterator: ...
116
    def to_dataframe(self, **kwargs) -> pandas.DataFrame: ...
117
    def to_arrow(self, **kwargs) -> pyarrow.Table: ...
118

119
class QueryJobConfig:
120
    def __init__(self, **kwargs): ...
121
    
122
def query(self, query: str, job_config: QueryJobConfig = None, **kwargs) -> QueryJob: ...
123
```
124

125
[Query Operations](./query-operations.md)
126

127
### Dataset Management
128

129
Dataset creation, configuration, access control, and metadata management. Datasets serve as containers for tables and other BigQuery resources.
130

131
```python { .api }
132
class Dataset:
133
    def __init__(self, dataset_ref: str): ...
134
    
135
class DatasetReference:
136
    def __init__(self, project: str, dataset_id: str): ...
137
    
138
class AccessEntry:
139
    def __init__(self, role: str, entity_type: str, entity_id: str): ...
140
```
141

142
[Dataset Management](./dataset-management.md)
143

144
### Table Operations
145

146
Table creation, schema management, data loading, and metadata operations. Includes support for partitioning, clustering, and various table types.
147

148
```python { .api }
149
class Table:
150
    def __init__(self, table_ref: str, schema: List[SchemaField] = None): ...
151
    
152
class TableReference:
153
    def __init__(self, dataset_ref: DatasetReference, table_id: str): ...
154
    
155
class Row:
156
    def values(self) -> List[Any]: ...
157
    def keys(self) -> List[str]: ...
158
```
159

160
[Table Operations](./table-operations.md)
161

162
### Data Loading
163

164
Loading data from various sources including local files, Cloud Storage, streaming inserts, and data export. Supports multiple formats and transformation options.
165

166
```python { .api }
167
class LoadJob:
168
    def result(self, **kwargs) -> LoadJob: ...
169
    
170
class LoadJobConfig:
171
    def __init__(self, **kwargs): ...
172
    source_format: SourceFormat
173
    schema: List[SchemaField]
174
    write_disposition: WriteDisposition
175

176
class ExtractJob:
177
    def result(self, **kwargs) -> ExtractJob: ...
178
    
179
class ExtractJobConfig:
180
    def __init__(self, **kwargs): ...
181
    destination_format: DestinationFormat
182
```
183

184
[Data Loading](./data-loading.md)
185

186
### Schema Definition
187

188
Type-safe schema definition with field specifications, modes, and descriptions. Essential for table creation and data validation.
189

190
```python { .api }
191
class SchemaField:
192
    def __init__(self, name: str, field_type: str, mode: str = "NULLABLE", **kwargs): ...
193
    
194
class FieldElementType:
195
    def __init__(self, element_type: str): ...
196
    
197
class PolicyTagList:
198
    def __init__(self, names: List[str]): ...
199
```
200

201
[Schema Definition](./schema-definition.md)
202

203
### Query Parameters
204

205
Type-safe parameter binding for SQL queries supporting scalar, array, struct, and range parameter types with proper type validation.
206

207
```python { .api }
208
class ScalarQueryParameter:
209
    def __init__(self, name: str, type_: str, value: Any): ...
210
    
211
class ArrayQueryParameter:
212
    def __init__(self, name: str, array_type: str, values: List[Any]): ...
213
    
214
class StructQueryParameter:
215
    def __init__(self, name: str, *sub_params): ...
216
```
217

218
[Query Parameters](./query-parameters.md)
219

220
### Database API (DB-API 2.0)
221

222
Python Database API specification compliance for SQL database compatibility. Enables use with database tools and ORMs.
223

224
```python { .api }
225
def connect(client: Client = None, **kwargs) -> Connection: ...
226

227
class Connection:
228
    def cursor(self) -> Cursor: ...
229
    def commit(self) -> None: ...
230
    def close(self) -> None: ...
231
    
232
class Cursor:
233
    def execute(self, query: str, parameters: Any = None) -> None: ...
234
    def fetchall(self) -> List[Any]: ...
235
```
236

237
[Database API](./database-api.md)
238

239
### Models and Routines
240

241
BigQuery ML model management and user-defined functions (UDFs). Supports model creation, training, evaluation, and stored procedures.
242

243
```python { .api }
244
class Model:
245
    def __init__(self, model_ref: Union[str, ModelReference]): ...
246
    
247
class ModelReference:
248
    def __init__(self, project: str, dataset_id: str, model_id: str): ...
249

250
class Routine:
251
    def __init__(self, routine_ref: Union[str, RoutineReference], routine_type: str = None): ...
252
    
253
class RoutineReference:
254
    def __init__(self, project: str, dataset_id: str, routine_id: str): ...
255
    
256
class RoutineArgument:
257
    def __init__(self, name: str = None, argument_kind: str = None, mode: str = None, data_type: StandardSqlDataType = None): ...
258
```
259

260
[Models and Routines](./models-routines.md)
261

262
## Common Types and Constants
263

264
```python { .api }
265
# Enums for job and table configuration
266
class SourceFormat:
267
    CSV: str
268
    JSON: str
269
    AVRO: str
270
    PARQUET: str
271
    ORC: str
272

273
class WriteDisposition:
274
    WRITE_EMPTY: str
275
    WRITE_TRUNCATE: str
276
    WRITE_APPEND: str
277

278
class CreateDisposition:
279
    CREATE_IF_NEEDED: str
280
    CREATE_NEVER: str
281

282
class QueryPriority:
283
    BATCH: str
284
    INTERACTIVE: str
285

286
# Exception classes
287
class LegacyBigQueryStorageError(Exception): ...
288
class LegacyPandasError(Exception): ...
289
class LegacyPyarrowError(Exception): ...
290

291
# Retry configuration
292
DEFAULT_RETRY: Retry
293
```