or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-google-cloud-bigquery

Google BigQuery API client library for Python providing comprehensive data warehouse and analytics capabilities

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/google-cloud-bigquery@3.36.x

To install, run

npx @tessl/cli install tessl/pypi-google-cloud-bigquery@3.36.0

0

# Google Cloud BigQuery

1

2

Google BigQuery API client library for Python providing comprehensive data warehouse and analytics capabilities. This library enables developers to interact with Google's cloud-based data warehouse, perform SQL queries on massive datasets, manage BigQuery resources, and integrate with the broader Google Cloud ecosystem.

3

4

## Package Information

5

6

- **Package Name**: google-cloud-bigquery

7

- **Package Type**: library

8

- **Language**: Python

9

- **Installation**: `pip install google-cloud-bigquery`

10

11

## Core Imports

12

13

```python

14

from google.cloud import bigquery

15

```

16

17

Main client and commonly used classes:

18

19

```python

20

from google.cloud.bigquery import Client, Dataset, Table, QueryJob

21

```

22

23

Import specific components as needed:

24

25

```python

26

from google.cloud.bigquery import (

27

SchemaField, LoadJob, ExtractJob,

28

QueryJobConfig, LoadJobConfig

29

)

30

```

31

32

## Basic Usage

33

34

```python

35

from google.cloud import bigquery

36

37

# Initialize the client

38

client = bigquery.Client()

39

40

# Simple query example

41

query = """

42

SELECT name, COUNT(*) as count

43

FROM `bigquery-public-data.usa_names.usa_1910_2013`

44

WHERE state = 'TX'

45

GROUP BY name

46

ORDER BY count DESC

47

LIMIT 10

48

"""

49

50

# Execute query and get results

51

query_job = client.query(query)

52

results = query_job.result()

53

54

# Process results

55

for row in results:

56

print(f"{row.name}: {row.count}")

57

58

# Working with datasets and tables

59

dataset_id = "my_dataset"

60

table_id = "my_table"

61

62

# Create dataset

63

dataset = bigquery.Dataset(f"{client.project}.{dataset_id}")

64

dataset = client.create_dataset(dataset, exists_ok=True)

65

66

# Define table schema

67

schema = [

68

bigquery.SchemaField("name", "STRING", mode="REQUIRED"),

69

bigquery.SchemaField("age", "INTEGER", mode="NULLABLE"),

70

bigquery.SchemaField("city", "STRING", mode="NULLABLE"),

71

]

72

73

# Create table

74

table = bigquery.Table(f"{client.project}.{dataset_id}.{table_id}", schema=schema)

75

table = client.create_table(table, exists_ok=True)

76

```

77

78

## Architecture

79

80

BigQuery client library follows a hierarchical resource model:

81

82

- **Client**: Central connection manager for all BigQuery operations

83

- **Dataset**: Container for tables, models, and routines within a project

84

- **Table**: Data storage with schema, containing rows and columns

85

- **Job**: Asynchronous operation (query, load, extract, copy) with progress tracking

86

- **Schema**: Structure definition using SchemaField objects for type safety

87

- **Query Parameters**: Type-safe parameter binding for SQL queries

88

89

The library integrates seamlessly with pandas, PyArrow, and other data science tools, supports both synchronous and asynchronous operations, and provides comprehensive error handling and retry mechanisms.

90

91

## Capabilities

92

93

### Client Operations

94

95

Core client functionality for authentication, project management, and resource operations. Provides the main entry point for all BigQuery interactions.

96

97

```python { .api }

98

class Client:

99

def __init__(self, project: str = None, credentials: Any = None, **kwargs): ...

100

def query(self, query: str, **kwargs) -> QueryJob: ...

101

def get_dataset(self, dataset_ref: str) -> Dataset: ...

102

def create_dataset(self, dataset: Dataset, **kwargs) -> Dataset: ...

103

def delete_dataset(self, dataset_ref: str, **kwargs) -> None: ...

104

def list_datasets(self, **kwargs) -> Iterator[Dataset]: ...

105

```

106

107

[Client Operations](./client-operations.md)

108

109

### Query Operations

110

111

SQL query execution with parameters, job configuration, and result processing. Supports both simple queries and complex analytical workloads with pagination and streaming.

112

113

```python { .api }

114

class QueryJob:

115

def result(self, **kwargs) -> RowIterator: ...

116

def to_dataframe(self, **kwargs) -> pandas.DataFrame: ...

117

def to_arrow(self, **kwargs) -> pyarrow.Table: ...

118

119

class QueryJobConfig:

120

def __init__(self, **kwargs): ...

121

122

def query(self, query: str, job_config: QueryJobConfig = None, **kwargs) -> QueryJob: ...

123

```

124

125

[Query Operations](./query-operations.md)

126

127

### Dataset Management

128

129

Dataset creation, configuration, access control, and metadata management. Datasets serve as containers for tables and other BigQuery resources.

130

131

```python { .api }

132

class Dataset:

133

def __init__(self, dataset_ref: str): ...

134

135

class DatasetReference:

136

def __init__(self, project: str, dataset_id: str): ...

137

138

class AccessEntry:

139

def __init__(self, role: str, entity_type: str, entity_id: str): ...

140

```

141

142

[Dataset Management](./dataset-management.md)

143

144

### Table Operations

145

146

Table creation, schema management, data loading, and metadata operations. Includes support for partitioning, clustering, and various table types.

147

148

```python { .api }

149

class Table:

150

def __init__(self, table_ref: str, schema: List[SchemaField] = None): ...

151

152

class TableReference:

153

def __init__(self, dataset_ref: DatasetReference, table_id: str): ...

154

155

class Row:

156

def values(self) -> List[Any]: ...

157

def keys(self) -> List[str]: ...

158

```

159

160

[Table Operations](./table-operations.md)

161

162

### Data Loading

163

164

Loading data from various sources including local files, Cloud Storage, streaming inserts, and data export. Supports multiple formats and transformation options.

165

166

```python { .api }

167

class LoadJob:

168

def result(self, **kwargs) -> LoadJob: ...

169

170

class LoadJobConfig:

171

def __init__(self, **kwargs): ...

172

source_format: SourceFormat

173

schema: List[SchemaField]

174

write_disposition: WriteDisposition

175

176

class ExtractJob:

177

def result(self, **kwargs) -> ExtractJob: ...

178

179

class ExtractJobConfig:

180

def __init__(self, **kwargs): ...

181

destination_format: DestinationFormat

182

```

183

184

[Data Loading](./data-loading.md)

185

186

### Schema Definition

187

188

Type-safe schema definition with field specifications, modes, and descriptions. Essential for table creation and data validation.

189

190

```python { .api }

191

class SchemaField:

192

def __init__(self, name: str, field_type: str, mode: str = "NULLABLE", **kwargs): ...

193

194

class FieldElementType:

195

def __init__(self, element_type: str): ...

196

197

class PolicyTagList:

198

def __init__(self, names: List[str]): ...

199

```

200

201

[Schema Definition](./schema-definition.md)

202

203

### Query Parameters

204

205

Type-safe parameter binding for SQL queries supporting scalar, array, struct, and range parameter types with proper type validation.

206

207

```python { .api }

208

class ScalarQueryParameter:

209

def __init__(self, name: str, type_: str, value: Any): ...

210

211

class ArrayQueryParameter:

212

def __init__(self, name: str, array_type: str, values: List[Any]): ...

213

214

class StructQueryParameter:

215

def __init__(self, name: str, *sub_params): ...

216

```

217

218

[Query Parameters](./query-parameters.md)

219

220

### Database API (DB-API 2.0)

221

222

Python Database API specification compliance for SQL database compatibility. Enables use with database tools and ORMs.

223

224

```python { .api }

225

def connect(client: Client = None, **kwargs) -> Connection: ...

226

227

class Connection:

228

def cursor(self) -> Cursor: ...

229

def commit(self) -> None: ...

230

def close(self) -> None: ...

231

232

class Cursor:

233

def execute(self, query: str, parameters: Any = None) -> None: ...

234

def fetchall(self) -> List[Any]: ...

235

```

236

237

[Database API](./database-api.md)

238

239

### Models and Routines

240

241

BigQuery ML model management and user-defined functions (UDFs). Supports model creation, training, evaluation, and stored procedures.

242

243

```python { .api }

244

class Model:

245

def __init__(self, model_ref: Union[str, ModelReference]): ...

246

247

class ModelReference:

248

def __init__(self, project: str, dataset_id: str, model_id: str): ...

249

250

class Routine:

251

def __init__(self, routine_ref: Union[str, RoutineReference], routine_type: str = None): ...

252

253

class RoutineReference:

254

def __init__(self, project: str, dataset_id: str, routine_id: str): ...

255

256

class RoutineArgument:

257

def __init__(self, name: str = None, argument_kind: str = None, mode: str = None, data_type: StandardSqlDataType = None): ...

258

```

259

260

[Models and Routines](./models-routines.md)

261

262

## Common Types and Constants

263

264

```python { .api }

265

# Enums for job and table configuration

266

class SourceFormat:

267

CSV: str

268

JSON: str

269

AVRO: str

270

PARQUET: str

271

ORC: str

272

273

class WriteDisposition:

274

WRITE_EMPTY: str

275

WRITE_TRUNCATE: str

276

WRITE_APPEND: str

277

278

class CreateDisposition:

279

CREATE_IF_NEEDED: str

280

CREATE_NEVER: str

281

282

class QueryPriority:

283

BATCH: str

284

INTERACTIVE: str

285

286

# Exception classes

287

class LegacyBigQueryStorageError(Exception): ...

288

class LegacyPandasError(Exception): ...

289

class LegacyPyarrowError(Exception): ...

290

291

# Retry configuration

292

DEFAULT_RETRY: Retry

293

```