Python client library for managing BigQuery Data Transfer Service operations and scheduling data transfers from partner SaaS applications.
—
Essential data structures representing transfer configurations, runs, data sources, and related metadata used throughout the API for defining and managing data transfer operations.
Represents a data transfer configuration including destination, schedule, parameters, and authorization details.
class TransferConfig:
"""
Represents a data transfer configuration.
Attributes:
name (str): Output only. Identifier for the transfer configuration resource.
display_name (str): The user specified display name for the transfer config.
destination_dataset_id (str): The BigQuery target dataset id.
data_source_id (str): Data source id.
params (Struct): Data transfer specific parameters.
schedule (str): Data transfer schedule in unix-cron format.
schedule_options (ScheduleOptions): Options customizing the data transfer schedule.
data_refresh_window_days (int): The number of days to look back to automatically refresh data.
disabled (bool): Is this config disabled.
update_time (Timestamp): Output only. Data transfer modification time.
next_run_time (Timestamp): Output only. Next time when data transfer will run.
state (TransferState): Output only. State of the most recently updated transfer run.
user_id (int): Deprecated. Unique ID of the user on whose behalf transfer is done.
dataset_region (str): Output only. Region in which BigQuery dataset is located.
notification_pubsub_topic (str): Pub/Sub topic where notifications will be sent.
email_preferences (EmailPreferences): Email notification preferences.
owner_info (UserInfo): Output only. Information about the user for transfer run.
encryption_configuration (EncryptionConfiguration): The encryption configuration part.
"""
name: str
display_name: str
destination_dataset_id: str
data_source_id: str
params: struct_pb2.Struct
schedule: str
schedule_options: ScheduleOptions
data_refresh_window_days: int
disabled: bool
update_time: timestamp_pb2.Timestamp
next_run_time: timestamp_pb2.Timestamp
state: TransferState
user_id: int
dataset_region: str
notification_pubsub_topic: str
email_preferences: EmailPreferences
owner_info: UserInfo
encryption_configuration: EncryptionConfigurationRepresents a data transfer run with execution status, timing, and error information.
class TransferRun:
"""
Represents a data transfer run.
Attributes:
name (str): Identifier for the transfer run resource.
schedule_time (Timestamp): Minimum time after which a transfer run can be started.
run_time (Timestamp): For batch transfer runs, specifies the date and time of the data.
error_status (Status): Status of the transfer run.
start_time (Timestamp): Output only. Time when transfer run was started.
end_time (Timestamp): Output only. Time when transfer run ended.
update_time (Timestamp): Output only. Last time the data transfer run state was updated.
params (Struct): Output only. Data transfer specific parameters.
destination_dataset_id (str): Output only. The BigQuery target dataset id.
data_source_id (str): Output only. Data source id.
state (TransferState): Data transfer run state.
user_id (int): Deprecated. Unique ID of the user on whose behalf transfer is done.
schedule (str): Output only. Describes the schedule of this transfer run.
notification_pubsub_topic (str): Output only. Pub/Sub topic where notifications will be sent.
email_preferences (EmailPreferences): Output only. Email notification preferences.
"""
name: str
schedule_time: timestamp_pb2.Timestamp
run_time: timestamp_pb2.Timestamp
error_status: status_pb2.Status
start_time: timestamp_pb2.Timestamp
end_time: timestamp_pb2.Timestamp
update_time: timestamp_pb2.Timestamp
params: struct_pb2.Struct
destination_dataset_id: str
data_source_id: str
state: TransferState
user_id: int
schedule: str
notification_pubsub_topic: str
email_preferences: EmailPreferencesRepresents a data source that can be used in BigQuery Data Transfer.
class DataSource:
"""
Represents a data source.
Attributes:
name (str): Output only. Data source resource name.
data_source_id (str): Data source id.
display_name (str): User friendly display name of the data source.
description (str): User friendly data source description string.
client_id (str): Data source client id.
scopes (Sequence[str]): Api auth scopes for which refresh token needs to be obtained.
transfer_type (TransferType): Deprecated. This field has no effect.
supports_multiple_transfers (bool): Indicates whether the data source supports multiple transfers.
update_deadline_seconds (int): The number of seconds to wait for a status update.
default_schedule (str): Default data transfer schedule.
supports_custom_schedule (bool): Specifies whether the data source supports a user defined schedule.
parameters (Sequence[DataSourceParameter]): Data source parameters.
help_url (str): Url for the help document for this data source.
authorization_type (AuthorizationType): Indicates the type of authorization.
data_refresh_type (DataRefreshType): Specifies whether the data source supports automatic data refresh.
default_data_refresh_window_days (int): Default data refresh window on days.
manual_runs_disabled (bool): Disables backfill and manual run functionality.
minimum_schedule_interval (Duration): The minimum interval for scheduler to schedule runs.
"""
name: str
data_source_id: str
display_name: str
description: str
client_id: str
scopes: Sequence[str]
transfer_type: TransferType
supports_multiple_transfers: bool
update_deadline_seconds: int
default_schedule: str
supports_custom_schedule: bool
parameters: Sequence[DataSourceParameter]
help_url: str
authorization_type: AuthorizationType
data_refresh_type: DataRefreshType
default_data_refresh_window_days: int
manual_runs_disabled: bool
minimum_schedule_interval: duration_pb2.Duration
class AuthorizationType(proto.Enum):
"""The type of authorization needed for this data source."""
AUTHORIZATION_TYPE_UNSPECIFIED = 0
AUTHORIZATION_CODE = 1
GOOGLE_PLUS_AUTHORIZATION_CODE = 2
FIRST_PARTY_OAUTH = 3
class DataRefreshType(proto.Enum):
"""Represents how the data source supports data auto refresh."""
DATA_REFRESH_TYPE_UNSPECIFIED = 0
SLIDING_WINDOW = 1
CUSTOM_SLIDING_WINDOW = 2Represents a parameter used to define custom fields in a data source definition.
class DataSourceParameter:
"""
A parameter used to define custom fields in a data source definition.
Attributes:
param_id (str): Parameter identifier.
display_name (str): Parameter display name in the user interface.
description (str): Parameter description.
type_ (Type): Parameter type.
required (bool): Is parameter required.
repeated (bool): Deprecated. This field has no effect.
validation_regex (str): Regular expression which can be used for parameter validation.
allowed_values (Sequence[str]): All possible values for the parameter.
min_value (DoubleValue): For integer and double values specifies minimum allowed value.
max_value (DoubleValue): For integer and double values specifies maximum allowed value.
fields (Sequence[DataSourceParameter]): Deprecated. This field has no effect.
validation_description (str): Description of the requirements for this field.
validation_help_url (str): URL to a help document to further explain the naming requirements.
immutable (bool): Cannot be changed after initial creation.
recurse (bool): Deprecated. This field has no effect.
"""
param_id: str
display_name: str
description: str
type_: Type
required: bool
repeated: bool
validation_regex: str
allowed_values: Sequence[str]
min_value: wrappers_pb2.DoubleValue
max_value: wrappers_pb2.DoubleValue
fields: Sequence[DataSourceParameter]
validation_description: str
validation_help_url: str
immutable: bool
recurse: bool
class Type(proto.Enum):
"""Parameter types."""
TYPE_UNSPECIFIED = 0
STRING = 1
INTEGER = 2
DOUBLE = 3
BOOLEAN = 4
RECORD = 5
PLUS_PAGE = 6
LIST = 7Represents a user-facing message for a particular data transfer run.
class TransferMessage:
"""
Represents a user-facing message for a particular data transfer run.
Attributes:
message_time (Timestamp): Time when message was logged.
severity (MessageSeverity): Message severity.
message_text (str): Message text.
"""
message_time: timestamp_pb2.Timestamp
severity: MessageSeverity
message_text: str
class MessageSeverity(proto.Enum):
"""Message severity levels."""
MESSAGE_SEVERITY_UNSPECIFIED = 0
INFO = 1
WARNING = 2
ERROR = 3Information about the user for whom the transfer config was created.
class UserInfo:
"""
Information about the user for whom the transfer config was created.
Attributes:
email (str): E-mail address of the user.
"""
email: strfrom google.cloud import bigquery_datatransfer
from google.protobuf import struct_pb2
client = bigquery_datatransfer.DataTransferServiceClient()
# Create parameters for scheduled query
params = struct_pb2.Struct()
params.update({
"query": "SELECT * FROM `project.dataset.table` WHERE date = @run_date",
"destination_table_name_template": "results_{run_date}",
"use_legacy_sql": False
})
# Create transfer config
transfer_config = bigquery_datatransfer.TransferConfig(
display_name="My Scheduled Query",
data_source_id="scheduled_query",
destination_dataset_id="my_dataset",
schedule="every day 08:00",
params=params,
disabled=False,
email_preferences=bigquery_datatransfer.EmailPreferences(
enable_failure_email=True
)
)
# Print config details
print(f"Config Name: {transfer_config.display_name}")
print(f"Data Source: {transfer_config.data_source_id}")
print(f"Schedule: {transfer_config.schedule}")
print(f"Disabled: {transfer_config.disabled}")from google.cloud import bigquery_datatransfer
client = bigquery_datatransfer.DataTransferServiceClient()
# Get a transfer run
run_name = f"projects/{project_id}/locations/{location}/transferConfigs/{config_id}/runs/{run_id}"
run = client.get_transfer_run(name=run_name)
# Access run properties
print(f"Run Name: {run.name}")
print(f"State: {run.state}")
print(f"Data Source: {run.data_source_id}")
print(f"Schedule Time: {run.schedule_time}")
print(f"Start Time: {run.start_time}")
print(f"End Time: {run.end_time}")
# Check for errors
if run.error_status and run.error_status.code != 0:
print(f"Error Code: {run.error_status.code}")
print(f"Error Message: {run.error_status.message}")
# Access run parameters
print("Parameters:")
for key, value in run.params.items():
print(f" {key}: {value}")from google.cloud import bigquery_datatransfer
client = bigquery_datatransfer.DataTransferServiceClient()
# Get data source details
data_source_name = f"projects/{project_id}/locations/{location}/dataSources/scheduled_query"
data_source = client.get_data_source(name=data_source_name)
print(f"Data Source: {data_source.display_name}")
print(f"ID: {data_source.data_source_id}")
print(f"Description: {data_source.description}")
print(f"Supports Custom Schedule: {data_source.supports_custom_schedule}")
print(f"Default Schedule: {data_source.default_schedule}")
# List parameters
print("Parameters:")
for param in data_source.parameters:
print(f" {param.param_id}: {param.display_name}")
print(f" Type: {param.type_}")
print(f" Required: {param.required}")
if param.description:
print(f" Description: {param.description}")
if param.allowed_values:
print(f" Allowed Values: {list(param.allowed_values)}")from google.cloud import bigquery_datatransfer
# Examine parameter details
def describe_parameter(param):
print(f"Parameter: {param.param_id}")
print(f" Display Name: {param.display_name}")
print(f" Type: {param.type_}")
print(f" Required: {param.required}")
if param.description:
print(f" Description: {param.description}")
if param.validation_regex:
print(f" Validation Regex: {param.validation_regex}")
if param.allowed_values:
print(f" Allowed Values: {list(param.allowed_values)}")
if param.min_value:
print(f" Min Value: {param.min_value.value}")
if param.max_value:
print(f" Max Value: {param.max_value.value}")
if param.validation_description:
print(f" Validation Help: {param.validation_description}")
# Example usage with a data source
client = bigquery_datatransfer.DataTransferServiceClient()
data_source_name = f"projects/{project_id}/locations/{location}/dataSources/google_ads"
data_source = client.get_data_source(name=data_source_name)
for param in data_source.parameters:
describe_parameter(param)
print()Pager classes used for iterating through paginated list responses.
class ListDataSourcesPager:
"""
A pager for iterating through list_data_sources requests.
This class provides an iterator interface for paginated results.
"""
def __iter__(self) -> Iterator[DataSource]:
"""Iterate over data sources in the response."""class ListTransferConfigsPager:
"""
A pager for iterating through list_transfer_configs requests.
This class provides an iterator interface for paginated results.
"""
def __iter__(self) -> Iterator[TransferConfig]:
"""Iterate over transfer configurations in the response."""class ListTransferRunsPager:
"""
A pager for iterating through list_transfer_runs requests.
This class provides an iterator interface for paginated results.
"""
def __iter__(self) -> Iterator[TransferRun]:
"""Iterate over transfer runs in the response."""class ListTransferLogsPager:
"""
A pager for iterating through list_transfer_logs requests.
This class provides an iterator interface for paginated results.
"""
def __iter__(self) -> Iterator[TransferMessage]:
"""Iterate over transfer log messages in the response."""Install with Tessl CLI
npx tessl i tessl/pypi-google-cloud-bigquery-datatransfer