CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-google-cloud-bigquery-datatransfer

Python client library for managing BigQuery Data Transfer Service operations and scheduling data transfers from partner SaaS applications.

Pending
Overview
Eval results
Files

data-types.mddocs/

Core Data Types

Essential data structures representing transfer configurations, runs, data sources, and related metadata used throughout the API for defining and managing data transfer operations.

Capabilities

TransferConfig

Represents a data transfer configuration including destination, schedule, parameters, and authorization details.

class TransferConfig:
    """
    Represents a data transfer configuration.
    
    Attributes:
        name (str): Output only. Identifier for the transfer configuration resource.
        display_name (str): The user specified display name for the transfer config.
        destination_dataset_id (str): The BigQuery target dataset id.
        data_source_id (str): Data source id.
        params (Struct): Data transfer specific parameters.
        schedule (str): Data transfer schedule in unix-cron format.
        schedule_options (ScheduleOptions): Options customizing the data transfer schedule.
        data_refresh_window_days (int): The number of days to look back to automatically refresh data.
        disabled (bool): Is this config disabled.
        update_time (Timestamp): Output only. Data transfer modification time.
        next_run_time (Timestamp): Output only. Next time when data transfer will run.
        state (TransferState): Output only. State of the most recently updated transfer run.
        user_id (int): Deprecated. Unique ID of the user on whose behalf transfer is done.
        dataset_region (str): Output only. Region in which BigQuery dataset is located.
        notification_pubsub_topic (str): Pub/Sub topic where notifications will be sent.
        email_preferences (EmailPreferences): Email notification preferences.
        owner_info (UserInfo): Output only. Information about the user for transfer run.
        encryption_configuration (EncryptionConfiguration): The encryption configuration part.
    """
    name: str
    display_name: str
    destination_dataset_id: str
    data_source_id: str
    params: struct_pb2.Struct
    schedule: str
    schedule_options: ScheduleOptions
    data_refresh_window_days: int
    disabled: bool
    update_time: timestamp_pb2.Timestamp
    next_run_time: timestamp_pb2.Timestamp
    state: TransferState
    user_id: int
    dataset_region: str
    notification_pubsub_topic: str
    email_preferences: EmailPreferences
    owner_info: UserInfo
    encryption_configuration: EncryptionConfiguration

TransferRun

Represents a data transfer run with execution status, timing, and error information.

class TransferRun:
    """
    Represents a data transfer run.
    
    Attributes:
        name (str): Identifier for the transfer run resource.
        schedule_time (Timestamp): Minimum time after which a transfer run can be started.
        run_time (Timestamp): For batch transfer runs, specifies the date and time of the data.
        error_status (Status): Status of the transfer run.
        start_time (Timestamp): Output only. Time when transfer run was started.
        end_time (Timestamp): Output only. Time when transfer run ended.
        update_time (Timestamp): Output only. Last time the data transfer run state was updated.
        params (Struct): Output only. Data transfer specific parameters.
        destination_dataset_id (str): Output only. The BigQuery target dataset id.
        data_source_id (str): Output only. Data source id.
        state (TransferState): Data transfer run state.
        user_id (int): Deprecated. Unique ID of the user on whose behalf transfer is done.
        schedule (str): Output only. Describes the schedule of this transfer run.
        notification_pubsub_topic (str): Output only. Pub/Sub topic where notifications will be sent.
        email_preferences (EmailPreferences): Output only. Email notification preferences.
    """
    name: str
    schedule_time: timestamp_pb2.Timestamp
    run_time: timestamp_pb2.Timestamp
    error_status: status_pb2.Status
    start_time: timestamp_pb2.Timestamp
    end_time: timestamp_pb2.Timestamp
    update_time: timestamp_pb2.Timestamp
    params: struct_pb2.Struct
    destination_dataset_id: str
    data_source_id: str
    state: TransferState
    user_id: int
    schedule: str
    notification_pubsub_topic: str
    email_preferences: EmailPreferences

DataSource

Represents a data source that can be used in BigQuery Data Transfer.

class DataSource:
    """
    Represents a data source.
    
    Attributes:
        name (str): Output only. Data source resource name.
        data_source_id (str): Data source id.
        display_name (str): User friendly display name of the data source.
        description (str): User friendly data source description string.
        client_id (str): Data source client id.
        scopes (Sequence[str]): Api auth scopes for which refresh token needs to be obtained.
        transfer_type (TransferType): Deprecated. This field has no effect.
        supports_multiple_transfers (bool): Indicates whether the data source supports multiple transfers.
        update_deadline_seconds (int): The number of seconds to wait for a status update.
        default_schedule (str): Default data transfer schedule.
        supports_custom_schedule (bool): Specifies whether the data source supports a user defined schedule.
        parameters (Sequence[DataSourceParameter]): Data source parameters.
        help_url (str): Url for the help document for this data source.
        authorization_type (AuthorizationType): Indicates the type of authorization.
        data_refresh_type (DataRefreshType): Specifies whether the data source supports automatic data refresh.
        default_data_refresh_window_days (int): Default data refresh window on days.
        manual_runs_disabled (bool): Disables backfill and manual run functionality.
        minimum_schedule_interval (Duration): The minimum interval for scheduler to schedule runs.
    """
    name: str
    data_source_id: str
    display_name: str
    description: str
    client_id: str
    scopes: Sequence[str]
    transfer_type: TransferType
    supports_multiple_transfers: bool
    update_deadline_seconds: int
    default_schedule: str
    supports_custom_schedule: bool
    parameters: Sequence[DataSourceParameter]
    help_url: str
    authorization_type: AuthorizationType
    data_refresh_type: DataRefreshType
    default_data_refresh_window_days: int
    manual_runs_disabled: bool
    minimum_schedule_interval: duration_pb2.Duration

    class AuthorizationType(proto.Enum):
        """The type of authorization needed for this data source."""
        AUTHORIZATION_TYPE_UNSPECIFIED = 0
        AUTHORIZATION_CODE = 1
        GOOGLE_PLUS_AUTHORIZATION_CODE = 2
        FIRST_PARTY_OAUTH = 3

    class DataRefreshType(proto.Enum):
        """Represents how the data source supports data auto refresh."""
        DATA_REFRESH_TYPE_UNSPECIFIED = 0
        SLIDING_WINDOW = 1
        CUSTOM_SLIDING_WINDOW = 2

DataSourceParameter

Represents a parameter used to define custom fields in a data source definition.

class DataSourceParameter:
    """
    A parameter used to define custom fields in a data source definition.
    
    Attributes:
        param_id (str): Parameter identifier.
        display_name (str): Parameter display name in the user interface.
        description (str): Parameter description.
        type_ (Type): Parameter type.
        required (bool): Is parameter required.
        repeated (bool): Deprecated. This field has no effect.
        validation_regex (str): Regular expression which can be used for parameter validation.
        allowed_values (Sequence[str]): All possible values for the parameter.
        min_value (DoubleValue): For integer and double values specifies minimum allowed value.
        max_value (DoubleValue): For integer and double values specifies maximum allowed value.
        fields (Sequence[DataSourceParameter]): Deprecated. This field has no effect.
        validation_description (str): Description of the requirements for this field.
        validation_help_url (str): URL to a help document to further explain the naming requirements.
        immutable (bool): Cannot be changed after initial creation.
        recurse (bool): Deprecated. This field has no effect.
    """
    param_id: str
    display_name: str
    description: str
    type_: Type
    required: bool
    repeated: bool
    validation_regex: str
    allowed_values: Sequence[str]
    min_value: wrappers_pb2.DoubleValue
    max_value: wrappers_pb2.DoubleValue
    fields: Sequence[DataSourceParameter]
    validation_description: str
    validation_help_url: str
    immutable: bool
    recurse: bool

    class Type(proto.Enum):
        """Parameter types."""
        TYPE_UNSPECIFIED = 0
        STRING = 1
        INTEGER = 2
        DOUBLE = 3
        BOOLEAN = 4
        RECORD = 5
        PLUS_PAGE = 6
        LIST = 7

TransferMessage

Represents a user-facing message for a particular data transfer run.

class TransferMessage:
    """
    Represents a user-facing message for a particular data transfer run.
    
    Attributes:
        message_time (Timestamp): Time when message was logged.
        severity (MessageSeverity): Message severity.
        message_text (str): Message text.
    """
    message_time: timestamp_pb2.Timestamp
    severity: MessageSeverity
    message_text: str

    class MessageSeverity(proto.Enum):
        """Message severity levels."""
        MESSAGE_SEVERITY_UNSPECIFIED = 0
        INFO = 1
        WARNING = 2
        ERROR = 3

UserInfo

Information about the user for whom the transfer config was created.

class UserInfo:
    """
    Information about the user for whom the transfer config was created.
    
    Attributes:
        email (str): E-mail address of the user.
    """
    email: str

Usage Examples

Working with TransferConfig

from google.cloud import bigquery_datatransfer
from google.protobuf import struct_pb2

client = bigquery_datatransfer.DataTransferServiceClient()

# Create parameters for scheduled query
params = struct_pb2.Struct()
params.update({
    "query": "SELECT * FROM `project.dataset.table` WHERE date = @run_date",
    "destination_table_name_template": "results_{run_date}",
    "use_legacy_sql": False
})

# Create transfer config
transfer_config = bigquery_datatransfer.TransferConfig(
    display_name="My Scheduled Query",
    data_source_id="scheduled_query",
    destination_dataset_id="my_dataset",
    schedule="every day 08:00",
    params=params,
    disabled=False,
    email_preferences=bigquery_datatransfer.EmailPreferences(
        enable_failure_email=True
    )
)

# Print config details
print(f"Config Name: {transfer_config.display_name}")
print(f"Data Source: {transfer_config.data_source_id}")
print(f"Schedule: {transfer_config.schedule}")
print(f"Disabled: {transfer_config.disabled}")

Working with TransferRun

from google.cloud import bigquery_datatransfer

client = bigquery_datatransfer.DataTransferServiceClient()

# Get a transfer run
run_name = f"projects/{project_id}/locations/{location}/transferConfigs/{config_id}/runs/{run_id}"
run = client.get_transfer_run(name=run_name)

# Access run properties
print(f"Run Name: {run.name}")
print(f"State: {run.state}")
print(f"Data Source: {run.data_source_id}")
print(f"Schedule Time: {run.schedule_time}")
print(f"Start Time: {run.start_time}")
print(f"End Time: {run.end_time}")

# Check for errors
if run.error_status and run.error_status.code != 0:
    print(f"Error Code: {run.error_status.code}")
    print(f"Error Message: {run.error_status.message}")

# Access run parameters
print("Parameters:")
for key, value in run.params.items():
    print(f"  {key}: {value}")

Working with DataSource

from google.cloud import bigquery_datatransfer

client = bigquery_datatransfer.DataTransferServiceClient()

# Get data source details
data_source_name = f"projects/{project_id}/locations/{location}/dataSources/scheduled_query"
data_source = client.get_data_source(name=data_source_name)

print(f"Data Source: {data_source.display_name}")
print(f"ID: {data_source.data_source_id}")
print(f"Description: {data_source.description}")
print(f"Supports Custom Schedule: {data_source.supports_custom_schedule}")
print(f"Default Schedule: {data_source.default_schedule}")

# List parameters
print("Parameters:")
for param in data_source.parameters:
    print(f"  {param.param_id}: {param.display_name}")
    print(f"    Type: {param.type_}")
    print(f"    Required: {param.required}")
    if param.description:
        print(f"    Description: {param.description}")
    if param.allowed_values:
        print(f"    Allowed Values: {list(param.allowed_values)}")

Working with DataSourceParameter

from google.cloud import bigquery_datatransfer

# Examine parameter details
def describe_parameter(param):
    print(f"Parameter: {param.param_id}")
    print(f"  Display Name: {param.display_name}")
    print(f"  Type: {param.type_}")
    print(f"  Required: {param.required}")
    
    if param.description:
        print(f"  Description: {param.description}")
    
    if param.validation_regex:
        print(f"  Validation Regex: {param.validation_regex}")
    
    if param.allowed_values:
        print(f"  Allowed Values: {list(param.allowed_values)}")
    
    if param.min_value:
        print(f"  Min Value: {param.min_value.value}")
    
    if param.max_value:
        print(f"  Max Value: {param.max_value.value}")
    
    if param.validation_description:
        print(f"  Validation Help: {param.validation_description}")

# Example usage with a data source
client = bigquery_datatransfer.DataTransferServiceClient()
data_source_name = f"projects/{project_id}/locations/{location}/dataSources/google_ads"
data_source = client.get_data_source(name=data_source_name)

for param in data_source.parameters:
    describe_parameter(param)
    print()

Pager Types

Pager classes used for iterating through paginated list responses.

ListDataSourcesPager

class ListDataSourcesPager:
    """
    A pager for iterating through list_data_sources requests.
    
    This class provides an iterator interface for paginated results.
    """
    def __iter__(self) -> Iterator[DataSource]:
        """Iterate over data sources in the response."""

ListTransferConfigsPager

class ListTransferConfigsPager:
    """
    A pager for iterating through list_transfer_configs requests.
    
    This class provides an iterator interface for paginated results.
    """
    def __iter__(self) -> Iterator[TransferConfig]:
        """Iterate over transfer configurations in the response."""

ListTransferRunsPager

class ListTransferRunsPager:
    """
    A pager for iterating through list_transfer_runs requests.
    
    This class provides an iterator interface for paginated results.
    """
    def __iter__(self) -> Iterator[TransferRun]:
        """Iterate over transfer runs in the response."""

ListTransferLogsPager

class ListTransferLogsPager:
    """
    A pager for iterating through list_transfer_logs requests.
    
    This class provides an iterator interface for paginated results.
    """
    def __iter__(self) -> Iterator[TransferMessage]:
        """Iterate over transfer log messages in the response."""

Install with Tessl CLI

npx tessl i tessl/pypi-google-cloud-bigquery-datatransfer

docs

data-sources.md

data-types.md

index.md

scheduling.md

service-clients.md

transfer-configs.md

transfer-runs.md

tile.json