or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

bulk-operations.mddata-catalog.mdentry-metadata.mdindex.mdpolicy-tags.mdtags.mdtaxonomy-serialization.md
tile.json

tessl/pypi-google-cloud-datacatalog

Google Cloud Datacatalog API client library for data discovery and metadata management

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/google-cloud-datacatalog@3.27.x

To install, run

npx @tessl/cli install tessl/pypi-google-cloud-datacatalog@3.27.0

index.mddocs/

Google Cloud Data Catalog

Google Cloud Data Catalog is a fully managed and highly scalable data discovery and metadata management service. It provides comprehensive APIs for cataloging, organizing, and managing metadata for data assets across Google Cloud services and beyond.

Package Information

  • Package Name: google-cloud-datacatalog
  • Language: Python
  • Installation: pip install google-cloud-datacatalog

Core Imports

from google.cloud import datacatalog_v1

Individual client imports:

from google.cloud.datacatalog_v1 import DataCatalogClient
from google.cloud.datacatalog_v1 import PolicyTagManagerClient
from google.cloud.datacatalog_v1 import PolicyTagManagerSerializationClient

Async client imports:

from google.cloud.datacatalog_v1 import DataCatalogAsyncClient
from google.cloud.datacatalog_v1 import PolicyTagManagerAsyncClient
from google.cloud.datacatalog_v1 import PolicyTagManagerSerializationAsyncClient

Import types and request/response objects:

from google.cloud.datacatalog_v1.types import (
    Entry, EntryGroup, Tag, TagTemplate, PolicyTag, Taxonomy,
    CreateEntryRequest, SearchCatalogRequest, # ... other types
)

Basic Usage

from google.cloud import datacatalog_v1

# Create a client
client = datacatalog_v1.DataCatalogClient()

# Search catalog
search_request = datacatalog_v1.SearchCatalogRequest(
    scope=datacatalog_v1.SearchCatalogRequest.Scope(
        include_org_ids=["my-org-id"]
    ),
    query="type=table"
)
search_results = client.search_catalog(request=search_request)

# Create entry group
entry_group = datacatalog_v1.EntryGroup(
    display_name="My Entry Group",
    description="A sample entry group"
)
create_entry_group_request = datacatalog_v1.CreateEntryGroupRequest(
    parent="projects/my-project/locations/us-central1",
    entry_group_id="my-entry-group",
    entry_group=entry_group
)
created_group = client.create_entry_group(request=create_entry_group_request)

# Create entry
entry = datacatalog_v1.Entry(
    display_name="My Table",
    description="A sample table entry",
    type_=datacatalog_v1.EntryType.TABLE
)
create_entry_request = datacatalog_v1.CreateEntryRequest(
    parent=created_group.name,
    entry_id="my-table",
    entry=entry
)
created_entry = client.create_entry(request=create_entry_request)

Architecture

The Data Catalog API is organized around three main services:

  • DataCatalogClient: Core catalog operations including entry management, search, tagging, and templates
  • PolicyTagManagerClient: Data governance through hierarchical policy tags and taxonomies for access control
  • PolicyTagManagerSerializationClient: Import/export capabilities for policy taxonomies across regions

The API supports both synchronous and asynchronous operations, with comprehensive pagination support for list operations and long-running operations for bulk imports and reconciliation tasks.

Capabilities

Data Catalog Management

Core catalog operations including search, entry groups, entries, tagging, and tag templates. This is the primary interface for discovering and managing metadata about data assets.

class DataCatalogClient:
    def search_catalog(self, request: SearchCatalogRequest = None, **kwargs) -> SearchCatalogPager: ...
    def create_entry_group(self, request: CreateEntryGroupRequest = None, **kwargs) -> EntryGroup: ...
    def get_entry_group(self, request: GetEntryGroupRequest = None, **kwargs) -> EntryGroup: ...
    def update_entry_group(self, request: UpdateEntryGroupRequest = None, **kwargs) -> EntryGroup: ...
    def delete_entry_group(self, request: DeleteEntryGroupRequest = None, **kwargs) -> None: ...
    def list_entry_groups(self, request: ListEntryGroupsRequest = None, **kwargs) -> ListEntryGroupsPager: ...
    def create_entry(self, request: CreateEntryRequest = None, **kwargs) -> Entry: ...
    def get_entry(self, request: GetEntryRequest = None, **kwargs) -> Entry: ...
    def update_entry(self, request: UpdateEntryRequest = None, **kwargs) -> Entry: ...
    def delete_entry(self, request: DeleteEntryRequest = None, **kwargs) -> None: ...
    def list_entries(self, request: ListEntriesRequest = None, **kwargs) -> ListEntriesPager: ...
    def lookup_entry(self, request: LookupEntryRequest = None, **kwargs) -> Entry: ...
    def set_iam_policy(self, request: SetIamPolicyRequest = None, **kwargs) -> Policy: ...
    def get_iam_policy(self, request: GetIamPolicyRequest = None, **kwargs) -> Policy: ...
    def test_iam_permissions(self, request: TestIamPermissionsRequest = None, **kwargs) -> TestIamPermissionsResponse: ...

Data Catalog Management

Policy Tag Management

Data governance through hierarchical taxonomies and policy tags for fine-grained access control. Enables creation and management of data classification policies.

class PolicyTagManagerClient:
    def create_taxonomy(self, request: CreateTaxonomyRequest = None, **kwargs) -> Taxonomy: ...
    def get_taxonomy(self, request: GetTaxonomyRequest = None, **kwargs) -> Taxonomy: ...
    def update_taxonomy(self, request: UpdateTaxonomyRequest = None, **kwargs) -> Taxonomy: ...
    def delete_taxonomy(self, request: DeleteTaxonomyRequest = None, **kwargs) -> None: ...
    def list_taxonomies(self, request: ListTaxonomiesRequest = None, **kwargs) -> ListTaxonomiesPager: ...
    def create_policy_tag(self, request: CreatePolicyTagRequest = None, **kwargs) -> PolicyTag: ...
    def get_policy_tag(self, request: GetPolicyTagRequest = None, **kwargs) -> PolicyTag: ...
    def update_policy_tag(self, request: UpdatePolicyTagRequest = None, **kwargs) -> PolicyTag: ...
    def delete_policy_tag(self, request: DeletePolicyTagRequest = None, **kwargs) -> None: ...
    def list_policy_tags(self, request: ListPolicyTagsRequest = None, **kwargs) -> ListPolicyTagsPager: ...

Policy Tag Management

Taxonomy Serialization

Import and export capabilities for taxonomies, enabling cross-regional taxonomy management and backup/restore operations.

class PolicyTagManagerSerializationClient:
    def replace_taxonomy(self, request: ReplaceTaxonomyRequest = None, **kwargs) -> Taxonomy: ...
    def import_taxonomies(self, request: ImportTaxonomiesRequest = None, **kwargs) -> ImportTaxonomiesResponse: ...
    def export_taxonomies(self, request: ExportTaxonomiesRequest = None, **kwargs) -> ExportTaxonomiesResponse: ...

Taxonomy Serialization

Tag Templates and Tags

Custom metadata schema definition and attachment to catalog resources. Tag templates define the structure of custom metadata that can be attached to entries.

class DataCatalogClient:
    def create_tag_template(self, request: CreateTagTemplateRequest = None, **kwargs) -> TagTemplate: ...
    def get_tag_template(self, request: GetTagTemplateRequest = None, **kwargs) -> TagTemplate: ...
    def update_tag_template(self, request: UpdateTagTemplateRequest = None, **kwargs) -> TagTemplate: ...
    def delete_tag_template(self, request: DeleteTagTemplateRequest = None, **kwargs) -> None: ...
    def create_tag(self, request: CreateTagRequest = None, **kwargs) -> Tag: ...
    def update_tag(self, request: UpdateTagRequest = None, **kwargs) -> Tag: ...
    def delete_tag(self, request: DeleteTagRequest = None, **kwargs) -> None: ...
    def list_tags(self, request: ListTagsRequest = None, **kwargs) -> ListTagsPager: ...

Tag Templates and Tags

Entry Metadata Management

Management of entry overview information, contacts, and starring functionality for organizing and maintaining entry metadata.

class DataCatalogClient:
    def modify_entry_overview(self, request: ModifyEntryOverviewRequest = None, **kwargs) -> EntryOverview: ...
    def modify_entry_contacts(self, request: ModifyEntryContactsRequest = None, **kwargs) -> Contacts: ...
    def star_entry(self, request: StarEntryRequest = None, **kwargs) -> StarEntryResponse: ...
    def unstar_entry(self, request: UnstarEntryRequest = None, **kwargs) -> UnstarEntryResponse: ...

Entry Metadata Management

Bulk Operations

Long-running operations for bulk entry import and tag reconciliation, designed for large-scale metadata management tasks.

class DataCatalogClient:
    def import_entries(self, request: ImportEntriesRequest = None, **kwargs) -> Operation: ...
    def reconcile_tags(self, request: ReconcileTagsRequest = None, **kwargs) -> Operation: ...

Bulk Operations

Core Types

# Primary Resources
class Entry:
    name: str
    linked_resource: str
    fully_qualified_name: str
    display_name: str
    description: str
    business_context: BusinessContext
    schema: Schema
    source_system_timestamps: SystemTimestamps
    usage_signal: UsageSignal
    integrated_system: IntegratedSystem
    user_specified_type: str
    user_specified_system: str
    personal_details: PersonalDetails
    contacts: Contacts
    labels: MutableMapping[str, str]
    type_: EntryType

class EntryGroup:
    name: str
    display_name: str
    description: str
    data_catalog_timestamps: SystemTimestamps

class Tag:
    name: str
    template: str
    template_display_name: str
    column: str
    fields: MutableMapping[str, TagField]

class TagTemplate:
    name: str
    display_name: str
    is_publicly_readable: bool
    fields: MutableMapping[str, TagTemplateField]
    dataplex_transfer_status: DataplexTransferStatus

class Taxonomy:
    name: str
    display_name: str
    description: str
    policy_tag_count: int
    taxonomy_timestamps: SystemTimestamps
    activated_policy_types: Sequence[PolicyType]
    service: ManagingSystem

class PolicyTag:
    name: str
    display_name: str
    description: str
    parent_policy_tag: str
    child_policy_tags: Sequence[str]

# Search and Response Types
class SearchCatalogResult:
    search_result_type: SearchResultType
    search_result_subtype: str
    relative_resource_name: str
    linked_resource: str
    modify_time: timestamp_pb2.Timestamp
    integrated_system: IntegratedSystem
    user_specified_system: str
    fully_qualified_name: str
    display_name: str
    description: str

class Schema:
    columns: Sequence[ColumnSchema]

class ColumnSchema:
    column: str
    type_: str
    description: str
    mode: str
    default_value: str
    ordinal_position: int
    highest_indexing_type: IndexingType
    subcolumns: Sequence['ColumnSchema']
    looker_column_spec: LookerColumnSpec
    range_element_type: RangeElementType
    gc_rule: str

# IAM Types
class Policy:
    version: int
    bindings: Sequence[Binding]
    etag: bytes

class Binding:
    role: str
    members: Sequence[str]
    condition: Expr

class SetIamPolicyRequest:
    resource: str
    policy: Policy

class GetIamPolicyRequest:
    resource: str
    options: GetPolicyOptions

class TestIamPermissionsRequest:
    resource: str
    permissions: Sequence[str]

class TestIamPermissionsResponse:
    permissions: Sequence[str]

Enums

class EntryType(proto.Enum):
    ENTRY_TYPE_UNSPECIFIED = 0
    TABLE = 2
    MODEL = 5
    DATA_STREAM = 3
    FILESET = 4
    CLUSTER = 6
    DATABASE = 7
    DATA_SOURCE_CONNECTION = 8
    ROUTINE = 9
    LAKE = 10
    ZONE = 11
    SERVICE = 14
    DATABASE_SCHEMA = 15
    DASHBOARD = 16
    EXPLORE = 17
    LOOK = 18

class SearchResultType(proto.Enum):
    SEARCH_RESULT_TYPE_UNSPECIFIED = 0
    ENTRY = 1
    TAG_TEMPLATE = 2
    ENTRY_GROUP = 3

class IntegratedSystem(proto.Enum):
    INTEGRATED_SYSTEM_UNSPECIFIED = 0
    BIGQUERY = 1
    CLOUD_PUBSUB = 2
    DATAPROC_METASTORE = 3
    DATAPLEX = 4
    CLOUD_SQL = 5
    CLOUD_BIGTABLE = 6
    CLOUD_DATAFLOW = 7
    CLOUD_DATAPROC = 8
    CLOUD_DATAPREP = 9
    CLOUD_COMPOSER = 10
    CLOUD_SPANNER = 11
    VERTEX_AI = 12
    LOOKER = 13
    CLOUD_STORAGE = 14

Pager Types

class SearchCatalogPager:
    """Pager for search_catalog method results"""
    def __iter__(self) -> Iterator[SearchCatalogResult]: ...

class ListEntryGroupsPager:
    """Pager for list_entry_groups method results"""
    def __iter__(self) -> Iterator[EntryGroup]: ...

class ListEntriesPager:
    """Pager for list_entries method results"""
    def __iter__(self) -> Iterator[Entry]: ...

class ListTagsPager:
    """Pager for list_tags method results"""
    def __iter__(self) -> Iterator[Tag]: ...

class ListTaxonomiesPager:
    """Pager for list_taxonomies method results"""
    def __iter__(self) -> Iterator[Taxonomy]: ...

class ListPolicyTagsPager:
    """Pager for list_policy_tags method results"""
    def __iter__(self) -> Iterator[PolicyTag]: ...