or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

tessl/pypi-google-cloud-dataproc-metastore

Google Cloud Dataproc Metastore API client library for managing fully managed, highly available metastore services

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/google-cloud-dataproc-metastore@1.19.x

To install, run

npx @tessl/cli install tessl/pypi-google-cloud-dataproc-metastore@1.19.0

0

# Google Cloud Dataproc Metastore

1

2

A Python client library for Google Cloud Dataproc Metastore, a fully managed, highly available, autoscaled, autohealing, OSS-native metastore service that greatly simplifies technical metadata management. Built on Apache Hive metastore, it serves as a critical component for enterprise data lakes.

3

4

## Package Information

5

6

- **Package Name**: google-cloud-dataproc-metastore

7

- **Language**: Python

8

- **Installation**: `pip install google-cloud-dataproc-metastore`

9

10

## Core Imports

11

12

```python

13

from google.cloud import metastore

14

```

15

16

Version-specific imports:

17

18

```python

19

from google.cloud import metastore_v1

20

from google.cloud import metastore_v1alpha

21

from google.cloud import metastore_v1beta

22

```

23

24

## Basic Usage

25

26

```python

27

from google.cloud import metastore

28

29

# Initialize the client

30

client = metastore.DataprocMetastoreClient()

31

32

# List all metastore services in a location

33

parent = "projects/my-project/locations/us-central1"

34

services = client.list_services(parent=parent)

35

36

for service in services:

37

print(f"Service: {service.name}")

38

print(f"State: {service.state}")

39

print(f"Endpoint URI: {service.endpoint_uri}")

40

41

# Get a specific service

42

service_name = "projects/my-project/locations/us-central1/services/my-metastore"

43

service = client.get_service(name=service_name)

44

print(f"Service tier: {service.tier}")

45

print(f"Hive version: {service.hive_metastore_config.version}")

46

47

# Create a new backup

48

backup_request = metastore.CreateBackupRequest(

49

parent="projects/my-project/locations/us-central1/services/my-metastore",

50

backup_id="my-backup",

51

backup=metastore.Backup(

52

description="Automated backup for disaster recovery"

53

)

54

)

55

operation = client.create_backup(request=backup_request)

56

backup = operation.result() # Wait for completion

57

print(f"Backup created: {backup.name}")

58

```

59

60

## Architecture

61

62

The Google Cloud Dataproc Metastore client library follows Google's standard client library patterns:

63

64

- **Client Classes**: Synchronous and asynchronous clients for different API versions

65

- **Resource Management**: Standardized CRUD operations for services, backups, and federations

66

- **Long-Running Operations**: Built-in support for async operations with progress tracking

67

- **Authentication**: Integrated with Google Cloud authentication (ADC, service accounts)

68

- **Error Handling**: Comprehensive error handling with retry logic and timeout configuration

69

- **Paging**: Automatic handling of paginated API responses

70

71

## Capabilities

72

73

### Service Management

74

75

Comprehensive lifecycle management for Dataproc Metastore services including creation, configuration, updates, and deletion. Supports multiple service tiers and Hive metastore versions with advanced networking and security options.

76

77

```python { .api }

78

class DataprocMetastoreClient:

79

def list_services(self, request=None, *, parent=None, **kwargs): ...

80

def get_service(self, request=None, *, name=None, **kwargs): ...

81

def create_service(self, request=None, *, parent=None, service=None, service_id=None, **kwargs): ...

82

def update_service(self, request=None, *, service=None, update_mask=None, **kwargs): ...

83

def delete_service(self, request=None, *, name=None, **kwargs): ...

84

```

85

86

[Service Management](./service-management.md)

87

88

### Backup and Restore Operations

89

90

Complete backup and restore functionality for metastore services including scheduled backups, point-in-time recovery, and cross-region backup management for disaster recovery scenarios.

91

92

```python { .api }

93

class DataprocMetastoreClient:

94

def list_backups(self, request=None, *, parent=None, **kwargs): ...

95

def get_backup(self, request=None, *, name=None, **kwargs): ...

96

def create_backup(self, request=None, *, parent=None, backup=None, backup_id=None, **kwargs): ...

97

def delete_backup(self, request=None, *, name=None, **kwargs): ...

98

def restore_service(self, request=None, *, service=None, backup=None, **kwargs): ...

99

```

100

101

[Backup and Restore](./backup-restore.md)

102

103

### Metadata Import and Export

104

105

Import metadata from external sources and export metastore data to Google Cloud Storage. Supports various database formats including MySQL and PostgreSQL dumps with comprehensive validation and error handling.

106

107

```python { .api }

108

class DataprocMetastoreClient:

109

def list_metadata_imports(self, request=None, *, parent=None, **kwargs): ...

110

def get_metadata_import(self, request=None, *, name=None, **kwargs): ...

111

def create_metadata_import(self, request=None, *, parent=None, metadata_import=None, metadata_import_id=None, **kwargs): ...

112

def update_metadata_import(self, request=None, *, metadata_import=None, update_mask=None, **kwargs): ...

113

def export_metadata(self, request=None, *, service=None, **kwargs): ...

114

```

115

116

[Metadata Import and Export](./metadata-import-export.md)

117

118

### Federation Management

119

120

Manage metastore federation services that provide unified access to multiple backend metastores. Supports cross-cloud and multi-region federation scenarios for enterprise data lake architectures.

121

122

```python { .api }

123

class DataprocMetastoreFederationClient:

124

def list_federations(self, request=None, *, parent=None, **kwargs): ...

125

def get_federation(self, request=None, *, name=None, **kwargs): ...

126

def create_federation(self, request=None, *, parent=None, federation=None, federation_id=None, **kwargs): ...

127

def update_federation(self, request=None, *, federation=None, update_mask=None, **kwargs): ...

128

def delete_federation(self, request=None, *, name=None, **kwargs): ...

129

```

130

131

[Federation Management](./federation-management.md)

132

133

### Metadata Query Operations

134

135

Execute Hive and Spark SQL queries directly against metastore metadata for advanced analytics and metadata management operations including table movement and resource location management.

136

137

```python { .api }

138

class DataprocMetastoreClient:

139

def query_metadata(self, request=None, *, service=None, query=None, **kwargs): ...

140

def move_table_to_database(self, request=None, *, service=None, table_name=None, db_name=None, destination_db_name=None, **kwargs): ...

141

def alter_metadata_resource_location(self, request=None, *, service=None, resource_name=None, location_uri=None, **kwargs): ...

142

```

143

144

[Metadata Query Operations](./metadata-query.md)

145

146

### Asynchronous Operations

147

148

Asynchronous client implementations for all operations with full async/await support, enabling high-performance concurrent operations and integration with async Python frameworks.

149

150

```python { .api }

151

class DataprocMetastoreAsyncClient:

152

async def list_services(self, request=None, *, parent=None, **kwargs): ...

153

async def get_service(self, request=None, *, name=None, **kwargs): ...

154

async def create_service(self, request=None, *, parent=None, service=None, service_id=None, **kwargs): ...

155

# ... all methods have async equivalents

156

```

157

158

[Asynchronous Operations](./async-operations.md)

159

160

## Common Types

161

162

```python { .api }

163

# Service states

164

class Service:

165

class State(enum.Enum):

166

CREATING = 1

167

ACTIVE = 2

168

SUSPENDING = 3

169

SUSPENDED = 4

170

UPDATING = 5

171

DELETING = 6

172

ERROR = 7

173

174

class Tier(enum.Enum):

175

DEVELOPER = 1

176

ENTERPRISE = 3

177

178

class ReleaseChannel(enum.Enum):

179

CANARY = 1

180

STABLE = 2

181

182

# Configuration classes

183

class HiveMetastoreConfig:

184

version: str

185

config_overrides: Dict[str, str]

186

kerberos_config: Optional[KerberosConfig]

187

auxiliary_versions: List[AuxiliaryVersionConfig]

188

189

class NetworkConfig:

190

consumers: List[NetworkConsumer]

191

enable_private_ip: bool

192

193

class EncryptionConfig:

194

kms_key: str

195

196

# Resource path helpers

197

def service_path(project: str, location: str, service: str) -> str: ...

198

def backup_path(project: str, location: str, service: str, backup: str) -> str: ...

199

def federation_path(project: str, location: str, federation: str) -> str: ...

200

```