or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

authentication-security.mdcluster-configuration.mdcluster-management.mdindex.mdnode-pool-operations.mdoperations-management.md

operations-management.mddocs/

0

# Operations Management

1

2

Long-running operation monitoring and management for Google Kubernetes Engine operations. This module provides functionality for tracking cluster and node pool changes, monitoring operation status, and managing operation lifecycle.

3

4

## Capabilities

5

6

### Listing Operations

7

8

Retrieve all operations in a project within a specified zone or across all zones.

9

10

```python { .api }

11

def list_operations(

12

self,

13

request=None, *,

14

project_id=None,

15

zone=None,

16

parent=None,

17

retry=gapic_v1.method.DEFAULT,

18

timeout=None,

19

metadata=()

20

) -> ListOperationsResponse:

21

"""

22

Lists all operations in a project in a specific zone or all zones.

23

24

Args:

25

project_id (str): Deprecated. The Google Developers Console project ID or project number.

26

zone (str): Deprecated. The name of the Google Compute Engine zone.

27

parent (str): The parent (project and location) where the operations will be listed.

28

Format: projects/{project_id}/locations/{location}

29

retry: Retry configuration.

30

timeout (float): Request timeout in seconds.

31

metadata: Additional gRPC metadata.

32

33

Returns:

34

ListOperationsResponse: Response containing the list of operations.

35

"""

36

```

37

38

Usage example:

39

40

```python

41

from google.cloud import container

42

43

client = container.ClusterManagerClient()

44

45

# List all operations in a zone

46

operations = client.list_operations(

47

project_id="my-project",

48

zone="us-central1-a"

49

)

50

51

# Or use the new parent format

52

operations = client.list_operations(

53

parent="projects/my-project/locations/us-central1-a"

54

)

55

56

for operation in operations.operations:

57

print(f"Operation: {operation.name}")

58

print(f"Type: {operation.operation_type}")

59

print(f"Status: {operation.status}")

60

print(f"Target: {operation.target_link}")

61

if operation.start_time:

62

print(f"Started: {operation.start_time}")

63

if operation.end_time:

64

print(f"Ended: {operation.end_time}")

65

```

66

67

### Getting Operation Details

68

69

Retrieve detailed information about a specific operation.

70

71

```python { .api }

72

def get_operation(

73

self,

74

request=None, *,

75

project_id=None,

76

zone=None,

77

operation_id=None,

78

name=None,

79

retry=gapic_v1.method.DEFAULT,

80

timeout=None,

81

metadata=()

82

) -> Operation:

83

"""

84

Gets the specified operation.

85

86

Args:

87

project_id (str): Deprecated. The Google Developers Console project ID or project number.

88

zone (str): Deprecated. The name of the Google Compute Engine zone.

89

operation_id (str): Deprecated. The server-assigned name of the operation.

90

name (str): The name (project, location, operation) of the operation to get.

91

Format: projects/{project_id}/locations/{location}/operations/{operation_id}

92

retry: Retry configuration.

93

timeout (float): Request timeout in seconds.

94

metadata: Additional gRPC metadata.

95

96

Returns:

97

Operation: The operation information.

98

"""

99

```

100

101

Usage example:

102

103

```python

104

operation = client.get_operation(

105

project_id="my-project",

106

zone="us-central1-a",

107

operation_id="operation-1234567890123-5f2a7b4d-a1b2c3d4"

108

)

109

110

print(f"Operation name: {operation.name}")

111

print(f"Operation type: {operation.operation_type}")

112

print(f"Status: {operation.status}")

113

print(f"Detail: {operation.detail}")

114

115

if operation.progress:

116

print(f"Progress: {operation.progress}")

117

118

if operation.error:

119

print(f"Error: {operation.error}")

120

121

# Monitor operation status

122

import time

123

124

def wait_for_operation(client, operation_name):

125

"""Wait for an operation to complete."""

126

while True:

127

op = client.get_operation(name=operation_name)

128

129

if op.status == "DONE":

130

if op.error:

131

print(f"Operation failed: {op.error}")

132

return False

133

else:

134

print("Operation completed successfully")

135

return True

136

elif op.status == "ABORTING":

137

print("Operation is aborting")

138

return False

139

else:

140

print(f"Operation status: {op.status}")

141

time.sleep(10) # Wait 10 seconds before checking again

142

143

# Example usage

144

operation = client.create_cluster(...)

145

success = wait_for_operation(client, operation.name)

146

```

147

148

### Cancelling Operations

149

150

Cancel a running operation.

151

152

```python { .api }

153

def cancel_operation(

154

self,

155

request=None, *,

156

project_id=None,

157

zone=None,

158

operation_id=None,

159

name=None,

160

retry=gapic_v1.method.DEFAULT,

161

timeout=None,

162

metadata=()

163

) -> None:

164

"""

165

Cancels the specified operation.

166

167

Args:

168

project_id (str): Deprecated. The Google Developers Console project ID or project number.

169

zone (str): Deprecated. The name of the Google Compute Engine zone.

170

operation_id (str): Deprecated. The server-assigned name of the operation.

171

name (str): The name (project, location, operation) of the operation to cancel.

172

Format: projects/{project_id}/locations/{location}/operations/{operation_id}

173

retry: Retry configuration.

174

timeout (float): Request timeout in seconds.

175

metadata: Additional gRPC metadata.

176

"""

177

```

178

179

Usage example:

180

181

```python

182

# Cancel a running operation

183

client.cancel_operation(

184

project_id="my-project",

185

zone="us-central1-a",

186

operation_id="operation-1234567890123-5f2a7b4d-a1b2c3d4"

187

)

188

189

print("Operation cancellation requested")

190

191

# Verify cancellation

192

operation = client.get_operation(

193

project_id="my-project",

194

zone="us-central1-a",

195

operation_id="operation-1234567890123-5f2a7b4d-a1b2c3d4"

196

)

197

198

print(f"Operation status after cancellation: {operation.status}")

199

```

200

201

### Monitoring Operation Progress

202

203

Operations in GKE are long-running and can take several minutes to complete. The library provides ways to monitor progress:

204

205

```python

206

def monitor_operation_with_callback(client, operation_name, callback=None):

207

"""

208

Monitor operation with optional progress callback.

209

210

Args:

211

client: ClusterManagerClient instance

212

operation_name: Full operation name

213

callback: Optional function called with operation progress

214

215

Returns:

216

Boolean indicating success/failure

217

"""

218

import time

219

220

while True:

221

operation = client.get_operation(name=operation_name)

222

223

# Call progress callback if provided

224

if callback:

225

callback(operation)

226

227

if operation.status == "DONE":

228

if operation.error:

229

print(f"Operation failed: {operation.error.message}")

230

return False

231

else:

232

print("Operation completed successfully")

233

return True

234

235

elif operation.status in ["ABORTING", "ABORTED"]:

236

print(f"Operation was aborted: {operation.status_message}")

237

return False

238

239

else:

240

# Operation is still running

241

progress_info = []

242

if operation.progress:

243

if hasattr(operation.progress, 'stages'):

244

for stage in operation.progress.stages:

245

progress_info.append(f"{stage.name}: {stage.status}")

246

247

status_msg = f"Status: {operation.status}"

248

if progress_info:

249

status_msg += f" - {', '.join(progress_info)}"

250

if operation.status_message:

251

status_msg += f" - {operation.status_message}"

252

253

print(status_msg)

254

time.sleep(15) # Check every 15 seconds

255

256

# Example with progress callback

257

def progress_callback(operation):

258

print(f"Operation {operation.name}: {operation.status}")

259

if operation.progress and operation.progress.stages:

260

for stage in operation.progress.stages:

261

print(f" Stage {stage.name}: {stage.status}")

262

263

# Use the monitor

264

operation = client.create_cluster(...)

265

success = monitor_operation_with_callback(

266

client,

267

operation.name,

268

callback=progress_callback

269

)

270

```

271

272

## Types

273

274

```python { .api }

275

class ListOperationsRequest:

276

"""ListOperationsRequest lists operations."""

277

project_id: str # Deprecated

278

zone: str # Deprecated

279

parent: str # Required. Format: projects/{project}/locations/{location}

280

281

class ListOperationsResponse:

282

"""ListOperationsResponse is the result of ListOperationsRequest."""

283

operations: MutableSequence[Operation]

284

missing_zones: MutableSequence[str]

285

286

class GetOperationRequest:

287

"""GetOperationRequest gets a single operation."""

288

project_id: str # Deprecated

289

zone: str # Deprecated

290

operation_id: str # Deprecated

291

name: str # Required. Format: projects/{project}/locations/{location}/operations/{operation}

292

293

class CancelOperationRequest:

294

"""CancelOperationRequest cancels a single operation."""

295

project_id: str # Deprecated

296

zone: str # Deprecated

297

operation_id: str # Deprecated

298

name: str # Required. Format: projects/{project}/locations/{location}/operations/{operation}

299

300

class Operation:

301

"""This operation resource represents operations that may have happened or are happening on the cluster."""

302

name: str # The server-assigned ID for the operation

303

zone: str # The name of the Google Compute Engine zone (deprecated)

304

operation_type: str # The operation type

305

status: str # The current status of the operation

306

detail: str # Detailed operation progress, if available

307

status_message: str # Output only. If an error has occurred, a textual description of the error

308

self_link: str # Server-defined URL for this resource

309

target_link: str # Server-defined URL for the target of the operation

310

location: str # The name of the Google Compute Engine location

311

start_time: str # The time the operation started

312

end_time: str # The time the operation completed

313

progress: OperationProgress # Output only. Progress information for an operation

314

cluster_conditions: MutableSequence[StatusCondition] # Which conditions caused the current cluster state

315

nodepool_conditions: MutableSequence[StatusCondition] # Which conditions caused the current node pool state

316

error: Status # The error result of the operation in case of failure

317

318

class OperationProgress:

319

"""Information about operation (or operation stage) progress."""

320

name: str # A non-parameterized string describing an operation stage

321

status: str # Status of an operation stage

322

metrics: MutableSequence[OperationProgress.Metric] # Progress metric bundle

323

stages: MutableSequence[OperationProgress] # Substages of an operation or a stage

324

325

class Metric:

326

"""Progress metric is (string, int|float|string) pair."""

327

name: str # Required. Metric name

328

int_value: int # For metrics with integer value

329

double_value: float # For metrics with floating point value

330

string_value: str # For metrics with string value

331

332

class StatusCondition:

333

"""StatusCondition describes why a cluster or a node pool has a certain status."""

334

code: str # Machine-friendly representation of the condition

335

message: str # Human-friendly representation of the condition

336

canonical_code: str # Canonical code of the condition

337

```

338

339

## Operation Types

340

341

Common operation types you'll encounter:

342

343

- `CREATE_CLUSTER` - Cluster creation

344

- `DELETE_CLUSTER` - Cluster deletion

345

- `UPGRADE_MASTER` - Master version upgrade

346

- `UPGRADE_NODES` - Node version upgrade

347

- `REPAIR_CLUSTER` - Cluster repair

348

- `UPDATE_CLUSTER` - Cluster configuration update

349

- `CREATE_NODE_POOL` - Node pool creation

350

- `DELETE_NODE_POOL` - Node pool deletion

351

- `SET_NODE_POOL_MANAGEMENT` - Node pool management update

352

- `AUTO_REPAIR_NODES` - Automatic node repair

353

- `AUTO_UPGRADE_NODES` - Automatic node upgrade

354

- `SET_LABELS` - Label updates

355

- `SET_MASTER_AUTH` - Master authentication updates

356

- `SET_NODE_POOL_SIZE` - Node pool size changes

357

- `SET_NETWORK_POLICY` - Network policy updates

358

- `SET_MAINTENANCE_POLICY` - Maintenance policy updates

359

360

## Operation Status Values

361

362

- `PENDING` - Operation is queued

363

- `RUNNING` - Operation is in progress

364

- `DONE` - Operation completed successfully

365

- `ABORTING` - Operation is being cancelled

366

- `ABORTED` - Operation was cancelled