or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

batch-systems.mdcore-workflow.mdfile-management.mdindex.mdjob-stores.mdprovisioning.mdutilities.mdworkflow-languages.md

index.mddocs/

0

# Toil Python Package

1

2

## Overview

3

4

Toil is a comprehensive Python pipeline management and workflow execution system designed for distributed computing environments. It provides robust job scheduling, cloud provisioning, and container execution capabilities across various batch systems including Slurm, LSF, Kubernetes, and local execution. Toil supports multiple workflow languages including CWL (Common Workflow Language) and WDL (Workflow Description Language), making it a versatile solution for scientific computing and data processing pipelines.

5

6

## Package Information

7

8

- **Package Name**: `toil`

9

- **Version**: 9.0.0 (managed dynamically via `toil.version` module)

10

- **Description**: Pipeline management software for clusters supporting distributed computing, cloud provisioning, and container execution

11

- **Main Module**: `toil`

12

- **Installation**: `pip install toil`

13

14

## Core Imports

15

16

```python

17

# Essential workflow components

18

from toil.common import Toil, Config

19

from toil.job import Job, JobDescription, Promise, AcceleratorRequirement

20

from toil.fileStores import AbstractFileStore, FileID

21

22

# Exception handling

23

from toil.exceptions import FailedJobsException

24

25

# Utility functions

26

from toil.lib.conversions import human2bytes, bytes2human

27

from toil.lib.retry import retry

28

from toil import physicalMemory, physicalDisk, toilPackageDirPath

29

```

30

31

## Basic Usage

32

33

### Simple Workflow Creation

34

35

```python { .api }

36

from toil.common import Toil, Config

37

from toil.job import Job

38

39

class HelloWorldJob(Job):

40

def __init__(self, message):

41

# 100MB memory, 1 core, 100MB disk

42

super().__init__(memory=100*1024*1024, cores=1, disk=100*1024*1024)

43

self.message = message

44

45

def run(self, fileStore):

46

fileStore.logToMaster(f"Hello {self.message}")

47

return f"Processed: {self.message}"

48

49

# Create and run workflow

50

if __name__ == "__main__":

51

config = Config()

52

config.jobStore = "file:my-job-store"

53

config.logLevel = "INFO"

54

55

with Toil(config) as toil:

56

root_job = HelloWorldJob("World")

57

result = toil.start(root_job)

58

print(f"Result: {result}")

59

```

60

61

### Function-Based Jobs

62

63

```python { .api }

64

from toil.common import Toil, Config

65

from toil.job import Job

66

67

@Job.wrapJobFn

68

def process_data(job, input_data):

69

# Job automatically gets memory=2G, cores=1, disk=2G by default

70

job.fileStore.logToMaster(f"Processing: {input_data}")

71

return input_data.upper()

72

73

@Job.wrapJobFn

74

def combine_results(job, *results):

75

combined = " + ".join(results)

76

job.fileStore.logToMaster(f"Combined: {combined}")

77

return combined

78

79

if __name__ == "__main__":

80

config = Config()

81

config.jobStore = "file:my-job-store"

82

83

with Toil(config) as toil:

84

# Create processing jobs

85

job1 = process_data("hello")

86

job2 = process_data("world")

87

88

# Chain jobs together

89

final_job = combine_results(job1.rv(), job2.rv())

90

job1.addFollowOn(final_job)

91

job2.addFollowOn(final_job)

92

93

result = toil.start(job1)

94

print(f"Final result: {result}")

95

```

96

97

## Architecture

98

99

Toil's architecture consists of several key components that work together to provide scalable workflow execution:

100

101

### Core Components

102

103

1. **Job Management Layer**: `Job`, `JobDescription`, and `Promise` classes handle job definition, scheduling, and result handling

104

2. **Batch System Layer**: Abstracts different compute environments (local, Slurm, Kubernetes, cloud services)

105

3. **Job Store Layer**: Persistent storage for job metadata and workflow state (file system, AWS S3, Google Cloud Storage)

106

4. **File Store Layer**: Manages file I/O operations and temporary file handling during job execution

107

5. **Leader-Worker Architecture**: Centralized leader coordinates job scheduling while distributed workers execute tasks

108

6. **Provisioning Layer**: Automatic cloud resource provisioning and scaling

109

110

### Workflow Execution Flow

111

112

1. **Configuration**: Define workflow parameters using `Config` class

113

2. **Job Definition**: Create job hierarchy using `Job` classes or function decorators

114

3. **Workflow Execution**: Use `Toil` context manager to execute the workflow

115

4. **Resource Management**: Automatic allocation and cleanup of compute and storage resources

116

5. **Result Handling**: Collect results through `Promise` objects and return values

117

118

## Capabilities

119

120

### Core Workflow Management

121

{ .api }

122

123

Basic job creation, execution, and chaining capabilities with resource management and promise-based result handling.

124

125

**Key APIs:**

126

- `Job(memory, cores, disk, accelerators, preemptible, checkpoint)` - Job definition with resource requirements

127

- `Job.addChild(childJob)` - Add dependent child jobs

128

- `Job.addFollowOn(followOnJob)` - Add sequential follow-on jobs

129

- `Job.rv(*path)` - Create promise for job return value

130

- `Toil(config).start(rootJob)` - Execute workflow with root job

131

- `Config()` - Workflow configuration and batch system settings

132

133

[Core Workflow Management](./core-workflow.md)

134

135

### Batch System Integration

136

{ .api }

137

138

Support for multiple compute environments including local execution, HPC schedulers, and cloud services.

139

140

**Key APIs:**

141

- `AbstractBatchSystem.issueBatchJob(jobNode)` - Submit job to batch system

142

- `AbstractBatchSystem.getUpdatedBatchJob(maxWait)` - Monitor job status

143

- `AbstractScalableBatchSystem.nodeTypes()` - Query available node types

144

- `KubernetesBatchSystem`, `SlurmBatchSystem`, `LSFBatchSystem` - Concrete implementations

145

- `BatchJobExitReason` - Job completion status enumeration

146

147

[Batch System Integration](./batch-systems.md)

148

149

### Job Store Management

150

{ .api }

151

152

Persistent storage backends for workflow metadata and state management across different storage systems.

153

154

**Key APIs:**

155

- `AbstractJobStore.create(jobDescription)` - Store job metadata

156

- `AbstractJobStore.load(jobStoreID)` - Retrieve job by ID

157

- `AbstractJobStore.writeFile(localFilePath)` - Store file in job store

158

- `AbstractJobStore.importFile(srcUrl, sharedFileName)` - Import external files

159

- `FileJobStore`, `AWSJobStore`, `GoogleJobStore` - Storage backend implementations

160

161

[Job Store Management](./job-stores.md)

162

163

### File Management

164

{ .api }

165

166

Comprehensive file handling for temporary files, shared data, and persistent storage during workflow execution.

167

168

**Key APIs:**

169

- `AbstractFileStore.writeGlobalFile(localFileName)` - Store globally accessible files

170

- `AbstractFileStore.readGlobalFile(fileStoreID, userPath, cache)` - Read shared files

171

- `AbstractFileStore.getLocalTempDir()` - Get temporary directory

172

- `AbstractFileStore.logToMaster(text, level)` - Send logs to workflow leader

173

- `FileID` - File identifier type for referencing stored files

174

175

[File Management](./file-management.md)

176

177

### Workflow Language Integration

178

{ .api }

179

180

Native support for CWL and WDL workflow specifications with seamless translation to Toil execution.

181

182

**Key APIs:**

183

- `toil-cwl-runner` - Command-line CWL workflow execution

184

- `toil-wdl-runner` - Command-line WDL workflow execution

185

- `toil.cwl.cwltoil.main()` - Programmatic CWL execution

186

- `toil.wdl.wdltoil.main()` - Programmatic WDL execution

187

- CWL and WDL utility functions for workflow processing

188

189

[Workflow Language Integration](./workflow-languages.md)

190

191

### Cloud Provisioning

192

{ .api }

193

194

Automatic cloud resource provisioning and cluster management for scalable workflow execution.

195

196

**Key APIs:**

197

- `AbstractProvisioner` - Base provisioner interface

198

- AWS, Google Cloud, Azure provisioners - Cloud-specific implementations

199

- `toil-launch-cluster` - Cluster creation utility

200

- `toil-destroy-cluster` - Cluster cleanup utility

201

- Dynamic node scaling and resource management

202

203

[Cloud Provisioning](./provisioning.md)

204

205

### Utilities and CLI Tools

206

{ .api }

207

208

Comprehensive command-line tools and utilities for workflow management, debugging, and monitoring.

209

210

**Key APIs:**

211

- `toil` - Main CLI interface for workflow execution

212

- `toil-stats` - Statistics collection and analysis

213

- `toil-status` - Workflow monitoring and status

214

- `toil-clean` - Cleanup utilities and job store management

215

- `toil-kill` - Workflow termination utilities

216

- Various debugging and cluster management tools

217

218

[Utilities and CLI Tools](./utilities.md)