or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

composer.mdindex.mdmjcf.mdphysics.mdsuite.mdviewer.md

suite.mddocs/

0

# Environment Suite

1

2

Pre-built collection of continuous control reinforcement learning environments spanning diverse domains including locomotion, manipulation, and classic control problems. The suite provides standardized interfaces, consistent action/observation spaces, and benchmark task definitions for RL research.

3

4

## Capabilities

5

6

### Environment Loading

7

8

Load environments by domain and task name with optional configuration parameters.

9

10

```python { .api }

11

def load(domain_name: str, task_name: str, task_kwargs=None, environment_kwargs=None, visualize_reward=False):

12

"""

13

Returns an environment from a domain name, task name and optional settings.

14

15

Parameters:

16

- domain_name: String name of the domain (e.g., 'cartpole', 'walker')

17

- task_name: String name of the task (e.g., 'balance', 'walk')

18

- task_kwargs: Optional dict of keyword arguments for the task

19

- environment_kwargs: Optional dict of keyword arguments for the environment

20

- visualize_reward: Optional bool to enable reward visualization in rendering

21

22

Returns:

23

Environment instance ready for interaction

24

25

Example:

26

>>> env = suite.load('cartpole', 'balance')

27

>>> env = suite.load('walker', 'walk', task_kwargs={'random': 42})

28

"""

29

30

def build_environment(domain_name: str, task_name: str, task_kwargs=None, environment_kwargs=None, visualize_reward=False):

31

"""

32

Returns an environment from the suite with comprehensive error handling.

33

34

Parameters: Same as load()

35

36

Raises:

37

- ValueError: If domain or task doesn't exist

38

39

Returns:

40

Environment instance

41

42

Note: Identical functionality to load() but with explicit error handling

43

"""

44

```

45

46

### Environment Collections

47

48

Pre-defined collections of environments organized by difficulty and purpose.

49

50

```python { .api }

51

# Complete environment catalog

52

ALL_TASKS: tuple

53

"""Tuple containing all available (domain_name, task_name) pairs"""

54

55

# Difficulty-based collections

56

BENCHMARKING: tuple

57

"""Tuple of (domain, task) pairs used for benchmarking"""

58

59

EASY: tuple

60

"""Tuple of easier difficulty tasks suitable for initial testing"""

61

62

HARD: tuple

63

"""Tuple of challenging tasks for advanced evaluation"""

64

65

EXTRA: tuple

66

"""Tuple of additional tasks not included in benchmarking set"""

67

68

# Visualization-based collections

69

REWARD_VIZ: tuple

70

"""Tuple of tasks that support reward visualization"""

71

72

NO_REWARD_VIZ: tuple

73

"""Tuple of tasks without reward visualization support"""

74

75

# Domain organization

76

TASKS_BY_DOMAIN: dict

77

"""Dict mapping domain names to tuples of their task names"""

78

```

79

80

### Available Domains

81

82

The suite includes environments across these domains:

83

84

```python { .api }

85

# Locomotion domains

86

acrobot # Acrobat balancing task

87

cheetah # Cheetah running tasks

88

hopper # Single-leg hopping tasks

89

humanoid # Humanoid locomotion tasks

90

humanoid_CMU # CMU humanoid with mocap data

91

quadruped # Four-legged locomotion

92

swimmer # Swimming locomotion

93

walker # Bipedal walking tasks

94

dog # Dog locomotion tasks

95

96

# Manipulation domains

97

finger # Finger manipulation tasks

98

manipulator # Robotic arm manipulation

99

reacher # Point reaching tasks

100

stacker # Block stacking tasks

101

102

# Classic control domains

103

ball_in_cup # Ball-in-cup balancing

104

cartpole # Cartpole balancing

105

pendulum # Pendulum swing-up

106

point_mass # Point mass navigation

107

108

# Control theory domains

109

lqr # Linear quadratic regulator

110

111

# Aquatic domains

112

fish # Fish swimming tasks

113

```

114

115

## Usage Examples

116

117

### Basic Environment Usage

118

119

```python

120

from dm_control import suite

121

122

# Load environment

123

env = suite.load('cartpole', 'balance')

124

125

# Environment interaction loop

126

time_step = env.reset()

127

while not time_step.last():

128

action = env.action_spec().generate_value() # Random action

129

time_step = env.step(action)

130

131

print(f"Reward: {time_step.reward}")

132

print(f"Observation: {time_step.observation}")

133

```

134

135

### Environment Exploration

136

137

```python

138

# Explore available environments

139

print("All available tasks:")

140

for domain, task in suite.ALL_TASKS:

141

print(f" {domain}/{task}")

142

143

print(f"\nBenchmarking tasks: {len(suite.BENCHMARKING)}")

144

print(f"Easy tasks: {len(suite.EASY)}")

145

print(f"Hard tasks: {len(suite.HARD)}")

146

147

# Explore domain-specific tasks

148

print("\nTasks by domain:")

149

for domain, tasks in suite.TASKS_BY_DOMAIN.items():

150

print(f" {domain}: {tasks}")

151

```

152

153

### Custom Configuration

154

155

```python

156

# Load with custom task parameters

157

env = suite.load(

158

'walker', 'walk',

159

task_kwargs={'random': 42}, # Set random seed

160

environment_kwargs={'flat_observation': True} # Flatten observations

161

)

162

163

# Enable reward visualization

164

env = suite.load('reacher', 'easy', visualize_reward=True)

165

```

166

167

### Environment Properties

168

169

```python

170

env = suite.load('humanoid', 'stand')

171

172

# Inspect environment specifications

173

print(f"Action spec: {env.action_spec()}")

174

print(f"Observation spec: {env.observation_spec()}")

175

print(f"Reward range: {env.reward_range()}")

176

177

# Access physics simulation

178

physics = env.physics

179

print(f"Timestep: {physics.timestep()}")

180

print(f"Control: {physics.control}")

181

```

182

183

## Error Handling

184

185

```python

186

try:

187

env = suite.load('nonexistent_domain', 'task')

188

except ValueError as e:

189

print(f"Domain error: {e}")

190

191

try:

192

env = suite.load('cartpole', 'nonexistent_task')

193

except ValueError as e:

194

print(f"Task error: {e}")

195

```