0
# Process Management
1
2
TensorBoard's manager module provides process lifecycle management for TensorBoard instances, enabling programmatic control over server startup, monitoring, and shutdown. It includes utilities for managing multiple TensorBoard instances and tracking their state.
3
4
## Capabilities
5
6
### TensorBoard Information
7
8
Data structure containing metadata about running TensorBoard instances.
9
10
```python { .api }
11
@dataclasses.dataclass(frozen=True)
12
class TensorBoardInfo:
13
"""
14
Information about a running TensorBoard instance.
15
16
Dataclass containing process metadata for TensorBoard instances.
17
18
Attributes:
19
version (str): Version of the running TensorBoard
20
start_time (int): Start time as seconds since epoch
21
pid (int): Process ID
22
port (int): Server port number
23
path_prefix (str): Relative URL path prefix (may be empty)
24
logdir (str): Data location used by server (may be empty)
25
db (str): Database connection used by server (may be empty)
26
cache_key (str): Opaque cache key for instance identification
27
"""
28
29
version: str
30
start_time: int
31
pid: int
32
port: int
33
path_prefix: str
34
logdir: str
35
db: str
36
cache_key: str
37
```
38
39
### Launch Result Classes
40
41
Result objects returned by the start() function indicating launch outcome.
42
43
```python { .api }
44
@dataclasses.dataclass(frozen=True)
45
class StartReused:
46
"""
47
Indicates an existing TensorBoard instance was reused.
48
49
Attributes:
50
info (TensorBoardInfo): Information about the reused instance
51
"""
52
53
info: TensorBoardInfo
54
55
@dataclasses.dataclass(frozen=True)
56
class StartLaunched:
57
"""
58
Indicates a new TensorBoard instance was launched successfully.
59
60
Attributes:
61
info (TensorBoardInfo): Information about the new instance
62
"""
63
64
info: TensorBoardInfo
65
66
@dataclasses.dataclass(frozen=True)
67
class StartFailed:
68
"""
69
Indicates TensorBoard launch failed.
70
71
Attributes:
72
exit_code (int): Process exit code (negative for signal)
73
stdout (str, optional): Standard output if readable
74
stderr (str, optional): Standard error if readable
75
"""
76
77
exit_code: int
78
stdout: Optional[str]
79
stderr: Optional[str]
80
81
@dataclasses.dataclass(frozen=True)
82
class StartExecFailed:
83
"""
84
Indicates TensorBoard executable could not be found or executed.
85
86
Attributes:
87
os_error (OSError): Exception from subprocess execution
88
explicit_binary (str, optional): Custom binary path from TENSORBOARD_BINARY env var
89
"""
90
91
os_error: OSError
92
explicit_binary: Optional[str]
93
94
@dataclasses.dataclass(frozen=True)
95
class StartTimedOut:
96
"""
97
Indicates TensorBoard launch timed out.
98
99
Attributes:
100
pid (int): Process ID of the timed-out TensorBoard instance
101
"""
102
103
pid: int
104
```
105
106
### Process Management Functions
107
108
Core functions for managing TensorBoard instances.
109
110
```python { .api }
111
def start(arguments, timeout=datetime.timedelta(seconds=60)):
112
"""
113
Start a TensorBoard instance with specified arguments.
114
115
Args:
116
arguments (list): Command-line arguments for TensorBoard
117
timeout (datetime.timedelta): Maximum time to wait for startup
118
(default 60 seconds)
119
120
Returns:
121
Union[StartReused, StartLaunched, StartFailed, StartExecFailed, StartTimedOut]:
122
Result object indicating launch outcome
123
124
Automatically detects if a compatible instance is already running
125
and reuses it when possible, or launches a new instance otherwise.
126
"""
127
128
def get_all():
129
"""
130
Get information about all known running TensorBoard instances.
131
132
Returns:
133
list[TensorBoardInfo]: List of TensorBoardInfo objects for
134
all discovered running instances
135
136
Scans info files in temp directory for TensorBoard metadata.
137
May contain stale entries if processes exited uncleanly.
138
"""
139
140
def data_source_from_info(info):
141
"""
142
Format data source information from TensorBoardInfo.
143
144
Args:
145
info (TensorBoardInfo): TensorBoard instance information
146
147
Returns:
148
str: Human-readable data source description (e.g., "logdir /path" or "db connection")
149
"""
150
```
151
152
### File Management Functions
153
154
Utilities for managing TensorBoard instance metadata files.
155
156
```python { .api }
157
def cache_key(working_directory, arguments, configure_kwargs):
158
"""
159
Generate cache key for TensorBoard instance identification.
160
161
Args:
162
working_directory (str): Working directory path
163
arguments (list): Command-line arguments
164
configure_kwargs (dict): Configuration keyword arguments
165
166
Returns:
167
str: Unique cache key for the configuration
168
169
Used to determine if an existing instance matches the requested configuration.
170
"""
171
172
def write_info_file(tensorboard_info):
173
"""
174
Write TensorBoard instance information to file.
175
176
Args:
177
tensorboard_info (TensorBoardInfo): Instance information to write
178
179
Creates a metadata file containing instance information for later retrieval.
180
"""
181
182
def remove_info_file():
183
"""
184
Remove TensorBoard instance information file.
185
186
Cleans up metadata files when instances are stopped or no longer needed.
187
"""
188
```
189
190
## Usage Examples
191
192
### Basic Instance Management
193
194
```python
195
from tensorboard import manager
196
197
# Start TensorBoard with specific arguments
198
arguments = ['--logdir', './logs', '--port', '6006']
199
result = manager.start(arguments, timeout=30)
200
201
# Check result type and handle accordingly
202
if isinstance(result, manager.StartLaunched):
203
print(f"Started new TensorBoard on port {result.info.port}")
204
elif isinstance(result, manager.StartReused):
205
print(f"Reusing existing TensorBoard on port {result.info.port}")
206
elif isinstance(result, manager.StartFailed):
207
print(f"Failed to start TensorBoard: {result.error}")
208
```
209
210
### Instance Discovery
211
212
```python
213
from tensorboard import manager
214
215
# Get all running TensorBoard instances
216
instances = manager.get_all()
217
218
print(f"Found {len(instances)} TensorBoard instances:")
219
for info in instances:
220
print(f" PID {info.pid}: port {info.port}, logdir {info.logdir}")
221
222
# Extract primary data source
223
data_source = manager.data_source_from_info(info)
224
print(f" Data source: {data_source}")
225
```
226
227
### Advanced Configuration
228
229
```python
230
from tensorboard import manager
231
import time
232
233
# Start TensorBoard with custom timeout
234
arguments = ['--logdir', './experiments', '--port', '6007', '--reload_interval', '30']
235
start_time = time.time()
236
237
result = manager.start(arguments, timeout=120) # 2 minute timeout
238
239
if isinstance(result, manager.StartLaunched):
240
elapsed = time.time() - start_time
241
print(f"TensorBoard started in {elapsed:.2f} seconds")
242
243
# Write instance info for tracking
244
manager.write_info_file(result.info)
245
246
elif isinstance(result, manager.StartTimedOut):
247
print(f"TensorBoard startup timed out, process still running as PID {result.pid}")
248
```
249
250
### Cache Key Generation
251
252
```python
253
from tensorboard import manager
254
import os
255
256
# Generate cache key for instance identification
257
working_dir = os.getcwd()
258
arguments = ['--logdir', './logs', '--port', '6006']
259
config_kwargs = {'reload_interval': 30, 'max_reload_threads': 4}
260
261
cache_key = manager.cache_key(working_dir, arguments, config_kwargs)
262
print(f"Cache key: {cache_key}")
263
264
# This key is used internally to determine if an existing instance
265
# matches the requested configuration for reuse
266
```
267
268
### Cleanup Operations
269
270
```python
271
from tensorboard import manager
272
273
# Clean up instance metadata files
274
try:
275
manager.remove_info_file()
276
print("Instance metadata cleaned up")
277
except Exception as e:
278
print(f"Cleanup failed: {e}")
279
```
280
281
The manager module provides essential process lifecycle functionality for TensorBoard, enabling robust management of server instances in both development and production environments. It handles the complexity of process detection, reuse logic, and metadata tracking automatically.