or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

index.mdinference.mdlayer-management.mdmain-pipeline.mdnote-grouping.mdnotehead-extraction.mdstaffline-detection.md

main-pipeline.mddocs/

0

# Main Processing Pipeline

1

2

Complete end-to-end optical music recognition pipeline that handles the full workflow from image input to MusicXML output. The pipeline orchestrates neural network inference, feature extraction, musical analysis, and structured document generation.

3

4

## Capabilities

5

6

### Primary Pipeline Function

7

8

The main extraction function that coordinates the entire OMR pipeline.

9

10

```python { .api }

11

def extract(args: Namespace) -> str:

12

"""

13

Main extraction pipeline function that processes a music sheet image and generates MusicXML.

14

15

Parameters:

16

- args.img_path (str): Path to the input image file

17

- args.output_path (str): Directory path for output files

18

- args.use_tf (bool): Use TensorFlow instead of ONNX runtime for inference

19

- args.save_cache (bool): Save model predictions to disk for reuse

20

- args.without_deskew (bool): Skip the deskewing step for aligned images

21

22

Returns:

23

str: Full path to the generated MusicXML file

24

25

Raises:

26

FileNotFoundError: If the input image file doesn't exist

27

StafflineException: If staffline detection fails

28

"""

29

```

30

31

### Command Line Interface

32

33

The CLI entry point that provides the `oemer` command.

34

35

```python { .api }

36

def main() -> None:

37

"""

38

CLI entry point for the oemer command.

39

40

Parses command line arguments and executes the extraction pipeline.

41

Downloads model checkpoints automatically if not present.

42

"""

43

44

def get_parser() -> ArgumentParser:

45

"""

46

Get the command line argument parser.

47

48

Returns:

49

ArgumentParser: Configured parser for oemer CLI options

50

"""

51

```

52

53

### Neural Network Prediction

54

55

Generate predictions using the two-stage neural network architecture.

56

57

```python { .api }

58

def generate_pred(img_path: str, use_tf: bool = False) -> Tuple[ndarray, ndarray, ndarray, ndarray, ndarray]:

59

"""

60

Generate neural network predictions for all musical elements.

61

62

Runs two U-Net models:

63

1. Staff vs. symbols segmentation

64

2. Detailed symbol classification

65

66

Parameters:

67

- img_path (str): Path to the input image

68

- use_tf (bool): Use TensorFlow instead of ONNX runtime

69

70

Returns:

71

Tuple containing:

72

- staff (ndarray): Staff line predictions

73

- symbols (ndarray): General symbol predictions

74

- stems_rests (ndarray): Stems and rests predictions

75

- notehead (ndarray): Note head predictions

76

- clefs_keys (ndarray): Clefs and accidentals predictions

77

"""

78

```

79

80

### Data Management

81

82

Functions for managing intermediate processing data and model checkpoints.

83

84

```python { .api }

85

def clear_data() -> None:

86

"""

87

Clear all registered processing layers.

88

89

Removes all intermediate data from the layer management system

90

to free memory and reset state between processing runs.

91

"""

92

93

def download_file(title: str, url: str, save_path: str) -> None:

94

"""

95

Download model checkpoint files with progress tracking.

96

97

Parameters:

98

- title (str): Display name for download progress

99

- url (str): URL of the file to download

100

- save_path (str): Local path to save the downloaded file

101

"""

102

103

def polish_symbols(rgb_black_th: int = 300) -> ndarray:

104

"""

105

Polish symbol predictions by filtering background pixels.

106

107

Parameters:

108

- rgb_black_th (int): RGB threshold for black pixel detection

109

110

Returns:

111

ndarray: Refined symbol predictions

112

"""

113

```

114

115

### Registration Functions

116

117

Functions for registering extracted elements in the layer system.

118

119

```python { .api }

120

def register_notehead_bbox(bboxes: List[BBox]) -> ndarray:

121

"""

122

Register notehead bounding boxes in the layer system.

123

124

Parameters:

125

- bboxes (List[BBox]): List of notehead bounding boxes

126

127

Returns:

128

ndarray: Updated bounding box layer

129

"""

130

131

def register_note_id() -> None:

132

"""

133

Register note IDs in the layer system.

134

135

Creates a mapping layer where each pixel contains the ID

136

of the note it belongs to, or -1 for background pixels.

137

"""

138

```

139

140

## Usage Examples

141

142

### Basic Programmatic Usage

143

144

```python

145

from oemer.ete import extract, clear_data

146

from argparse import Namespace

147

import os

148

149

# Clear any previous data

150

clear_data()

151

152

# Configure extraction parameters

153

args = Namespace(

154

img_path='scores/beethoven_symphony.jpg',

155

output_path='./output/',

156

use_tf=False, # Use ONNX runtime (faster)

157

save_cache=True, # Save predictions for reuse

158

without_deskew=False # Enable deskewing for phone photos

159

)

160

161

try:

162

# Run the extraction pipeline

163

musicxml_path = extract(args)

164

print(f"Successfully generated: {musicxml_path}")

165

166

# Check if teaser image was also created

167

teaser_path = musicxml_path.replace('.musicxml', '_teaser.png')

168

if os.path.exists(teaser_path):

169

print(f"Analysis visualization: {teaser_path}")

170

171

except FileNotFoundError:

172

print(f"Image file not found: {args.img_path}")

173

except Exception as e:

174

print(f"Processing failed: {e}")

175

```

176

177

### Batch Processing Multiple Images

178

179

```python

180

import os

181

from pathlib import Path

182

from oemer.ete import extract, clear_data

183

from argparse import Namespace

184

185

def process_directory(input_dir: str, output_dir: str):

186

"""Process all images in a directory."""

187

input_path = Path(input_dir)

188

output_path = Path(output_dir)

189

output_path.mkdir(exist_ok=True)

190

191

# Find all image files

192

image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}

193

image_files = [f for f in input_path.iterdir()

194

if f.suffix.lower() in image_extensions]

195

196

print(f"Found {len(image_files)} images to process")

197

198

for i, img_file in enumerate(image_files, 1):

199

print(f"Processing {i}/{len(image_files)}: {img_file.name}")

200

201

# Clear data between files to free memory

202

clear_data()

203

204

args = Namespace(

205

img_path=str(img_file),

206

output_path=str(output_path),

207

use_tf=False,

208

save_cache=True, # Reuse predictions if processing same image again

209

without_deskew=False

210

)

211

212

try:

213

musicxml_path = extract(args)

214

print(f"✓ Generated: {Path(musicxml_path).name}")

215

except Exception as e:

216

print(f"✗ Failed: {e}")

217

218

# Process a directory of sheet music images

219

process_directory('./sheet_music_images/', './musicxml_output/')

220

```

221

222

### Using with Different Model Backends

223

224

```python

225

from oemer.ete import extract

226

from argparse import Namespace

227

228

# Test with both ONNX and TensorFlow backends

229

test_image = 'test_score.jpg'

230

231

# ONNX runtime (default - faster inference)

232

args_onnx = Namespace(

233

img_path=test_image,

234

output_path='./onnx_output/',

235

use_tf=False,

236

save_cache=False,

237

without_deskew=False

238

)

239

240

# TensorFlow backend (may have different precision)

241

args_tf = Namespace(

242

img_path=test_image,

243

output_path='./tf_output/',

244

use_tf=True,

245

save_cache=False,

246

without_deskew=False

247

)

248

249

print("Processing with ONNX runtime...")

250

onnx_result = extract(args_onnx)

251

252

print("Processing with TensorFlow...")

253

tf_result = extract(args_tf)

254

255

print(f"ONNX result: {onnx_result}")

256

print(f"TensorFlow result: {tf_result}")

257

```

258

259

## Pipeline Architecture

260

261

The extraction pipeline follows these stages:

262

263

1. **Input Validation**: Verify image file exists and is readable

264

2. **Model Loading**: Load or download neural network checkpoints

265

3. **Image Preprocessing**: Resize and normalize input image

266

4. **Neural Network Inference**: Run two-stage semantic segmentation

267

5. **Image Dewarping**: Correct perspective and skew (optional)

268

6. **Layer Registration**: Store all predictions in layer system

269

7. **Staff Extraction**: Detect and analyze staff lines

270

8. **Note Extraction**: Identify and classify noteheads

271

9. **Note Grouping**: Group notes by stems and beams

272

10. **Symbol Extraction**: Extract clefs, accidentals, rests, barlines

273

11. **Rhythm Analysis**: Analyze beams, flags, and dots for timing

274

12. **MusicXML Building**: Generate structured musical document

275

13. **Output Generation**: Write MusicXML file and analysis visualization

276

277

Each stage can access intermediate results through the layer management system, enabling modular processing and debugging capabilities.