Tessl Tile for pypi/oemer@0.1.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

index.md inference.md layer-management.md main-pipeline.md note-grouping.md notehead-extraction.md staffline-detection.md

main-pipeline.mddocs/

0
# Main Processing Pipeline
1

2
Complete end-to-end optical music recognition pipeline that handles the full workflow from image input to MusicXML output. The pipeline orchestrates neural network inference, feature extraction, musical analysis, and structured document generation.
3

4
## Capabilities
5

6
### Primary Pipeline Function
7

8
The main extraction function that coordinates the entire OMR pipeline.
9

10
```python { .api }
11
def extract(args: Namespace) -> str:
12
    """
13
    Main extraction pipeline function that processes a music sheet image and generates MusicXML.
14
    
15
    Parameters:
16
    - args.img_path (str): Path to the input image file
17
    - args.output_path (str): Directory path for output files
18
    - args.use_tf (bool): Use TensorFlow instead of ONNX runtime for inference
19
    - args.save_cache (bool): Save model predictions to disk for reuse
20
    - args.without_deskew (bool): Skip the deskewing step for aligned images
21
    
22
    Returns:
23
    str: Full path to the generated MusicXML file
24
    
25
    Raises:
26
    FileNotFoundError: If the input image file doesn't exist
27
    StafflineException: If staffline detection fails
28
    """
29
```
30

31
### Command Line Interface
32

33
The CLI entry point that provides the `oemer` command.
34

35
```python { .api }
36
def main() -> None:
37
    """
38
    CLI entry point for the oemer command.
39
    
40
    Parses command line arguments and executes the extraction pipeline.
41
    Downloads model checkpoints automatically if not present.
42
    """
43

44
def get_parser() -> ArgumentParser:
45
    """
46
    Get the command line argument parser.
47
    
48
    Returns:
49
    ArgumentParser: Configured parser for oemer CLI options
50
    """
51
```
52

53
### Neural Network Prediction
54

55
Generate predictions using the two-stage neural network architecture.
56

57
```python { .api }
58
def generate_pred(img_path: str, use_tf: bool = False) -> Tuple[ndarray, ndarray, ndarray, ndarray, ndarray]:
59
    """
60
    Generate neural network predictions for all musical elements.
61
    
62
    Runs two U-Net models:
63
    1. Staff vs. symbols segmentation
64
    2. Detailed symbol classification
65
    
66
    Parameters:
67
    - img_path (str): Path to the input image
68
    - use_tf (bool): Use TensorFlow instead of ONNX runtime
69
    
70
    Returns:
71
    Tuple containing:
72
    - staff (ndarray): Staff line predictions
73
    - symbols (ndarray): General symbol predictions  
74
    - stems_rests (ndarray): Stems and rests predictions
75
    - notehead (ndarray): Note head predictions
76
    - clefs_keys (ndarray): Clefs and accidentals predictions
77
    """
78
```
79

80
### Data Management
81

82
Functions for managing intermediate processing data and model checkpoints.
83

84
```python { .api }
85
def clear_data() -> None:
86
    """
87
    Clear all registered processing layers.
88
    
89
    Removes all intermediate data from the layer management system
90
    to free memory and reset state between processing runs.
91
    """
92

93
def download_file(title: str, url: str, save_path: str) -> None:
94
    """
95
    Download model checkpoint files with progress tracking.
96
    
97
    Parameters:
98
    - title (str): Display name for download progress
99
    - url (str): URL of the file to download
100
    - save_path (str): Local path to save the downloaded file
101
    """
102

103
def polish_symbols(rgb_black_th: int = 300) -> ndarray:
104
    """
105
    Polish symbol predictions by filtering background pixels.
106
    
107
    Parameters:
108
    - rgb_black_th (int): RGB threshold for black pixel detection
109
    
110
    Returns:
111
    ndarray: Refined symbol predictions
112
    """
113
```
114

115
### Registration Functions
116

117
Functions for registering extracted elements in the layer system.
118

119
```python { .api }
120
def register_notehead_bbox(bboxes: List[BBox]) -> ndarray:
121
    """
122
    Register notehead bounding boxes in the layer system.
123
    
124
    Parameters:
125
    - bboxes (List[BBox]): List of notehead bounding boxes
126
    
127
    Returns:
128
    ndarray: Updated bounding box layer
129
    """
130

131
def register_note_id() -> None:
132
    """
133
    Register note IDs in the layer system.
134
    
135
    Creates a mapping layer where each pixel contains the ID
136
    of the note it belongs to, or -1 for background pixels.
137
    """
138
```
139

140
## Usage Examples
141

142
### Basic Programmatic Usage
143

144
```python
145
from oemer.ete import extract, clear_data
146
from argparse import Namespace
147
import os
148

149
# Clear any previous data
150
clear_data()
151

152
# Configure extraction parameters
153
args = Namespace(
154
    img_path='scores/beethoven_symphony.jpg',
155
    output_path='./output/',
156
    use_tf=False,           # Use ONNX runtime (faster)
157
    save_cache=True,        # Save predictions for reuse
158
    without_deskew=False    # Enable deskewing for phone photos
159
)
160

161
try:
162
    # Run the extraction pipeline
163
    musicxml_path = extract(args)
164
    print(f"Successfully generated: {musicxml_path}")
165
    
166
    # Check if teaser image was also created
167
    teaser_path = musicxml_path.replace('.musicxml', '_teaser.png')
168
    if os.path.exists(teaser_path):
169
        print(f"Analysis visualization: {teaser_path}")
170
        
171
except FileNotFoundError:
172
    print(f"Image file not found: {args.img_path}")
173
except Exception as e:
174
    print(f"Processing failed: {e}")
175
```
176

177
### Batch Processing Multiple Images
178

179
```python
180
import os
181
from pathlib import Path
182
from oemer.ete import extract, clear_data
183
from argparse import Namespace
184

185
def process_directory(input_dir: str, output_dir: str):
186
    """Process all images in a directory."""
187
    input_path = Path(input_dir)
188
    output_path = Path(output_dir)
189
    output_path.mkdir(exist_ok=True)
190
    
191
    # Find all image files
192
    image_extensions = {'.jpg', '.jpeg', '.png', '.bmp', '.tiff'}
193
    image_files = [f for f in input_path.iterdir() 
194
                   if f.suffix.lower() in image_extensions]
195
    
196
    print(f"Found {len(image_files)} images to process")
197
    
198
    for i, img_file in enumerate(image_files, 1):
199
        print(f"Processing {i}/{len(image_files)}: {img_file.name}")
200
        
201
        # Clear data between files to free memory
202
        clear_data()
203
        
204
        args = Namespace(
205
            img_path=str(img_file),
206
            output_path=str(output_path),
207
            use_tf=False,
208
            save_cache=True,  # Reuse predictions if processing same image again
209
            without_deskew=False
210
        )
211
        
212
        try:
213
            musicxml_path = extract(args)
214
            print(f"✓ Generated: {Path(musicxml_path).name}")
215
        except Exception as e:
216
            print(f"✗ Failed: {e}")
217

218
# Process a directory of sheet music images
219
process_directory('./sheet_music_images/', './musicxml_output/')
220
```
221

222
### Using with Different Model Backends
223

224
```python
225
from oemer.ete import extract
226
from argparse import Namespace
227

228
# Test with both ONNX and TensorFlow backends
229
test_image = 'test_score.jpg'
230

231
# ONNX runtime (default - faster inference)
232
args_onnx = Namespace(
233
    img_path=test_image,
234
    output_path='./onnx_output/',
235
    use_tf=False,
236
    save_cache=False,
237
    without_deskew=False
238
)
239

240
# TensorFlow backend (may have different precision)
241
args_tf = Namespace(
242
    img_path=test_image,
243
    output_path='./tf_output/',
244
    use_tf=True,
245
    save_cache=False,
246
    without_deskew=False
247
)
248

249
print("Processing with ONNX runtime...")
250
onnx_result = extract(args_onnx)
251

252
print("Processing with TensorFlow...")
253
tf_result = extract(args_tf)
254

255
print(f"ONNX result: {onnx_result}")
256
print(f"TensorFlow result: {tf_result}")
257
```
258

259
## Pipeline Architecture
260

261
The extraction pipeline follows these stages:
262

263
1. **Input Validation**: Verify image file exists and is readable
264
2. **Model Loading**: Load or download neural network checkpoints
265
3. **Image Preprocessing**: Resize and normalize input image
266
4. **Neural Network Inference**: Run two-stage semantic segmentation
267
5. **Image Dewarping**: Correct perspective and skew (optional)
268
6. **Layer Registration**: Store all predictions in layer system
269
7. **Staff Extraction**: Detect and analyze staff lines
270
8. **Note Extraction**: Identify and classify noteheads
271
9. **Note Grouping**: Group notes by stems and beams
272
10. **Symbol Extraction**: Extract clefs, accidentals, rests, barlines
273
11. **Rhythm Analysis**: Analyze beams, flags, and dots for timing
274
12. **MusicXML Building**: Generate structured musical document
275
13. **Output Generation**: Write MusicXML file and analysis visualization
276

277
Each stage can access intermediate results through the layer management system, enabling modular processing and debugging capabilities.

Version

Tile

Files

main-pipeline.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

main-pipeline.mddocs/