Python wrapper for C++ LC-MS library OpenMS for comprehensive mass spectrometry data analysis
npx @tessl/cli install tessl/pypi-pyopenms@3.4.00
# pyOpenMS
1
2
Python bindings for the OpenMS C++ library providing comprehensive mass spectrometry data analysis capabilities. pyOpenMS enables rapid development of LC-MS data processing workflows through high-performance algorithms for file I/O, signal processing, feature detection, peptide/protein identification, and quantification.
3
4
## Package Information
5
6
- **Package Name**: pyopenms
7
- **Language**: Python
8
- **Installation**: `pip install pyopenms`
9
- **Documentation**: https://pyopenms.readthedocs.io
10
- **Repository**: https://github.com/OpenMS/OpenMS/tree/develop/src/pyOpenMS
11
12
## Core Imports
13
14
```python
15
import pyopenms
16
```
17
18
Common usage patterns:
19
20
```python
21
# File I/O
22
from pyopenms import MzMLFile, FeatureXMLFile, IdXMLFile
23
24
# Core data structures
25
from pyopenms import MSExperiment, MSSpectrum, FeatureMap, ConsensusMap
26
27
# Sequence analysis
28
from pyopenms import AASequence, PeptideIdentification, ProteinIdentification
29
30
# Algorithms
31
from pyopenms import PeakPickerHiRes, FeatureFinderAlgorithmPicked
32
```
33
34
## Basic Usage
35
36
### Loading and Processing MS Data
37
38
```python
39
import pyopenms
40
41
# Load MS experiment from mzML file
42
exp = pyopenms.MSExperiment()
43
pyopenms.MzMLFile().load("data.mzML", exp)
44
45
# Access spectra
46
for spectrum in exp:
47
rt = spectrum.getRT()
48
ms_level = spectrum.getMSLevel()
49
mz_array, intensity_array = spectrum.get_peaks()
50
print(f"RT: {rt}, MS Level: {ms_level}, Peaks: {len(mz_array)}")
51
52
# Save processed experiment
53
pyopenms.MzMLFile().store("processed.mzML", exp)
54
```
55
56
### Feature Detection Workflow
57
58
```python
59
import pyopenms
60
61
# Load raw data
62
exp = pyopenms.MSExperiment()
63
pyopenms.MzMLFile().load("data.mzML", exp)
64
65
# Peak picking
66
picker = pyopenms.PeakPickerHiRes()
67
picker.pickExperiment(exp, exp)
68
69
# Feature detection
70
features = pyopenms.FeatureMap()
71
finder = pyopenms.FeatureFinderAlgorithmPicked()
72
finder.run(exp, features, pyopenms.Param(), pyopenms.FeatureMap())
73
74
# Access feature results
75
for feature in features:
76
rt = feature.getRT()
77
mz = feature.getMZ()
78
intensity = feature.getIntensity()
79
print(f"Feature: RT={rt:.2f}, m/z={mz:.4f}, Intensity={intensity:.0f}")
80
81
# Save features
82
pyopenms.FeatureXMLFile().store("features.featureXML", features)
83
```
84
85
### Peptide Sequence Analysis
86
87
```python
88
import pyopenms
89
90
# Create peptide sequence with modifications
91
seq = pyopenms.AASequence.fromString("PEPTIDE")
92
modified_seq = pyopenms.AASequence.fromString("PEPTIDEM(Oxidation)")
93
94
# Get sequence properties
95
mono_weight = seq.getMonoWeight()
96
average_weight = seq.getAverageWeight()
97
formula = seq.getFormula()
98
99
print(f"Sequence: {seq.toString()}")
100
print(f"Monoisotopic weight: {mono_weight:.4f}")
101
print(f"Formula: {formula.toString()}")
102
103
# Generate theoretical spectrum
104
generator = pyopenms.TheoreticalSpectrumGenerator()
105
spectrum = pyopenms.MSSpectrum()
106
generator.getSpectrum(spectrum, seq, 1, 1) # charge=1, intensity=1
107
```
108
109
## Architecture
110
111
pyOpenMS provides a comprehensive API organized around several key architectural components:
112
113
### Data Structures
114
- **MSExperiment**: Central container for LC-MS experiments with spectra and chromatograms
115
- **Feature/ConsensusFeature**: Detected and aligned features with quantitative information
116
- **AASequence**: Peptide sequences with modification support
117
- **Identification structures**: Search results linking spectra to peptide/protein sequences
118
119
### File I/O Layer
120
- **Format handlers**: Support for mzML, mzXML, featureXML, idXML, and 15+ other formats
121
- **Cached access**: Memory-efficient processing of large datasets
122
- **Metadata preservation**: Complete experimental metadata handling
123
124
### Algorithm Framework
125
- **Configurable parameters**: All algorithms use Param objects for configuration
126
- **Processing pipelines**: Modular components for complete analysis workflows
127
- **Quality control**: Built-in validation and QC metrics
128
129
### Integration Layer
130
- **NumPy arrays**: Direct array access for efficient data processing
131
- **Pandas DataFrames**: Export capabilities for statistical analysis
132
- **Visualization**: Basic plotting support for MS data
133
134
## Capabilities
135
136
### File I/O and Data Formats
137
138
Comprehensive support for mass spectrometry file formats including mzML, mzXML, feature detection results, identification files, and spectral libraries. Handles both reading and writing with full metadata preservation.
139
140
```python { .api }
141
class MzMLFile:
142
def load(self, filename: str, exp: MSExperiment) -> None: ...
143
def store(self, filename: str, exp: MSExperiment) -> None: ...
144
145
class FeatureXMLFile:
146
def load(self, filename: str, features: FeatureMap) -> None: ...
147
def store(self, filename: str, features: FeatureMap) -> None: ...
148
149
class IdXMLFile:
150
def load(self, filename: str, prot_ids: list, pep_ids: list) -> None: ...
151
def store(self, filename: str, prot_ids: list, pep_ids: list) -> None: ...
152
```
153
154
[File I/O and Data Formats](./file-io.md)
155
156
### MS Data Structures and Processing
157
158
Core data structures for representing mass spectrometry experiments, spectra, and chromatograms with efficient data access patterns and numpy integration.
159
160
```python { .api }
161
class MSExperiment:
162
def size(self) -> int: ...
163
def getSpectrum(self, index: int) -> MSSpectrum: ...
164
def addSpectrum(self, spectrum: MSSpectrum) -> None: ...
165
def updateRanges(self) -> None: ...
166
def get_df(self, long: bool = False) -> DataFrame: ...
167
168
class MSSpectrum:
169
def getRT(self) -> float: ...
170
def getMSLevel(self) -> int: ...
171
def get_peaks(self) -> tuple[np.ndarray, np.ndarray]: ...
172
def set_peaks(self, mz: np.ndarray, intensity: np.ndarray) -> None: ...
173
```
174
175
[MS Data Structures](./ms-data.md)
176
177
### Feature Detection and Quantification
178
179
Advanced algorithms for detecting LC-MS features, including peak picking, feature finding, and quantitative analysis across multiple experiments.
180
181
```python { .api }
182
class PeakPickerHiRes:
183
def pickExperiment(self, input: MSExperiment, output: MSExperiment) -> None: ...
184
def getParameters(self) -> Param: ...
185
186
class FeatureFinderAlgorithmPicked:
187
def run(self, input: MSExperiment, features: FeatureMap,
188
params: Param, seeds: FeatureMap) -> None: ...
189
190
class Feature:
191
def getRT(self) -> float: ...
192
def getMZ(self) -> float: ...
193
def getIntensity(self) -> float: ...
194
def getOverallQuality(self) -> float: ...
195
```
196
197
[Feature Detection](./feature-detection.md)
198
199
### Peptide and Protein Identification
200
201
Comprehensive support for peptide sequence analysis, database search results, and protein identification workflows with modification handling.
202
203
```python { .api }
204
class AASequence:
205
@staticmethod
206
def fromString(seq: str) -> AASequence: ...
207
def toString(self) -> str: ...
208
def getMonoWeight(self) -> float: ...
209
def getFormula(self) -> EmpiricalFormula: ...
210
211
class PeptideIdentification:
212
def getHits(self) -> list[PeptideHit]: ...
213
def getRT(self) -> float: ...
214
def getMZ(self) -> float: ...
215
216
class PeptideHit:
217
def getSequence(self) -> AASequence: ...
218
def getScore(self) -> float: ...
219
def getCharge(self) -> int: ...
220
```
221
222
[Peptide and Protein Analysis](./peptide-protein.md)
223
224
### Map Alignment and Consensus Features
225
226
Retention time alignment algorithms and consensus feature generation for comparative analysis across multiple LC-MS experiments.
227
228
```python { .api }
229
class MapAlignmentAlgorithmPoseClustering:
230
def align(self, maps: list[FeatureMap], trafos: list[TransformationDescription]) -> None: ...
231
232
class ConsensusMap:
233
def getColumnHeaders(self) -> dict[int, ColumnHeader]: ...
234
def get_intensity_df(self) -> DataFrame: ...
235
def get_metadata_df(self) -> DataFrame: ...
236
237
class ConsensusFeature:
238
def getFeatureList(self) -> list[FeatureHandle]: ...
239
def getRT(self) -> float: ...
240
def getMZ(self) -> float: ...
241
```
242
243
[Alignment and Consensus](./alignment.md)
244
245
### Targeted Analysis and MRM
246
247
Specialized functionality for targeted mass spectrometry including MRM/SRM analysis, transition lists, and quantitative workflows.
248
249
```python { .api }
250
class TargetedExperiment:
251
def getTransitions(self) -> list[ReactionMonitoringTransition]: ...
252
def addTransition(self, transition: ReactionMonitoringTransition) -> None: ...
253
254
class ReactionMonitoringTransition:
255
def getPrecursor(self) -> Precursor: ...
256
def getProduct(self) -> Product: ...
257
def getDecoyTransitionType(self) -> DecoyTransitionType: ...
258
259
class MRMFeature:
260
def getRT(self) -> float: ...
261
def getIntensity(self) -> float: ...
262
def getOverallQuality(self) -> float: ...
263
```
264
265
[Targeted Analysis](./targeted-analysis.md)
266
267
### Chemistry and Molecular Properties
268
269
Chemical calculations including empirical formulas, isotope distributions, elemental compositions, and theoretical spectrum generation.
270
271
```python { .api }
272
class EmpiricalFormula:
273
def __init__(self, formula: str = "") -> None: ...
274
def getMonoWeight(self) -> float: ...
275
def getAverageWeight(self) -> float: ...
276
def toString(self) -> str: ...
277
278
class IsotopeDistribution:
279
def set(self, formula: EmpiricalFormula) -> None: ...
280
def getContainer(self) -> list[Peak1D]: ...
281
282
class TheoreticalSpectrumGenerator:
283
def getSpectrum(self, spectrum: MSSpectrum, peptide: AASequence,
284
charge: int, intensity: float) -> None: ...
285
```
286
287
[Chemistry and Molecular Properties](./chemistry.md)
288
289
## Types
290
291
### Core Data Types
292
293
```python { .api }
294
class Peak1D:
295
def __init__(self, mz: float = 0.0, intensity: float = 0.0) -> None: ...
296
def getMZ(self) -> float: ...
297
def getIntensity(self) -> float: ...
298
299
class Peak2D:
300
def __init__(self, rt: float = 0.0, mz: float = 0.0, intensity: float = 0.0) -> None: ...
301
def getRT(self) -> float: ...
302
def getMZ(self) -> float: ...
303
def getIntensity(self) -> float: ...
304
305
class Param:
306
def __init__(self) -> None: ...
307
def setValue(self, key: str, value: any) -> None: ...
308
def getValue(self, key: str) -> any: ...
309
def exists(self, key: str) -> bool: ...
310
```
311
312
### Range and Index Types
313
314
```python { .api }
315
class DRange:
316
def __init__(self, dim: int = 1) -> None: ...
317
def getMin(self) -> float: ...
318
def getMax(self) -> float: ...
319
def setMin(self, min_val: float) -> None: ...
320
def setMax(self, max_val: float) -> None: ...
321
322
class RangeManager:
323
def __init__(self, dim: int = 1) -> None: ...
324
def getMin(self, dim: int) -> float: ...
325
def getMax(self, dim: int) -> float: ...
326
def updateRanges(self) -> None: ...
327
```
328
329
### Instrument and Method Types
330
331
```python { .api }
332
class Instrument:
333
def __init__(self) -> None: ...
334
def getName(self) -> str: ...
335
def getVendor(self) -> str: ...
336
def getModel(self) -> str: ...
337
338
class IonSource:
339
def __init__(self) -> None: ...
340
def getPolarity(self) -> Polarity: ...
341
def setPolarity(self, polarity: Polarity) -> None: ...
342
343
class Sample:
344
def __init__(self) -> None: ...
345
def getName(self) -> str: ...
346
def getOrganism(self) -> str: ...
347
def getTissue(self) -> str: ...
348
349
class Matrix:
350
def __init__(self, rows: int = 0, cols: int = 0) -> None: ...
351
def getValue(self, row: int, col: int) -> float: ...
352
def setValue(self, row: int, col: int, value: float) -> None: ...
353
def rows(self) -> int: ...
354
def cols(self) -> int: ...
355
356
class PeakFileOptions:
357
def __init__(self) -> None: ...
358
def setMSLevels(self, levels: list[int]) -> None: ...
359
def getMSLevels(self) -> list[int]: ...
360
def setRTRange(self, min_rt: float, max_rt: float) -> None: ...
361
def setMZRange(self, min_mz: float, max_mz: float) -> None: ...
362
363
class FeatureFileOptions:
364
def __init__(self) -> None: ...
365
def setLoadConvexHull(self, load: bool) -> None: ...
366
def getLoadConvexHull(self) -> bool: ...
367
368
class DriftTimeUnit:
369
NONE = 0
370
MILLISECOND = 1
371
VOLT_SECOND_PER_SQUARE_CENTIMETER = 2
372
373
class SpectrumType:
374
UNKNOWN = 0
375
PROFILE = 1
376
CENTROIDED = 2
377
```