or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-peptide-operations.mdadvanced-spectral-libraries.mdchemical-constants.mdfragment-ions.mdindex.mdio-utilities.mdprotein-analysis.mdpsm-readers.mdquantification.mdsmiles-chemistry.mdspectral-libraries.md

chemical-constants.mddocs/

0

# Chemical Constants and Calculations

1

2

Comprehensive databases and calculation functions for amino acids, chemical elements, modifications, and isotopes. These components form the foundation of all mass spectrometry calculations in AlphaBase, providing pre-computed lookup tables and vectorized operations for high-performance proteomics workflows.

3

4

## Capabilities

5

6

### Amino Acid Constants and Calculations

7

8

Core amino acid database with masses, formulas, and properties, plus vectorized calculation functions for peptide sequences.

9

10

```python { .api }

11

# Global constants

12

AA_ASCII_MASS: np.ndarray # 128-length array indexed by ASCII code

13

AA_DF: pd.DataFrame # Complete amino acid properties dataframe

14

AA_Composition: dict # Amino acid formula compositions

15

aa_formula: pd.DataFrame # Amino acid formulas and properties

16

17

# Mass calculation functions

18

def calc_AA_masses(sequences: List[str]) -> np.ndarray:

19

"""

20

Calculate amino acid masses for peptide sequences.

21

22

Parameters:

23

- sequences: List of peptide sequences

24

25

Returns:

26

2D numpy array with masses for each AA position

27

"""

28

29

def calc_AA_masses_for_same_len_seqs(sequences: List[str]) -> np.ndarray:

30

"""

31

Fast batch calculation for equal-length sequences.

32

33

Parameters:

34

- sequences: List of equal-length peptide sequences

35

36

Returns:

37

2D numpy array with optimized memory layout

38

"""

39

40

def calc_sequence_masses_for_same_len_seqs(sequences: List[str]) -> np.ndarray:

41

"""

42

Calculate full sequence masses for equal-length sequences.

43

44

Parameters:

45

- sequences: List of equal-length peptide sequences

46

47

Returns:

48

1D numpy array with total masses

49

"""

50

51

# Database modification functions

52

def update_an_AA(aa_code: str, formula: dict, mass: float = None) -> None:

53

"""

54

Update a single amino acid definition.

55

56

Parameters:

57

- aa_code: Single letter amino acid code

58

- formula: Chemical formula as dict {'C': 6, 'H': 12, ...}

59

- mass: Optional mass override

60

"""

61

62

def reset_AA_mass() -> None:

63

"""Recalculate amino acid masses after modifications."""

64

65

def reset_AA_df() -> None:

66

"""Reset amino acid DataFrame from formulas."""

67

```

68

69

### Chemical Elements and Atoms

70

71

Fundamental chemical constants and formula parsing capabilities with isotope information.

72

73

```python { .api }

74

# Physical constants

75

MASS_PROTON: float = 1.00727646688

76

MASS_ISOTOPE: float = 1.00235

77

MAX_ISOTOPE_LEN: int = 8

78

79

# Element masses

80

MASS_H: float = 1.007825032

81

MASS_C: float = 12.0

82

MASS_O: float = 15.994914620

83

MASS_N: float = 14.003074004

84

MASS_H2O: float = 18.0105647

85

MASS_NH3: float = 17.026549101

86

87

# Chemical databases

88

CHEM_INFO_DICT: dict # Element information dictionary

89

CHEM_MONO_MASS: dict # Monoisotopic masses dictionary

90

CHEM_ISOTOPE_DIST: dict # Isotope distributions dictionary

91

CHEM_MONO_IDX: dict # Monoisotopic index mappings

92

EMPTY_DIST: np.ndarray # Default isotope distribution

93

94

# Formula parsing and mass calculation

95

def parse_formula(formula: str) -> dict:

96

"""

97

Parse chemical formula string into composition dictionary.

98

99

Parameters:

100

- formula: Chemical formula like 'C6H12N2O'

101

102

Returns:

103

Dictionary with element counts {'C': 6, 'H': 12, 'N': 2, 'O': 1}

104

"""

105

106

def calc_mass_from_formula(formula: str) -> float:

107

"""

108

Calculate monoisotopic mass from chemical formula.

109

110

Parameters:

111

- formula: Chemical formula string

112

113

Returns:

114

Monoisotopic mass as float

115

"""

116

117

class ChemicalCompositonFormula:

118

"""Handle chemical compositions and parse SMILES notation."""

119

120

def __init__(self, formula: str = None):

121

"""

122

Initialize with optional formula.

123

124

Parameters:

125

- formula: Chemical formula string or SMILES notation

126

"""

127

128

def calc_mass(self) -> float:

129

"""Calculate monoisotopic mass of composition."""

130

131

# Database management

132

def update_atom_infos(atom_dict: dict) -> None:

133

"""Update atomic information from external data."""

134

135

def reset_elements() -> None:

136

"""Reset element data from default sources."""

137

138

def load_elem_yaml(yaml_path: str) -> None:

139

"""Load element definitions from YAML file."""

140

```

141

142

### Modifications Database and Calculations

143

144

Complete modification database with masses, formulas, and loss patterns, plus calculation functions for modified peptide sequences.

145

146

```python { .api }

147

# Global modification constants

148

MOD_DF: pd.DataFrame # Main modification database

149

MOD_INFO_DICT: dict # Modification information

150

MOD_CHEM: dict # Modification chemistry

151

MOD_MASS: dict # Modification masses

152

MOD_LOSS_MASS: dict # Modification loss masses

153

MOD_Composition: dict # Modification compositions

154

MOD_LOSS_IMPORTANCE: dict # Loss importance rankings

155

156

# Modification mass calculations

157

def calc_modification_mass(mod_sequences: List[str]) -> np.ndarray:

158

"""

159

Calculate modification masses for peptide sequences.

160

161

Parameters:

162

- mod_sequences: List of modified sequences like 'PEPTIDE[Oxidation (M)]'

163

164

Returns:

165

2D numpy array with modification masses per position

166

"""

167

168

def calc_mod_masses_for_same_len_seqs(mod_sequences: List[str]) -> np.ndarray:

169

"""

170

Batch modification mass calculation for equal-length sequences.

171

172

Parameters:

173

- mod_sequences: List of equal-length modified sequences

174

175

Returns:

176

2D numpy array with optimized layout

177

"""

178

179

def calc_modification_mass_sum(mod_sequences: List[str]) -> np.ndarray:

180

"""

181

Sum modification masses across peptide sequences.

182

183

Parameters:

184

- mod_sequences: List of modified sequences

185

186

Returns:

187

1D numpy array with total modification masses

188

"""

189

190

def calc_modloss_mass(mod_sequences: List[str]) -> np.ndarray:

191

"""

192

Calculate modification loss masses.

193

194

Parameters:

195

- mod_sequences: List of modified sequences

196

197

Returns:

198

2D numpy array with loss masses

199

"""

200

201

def calc_modloss_mass_with_importance(mod_sequences: List[str],

202

importance_level: int = 1) -> np.ndarray:

203

"""

204

Calculate modification losses filtered by importance.

205

206

Parameters:

207

- mod_sequences: List of modified sequences

208

- importance_level: Minimum importance level (1-3)

209

210

Returns:

211

2D numpy array with filtered loss masses

212

"""

213

214

# Database management

215

def add_new_modifications(mod_df: pd.DataFrame) -> None:

216

"""

217

Add custom modifications to global database.

218

219

Parameters:

220

- mod_df: DataFrame with new modifications

221

"""

222

223

def has_custom_mods() -> bool:

224

"""Check for presence of user-defined modifications."""

225

226

def load_mod_df(tsv_path: str) -> pd.DataFrame:

227

"""Load modifications from TSV file."""

228

229

def update_all_by_MOD_DF() -> None:

230

"""Update all modification globals from main DataFrame."""

231

232

def keep_modloss_by_importance(importance_level: int = 1) -> None:

233

"""Filter modification losses by importance ranking."""

234

```

235

236

### Isotope Calculations

237

238

Fast isotope pattern calculation with pre-built lookup tables and mathematical convolution functions.

239

240

```python { .api }

241

class IsotopeDistribution:

242

"""Fast isotope distribution calculator with pre-built tables."""

243

244

def __init__(self, max_mass: int = 2000, max_isotope_len: int = 8):

245

"""

246

Initialize isotope calculator.

247

248

Parameters:

249

- max_mass: Maximum mass for pre-calculated tables

250

- max_isotope_len: Maximum isotope pattern length

251

"""

252

253

def calc_isotope_distribution(self, formula: str) -> np.ndarray:

254

"""

255

Calculate isotope distribution for chemical formula.

256

257

Parameters:

258

- formula: Chemical formula string

259

260

Returns:

261

Numpy array with isotope intensities

262

"""

263

264

# Direct calculation functions

265

def formula_dist(formula: str) -> np.ndarray:

266

"""

267

Generate isotope distribution for chemical formula.

268

269

Parameters:

270

- formula: Chemical formula string

271

272

Returns:

273

Numpy array with isotope pattern

274

"""

275

276

def one_element_dist(element: str, count: int) -> np.ndarray:

277

"""

278

Calculate single element isotope distribution.

279

280

Parameters:

281

- element: Element symbol ('C', 'H', etc.)

282

- count: Number of atoms

283

284

Returns:

285

Numpy array with isotope intensities

286

"""

287

288

def abundance_convolution(dist1: np.ndarray, dist2: np.ndarray) -> np.ndarray:

289

"""

290

Convolute two isotope distributions.

291

292

Parameters:

293

- dist1: First isotope distribution

294

- dist2: Second isotope distribution

295

296

Returns:

297

Convolved isotope distribution

298

"""

299

300

def truncate_isotope(distribution: np.ndarray, max_len: int = 8) -> np.ndarray:

301

"""

302

Truncate isotope distribution to specified length.

303

304

Parameters:

305

- distribution: Input isotope distribution

306

- max_len: Maximum length to keep

307

308

Returns:

309

Truncated distribution

310

"""

311

```

312

313

## Usage Examples

314

315

### Basic Mass Calculations

316

317

```python

318

from alphabase.constants.aa import calc_AA_masses

319

from alphabase.constants.modification import calc_modification_mass

320

321

# Calculate amino acid masses

322

sequences = ['PEPTIDE', 'SEQUENCE', 'EXAMPLE']

323

aa_masses = calc_AA_masses(sequences)

324

print(f"AA masses shape: {aa_masses.shape}") # (3, 8) for longest sequence

325

326

# Calculate modification masses

327

mod_sequences = ['PEPTIDE[Oxidation (M)]', 'SEQUENCE[Phospho (STY)]']

328

mod_masses = calc_modification_mass(mod_sequences)

329

print(f"Modification masses: {mod_masses}")

330

```

331

332

### Chemical Formula Processing

333

334

```python

335

from alphabase.constants.atom import parse_formula, calc_mass_from_formula

336

337

# Parse and calculate mass

338

formula = "C6H12N2O2"

339

composition = parse_formula(formula)

340

mass = calc_mass_from_formula(formula)

341

print(f"Formula {formula}: {composition}, Mass: {mass:.6f}")

342

```

343

344

### Custom Modifications

345

346

```python

347

import pandas as pd

348

from alphabase.constants.modification import add_new_modifications

349

350

# Add custom modification

351

custom_mods = pd.DataFrame({

352

'mod_name': ['Custom_Mod'],

353

'mass': [42.0106],

354

'composition': ['C2H2O'],

355

'aa': ['K'],

356

'position': ['any']

357

})

358

359

add_new_modifications(custom_mods)

360

```

361

362

### Isotope Pattern Calculation

363

364

```python

365

from alphabase.constants.isotope import IsotopeDistribution

366

367

# Calculate isotope pattern

368

iso_calc = IsotopeDistribution()

369

pattern = iso_calc.calc_isotope_distribution("C50H80N14O10")

370

print(f"Isotope pattern: {pattern}")

371

```