0
# SMILES and Chemical Representations
1
2
Comprehensive cheminformatics capabilities for peptide and amino acid SMILES (Simplified Molecular-Input Line-Entry System) representations. Provides tools for chemical structure encoding, modification representation, and integration with computational chemistry workflows in proteomics.
3
4
## Capabilities
5
6
### Amino Acid SMILES Modification
7
8
Advanced system for modifying amino acid chemical structures using SMILES notation and applying chemical transformations.
9
10
```python { .api }
11
class AminoAcidModifier:
12
"""Chemical modification system for amino acids using SMILES representations."""
13
14
def __init__(self):
15
"""Initialize amino acid modifier with default chemical data."""
16
17
def get_amino_acid_smiles(self, aa_code: str) -> str:
18
"""
19
Get SMILES representation for amino acid.
20
21
Parameters:
22
- aa_code: Single letter amino acid code
23
24
Returns:
25
SMILES string representing the amino acid structure
26
"""
27
28
def apply_modification(self, aa_smiles: str,
29
modification_smiles: str,
30
reaction_site: str = 'auto') -> str:
31
"""
32
Apply chemical modification to amino acid SMILES.
33
34
Parameters:
35
- aa_smiles: Original amino acid SMILES
36
- modification_smiles: SMILES of modification to apply
37
- reaction_site: Specific reaction site or 'auto' for automatic detection
38
39
Returns:
40
Modified amino acid SMILES string
41
"""
42
43
def create_modified_aa_library(self, modifications: List[dict]) -> dict:
44
"""
45
Create library of modified amino acids.
46
47
Parameters:
48
- modifications: List of modification definitions with SMILES
49
50
Returns:
51
Dictionary mapping modified AA codes to SMILES representations
52
"""
53
54
def validate_modification_chemistry(self, aa_code: str,
55
modification: str) -> dict:
56
"""
57
Validate chemical feasibility of amino acid modification.
58
59
Parameters:
60
- aa_code: Amino acid code
61
- modification: Modification name or SMILES
62
63
Returns:
64
Dictionary with validation results and chemical properties
65
"""
66
67
def calculate_modification_properties(self, modified_smiles: str) -> dict:
68
"""
69
Calculate chemical properties of modified amino acid.
70
71
Parameters:
72
- modified_smiles: SMILES of modified amino acid
73
74
Returns:
75
Dictionary with molecular weight, formula, logP, etc.
76
"""
77
78
def get_modification_reaction_mechanism(self, modification: str) -> dict:
79
"""
80
Get reaction mechanism for modification.
81
82
Parameters:
83
- modification: Modification name
84
85
Returns:
86
Dictionary with reaction details and mechanism
87
"""
88
89
def load_amino_acid_smiles_database() -> dict:
90
"""
91
Load comprehensive database of amino acid SMILES representations.
92
93
Returns:
94
Dictionary mapping amino acid codes to SMILES strings
95
"""
96
97
def create_custom_amino_acid(smiles: str,
98
aa_code: str,
99
name: str = None) -> dict:
100
"""
101
Create custom amino acid definition from SMILES.
102
103
Parameters:
104
- smiles: SMILES representation of the amino acid
105
- aa_code: Single letter code for the amino acid
106
- name: Optional full name for the amino acid
107
108
Returns:
109
Dictionary with amino acid definition
110
"""
111
112
def validate_amino_acid_smiles(smiles: str) -> dict:
113
"""
114
Validate SMILES string for amino acid structure.
115
116
Parameters:
117
- smiles: SMILES string to validate
118
119
Returns:
120
Dictionary with validation results and structural analysis
121
"""
122
```
123
124
### Peptide SMILES Encoding
125
126
System for converting peptide sequences to complete chemical structure representations using SMILES notation.
127
128
```python { .api }
129
class PeptideSmilesEncoder:
130
"""Encoder for converting peptide sequences to SMILES representations."""
131
132
def __init__(self, amino_acid_library: dict = None):
133
"""
134
Initialize peptide SMILES encoder.
135
136
Parameters:
137
- amino_acid_library: Custom amino acid SMILES library
138
"""
139
140
def encode_peptide_sequence(self, sequence: str,
141
modifications: List[dict] = None) -> str:
142
"""
143
Convert peptide sequence to SMILES representation.
144
145
Parameters:
146
- sequence: Peptide sequence string
147
- modifications: List of modifications with positions and SMILES
148
149
Returns:
150
Complete SMILES string representing the peptide structure
151
"""
152
153
def encode_modified_peptide(self, sequence: str,
154
mod_names: List[str],
155
mod_sites: List[int]) -> str:
156
"""
157
Encode peptide with standard modifications to SMILES.
158
159
Parameters:
160
- sequence: Peptide sequence
161
- mod_names: List of modification names
162
- mod_sites: List of modification sites (1-indexed)
163
164
Returns:
165
SMILES string with modifications incorporated
166
"""
167
168
def create_peptide_graph(self, sequence: str,
169
modifications: List[dict] = None) -> object:
170
"""
171
Create molecular graph representation of peptide.
172
173
Parameters:
174
- sequence: Peptide sequence
175
- modifications: Optional modifications
176
177
Returns:
178
Molecular graph object (NetworkX or RDKit Mol)
179
"""
180
181
def calculate_peptide_properties(self, peptide_smiles: str) -> dict:
182
"""
183
Calculate chemical properties from peptide SMILES.
184
185
Parameters:
186
- peptide_smiles: SMILES representation of peptide
187
188
Returns:
189
Dictionary with molecular properties (MW, logP, PSA, etc.)
190
"""
191
192
def generate_conformers(self, peptide_smiles: str,
193
num_conformers: int = 10) -> List[object]:
194
"""
195
Generate 3D conformers from peptide SMILES.
196
197
Parameters:
198
- peptide_smiles: SMILES string of peptide
199
- num_conformers: Number of conformers to generate
200
201
Returns:
202
List of 3D molecular conformer objects
203
"""
204
205
def validate_peptide_structure(self, peptide_smiles: str) -> dict:
206
"""
207
Validate peptide SMILES for structural correctness.
208
209
Parameters:
210
- peptide_smiles: Peptide SMILES to validate
211
212
Returns:
213
Dictionary with validation results and structural issues
214
"""
215
216
def fragment_peptide_smiles(self, peptide_smiles: str,
217
fragmentation_type: str = 'b_y') -> dict:
218
"""
219
Generate fragment ion SMILES from peptide SMILES.
220
221
Parameters:
222
- peptide_smiles: Parent peptide SMILES
223
- fragmentation_type: Type of fragmentation ('b_y', 'cid', 'etd')
224
225
Returns:
226
Dictionary with fragment SMILES and masses
227
"""
228
229
def convert_sequence_to_smiles(sequence: str,
230
modification_map: dict = None) -> str:
231
"""
232
Convert peptide sequence to SMILES using default encoder.
233
234
Parameters:
235
- sequence: Peptide sequence string
236
- modification_map: Optional modification mappings
237
238
Returns:
239
SMILES representation of the peptide
240
"""
241
242
def batch_encode_peptides(sequences: List[str],
243
modifications: List[List[dict]] = None,
244
n_jobs: int = 1) -> List[str]:
245
"""
246
Batch encode multiple peptides to SMILES.
247
248
Parameters:
249
- sequences: List of peptide sequences
250
- modifications: List of modification lists for each peptide
251
- n_jobs: Number of parallel jobs
252
253
Returns:
254
List of SMILES strings for each peptide
255
"""
256
```
257
258
### Chemical Property Calculations
259
260
Functions for calculating molecular properties and descriptors from SMILES representations.
261
262
```python { .api }
263
def calculate_molecular_descriptors(smiles: str) -> dict:
264
"""
265
Calculate comprehensive molecular descriptors from SMILES.
266
267
Parameters:
268
- smiles: SMILES string
269
270
Returns:
271
Dictionary with molecular descriptors (MW, logP, TPSA, etc.)
272
"""
273
274
def calculate_lipinski_properties(smiles: str) -> dict:
275
"""
276
Calculate Lipinski Rule of Five properties.
277
278
Parameters:
279
- smiles: SMILES string
280
281
Returns:
282
Dictionary with Lipinski properties and compliance
283
"""
284
285
def calculate_peptide_hydrophobicity(smiles: str) -> float:
286
"""
287
Calculate peptide hydrophobicity from SMILES.
288
289
Parameters:
290
- smiles: Peptide SMILES string
291
292
Returns:
293
Hydrophobicity score
294
"""
295
296
def predict_retention_time_from_smiles(smiles: str,
297
model_type: str = 'krokhin') -> float:
298
"""
299
Predict chromatographic retention time from SMILES.
300
301
Parameters:
302
- smiles: Peptide SMILES string
303
- model_type: Prediction model ('krokhin', 'ssp', 'ml')
304
305
Returns:
306
Predicted retention time in minutes
307
"""
308
309
def calculate_collision_cross_section_smiles(smiles: str,
310
charge: int = 1) -> float:
311
"""
312
Predict collision cross section from SMILES structure.
313
314
Parameters:
315
- smiles: Peptide SMILES string
316
- charge: Ion charge state
317
318
Returns:
319
Predicted CCS value in Ųa
320
"""
321
322
def assess_fragmentation_propensity(smiles: str) -> dict:
323
"""
324
Assess peptide fragmentation propensity from structure.
325
326
Parameters:
327
- smiles: Peptide SMILES string
328
329
Returns:
330
Dictionary with fragmentation predictions
331
"""
332
```
333
334
### Structural Analysis and Comparison
335
336
Tools for analyzing and comparing chemical structures using SMILES representations.
337
338
```python { .api }
339
def calculate_tanimoto_similarity(smiles1: str, smiles2: str) -> float:
340
"""
341
Calculate Tanimoto similarity between two SMILES structures.
342
343
Parameters:
344
- smiles1: First SMILES string
345
- smiles2: Second SMILES string
346
347
Returns:
348
Tanimoto similarity coefficient (0-1)
349
"""
350
351
def find_structural_motifs(smiles: str, motif_patterns: List[str]) -> dict:
352
"""
353
Find structural motifs in SMILES representation.
354
355
Parameters:
356
- smiles: SMILES string to analyze
357
- motif_patterns: List of SMARTS patterns to search for
358
359
Returns:
360
Dictionary with motif matches and positions
361
"""
362
363
def cluster_peptides_by_structure(smiles_list: List[str],
364
similarity_threshold: float = 0.7) -> List[List[int]]:
365
"""
366
Cluster peptides based on structural similarity.
367
368
Parameters:
369
- smiles_list: List of SMILES strings
370
- similarity_threshold: Minimum similarity for clustering
371
372
Returns:
373
List of clusters (lists of indices)
374
"""
375
376
def generate_structural_fingerprints(smiles: str,
377
fingerprint_type: str = 'morgan') -> object:
378
"""
379
Generate molecular fingerprints from SMILES.
380
381
Parameters:
382
- smiles: SMILES string
383
- fingerprint_type: Type of fingerprint ('morgan', 'maccs', 'topological')
384
385
Returns:
386
Molecular fingerprint object
387
"""
388
389
def identify_functional_groups(smiles: str) -> List[dict]:
390
"""
391
Identify functional groups in SMILES structure.
392
393
Parameters:
394
- smiles: SMILES string to analyze
395
396
Returns:
397
List of functional group identifications
398
"""
399
```
400
401
### Modification Database Integration
402
403
Integration with modification databases and chemical reaction systems.
404
405
```python { .api }
406
class ModificationSmilesDatabase:
407
"""Database for modification SMILES and reaction patterns."""
408
409
def __init__(self, database_path: str = None):
410
"""
411
Initialize modification database.
412
413
Parameters:
414
- database_path: Path to custom modification database
415
"""
416
417
def get_modification_smiles(self, modification_name: str) -> str:
418
"""
419
Get SMILES representation for named modification.
420
421
Parameters:
422
- modification_name: Standard modification name
423
424
Returns:
425
SMILES string representing the modification
426
"""
427
428
def add_custom_modification(self, name: str,
429
smiles: str,
430
reaction_pattern: str = None) -> None:
431
"""
432
Add custom modification to database.
433
434
Parameters:
435
- name: Modification name
436
- smiles: SMILES representation
437
- reaction_pattern: SMARTS reaction pattern
438
"""
439
440
def search_modifications_by_structure(self, query_smiles: str,
441
similarity_threshold: float = 0.8) -> List[dict]:
442
"""
443
Search modifications by structural similarity.
444
445
Parameters:
446
- query_smiles: Query SMILES structure
447
- similarity_threshold: Minimum similarity threshold
448
449
Returns:
450
List of matching modifications with similarity scores
451
"""
452
453
def get_reaction_products(self, reactant_smiles: str,
454
modification_name: str) -> List[str]:
455
"""
456
Get reaction products for modification.
457
458
Parameters:
459
- reactant_smiles: Starting amino acid SMILES
460
- modification_name: Modification to apply
461
462
Returns:
463
List of possible product SMILES
464
"""
465
466
def load_unimod_smiles_mappings() -> dict:
467
"""
468
Load SMILES mappings for UniMod modifications.
469
470
Returns:
471
Dictionary mapping UniMod IDs to SMILES representations
472
"""
473
474
def create_reaction_template(reactant_pattern: str,
475
product_pattern: str) -> object:
476
"""
477
Create reaction template from SMARTS patterns.
478
479
Parameters:
480
- reactant_pattern: SMARTS pattern for reactants
481
- product_pattern: SMARTS pattern for products
482
483
Returns:
484
Reaction template object
485
"""
486
```
487
488
### Visualization and Export
489
490
Tools for visualizing chemical structures and exporting to various formats.
491
492
```python { .api }
493
def visualize_peptide_structure(smiles: str,
494
output_path: str = None,
495
format: str = 'png') -> object:
496
"""
497
Visualize peptide structure from SMILES.
498
499
Parameters:
500
- smiles: SMILES string to visualize
501
- output_path: Optional path to save image
502
- format: Output format ('png', 'svg', 'pdf')
503
504
Returns:
505
Image object or None if saved to file
506
"""
507
508
def export_to_sdf(smiles_list: List[str],
509
output_path: str,
510
include_properties: bool = True) -> None:
511
"""
512
Export SMILES list to SDF format.
513
514
Parameters:
515
- smiles_list: List of SMILES strings
516
- output_path: Output SDF file path
517
- include_properties: Include calculated properties
518
"""
519
520
def export_to_mol2(smiles: str, output_path: str) -> None:
521
"""
522
Export SMILES to MOL2 format with 3D coordinates.
523
524
Parameters:
525
- smiles: SMILES string
526
- output_path: Output MOL2 file path
527
"""
528
529
def create_structure_grid(smiles_list: List[str],
530
labels: List[str] = None,
531
grid_size: tuple = (4, 4)) -> object:
532
"""
533
Create grid visualization of multiple structures.
534
535
Parameters:
536
- smiles_list: List of SMILES to visualize
537
- labels: Optional labels for each structure
538
- grid_size: Grid dimensions (rows, columns)
539
540
Returns:
541
Grid image object
542
"""
543
```
544
545
## Usage Examples
546
547
### Basic Amino Acid Modification
548
549
```python
550
from alphabase.smiles.smiles import AminoAcidModifier
551
from alphabase.smiles.peptide import PeptideSmilesEncoder
552
553
# Initialize amino acid modifier
554
modifier = AminoAcidModifier()
555
556
# Get SMILES for standard amino acid
557
cys_smiles = modifier.get_amino_acid_smiles('C')
558
print(f"Cysteine SMILES: {cys_smiles}")
559
560
# Apply carbamidomethyl modification
561
carbamidomethyl_smiles = "CC(=O)N" # Simplified modification SMILES
562
modified_cys = modifier.apply_modification(
563
aa_smiles=cys_smiles,
564
modification_smiles=carbamidomethyl_smiles,
565
reaction_site='thiol'
566
)
567
print(f"Carbamidomethyl cysteine SMILES: {modified_cys}")
568
569
# Calculate properties of modified amino acid
570
properties = modifier.calculate_modification_properties(modified_cys)
571
print(f"Modified cysteine properties: {properties}")
572
```
573
574
### Peptide SMILES Encoding
575
576
```python
577
from alphabase.smiles.peptide import PeptideSmilesEncoder, convert_sequence_to_smiles
578
579
# Initialize peptide encoder
580
encoder = PeptideSmilesEncoder()
581
582
# Simple peptide encoding
583
peptide_seq = "PEPTIDE"
584
peptide_smiles = encoder.encode_peptide_sequence(peptide_seq)
585
print(f"Peptide SMILES: {peptide_smiles}")
586
587
# Encode peptide with modifications
588
modified_seq = "PEPTIDE"
589
mod_names = ["Oxidation (M)"]
590
mod_sites = [1] # First position (1-indexed)
591
592
modified_smiles = encoder.encode_modified_peptide(
593
sequence=modified_seq,
594
mod_names=mod_names,
595
mod_sites=mod_sites
596
)
597
print(f"Modified peptide SMILES: {modified_smiles}")
598
599
# Calculate peptide properties
600
properties = encoder.calculate_peptide_properties(peptide_smiles)
601
print(f"Peptide properties:")
602
print(f" Molecular weight: {properties['molecular_weight']:.2f}")
603
print(f" LogP: {properties['logp']:.2f}")
604
print(f" Polar surface area: {properties['tpsa']:.2f}")
605
```
606
607
### Batch Processing and Analysis
608
609
```python
610
from alphabase.smiles.peptide import batch_encode_peptides
611
from alphabase.smiles.smiles import calculate_molecular_descriptors
612
613
# Batch encode multiple peptides
614
sequences = ["PEPTIDE", "SEQUENCE", "EXAMPLE", "TESTING"]
615
batch_smiles = batch_encode_peptides(sequences, n_jobs=2)
616
617
print(f"Encoded {len(batch_smiles)} peptides to SMILES")
618
619
# Calculate descriptors for all peptides
620
descriptors_list = []
621
for i, smiles in enumerate(batch_smiles):
622
descriptors = calculate_molecular_descriptors(smiles)
623
descriptors['sequence'] = sequences[i]
624
descriptors_list.append(descriptors)
625
626
# Analyze molecular properties
627
import pandas as pd
628
descriptors_df = pd.DataFrame(descriptors_list)
629
print(f"Molecular descriptor statistics:")
630
print(descriptors_df[['molecular_weight', 'logp', 'tpsa']].describe())
631
```
632
633
### Chemical Property Predictions
634
635
```python
636
from alphabase.smiles.smiles import (
637
calculate_lipinski_properties, predict_retention_time_from_smiles,
638
calculate_collision_cross_section_smiles
639
)
640
641
# Example peptide SMILES
642
peptide_smiles = batch_smiles[0] # From previous example
643
644
# Lipinski properties
645
lipinski = calculate_lipinski_properties(peptide_smiles)
646
print(f"Lipinski properties: {lipinski}")
647
print(f"Rule of 5 compliant: {lipinski['compliant']}")
648
649
# Predict retention time
650
predicted_rt = predict_retention_time_from_smiles(peptide_smiles, model_type='krokhin')
651
print(f"Predicted retention time: {predicted_rt:.2f} minutes")
652
653
# Predict collision cross section
654
predicted_ccs = calculate_collision_cross_section_smiles(peptide_smiles, charge=2)
655
print(f"Predicted CCS: {predicted_ccs:.1f} Ų")
656
```
657
658
### Structural Analysis and Similarity
659
660
```python
661
from alphabase.smiles.smiles import (
662
calculate_tanimoto_similarity, cluster_peptides_by_structure,
663
find_structural_motifs
664
)
665
666
# Calculate similarity between peptides
667
smiles1 = batch_smiles[0]
668
smiles2 = batch_smiles[1]
669
similarity = calculate_tanimoto_similarity(smiles1, smiles2)
670
print(f"Tanimoto similarity: {similarity:.3f}")
671
672
# Cluster peptides by structural similarity
673
clusters = cluster_peptides_by_structure(batch_smiles, similarity_threshold=0.6)
674
print(f"Found {len(clusters)} structural clusters")
675
for i, cluster in enumerate(clusters):
676
cluster_sequences = [sequences[idx] for idx in cluster]
677
print(f" Cluster {i+1}: {cluster_sequences}")
678
679
# Find structural motifs
680
motif_patterns = [
681
"CC(C)C", # Leucine/Isoleucine side chain
682
"c1ccccc1", # Benzene ring (Phe, Tyr, Trp)
683
"CCC(=O)O" # Carboxylic acid
684
]
685
686
for smiles in batch_smiles[:2]:
687
motifs = find_structural_motifs(smiles, motif_patterns)
688
print(f"Motifs in {smiles[:20]}...")
689
for pattern, matches in motifs.items():
690
if matches:
691
print(f" Found pattern {pattern}: {len(matches)} matches")
692
```
693
694
### Modification Database Operations
695
696
```python
697
from alphabase.smiles.smiles import ModificationSmilesDatabase, load_unimod_smiles_mappings
698
699
# Load modification database
700
mod_db = ModificationSmilesDatabase()
701
702
# Get SMILES for common modifications
703
oxidation_smiles = mod_db.get_modification_smiles("Oxidation (M)")
704
phospho_smiles = mod_db.get_modification_smiles("Phospho (STY)")
705
print(f"Oxidation SMILES: {oxidation_smiles}")
706
print(f"Phosphorylation SMILES: {phospho_smiles}")
707
708
# Add custom modification
709
mod_db.add_custom_modification(
710
name="Custom_Alkylation",
711
smiles="CCCC",
712
reaction_pattern="[#6:1]>>[#6:1]CCCC"
713
)
714
715
# Search for structurally similar modifications
716
query_smiles = "CC(=O)N"
717
similar_mods = mod_db.search_modifications_by_structure(
718
query_smiles,
719
similarity_threshold=0.7
720
)
721
print(f"Similar modifications: {[mod['name'] for mod in similar_mods]}")
722
723
# Load UniMod mappings
724
unimod_smiles = load_unimod_smiles_mappings()
725
print(f"Loaded {len(unimod_smiles)} UniMod SMILES mappings")
726
```
727
728
### Fragment Ion Analysis
729
730
```python
731
# Generate fragment ion SMILES
732
fragment_smiles = encoder.fragment_peptide_smiles(
733
peptide_smiles=batch_smiles[0],
734
fragmentation_type='b_y'
735
)
736
737
print(f"Generated fragments:")
738
for frag_type, frag_list in fragment_smiles.items():
739
print(f" {frag_type} ions: {len(frag_list)}")
740
for frag in frag_list[:3]: # Show first 3
741
print(f" SMILES: {frag['smiles'][:30]}...")
742
print(f" Mass: {frag['mass']:.4f}")
743
```
744
745
### Visualization and Export
746
747
```python
748
from alphabase.smiles.smiles import (
749
visualize_peptide_structure, export_to_sdf, create_structure_grid
750
)
751
752
# Visualize single peptide structure
753
structure_img = visualize_peptide_structure(
754
smiles=batch_smiles[0],
755
output_path='peptide_structure.png',
756
format='png'
757
)
758
759
# Create grid visualization of multiple structures
760
grid_img = create_structure_grid(
761
smiles_list=batch_smiles,
762
labels=sequences,
763
grid_size=(2, 2)
764
)
765
766
# Export to SDF format for external analysis
767
export_to_sdf(
768
smiles_list=batch_smiles,
769
output_path='peptides.sdf',
770
include_properties=True
771
)
772
773
print("Visualization and export completed")
774
```
775
776
### Advanced Chemical Analysis
777
778
```python
779
from alphabase.smiles.smiles import (
780
generate_structural_fingerprints, identify_functional_groups,
781
assess_fragmentation_propensity
782
)
783
784
# Generate molecular fingerprints
785
fingerprint = generate_structural_fingerprints(
786
smiles=batch_smiles[0],
787
fingerprint_type='morgan'
788
)
789
print(f"Generated molecular fingerprint with {len(fingerprint)} bits")
790
791
# Identify functional groups
792
functional_groups = identify_functional_groups(batch_smiles[0])
793
print(f"Functional groups identified:")
794
for group in functional_groups:
795
print(f" {group['name']}: {group['count']} occurrences")
796
797
# Assess fragmentation propensity
798
fragmentation = assess_fragmentation_propensity(batch_smiles[0])
799
print(f"Fragmentation analysis:")
800
print(f" Predicted dominant fragments: {fragmentation['dominant_fragments']}")
801
print(f" Fragmentation score: {fragmentation['fragmentation_score']:.2f}")
802
```
803
804
### Integration with Proteomics Workflows
805
806
```python
807
# Example: Integrate SMILES with spectral library
808
from alphabase.spectral_library.base import SpecLibBase
809
import pandas as pd
810
811
# Create spectral library with SMILES information
812
spec_lib = SpecLibBase()
813
spec_lib.precursor_df = pd.DataFrame({
814
'sequence': sequences,
815
'mods': ['', '', '', ''],
816
'charge': [2, 2, 2, 2],
817
'proteins': ['P1', 'P2', 'P3', 'P4'],
818
'smiles': batch_smiles
819
})
820
821
# Add chemical properties to spectral library
822
for idx, row in spec_lib.precursor_df.iterrows():
823
props = calculate_molecular_descriptors(row['smiles'])
824
spec_lib.precursor_df.loc[idx, 'molecular_weight'] = props['molecular_weight']
825
spec_lib.precursor_df.loc[idx, 'logp'] = props['logp']
826
spec_lib.precursor_df.loc[idx, 'predicted_rt'] = predict_retention_time_from_smiles(row['smiles'])
827
828
print(f"Enhanced spectral library with chemical properties:")
829
print(spec_lib.precursor_df[['sequence', 'molecular_weight', 'logp', 'predicted_rt']].head())
830
```