0
# Chemical Features
1
2
Pharmacophore and chemical feature detection for identifying functional groups, binding sites, and molecular patterns. RDKit's feature system enables systematic analysis of chemical functionality through customizable feature definitions and automated pattern recognition algorithms essential for drug discovery and chemical analysis.
3
4
## Capabilities
5
6
### Feature Factory Creation
7
8
Create feature detection factories from feature definition files for systematic chemical pattern recognition.
9
10
```python { .api }
11
def BuildFeatureFactory(fdefName: str):
12
"""
13
Create a feature factory from a feature definition file.
14
15
Parameters:
16
- fdefName: Path to feature definition file (.fdef format)
17
18
Returns:
19
FeatureFactory object for pattern detection
20
"""
21
22
def BuildFeatureFactoryFromString(fdefString: str):
23
"""
24
Create a feature factory from a feature definition string.
25
26
Parameters:
27
- fdefString: Feature definition string in .fdef format
28
29
Returns:
30
FeatureFactory object for pattern detection
31
"""
32
```
33
34
### Feature Detection
35
36
Identify chemical features and functional groups in molecular structures.
37
38
```python { .api }
39
class FeatureFactory:
40
"""
41
Factory for detecting chemical features in molecules.
42
"""
43
44
def GetFeaturesForMol(self, mol: Mol, confId: int = -1) -> list:
45
"""
46
Find all features in a molecule.
47
48
Parameters:
49
- mol: Input molecule
50
- confId: Conformer ID to use (default -1)
51
52
Returns:
53
List of ChemicalFeature objects
54
"""
55
56
def GetFeatureDefs(self) -> list:
57
"""
58
Get all feature definitions in this factory.
59
60
Returns:
61
List of FeatureDef objects
62
"""
63
64
def GetFeatureFamilies(self) -> list:
65
"""
66
Get all feature family names.
67
68
Returns:
69
List of feature family name strings
70
"""
71
```
72
73
### Chemical Feature Objects
74
75
Represent detected chemical features with spatial and chemical information.
76
77
```python { .api }
78
class ChemicalFeature:
79
"""
80
Represents a detected chemical feature in a molecule.
81
"""
82
83
def GetFamily(self) -> str:
84
"""
85
Get the feature family name.
86
87
Returns:
88
Feature family string (e.g., 'Donor', 'Acceptor', 'Aromatic')
89
"""
90
91
def GetType(self) -> str:
92
"""
93
Get the specific feature type.
94
95
Returns:
96
Feature type string
97
"""
98
99
def GetAtomIds(self) -> tuple:
100
"""
101
Get atom indices involved in this feature.
102
103
Returns:
104
Tuple of atom indices
105
"""
106
107
def GetPos(self) -> tuple:
108
"""
109
Get the 3D position of the feature.
110
111
Returns:
112
(x, y, z) coordinate tuple
113
"""
114
115
def GetId(self) -> int:
116
"""
117
Get the feature ID.
118
119
Returns:
120
Feature ID integer
121
"""
122
```
123
124
### Feature Definition Objects
125
126
Define patterns and rules for chemical feature detection.
127
128
```python { .api }
129
class FeatureDef:
130
"""
131
Definition of a chemical feature pattern.
132
"""
133
134
def GetFamily(self) -> str:
135
"""
136
Get the feature family name.
137
138
Returns:
139
Feature family string
140
"""
141
142
def GetType(self) -> str:
143
"""
144
Get the specific feature type name.
145
146
Returns:
147
Feature type string
148
"""
149
150
def GetSmarts(self) -> str:
151
"""
152
Get the SMARTS pattern for this feature.
153
154
Returns:
155
SMARTS pattern string
156
"""
157
158
def GetWeight(self) -> float:
159
"""
160
Get the feature weight/importance.
161
162
Returns:
163
Weight value
164
"""
165
```
166
167
### Pharmacophore Analysis
168
169
Analyze spatial relationships between chemical features for pharmacophore modeling.
170
171
```python { .api }
172
def GetDistanceMatrix(features: list, confId: int = -1) -> list:
173
"""
174
Calculate distance matrix between features.
175
176
Parameters:
177
- features: List of ChemicalFeature objects
178
- confId: Conformer ID to use (default -1)
179
180
Returns:
181
2D list representing distance matrix
182
"""
183
184
def Get3DDistanceMatrix(mol: Mol, confId: int = -1, useAtomWts: bool = False) -> list:
185
"""
186
Calculate 3D distance matrix for all atoms in a molecule.
187
188
Parameters:
189
- mol: Input molecule with 3D coordinates
190
- confId: Conformer ID to use (default -1)
191
- useAtomWts: Weight distances by atomic weights (default False)
192
193
Returns:
194
2D list representing distance matrix
195
"""
196
```
197
198
### Built-in Feature Definitions
199
200
Access to RDKit's standard feature definition resources.
201
202
```python { .api }
203
def GetFeatureDefFile() -> str:
204
"""
205
Get path to the default BaseFeatures.fdef file.
206
207
Returns:
208
Path to BaseFeatures.fdef in RDKit data directory
209
"""
210
```
211
212
## Standard Feature Families
213
214
RDKit's BaseFeatures.fdef defines several standard feature families:
215
216
### Hydrogen Bond Features
217
- **Donor**: Hydrogen bond donor groups (NH, OH, SH)
218
- **Acceptor**: Hydrogen bond acceptor atoms (N, O with lone pairs)
219
220
### Hydrophobic Features
221
- **Hydrophobe**: Hydrophobic groups (alkyl chains, aromatic carbons)
222
223
### Aromatic Features
224
- **Aromatic**: Aromatic ring systems
225
- **Arom**: Aromatic atoms
226
227
### Charged Features
228
- **PosIonizable**: Positively ionizable groups (amines, guanidines)
229
- **NegIonizable**: Negatively ionizable groups (carboxylates, phosphates)
230
231
### Metal Binding
232
- **ZnBinder**: Zinc-binding groups (histidine, cysteine)
233
234
## Usage Examples
235
236
### Basic Feature Detection
237
238
```python
239
import os
240
from rdkit import Chem, RDConfig
241
from rdkit.Chem import ChemicalFeatures
242
243
# Load the default feature factory
244
fdefName = os.path.join(RDConfig.RDDataDir, 'BaseFeatures.fdef')
245
factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
246
247
# Create a molecule and detect features
248
mol = Chem.MolFromSmiles('OCc1ccccc1CN') # Tyramine-like molecule
249
features = factory.GetFeaturesForMol(mol)
250
251
print(f"Found {len(features)} features:")
252
for feat in features:
253
print(f"- {feat.GetFamily()}: atoms {feat.GetAtomIds()}")
254
```
255
256
### Pharmacophore Analysis
257
258
```python
259
import os
260
from rdkit import Chem, RDConfig
261
from rdkit.Chem import ChemicalFeatures, AllChem
262
263
# Prepare molecule with 3D coordinates
264
mol = Chem.MolFromSmiles('OCc1ccccc1CN')
265
AllChem.EmbedMolecule(mol)
266
AllChem.MMFFOptimizeMolecule(mol)
267
268
# Detect features
269
fdefName = os.path.join(RDConfig.RDDataDir, 'BaseFeatures.fdef')
270
factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
271
features = factory.GetFeaturesForMol(mol)
272
273
# Analyze feature positions and distances
274
donor_features = [f for f in features if f.GetFamily() == 'Donor']
275
acceptor_features = [f for f in features if f.GetFamily() == 'Acceptor']
276
277
print(f"Donors: {len(donor_features)}")
278
print(f"Acceptors: {len(acceptor_features)}")
279
280
# Calculate distances between donors and acceptors
281
for i, donor in enumerate(donor_features):
282
for j, acceptor in enumerate(acceptor_features):
283
pos1 = donor.GetPos()
284
pos2 = acceptor.GetPos()
285
distance = ((pos1[0]-pos2[0])**2 + (pos1[1]-pos2[1])**2 + (pos1[2]-pos2[2])**2)**0.5
286
print(f"Donor {i} to Acceptor {j}: {distance:.2f} Ų")
287
```
288
289
### Custom Feature Definitions
290
291
```python
292
from rdkit.Chem import ChemicalFeatures
293
294
# Define custom feature patterns
295
custom_fdef = """
296
DefineFeature HalogenBond [#9,#17,#35,#53;X1]
297
Family HalogenBond
298
Weights 1.0
299
EndFeature
300
301
DefineFeature Nitrile [C]#[N]
302
Family Nitrile
303
Weights 1.0,1.0
304
EndFeature
305
"""
306
307
# Create factory from custom definitions
308
factory = ChemicalFeatures.BuildFeatureFactoryFromString(custom_fdef)
309
310
# Test on molecules containing these features
311
mol1 = Chem.MolFromSmiles('CCF') # Contains fluorine
312
mol2 = Chem.MolFromSmiles('CC#N') # Contains nitrile
313
314
features1 = factory.GetFeaturesForMol(mol1)
315
features2 = factory.GetFeaturesForMol(mol2)
316
317
print(f"Molecule 1 features: {[f.GetFamily() for f in features1]}")
318
print(f"Molecule 2 features: {[f.GetFamily() for f in features2]}")
319
```
320
321
### Feature Factory Inspection
322
323
```python
324
import os
325
from rdkit import RDConfig
326
from rdkit.Chem import ChemicalFeatures
327
328
# Load factory and inspect available features
329
fdefName = os.path.join(RDConfig.RDDataDir, 'BaseFeatures.fdef')
330
factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
331
332
# Get all feature families
333
families = factory.GetFeatureFamilies()
334
print(f"Available feature families: {families}")
335
336
# Get feature definitions
337
feature_defs = factory.GetFeatureDefs()
338
print(f"\nTotal feature definitions: {len(feature_defs)}")
339
340
for fdef in feature_defs[:5]: # Show first 5
341
print(f"- {fdef.GetFamily()}/{fdef.GetType()}: {fdef.GetSmarts()}")
342
```
343
344
### Integration with Molecular Descriptors
345
346
```python
347
import os
348
from rdkit import Chem, RDConfig
349
from rdkit.Chem import ChemicalFeatures, Descriptors
350
351
# Combine feature detection with descriptor calculation
352
mol = Chem.MolFromSmiles('OCc1ccccc1CN')
353
354
# Calculate basic descriptors
355
mw = Descriptors.MolWt(mol)
356
logp = Descriptors.MolLogP(mol)
357
hbd = Descriptors.NumHDonors(mol)
358
hba = Descriptors.NumHAcceptors(mol)
359
360
# Detect detailed features
361
fdefName = os.path.join(RDConfig.RDDataDir, 'BaseFeatures.fdef')
362
factory = ChemicalFeatures.BuildFeatureFactory(fdefName)
363
features = factory.GetFeaturesForMol(mol)
364
365
# Summarize results
366
print(f"Molecular Weight: {mw:.2f}")
367
print(f"LogP: {logp:.2f}")
368
print(f"HB Donors (Descriptors): {hbd}")
369
print(f"HB Acceptors (Descriptors): {hba}")
370
print(f"Total Features Detected: {len(features)}")
371
372
# Count features by family
373
feature_counts = {}
374
for feat in features:
375
family = feat.GetFamily()
376
feature_counts[family] = feature_counts.get(family, 0) + 1
377
378
for family, count in feature_counts.items():
379
print(f"{family}: {count}")
380
```