0
# Global Descriptors
1
2
Global descriptors compute features for entire atomic structures, producing a single feature vector per structure that captures overall structural properties. These descriptors are ideal for comparing and classifying different crystal structures or molecular conformations.
3
4
## Capabilities
5
6
### MBTR (Many-Body Tensor Representation)
7
8
MBTR represents atomic structures through many-body interaction terms, capturing both local and global structural information. It uses geometry functions to describe k-body interactions (k1: atomic properties, k2: pair interactions, k3: three-body angles) and discretizes them into histograms.
9
10
```python { .api }
11
class MBTR:
12
def __init__(self, geometry=None, grid=None, weighting=None, normalize_gaussians=True,
13
normalization="none", species=None, periodic=False, sparse=False, dtype="float64"):
14
"""
15
Initialize MBTR descriptor.
16
17
Parameters:
18
- geometry (dict): Geometry functions configuration for k1, k2, k3 terms:
19
- k1: atomic properties (e.g., "atomic_number", "coulomb_matrix")
20
- k2: pair interactions (e.g., "distance", "inverse_distance")
21
- k3: three-body terms (e.g., "angle", "cosine")
22
- grid (dict): Discretization grids for each geometry function:
23
- min/max: range bounds for the grid
24
- n: number of grid points
25
- sigma: Gaussian broadening width
26
- weighting (dict): Weighting functions for contributions:
27
- function: weighting scheme (e.g., "unity", "exp", "inverse_r0")
28
- r0, c: parameters for distance-based weighting
29
- normalize_gaussians (bool): Whether to normalize Gaussian broadening
30
- normalization (str): Normalization scheme ("none", "l2", "n_atoms")
31
- species (list): List of atomic species to include
32
- periodic (bool): Whether to consider periodic boundary conditions
33
- sparse (bool): Whether to return sparse arrays
34
- dtype (str): Data type for arrays
35
"""
36
37
def create(self, system, n_jobs=1, only_physical_cores=False, verbose=False):
38
"""
39
Create MBTR descriptor for given system(s).
40
41
Parameters:
42
- system: ASE Atoms object(s) or DScribe System object(s)
43
- n_jobs (int): Number of parallel processes
44
- only_physical_cores (bool): Whether to use only physical CPU cores
45
- verbose (bool): Whether to print progress information
46
47
Returns:
48
numpy.ndarray or scipy.sparse matrix: MBTR descriptors with shape (n_systems, n_features)
49
"""
50
51
def derivatives(self, system, include=None, exclude=None, method="auto",
52
return_descriptor=True, n_jobs=1, only_physical_cores=False, verbose=False):
53
"""
54
Calculate derivatives of MBTR descriptor with respect to atomic positions.
55
56
Parameters:
57
- system: ASE Atoms object(s) or DScribe System object(s)
58
- include (list): Atomic indices to include in derivative calculation
59
- exclude (list): Atomic indices to exclude from derivative calculation
60
- method (str): Derivative calculation method ("auto", "analytical", "numerical")
61
- return_descriptor (bool): Whether to also return the descriptor values (default True)
62
- n_jobs (int): Number of parallel processes
63
- only_physical_cores (bool): Whether to use only physical CPU cores
64
- verbose (bool): Whether to print progress information
65
66
Returns:
67
numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
68
"""
69
70
def get_number_of_features(self):
71
"""Get total number of features in MBTR descriptor."""
72
```
73
74
**Usage Example:**
75
76
```python
77
from dscribe.descriptors import MBTR
78
from ase.build import molecule
79
80
# Setup MBTR descriptor with k2 and k3 terms
81
mbtr = MBTR(
82
species=["H", "O"],
83
geometry={
84
"k2": {
85
"function": "inverse_distance",
86
},
87
"k3": {
88
"function": "angle",
89
}
90
},
91
grid={
92
"k2": {
93
"min": 0.5,
94
"max": 2.0,
95
"n": 50,
96
"sigma": 0.05
97
},
98
"k3": {
99
"min": 0,
100
"max": 180,
101
"n": 50,
102
"sigma": 5
103
}
104
},
105
weighting={
106
"k2": {
107
"function": "exp",
108
"r0": 3.5,
109
"c": 0.5
110
},
111
"k3": {
112
"function": "exp",
113
"r0": 3.5,
114
"c": 0.5
115
}
116
}
117
)
118
119
# Create descriptor for water molecule
120
water = molecule("H2O")
121
mbtr_desc = mbtr.create(water) # Shape: (1, n_features)
122
123
# Process multiple systems
124
molecules = [molecule("H2O"), molecule("NH3"), molecule("CH4")]
125
mbtr_descriptors = mbtr.create(molecules) # Shape: (3, n_features)
126
```
127
128
### ValleOganov
129
130
ValleOganov descriptor is a shortcut implementation of the Valle-Oganov fingerprint using MBTR with specific weighting and normalization settings. It provides a standardized way to create descriptors following the Valle-Oganov methodology.
131
132
```python { .api }
133
class ValleOganov:
134
def __init__(self, species, function, n, sigma, r_cut, sparse=False, dtype="float64"):
135
"""
136
Initialize Valle-Oganov descriptor.
137
138
Parameters:
139
- species (list): List of atomic species to include
140
- function (str): Geometry function to use ("inverse_distance", "distance", etc.)
141
- n (int): Number of grid points for discretization
142
- sigma (float): Gaussian broadening width
143
- r_cut (float): Cutoff radius for interactions
144
- sparse (bool): Whether to return sparse arrays
145
- dtype (str): Data type for arrays
146
"""
147
148
def create(self, system, n_jobs=1, only_physical_cores=False, verbose=False):
149
"""
150
Create Valle-Oganov descriptor for given system(s).
151
152
Parameters:
153
- system: ASE Atoms object(s) or DScribe System object(s)
154
- n_jobs (int): Number of parallel processes
155
- only_physical_cores (bool): Whether to use only physical CPU cores
156
- verbose (bool): Whether to print progress information
157
158
Returns:
159
numpy.ndarray or scipy.sparse matrix: Valle-Oganov descriptors
160
"""
161
162
def get_number_of_features(self):
163
"""Get total number of features in Valle-Oganov descriptor."""
164
```
165
166
**Usage Example:**
167
168
```python
169
from dscribe.descriptors import ValleOganov
170
from ase.build import molecule
171
172
# Setup Valle-Oganov descriptor
173
vo = ValleOganov(
174
species=["H", "O"],
175
function="inverse_distance",
176
n=100,
177
sigma=0.05,
178
r_cut=6.0
179
)
180
181
# Create descriptor for water molecule
182
water = molecule("H2O")
183
vo_desc = vo.create(water) # Shape: (1, n_features)
184
```
185
186
## MBTR Configuration Details
187
188
### Geometry Functions
189
190
MBTR supports different k-body terms:
191
192
- **k1 terms** (atomic): `"atomic_number"`, `"coulomb_matrix"`
193
- **k2 terms** (pairs): `"distance"`, `"inverse_distance"`
194
- **k3 terms** (triplets): `"angle"`, `"cosine"`
195
196
### Grid Configuration
197
198
Each geometry function requires a grid specification:
199
200
```python
201
grid = {
202
"k2": {
203
"min": 0.5, # Minimum value
204
"max": 5.0, # Maximum value
205
"n": 50, # Number of grid points
206
"sigma": 0.1 # Gaussian broadening width
207
}
208
}
209
```
210
211
### Weighting Functions
212
213
Weighting functions control how different contributions are weighted:
214
215
- `"unity"`: All contributions weighted equally
216
- `"exp"`: Exponential decay with distance
217
- `"inverse_r0"`: Inverse distance weighting
218
219
```python
220
weighting = {
221
"k2": {
222
"function": "exp",
223
"r0": 3.5, # Reference distance
224
"c": 0.5 # Decay parameter
225
}
226
}
227
```
228
229
## Common Global Descriptor Features
230
231
Global descriptors share these characteristics:
232
233
- **Per-structure output**: Each descriptor returns one feature vector per atomic structure
234
- **Structure-level properties**: Capture overall structural characteristics and symmetries
235
- **Comparison capability**: Enable direct comparison between different structures
236
- **Normalization options**: Support different normalization schemes for consistent scaling
237
238
## Output Shapes
239
240
Global descriptors return arrays with shape:
241
- Single system: `(1, n_features)`
242
- Multiple systems: `(n_systems, n_features)`
243
244
This consistent output format makes global descriptors ideal for machine learning tasks that classify or compare entire structures, such as crystal structure prediction or molecular property prediction.