or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-classes.mdglobal-descriptors.mdindex.mdkernels.mdlocal-descriptors.mdmatrix-descriptors.mdutilities.md

global-descriptors.mddocs/

0

# Global Descriptors

1

2

Global descriptors compute features for entire atomic structures, producing a single feature vector per structure that captures overall structural properties. These descriptors are ideal for comparing and classifying different crystal structures or molecular conformations.

3

4

## Capabilities

5

6

### MBTR (Many-Body Tensor Representation)

7

8

MBTR represents atomic structures through many-body interaction terms, capturing both local and global structural information. It uses geometry functions to describe k-body interactions (k1: atomic properties, k2: pair interactions, k3: three-body angles) and discretizes them into histograms.

9

10

```python { .api }

11

class MBTR:

12

def __init__(self, geometry=None, grid=None, weighting=None, normalize_gaussians=True,

13

normalization="none", species=None, periodic=False, sparse=False, dtype="float64"):

14

"""

15

Initialize MBTR descriptor.

16

17

Parameters:

18

- geometry (dict): Geometry functions configuration for k1, k2, k3 terms:

19

- k1: atomic properties (e.g., "atomic_number", "coulomb_matrix")

20

- k2: pair interactions (e.g., "distance", "inverse_distance")

21

- k3: three-body terms (e.g., "angle", "cosine")

22

- grid (dict): Discretization grids for each geometry function:

23

- min/max: range bounds for the grid

24

- n: number of grid points

25

- sigma: Gaussian broadening width

26

- weighting (dict): Weighting functions for contributions:

27

- function: weighting scheme (e.g., "unity", "exp", "inverse_r0")

28

- r0, c: parameters for distance-based weighting

29

- normalize_gaussians (bool): Whether to normalize Gaussian broadening

30

- normalization (str): Normalization scheme ("none", "l2", "n_atoms")

31

- species (list): List of atomic species to include

32

- periodic (bool): Whether to consider periodic boundary conditions

33

- sparse (bool): Whether to return sparse arrays

34

- dtype (str): Data type for arrays

35

"""

36

37

def create(self, system, n_jobs=1, only_physical_cores=False, verbose=False):

38

"""

39

Create MBTR descriptor for given system(s).

40

41

Parameters:

42

- system: ASE Atoms object(s) or DScribe System object(s)

43

- n_jobs (int): Number of parallel processes

44

- only_physical_cores (bool): Whether to use only physical CPU cores

45

- verbose (bool): Whether to print progress information

46

47

Returns:

48

numpy.ndarray or scipy.sparse matrix: MBTR descriptors with shape (n_systems, n_features)

49

"""

50

51

def derivatives(self, system, include=None, exclude=None, method="auto",

52

return_descriptor=True, n_jobs=1, only_physical_cores=False, verbose=False):

53

"""

54

Calculate derivatives of MBTR descriptor with respect to atomic positions.

55

56

Parameters:

57

- system: ASE Atoms object(s) or DScribe System object(s)

58

- include (list): Atomic indices to include in derivative calculation

59

- exclude (list): Atomic indices to exclude from derivative calculation

60

- method (str): Derivative calculation method ("auto", "analytical", "numerical")

61

- return_descriptor (bool): Whether to also return the descriptor values (default True)

62

- n_jobs (int): Number of parallel processes

63

- only_physical_cores (bool): Whether to use only physical CPU cores

64

- verbose (bool): Whether to print progress information

65

66

Returns:

67

numpy.ndarray or tuple: Derivatives array, optionally with descriptor values

68

"""

69

70

def get_number_of_features(self):

71

"""Get total number of features in MBTR descriptor."""

72

```

73

74

**Usage Example:**

75

76

```python

77

from dscribe.descriptors import MBTR

78

from ase.build import molecule

79

80

# Setup MBTR descriptor with k2 and k3 terms

81

mbtr = MBTR(

82

species=["H", "O"],

83

geometry={

84

"k2": {

85

"function": "inverse_distance",

86

},

87

"k3": {

88

"function": "angle",

89

}

90

},

91

grid={

92

"k2": {

93

"min": 0.5,

94

"max": 2.0,

95

"n": 50,

96

"sigma": 0.05

97

},

98

"k3": {

99

"min": 0,

100

"max": 180,

101

"n": 50,

102

"sigma": 5

103

}

104

},

105

weighting={

106

"k2": {

107

"function": "exp",

108

"r0": 3.5,

109

"c": 0.5

110

},

111

"k3": {

112

"function": "exp",

113

"r0": 3.5,

114

"c": 0.5

115

}

116

}

117

)

118

119

# Create descriptor for water molecule

120

water = molecule("H2O")

121

mbtr_desc = mbtr.create(water) # Shape: (1, n_features)

122

123

# Process multiple systems

124

molecules = [molecule("H2O"), molecule("NH3"), molecule("CH4")]

125

mbtr_descriptors = mbtr.create(molecules) # Shape: (3, n_features)

126

```

127

128

### ValleOganov

129

130

ValleOganov descriptor is a shortcut implementation of the Valle-Oganov fingerprint using MBTR with specific weighting and normalization settings. It provides a standardized way to create descriptors following the Valle-Oganov methodology.

131

132

```python { .api }

133

class ValleOganov:

134

def __init__(self, species, function, n, sigma, r_cut, sparse=False, dtype="float64"):

135

"""

136

Initialize Valle-Oganov descriptor.

137

138

Parameters:

139

- species (list): List of atomic species to include

140

- function (str): Geometry function to use ("inverse_distance", "distance", etc.)

141

- n (int): Number of grid points for discretization

142

- sigma (float): Gaussian broadening width

143

- r_cut (float): Cutoff radius for interactions

144

- sparse (bool): Whether to return sparse arrays

145

- dtype (str): Data type for arrays

146

"""

147

148

def create(self, system, n_jobs=1, only_physical_cores=False, verbose=False):

149

"""

150

Create Valle-Oganov descriptor for given system(s).

151

152

Parameters:

153

- system: ASE Atoms object(s) or DScribe System object(s)

154

- n_jobs (int): Number of parallel processes

155

- only_physical_cores (bool): Whether to use only physical CPU cores

156

- verbose (bool): Whether to print progress information

157

158

Returns:

159

numpy.ndarray or scipy.sparse matrix: Valle-Oganov descriptors

160

"""

161

162

def get_number_of_features(self):

163

"""Get total number of features in Valle-Oganov descriptor."""

164

```

165

166

**Usage Example:**

167

168

```python

169

from dscribe.descriptors import ValleOganov

170

from ase.build import molecule

171

172

# Setup Valle-Oganov descriptor

173

vo = ValleOganov(

174

species=["H", "O"],

175

function="inverse_distance",

176

n=100,

177

sigma=0.05,

178

r_cut=6.0

179

)

180

181

# Create descriptor for water molecule

182

water = molecule("H2O")

183

vo_desc = vo.create(water) # Shape: (1, n_features)

184

```

185

186

## MBTR Configuration Details

187

188

### Geometry Functions

189

190

MBTR supports different k-body terms:

191

192

- **k1 terms** (atomic): `"atomic_number"`, `"coulomb_matrix"`

193

- **k2 terms** (pairs): `"distance"`, `"inverse_distance"`

194

- **k3 terms** (triplets): `"angle"`, `"cosine"`

195

196

### Grid Configuration

197

198

Each geometry function requires a grid specification:

199

200

```python

201

grid = {

202

"k2": {

203

"min": 0.5, # Minimum value

204

"max": 5.0, # Maximum value

205

"n": 50, # Number of grid points

206

"sigma": 0.1 # Gaussian broadening width

207

}

208

}

209

```

210

211

### Weighting Functions

212

213

Weighting functions control how different contributions are weighted:

214

215

- `"unity"`: All contributions weighted equally

216

- `"exp"`: Exponential decay with distance

217

- `"inverse_r0"`: Inverse distance weighting

218

219

```python

220

weighting = {

221

"k2": {

222

"function": "exp",

223

"r0": 3.5, # Reference distance

224

"c": 0.5 # Decay parameter

225

}

226

}

227

```

228

229

## Common Global Descriptor Features

230

231

Global descriptors share these characteristics:

232

233

- **Per-structure output**: Each descriptor returns one feature vector per atomic structure

234

- **Structure-level properties**: Capture overall structural characteristics and symmetries

235

- **Comparison capability**: Enable direct comparison between different structures

236

- **Normalization options**: Support different normalization schemes for consistent scaling

237

238

## Output Shapes

239

240

Global descriptors return arrays with shape:

241

- Single system: `(1, n_features)`

242

- Multiple systems: `(n_systems, n_features)`

243

244

This consistent output format makes global descriptors ideal for machine learning tasks that classify or compare entire structures, such as crystal structure prediction or molecular property prediction.