or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

core-classes.mdglobal-descriptors.mdindex.mdkernels.mdlocal-descriptors.mdmatrix-descriptors.mdutilities.md

matrix-descriptors.mddocs/

0

# Matrix Descriptors

1

2

Matrix descriptors represent atomic structures as matrices based on pairwise interactions between atoms, then transform these matrices into fixed-size feature vectors. These descriptors are particularly useful for molecular systems and provide intuitive representations of atomic interactions.

3

4

## Capabilities

5

6

### CoulombMatrix

7

8

The Coulomb Matrix represents atomic structures through Coulomb interactions between atoms. Matrix elements are the Coulomb repulsion between atoms (off-diagonal) or a polynomial of the atomic charge (diagonal).

9

10

```python { .api }

11

class CoulombMatrix:

12

def __init__(self, n_atoms_max, permutation="sorted_l2", sigma=None, seed=None, sparse=False):

13

"""

14

Initialize Coulomb Matrix descriptor.

15

16

Parameters:

17

- n_atoms_max (int): Maximum number of atoms in structures to be processed

18

- permutation (str): Permutation strategy for handling atom ordering:

19

- "sorted_l2": Sort rows/columns by L2 norm

20

- "eigenspectrum": Use eigenvalue spectrum

21

- "random": Random permutation with noise

22

- sigma (float): Standard deviation for random noise (only for "random" permutation)

23

- seed (int): Random seed for reproducible random permutations

24

- sparse (bool): Whether to return sparse arrays

25

"""

26

27

def create(self, system, n_jobs=1, only_physical_cores=False, verbose=False):

28

"""

29

Create Coulomb Matrix descriptor for given system(s).

30

31

Parameters:

32

- system: ASE Atoms object(s) or DScribe System object(s)

33

- n_jobs (int): Number of parallel processes

34

- only_physical_cores (bool): Whether to use only physical CPU cores

35

- verbose (bool): Whether to print progress information

36

37

Returns:

38

numpy.ndarray: Coulomb Matrix descriptors with shape (n_systems, n_features)

39

"""

40

41

def get_matrix(self, system):

42

"""

43

Get the Coulomb matrix for a single atomic system.

44

45

Parameters:

46

- system: ASE Atoms object or DScribe System object

47

48

Returns:

49

numpy.ndarray: 2D Coulomb matrix with shape (n_atoms, n_atoms)

50

"""

51

52

def get_number_of_features(self):

53

"""Get total number of features in flattened Coulomb Matrix descriptor."""

54

55

def unflatten(self, features, n_systems=None):

56

"""

57

Unflatten descriptor back to 2D matrix form.

58

59

Parameters:

60

- features: Flattened descriptor array

61

- n_systems (int): Number of systems (for multiple systems)

62

63

Returns:

64

numpy.ndarray: Unflattened matrix descriptors

65

"""

66

```

67

68

**Usage Example:**

69

70

```python

71

from dscribe.descriptors import CoulombMatrix

72

from ase.build import molecule

73

74

# Setup Coulomb Matrix descriptor

75

cm = CoulombMatrix(

76

n_atoms_max=10,

77

permutation="sorted_l2"

78

)

79

80

# Create descriptor for water molecule

81

water = molecule("H2O")

82

cm_desc = cm.create(water) # Shape: (1, n_features)

83

84

# Get the actual matrix (before flattening)

85

cm_matrix = cm.get_matrix(water) # Shape: (3, 3) for H2O

86

87

# Process multiple molecules

88

molecules = [molecule("H2O"), molecule("NH3"), molecule("CH4")]

89

cm_descriptors = cm.create(molecules) # Shape: (3, n_features)

90

```

91

92

### SineMatrix

93

94

The Sine Matrix is designed for periodic systems, using sine functions instead of Coulomb interactions to handle periodic boundary conditions more effectively.

95

96

```python { .api }

97

class SineMatrix:

98

def __init__(self, n_atoms_max, permutation="sorted_l2", sigma=None, seed=None, sparse=False):

99

"""

100

Initialize Sine Matrix descriptor.

101

102

Parameters:

103

- n_atoms_max (int): Maximum number of atoms in structures to be processed

104

- permutation (str): Permutation strategy for handling atom ordering:

105

- "sorted_l2": Sort rows/columns by L2 norm

106

- "eigenspectrum": Use eigenvalue spectrum

107

- "random": Random permutation with noise

108

- sigma (float): Standard deviation for random noise (only for "random" permutation)

109

- seed (int): Random seed for reproducible random permutations

110

- sparse (bool): Whether to return sparse arrays

111

"""

112

113

def create(self, system, n_jobs=1, only_physical_cores=False, verbose=False):

114

"""

115

Create Sine Matrix descriptor for given system(s).

116

117

Parameters:

118

- system: ASE Atoms object(s) or DScribe System object(s)

119

- n_jobs (int): Number of parallel processes

120

- only_physical_cores (bool): Whether to use only physical CPU cores

121

- verbose (bool): Whether to print progress information

122

123

Returns:

124

numpy.ndarray: Sine Matrix descriptors with shape (n_systems, n_features)

125

"""

126

127

def get_matrix(self, system):

128

"""

129

Get the sine matrix for a single atomic system.

130

131

Parameters:

132

- system: ASE Atoms object or DScribe System object

133

134

Returns:

135

numpy.ndarray: 2D sine matrix with shape (n_atoms, n_atoms)

136

"""

137

138

def get_number_of_features(self):

139

"""Get total number of features in flattened Sine Matrix descriptor."""

140

```

141

142

**Usage Example:**

143

144

```python

145

from dscribe.descriptors import SineMatrix

146

from ase.build import bulk

147

148

# Setup Sine Matrix descriptor for periodic systems

149

sm = SineMatrix(

150

n_atoms_max=8,

151

permutation="sorted_l2"

152

)

153

154

# Create descriptor for periodic system

155

nacl = bulk("NaCl", "rocksalt", a=5.64)

156

sm_desc = sm.create(nacl) # Shape: (1, n_features)

157

```

158

159

### EwaldSumMatrix

160

161

The Ewald Sum Matrix uses Ewald summation to compute electrostatic interactions in periodic systems, providing a more accurate treatment of long-range Coulomb interactions than the basic Coulomb Matrix.

162

163

```python { .api }

164

class EwaldSumMatrix:

165

def __init__(self, n_atoms_max, permutation="sorted_l2", sigma=None, seed=None, sparse=False):

166

"""

167

Initialize Ewald Sum Matrix descriptor.

168

169

Parameters:

170

- n_atoms_max (int): Maximum number of atoms in structures to be processed

171

- permutation (str): Permutation strategy for handling atom ordering:

172

- "sorted_l2": Sort rows/columns by L2 norm

173

- "eigenspectrum": Use eigenvalue spectrum

174

- "random": Random permutation with noise

175

- sigma (float): Standard deviation for random noise (only for "random" permutation)

176

- seed (int): Random seed for reproducible random permutations

177

- sparse (bool): Whether to return sparse arrays

178

"""

179

180

def create(self, system, accuracy=1e-5, w=1, r_cut=None, g_cut=None, a=None,

181

n_jobs=1, only_physical_cores=False, verbose=False):

182

"""

183

Create Ewald Sum Matrix descriptor for given system(s).

184

185

Parameters:

186

- system: ASE Atoms object(s) or DScribe System object(s)

187

- accuracy (float): Accuracy of Ewald summation

188

- w (float): Scaling parameter for electrostatic interactions

189

- r_cut (float): Real-space cutoff radius (auto-determined if None)

190

- g_cut (float): Reciprocal-space cutoff (auto-determined if None)

191

- a (float): Ewald parameter (auto-determined if None)

192

- n_jobs (int): Number of parallel processes

193

- only_physical_cores (bool): Whether to use only physical CPU cores

194

- verbose (bool): Whether to print progress information

195

196

Returns:

197

numpy.ndarray: Ewald Sum Matrix descriptors with shape (n_systems, n_features)

198

"""

199

200

def get_matrix(self, system, accuracy=1e-5, w=1, r_cut=None, g_cut=None, a=None):

201

"""

202

Get the Ewald sum matrix for a single atomic system.

203

204

Parameters:

205

- system: ASE Atoms object or DScribe System object

206

- accuracy (float): Accuracy of Ewald summation

207

- w (float): Scaling parameter for electrostatic interactions

208

- r_cut (float): Real-space cutoff radius

209

- g_cut (float): Reciprocal-space cutoff

210

- a (float): Ewald parameter

211

212

Returns:

213

numpy.ndarray: 2D Ewald sum matrix with shape (n_atoms, n_atoms)

214

"""

215

216

def get_number_of_features(self):

217

"""Get total number of features in flattened Ewald Sum Matrix descriptor."""

218

```

219

220

**Usage Example:**

221

222

```python

223

from dscribe.descriptors import EwaldSumMatrix

224

from ase.build import bulk

225

226

# Setup Ewald Sum Matrix descriptor

227

esm = EwaldSumMatrix(

228

n_atoms_max=8,

229

permutation="sorted_l2"

230

)

231

232

# Create descriptor for periodic system with custom Ewald parameters

233

nacl = bulk("NaCl", "rocksalt", a=5.64)

234

esm_desc = esm.create(nacl, accuracy=1e-6, w=0.5) # Shape: (1, n_features)

235

236

# Get the actual Ewald matrix

237

esm_matrix = esm.get_matrix(nacl, accuracy=1e-6) # Shape: (n_atoms, n_atoms)

238

```

239

240

## Matrix Descriptor Base Methods

241

242

All matrix descriptors inherit from `DescriptorMatrix` and share these methods:

243

244

```python { .api }

245

def sort(self, matrix):

246

"""

247

Sort matrix rows and columns by L2 norm.

248

249

Parameters:

250

- matrix: 2D matrix to sort

251

252

Returns:

253

numpy.ndarray: Sorted matrix

254

"""

255

256

def get_eigenspectrum(self, matrix):

257

"""

258

Get eigenvalue spectrum of matrix sorted by absolute value.

259

260

Parameters:

261

- matrix: 2D matrix

262

263

Returns:

264

numpy.ndarray: Sorted eigenvalues

265

"""

266

267

def zero_pad(self, array):

268

"""

269

Zero-pad matrix to n_atoms_max size.

270

271

Parameters:

272

- array: Matrix to pad

273

274

Returns:

275

numpy.ndarray: Zero-padded matrix

276

"""

277

```

278

279

## Permutation Strategies

280

281

Matrix descriptors handle different atom orderings through permutation strategies:

282

283

- **"sorted_l2"**: Sort matrix rows/columns by their L2 norms (most common)

284

- **"eigenspectrum"**: Use eigenvalue spectrum instead of full matrix

285

- **"random"**: Add random noise for data augmentation (requires `sigma` parameter)

286

287

## Usage Considerations

288

289

### System Size Requirements

290

291

All matrix descriptors require specifying `n_atoms_max`, which should be:

292

- At least as large as the biggest system you'll process

293

- Not too large to avoid unnecessary memory usage and computation time

294

295

### Periodic vs Non-Periodic Systems

296

297

- **CoulombMatrix**: Best for molecular systems (non-periodic)

298

- **SineMatrix**: Designed for periodic systems

299

- **EwaldSumMatrix**: Most accurate for periodic systems with long-range interactions

300

301

### Output Format

302

303

Matrix descriptors flatten the 2D matrices into 1D feature vectors:

304

- Original matrix shape: `(n_atoms, n_atoms)`

305

- Flattened output shape: `(n_atoms_max * n_atoms_max,)`

306

- For multiple systems: `(n_systems, n_atoms_max * n_atoms_max)`

307

308

Use the `unflatten()` method to recover the original 2D matrix format when needed.