A Python package for creating feature transformations in applications of machine learning to materials science.
—
Local descriptors compute features for individual atoms or local atomic environments, producing per-atom feature vectors. These descriptors are ideal for machine learning tasks where atomic-level properties need to be predicted or where local chemical environments are the focus of analysis.
SOAP creates descriptors based on the local atomic environment using spherical harmonics expansion. It captures both radial and angular information about neighboring atoms within a cutoff radius.
class SOAP:
def __init__(self, r_cut, n_max, l_max, sigma=1.0, rbf="gto",
weighting=None, average="off", compression={"mode": "off", "species_weighting": None},
species=None, periodic=False, sparse=False, dtype="float64"):
"""
Initialize SOAP descriptor.
Parameters:
- r_cut (float): Cutoff radius in angstroms for the local environment
- n_max (int): Number of radial basis functions
- l_max (int): Maximum degree of spherical harmonics
- sigma (float): Width of atomic Gaussians for broadening
- rbf (str): Radial basis functions ("gto" for Gaussian-type orbitals or "polynomial")
- weighting (dict): Weighting function configuration for neighbor contributions
- average (str): Averaging mode ("off", "inner", "outer")
- compression (dict): Compression settings for reducing dimensionality
- species (list): List of atomic species to include
- periodic (bool): Whether to consider periodic boundary conditions
- sparse (bool): Whether to return sparse arrays
- dtype (str): Data type for arrays
"""
def create(self, system, centers=None, n_jobs=1, only_physical_cores=False, verbose=False):
"""
Create SOAP descriptor for given system(s).
Parameters:
- system: ASE Atoms object(s) or DScribe System object(s)
- centers (list): Indices of atoms to compute SOAP for. If None, compute for all atoms
- n_jobs (int): Number of parallel processes
- only_physical_cores (bool): Whether to use only physical CPU cores
- verbose (bool): Whether to print progress information
Returns:
numpy.ndarray or scipy.sparse matrix: SOAP descriptors
"""
def derivatives(self, system, centers=None, include=None, exclude=None,
method="auto", return_descriptor=False, n_jobs=1, only_physical_cores=False):
"""
Calculate derivatives of SOAP descriptor with respect to atomic positions.
Parameters:
- system: ASE Atoms object(s) or DScribe System object(s)
- centers (list): Indices of atoms to compute derivatives for
- include (list): Atomic indices to include in derivative calculation
- exclude (list): Atomic indices to exclude from derivative calculation
- method (str): Derivative calculation method ("auto", "analytical", "numerical")
- return_descriptor (bool): Whether to also return the descriptor values
- n_jobs (int): Number of parallel processes
- only_physical_cores (bool): Whether to use only physical CPU cores
Returns:
numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
"""
def get_number_of_features(self):
"""Get total number of features in SOAP descriptor."""Usage Example:
from dscribe.descriptors import SOAP
from ase.build import molecule
# Setup SOAP descriptor
soap = SOAP(
species=["H", "O"],
r_cut=5.0,
n_max=8,
l_max=6,
sigma=0.2
)
# Create descriptor for water molecule
water = molecule("H2O")
soap_desc = soap.create(water) # Shape: (n_atoms, n_features)
# Calculate derivatives
derivatives = soap.derivatives(water, return_descriptor=False)ACSF uses Behler-Parrinello symmetry functions to create rotationally and translationally invariant descriptors based on local atomic environments. Different symmetry function types (G2, G3, G4, G5) capture radial and angular information.
class ACSF:
def __init__(self, r_cut, g2_params=None, g3_params=None, g4_params=None, g5_params=None,
species=None, periodic=False, sparse=False, dtype="float64"):
"""
Initialize ACSF descriptor.
Parameters:
- r_cut (float): Cutoff radius for all symmetry functions
- g2_params (list): Parameters for G2 (radial) symmetry functions as [eta, Rs] pairs
- g3_params (list): Parameters for G3 (angular) symmetry functions as kappa values
- g4_params (list): Parameters for G4 (angular) symmetry functions as [eta, zeta, lambda] triplets
- g5_params (list): Parameters for G5 (angular) symmetry functions as [eta, zeta, lambda] triplets
- species (list): List of atomic species to include
- periodic (bool): Whether to consider periodic boundary conditions
- sparse (bool): Whether to return sparse arrays
- dtype (str): Data type for arrays
"""
def create(self, system, centers=None, n_jobs=1, only_physical_cores=False, verbose=False):
"""
Create ACSF descriptor for given system(s).
Parameters:
- system: ASE Atoms object(s) or DScribe System object(s)
- centers (list): Indices of atoms to compute ACSF for. If None, compute for all atoms
- n_jobs (int): Number of parallel processes
- only_physical_cores (bool): Whether to use only physical CPU cores
- verbose (bool): Whether to print progress information
Returns:
numpy.ndarray or scipy.sparse matrix: ACSF descriptors
"""
def derivatives(self, system, centers=None, include=None, exclude=None,
method="auto", return_descriptor=False, n_jobs=1, only_physical_cores=False):
"""
Calculate derivatives of ACSF descriptor with respect to atomic positions.
Parameters:
- system: ASE Atoms object(s) or DScribe System object(s)
- centers (list): Indices of atoms to compute derivatives for
- include (list): Atomic indices to include in derivative calculation
- exclude (list): Atomic indices to exclude from derivative calculation
- method (str): Derivative calculation method ("auto", "analytical", "numerical")
- return_descriptor (bool): Whether to also return the descriptor values
- n_jobs (int): Number of parallel processes
- only_physical_cores (bool): Whether to use only physical CPU cores
Returns:
numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
"""
def get_number_of_features(self):
"""Get total number of features in ACSF descriptor."""Usage Example:
from dscribe.descriptors import ACSF
from ase.build import molecule
# Setup ACSF descriptor with different symmetry function parameters
acsf = ACSF(
species=["H", "O"],
r_cut=6.0,
g2_params=[[1, 1], [1, 2]], # G2 parameters: [eta, Rs]
g4_params=[[1, 1, 1], [1, -1, 4]] # G4 parameters: [eta, zeta, lambda]
)
# Create descriptor for water molecule
water = molecule("H2O")
acsf_desc = acsf.create(water) # Shape: (n_atoms, n_features)LMBTR is the local version of MBTR, computing many-body interaction terms for individual atomic environments. It provides detailed information about local chemical environments through k-body terms.
class LMBTR:
def __init__(self, geometry=None, grid=None, weighting=None, normalize_gaussians=True,
normalization="none", species=None, periodic=False, sparse=False, dtype="float64"):
"""
Initialize LMBTR descriptor.
Parameters:
- geometry (dict): Geometry functions for k1, k2, k3 terms
- grid (dict): Discretization grids for each geometry function
- weighting (dict): Weighting functions for neighbor contributions
- normalize_gaussians (bool): Whether to normalize Gaussian broadening
- normalization (str): Normalization scheme ("none", "l2", "n_atoms")
- species (list): List of atomic species to include
- periodic (bool): Whether to consider periodic boundary conditions
- sparse (bool): Whether to return sparse arrays
- dtype (str): Data type for arrays
"""
def create(self, system, centers=None, n_jobs=1, only_physical_cores=False, verbose=False):
"""
Create LMBTR descriptor for given system(s).
Parameters:
- system: ASE Atoms object(s) or DScribe System object(s)
- centers (list): Indices of atoms to compute LMBTR for. If None, compute for all atoms
- n_jobs (int): Number of parallel processes
- only_physical_cores (bool): Whether to use only physical CPU cores
- verbose (bool): Whether to print progress information
Returns:
numpy.ndarray or scipy.sparse matrix: LMBTR descriptors
"""
def derivatives(self, system, centers=None, include=None, exclude=None,
method="auto", return_descriptor=False, n_jobs=1, only_physical_cores=False):
"""
Calculate derivatives of LMBTR descriptor with respect to atomic positions.
Parameters:
- system: ASE Atoms object(s) or DScribe System object(s)
- centers (list): Indices of atoms to compute derivatives for
- include (list): Atomic indices to include in derivative calculation
- exclude (list): Atomic indices to exclude from derivative calculation
- method (str): Derivative calculation method ("auto", "analytical", "numerical")
- return_descriptor (bool): Whether to also return the descriptor values
- n_jobs (int): Number of parallel processes
- only_physical_cores (bool): Whether to use only physical CPU cores
Returns:
numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
"""
def get_number_of_features(self):
"""Get total number of features in LMBTR descriptor."""Usage Example:
from dscribe.descriptors import LMBTR
from ase.build import molecule
# Setup LMBTR descriptor
lmbtr = LMBTR(
species=["H", "O"],
geometry={
"k2": {
"function": "inverse_distance",
},
"k3": {
"function": "angle",
}
},
grid={
"k2": {
"min": 0.5,
"max": 2.0,
"n": 50,
"sigma": 0.05
},
"k3": {
"min": 0,
"max": 180,
"n": 50,
"sigma": 5
}
}
)
# Create descriptor for water molecule
water = molecule("H2O")
lmbtr_desc = lmbtr.create(water) # Shape: (n_atoms, n_features)All local descriptors share these characteristics:
centers parameterLocal descriptors return arrays with shape:
(n_centers, n_features) where n_centers is the number of atoms processed(total_centers, n_features) where total_centers is the sum across all systemsWhen centers is specified, only those atomic indices are processed, reducing the output size.
Install with Tessl CLI
npx tessl i tessl/pypi-dscribe