CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-dscribe

A Python package for creating feature transformations in applications of machine learning to materials science.

Pending
Overview
Eval results
Files

local-descriptors.mddocs/

Local Descriptors

Local descriptors compute features for individual atoms or local atomic environments, producing per-atom feature vectors. These descriptors are ideal for machine learning tasks where atomic-level properties need to be predicted or where local chemical environments are the focus of analysis.

Capabilities

SOAP (Smooth Overlap of Atomic Positions)

SOAP creates descriptors based on the local atomic environment using spherical harmonics expansion. It captures both radial and angular information about neighboring atoms within a cutoff radius.

class SOAP:
    def __init__(self, r_cut, n_max, l_max, sigma=1.0, rbf="gto", 
                 weighting=None, average="off", compression={"mode": "off", "species_weighting": None}, 
                 species=None, periodic=False, sparse=False, dtype="float64"):
        """
        Initialize SOAP descriptor.
        
        Parameters:
        - r_cut (float): Cutoff radius in angstroms for the local environment
        - n_max (int): Number of radial basis functions
        - l_max (int): Maximum degree of spherical harmonics
        - sigma (float): Width of atomic Gaussians for broadening
        - rbf (str): Radial basis functions ("gto" for Gaussian-type orbitals or "polynomial")
        - weighting (dict): Weighting function configuration for neighbor contributions
        - average (str): Averaging mode ("off", "inner", "outer")
        - compression (dict): Compression settings for reducing dimensionality
        - species (list): List of atomic species to include
        - periodic (bool): Whether to consider periodic boundary conditions
        - sparse (bool): Whether to return sparse arrays
        - dtype (str): Data type for arrays
        """

    def create(self, system, centers=None, n_jobs=1, only_physical_cores=False, verbose=False):
        """
        Create SOAP descriptor for given system(s).
        
        Parameters:
        - system: ASE Atoms object(s) or DScribe System object(s)
        - centers (list): Indices of atoms to compute SOAP for. If None, compute for all atoms
        - n_jobs (int): Number of parallel processes
        - only_physical_cores (bool): Whether to use only physical CPU cores
        - verbose (bool): Whether to print progress information
        
        Returns:
        numpy.ndarray or scipy.sparse matrix: SOAP descriptors
        """

    def derivatives(self, system, centers=None, include=None, exclude=None, 
                   method="auto", return_descriptor=False, n_jobs=1, only_physical_cores=False):
        """
        Calculate derivatives of SOAP descriptor with respect to atomic positions.
        
        Parameters:
        - system: ASE Atoms object(s) or DScribe System object(s)  
        - centers (list): Indices of atoms to compute derivatives for
        - include (list): Atomic indices to include in derivative calculation
        - exclude (list): Atomic indices to exclude from derivative calculation
        - method (str): Derivative calculation method ("auto", "analytical", "numerical")
        - return_descriptor (bool): Whether to also return the descriptor values
        - n_jobs (int): Number of parallel processes
        - only_physical_cores (bool): Whether to use only physical CPU cores
        
        Returns:
        numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
        """

    def get_number_of_features(self):
        """Get total number of features in SOAP descriptor."""

Usage Example:

from dscribe.descriptors import SOAP
from ase.build import molecule

# Setup SOAP descriptor
soap = SOAP(
    species=["H", "O"],
    r_cut=5.0,
    n_max=8,
    l_max=6,
    sigma=0.2
)

# Create descriptor for water molecule
water = molecule("H2O")
soap_desc = soap.create(water)  # Shape: (n_atoms, n_features)

# Calculate derivatives
derivatives = soap.derivatives(water, return_descriptor=False)

ACSF (Atom-Centered Symmetry Functions)

ACSF uses Behler-Parrinello symmetry functions to create rotationally and translationally invariant descriptors based on local atomic environments. Different symmetry function types (G2, G3, G4, G5) capture radial and angular information.

class ACSF:
    def __init__(self, r_cut, g2_params=None, g3_params=None, g4_params=None, g5_params=None,
                 species=None, periodic=False, sparse=False, dtype="float64"):
        """
        Initialize ACSF descriptor.
        
        Parameters:
        - r_cut (float): Cutoff radius for all symmetry functions
        - g2_params (list): Parameters for G2 (radial) symmetry functions as [eta, Rs] pairs
        - g3_params (list): Parameters for G3 (angular) symmetry functions as kappa values
        - g4_params (list): Parameters for G4 (angular) symmetry functions as [eta, zeta, lambda] triplets
        - g5_params (list): Parameters for G5 (angular) symmetry functions as [eta, zeta, lambda] triplets
        - species (list): List of atomic species to include
        - periodic (bool): Whether to consider periodic boundary conditions
        - sparse (bool): Whether to return sparse arrays
        - dtype (str): Data type for arrays
        """

    def create(self, system, centers=None, n_jobs=1, only_physical_cores=False, verbose=False):
        """
        Create ACSF descriptor for given system(s).
        
        Parameters:
        - system: ASE Atoms object(s) or DScribe System object(s)
        - centers (list): Indices of atoms to compute ACSF for. If None, compute for all atoms
        - n_jobs (int): Number of parallel processes
        - only_physical_cores (bool): Whether to use only physical CPU cores
        - verbose (bool): Whether to print progress information
        
        Returns:
        numpy.ndarray or scipy.sparse matrix: ACSF descriptors
        """

    def derivatives(self, system, centers=None, include=None, exclude=None,
                   method="auto", return_descriptor=False, n_jobs=1, only_physical_cores=False):
        """
        Calculate derivatives of ACSF descriptor with respect to atomic positions.
        
        Parameters:
        - system: ASE Atoms object(s) or DScribe System object(s)
        - centers (list): Indices of atoms to compute derivatives for
        - include (list): Atomic indices to include in derivative calculation
        - exclude (list): Atomic indices to exclude from derivative calculation
        - method (str): Derivative calculation method ("auto", "analytical", "numerical")
        - return_descriptor (bool): Whether to also return the descriptor values
        - n_jobs (int): Number of parallel processes
        - only_physical_cores (bool): Whether to use only physical CPU cores
        
        Returns:
        numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
        """

    def get_number_of_features(self):
        """Get total number of features in ACSF descriptor."""

Usage Example:

from dscribe.descriptors import ACSF
from ase.build import molecule

# Setup ACSF descriptor with different symmetry function parameters
acsf = ACSF(
    species=["H", "O"],
    r_cut=6.0,
    g2_params=[[1, 1], [1, 2]],  # G2 parameters: [eta, Rs]
    g4_params=[[1, 1, 1], [1, -1, 4]]  # G4 parameters: [eta, zeta, lambda]
)

# Create descriptor for water molecule
water = molecule("H2O")
acsf_desc = acsf.create(water)  # Shape: (n_atoms, n_features)

LMBTR (Local Many-Body Tensor Representation)

LMBTR is the local version of MBTR, computing many-body interaction terms for individual atomic environments. It provides detailed information about local chemical environments through k-body terms.

class LMBTR:
    def __init__(self, geometry=None, grid=None, weighting=None, normalize_gaussians=True,
                 normalization="none", species=None, periodic=False, sparse=False, dtype="float64"):
        """
        Initialize LMBTR descriptor.
        
        Parameters:
        - geometry (dict): Geometry functions for k1, k2, k3 terms
        - grid (dict): Discretization grids for each geometry function
        - weighting (dict): Weighting functions for neighbor contributions
        - normalize_gaussians (bool): Whether to normalize Gaussian broadening
        - normalization (str): Normalization scheme ("none", "l2", "n_atoms")
        - species (list): List of atomic species to include
        - periodic (bool): Whether to consider periodic boundary conditions
        - sparse (bool): Whether to return sparse arrays
        - dtype (str): Data type for arrays
        """

    def create(self, system, centers=None, n_jobs=1, only_physical_cores=False, verbose=False):
        """
        Create LMBTR descriptor for given system(s).
        
        Parameters:
        - system: ASE Atoms object(s) or DScribe System object(s)
        - centers (list): Indices of atoms to compute LMBTR for. If None, compute for all atoms
        - n_jobs (int): Number of parallel processes
        - only_physical_cores (bool): Whether to use only physical CPU cores
        - verbose (bool): Whether to print progress information
        
        Returns:
        numpy.ndarray or scipy.sparse matrix: LMBTR descriptors
        """

    def derivatives(self, system, centers=None, include=None, exclude=None,
                   method="auto", return_descriptor=False, n_jobs=1, only_physical_cores=False):
        """
        Calculate derivatives of LMBTR descriptor with respect to atomic positions.
        
        Parameters:
        - system: ASE Atoms object(s) or DScribe System object(s)
        - centers (list): Indices of atoms to compute derivatives for
        - include (list): Atomic indices to include in derivative calculation
        - exclude (list): Atomic indices to exclude from derivative calculation
        - method (str): Derivative calculation method ("auto", "analytical", "numerical")
        - return_descriptor (bool): Whether to also return the descriptor values
        - n_jobs (int): Number of parallel processes
        - only_physical_cores (bool): Whether to use only physical CPU cores
        
        Returns:
        numpy.ndarray or tuple: Derivatives array, optionally with descriptor values
        """

    def get_number_of_features(self):
        """Get total number of features in LMBTR descriptor."""

Usage Example:

from dscribe.descriptors import LMBTR
from ase.build import molecule

# Setup LMBTR descriptor
lmbtr = LMBTR(
    species=["H", "O"],
    geometry={
        "k2": {
            "function": "inverse_distance",
        },
        "k3": {
            "function": "angle",
        }
    },
    grid={
        "k2": {
            "min": 0.5,
            "max": 2.0,
            "n": 50,
            "sigma": 0.05
        },
        "k3": {
            "min": 0,
            "max": 180,
            "n": 50,
            "sigma": 5
        }
    }
)

# Create descriptor for water molecule
water = molecule("H2O")
lmbtr_desc = lmbtr.create(water)  # Shape: (n_atoms, n_features)

Common Local Descriptor Features

All local descriptors share these characteristics:

  • Per-atom output: Each descriptor returns features for individual atoms
  • Center specification: Can compute features for specific atoms using the centers parameter
  • Parallel processing: Support parallel computation across multiple systems
  • Derivative support: All local descriptors support analytical or numerical derivatives
  • Averaging options: Some descriptors (like SOAP) support different averaging schemes

Output Shapes

Local descriptors return arrays with shape:

  • Single system: (n_centers, n_features) where n_centers is the number of atoms processed
  • Multiple systems: (total_centers, n_features) where total_centers is the sum across all systems

When centers is specified, only those atomic indices are processed, reducing the output size.

Install with Tessl CLI

npx tessl i tessl/pypi-dscribe

docs

core-classes.md

global-descriptors.md

index.md

kernels.md

local-descriptors.md

matrix-descriptors.md

utilities.md

tile.json