or run

tessl search
Log in

Version

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/rdkit@2024.9.x
tile.json

tessl/pypi-rdkit

tessl install tessl/pypi-rdkit@2024.9.0

Platform wheels for RDKit - a comprehensive cheminformatics and machine-learning library with Python bindings

Agent Success

Agent success rate when using this tile

89%

Improvement

Agent success rate improvement when using this tile compared to baseline

1.01x

Baseline

Agent success rate without this tile

88%

task.mdevals/scenario-5/

Molecular Similarity Search Tool

Build a molecular similarity search tool that can find structurally similar molecules from a dataset based on their chemical fingerprints. The tool should accept a query molecule and return the most similar molecules ranked by similarity score.

Requirements

Your implementation should:

  1. Accept molecules in SMILES format as input
  2. Generate molecular fingerprints for similarity comparison
  3. Calculate similarity scores between molecules
  4. Return a ranked list of similar molecules with their similarity scores

Input/Output Specifications

Input

  • A query molecule represented as a SMILES string
  • A list of target molecules (SMILES strings) to search against
  • A similarity threshold (float between 0 and 1)
  • Optional: fingerprint radius parameter
  • Optional: fingerprint bit length parameter

Output

  • A list of molecules that meet the similarity threshold
  • Each result should include:
    • The molecule's SMILES string
    • The similarity score (float between 0 and 1)
  • Results should be sorted by similarity score in descending order

Test Cases

Test Case 1: Basic Similarity Search { .test }

Given a query molecule "CCO" (ethanol) and a dataset containing ["CCCO", "CC", "c1ccccc1"], when searching with threshold 0.3:

  • The results should include molecules structurally similar to ethanol (CCCO and CC)
  • The results should exclude benzene (c1ccccc1) as it has very low similarity
  • Results must be sorted by similarity score in descending order
  • CCCO should have higher similarity than CC

@test

Test Case 2: Custom Fingerprint Parameters { .test }

Given a query molecule "CC(=O)O" (acetic acid), a target list containing similar molecules, and custom parameters (radius=3, nbits=1024):

  • The function should accept and use the custom radius and nbits parameters
  • Results should be returned successfully

@test

Test Case 3: No Matches Above Threshold { .test }

Given a query molecule "CCO", a target list ["c1ccccc1", "c1ccc2ccccc2c1"], and threshold=0.8:

  • When no molecules meet the threshold, return an empty list
  • The function should not raise an error

@test

Implementation

@generates

API

def find_similar_molecules(
    query_smiles: str,
    target_smiles_list: list[str],
    threshold: float = 0.5,
    radius: int = 2,
    nbits: int = 2048
) -> list[dict]:
    """
    Find molecules similar to the query molecule.

    Args:
        query_smiles: SMILES string of the query molecule
        target_smiles_list: List of SMILES strings to search
        threshold: Minimum similarity score (0-1) to include in results
        radius: Fingerprint radius parameter (default: 2)
        nbits: Fingerprint bit length (default: 2048)

    Returns:
        List of dicts with keys 'smiles' and 'similarity', sorted by similarity descending
    """
    pass

Dependencies { .dependencies }

rdkit { .dependency }

Provides cheminformatics functionality for molecular fingerprint generation and similarity calculations.