CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-obonet

Parse OBO formatted ontologies into NetworkX MultiDiGraph data structures

Pending
Quality

Pending

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Pending

The risk profile of this skill

Overview
Eval results
Files

index.mddocs/

obonet

Parse OBO (Open Biomedical Ontologies) formatted files into NetworkX MultiDiGraph data structures. obonet provides a simple, pythonic interface for loading and manipulating ontological data used in bioinformatics and scientific research.

Package Information

  • Package Name: obonet
  • Package Type: PyPI
  • Language: Python
  • Installation: pip install obonet

Core Imports

import obonet

Basic Usage

import obonet
import networkx

# Read from a local file
graph = obonet.read_obo('path/to/ontology.obo')

# Read from a URL
url = 'https://example.com/ontology.obo'
graph = obonet.read_obo(url)

# Read compressed file (automatic detection)
graph = obonet.read_obo('ontology.obo.gz')

# Basic graph operations
print(f"Number of terms: {len(graph)}")
print(f"Number of relationships: {graph.number_of_edges()}")

# Check if ontology is a directed acyclic graph
is_dag = networkx.is_directed_acyclic_graph(graph)

# Get term information
for term_id, data in graph.nodes(data=True):
    name = data.get('name', 'Unknown')
    print(f"{term_id}: {name}")
    break

# Find all parent terms (superterms) of a specific term
parents = networkx.descendants(graph, 'TERM:0000001')

# Find all child terms (subterms) of a specific term  
children = networkx.ancestors(graph, 'TERM:0000001')

Capabilities

OBO File Parsing

Parse OBO formatted ontology files into NetworkX MultiDiGraph structures.

def read_obo(
    path_or_file: PathType, 
    ignore_obsolete: bool = True, 
    encoding: str | None = "utf-8"
) -> networkx.MultiDiGraph[str]:
    """
    Parse an OBO formatted ontology file into a NetworkX MultiDiGraph.
    
    Parameters:
    - path_or_file: Path, URL, or open file object for the OBO file
    - ignore_obsolete: When True (default), excludes obsolete terms from the graph
    - encoding: Character encoding for file reading (default: "utf-8")
    
    Returns:
    NetworkX MultiDiGraph representing the ontology with:
    - Nodes: ontology terms with metadata as attributes
    - Edges: relationships with relationship type as edge key
    - Graph attributes: header information from OBO file
    """

Package Version

Access the package version information.

__version__: str | None

Input Support

  • Local files: Standard file paths and pathlib.Path objects
  • URLs: HTTP, HTTPS, and FTP URLs
  • Open file objects: Pre-opened file handles
  • Compression: Automatic detection and support for .gz, .bz2, .xz formats
  • Encoding: Configurable text encoding (UTF-8 default)

Output Format

The returned NetworkX MultiDiGraph contains:

  • Nodes: Each ontology term as a node with term ID as the key
  • Node attributes: All OBO term metadata (name, definition, synonyms, etc.)
  • Edges: Relationships between terms (is_a, part_of, etc.)
  • Edge keys: Relationship type (allows multiple relationships between same terms)
  • Graph attributes: OBO file header information (ontology name, version, etc.)

OBO Format Support

  • OBO versions: Compliant with OBO specification versions 1.2 and 1.4
  • Stanza types: [Header], [Term], [Typedef], [Instance]
  • Standard tags: All standard OBO tag-value pairs
  • Comments: Processes comments and trailing modifiers
  • Relationships: Hierarchical relationships (is_a, part_of, etc.)

Types

from typing import Union, TextIO, Any
import os

PathType = Union[str, "os.PathLike[Any]", TextIO]

Error Handling

  • Graceful handling of missing or malformed OBO data
  • Optional filtering of obsolete terms via ignore_obsolete parameter
  • Comprehensive logging for debugging parsing issues
  • ValueError raised for invalid tag-value pair syntax

docs

index.md

tile.json