tessl/pypi-obonet

Parse OBO formatted ontologies into NetworkX MultiDiGraph data structures

—

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

—

The risk profile of this skill

Overview

Eval results

Files

obonet

Name: tessl/pypi-obonet
Author: tessl

Parse OBO (Open Biomedical Ontologies) formatted files into NetworkX MultiDiGraph data structures. obonet provides a simple, pythonic interface for loading and manipulating ontological data used in bioinformatics and scientific research.

Package Information

Package Name: obonet
Package Type: PyPI
Language: Python
Installation: pip install obonet

Core Imports

import obonet

Basic Usage

import obonet
import networkx

# Read from a local file
graph = obonet.read_obo('path/to/ontology.obo')

# Read from a URL
url = 'https://example.com/ontology.obo'
graph = obonet.read_obo(url)

# Read compressed file (automatic detection)
graph = obonet.read_obo('ontology.obo.gz')

# Basic graph operations
print(f"Number of terms: {len(graph)}")
print(f"Number of relationships: {graph.number_of_edges()}")

# Check if ontology is a directed acyclic graph
is_dag = networkx.is_directed_acyclic_graph(graph)

# Get term information
for term_id, data in graph.nodes(data=True):
    name = data.get('name', 'Unknown')
    print(f"{term_id}: {name}")
    break

# Find all parent terms (superterms) of a specific term
parents = networkx.descendants(graph, 'TERM:0000001')

# Find all child terms (subterms) of a specific term  
children = networkx.ancestors(graph, 'TERM:0000001')

Capabilities

OBO File Parsing

Parse OBO formatted ontology files into NetworkX MultiDiGraph structures.

def read_obo(
    path_or_file: PathType, 
    ignore_obsolete: bool = True, 
    encoding: str | None = "utf-8"
) -> networkx.MultiDiGraph[str]:
    """
    Parse an OBO formatted ontology file into a NetworkX MultiDiGraph.
    
    Parameters:
    - path_or_file: Path, URL, or open file object for the OBO file
    - ignore_obsolete: When True (default), excludes obsolete terms from the graph
    - encoding: Character encoding for file reading (default: "utf-8")
    
    Returns:
    NetworkX MultiDiGraph representing the ontology with:
    - Nodes: ontology terms with metadata as attributes
    - Edges: relationships with relationship type as edge key
    - Graph attributes: header information from OBO file
    """

Package Version

Access the package version information.

__version__: str | None

Input Support

Local files: Standard file paths and pathlib.Path objects
URLs: HTTP, HTTPS, and FTP URLs
Open file objects: Pre-opened file handles
Compression: Automatic detection and support for .gz, .bz2, .xz formats
Encoding: Configurable text encoding (UTF-8 default)

Output Format

The returned NetworkX MultiDiGraph contains:

Nodes: Each ontology term as a node with term ID as the key
Node attributes: All OBO term metadata (name, definition, synonyms, etc.)
Edges: Relationships between terms (is_a, part_of, etc.)
Edge keys: Relationship type (allows multiple relationships between same terms)
Graph attributes: OBO file header information (ontology name, version, etc.)

OBO Format Support

OBO versions: Compliant with OBO specification versions 1.2 and 1.4
Stanza types: [Header], [Term], [Typedef], [Instance]
Standard tags: All standard OBO tag-value pairs
Comments: Processes comments and trailing modifiers
Relationships: Hierarchical relationships (is_a, part_of, etc.)

Types

from typing import Union, TextIO, Any
import os

PathType = Union[str, "os.PathLike[Any]", TextIO]

Error Handling

Graceful handling of missing or malformed OBO data
Optional filtering of obsolete terms via ignore_obsolete parameter
Comprehensive logging for debugging parsing issues
ValueError raised for invalid tag-value pair syntax

docs

index.md

tile.json