or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

docs

index.md
tile.json

tessl/pypi-obonet

Parse OBO formatted ontologies into NetworkX MultiDiGraph data structures

Workspace
tessl
Visibility
Public
Created
Last updated
Describes
pypipkg:pypi/obonet@1.1.x

To install, run

npx @tessl/cli install tessl/pypi-obonet@1.1.0

index.mddocs/

obonet

Parse OBO (Open Biomedical Ontologies) formatted files into NetworkX MultiDiGraph data structures. obonet provides a simple, pythonic interface for loading and manipulating ontological data used in bioinformatics and scientific research.

Package Information

  • Package Name: obonet
  • Package Type: PyPI
  • Language: Python
  • Installation: pip install obonet

Core Imports

import obonet

Basic Usage

import obonet
import networkx

# Read from a local file
graph = obonet.read_obo('path/to/ontology.obo')

# Read from a URL
url = 'https://example.com/ontology.obo'
graph = obonet.read_obo(url)

# Read compressed file (automatic detection)
graph = obonet.read_obo('ontology.obo.gz')

# Basic graph operations
print(f"Number of terms: {len(graph)}")
print(f"Number of relationships: {graph.number_of_edges()}")

# Check if ontology is a directed acyclic graph
is_dag = networkx.is_directed_acyclic_graph(graph)

# Get term information
for term_id, data in graph.nodes(data=True):
    name = data.get('name', 'Unknown')
    print(f"{term_id}: {name}")
    break

# Find all parent terms (superterms) of a specific term
parents = networkx.descendants(graph, 'TERM:0000001')

# Find all child terms (subterms) of a specific term  
children = networkx.ancestors(graph, 'TERM:0000001')

Capabilities

OBO File Parsing

Parse OBO formatted ontology files into NetworkX MultiDiGraph structures.

def read_obo(
    path_or_file: PathType, 
    ignore_obsolete: bool = True, 
    encoding: str | None = "utf-8"
) -> networkx.MultiDiGraph[str]:
    """
    Parse an OBO formatted ontology file into a NetworkX MultiDiGraph.
    
    Parameters:
    - path_or_file: Path, URL, or open file object for the OBO file
    - ignore_obsolete: When True (default), excludes obsolete terms from the graph
    - encoding: Character encoding for file reading (default: "utf-8")
    
    Returns:
    NetworkX MultiDiGraph representing the ontology with:
    - Nodes: ontology terms with metadata as attributes
    - Edges: relationships with relationship type as edge key
    - Graph attributes: header information from OBO file
    """

Package Version

Access the package version information.

__version__: str | None

Input Support

  • Local files: Standard file paths and pathlib.Path objects
  • URLs: HTTP, HTTPS, and FTP URLs
  • Open file objects: Pre-opened file handles
  • Compression: Automatic detection and support for .gz, .bz2, .xz formats
  • Encoding: Configurable text encoding (UTF-8 default)

Output Format

The returned NetworkX MultiDiGraph contains:

  • Nodes: Each ontology term as a node with term ID as the key
  • Node attributes: All OBO term metadata (name, definition, synonyms, etc.)
  • Edges: Relationships between terms (is_a, part_of, etc.)
  • Edge keys: Relationship type (allows multiple relationships between same terms)
  • Graph attributes: OBO file header information (ontology name, version, etc.)

OBO Format Support

  • OBO versions: Compliant with OBO specification versions 1.2 and 1.4
  • Stanza types: [Header], [Term], [Typedef], [Instance]
  • Standard tags: All standard OBO tag-value pairs
  • Comments: Processes comments and trailing modifiers
  • Relationships: Hierarchical relationships (is_a, part_of, etc.)

Types

from typing import Union, TextIO, Any
import os

PathType = Union[str, "os.PathLike[Any]", TextIO]

Error Handling

  • Graceful handling of missing or malformed OBO data
  • Optional filtering of obsolete terms via ignore_obsolete parameter
  • Comprehensive logging for debugging parsing issues
  • ValueError raised for invalid tag-value pair syntax