0
# obonet
1
2
Parse OBO (Open Biomedical Ontologies) formatted files into NetworkX MultiDiGraph data structures. obonet provides a simple, pythonic interface for loading and manipulating ontological data used in bioinformatics and scientific research.
3
4
## Package Information
5
6
- **Package Name**: obonet
7
- **Package Type**: PyPI
8
- **Language**: Python
9
- **Installation**: `pip install obonet`
10
11
## Core Imports
12
13
```python
14
import obonet
15
```
16
17
## Basic Usage
18
19
```python
20
import obonet
21
import networkx
22
23
# Read from a local file
24
graph = obonet.read_obo('path/to/ontology.obo')
25
26
# Read from a URL
27
url = 'https://example.com/ontology.obo'
28
graph = obonet.read_obo(url)
29
30
# Read compressed file (automatic detection)
31
graph = obonet.read_obo('ontology.obo.gz')
32
33
# Basic graph operations
34
print(f"Number of terms: {len(graph)}")
35
print(f"Number of relationships: {graph.number_of_edges()}")
36
37
# Check if ontology is a directed acyclic graph
38
is_dag = networkx.is_directed_acyclic_graph(graph)
39
40
# Get term information
41
for term_id, data in graph.nodes(data=True):
42
name = data.get('name', 'Unknown')
43
print(f"{term_id}: {name}")
44
break
45
46
# Find all parent terms (superterms) of a specific term
47
parents = networkx.descendants(graph, 'TERM:0000001')
48
49
# Find all child terms (subterms) of a specific term
50
children = networkx.ancestors(graph, 'TERM:0000001')
51
```
52
53
## Capabilities
54
55
### OBO File Parsing
56
57
Parse OBO formatted ontology files into NetworkX MultiDiGraph structures.
58
59
```python { .api }
60
def read_obo(
61
path_or_file: PathType,
62
ignore_obsolete: bool = True,
63
encoding: str | None = "utf-8"
64
) -> networkx.MultiDiGraph[str]:
65
"""
66
Parse an OBO formatted ontology file into a NetworkX MultiDiGraph.
67
68
Parameters:
69
- path_or_file: Path, URL, or open file object for the OBO file
70
- ignore_obsolete: When True (default), excludes obsolete terms from the graph
71
- encoding: Character encoding for file reading (default: "utf-8")
72
73
Returns:
74
NetworkX MultiDiGraph representing the ontology with:
75
- Nodes: ontology terms with metadata as attributes
76
- Edges: relationships with relationship type as edge key
77
- Graph attributes: header information from OBO file
78
"""
79
```
80
81
### Package Version
82
83
Access the package version information.
84
85
```python { .api }
86
__version__: str | None
87
```
88
89
## Input Support
90
91
- **Local files**: Standard file paths and pathlib.Path objects
92
- **URLs**: HTTP, HTTPS, and FTP URLs
93
- **Open file objects**: Pre-opened file handles
94
- **Compression**: Automatic detection and support for .gz, .bz2, .xz formats
95
- **Encoding**: Configurable text encoding (UTF-8 default)
96
97
## Output Format
98
99
The returned NetworkX MultiDiGraph contains:
100
101
- **Nodes**: Each ontology term as a node with term ID as the key
102
- **Node attributes**: All OBO term metadata (name, definition, synonyms, etc.)
103
- **Edges**: Relationships between terms (is_a, part_of, etc.)
104
- **Edge keys**: Relationship type (allows multiple relationships between same terms)
105
- **Graph attributes**: OBO file header information (ontology name, version, etc.)
106
107
## OBO Format Support
108
109
- **OBO versions**: Compliant with OBO specification versions 1.2 and 1.4
110
- **Stanza types**: [Header], [Term], [Typedef], [Instance]
111
- **Standard tags**: All standard OBO tag-value pairs
112
- **Comments**: Processes comments and trailing modifiers
113
- **Relationships**: Hierarchical relationships (is_a, part_of, etc.)
114
115
## Types
116
117
```python { .api }
118
from typing import Union, TextIO, Any
119
import os
120
121
PathType = Union[str, "os.PathLike[Any]", TextIO]
122
```
123
124
## Error Handling
125
126
- Graceful handling of missing or malformed OBO data
127
- Optional filtering of obsolete terms via `ignore_obsolete` parameter
128
- Comprehensive logging for debugging parsing issues
129
- ValueError raised for invalid tag-value pair syntax