0
# Utilities and Helpers
1
2
Utility functions for RDF data manipulation, format detection, term conversion, and graph comparison operations that support common RDF processing tasks.
3
4
## Capabilities
5
6
### Term Conversion and Creation
7
8
Functions for converting between Python objects and RDF terms.
9
10
```python { .api }
11
def to_term(s, default: URIRef = None) -> Node:
12
"""
13
Convert Python object to RDF term.
14
15
Parameters:
16
- s: Python object (string, int, float, bool, datetime, etc.)
17
- default: Default URIRef if conversion is ambiguous
18
19
Returns:
20
Node: Appropriate RDF term (URIRef, Literal, BNode)
21
"""
22
23
def from_n3(s: str, default_graph: Graph = None, backend: str = None, nsm: NamespaceManager = None) -> Node:
24
"""
25
Parse N3/Turtle notation to create RDF term.
26
27
Parameters:
28
- s: N3 string representation
29
- default_graph: Graph for context
30
- backend: Parser backend to use
31
- nsm: Namespace manager for prefix resolution
32
33
Returns:
34
Node: Parsed RDF term
35
"""
36
```
37
38
### Collection Utilities
39
40
General-purpose collection manipulation functions.
41
42
```python { .api }
43
def first(seq):
44
"""
45
Get first item from iterable.
46
47
Parameters:
48
- seq: Iterable sequence
49
50
Returns:
51
First item or None if empty
52
"""
53
54
def uniq(seq):
55
"""
56
Remove duplicates while preserving order.
57
58
Parameters:
59
- seq: Input sequence
60
61
Returns:
62
Generator: Unique items in order
63
"""
64
65
def more_than(seq, n: int) -> bool:
66
"""
67
Check if sequence has more than n items.
68
69
Parameters:
70
- seq: Sequence to check
71
- n: Threshold count
72
73
Returns:
74
bool: True if sequence has more than n items
75
"""
76
```
77
78
### Format Detection
79
80
Functions for detecting RDF formats from files and content.
81
82
```python { .api }
83
def guess_format(fpath: str = None, data: str = None) -> str:
84
"""
85
Guess RDF format from file path or content.
86
87
Parameters:
88
- fpath: File path (uses extension)
89
- data: RDF data content (analyzes structure)
90
91
Returns:
92
str: Format identifier ('turtle', 'xml', 'json-ld', etc.)
93
"""
94
```
95
96
### Graph Analysis
97
98
Functions for analyzing graph structure and finding patterns.
99
100
```python { .api }
101
def find_roots(graph: Graph, prop: URIRef, roots: Set[Node] = None) -> Set[Node]:
102
"""
103
Find root nodes in graph hierarchy.
104
105
Parameters:
106
- graph: Graph to analyze
107
- prop: Property defining hierarchy (e.g., rdfs:subClassOf)
108
- roots: Existing root set to extend
109
110
Returns:
111
Set: Root nodes with no incoming hierarchy edges
112
"""
113
```
114
115
### Date and Time Utilities
116
117
Functions for working with date/time values in RDF.
118
119
```python { .api }
120
def date_time(t: float = None, local_time_zone: bool = False) -> Literal:
121
"""
122
Create datetime literal from timestamp.
123
124
Parameters:
125
- t: Unix timestamp (current time if None)
126
- local_time_zone: Use local timezone instead of UTC
127
128
Returns:
129
Literal: XSD datetime literal
130
"""
131
132
def parse_date_time(val: str) -> datetime:
133
"""
134
Parse XSD datetime string.
135
136
Parameters:
137
- val: XSD datetime string
138
139
Returns:
140
datetime: Parsed datetime object
141
"""
142
```
143
144
### Graph Comparison
145
146
Functions for comparing and analyzing graph differences.
147
148
```python { .api }
149
def isomorphic(graph1: Graph, graph2: Graph) -> bool:
150
"""
151
Test if two graphs are isomorphic.
152
153
Parameters:
154
- graph1: First graph
155
- graph2: Second graph
156
157
Returns:
158
bool: True if graphs are isomorphic
159
"""
160
161
def graph_diff(g1: Graph, g2: Graph) -> Tuple[Graph, Graph]:
162
"""
163
Compute difference between two graphs.
164
165
Parameters:
166
- g1: First graph
167
- g2: Second graph
168
169
Returns:
170
Tuple: (in_first_only, in_second_only) graphs
171
"""
172
173
def to_canonical_graph(graph: Graph) -> Graph:
174
"""
175
Convert graph to canonical form for comparison.
176
177
Parameters:
178
- graph: Input graph
179
180
Returns:
181
Graph: Canonicalized graph
182
"""
183
```
184
185
### URI and Path Utilities
186
187
Functions for working with URIs and file paths.
188
189
```python { .api }
190
def defrag(uriref: URIRef) -> URIRef:
191
"""
192
Remove fragment from URI.
193
194
Parameters:
195
- uriref: URI reference
196
197
Returns:
198
URIRef: URI without fragment
199
"""
200
201
def file_uri_to_path(uri: str) -> str:
202
"""
203
Convert file:// URI to file path.
204
205
Parameters:
206
- uri: File URI string
207
208
Returns:
209
str: Local file path
210
"""
211
212
def path_to_file_uri(path: str) -> str:
213
"""
214
Convert file path to file:// URI.
215
216
Parameters:
217
- path: Local file path
218
219
Returns:
220
str: File URI
221
"""
222
```
223
224
### Namespace Utilities
225
226
Additional namespace manipulation functions.
227
228
```python { .api }
229
def split_uri(uri: URIRef) -> Tuple[str, str]:
230
"""
231
Split URI into namespace and local name.
232
233
Parameters:
234
- uri: URI to split
235
236
Returns:
237
Tuple: (namespace, local_name)
238
"""
239
```
240
241
## Usage Examples
242
243
### Term Conversion
244
245
```python
246
from rdflib import Graph, Literal
247
from rdflib.util import to_term, from_n3
248
from rdflib.namespace import XSD
249
from datetime import datetime
250
251
# Convert Python objects to RDF terms
252
string_term = to_term("Hello World") # -> Literal("Hello World")
253
int_term = to_term(42) # -> Literal(42, datatype=XSD.integer)
254
float_term = to_term(3.14) # -> Literal(3.14, datatype=XSD.decimal)
255
bool_term = to_term(True) # -> Literal(True, datatype=XSD.boolean)
256
date_term = to_term(datetime.now()) # -> Literal with XSD.dateTime
257
258
print(f"String: {string_term}")
259
print(f"Integer: {int_term}")
260
print(f"Float: {float_term}")
261
print(f"Boolean: {bool_term}")
262
print(f"Date: {date_term}")
263
264
# Parse N3 notation
265
uri_from_n3 = from_n3("<http://example.org/person/1>")
266
literal_from_n3 = from_n3('"John Doe"')
267
typed_literal = from_n3('"42"^^<http://www.w3.org/2001/XMLSchema#integer>')
268
269
print(f"URI: {uri_from_n3}")
270
print(f"Literal: {literal_from_n3}")
271
print(f"Typed: {typed_literal}")
272
```
273
274
### Collection Utilities
275
276
```python
277
from rdflib.util import first, uniq, more_than
278
279
# Get first item
280
data = [1, 2, 3, 4, 5]
281
first_item = first(data)
282
print(f"First: {first_item}")
283
284
# Remove duplicates
285
duplicated = [1, 2, 2, 3, 3, 3, 4, 4, 5]
286
unique_items = list(uniq(duplicated))
287
print(f"Unique: {unique_items}")
288
289
# Check if more than threshold
290
has_many = more_than(data, 3)
291
print(f"More than 3 items: {has_many}")
292
293
# Handle empty sequences
294
empty_first = first([])
295
print(f"First of empty: {empty_first}") # None
296
```
297
298
### Format Detection
299
300
```python
301
from rdflib.util import guess_format
302
303
# Detect from file extension
304
format1 = guess_format("data.ttl")
305
format2 = guess_format("data.rdf")
306
format3 = guess_format("data.jsonld")
307
308
print(f"TTL format: {format1}")
309
print(f"RDF format: {format2}")
310
print(f"JSON-LD format: {format3}")
311
312
# Detect from content
313
turtle_content = """
314
@prefix ex: <http://example.org/> .
315
ex:person1 ex:name "John" .
316
"""
317
318
xml_content = """<?xml version="1.0"?>
319
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
320
</rdf:RDF>"""
321
322
turtle_format = guess_format(data=turtle_content)
323
xml_format = guess_format(data=xml_content)
324
325
print(f"Turtle detected: {turtle_format}")
326
print(f"XML detected: {xml_format}")
327
```
328
329
### Graph Comparison
330
331
```python
332
from rdflib import Graph, URIRef, Literal
333
from rdflib.util import isomorphic, graph_diff
334
from rdflib.namespace import FOAF
335
336
# Create two similar graphs
337
g1 = Graph()
338
g1.add((URIRef("http://example.org/person/1"), FOAF.name, Literal("John")))
339
g1.add((URIRef("http://example.org/person/1"), FOAF.age, Literal(30)))
340
341
g2 = Graph()
342
g2.add((URIRef("http://example.org/person/1"), FOAF.name, Literal("John")))
343
g2.add((URIRef("http://example.org/person/1"), FOAF.age, Literal(30)))
344
345
# Test isomorphism
346
are_isomorphic = isomorphic(g1, g2)
347
print(f"Graphs are isomorphic: {are_isomorphic}")
348
349
# Add different data
350
g2.add((URIRef("http://example.org/person/1"), FOAF.email, Literal("john@example.com")))
351
g1.add((URIRef("http://example.org/person/1"), FOAF.phone, Literal("555-1234")))
352
353
# Find differences
354
only_in_g1, only_in_g2 = graph_diff(g1, g2)
355
356
print("Only in g1:")
357
for triple in only_in_g1:
358
print(f" {triple}")
359
360
print("Only in g2:")
361
for triple in only_in_g2:
362
print(f" {triple}")
363
```
364
365
### Graph Analysis
366
367
```python
368
from rdflib import Graph, URIRef
369
from rdflib.namespace import RDFS
370
from rdflib.util import find_roots
371
372
g = Graph()
373
374
# Create class hierarchy
375
animal = URIRef("http://example.org/Animal")
376
mammal = URIRef("http://example.org/Mammal")
377
dog = URIRef("http://example.org/Dog")
378
cat = URIRef("http://example.org/Cat")
379
380
g.add((mammal, RDFS.subClassOf, animal))
381
g.add((dog, RDFS.subClassOf, mammal))
382
g.add((cat, RDFS.subClassOf, mammal))
383
384
# Find root classes
385
roots = find_roots(g, RDFS.subClassOf)
386
print("Root classes:")
387
for root in roots:
388
print(f" {root}")
389
```
390
391
### Date and Time Utilities
392
393
```python
394
from rdflib.util import date_time, parse_date_time
395
from datetime import datetime
396
import time
397
398
# Create datetime literals
399
current_time = date_time()
400
specific_time = date_time(1609459200) # 2021-01-01 00:00:00 UTC
401
local_time = date_time(local_time_zone=True)
402
403
print(f"Current: {current_time}")
404
print(f"Specific: {specific_time}")
405
print(f"Local: {local_time}")
406
407
# Parse datetime strings
408
dt_string = "2021-01-01T12:00:00Z"
409
parsed_dt = parse_date_time(dt_string)
410
print(f"Parsed: {parsed_dt}")
411
```
412
413
### URI Utilities
414
415
```python
416
from rdflib import URIRef
417
from rdflib.util import split_uri, defrag
418
import os
419
420
# Split URI into namespace and local name
421
uri = URIRef("http://xmlns.com/foaf/0.1/name")
422
namespace, local_name = split_uri(uri)
423
print(f"Namespace: {namespace}")
424
print(f"Local name: {local_name}")
425
426
# Remove fragment from URI
427
fragmented_uri = URIRef("http://example.org/resource#section1")
428
defragged = defrag(fragmented_uri)
429
print(f"Original: {fragmented_uri}")
430
print(f"Defragged: {defragged}")
431
432
# Work with file URIs
433
file_path = "/path/to/data.ttl"
434
file_uri = f"file://{os.path.abspath(file_path)}"
435
print(f"File URI: {file_uri}")
436
```
437
438
### Advanced Utilities
439
440
```python
441
from rdflib import Graph, URIRef, Literal, BNode
442
from rdflib.util import to_term, first, uniq
443
from rdflib.namespace import FOAF, RDF
444
445
g = Graph()
446
447
# Build graph with various data types
448
people_data = [
449
("John", 30, "john@example.com"),
450
("Jane", 25, "jane@example.com"),
451
("Bob", 35, "bob@example.com"),
452
("Alice", 30, "alice@example.com") # Duplicate age
453
]
454
455
for i, (name, age, email) in enumerate(people_data, 1):
456
person = URIRef(f"http://example.org/person/{i}")
457
g.add((person, RDF.type, FOAF.Person))
458
g.add((person, FOAF.name, to_term(name)))
459
g.add((person, FOAF.age, to_term(age)))
460
g.add((person, FOAF.mbox, URIRef(f"mailto:{email}")))
461
462
# Find unique ages
463
ages = [obj.toPython() for obj in g.objects(None, FOAF.age)]
464
unique_ages = list(uniq(ages))
465
print(f"Ages: {ages}")
466
print(f"Unique ages: {unique_ages}")
467
468
# Get first person
469
first_person = first(g.subjects(RDF.type, FOAF.Person))
470
print(f"First person: {first_person}")
471
472
# Check if there are many people
473
many_people = more_than(list(g.subjects(RDF.type, FOAF.Person)), 2)
474
print(f"More than 2 people: {many_people}")
475
```
476
477
### Error Handling with Utilities
478
479
```python
480
from rdflib.util import from_n3, to_term, guess_format
481
482
# Handle parsing errors
483
try:
484
bad_n3 = from_n3("<<invalid>>")
485
except Exception as e:
486
print(f"N3 parsing error: {e}")
487
488
# Handle unsupported types
489
class CustomObject:
490
pass
491
492
try:
493
custom_term = to_term(CustomObject())
494
except Exception as e:
495
print(f"Term conversion error: {e}")
496
497
# Handle unknown formats
498
try:
499
unknown_format = guess_format("data.unknown")
500
print(f"Unknown format result: {unknown_format}")
501
except Exception as e:
502
print(f"Format detection error: {e}")
503
```