0
# Parsers and Serializers
1
2
Pluggable parsers and serializers supporting RDF/XML, Turtle, N-Triples, N-Quads, TriG, JSON-LD, and other RDF formats through a unified interface with the plugin system.
3
4
## Capabilities
5
6
### Plugin Registration and Management
7
8
Functions for registering and managing parser and serializer plugins.
9
10
```python { .api }
11
def register(name: str, kind: str, module_path: str, class_name: str):
12
"""
13
Register a plugin.
14
15
Parameters:
16
- name: Plugin name (format identifier)
17
- kind: Plugin type ('parser', 'serializer', 'store', 'query_processor', etc.)
18
- module_path: Python module path
19
- class_name: Class name within module
20
"""
21
22
def get(name: str, kind: str):
23
"""
24
Get a plugin instance.
25
26
Parameters:
27
- name: Plugin name
28
- kind: Plugin type
29
30
Returns:
31
Plugin instance
32
"""
33
34
def plugins(kind: str = None) -> Iterator[Tuple[str, str]]:
35
"""
36
List available plugins.
37
38
Parameters:
39
- kind: Plugin type to filter by
40
41
Returns:
42
Iterator: (name, kind) pairs
43
"""
44
```
45
46
### Parser - Base Parser Interface
47
48
Base interface for RDF format parsers.
49
50
```python { .api }
51
class Parser:
52
def __init__(self):
53
"""Initialize parser."""
54
55
def parse(self, source, graph: Graph, encoding: str = None, **kwargs):
56
"""
57
Parse RDF data into graph.
58
59
Parameters:
60
- source: Input source (file-like object, string, or URL)
61
- graph: Target graph to parse into
62
- encoding: Character encoding
63
"""
64
```
65
66
### Serializer - Base Serializer Interface
67
68
Base interface for RDF format serializers.
69
70
```python { .api }
71
class Serializer:
72
def __init__(self, graph: Graph):
73
"""
74
Initialize serializer.
75
76
Parameters:
77
- graph: Graph to serialize
78
"""
79
80
def serialize(self, stream, base: str = None, encoding: str = None, **kwargs):
81
"""
82
Serialize graph to stream.
83
84
Parameters:
85
- stream: Output stream (file-like object)
86
- base: Base URI for relative references
87
- encoding: Character encoding
88
"""
89
```
90
91
## Built-in Parsers
92
93
RDFLib includes parsers for major RDF formats:
94
95
```python { .api }
96
# RDF/XML Parser
97
class XMLParser(Parser):
98
"""Parse RDF/XML format."""
99
100
# Turtle/N3 Parser
101
class TurtleParser(Parser):
102
"""Parse Turtle and N3 formats."""
103
104
# N-Triples Parser
105
class NTriplesParser(Parser):
106
"""Parse N-Triples format."""
107
108
# N-Quads Parser
109
class NQuadsParser(Parser):
110
"""Parse N-Quads format."""
111
112
# TriX Parser
113
class TriXParser(Parser):
114
"""Parse TriX XML format."""
115
116
# TriG Parser
117
class TriGParser(Parser):
118
"""Parse TriG format."""
119
120
# JSON-LD Parser
121
class JsonLdParser(Parser):
122
"""Parse JSON-LD format."""
123
124
# HexTuples Parser
125
class HextuplesParser(Parser):
126
"""Parse HexTuples format."""
127
```
128
129
## Built-in Serializers
130
131
RDFLib includes serializers for major RDF formats:
132
133
```python { .api }
134
# RDF/XML Serializer
135
class XMLSerializer(Serializer):
136
"""Serialize to RDF/XML format."""
137
138
# Turtle Serializer
139
class TurtleSerializer(Serializer):
140
"""Serialize to Turtle format."""
141
142
# N-Triples Serializer
143
class NTriplesSerializer(Serializer):
144
"""Serialize to N-Triples format."""
145
146
# N-Quads Serializer
147
class NQuadsSerializer(Serializer):
148
"""Serialize to N-Quads format."""
149
150
# TriX Serializer
151
class TriXSerializer(Serializer):
152
"""Serialize to TriX XML format."""
153
154
# TriG Serializer
155
class TriGSerializer(Serializer):
156
"""Serialize to TriG format."""
157
158
# JSON-LD Serializer
159
class JsonLdSerializer(Serializer):
160
"""Serialize to JSON-LD format."""
161
162
# HexTuples Serializer
163
class HextuplesSerializer(Serializer):
164
"""Serialize to HexTuples format."""
165
```
166
167
## Format Identification
168
169
### Supported Format Names
170
171
Common format identifiers used with parse() and serialize() methods:
172
173
```python { .api }
174
# Format name mappings
175
PARSER_FORMATS = {
176
'xml': 'RDF/XML format',
177
'rdf': 'RDF/XML format',
178
'turtle': 'Turtle format',
179
'ttl': 'Turtle format',
180
'n3': 'Notation3 format',
181
'nt': 'N-Triples format',
182
'ntriples': 'N-Triples format',
183
'nquads': 'N-Quads format',
184
'nq': 'N-Quads format',
185
'trix': 'TriX XML format',
186
'trig': 'TriG format',
187
'json-ld': 'JSON-LD format',
188
'jsonld': 'JSON-LD format',
189
'hext': 'HexTuples format'
190
}
191
192
SERIALIZER_FORMATS = {
193
'xml': 'RDF/XML format',
194
'pretty-xml': 'Pretty RDF/XML format',
195
'turtle': 'Turtle format',
196
'ttl': 'Turtle format',
197
'longturtle': 'Long-form Turtle format',
198
'n3': 'Notation3 format',
199
'nt': 'N-Triples format',
200
'ntriples': 'N-Triples format',
201
'nquads': 'N-Quads format',
202
'nq': 'N-Quads format',
203
'trix': 'TriX XML format',
204
'trig': 'TriG format',
205
'json-ld': 'JSON-LD format',
206
'jsonld': 'JSON-LD format',
207
'hext': 'HexTuples format'
208
}
209
```
210
211
## Usage Examples
212
213
### Basic Parsing
214
215
```python
216
from rdflib import Graph
217
218
g = Graph()
219
220
# Parse from file with format detection
221
g.parse("data.ttl") # Format detected from extension
222
223
# Parse with explicit format
224
g.parse("data.rdf", format="xml")
225
g.parse("data.n3", format="n3")
226
g.parse("data.jsonld", format="json-ld")
227
228
# Parse from URL
229
g.parse("http://example.org/data.rdf")
230
231
# Parse from string
232
rdf_data = """
233
@prefix ex: <http://example.org/> .
234
ex:person1 ex:name "John Doe" .
235
"""
236
g.parse(data=rdf_data, format="turtle")
237
238
# Parse with base URI
239
g.parse(data=rdf_data, format="turtle", publicID="http://example.org/base")
240
```
241
242
### Basic Serialization
243
244
```python
245
from rdflib import Graph
246
247
g = Graph()
248
# ... populate graph ...
249
250
# Serialize to different formats
251
turtle_data = g.serialize(format="turtle")
252
xml_data = g.serialize(format="xml")
253
ntriples_data = g.serialize(format="nt")
254
jsonld_data = g.serialize(format="json-ld")
255
256
# Serialize to file
257
g.serialize("output.ttl", format="turtle")
258
g.serialize("output.rdf", format="xml")
259
260
# Serialize with base URI
261
turtle_with_base = g.serialize(format="turtle", base="http://example.org/")
262
263
# Pretty print XML
264
pretty_xml = g.serialize(format="pretty-xml")
265
```
266
267
### Working with Datasets
268
269
```python
270
from rdflib import Dataset
271
272
ds = Dataset()
273
# ... populate dataset ...
274
275
# Parse dataset formats
276
ds.parse("data.trig", format="trig")
277
ds.parse("data.nq", format="nquads")
278
279
# Serialize dataset formats
280
trig_data = ds.serialize(format="trig")
281
nquads_data = ds.serialize(format="nquads")
282
```
283
284
### Custom Parser Registration
285
286
```python
287
from rdflib import Graph
288
from rdflib.parser import Parser
289
from rdflib.plugin import register
290
291
class CustomParser(Parser):
292
def parse(self, source, graph, encoding=None, **kwargs):
293
"""Custom parser implementation."""
294
# Read source and parse into graph
295
pass
296
297
# Register custom parser
298
register(
299
name="custom",
300
kind="parser",
301
module_path="mymodule.parsers",
302
class_name="CustomParser"
303
)
304
305
# Use custom parser
306
g = Graph()
307
g.parse("data.custom", format="custom")
308
```
309
310
### Custom Serializer Registration
311
312
```python
313
from rdflib import Graph
314
from rdflib.serializer import Serializer
315
from rdflib.plugin import register
316
317
class CustomSerializer(Serializer):
318
def serialize(self, stream, base=None, encoding=None, **kwargs):
319
"""Custom serializer implementation."""
320
# Serialize graph to stream
321
pass
322
323
# Register custom serializer
324
register(
325
name="custom",
326
kind="serializer",
327
module_path="mymodule.serializers",
328
class_name="CustomSerializer"
329
)
330
331
# Use custom serializer
332
g = Graph()
333
# ... populate graph ...
334
custom_data = g.serialize(format="custom")
335
```
336
337
### Format-Specific Options
338
339
```python
340
from rdflib import Graph
341
342
g = Graph()
343
# ... populate graph ...
344
345
# Turtle with custom indentation
346
turtle_data = g.serialize(
347
format="turtle",
348
indent=" ", # Custom indent
349
base="http://example.org/"
350
)
351
352
# JSON-LD with context
353
jsonld_data = g.serialize(
354
format="json-ld",
355
context={
356
"name": "http://xmlns.com/foaf/0.1/name",
357
"age": "http://xmlns.com/foaf/0.1/age"
358
},
359
indent=2
360
)
361
362
# RDF/XML with pretty printing
363
xml_data = g.serialize(
364
format="pretty-xml",
365
max_depth=3,
366
untyped_literals=True
367
)
368
```
369
370
### Parsing with Error Handling
371
372
```python
373
from rdflib import Graph
374
from rdflib.exceptions import ParserError
375
376
g = Graph()
377
378
try:
379
g.parse("invalid_data.ttl", format="turtle")
380
except ParserError as e:
381
print(f"Parse error: {e}")
382
except FileNotFoundError:
383
print("File not found")
384
except Exception as e:
385
print(f"Unexpected error: {e}")
386
```
387
388
### Batch Processing
389
390
```python
391
import os
392
from rdflib import Graph
393
394
def process_rdf_files(directory):
395
"""Process all RDF files in directory."""
396
g = Graph()
397
398
for filename in os.listdir(directory):
399
filepath = os.path.join(directory, filename)
400
401
# Determine format from extension
402
if filename.endswith('.ttl'):
403
format_type = 'turtle'
404
elif filename.endswith('.rdf'):
405
format_type = 'xml'
406
elif filename.endswith('.n3'):
407
format_type = 'n3'
408
elif filename.endswith('.nt'):
409
format_type = 'nt'
410
else:
411
continue # Skip unknown formats
412
413
try:
414
g.parse(filepath, format=format_type)
415
print(f"Parsed {filename} ({format_type})")
416
except Exception as e:
417
print(f"Error parsing {filename}: {e}")
418
419
return g
420
421
# Process all files and serialize result
422
combined_graph = process_rdf_files("rdf_data/")
423
combined_graph.serialize("combined.ttl", format="turtle")
424
```
425
426
### Streaming Large Files
427
428
```python
429
from rdflib import Graph
430
import gzip
431
432
# Parse compressed RDF
433
g = Graph()
434
with gzip.open('large_data.ttl.gz', 'rt', encoding='utf-8') as f:
435
g.parse(f, format="turtle")
436
437
# Serialize to compressed file
438
with gzip.open('output.ttl.gz', 'wt', encoding='utf-8') as f:
439
g.serialize(f, format="turtle")
440
```
441
442
### Format Detection
443
444
```python
445
from rdflib import Graph
446
from rdflib.util import guess_format
447
448
# Automatic format detection
449
filename = "data.ttl"
450
format_type = guess_format(filename)
451
print(f"Detected format: {format_type}")
452
453
g = Graph()
454
g.parse(filename, format=format_type)
455
456
# Detection from content
457
data = """<?xml version="1.0"?>
458
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
459
</rdf:RDF>"""
460
461
format_type = guess_format(data=data)
462
g.parse(data=data, format=format_type)
463
```
464
465
### Plugin Introspection
466
467
```python
468
from rdflib.plugin import plugins
469
470
# List all parsers
471
print("Available parsers:")
472
for name, kind in plugins("parser"):
473
print(f" {name}")
474
475
# List all serializers
476
print("Available serializers:")
477
for name, kind in plugins("serializer"):
478
print(f" {name}")
479
480
# List all plugins
481
print("All plugins:")
482
for name, kind in plugins():
483
print(f" {name} ({kind})")
484
```