0
# DOM4J - Flexible XML Framework for Java
1
2
## Overview
3
4
DOM4J is a comprehensive XML processing framework for Java that provides a flexible and efficient API for reading, writing, and manipulating XML documents. It integrates seamlessly with XPath for powerful XML querying and fully supports DOM, SAX, JAXP standards along with Java platform features like Collections.
5
6
The library offers both tree-based and event-driven XML processing models, making it suitable for a wide range of XML manipulation tasks from simple document parsing to complex XML transformations. With support for namespace handling, XML Schema validation, and XSLT processing, DOM4J serves as a complete solution for Java applications requiring robust XML functionality.
7
8
**Version**: 2.1.4
9
**License**: Plexus
10
**Java Compatibility**: Java 8+
11
**Total API Classes**: 181
12
13
## Package Information
14
15
- **Package Name**: dom4j
16
- **Package Type**: maven
17
- **Language**: Java
18
- **Installation**: Add to Maven or Gradle dependencies (see below)
19
20
## Core Imports
21
22
```java { .api }
23
import org.dom4j.*;
24
import org.dom4j.io.SAXReader;
25
import org.dom4j.io.XMLWriter;
26
import org.dom4j.io.OutputFormat;
27
```
28
29
## Dependencies
30
31
### Maven
32
```xml { .api }
33
<dependency>
34
<groupId>org.dom4j</groupId>
35
<artifactId>dom4j</artifactId>
36
<version>2.1.4</version>
37
</dependency>
38
```
39
40
### Gradle
41
```gradle { .api }
42
implementation 'org.dom4j:dom4j:2.1.4'
43
```
44
45
## Basic Usage
46
47
### Basic XML Parsing
48
```java { .api }
49
import org.dom4j.*;
50
import org.dom4j.io.SAXReader;
51
52
// Parse XML from file
53
SAXReader reader = new SAXReader();
54
Document document = reader.read(new File("example.xml"));
55
56
// Get root element
57
Element root = document.getRootElement();
58
59
// Navigate and access content
60
String name = root.getName();
61
String text = root.getText();
62
List<Element> children = root.elements();
63
```
64
65
### Creating XML Documents
66
```java { .api }
67
import org.dom4j.*;
68
69
// Create new document
70
Document document = DocumentHelper.createDocument();
71
Element root = document.addElement("root");
72
73
// Add elements and attributes
74
Element person = root.addElement("person")
75
.addAttribute("id", "1")
76
.addText("John Doe");
77
78
// Output XML
79
String xml = document.asXML();
80
```
81
82
### XPath Queries
83
```java { .api }
84
import org.dom4j.XPath;
85
86
// Create XPath expression
87
XPath xpath = DocumentHelper.createXPath("//person[@id='1']");
88
89
// Execute query
90
Node result = xpath.selectSingleNode(document);
91
List<Node> nodes = xpath.selectNodes(document);
92
```
93
94
## Package Architecture
95
96
DOM4J is organized into several focused packages providing different aspects of XML processing functionality:
97
98
### Core Packages
99
- **org.dom4j**: Core interfaces and classes (Node, Document, Element, etc.)
100
- **org.dom4j.tree**: Default implementations of core interfaces
101
- **org.dom4j.io**: XML input/output operations (SAXReader, XMLWriter)
102
- **org.dom4j.xpath**: XPath processing and pattern matching
103
104
### Integration Packages
105
- **org.dom4j.bean**: JavaBean reflection-based XML binding
106
- **org.dom4j.datatype**: XML Schema datatype support
107
- **org.dom4j.dom**: W3C DOM interoperability
108
- **org.dom4j.jaxb**: JAXB integration support
109
110
### Utility Packages
111
- **org.dom4j.dtd**: DTD processing and validation
112
- **org.dom4j.rule**: XSLT-style pattern matching and rule processing
113
- **org.dom4j.swing**: Swing UI component integration
114
- **org.dom4j.util**: General utility classes
115
116
## Node Type Hierarchy
117
118
DOM4J uses a comprehensive node type system with constants defined in the Node interface:
119
120
```java { .api }
121
// Node type constants
122
public interface Node {
123
short ANY_NODE = 0;
124
short ELEMENT_NODE = 1;
125
short ATTRIBUTE_NODE = 2;
126
short TEXT_NODE = 3;
127
short CDATA_SECTION_NODE = 4;
128
short ENTITY_REFERENCE_NODE = 5;
129
short PROCESSING_INSTRUCTION_NODE = 7;
130
short COMMENT_NODE = 8;
131
short DOCUMENT_NODE = 9;
132
short DOCUMENT_TYPE_NODE = 10;
133
short NAMESPACE_NODE = 13;
134
short UNKNOWN_NODE = 14;
135
short MAX_NODE_TYPE = 14;
136
}
137
```
138
139
The node hierarchy follows this structure:
140
- **Node** (root interface)
141
- **Branch** (nodes that can contain children)
142
- **Document** (root document)
143
- **Element** (XML elements)
144
- **CharacterData** (text-based nodes)
145
- **Text** (text nodes)
146
- **CDATA** (CDATA sections)
147
- **Comment** (XML comments)
148
- **Attribute** (element attributes)
149
- **ProcessingInstruction** (processing instructions)
150
- **Entity** (entity references)
151
- **DocumentType** (DOCTYPE declarations)
152
153
## Key Design Patterns
154
155
### Factory Pattern
156
DOM4J provides centralized object creation through factories:
157
- **DocumentFactory**: Creates all DOM4J tree objects with customization support
158
- **DocumentHelper**: Static utility methods for common operations
159
160
### Visitor Pattern
161
The Visitor interface enables type-safe tree traversal and processing:
162
```java { .api }
163
document.accept(new VisitorSupport() {
164
public void visit(Element element) {
165
// Process elements
166
}
167
168
public void visit(Attribute attribute) {
169
// Process attributes
170
}
171
});
172
```
173
174
### Flyweight Pattern
175
QName and Namespace instances are cached and reused to minimize memory usage.
176
177
### XPath Integration
178
Full XPath 1.0 support through Jaxen integration enables powerful document querying:
179
```java { .api }
180
// XPath with namespace support
181
XPath xpath = DocumentHelper.createXPath("//ns:element");
182
xpath.setNamespaceURIs(Map.of("ns", "http://example.com/ns"));
183
List<Node> results = xpath.selectNodes(document);
184
```
185
186
## Thread Safety Considerations
187
188
- **Node instances**: NOT thread-safe for modification operations
189
- **Factory instances**: Thread-safe for read operations and object creation
190
- **Cached objects**: QName and Namespace are immutable and thread-safe
191
- **DocumentFactory singleton**: Thread-safe
192
- **XPath expressions**: Thread-safe for execution but not for configuration changes
193
194
## Exception Handling
195
196
DOM4J defines several specific exception types:
197
198
```java { .api }
199
import org.dom4j.DocumentException;
200
import org.dom4j.InvalidXPathException;
201
import org.dom4j.XPathException;
202
import org.dom4j.IllegalAddException;
203
204
try {
205
Document doc = reader.read(file);
206
XPath xpath = DocumentHelper.createXPath("//element");
207
List<Node> nodes = xpath.selectNodes(doc);
208
} catch (DocumentException e) {
209
// Handle parsing errors
210
} catch (InvalidXPathException e) {
211
// Handle invalid XPath syntax
212
} catch (XPathException e) {
213
// Handle XPath evaluation errors
214
}
215
```
216
217
## Performance Considerations
218
219
### Memory Optimization
220
- Use DocumentFactory.getInstance() for standard operations
221
- QName and Namespace instances are automatically cached
222
- Consider using SAX-based processing for very large documents
223
224
### Processing Efficiency
225
- XPath expressions are compiled once and can be reused
226
- Use Iterator-based navigation for memory efficiency
227
- Consider streaming approaches for large document processing
228
229
## Integration Capabilities
230
231
### SAX Integration
232
```java { .api }
233
// Custom SAX content handler
234
SAXContentHandler contentHandler = new SAXContentHandler();
235
// Use with any SAX parser
236
saxParser.setContentHandler(contentHandler);
237
Document document = contentHandler.getDocument();
238
```
239
240
### DOM Interoperability
241
```java { .api }
242
// Convert from W3C DOM to DOM4J
243
DOMReader domReader = new DOMReader();
244
Document dom4jDoc = domReader.read(w3cDocument);
245
246
// Convert from DOM4J to W3C DOM
247
DOMWriter domWriter = new DOMWriter();
248
org.w3c.dom.Document w3cDoc = domWriter.write(dom4jDoc);
249
```
250
251
### STAX Support
252
```java { .api }
253
// Read from STAX event stream
254
STAXEventReader staxReader = new STAXEventReader();
255
Document document = staxReader.readDocument(xmlEventReader);
256
```
257
258
## Namespace Support
259
260
DOM4J provides comprehensive XML namespace support:
261
262
```java { .api }
263
// Define namespaces
264
Namespace ns1 = Namespace.get("prefix", "http://example.com/ns1");
265
Namespace ns2 = Namespace.get("http://example.com/ns2");
266
267
// Create namespaced elements
268
Element root = DocumentHelper.createElement(QName.get("root", ns1));
269
Element child = root.addElement(QName.get("child", ns2));
270
271
// Namespace-aware queries
272
XPath xpath = DocumentHelper.createXPath("//prefix:element");
273
xpath.setNamespaceURIs(Map.of("prefix", "http://example.com/ns1"));
274
```
275
276
## Documentation Sections
277
278
This documentation is organized into focused sections covering different aspects of DOM4J:
279
280
- **[Core API](core-api.md)**: Node hierarchy, Document, Element, and Attribute interfaces
281
- **[Document Creation](document-creation.md)**: DocumentHelper, DocumentFactory, and XML parsing
282
- **[I/O Operations](io-operations.md)**: Reading/writing with SAXReader, XMLWriter, STAX integration, HTML output, and JAXP compatibility
283
- **[XPath](xpath.md)**: XPath interface, DefaultXPath, and advanced querying
284
- **[Advanced Features](advanced-features.md)**: JAXB integration, bean binding, validation, utility classes, and specialized packages
285
286
## Common Use Cases
287
288
### Configuration File Processing
289
```java { .api }
290
// Read configuration XML
291
SAXReader reader = new SAXReader();
292
Document config = reader.read("config.xml");
293
294
// Extract configuration values
295
String dbUrl = config.valueOf("//database/@url");
296
String poolSize = config.valueOf("//connection-pool/@size");
297
```
298
299
### XML Generation for Web Services
300
```java { .api }
301
// Create SOAP envelope
302
Document soap = DocumentHelper.createDocument();
303
Element envelope = soap.addElement("soap:Envelope")
304
.addNamespace("soap", "http://schemas.xmlsoap.org/soap/envelope/");
305
306
Element body = envelope.addElement("soap:Body");
307
Element operation = body.addElement("myOperation")
308
.addNamespace("", "http://myservice.example.com/");
309
```
310
311
### Data Transformation
312
```java { .api }
313
// Transform XML structure
314
for (Element element : root.elements("item")) {
315
String id = element.attributeValue("id");
316
String value = element.getText();
317
318
Element newItem = newRoot.addElement("product")
319
.addAttribute("productId", id)
320
.addElement("name").addText(value);
321
}
322
```
323
324
DOM4J's comprehensive API and flexible architecture make it an excellent choice for any Java application requiring robust XML processing capabilities.