0
# Path Types and Normalization
1
2
Jimfs supports different path types that define how paths are parsed, rendered, and handled, along with Unicode and case normalization options.
3
4
## Core Imports
5
6
```java
7
import com.google.common.jimfs.PathType;
8
import com.google.common.jimfs.PathNormalization;
9
import java.util.regex.Pattern;
10
```
11
12
## Path Types
13
14
### Unix Path Type
15
16
Creates a Unix-style path type with `/` separators.
17
18
```java { .api }
19
public static PathType unix();
20
```
21
22
**Characteristics:**
23
- Path separator: `/`
24
- Root: `/`
25
- Absolute paths start with `/`
26
- Case-sensitive by default
27
- Disallows nul character (`\0`) in paths
28
29
**Usage Example:**
30
```java
31
PathType pathType = PathType.unix();
32
Configuration config = Configuration.builder(pathType)
33
.setRoots("/")
34
.setWorkingDirectory("/home/user")
35
.build();
36
```
37
38
### Windows Path Type
39
40
Creates a Windows-style path type supporting both `\` and `/` separators.
41
42
```java { .api }
43
public static PathType windows();
44
```
45
46
**Characteristics:**
47
- Canonical separator: `\`
48
- Also recognizes `/` when parsing paths
49
- Supports drive-letter roots (e.g., `C:\`)
50
- Supports UNC roots (e.g., `\\host\share\`)
51
- Case-insensitive by default
52
- Does not support relative paths with drive letters (e.g., `C:foo`)
53
- Does not support absolute paths with no root (e.g., `\foo`)
54
55
**Usage Example:**
56
```java
57
PathType pathType = PathType.windows();
58
Configuration config = Configuration.builder(pathType)
59
.setRoots("C:\\", "D:\\")
60
.setWorkingDirectory("C:\\Users\\user")
61
.build();
62
```
63
64
## Path Normalization
65
66
The `PathNormalization` enum provides options for normalizing path names to handle Unicode and case sensitivity.
67
68
### Normalization Options
69
70
```java { .api }
71
public enum PathNormalization implements Function<String, String> {
72
NONE, NFC, NFD, CASE_FOLD_UNICODE, CASE_FOLD_ASCII
73
}
74
```
75
76
### No Normalization
77
78
```java { .api }
79
NONE
80
```
81
82
Applies no normalization to path names. Paths are used exactly as provided.
83
84
### Unicode Composed Normalization
85
86
```java { .api }
87
NFC
88
```
89
90
Applies Unicode Normalization Form Composed (NFC). Combines character sequences into their composed forms.
91
92
**Usage Example:**
93
```java
94
Configuration config = Configuration.unix()
95
.toBuilder()
96
.setNameDisplayNormalization(PathNormalization.NFC)
97
.build();
98
```
99
100
### Unicode Decomposed Normalization
101
102
```java { .api }
103
NFD
104
```
105
106
Applies Unicode Normalization Form Decomposed (NFD). Breaks down composed characters into their component parts.
107
108
**Usage Example:**
109
```java
110
// Mac OS X typically uses NFD for canonical form
111
Configuration config = Configuration.unix()
112
.toBuilder()
113
.setNameCanonicalNormalization(PathNormalization.NFD)
114
.build();
115
```
116
117
### Unicode Case Folding
118
119
```java { .api }
120
CASE_FOLD_UNICODE
121
```
122
123
Applies full Unicode case folding for case-insensitive paths. Requires ICU4J library on the classpath.
124
125
**Usage Example:**
126
```java
127
Configuration config = Configuration.unix()
128
.toBuilder()
129
.setNameCanonicalNormalization(PathNormalization.CASE_FOLD_UNICODE)
130
.build();
131
```
132
133
**Error Handling:**
134
Throws `NoClassDefFoundError` if ICU4J is not available:
135
```
136
PathNormalization.CASE_FOLD_UNICODE requires ICU4J.
137
Did you forget to include it on your classpath?
138
```
139
140
### ASCII Case Folding
141
142
```java { .api }
143
CASE_FOLD_ASCII
144
```
145
146
Applies ASCII-only case folding for simple case-insensitive paths. Converts ASCII characters to lowercase.
147
148
**Usage Example:**
149
```java
150
Configuration config = Configuration.windows()
151
.toBuilder()
152
.setNameCanonicalNormalization(PathNormalization.CASE_FOLD_ASCII)
153
.build();
154
```
155
156
## Normalization Methods
157
158
### Apply Normalization
159
160
Apply a single normalization to a string.
161
162
```java { .api }
163
public abstract String apply(String string);
164
```
165
166
Each normalization enum value implements this method to transform the input string.
167
168
**Usage Example:**
169
```java
170
String normalized = PathNormalization.CASE_FOLD_ASCII.apply("Hello World");
171
// Result: "hello world"
172
```
173
174
### Pattern Flags
175
176
Get regex pattern flags that approximate the normalization.
177
178
```java { .api }
179
public int patternFlags();
180
```
181
182
Returns flags suitable for use with `Pattern.compile()` to create regex patterns that match the normalization behavior.
183
184
**Usage Example:**
185
```java
186
int flags = PathNormalization.CASE_FOLD_ASCII.patternFlags();
187
// Returns: Pattern.CASE_INSENSITIVE
188
```
189
190
### Static Utility Methods
191
192
Apply multiple normalizations in sequence.
193
194
```java { .api }
195
public static String normalize(String string, Iterable<PathNormalization> normalizations);
196
```
197
198
**Parameters:**
199
- `string` - Input string to normalize
200
- `normalizations` - Sequence of normalizations to apply
201
202
**Usage Example:**
203
```java
204
String result = PathNormalization.normalize("Héllo Wörld",
205
Arrays.asList(PathNormalization.NFD, PathNormalization.CASE_FOLD_ASCII));
206
```
207
208
### Compile Pattern with Normalizations
209
210
Create a regex pattern using flags from multiple normalizations.
211
212
```java { .api }
213
public static Pattern compilePattern(String regex, Iterable<PathNormalization> normalizations);
214
```
215
216
**Parameters:**
217
- `regex` - Regular expression string
218
- `normalizations` - Normalizations to derive pattern flags from
219
220
**Usage Example:**
221
```java
222
Pattern pattern = PathNormalization.compilePattern(".*\\.txt",
223
Arrays.asList(PathNormalization.CASE_FOLD_ASCII));
224
// Creates case-insensitive pattern for .txt files
225
```
226
227
## Display vs Canonical Forms
228
229
Jimfs distinguishes between two forms of path names:
230
231
- **Display Form**: Used in `Path.toString()` and path rendering
232
- **Canonical Form**: Used for file lookup and equality comparison
233
234
### Configuration
235
236
```java
237
Configuration config = Configuration.unix()
238
.toBuilder()
239
.setNameDisplayNormalization(PathNormalization.NFC) // For display
240
.setNameCanonicalNormalization(PathNormalization.NFD, PathNormalization.CASE_FOLD_ASCII) // For lookup
241
.setPathEqualityUsesCanonicalForm(true) // Use canonical form for Path.equals()
242
.build();
243
```
244
245
## Normalization Rules
246
247
When configuring normalizations:
248
249
- **Cannot combine conflicting normalizations**: e.g., both NFC and NFD
250
- **Cannot combine conflicting case folding**: e.g., both CASE_FOLD_UNICODE and CASE_FOLD_ASCII
251
- **NONE normalization excludes all others**: If NONE is specified, no other normalizations are applied
252
- **Order matters**: Multiple normalizations are applied in the specified order
253
254
## Platform Behavior Matching
255
256
### Mac OS X
257
```java
258
Configuration.osX() // Uses NFC for display, NFD + CASE_FOLD_ASCII for canonical
259
```
260
261
### Windows
262
```java
263
Configuration.windows() // Uses CASE_FOLD_ASCII for canonical form
264
```
265
266
### Unix/Linux
267
```java
268
Configuration.unix() // No normalization by default (case-sensitive)
269
```