CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-com-google-jimfs--jimfs

In-memory file system implementation for Java that provides complete java.nio.file API compatibility

Pending
Overview
Eval results
Files

path-types.mddocs/

Path Types and Normalization

Jimfs supports different path types that define how paths are parsed, rendered, and handled, along with Unicode and case normalization options.

Core Imports

import com.google.common.jimfs.PathType;
import com.google.common.jimfs.PathNormalization;
import java.util.regex.Pattern;

Path Types

Unix Path Type

Creates a Unix-style path type with / separators.

public static PathType unix();

Characteristics:

  • Path separator: /
  • Root: /
  • Absolute paths start with /
  • Case-sensitive by default
  • Disallows nul character (\0) in paths

Usage Example:

PathType pathType = PathType.unix();
Configuration config = Configuration.builder(pathType)
    .setRoots("/")
    .setWorkingDirectory("/home/user")
    .build();

Windows Path Type

Creates a Windows-style path type supporting both \ and / separators.

public static PathType windows();

Characteristics:

  • Canonical separator: \
  • Also recognizes / when parsing paths
  • Supports drive-letter roots (e.g., C:\)
  • Supports UNC roots (e.g., \\host\share\)
  • Case-insensitive by default
  • Does not support relative paths with drive letters (e.g., C:foo)
  • Does not support absolute paths with no root (e.g., \foo)

Usage Example:

PathType pathType = PathType.windows();
Configuration config = Configuration.builder(pathType)
    .setRoots("C:\\", "D:\\")
    .setWorkingDirectory("C:\\Users\\user")
    .build();

Path Normalization

The PathNormalization enum provides options for normalizing path names to handle Unicode and case sensitivity.

Normalization Options

public enum PathNormalization implements Function<String, String> {
    NONE, NFC, NFD, CASE_FOLD_UNICODE, CASE_FOLD_ASCII
}

No Normalization

NONE

Applies no normalization to path names. Paths are used exactly as provided.

Unicode Composed Normalization

NFC

Applies Unicode Normalization Form Composed (NFC). Combines character sequences into their composed forms.

Usage Example:

Configuration config = Configuration.unix()
    .toBuilder()
    .setNameDisplayNormalization(PathNormalization.NFC)
    .build();

Unicode Decomposed Normalization

NFD

Applies Unicode Normalization Form Decomposed (NFD). Breaks down composed characters into their component parts.

Usage Example:

// Mac OS X typically uses NFD for canonical form
Configuration config = Configuration.unix()
    .toBuilder()
    .setNameCanonicalNormalization(PathNormalization.NFD)
    .build();

Unicode Case Folding

CASE_FOLD_UNICODE

Applies full Unicode case folding for case-insensitive paths. Requires ICU4J library on the classpath.

Usage Example:

Configuration config = Configuration.unix()
    .toBuilder()
    .setNameCanonicalNormalization(PathNormalization.CASE_FOLD_UNICODE)
    .build();

Error Handling: Throws NoClassDefFoundError if ICU4J is not available:

PathNormalization.CASE_FOLD_UNICODE requires ICU4J. 
Did you forget to include it on your classpath?

ASCII Case Folding

CASE_FOLD_ASCII

Applies ASCII-only case folding for simple case-insensitive paths. Converts ASCII characters to lowercase.

Usage Example:

Configuration config = Configuration.windows()
    .toBuilder()
    .setNameCanonicalNormalization(PathNormalization.CASE_FOLD_ASCII)
    .build();

Normalization Methods

Apply Normalization

Apply a single normalization to a string.

public abstract String apply(String string);

Each normalization enum value implements this method to transform the input string.

Usage Example:

String normalized = PathNormalization.CASE_FOLD_ASCII.apply("Hello World");
// Result: "hello world"

Pattern Flags

Get regex pattern flags that approximate the normalization.

public int patternFlags();

Returns flags suitable for use with Pattern.compile() to create regex patterns that match the normalization behavior.

Usage Example:

int flags = PathNormalization.CASE_FOLD_ASCII.patternFlags();
// Returns: Pattern.CASE_INSENSITIVE

Static Utility Methods

Apply multiple normalizations in sequence.

public static String normalize(String string, Iterable<PathNormalization> normalizations);

Parameters:

  • string - Input string to normalize
  • normalizations - Sequence of normalizations to apply

Usage Example:

String result = PathNormalization.normalize("Héllo Wörld", 
    Arrays.asList(PathNormalization.NFD, PathNormalization.CASE_FOLD_ASCII));

Compile Pattern with Normalizations

Create a regex pattern using flags from multiple normalizations.

public static Pattern compilePattern(String regex, Iterable<PathNormalization> normalizations);

Parameters:

  • regex - Regular expression string
  • normalizations - Normalizations to derive pattern flags from

Usage Example:

Pattern pattern = PathNormalization.compilePattern(".*\\.txt", 
    Arrays.asList(PathNormalization.CASE_FOLD_ASCII));
// Creates case-insensitive pattern for .txt files

Display vs Canonical Forms

Jimfs distinguishes between two forms of path names:

  • Display Form: Used in Path.toString() and path rendering
  • Canonical Form: Used for file lookup and equality comparison

Configuration

Configuration config = Configuration.unix()
    .toBuilder()
    .setNameDisplayNormalization(PathNormalization.NFC)     // For display
    .setNameCanonicalNormalization(PathNormalization.NFD, PathNormalization.CASE_FOLD_ASCII)  // For lookup
    .setPathEqualityUsesCanonicalForm(true)  // Use canonical form for Path.equals()
    .build();

Normalization Rules

When configuring normalizations:

  • Cannot combine conflicting normalizations: e.g., both NFC and NFD
  • Cannot combine conflicting case folding: e.g., both CASE_FOLD_UNICODE and CASE_FOLD_ASCII
  • NONE normalization excludes all others: If NONE is specified, no other normalizations are applied
  • Order matters: Multiple normalizations are applied in the specified order

Platform Behavior Matching

Mac OS X

Configuration.osX()  // Uses NFC for display, NFD + CASE_FOLD_ASCII for canonical

Windows

Configuration.windows()  // Uses CASE_FOLD_ASCII for canonical form

Unix/Linux

Configuration.unix()  // No normalization by default (case-sensitive)

Install with Tessl CLI

npx tessl i tessl/maven-com-google-jimfs--jimfs

docs

configuration.md

features-monitoring.md

filesystem-creation.md

index.md

path-types.md

tile.json