CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/maven-it-unimi-dsi--fastutil

High-performance type-specific collections extending the Java Collections Framework with memory-efficient primitive containers and big data structures.

Pending
Overview
Eval results
Files

io-streams.mddocs/

High-Performance I/O

Fast, repositionable stream implementations with measurement capabilities for efficient binary and text file operations. These classes provide high-performance alternatives to standard Java I/O classes, optimized for speed and functionality.

Overview

FastUtil's I/O package provides:

  • Unsynchronized Operations: Maximum performance by removing synchronization overhead
  • Repositioning Support: Efficient seeking within streams
  • Measurement Capabilities: Access to stream length and current position
  • Enhanced Functionality: Features missing from standard Java streams
  • Memory Mapping: Utilities for memory-mapped file operations

All stream classes are designed for single-threaded access to maximize performance.

Capabilities

Stream Interfaces

Foundation interfaces that extend standard Java I/O with measurement and repositioning capabilities.

/**
 * Interface for streams that provide access to length and current position
 */
public interface MeasurableStream {
    /**
     * Get the overall length of the stream (optional operation)
     * @return the stream length, or -1 if unknown
     * @throws IOException if an I/O error occurs
     */
    long length() throws IOException;
    
    /**
     * Get the current position in the stream (optional operation)
     * @return the current position, or -1 if unknown
     * @throws IOException if an I/O error occurs
     */
    long position() throws IOException;
}

/**
 * Interface for streams supporting repositioning operations
 */
public interface RepositionableStream {
    /**
     * Set the stream position
     * @param newPosition the new position in the stream
     * @throws IOException if an I/O error occurs or position is invalid
     */
    void position(long newPosition) throws IOException;
    
    /**
     * Get the current stream position
     * @return the current position in the stream
     * @throws IOException if an I/O error occurs
     */
    long position() throws IOException;
}

Abstract Stream Base Classes

Base classes providing common functionality for measurable streams.

/**
 * Abstract base class for measurable input streams
 */
public abstract class MeasurableInputStream extends InputStream implements MeasurableStream {
    /**
     * Default implementation returns -1 (unknown length)
     * @return the stream length, or -1 if unknown
     */
    public long length() throws IOException {
        return -1;
    }
    
    /**
     * Default implementation returns -1 (unknown position)
     * @return the current position, or -1 if unknown
     */
    public long position() throws IOException {
        return -1;
    }
}

/**
 * Abstract base class for measurable output streams
 */
public abstract class MeasurableOutputStream extends OutputStream implements MeasurableStream {
    /**
     * Default implementation returns -1 (unknown length)
     * @return the stream length, or -1 if unknown
     */
    public long length() throws IOException {
        return -1;
    }
    
    /**
     * Default implementation returns -1 (unknown position)
     * @return the current position, or -1 if unknown
     */
    public long position() throws IOException {
        return -1;
    }
}

Fast Buffered Streams

High-performance, unsynchronized buffered streams with repositioning support.

/**
 * High-performance, unsynchronized buffered input stream with repositioning
 */
public class FastBufferedInputStream extends MeasurableInputStream implements RepositionableStream {
    /** Default buffer size */
    public static final int DEFAULT_BUFFER_SIZE = 8192;
    
    /**
     * Create buffered input stream with default buffer size
     * @param is the underlying input stream
     */
    public FastBufferedInputStream(InputStream is);
    
    /**
     * Create buffered input stream with specified buffer size
     * @param is the underlying input stream
     * @param bufSize the buffer size
     */
    public FastBufferedInputStream(InputStream is, int bufSize);
    
    /**
     * Create buffered input stream from file
     * @param file the file to read
     * @throws FileNotFoundException if file doesn't exist
     */
    public FastBufferedInputStream(File file) throws FileNotFoundException;
    
    /**
     * Create buffered input stream from file with specified buffer size
     * @param file the file to read
     * @param bufSize the buffer size
     * @throws FileNotFoundException if file doesn't exist
     */
    public FastBufferedInputStream(File file, int bufSize) throws FileNotFoundException;
    
    /**
     * Read a single byte
     * @return the byte value (0-255), or -1 if end of stream
     */
    public int read() throws IOException;
    
    /**
     * Read bytes into array
     * @param b the byte array
     * @param offset starting offset in array
     * @param length maximum bytes to read
     * @return number of bytes read, or -1 if end of stream
     */
    public int read(byte[] b, int offset, int length) throws IOException;
    
    /**
     * Skip bytes efficiently (true skipping, not just reading)
     * @param n number of bytes to skip
     * @return number of bytes actually skipped
     */
    public long skip(long n) throws IOException;
    
    /**
     * Read a line of text (bytes until newline)
     * @param a byte array to store line (will be reallocated if needed)
     * @return number of bytes in line, or -1 if end of stream
     */
    public int readLine(byte[] a) throws IOException;
    
    /**
     * Read a line of text as byte array
     * @return byte array containing line, or null if end of stream
     */
    public byte[] readLine() throws IOException;
    
    /**
     * Set stream position (if underlying stream supports it)
     * @param newPosition the new position
     */
    public void position(long newPosition) throws IOException;
    
    /**
     * Get current stream position (if underlying stream supports it)
     * @return the current position
     */
    public long position() throws IOException;
    
    /**
     * Get stream length (if underlying stream supports it)
     * @return the stream length
     */
    public long length() throws IOException;
}

/**
 * High-performance, unsynchronized buffered output stream with repositioning
 */
public class FastBufferedOutputStream extends MeasurableOutputStream implements RepositionableStream {
    /** Default buffer size */
    public static final int DEFAULT_BUFFER_SIZE = 8192;
    
    /**
     * Create buffered output stream with default buffer size
     * @param os the underlying output stream
     */
    public FastBufferedOutputStream(OutputStream os);
    
    /**
     * Create buffered output stream with specified buffer size
     * @param os the underlying output stream
     * @param bufSize the buffer size
     */
    public FastBufferedOutputStream(OutputStream os, int bufSize);
    
    /**
     * Create buffered output stream to file
     * @param file the file to write
     * @throws FileNotFoundException if file cannot be created
     */
    public FastBufferedOutputStream(File file) throws FileNotFoundException;
    
    /**
     * Write a single byte
     * @param b the byte value to write
     */
    public void write(int b) throws IOException;
    
    /**
     * Write bytes from array
     * @param b the byte array
     * @param offset starting offset in array
     * @param length number of bytes to write
     */
    public void write(byte[] b, int offset, int length) throws IOException;
    
    /**
     * Flush buffered data to underlying stream
     */
    public void flush() throws IOException;
    
    /**
     * Set stream position (if underlying stream supports it)
     * @param newPosition the new position
     */
    public void position(long newPosition) throws IOException;
    
    /**
     * Get current stream position (if underlying stream supports it)
     * @return the current position
     */
    public long position() throws IOException;
    
    /**
     * Get stream length (if underlying stream supports it)
     * @return the stream length
     */
    public long length() throws IOException;
}

Usage Examples:

import it.unimi.dsi.fastutil.io.*;
import java.io.*;
import java.nio.charset.StandardCharsets;

// Fast file reading with repositioning
try (FastBufferedInputStream input = new FastBufferedInputStream(new File("data.txt"))) {
    // Check file length
    long fileSize = input.length();
    System.out.println("File size: " + fileSize + " bytes");
    
    // Read some data
    byte[] buffer = new byte[1024];
    int bytesRead = input.read(buffer);
    
    // Jump to middle of file
    input.position(fileSize / 2);
    long currentPos = input.position();
    System.out.println("Current position: " + currentPos);
    
    // Read line by line
    byte[] line;
    while ((line = input.readLine()) != null) {
        String text = new String(line, StandardCharsets.UTF_8);
        System.out.println("Line: " + text);
    }
}

// Fast file writing with buffering
try (FastBufferedOutputStream output = new FastBufferedOutputStream(new File("output.txt"))) {
    // Write data efficiently
    String data = "Hello, FastUtil I/O!";
    output.write(data.getBytes(StandardCharsets.UTF_8));
    
    // Position and overwrite
    output.position(0);
    output.write("Hi!!!".getBytes(StandardCharsets.UTF_8));
    
    // Ensure data is written
    output.flush();
}

Fast Byte Array Streams

Memory-based streams with direct array access and correct repositioning semantics.

/**
 * Fast, repositionable byte array input stream with correct semantics
 * (fixes issues with java.io.ByteArrayInputStream)
 */
public class FastByteArrayInputStream extends MeasurableInputStream implements RepositionableStream {
    /** Direct access to the underlying byte array */
    public byte[] array;
    
    /** Starting offset in the array */
    public int offset;
    
    /** Number of valid bytes in the array */
    public int length;
    
    /**
     * Create from byte array
     * @param a the byte array
     */
    public FastByteArrayInputStream(byte[] a);
    
    /**
     * Create from portion of byte array
     * @param a the byte array
     * @param offset starting offset
     * @param length number of bytes
     */
    public FastByteArrayInputStream(byte[] a, int offset, int length);
    
    /**
     * Read a single byte
     * @return the byte value (0-255), or -1 if end of stream
     */
    public int read();
    
    /**
     * Read bytes into array
     * @param b destination array
     * @param offset starting offset in destination
     * @param length maximum bytes to read
     * @return number of bytes read, or -1 if end of stream
     */
    public int read(byte[] b, int offset, int length);
    
    /**
     * Skip bytes (efficiently, without reading)
     * @param n number of bytes to skip
     * @return number of bytes actually skipped
     */
    public long skip(long n);
    
    /**
     * Get number of available bytes
     * @return number of bytes available for reading
     */
    public int available();
    
    /**
     * Set stream position
     * @param newPosition the new position
     */
    public void position(long newPosition);
    
    /**
     * Get current stream position
     * @return the current position
     */
    public long position();
    
    /**
     * Get stream length
     * @return the stream length
     */
    public long length();
    
    /**
     * Reset stream to beginning
     */
    public void reset();
    
    /**
     * Mark current position (supports repositioning)
     * @param readlimit ignored (always supports repositioning)
     */
    public void mark(int readlimit);
    
    /**
     * Test if mark/reset is supported
     * @return always true
     */
    public boolean markSupported();
}

/**
 * Fast byte array output stream with direct array access
 */
public class FastByteArrayOutputStream extends OutputStream {
    /** Direct access to the underlying byte array */
    public byte[] array;
    
    /** Current length of valid data */
    public int length;
    
    /**
     * Create with default initial capacity
     */
    public FastByteArrayOutputStream();
    
    /**
     * Create with specified initial capacity
     * @param initialCapacity the initial capacity
     */
    public FastByteArrayOutputStream(int initialCapacity);
    
    /**
     * Write a single byte
     * @param b the byte value to write
     */
    public void write(int b);
    
    /**
     * Write bytes from array
     * @param b source array
     * @param offset starting offset in source
     * @param len number of bytes to write
     */
    public void write(byte[] b, int offset, int len);
    
    /**
     * Reset the stream (length becomes 0, array is reused)
     */
    public void reset();
    
    /**
     * Get copy of written data as byte array
     * @return copy of the data
     */
    public byte[] toByteArray();
    
    /**
     * Get current length of data
     * @return number of bytes written
     */
    public int size();
    
    /**
     * Convert data to string using specified charset
     * @param charset the character set
     * @return string representation of data
     */
    public String toString(Charset charset);
    
    /**
     * Trim array to exact size (frees unused capacity)
     */
    public void trim();
}

Usage Examples:

import it.unimi.dsi.fastutil.io.*;

// Working with byte array input stream
byte[] data = "Hello, World! This is test data.".getBytes();
FastByteArrayInputStream input = new FastByteArrayInputStream(data);

// Direct array access
System.out.println("Array length: " + input.array.length);
System.out.println("Data length: " + input.length);

// Positioning and reading
input.position(7); // Move to "World!"
byte[] buffer = new byte[6];
int read = input.read(buffer);
System.out.println("Read: " + new String(buffer, 0, read)); // "World!"

// Mark and reset functionality
input.mark(0);
input.read(); // Read one byte
input.reset(); // Back to marked position

// Working with byte array output stream
FastByteArrayOutputStream output = new FastByteArrayOutputStream();
output.write("Hello".getBytes());
output.write(", ".getBytes());
output.write("World!".getBytes());

// Direct access to written data
System.out.println("Written " + output.length + " bytes");
System.out.println("Data: " + new String(output.array, 0, output.length));

// Get copy of data
byte[] result = output.toByteArray();

// Reset and reuse
output.reset();
output.write("New data".getBytes());

Multi-Array and Cached Streams

Specialized streams for handling very large data and file caching scenarios.

/**
 * Input stream backed by multiple byte arrays for very large data
 */
public class FastMultiByteArrayInputStream extends MeasurableInputStream implements RepositionableStream {
    /**
     * Create from array of byte arrays
     * @param a array of byte arrays
     */
    public FastMultiByteArrayInputStream(byte[][] a);
    
    /**
     * Create from array of byte arrays with specified offset and length
     * @param a array of byte arrays
     * @param offset starting offset
     * @param length total number of bytes
     */
    public FastMultiByteArrayInputStream(byte[][] a, long offset, long length);
    
    /**
     * Read a single byte
     * @return the byte value (0-255), or -1 if end of stream
     */
    public int read() throws IOException;
    
    /**
     * Read bytes into array
     * @param b destination array
     * @param offset starting offset in destination
     * @param length maximum bytes to read
     * @return number of bytes read, or -1 if end of stream
     */
    public int read(byte[] b, int offset, int length) throws IOException;
    
    /**
     * Skip bytes efficiently
     * @param n number of bytes to skip
     * @return number of bytes actually skipped
     */
    public long skip(long n) throws IOException;
    
    /**
     * Set stream position
     * @param newPosition the new position
     */
    public void position(long newPosition) throws IOException;
    
    /**
     * Get current stream position
     * @return the current position
     */
    public long position() throws IOException;
    
    /**
     * Get stream length
     * @return the stream length
     */
    public long length() throws IOException;
}

/**
 * File-cached input stream with inspection capabilities
 */
public class InspectableFileCachedInputStream extends InputStream {
    /**
     * Create from input stream with specified cache size
     * @param is the source input stream
     * @param bufferSize size of memory buffer before file caching
     */
    public InspectableFileCachedInputStream(InputStream is, int bufferSize);
    
    /**
     * Create from input stream with default cache size
     * @param is the source input stream
     */
    public InspectableFileCachedInputStream(InputStream is);
    
    /**
     * Read a single byte
     * @return the byte value (0-255), or -1 if end of stream
     */
    public int read() throws IOException;
    
    /**
     * Read bytes into array
     * @param b destination array
     * @param offset starting offset in destination
     * @param length maximum bytes to read
     * @return number of bytes read, or -1 if end of stream
     */
    public int read(byte[] b, int offset, int length) throws IOException;
    
    /**
     * Get current position in cached data
     * @return the current position
     */
    public long position();
    
    /**
     * Get total amount of data cached so far
     * @return number of bytes cached
     */
    public long size();
    
    /**
     * Inspect cached data at specified position
     * @param position position to inspect
     * @return byte value at position, or -1 if not yet cached
     */
    public int inspect(long position);
    
    /**
     * Close the stream and clean up cache files
     */
    public void close() throws IOException;
}

Usage Examples:

import it.unimi.dsi.fastutil.io.*;

// Working with multi-array input for big data
byte[][] bigData = new byte[1000][]; // Array of arrays
for (int i = 0; i < bigData.length; i++) {
    bigData[i] = ("Segment " + i + " data\n").getBytes();
}

FastMultiByteArrayInputStream multiInput = new FastMultiByteArrayInputStream(bigData);
long totalSize = multiInput.length();
System.out.println("Total data size: " + totalSize);

// Position anywhere in the big data
multiInput.position(totalSize / 2);
byte[] sample = new byte[20];
multiInput.read(sample);
System.out.println("Middle sample: " + new String(sample));

// File-cached stream for large network downloads
try (InspectableFileCachedInputStream cached = 
     new InspectableFileCachedInputStream(
         new URL("http://example.com/largefile.dat").openStream())) {
    
    // Read data (automatically cached to file when memory buffer fills)
    byte[] buffer = new byte[8192];
    int totalRead = 0;
    int read;
    while ((read = cached.read(buffer)) != -1) {
        totalRead += read;
        if (totalRead % 1000000 == 0) {
            System.out.println("Read " + totalRead + " bytes, cached " + cached.size());
        }
    }
    
    // Inspect previously read data
    long pos = cached.position();
    if (pos > 1000) {
        int byte1000 = cached.inspect(1000);
        System.out.println("Byte at position 1000: " + byte1000);
    }
}

Performance Features

Key Optimizations

  1. Unsynchronized Operations: All classes are designed for single-threaded access, eliminating synchronization overhead
  2. Direct Buffer Management: Manual buffer management for optimal performance
  3. True Skipping: skip() methods actually skip data instead of reading and discarding
  4. Efficient Repositioning: Direct position manipulation where supported by underlying streams
  5. Memory Mapping: Automatic detection and use of FileChannel for repositioning when available

Usage Guidelines

  1. Thread Safety: These classes are NOT thread-safe. Use external synchronization if needed
  2. Resource Management: Always use try-with-resources for proper cleanup
  3. Buffer Sizing: Choose buffer sizes appropriate for your access patterns
  4. File Operations: FastBuffered streams automatically detect and use FileChannel optimizations
  5. Memory Constraints: Use multi-array streams for data larger than single array limits

Install with Tessl CLI

npx tessl i tessl/maven-it-unimi-dsi--fastutil

docs

big-data-structures.md

core-utilities.md

index.md

io-streams.md

priority-queues-stacks.md

type-specific-collections.md

tile.json