CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/pypi-smmap

A pure Python implementation of a sliding window memory map manager

Pending
Overview
Eval results
Files

window-access.mddocs/

Window Access

Interface classes for accessing memory-mapped file content, providing both low-level cursor control for precise memory region management and high-level buffer interface with automatic windowing and string-like access patterns.

Capabilities

WindowCursor

Low-level interface providing precise control over memory region mapping and access. Cursors are created by memory managers and provide direct access to mapped memory buffers.

class WindowCursor:
    def use_region(self, offset=0, size=0, flags=0):
        """
        Map a file region into memory for access.
        
        Parameters:
        - offset (int): Absolute offset in bytes into the file (default: 0)
        - size (int): Amount of bytes to map. 0 = map as much as possible
        - flags (int): Additional flags for os.open (used when opening file handles)
        
        Returns:
        WindowCursor: Self (for method chaining)
        
        Note: The actual mapped size may be smaller than requested if the file
        ends or if the region was created between existing mapped regions.
        """
    
    def unuse_region(self):
        """
        Release the current memory region and free resources.
        
        Note: Cursor becomes invalid after calling this method.
        Resources are automatically freed on cursor destruction.
        """
    
    def buffer(self):
        """
        Get memory buffer for the current mapped region.
        
        Returns:
        memoryview: Buffer object providing access to mapped memory
        
        Raises:
        AssertionError: If cursor is not valid (no region mapped)
        
        Note: Buffer should not be cached beyond the cursor's lifetime
        as it prevents resource cleanup.
        """
    
    def map(self):
        """
        Get the underlying raw memory map.
        
        Returns:
        mmap.mmap: Raw memory map object
        
        Note: Offset and size may differ from use_region() parameters.
        For StaticWindowMapManager, this maps the entire file.
        """
    
    def is_valid(self):
        """
        Check if cursor has a valid mapped region.
        
        Returns:
        bool: True if cursor can be used for memory access
        """
    
    def is_associated(self):
        """
        Check if cursor is associated with a file.
        
        Returns:
        bool: True if cursor is bound to a specific file
        """
    
    def ofs_begin(self):
        """
        Get absolute offset to the first byte accessible through this cursor.
        
        Returns:
        int: Absolute file offset in bytes
        
        Note: Only valid when is_valid() returns True
        """
    
    def ofs_end(self):
        """
        Get absolute offset to one byte past the last accessible byte.
        
        Returns:
        int: Absolute file offset in bytes
        """
    
    def size(self):
        """
        Get number of bytes accessible through this cursor.
        
        Returns:
        int: Size of accessible region in bytes
        """
    
    def region(self):
        """
        Get the underlying MapRegion object.
        
        Returns:
        MapRegion | None: Current mapped region, or None if invalid
        """
    
    def includes_ofs(self, ofs):
        """
        Check if absolute offset is accessible through this cursor.
        
        Parameters:
        - ofs (int): Absolute file offset to check
        
        Returns:
        bool: True if offset is within cursor's current region
        
        Note: Cursor must be valid for this to work correctly.
        """
    
    def file_size(self):
        """
        Get size of the underlying file.
        
        Returns:
        int: Total file size in bytes
        """
    
    def path_or_fd(self):
        """
        Get file path or descriptor used to create this cursor.
        
        Returns:
        str | int: File path or file descriptor
        """
    
    def path(self):
        """
        Get file path for this cursor.
        
        Returns:
        str: File path
        
        Raises:
        ValueError: If cursor was created from a file descriptor
        """
    
    def fd(self):
        """
        Get file descriptor for this cursor.
        
        Returns:
        int: File descriptor
        
        Raises:
        ValueError: If cursor was created from a file path
        
        Note: File descriptor may no longer be valid.
        """
    
    def assign(self, rhs):
        """
        Copy data from another cursor into this instance.
        
        Parameters:
        - rhs (WindowCursor): Source cursor to copy from
        
        Note: This creates a real copy with independent resource management.
        Alternatively, use copy.copy() for the same functionality.
        """

Usage Example

import smmap

manager = smmap.SlidingWindowMapManager()
cursor = manager.make_cursor('/path/to/data.bin')

# Map first 1MB of file
cursor.use_region(offset=0, size=1024*1024)

if cursor.is_valid():
    print(f"Mapped region: {cursor.ofs_begin()} to {cursor.ofs_end()}")
    print(f"Region size: {cursor.size()} bytes")
    print(f"File size: {cursor.file_size()} bytes")
    
    # Access mapped memory
    buffer = cursor.buffer()
    first_byte = buffer[0]
    header = buffer[:128]
    
    # Check if specific offset is accessible
    if cursor.includes_ofs(512000):
        middle_data = buffer[512000 - cursor.ofs_begin()]
    
    # Map a different region
    cursor.use_region(offset=2*1024*1024, size=512*1024)  # Map 512KB at 2MB offset
    if cursor.is_valid():
        new_buffer = cursor.buffer()
        data = new_buffer[:1000]
    
    # Clean up
    cursor.unuse_region()

# Context manager support
with cursor:
    cursor.use_region(offset=1000, size=5000)
    if cursor.is_valid():
        data = cursor.buffer()[:]
    # Automatic cleanup on exit

SlidingWindowMapBuffer

High-level buffer interface providing string-like access to memory-mapped files with automatic windowing. Handles window management transparently, allowing direct indexing and slicing operations.

class SlidingWindowMapBuffer:
    def __init__(self, cursor=None, offset=0, size=sys.maxsize, flags=0):
        """
        Initialize buffer for memory-mapped file access.
        
        Parameters:
        - cursor (WindowCursor | None): Associated cursor for file access.
          If None, must call begin_access() before use.
        - offset (int): Starting offset in file (default: 0)
        - size (int): Maximum buffer size. If larger than file, file size is used.
        - flags (int): Additional flags for os.open operations
        
        Raises:
        ValueError: If buffer cannot achieve valid state with given parameters
        """
    
    def begin_access(self, cursor=None, offset=0, size=sys.maxsize, flags=0):
        """
        Initialize buffer for file access.
        
        Parameters:
        - cursor (WindowCursor | None): Cursor for file access. If None, uses existing cursor.
        - offset (int): Starting offset in file
        - size (int): Maximum buffer size
        - flags (int): Additional flags for file operations
        
        Returns:
        bool: True if buffer is ready for use
        """
    
    def end_access(self):
        """
        Release buffer resources and make buffer unusable.
        
        Note: Automatically called on destruction. Call manually for
        earlier resource cleanup in persistent buffer instances.
        """
    
    def cursor(self):
        """
        Get associated cursor providing file access.
        
        Returns:
        WindowCursor: Underlying cursor object
        """
    
    def __len__(self):
        """
        Get buffer length.
        
        Returns:
        int: Buffer size in bytes
        """
    
    def __getitem__(self, key):
        """
        Get byte(s) at index or slice.
        
        Parameters:
        - key (int | slice): Index or slice specification
        
        Returns:
        int | bytes: Single byte value or byte sequence
        
        Note: Automatically maps required windows as needed.
        Supports negative indexing for end-relative access.
        """

Usage Example

import smmap

manager = smmap.SlidingWindowMapManager()

# Direct initialization with cursor
cursor = manager.make_cursor('/path/to/large_file.dat')
buffer = smmap.SlidingWindowMapBuffer(cursor)

# String-like access with automatic windowing
print(f"File size: {len(buffer)} bytes")

# Direct byte access
first_byte = buffer[0]
last_byte = buffer[-1]

# Slice access
header = buffer[:1024]  # First 1KB
trailer = buffer[-1024:]  # Last 1KB
middle_chunk = buffer[1000000:1001000]  # 1KB at 1MB offset

# Context manager for automatic cleanup
with smmap.SlidingWindowMapBuffer() as buf:
    # Initialize for specific region
    if buf.begin_access(cursor, offset=2048, size=1024*1024):
        # Access relative to buffer start (offset 2048 in file = index 0 in buffer)
        data = buf[0:100]  # First 100 bytes of the buffer region
        
        # Check buffer state
        print(f"Buffer covers {len(buf)} bytes")
        print(f"Cursor valid: {buf.cursor().is_valid()}")
    
    # Automatic cleanup on context exit

# Manual resource management
buf = smmap.SlidingWindowMapBuffer()
try:
    buf.begin_access(cursor, offset=0)
    
    # Process file in chunks
    chunk_size = 64 * 1024
    for i in range(0, len(buf), chunk_size):
        chunk = buf[i:i+chunk_size]
        # Process chunk
        pass
        
finally:
    buf.end_access()

Access Patterns

Sequential Access

# Efficient sequential reading
cursor = manager.make_cursor(file_path)
cursor.use_region(offset=0, size=window_size)

offset = 0
while cursor.is_valid():
    buffer = cursor.buffer()
    # Process buffer content
    
    offset += cursor.size()
    cursor.use_region(offset=offset, size=window_size)

Random Access

# Random access with buffer interface
with smmap.SlidingWindowMapBuffer(manager.make_cursor(file_path)) as buf:
    # Access any location efficiently
    header = buf[0:64]
    footer = buf[-64:]
    middle = buf[len(buf)//2:len(buf)//2+64]

Multi-cursor Access

# Multiple cursors for parallel access
cursor1 = manager.make_cursor(file_path)
cursor2 = manager.make_cursor(file_path)

# Different regions simultaneously
cursor1.use_region(offset=0, size=1024*1024)
cursor2.use_region(offset=10*1024*1024, size=1024*1024)

# Independent access to same file
data1 = cursor1.buffer()[:]
data2 = cursor2.buffer()[:]

Resource Management

  • Automatic Cleanup: Context managers and destructors handle resource cleanup
  • Reference Counting: Regions stay mapped while cursors reference them
  • Window Management: Buffers automatically map/unmap windows as needed
  • Memory Limits: Respects manager-configured memory and handle limits
  • Copy Semantics: Cursors support copying for independent access patterns

Install with Tessl CLI

npx tessl i tessl/pypi-smmap

docs

index.md

memory-managers.md

utilities.md

window-access.md

tile.json