CtrlK
BlogDocsLog inGet started
Tessl Logo

tessl/npm-tar-stream

Streaming tar parser and generator that operates purely using streams without hitting the file system.

Overview
Eval results
Files

extraction.mddocs/

Tar Extraction

Streaming tar archive parser that extracts entries from tar data without hitting the file system. The extract stream processes tar archives sequentially and emits events for each entry.

Capabilities

Extract Factory Function

Creates a new tar extraction stream.

/**
 * Creates a new tar extraction stream
 * @param {ExtractOptions} [opts] - Optional configuration
 * @returns {Extract} Extract stream instance
 */
function extract(opts?: ExtractOptions): Extract;

interface ExtractOptions {
  /** Character encoding for filenames, defaults to 'utf-8' */
  filenameEncoding?: string;
  /** Allow unknown tar formats, defaults to false */
  allowUnknownFormat?: boolean;
}

Extract Stream

Writable stream that parses tar data and emits entry events.

class Extract extends Writable {
  /** Async iterator interface for processing entries */
  [Symbol.asyncIterator](): AsyncIterableIterator<EntryStream>;
}

Entry Event

Emitted for each tar entry (file, directory, etc.) found in the archive.

/**
 * Entry event handler
 * @param {Header} header - Tar header with entry metadata
 * @param {EntryStream} stream - Readable stream of entry content
 * @param {Function} next - Callback to proceed to next entry
 */
extract.on('entry', function(header: Header, stream: EntryStream, next: () => void): void);

Entry Stream

Readable stream representing the content of a tar entry.

class EntryStream extends Readable {
  /** Tar header for this entry */
  header: Header;
  /** Byte offset of this entry in the tar archive */
  offset: number;
}

Finish Event

Emitted when all entries have been processed.

extract.on('finish', function(): void);

Usage Examples

Basic Extraction

const tar = require('tar-stream');
const fs = require('fs');

const extract = tar.extract();

extract.on('entry', function(header, stream, next) {
  console.log('Entry:', header.name, 'Size:', header.size);

  stream.on('end', function() {
    next(); // ready for next entry
  });

  stream.resume(); // auto-drain the stream
});

extract.on('finish', function() {
  console.log('Extraction complete');
});

// Pipe tar data to extractor
fs.createReadStream('archive.tar').pipe(extract);

Async Iterator Usage

const tar = require('tar-stream');
const fs = require('fs');

async function extractArchive() {
  const extract = tar.extract();

  // Pipe tar data to extractor
  fs.createReadStream('archive.tar').pipe(extract);

  // Process entries using async iterator
  for await (const entry of extract) {
    console.log('Processing:', entry.header.name);

    // Entry stream is the same object as the iterator value
    entry.resume(); // drain the stream
  }
}

Content Processing

const tar = require('tar-stream');
const fs = require('fs');

const extract = tar.extract();

extract.on('entry', function(header, stream, next) {
  if (header.type === 'file' && header.name.endsWith('.txt')) {
    let content = '';

    stream.on('data', function(chunk) {
      content += chunk.toString();
    });

    stream.on('end', function() {
      console.log('File content:', content);
      next();
    });
  } else {
    stream.resume(); // skip non-text files
    stream.on('end', next);
  }
});

fs.createReadStream('archive.tar').pipe(extract);

Error Handling

const tar = require('tar-stream');

const extract = tar.extract({
  allowUnknownFormat: true, // allow non-standard tar formats
  filenameEncoding: 'latin1' // handle non-UTF8 filenames
});

extract.on('error', function(err) {
  console.error('Extraction error:', err.message);
});

extract.on('entry', function(header, stream, next) {
  stream.on('error', function(err) {
    console.error('Entry stream error:', err.message);
    next(err); // propagate error
  });

  stream.resume();
  stream.on('end', next);
});

Important Notes

  • Sequential Processing: The tar archive is streamed sequentially, so you must drain each entry's stream before the next entry will be processed
  • Memory Efficiency: Streams allow processing large archives without loading entire contents into memory
  • Format Support: Supports USTAR format with pax extended headers for long filenames/paths
  • Error Recovery: Invalid tar headers will cause errors unless allowUnknownFormat is enabled

Install with Tessl CLI

npx tessl i tessl/npm-tar-stream

docs

extraction.md

index.md

packing.md

tile.json