Advanced email parser for Node.js that handles email parsing as a stream for memory-efficient processing of large messages
—
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Pending
The risk profile of this skill
Low-level streaming parser class for advanced email processing scenarios where you need fine-grained control over the parsing process, event handling, and memory usage. Built on Node.js Transform streams for maximum efficiency.
Transform stream class that parses email data and emits events for headers, text content, and attachments as they are processed.
/**
* Advanced streaming email parser class
* @param options - Parsing configuration options
*/
class MailParser extends Transform {
constructor(options?: ParseOptions);
/** Parsed email headers (available after 'headers' event) */
headers: Map<string, any>;
/** Raw header lines (available after 'headers' event) */
headerLines: HeaderLine[];
/** HTML content (available after parsing completes) */
html?: string;
/** Plain text content (available after parsing completes) */
text?: string;
/** Text converted to HTML (available after parsing completes) */
textAsHtml?: string;
/** Email subject (available after 'headers' event) */
subject?: string;
/** From address (available after 'headers' event) */
from?: AddressObject;
/** To addresses (available after 'headers' event) */
to?: AddressObject;
/** CC addresses (available after 'headers' event) */
cc?: AddressObject;
/** BCC addresses (available after 'headers' event) */
bcc?: AddressObject;
/** Email date (available after 'headers' event) */
date?: Date;
/** Message ID (available after 'headers' event) */
messageId?: string;
/** List of attachments processed */
attachmentList: AttachmentData[];
}Usage Examples:
const { MailParser } = require('mailparser');
const fs = require('fs');
// Basic streaming parsing
const parser = new MailParser();
parser.on('headers', headers => {
console.log('Subject:', headers.get('subject'));
console.log('From:', headers.get('from'));
});
// Read data from the stream in object mode
const readData = () => {
let data;
while ((data = parser.read()) !== null) {
if (data.type === 'text') {
console.log('Text content:', data.text);
console.log('HTML content:', data.html);
}
if (data.type === 'attachment') {
console.log('Attachment:', data.filename);
console.log('Size:', data.size);
// Process attachment content
let chunks = [];
data.content.on('data', chunk => chunks.push(chunk));
data.content.on('end', () => {
const content = Buffer.concat(chunks);
console.log('Attachment content length:', content.length);
data.release(); // Must call to continue processing
readData(); // Continue reading after attachment is processed
});
return; // Wait for attachment to complete before reading more
}
}
};
parser.on('readable', readData);
parser.on('end', () => {
console.log('Parsing complete');
});
// Pipe email data to parser
fs.createReadStream('email.eml').pipe(parser);The MailParser class emits several events during the parsing process.
// Headers event - emitted when email headers are parsed
on(event: 'headers', listener: (headers: Map<string, any>) => void): this;
// Header lines event - emitted with raw header data
on(event: 'headerLines', listener: (lines: HeaderLine[]) => void): this;
// Readable event - emitted when data is available to read
on(event: 'readable', listener: () => void): this;
// Stream reading method - call to get text/attachment data
read(): TextData | AttachmentData | null;
// End event - emitted when parsing is complete
on(event: 'end', listener: () => void): this;
// Error event - emitted when parsing errors occur
on(event: 'error', listener: (err: Error) => void): this;Returned by the read() method when text content is parsed.
interface TextData {
/** Always 'text' for text content */
type: 'text';
/** HTML version of the email content */
html?: string;
/** Plain text version of the email content */
text?: string;
/** Plain text converted to HTML format */
textAsHtml?: string;
}Returned by the read() method for each attachment found in the email.
interface AttachmentData {
/** Always 'attachment' for attachment data */
type: 'attachment';
/** Readable stream containing attachment content */
content: Stream;
/** MIME content type of the attachment */
contentType: string;
/** Part identifier within the email structure */
partId?: string;
/** Content-Disposition header value */
contentDisposition?: string;
/** Filename from Content-Disposition or Content-Type */
filename?: string;
/** Content-ID header value (with angle brackets) */
contentId?: string;
/** Clean Content-ID without angle brackets */
cid?: string;
/** Whether attachment is related to email content (embedded) */
related?: boolean;
/** All headers for this attachment part */
headers: Map<string, any>;
/** Content checksum (calculated after content is read) */
checksum?: string;
/** Size in bytes (calculated after content is read) */
size?: number;
/** Function to call when done processing attachment - REQUIRED */
release(): void;
}Additional methods available on the MailParser instance for advanced use cases.
/**
* Update image links in HTML content using a custom replacement function
* @param replaceCallback - Function to generate replacement URLs
* @param done - Callback when processing is complete
*/
updateImageLinks(
replaceCallback: (attachment: AttachmentData, done: (err: Error | null, url?: string) => void) => void,
done: (err: Error | null, html?: string) => void
): void;
/**
* Convert plain text to HTML with link detection
* @param str - Plain text string to convert
* @returns HTML formatted string
*/
textToHtml(str: string): string;
/**
* Format email addresses as HTML
* @param addresses - Address objects to format
* @returns HTML formatted address string
*/
getAddressesHTML(addresses: Address[]): string;
/**
* Format email addresses as plain text
* @param addresses - Address objects to format
* @returns Plain text formatted address string
*/
getAddressesText(addresses: Address[]): string;Important pattern for handling attachment streams correctly:
parser.on('data', data => {
if (data.type === 'attachment') {
// Store attachment chunks
let chunks = [];
let size = 0;
data.content.on('readable', () => {
let chunk;
while ((chunk = data.content.read()) !== null) {
chunks.push(chunk);
size += chunk.length;
}
});
data.content.on('end', () => {
// Process complete attachment
const buffer = Buffer.concat(chunks);
console.log(`Processed ${data.filename}: ${buffer.length} bytes`);
// CRITICAL: Must call release() to continue parsing
data.release();
});
data.content.on('error', err => {
console.error('Attachment stream error:', err);
data.release(); // Release even on error
});
}
});For processing large emails efficiently:
const parser = new MailParser({
checksumAlgo: 'sha256', // Use stronger checksum if needed
maxHtmlLengthToParse: 5 * 1024 * 1024 // Limit HTML parsing to 5MB
});
// Handle large attachments by streaming to disk
parser.on('data', data => {
if (data.type === 'attachment' && data.size > 10 * 1024 * 1024) {
// Stream large attachments directly to file system
const writeStream = fs.createWriteStream(`/tmp/${data.filename}`);
data.content.pipe(writeStream);
writeStream.on('finish', () => {
console.log(`Large attachment saved: ${data.filename}`);
data.release();
});
}
});