Convert Word documents from docx to simple HTML and Markdown
—
Pending
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Pending
The risk profile of this skill
Utilities for handling underline and other styling elements in document conversion.
Create underline element with specified HTML tag name.
function element(name: string): (html: any) => any;name: HTML tag name to use for the underline element (e.g., "u", "span", "em")A function that creates an HTML element with the specified tag name for underline styling.
const mammoth = require("mammoth");
// Create underline element using <u> tag
const underlineElement = mammoth.underline.element("u");
// Create underline element using <span> with class
const underlineSpan = mammoth.underline.element("span");By default, underlining in Word documents is ignored since it can be confused with links in HTML. However, you can enable underline processing using style mappings.
const options = {
styleMap: [
"u => em" // Convert underlines to emphasis
]
};
mammoth.convertToHtml({path: "document.docx"}, options);const options = {
styleMap: [
"u => u", // Convert to HTML <u> tag
"u => span.underline", // Convert to span with CSS class
"u => strong", // Treat underlines as bold
"u:lang(en) => em" // Language-specific underline handling
]
};// Convert underlines to emphasis
const emphasisOptions = {
styleMap: ["u => em"]
};
// Convert underlines to spans with CSS class
const spanOptions = {
styleMap: ["u => span.underlined-text"]
};
// Convert underlines to strong tags
const strongOptions = {
styleMap: ["u => strong"]
};
// Ignore underlines completely (default behavior)
const ignoreOptions = {
styleMap: [] // No mapping for 'u' means underlines are ignored
};const options = {
styleMap: [
// Text formatting
"b => strong",
"i => em",
"u => span.underline",
"strike => del",
// Paragraph styles
"p[style-name='Heading 1'] => h1",
"p[style-name='Heading 2'] => h2",
"p[style-name='Code Block'] => pre",
// Character styles
"r[style-name='Code'] => code",
"r[style-name='Emphasis'] => em"
]
};
mammoth.convertToHtml({path: "document.docx"}, options);Mammoth has default behaviors for common text formatting:
<strong> tags<em> tags<s> tags<sup> tags<sub> tagsconst options = {
styleMap: [
"b => span.bold", // Override bold to use span instead of strong
"i => span.italic", // Override italic to use span instead of em
"u => u", // Enable underline (normally ignored)
"strike => span.strike" // Override strikethrough to use span instead of s
]
};When using CSS classes in style mappings, you'll need to provide corresponding CSS:
/* CSS for custom underline styles */
.underline {
text-decoration: underline;
}
.underlined-text {
text-decoration: underline;
color: blue;
}
.custom-emphasis {
text-decoration: underline;
font-style: italic;
color: #cc0000;
}// Use the CSS classes in style mappings
const options = {
styleMap: [
"u => span.underline",
"u[custom-style='important'] => span.custom-emphasis"
]
};The underline element can be targeted using various selectors:
const styleMapExamples = [
"u => em", // All underlines to emphasis
"u[style-name='Important'] => strong", // Underlines with specific style
"u:lang(en) => em", // Language-specific underlines
"u.custom-class => span.highlight" // Underlines with CSS class
];// For documents where underlines indicate emphasis
const emphasisDocument = {
styleMap: [
"u => em", // Underlines become emphasis
"b => strong", // Bold remains strong
"i => span.italic" // Italic becomes span for styling control
]
};
mammoth.convertToHtml({path: "emphasis-doc.docx"}, emphasisDocument);// For legal documents where underlines should be preserved
const legalDocument = {
styleMap: [
"u => u", // Preserve underlines as <u>
"p[style-name='Signature'] => p.signature" // Special signature styles
]
};
mammoth.convertToHtml({path: "legal-doc.docx"}, legalDocument);// For academic papers with specific formatting needs
const academicPaper = {
styleMap: [
"u => span.term", // Underlined terms
"b => strong", // Bold for emphasis
"i => em", // Italic for emphasis
"p[style-name='Definition'] => div.definition" // Definition blocks
]
};
mammoth.convertToHtml({path: "paper.docx"}, academicPaper);