or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

conversion.mdimages.mdindex.mdstyle-maps.mdstyles.mdtransforms.md

transforms.mddocs/

0

# Document Transforms

1

2

Document transformation utilities for modifying document elements before conversion, enabling custom preprocessing of document structure.

3

4

**Note**: The API for document transforms should be considered unstable and may change between versions. Pin to a specific version if you rely on this behavior.

5

6

## transforms.paragraph

7

8

Apply a transformation to paragraph elements in the document.

9

10

```javascript { .api }

11

function paragraph(transform: (element: any) => any): (element: any) => any;

12

```

13

14

### Parameters

15

16

- `transform`: Function that takes a paragraph element and returns the modified element

17

18

### Returns

19

20

A transformation function that can be used with the `transformDocument` option.

21

22

### Usage Example

23

24

```javascript

25

const mammoth = require("mammoth");

26

27

function transformParagraph(element) {

28

// Convert center-aligned paragraphs to headings

29

if (element.alignment === "center" && !element.styleId) {

30

return {...element, styleId: "Heading2"};

31

}

32

return element;

33

}

34

35

const options = {

36

transformDocument: mammoth.transforms.paragraph(transformParagraph)

37

};

38

39

mammoth.convertToHtml({path: "document.docx"}, options);

40

```

41

42

## transforms.run

43

44

Apply a transformation to run elements (text runs) in the document.

45

46

```javascript { .api }

47

function run(transform: (element: any) => any): (element: any) => any;

48

```

49

50

### Parameters

51

52

- `transform`: Function that takes a run element and returns the modified element

53

54

### Returns

55

56

A transformation function that can be used with the `transformDocument` option.

57

58

### Usage Example

59

60

```javascript

61

function transformRun(element) {

62

// Convert runs with monospace font to code

63

if (element.font && element.font.name === "Courier New") {

64

return {...element, styleId: "Code"};

65

}

66

return element;

67

}

68

69

const options = {

70

transformDocument: mammoth.transforms.run(transformRun)

71

};

72

```

73

74

## transforms.getDescendants

75

76

Get all descendant elements from a document element.

77

78

```javascript { .api }

79

function getDescendants(element: any): any[];

80

```

81

82

### Parameters

83

84

- `element`: The document element to traverse

85

86

### Returns

87

88

Array of all descendant elements found in the element tree.

89

90

### Usage Example

91

92

```javascript

93

function analyzeDocument(documentElement) {

94

const allDescendants = mammoth.transforms.getDescendants(documentElement);

95

console.log(`Document contains ${allDescendants.length} elements`);

96

97

allDescendants.forEach(function(descendant) {

98

console.log(`Element type: ${descendant.type}`);

99

});

100

}

101

```

102

103

## transforms.getDescendantsOfType

104

105

Get all descendant elements of a specific type from a document element.

106

107

```javascript { .api }

108

function getDescendantsOfType(element: any, type: string): any[];

109

```

110

111

### Parameters

112

113

- `element`: The document element to traverse

114

- `type`: The element type to filter for (e.g., "paragraph", "run", "table")

115

116

### Returns

117

118

Array of descendant elements matching the specified type.

119

120

### Usage Example

121

122

```javascript

123

function countParagraphs(documentElement) {

124

const paragraphs = mammoth.transforms.getDescendantsOfType(documentElement, "paragraph");

125

console.log(`Document contains ${paragraphs.length} paragraphs`);

126

return paragraphs;

127

}

128

129

function findTables(documentElement) {

130

const tables = mammoth.transforms.getDescendantsOfType(documentElement, "table");

131

return tables;

132

}

133

```

134

135

## Manual Element Transformation

136

137

For more complex transformations, you can write your own recursive transformation function:

138

139

```javascript { .api }

140

function transformElement(element: any): any {

141

if (element.children) {

142

const children = element.children.map(transformElement);

143

element = {...element, children: children};

144

}

145

146

// Apply specific transformations based on element type

147

if (element.type === "paragraph") {

148

return transformParagraph(element);

149

} else if (element.type === "run") {

150

return transformRun(element);

151

}

152

153

return element;

154

}

155

```

156

157

### Usage Example

158

159

```javascript

160

function transformElement(element) {

161

// Recursively transform children first

162

if (element.children) {

163

const children = element.children.map(transformElement);

164

element = {...element, children: children};

165

}

166

167

// Transform paragraphs

168

if (element.type === "paragraph") {

169

// Convert center-aligned paragraphs to headings

170

if (element.alignment === "center" && !element.styleId) {

171

return {...element, styleId: "Heading2"};

172

}

173

174

// Convert paragraphs with specific text patterns

175

if (element.children && element.children.length > 0) {

176

const text = element.children

177

.filter(child => child.type === "text")

178

.map(child => child.value)

179

.join("");

180

181

if (text.startsWith("TODO:")) {

182

return {...element, styleId: "TodoItem"};

183

}

184

}

185

}

186

187

// Transform runs

188

if (element.type === "run") {

189

// Convert monospace font runs to code

190

if (element.font && element.font.name === "Courier New") {

191

return {...element, styleId: "Code"};

192

}

193

}

194

195

return element;

196

}

197

198

const options = {

199

transformDocument: transformElement

200

};

201

202

mammoth.convertToHtml({path: "document.docx"}, options);

203

```

204

205

## Common Element Types

206

207

Document elements you might encounter during transformation:

208

209

- `"paragraph"`: Paragraph elements

210

- `"run"`: Text runs within paragraphs

211

- `"text"`: Text content

212

- `"table"`: Table elements

213

- `"table-row"`: Table row elements

214

- `"table-cell"`: Table cell elements

215

- `"hyperlink"`: Link elements

216

- `"image"`: Image elements

217

- `"line-break"`: Line break elements

218

- `"footnote-reference"`: Footnote references

219

- `"endnote-reference"`: Endnote references

220

221

## Element Properties

222

223

Common properties found on document elements:

224

225

### Paragraph Elements

226

- `type`: "paragraph"

227

- `styleId`: Style identifier from the document

228

- `styleName`: Human-readable style name

229

- `alignment`: Text alignment ("left", "center", "right", "justify")

230

- `children`: Array of child elements

231

232

### Run Elements

233

- `type`: "run"

234

- `font`: Font information object

235

- `isBold`: Boolean indicating bold formatting

236

- `isItalic`: Boolean indicating italic formatting

237

- `isUnderline`: Boolean indicating underline formatting

238

- `isStrikethrough`: Boolean indicating strikethrough formatting

239

- `verticalAlignment`: "superscript" or "subscript"

240

- `children`: Array of child elements (usually text)

241

242

### Text Elements

243

- `type`: "text"

244

- `value`: The actual text content