or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli-interface.mderror-handling.mdindex.mdsimple-api.mdsitemap-index.mdsitemap-parsing.mdsitemap-streams.mdvalidation-utilities.mdxml-validation.md

xml-validation.mddocs/

0

# XML Validation

1

2

External validation capabilities using xmllint for ensuring generated sitemaps comply with XML schemas. This provides an additional layer of validation beyond the built-in JavaScript validation.

3

4

## Capabilities

5

6

### xmlLint Function

7

8

Validates XML content against the official sitemap schema using the external xmllint tool.

9

10

```typescript { .api }

11

/**

12

* Verify the passed in XML is valid using xmllint external tool

13

* Requires xmllint to be installed on the system

14

* @param xml - XML content as string or readable stream

15

* @returns Promise that resolves on valid XML, rejects with error details

16

* @throws XMLLintUnavailable if xmllint is not installed

17

*/

18

function xmlLint(xml: string | Readable): Promise<void>;

19

```

20

21

**Usage Examples:**

22

23

```typescript

24

import { xmlLint } from "sitemap";

25

import { createReadStream } from "fs";

26

27

// Validate XML file

28

try {

29

await xmlLint(createReadStream("sitemap.xml"));

30

console.log("Sitemap is valid!");

31

} catch ([error, stderr]) {

32

if (error.name === 'XMLLintUnavailable') {

33

console.error("xmllint is not installed");

34

} else {

35

console.error("Validation failed:", stderr.toString());

36

}

37

}

38

39

// Validate XML string

40

const xmlContent = `<?xml version="1.0" encoding="UTF-8"?>

41

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">

42

<url>

43

<loc>https://example.com/</loc>

44

<lastmod>2023-01-01T00:00:00.000Z</lastmod>

45

<changefreq>daily</changefreq>

46

<priority>1.0</priority>

47

</url>

48

</urlset>`;

49

50

try {

51

await xmlLint(xmlContent);

52

console.log("XML is schema-compliant");

53

} catch (error) {

54

console.error("Schema validation failed");

55

}

56

```

57

58

### Integration with Sitemap Generation

59

60

Validate generated sitemaps to ensure they meet official specifications:

61

62

```typescript

63

import { SitemapStream, xmlLint, streamToPromise } from "sitemap";

64

65

async function generateAndValidateSitemap() {

66

// Create sitemap

67

const sitemap = new SitemapStream({

68

hostname: "https://example.com"

69

});

70

71

sitemap.write({ url: "/", changefreq: "daily", priority: 1.0 });

72

sitemap.write({ url: "/about", changefreq: "monthly", priority: 0.7 });

73

sitemap.end();

74

75

// Get XML content

76

const xmlBuffer = await streamToPromise(sitemap);

77

78

// Validate against schema

79

try {

80

await xmlLint(xmlBuffer.toString());

81

console.log("Generated sitemap is valid");

82

return xmlBuffer;

83

} catch ([error, stderr]) {

84

console.error("Generated sitemap is invalid:", stderr.toString());

85

throw error;

86

}

87

}

88

```

89

90

## Error Handling

91

92

### XMLLintUnavailable Error

93

94

This error is thrown when xmllint is not installed on the system:

95

96

```typescript

97

import { xmlLint, XMLLintUnavailable } from "sitemap";

98

99

try {

100

await xmlLint("<invalid>xml</invalid>");

101

} catch (error) {

102

if (error instanceof XMLLintUnavailable) {

103

console.error("Please install xmllint:");

104

console.error("Ubuntu/Debian: sudo apt-get install libxml2-utils");

105

console.error("macOS: brew install libxml2");

106

console.error("Or skip validation by not using xmlLint function");

107

} else {

108

console.error("Validation error:", error);

109

}

110

}

111

```

112

113

### Validation Error Handling

114

115

```typescript

116

import { xmlLint } from "sitemap";

117

118

async function validateWithFallback(xmlContent: string) {

119

try {

120

await xmlLint(xmlContent);

121

return { isValid: true, errors: [] };

122

} catch ([error, stderr]) {

123

if (error && error.name === 'XMLLintUnavailable') {

124

console.warn("xmllint not available, skipping schema validation");

125

return { isValid: null, errors: ["xmllint not available"] };

126

} else {

127

const errorMessage = stderr ? stderr.toString() : error?.message || "Unknown error";

128

return { isValid: false, errors: [errorMessage] };

129

}

130

}

131

}

132

```

133

134

## CLI Integration

135

136

The xmlLint function is also used by the command-line interface:

137

138

```bash

139

# Validate a sitemap file using CLI

140

npx sitemap --validate sitemap.xml

141

142

# Validate will output "valid" or error details

143

npx sitemap --validate invalid-sitemap.xml

144

# Output: Error details from xmllint

145

```

146

147

## Installation Requirements

148

149

To use xmlLint validation, you need to install xmllint on your system:

150

151

### Ubuntu/Debian

152

```bash

153

sudo apt-get install libxml2-utils

154

```

155

156

### macOS

157

```bash

158

brew install libxml2

159

```

160

161

### Windows

162

- Download libxml2 from http://xmlsoft.org/downloads.html

163

- Or use Windows Subsystem for Linux (WSL)

164

- Or use Docker with a Linux container

165

166

### Docker Example

167

```dockerfile

168

FROM node:18

169

RUN apt-get update && apt-get install -y libxml2-utils

170

COPY . /app

171

WORKDIR /app

172

RUN npm install

173

```

174

175

## Schema Validation Details

176

177

xmlLint validates against the official sitemap schemas:

178

179

- **Core sitemap**: `http://www.sitemaps.org/schemas/sitemap/0.9`

180

- **Image extension**: `http://www.google.com/schemas/sitemap-image/1.1`

181

- **Video extension**: `http://www.google.com/schemas/sitemap-video/1.1`

182

- **News extension**: `http://www.google.com/schemas/sitemap-news/0.9`

183

184

The validation ensures:

185

- Proper XML structure and encoding

186

- Correct namespace declarations

187

- Valid element nesting

188

- Required attributes are present

189

- Data types match schema requirements

190

- URL limits are respected (50,000 URLs max per sitemap)

191

192

## Advanced Usage

193

194

### Batch Validation

195

196

```typescript

197

import { xmlLint } from "sitemap";

198

import { readdir, createReadStream } from "fs";

199

import { promisify } from "util";

200

201

const readdirAsync = promisify(readdir);

202

203

async function validateSitemapDirectory(directory: string) {

204

const files = await readdirAsync(directory);

205

const sitemapFiles = files.filter(f => f.endsWith('.xml'));

206

207

const results = await Promise.allSettled(

208

sitemapFiles.map(async (file) => {

209

try {

210

await xmlLint(createReadStream(`${directory}/${file}`));

211

return { file, valid: true };

212

} catch (error) {

213

return { file, valid: false, error };

214

}

215

})

216

);

217

218

results.forEach((result, index) => {

219

if (result.status === 'fulfilled') {

220

const { file, valid, error } = result.value;

221

console.log(`${file}: ${valid ? 'VALID' : 'INVALID'}`);

222

if (!valid) {

223

console.error(` Error: ${error}`);

224

}

225

}

226

});

227

}

228

```

229

230

### Custom Schema Validation

231

232

While the built-in function uses the standard sitemap schema, you can use xmllint directly for custom validation:

233

234

```typescript

235

import { execFile } from "child_process";

236

import { promisify } from "util";

237

238

const execFileAsync = promisify(execFile);

239

240

async function validateAgainstCustomSchema(xmlFile: string, schemaFile: string) {

241

try {

242

await execFileAsync('xmllint', [

243

'--schema', schemaFile,

244

'--noout',

245

xmlFile

246

]);

247

return true;

248

} catch (error) {

249

console.error("Custom schema validation failed:", error);

250

return false;

251

}

252

}

253

```

254

255

## Best Practices

256

257

1. **Optional Validation**: Always handle XMLLintUnavailable gracefully

258

2. **CI/CD Integration**: Include xmllint in build containers for automated validation

259

3. **Development vs Production**: Use validation in development and testing, consider skipping in production for performance

260

4. **Error Reporting**: Capture and log validation errors for debugging

261

5. **Schema Updates**: Keep xmllint updated to support latest sitemap specifications