0
# Linkinator
1
2
Linkinator is a comprehensive broken link checker and site crawler that provides both CLI and API functionality for validating links in HTML, markdown, and local files. It features concurrent scanning with configurable parallelism, recursive crawling capabilities, support for various link types (absolute, relative, redirects), regex-based URL filtering for inclusion/exclusion, automatic static server setup for local file scanning, and multiple output formats (JSON, CSV).
3
4
## Package Information
5
6
- **Package Name**: linkinator
7
- **Package Type**: npm
8
- **Language**: TypeScript
9
- **Installation**: `npm install linkinator`
10
11
## Core Imports
12
13
```typescript
14
import { check, LinkChecker, LinkState, DEFAULT_USER_AGENT } from "linkinator";
15
```
16
17
For CommonJS:
18
19
```javascript
20
const { check, LinkChecker, LinkState, DEFAULT_USER_AGENT } = require("linkinator");
21
```
22
23
## Basic Usage
24
25
```typescript
26
import { check, LinkState } from "linkinator";
27
28
// Simple link checking
29
const result = await check({ path: "https://example.com" });
30
console.log(`Passed: ${result.passed}`);
31
console.log(`Total links: ${result.links.length}`);
32
33
// Check local files with markdown support
34
const localResult = await check({
35
path: "./docs/",
36
recurse: true,
37
markdown: true,
38
});
39
40
// Filter broken links
41
const brokenLinks = localResult.links.filter(
42
(link) => link.state === LinkState.BROKEN
43
);
44
```
45
46
## Architecture
47
48
Linkinator is built around several key components:
49
50
- **Link Checker**: Core scanning engine with event-driven architecture for real-time progress tracking
51
- **Queue System**: Concurrent task management with configurable concurrency and retry mechanisms
52
- **Static Server**: Automatic HTTP server for local file scanning with markdown compilation support
53
- **Link Parser**: HTML parsing engine that extracts links from various HTML attributes and srcset specifications
54
- **Configuration System**: Flexible config file support with CLI flag override capabilities
55
- **CLI Interface**: Command-line tool with comprehensive options for automation and CI/CD integration
56
57
## Capabilities
58
59
### Link Scanning
60
61
Core link checking functionality for validating URLs and local files. Provides both synchronous batch processing and event-driven real-time scanning.
62
63
```typescript { .api }
64
function check(options: CheckOptions): Promise<CrawlResult>;
65
66
class LinkChecker extends EventEmitter {
67
check(options: CheckOptions): Promise<CrawlResult>;
68
on(event: 'link', listener: (result: LinkResult) => void): this;
69
on(event: 'pagestart', listener: (link: string) => void): this;
70
on(event: 'retry', listener: (details: RetryInfo) => void): this;
71
}
72
```
73
74
[Link Scanning](./link-scanning.md)
75
76
### Configuration Management
77
78
Configuration system supporting both programmatic options and file-based configuration with CLI flag precedence.
79
80
```typescript { .api }
81
function getConfig(flags: Flags): Promise<Flags>;
82
83
interface CheckOptions {
84
concurrency?: number;
85
port?: number;
86
path: string | string[];
87
recurse?: boolean;
88
timeout?: number;
89
markdown?: boolean;
90
linksToSkip?: string[] | ((link: string) => Promise<boolean>);
91
serverRoot?: string;
92
directoryListing?: boolean;
93
retry?: boolean;
94
retryErrors?: boolean;
95
retryErrorsCount?: number;
96
retryErrorsJitter?: number;
97
urlRewriteExpressions?: UrlRewriteExpression[];
98
userAgent?: string;
99
}
100
```
101
102
[Configuration](./configuration.md)
103
104
### Command Line Interface
105
106
Command-line tool providing direct access to linkinator functionality with comprehensive options for automation and CI/CD integration.
107
108
```typescript { .api }
109
// CLI executable available at: linkinator
110
// Usage: linkinator LOCATIONS [--arguments]
111
```
112
113
The CLI supports all CheckOptions through command-line flags and can output results in multiple formats (TEXT, JSON, CSV) with configurable verbosity levels.
114
115
116
## Types
117
118
```typescript { .api }
119
interface CrawlResult {
120
passed: boolean;
121
links: LinkResult[];
122
}
123
124
interface LinkResult {
125
url: string;
126
status?: number;
127
state: LinkState;
128
parent?: string;
129
failureDetails?: Array<Error | GaxiosResponse>;
130
}
131
132
interface GaxiosResponse {
133
status: number;
134
statusText: string;
135
headers: Record<string, string>;
136
data: any;
137
config: any;
138
}
139
140
interface RetryInfo {
141
url: string;
142
secondsUntilRetry: number;
143
status: number;
144
}
145
146
enum LinkState {
147
OK = 'OK',
148
BROKEN = 'BROKEN',
149
SKIPPED = 'SKIPPED',
150
}
151
152
interface UrlRewriteExpression {
153
pattern: RegExp;
154
replacement: string;
155
}
156
157
/** Default user agent string used for HTTP requests */
158
const DEFAULT_USER_AGENT: string;
159
```