or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

annotation-api.mdconfiguration.mddocument-api.mdeditor-api.mdindex.mdrendering-api.mdtext-layer.mdxfa-api.md

document-api.mddocs/

0

# Document API

1

2

Core functionality for loading PDF documents from URLs, binary data, or streams. Handles document-level operations like metadata extraction, page navigation, and resource management.

3

4

## Capabilities

5

6

### Document Loading

7

8

The primary entry point for loading PDF documents with comprehensive configuration options.

9

10

```javascript { .api }

11

/**

12

* Loads a PDF document from various sources

13

* @param src - Document source (URL, binary data, or parameters object)

14

* @returns Promise-based loading task for document access

15

*/

16

function getDocument(src: string | Uint8Array | ArrayBuffer | DocumentInitParameters): PDFDocumentLoadingTask;

17

18

interface DocumentInitParameters {

19

/** URL to the PDF document */

20

url?: string;

21

/** Binary PDF data as typed array */

22

data?: Uint8Array | ArrayBuffer;

23

/** HTTP headers for document requests */

24

httpHeaders?: Record<string, string>;

25

/** Include credentials in cross-origin requests */

26

withCredentials?: boolean;

27

/** Password for encrypted PDFs */

28

password?: string;

29

/** Expected document length for optimization */

30

length?: number;

31

/** Custom data range transport */

32

range?: PDFDataRangeTransport;

33

/** Custom worker instance */

34

worker?: PDFWorker;

35

/** Logging verbosity level (0-5) */

36

verbosity?: number;

37

/** Base URL for relative links */

38

docBaseUrl?: string;

39

/** URL path to character mapping files */

40

cMapUrl?: string;

41

/** Whether character maps are binary packed */

42

cMapPacked?: boolean;

43

/** Custom character map reader factory */

44

CMapReaderFactory?: any;

45

/** Use system fonts when available */

46

useSystemFonts?: boolean;

47

/** URL path to standard font data */

48

standardFontDataUrl?: string;

49

/** Custom standard font data factory */

50

StandardFontDataFactory?: any;

51

/** Use worker for fetch operations */

52

useWorkerFetch?: boolean;

53

/** JavaScript evaluation support */

54

isEvalSupported?: boolean;

55

/** OffscreenCanvas support for rendering */

56

isOffscreenCanvasSupported?: boolean;

57

/** Maximum canvas area in bytes */

58

canvasMaxAreaInBytes?: number;

59

/** Disable @font-face rules */

60

disableFontFace?: boolean;

61

/** Include extra font properties */

62

fontExtraProperties?: boolean;

63

/** Enable XFA form support */

64

enableXfa?: boolean;

65

/** Owner document for DOM operations */

66

ownerDocument?: Document;

67

/** Disable byte range requests */

68

disableRange?: boolean;

69

/** Disable streaming */

70

disableStream?: boolean;

71

/** Disable auto-fetch of missing data */

72

disableAutoFetch?: boolean;

73

/** Enable PDF debugging features */

74

pdfBug?: boolean;

75

}

76

```

77

78

**Usage Examples:**

79

80

```javascript

81

import { getDocument } from "pdfjs-dist";

82

83

// Load from URL

84

const loadingTask = getDocument("https://example.com/document.pdf");

85

const pdf = await loadingTask.promise;

86

87

// Load from binary data

88

const arrayBuffer = await fetch("document.pdf").then(r => r.arrayBuffer());

89

const loadingTask2 = getDocument(new Uint8Array(arrayBuffer));

90

const pdf2 = await loadingTask2.promise;

91

92

// Load with configuration

93

const loadingTask3 = getDocument({

94

url: "document.pdf",

95

httpHeaders: { "Authorization": "Bearer token" },

96

cMapUrl: "./cmaps/",

97

cMapPacked: true

98

});

99

const pdf3 = await loadingTask3.promise;

100

```

101

102

### Document Loading Task

103

104

Represents an ongoing document loading operation with progress tracking and cancellation support.

105

106

```javascript { .api }

107

interface PDFDocumentLoadingTask {

108

/** Promise that resolves to the loaded PDF document */

109

promise: Promise<PDFDocumentProxy>;

110

/** Destroy/cancel the loading task */

111

destroy(): void;

112

/** Document loading progress callback */

113

onProgress?: (progressData: OnProgressParameters) => void;

114

/** Password required callback for encrypted documents */

115

onPassword?: (updatePassword: (password: string) => void, reason: number) => void;

116

}

117

118

interface OnProgressParameters {

119

/** Bytes loaded so far */

120

loaded: number;

121

/** Total bytes to load (if known) */

122

total?: number;

123

/** Loading progress percentage */

124

percent?: number;

125

}

126

```

127

128

### PDF Document Proxy

129

130

Main interface for interacting with a loaded PDF document, providing access to pages, metadata, and document-level operations.

131

132

```javascript { .api }

133

interface PDFDocumentProxy {

134

/** Number of pages in the document */

135

numPages: number;

136

/** Document fingerprint for caching */

137

fingerprints: string[];

138

/** Loading parameters used */

139

loadingParams: DocumentInitParameters;

140

/** Loading task that created this document */

141

loadingTask: PDFDocumentLoadingTask;

142

143

/**

144

* Get a specific page by number (1-indexed)

145

* @param pageNumber - Page number (1 to numPages)

146

* @returns Promise resolving to page proxy

147

*/

148

getPage(pageNumber: number): Promise<PDFPageProxy>;

149

150

/**

151

* Get page index from page reference

152

* @param ref - Page reference object

153

* @returns Promise resolving to 0-based page index

154

*/

155

getPageIndex(ref: RefProxy): Promise<number>;

156

157

/**

158

* Get named destinations in the document

159

* @returns Promise resolving to destination mapping

160

*/

161

getDestinations(): Promise<{ [name: string]: any }>;

162

163

/**

164

* Get specific destination by ID

165

* @param id - Destination identifier

166

* @returns Promise resolving to destination array

167

*/

168

getDestination(id: string): Promise<any[] | null>;

169

170

/**

171

* Get document outline/bookmarks

172

* @returns Promise resolving to outline tree

173

*/

174

getOutline(): Promise<any[]>;

175

176

/**

177

* Get document permissions

178

* @returns Promise resolving to permission flags

179

*/

180

getPermissions(): Promise<number[]>;

181

182

/**

183

* Get document metadata

184

* @returns Promise resolving to metadata object

185

*/

186

getMetadata(): Promise<{ info: any; metadata: Metadata | null; contentDispositionFilename?: string }>;

187

188

/**

189

* Get document data as Uint8Array

190

* @returns Promise resolving to document bytes

191

*/

192

getData(): Promise<Uint8Array>;

193

194

/**

195

* Get download info for saving

196

* @returns Promise resolving to download information

197

*/

198

getDownloadInfo(): Promise<{ length: number }>;

199

200

/**

201

* Get document statistics

202

* @returns Promise resolving to stats object

203

*/

204

getStats(): Promise<{ streamTypes: any; fontTypes: any }>;

205

206

/**

207

* Get page labels/numbering information

208

* @returns Promise resolving to label array

209

*/

210

getPageLabels(): Promise<string[] | null>;

211

212

/**

213

* Get page layout setting

214

* @returns Promise resolving to layout name

215

*/

216

getPageLayout(): Promise<string>;

217

218

/**

219

* Get page mode setting

220

* @returns Promise resolving to mode name

221

*/

222

getPageMode(): Promise<string>;

223

224

/**

225

* Get viewer preferences

226

* @returns Promise resolving to preferences object

227

*/

228

getViewerPreferences(): Promise<any>;

229

230

/**

231

* Get document attachments

232

* @returns Promise resolving to attachments object

233

*/

234

getAttachments(): Promise<{ [filename: string]: any }>;

235

236

/**

237

* Get document open action

238

* @returns Promise resolving to open action

239

*/

240

getOpenAction(): Promise<any>;

241

242

/**

243

* Get optional content configuration

244

* @param params - Configuration parameters

245

* @returns Promise resolving to config object

246

*/

247

getOptionalContentConfig(params?: { intent?: string }): Promise<OptionalContentConfig>;

248

249

/**

250

* Get mark info for accessibility

251

* @returns Promise resolving to mark info object

252

*/

253

getMarkInfo(): Promise<any>;

254

255

/**

256

* Get annotations filtered by type

257

* @param types - Array of annotation types to include

258

* @param pageIndexesToSkip - Page indices to skip

259

* @returns Promise resolving to annotations array

260

*/

261

getAnnotationsByType(types: number[], pageIndexesToSkip?: number[]): Promise<any[]>;

262

263

/**

264

* Get JavaScript actions in document

265

* @returns Promise resolving to actions object

266

*/

267

getJSActions(): Promise<{ [name: string]: any }>;

268

269

/**

270

* Get field objects for forms

271

* @returns Promise resolving to field mapping

272

*/

273

getFieldObjects(): Promise<{ [id: string]: any }>;

274

275

/**

276

* Check if document has JavaScript actions

277

* @returns Promise resolving to boolean

278

*/

279

hasJSActions(): Promise<boolean>;

280

281

/**

282

* Get calculate order for form fields

283

* @returns Promise resolving to field order array

284

*/

285

getCalculationOrderIds(): Promise<string[]>;

286

287

/**

288

* Clean up document resources

289

* @param keepLoadedFonts - Keep loaded fonts in memory

290

*/

291

cleanup(keepLoadedFonts?: boolean): void;

292

293

/**

294

* Destroy document and release all resources

295

*/

296

destroy(): void;

297

298

/**

299

* Get structure tree for accessibility

300

* @returns Promise resolving to structure tree

301

*/

302

getStructTree(pageIndex: number): Promise<any>;

303

304

/**

305

* Save document with annotations

306

* @param annotationStorage - Annotation storage to include

307

* @param filename - Filename for saved document

308

* @param options - Save options

309

* @returns Promise resolving to saved document bytes

310

*/

311

saveDocument(annotationStorage?: AnnotationStorage, filename?: string, options?: any): Promise<Uint8Array>;

312

313

/**

314

* Get cached page number for reference

315

* @param ref - Page reference object

316

* @returns Cached page number or null

317

*/

318

cachedPageNumber(ref: RefProxy): number | null;

319

}

320

```

321

322

**Usage Examples:**

323

324

```javascript

325

import { getDocument } from "pdfjs-dist";

326

327

// Load and inspect document

328

const pdf = await getDocument("document.pdf").promise;

329

330

console.log(`Document has ${pdf.numPages} pages`);

331

332

// Get metadata

333

const metadata = await pdf.getMetadata();

334

console.log("Title:", metadata.info.Title);

335

console.log("Author:", metadata.info.Author);

336

337

// Get first page

338

const page = await pdf.getPage(1);

339

340

// Get outline

341

const outline = await pdf.getOutline();

342

if (outline) {

343

console.log("Document has bookmarks");

344

}

345

346

// Check permissions

347

const permissions = await pdf.getPermissions();

348

const canPrint = permissions.includes(4); // PRINT permission

349

```

350

351

### Data Range Transport

352

353

Custom transport mechanism for handling byte-range requests, useful for streaming large documents or custom data sources.

354

355

```javascript { .api }

356

class PDFDataRangeTransport {

357

/**

358

* Constructor for custom data range transport

359

* @param length - Total data length

360

* @param initialData - Initial chunk of data

361

* @param progressiveDone - Whether progressive loading is complete

362

* @param contentDispositionFilename - Suggested filename

363

*/

364

constructor(

365

length: number,

366

initialData: Uint8Array,

367

progressiveDone?: boolean,

368

contentDispositionFilename?: string

369

);

370

371

/**

372

* Request a specific data range

373

* @param begin - Start byte position

374

* @param end - End byte position

375

*/

376

requestDataRange(begin: number, end: number): void;

377

378

/**

379

* Abort all pending requests

380

* @param reason - Abort reason

381

*/

382

abort(reason?: any): void;

383

}

384

```

385

386

### Document Build Information

387

388

Version and build information for the PDF.js library.

389

390

```javascript { .api }

391

const build: {

392

version: string;

393

date: string;

394

};

395

396

const version: string;

397

```

398

399

**Usage Examples:**

400

401

```javascript

402

import { build, version } from "pdfjs-dist";

403

404

console.log(`PDF.js version: ${version}`);

405

console.log(`Build date: ${build.date}`);

406

```

407

408

## Error Handling

409

410

```javascript { .api }

411

class InvalidPDFException extends Error {

412

constructor(msg: string);

413

}

414

415

class MissingPDFException extends Error {

416

constructor(msg: string);

417

}

418

419

class PasswordException extends Error {

420

constructor(msg: string, code: number);

421

}

422

423

class UnexpectedResponseException extends Error {

424

constructor(msg: string, status: number);

425

}

426

```

427

428

Common error scenarios:

429

- Invalid PDF files throw `InvalidPDFException`

430

- Missing or network-inaccessible files throw `MissingPDFException`

431

- Password-protected documents throw `PasswordException`

432

- HTTP errors throw `UnexpectedResponseException`