Tessl Tile for pypi/xhtml2pdf@0.2.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

command-line.md context-management.md css-processing.md document-processing.md file-handling.md index.md pdf-features.md utilities.md wsgi-integration.md

document-processing.mddocs/

0
# Document Processing
1

2
Core document processing functions for converting HTML and CSS content to PDF documents. These functions provide the main entry points for xhtml2pdf's conversion capabilities, handling everything from simple HTML strings to complex documents with external resources.
3

4
## Capabilities
5

6
### Main Document Conversion
7

8
The primary function for converting HTML to PDF with comprehensive configuration options for handling various input sources, output destinations, and processing parameters.
9

10
```python { .api }
11
def pisaDocument(
12
    src,
13
    dest=None,
14
    dest_bytes=False,
15
    path="",
16
    link_callback=None,
17
    debug=0,
18
    default_css=None,
19
    xhtml=False,
20
    encoding=None,
21
    xml_output=None,
22
    raise_exception=True,
23
    capacity=100 * 1024,
24
    context_meta=None,
25
    encrypt=None,
26
    signature=None,
27
    **kwargs
28
):
29
    """
30
    Convert HTML to PDF with full control over processing options.
31
    
32
    Args:
33
        src: HTML source - can be:
34
            - str: HTML content as string
35
            - file-like object: Open file or BytesIO
36
            - filename: Path to HTML file
37
        dest: Output destination - can be:
38
            - file-like object: Open file or BytesIO for writing
39
            - filename: Path for output PDF file
40
            - None: Return PDF content in context
41
        dest_bytes (bool): If True and dest is None, return bytes
42
        path (str): Base path for resolving relative URLs and file paths
43
        link_callback (callable): Custom function to resolve URLs and file paths
44
            Signature: callback(uri, rel) -> resolved_uri
45
        debug (int): Debug level 0-2, higher values provide more logging
46
        default_css (str): Custom default CSS to apply before document CSS
47
        xhtml (bool): Force XHTML parsing mode instead of HTML5
48
        encoding (str): Character encoding for source document
49
            If None, encoding is auto-detected from HTML meta tags
50
        xml_output: XML output configuration options
51
        raise_exception (bool): Raise exceptions on conversion errors
52
        capacity (int): Memory capacity in bytes for temporary files
53
        context_meta (dict): Additional metadata to add to PDF context
54
        encrypt (dict): PDF encryption settings with keys:
55
            - userPassword: User password for PDF
56
            - ownerPassword: Owner password for PDF
57
            - canPrint: Allow printing (bool)
58
            - canModify: Allow modifications (bool)
59
            - canCopy: Allow copying content (bool)
60
            - canAnnotate: Allow annotations (bool)
61
        signature (dict): PDF digital signature settings
62
        **kwargs: Additional processing options
63
    
64
    Returns:
65
        pisaContext: Processing context object with attributes:
66
            - err (int): Number of errors encountered
67
            - warn (int): Number of warnings encountered  
68
            - log (list): List of log messages
69
            - dest: Output destination (if dest_bytes=True, contains PDF bytes)
70
    """
71
```
72

73
#### Usage Examples
74

75
**Basic HTML string to PDF file:**
76

77
```python
78
from xhtml2pdf import pisa
79

80
html = "<html><body><h1>Hello World</h1></body></html>"
81
with open("output.pdf", "wb") as dest:
82
    result = pisa.pisaDocument(html, dest)
83
    if result.err:
84
        print(f"Errors: {result.log}")
85
```
86

87
**Convert with custom CSS and base path:**
88

89
```python
90
from xhtml2pdf import pisa
91

92
custom_css = """
93
@page {
94
    size: A4;
95
    margin: 2cm;
96
}
97
body { font-family: Arial; }
98
"""
99

100
html = """
101
<html>
102
    <body>
103
        <h1>Report</h1>
104
        <img src="chart.png" />
105
    </body>
106
</html>
107
"""
108

109
with open("report.pdf", "wb") as dest:
110
    result = pisa.pisaDocument(
111
        html, 
112
        dest,
113
        path="/path/to/resources/",  # Base path for resolving chart.png
114
        default_css=custom_css,
115
        debug=1
116
    )
117
```
118

119
**Convert with custom link callback:**
120

121
```python
122
from xhtml2pdf import pisa
123
import os
124

125
def link_callback(uri, rel):
126
    """
127
    Resolve relative URLs to absolute file paths.
128
    """
129
    if uri.startswith(('http://', 'https://')):
130
        return uri
131
    
132
    # Convert relative paths to absolute paths
133
    if not os.path.isabs(uri):
134
        return os.path.join('/path/to/assets/', uri)
135
    return uri
136

137
html = '<html><body><img src="images/logo.png" /></body></html>'
138
with open("output.pdf", "wb") as dest:
139
    result = pisa.pisaDocument(html, dest, link_callback=link_callback)
140
```
141

142
**Return PDF as bytes:**
143

144
```python
145
from xhtml2pdf import pisa
146
import io
147

148
html = "<html><body><h1>Document</h1></body></html>"
149
output = io.BytesIO()
150
result = pisa.pisaDocument(html, dest=output)
151

152
if not result.err:
153
    pdf_bytes = output.getvalue()
154
    # Use pdf_bytes as needed
155
```
156

157
### Document Story Creation
158

159
Lower-level function for creating ReportLab story objects from HTML content, providing more granular control over the conversion process.
160

161
```python { .api }
162
def pisaStory(
163
    src,
164
    path="",
165
    link_callback=None,
166
    debug=0,
167
    default_css=None,
168
    xhtml=False,
169
    encoding=None,
170
    context=None,
171
    xml_output=None,
172
    **kwargs
173
):
174
    """
175
    Create ReportLab story from HTML source without generating PDF.
176
    
177
    This function provides lower-level access to the conversion process,
178
    allowing you to work with the ReportLab story directly before PDF generation.
179
    
180
    Args:
181
        src: HTML source (string, file-like object, or filename)
182
        path (str): Base path for relative resource resolution
183
        link_callback (callable): Custom URL/file resolution function
184
        debug (int): Debug level for logging (0-2)
185
        default_css (str): Custom default CSS stylesheet
186
        xhtml (bool): Use XHTML parsing mode
187
        encoding (str): Character encoding for source
188
        context (pisaContext): Existing context to use (creates new if None)
189
        xml_output: XML output options
190
        **kwargs: Additional processing options
191
    
192
    Returns:
193
        pisaContext: Processing context with story in context.story attribute
194
    """
195
```
196

197
#### Usage Example
198

199
```python
200
from xhtml2pdf.document import pisaStory
201
from reportlab.pdfgen import canvas
202
from reportlab.lib.pagesizes import A4
203

204
html = """
205
<html>
206
    <body>
207
        <h1>Chapter 1</h1>
208
        <p>Content here...</p>
209
    </body>
210
</html>
211
"""
212

213
# Create story from HTML
214
context = pisaStory(html, debug=1)
215

216
if not context.err:
217
    # Use the story with ReportLab directly
218
    pdf_canvas = canvas.Canvas("custom.pdf", pagesize=A4)
219
    # ... custom processing with context.story
220
    pdf_canvas.save()
221
```
222

223
### Error Document Generation
224

225
Utility function for generating error documents when conversion fails, providing user-friendly error reporting.
226

227
```python { .api }
228
def pisaErrorDocument(dest, c):
229
    """
230
    Generate a PDF document containing error information.
231
    
232
    Args:
233
        dest: Output destination for error PDF
234
        c (pisaContext): Context containing error information
235
    
236
    Returns:
237
        pisaContext: Updated context after error document generation
238
    """
239
```
240

241
### PDF Encryption Helper
242

243
Utility function for creating PDF encryption instances from encryption configuration data.
244

245
```python { .api }
246
def get_encrypt_instance(data):
247
    """
248
    Create PDF encryption instance from configuration data.
249
    
250
    Args:
251
        data (dict): Encryption configuration with keys:
252
            - userPassword (str): User password
253
            - ownerPassword (str): Owner password  
254
            - canPrint (bool): Allow printing
255
            - canModify (bool): Allow modifications
256
            - canCopy (bool): Allow copying
257
            - canAnnotate (bool): Allow annotations
258
    
259
    Returns:
260
        Encryption instance for PDF generation
261
    """
262
```
263

264
#### Usage Example
265

266
```python
267
from xhtml2pdf import pisa
268

269
html = "<html><body><h1>Confidential</h1></body></html>"
270

271
encrypt_config = {
272
    'userPassword': 'user123',
273
    'ownerPassword': 'owner456', 
274
    'canPrint': True,
275
    'canModify': False,
276
    'canCopy': False,
277
    'canAnnotate': False
278
}
279

280
with open("secure.pdf", "wb") as dest:
281
    result = pisa.pisaDocument(html, dest, encrypt=encrypt_config)
282
```
283

284
## Advanced Processing Options
285

286
### Memory Management
287

288
The `capacity` parameter controls memory usage during conversion:
289

290
- **Default**: 100KB - suitable for most documents
291
- **Large documents**: Increase to 1MB+ for better performance
292
- **Memory-constrained**: Decrease to 50KB or less
293

294
```python
295
# For large documents
296
result = pisa.pisaDocument(html, dest, capacity=1024*1024)  # 1MB
297

298
# For memory-constrained environments  
299
result = pisa.pisaDocument(html, dest, capacity=50*1024)    # 50KB
300
```
301

302
### Debug Levels
303

304
Debug levels provide different amounts of processing information:
305

306
- **0**: No debug output (default)
307
- **1**: Basic processing information and warnings
308
- **2**: Detailed processing steps and CSS parsing information
309

310
```python
311
result = pisa.pisaDocument(html, dest, debug=2)
312
for log_entry in result.log:
313
    print(log_entry)
314
```
315

316
### Context Metadata
317

318
Additional metadata can be embedded in the PDF:
319

320
```python
321
metadata = {
322
    'author': 'John Doe',
323
    'title': 'My Document', 
324
    'subject': 'Sample PDF',
325
    'creator': 'My Application'
326
}
327

328
result = pisa.pisaDocument(html, dest, context_meta=metadata)
329
```
330

331
## Return Values and Error Handling
332

333
All document processing functions return a `pisaContext` object with these key attributes:
334

335
- **`err`** (int): Number of errors encountered (0 = success)
336
- **`warn`** (int): Number of warnings generated
337
- **`log`** (list): Detailed log messages for debugging
338
- **`dest`**: Output destination or PDF bytes (if dest_bytes=True)
339

340
```python
341
result = pisa.pisaDocument(html, dest)
342

343
# Check for success
344
if result.err:
345
    print(f"Conversion failed with {result.err} errors")
346
    for msg in result.log:
347
        if 'ERROR' in str(msg):
348
            print(f"Error: {msg}")
349
else:
350
    print("PDF generated successfully")
351
    
352
# Handle warnings
353
if result.warn:
354
    print(f"Generated with {result.warn} warnings")
355
```
356

357
## Backward Compatibility
358

359
The legacy `CreatePDF` alias is still available for backward compatibility:
360

361
```python { .api }
362
CreatePDF = pisaDocument  # Backward compatibility alias
363
```
364

365
```python
366
from xhtml2pdf.pisa import CreatePDF
367

368
# Legacy usage (deprecated but still works)
369
result = CreatePDF(html, dest)
370
```
371

372
## Types
373

374
```python { .api }
375
class pisaContext:
376
    """
377
    Processing context returned by document processing functions.
378
    
379
    Attributes:
380
        err (int): Error count
381
        warn (int): Warning count
382
        log (list): Processing log messages
383
        dest: Output destination or PDF content
384
        story (list): ReportLab story elements (from pisaStory)
385
        cssText (str): Processed CSS content
386
        path (str): Base path for resources
387
    """
388
```

Version

Tile

Files

document-processing.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

document-processing.mddocs/