Tessl Tile for pypi/pelican@4.11.0

or run

npx @tessl/cli init

Version

Tile

Overview

Evals

Files

docs

cli-tools.md content-generation.md content-management.md content-reading.md index.md main-application.md plugin-system.md settings-configuration.md utilities.md

content-reading.mddocs/

0
# Content Reading
1

2
Reader classes for parsing different markup formats including Markdown, reStructuredText, and HTML. Readers extract metadata, process content, and convert markup to HTML for theme rendering.
3

4
## Capabilities
5

6
### Readers Manager
7

8
Central reader manager that coordinates different format readers and provides caching functionality for improved performance.
9

10
```python { .api }
11
class Readers(FileStampDataCacher):
12
    """
13
    Content reader manager with caching support.
14
    
15
    Parameters:
16
    - settings (dict): Site configuration dictionary
17
    - cache_name (str, optional): Cache identifier for file caching
18
    """
19
    def __init__(self, settings: dict, cache_name: str = ""): ...
20
    
21
    def read_file(
22
        self,
23
        base_path: str,
24
        path: str,
25
        content_class=Content,
26
        fmt: str = None
27
    ) -> Content:
28
        """
29
        Read and parse a content file.
30
        
31
        Parameters:
32
        - base_path (str): Base directory path
33
        - path (str): Relative file path
34
        - content_class (class, optional): Content class to instantiate (default: Content)
35
        - fmt (str, optional): Force specific format reader
36
        
37
        Returns:
38
        Content: Parsed content object with metadata and HTML content
39
        """
40
    
41
    # Available readers (populated from settings)
42
    readers: dict[str, BaseReader]  # Format -> Reader mapping
43
```
44

45
### Base Reader Class
46

47
Foundation class for all content format readers providing common functionality for metadata extraction and content processing.
48

49
```python { .api }
50
class BaseReader:
51
    """
52
    Base class for content format readers.
53
    
54
    Parameters:
55
    - settings (dict): Site configuration dictionary
56
    """
57
    def __init__(self, settings: dict): ...
58
    
59
    enabled: bool = True  # Whether this reader is enabled
60
    file_extensions: list[str]  # Supported file extensions
61
    
62
    def read(self, source_path: str) -> tuple[str, dict]:
63
        """
64
        Read and parse content file.
65
        
66
        Parameters:
67
        - source_path (str): Path to content file
68
        
69
        Returns:
70
        tuple: (HTML content string, metadata dictionary)
71
        """
72
    
73
    def process_metadata(self, name: str, value: str) -> tuple[str, Any]:
74
        """
75
        Process individual metadata field.
76
        
77
        Parameters:
78
        - name (str): Metadata field name
79
        - value (str): Raw metadata value
80
        
81
        Returns:
82
        tuple: (processed name, processed value)
83
        """
84
```
85

86
### reStructuredText Reader
87

88
Reader for reStructuredText (.rst) files using the docutils library for parsing and HTML generation.
89

90
```python { .api }
91
class RstReader(BaseReader):
92
    """
93
    reStructuredText content reader.
94
    
95
    Supports:
96
    - Standard reStructuredText syntax
97
    - Custom Pelican directives (code highlighting, etc.)
98
    - Metadata extraction from docutils meta fields
99
    - Math rendering via MathJax
100
    - Custom role and directive registration
101
    """
102
    
103
    file_extensions: list[str] = ['rst']
104
    
105
    def read(self, source_path: str) -> tuple[str, dict]:
106
        """
107
        Parse reStructuredText file and extract content/metadata.
108
        
109
        Uses docutils for parsing with Pelican-specific settings and directives.
110
        Supports custom roles and directives for enhanced functionality.
111
        """
112
```
113

114
### Markdown Reader
115

116
Reader for Markdown (.md, .markdown, .mkd) files using the Python-Markdown library with configurable extensions.
117

118
```python { .api }
119
class MarkdownReader(BaseReader):
120
    """
121
    Markdown content reader.
122
    
123
    Supports:
124
    - Standard Markdown syntax
125
    - Configurable Python-Markdown extensions
126
    - Metadata extraction from YAML front matter or meta extension
127
    - Code highlighting via Pygments
128
    - Table support, footnotes, and other extensions
129
    """
130
    
131
    file_extensions: list[str] = ['md', 'markdown', 'mkd']
132
    
133
    def read(self, source_path: str) -> tuple[str, dict]:
134
        """
135
        Parse Markdown file and extract content/metadata.
136
        
137
        Uses Python-Markdown with configurable extensions.
138
        Metadata can be extracted from YAML front matter or meta extension.
139
        """
140
```
141

142
### HTML Reader
143

144
Reader for HTML (.html, .htm) files that extracts metadata from HTML meta tags and preserves HTML content.
145

146
```python { .api }
147
class HTMLReader(BaseReader):
148
    """
149
    HTML content reader.
150
    
151
    Supports:
152
    - Raw HTML content preservation
153
    - Metadata extraction from HTML meta tags
154
    - Title extraction from <title> tag
155
    - Custom metadata via <meta> tags
156
    """
157
    
158
    file_extensions: list[str] = ['html', 'htm']
159
    
160
    def read(self, source_path: str) -> tuple[str, dict]:
161
        """
162
        Parse HTML file and extract content/metadata.
163
        
164
        Extracts metadata from HTML meta tags and preserves HTML content as-is.
165
        Useful for importing existing HTML content or custom layouts.
166
        """
167
```
168

169
## Reader Configuration
170

171
### Markdown Configuration
172

173
Configure Markdown reader behavior in settings:
174

175
```python
176
# In pelicanconf.py
177
MARKDOWN = {
178
    'extension_configs': {
179
        'markdown.extensions.codehilite': {'css_class': 'highlight'},
180
        'markdown.extensions.extra': {},
181
        'markdown.extensions.meta': {},
182
        'markdown.extensions.toc': {'permalink': True},
183
    },
184
    'output_format': 'html5',
185
}
186
```
187

188
### reStructuredText Configuration
189

190
Configure reStructuredText reader behavior:
191

192
```python
193
# In pelicanconf.py
194
DOCUTILS_SETTINGS = {
195
    'smart_quotes': True,
196
    'initial_header_level': 2,
197
    'syntax_highlight': 'short',
198
    'input_encoding': 'utf-8',
199
    'math_output': 'MathJax',
200
}
201
```
202

203
### Custom Readers
204

205
Register custom readers for additional formats:
206

207
```python
208
# In pelicanconf.py
209
READERS = {
210
    'txt': 'path.to.custom.TextReader',
211
    'org': 'path.to.custom.OrgModeReader',
212
}
213
```
214

215
## Metadata Processing
216

217
### Common Metadata Fields
218

219
All readers process these standard metadata fields:
220

221
- `title`: Content title
222
- `date`: Publication date (ISO format or custom format)
223
- `modified`: Last modification date
224
- `category`: Content category (articles only)
225
- `tags`: Comma-separated tags (articles only)
226
- `slug`: URL slug (auto-generated if not provided)
227
- `author`: Author name
228
- `authors`: Multiple authors (comma-separated)
229
- `summary`: Content summary/description
230
- `lang`: Content language code
231
- `status`: Content status (published, draft, hidden)
232
- `template`: Custom template name
233
- `save_as`: Custom output file path
234
- `url`: Custom URL path
235

236
### Metadata Format Examples
237

238
#### Markdown with YAML Front Matter
239

240
```markdown
241
---
242
title: My Article Title
243
date: 2023-01-15 10:30
244
category: Python
245
tags: tutorial, programming
246
author: John Doe
247
summary: A comprehensive guide to Python programming.
248
---
249

250
# Article Content
251

252
Content goes here...
253
```
254

255
#### Markdown with Meta Extension
256

257
```markdown
258
Title: My Article Title
259
Date: 2023-01-15 10:30
260
Category: Python
261
Tags: tutorial, programming
262
Author: John Doe
263
Summary: A comprehensive guide to Python programming.
264

265
# Article Content
266

267
Content goes here...
268
```
269

270
#### reStructuredText
271

272
```rst
273
My Article Title
274
================
275

276
:date: 2023-01-15 10:30
277
:category: Python
278
:tags: tutorial, programming
279
:author: John Doe
280
:summary: A comprehensive guide to Python programming.
281

282
Article Content
283
---------------
284

285
Content goes here...
286
```
287

288
#### HTML
289

290
```html
291
<html>
292
<head>
293
    <title>My Article Title</title>
294
    <meta name="date" content="2023-01-15 10:30">
295
    <meta name="category" content="Python">
296
    <meta name="tags" content="tutorial, programming">
297
    <meta name="author" content="John Doe">
298
    <meta name="summary" content="A comprehensive guide to Python programming.">
299
</head>
300
<body>
301
    <h1>Article Content</h1>
302
    <p>Content goes here...</p>
303
</body>
304
</html>
305
```
306

307
## Usage Examples
308

309
### Using Readers Directly
310

311
```python
312
from pelican.readers import Readers
313
from pelican.settings import read_settings
314

315
# Load settings and create readers
316
settings = read_settings('pelicanconf.py')
317
readers = Readers(settings)
318

319
# Read a Markdown file
320
content = readers.read_file(
321
    base_path='content',
322
    path='articles/my-post.md',
323
    content_class=Article
324
)
325

326
print(content.title)     # Article title
327
print(content.content)   # HTML content
328
print(content.metadata)  # Raw metadata dictionary
329
```
330

331
### Custom Reader Implementation
332

333
```python
334
from pelican.readers import BaseReader
335
import json
336

337
class JsonReader(BaseReader):
338
    """Custom reader for JSON content files."""
339
    
340
    file_extensions = ['json']
341
    
342
    def read(self, source_path):
343
        """Read JSON file and extract content/metadata."""
344
        with open(source_path, 'r', encoding='utf-8') as f:
345
            data = json.load(f)
346
        
347
        # Extract content and metadata
348
        content = data.get('content', '')
349
        metadata = {k: v for k, v in data.items() if k != 'content'}
350
        
351
        # Process metadata using base class method
352
        processed_metadata = {}
353
        for name, value in metadata.items():
354
            name, value = self.process_metadata(name, str(value))
355
            processed_metadata[name] = value
356
        
357
        return content, processed_metadata
358

359
# Register custom reader
360
# In pelicanconf.py:
361
# READERS = {'json': 'path.to.JsonReader'}
362
```
363

364
### Reader Integration with Generators
365

366
```python
367
from pelican.generators import Generator
368

369
class CustomGenerator(Generator):
370
    """Generator that uses readers to process content."""
371
    
372
    def generate_context(self):
373
        """Generate content using readers."""
374
        content_files = self.get_content_files()
375
        
376
        for content_file in content_files:
377
            # Use readers to parse file
378
            content = self.readers.read_file(
379
                base_path=self.path,
380
                path=content_file,
381
                content_class=Article
382
            )
383
            
384
            # Process content
385
            self.process_content(content)
386
    
387
    def get_content_files(self):
388
        """Get list of content files to process."""
389
        # Implementation depends on file discovery strategy
390
        return []
391
    
392
    def process_content(self, content):
393
        """Process parsed content."""
394
        # Add to context or perform custom processing
395
        pass
396
```
397

398
### Metadata Processing Customization
399

400
```python
401
from pelican.readers import BaseReader
402
from datetime import datetime
403

404
class CustomReader(BaseReader):
405
    """Reader with custom metadata processing."""
406
    
407
    def process_metadata(self, name, value):
408
        """Custom metadata processing logic."""
409
        name, value = super().process_metadata(name, value)
410
        
411
        # Custom date parsing
412
        if name == 'date':
413
            if isinstance(value, str):
414
                try:
415
                    value = datetime.strptime(value, '%Y-%m-%d %H:%M')
416
                except ValueError:
417
                    value = datetime.strptime(value, '%Y-%m-%d')
418
        
419
        # Custom tag processing
420
        elif name == 'tags':
421
            if isinstance(value, str):
422
                value = [tag.strip() for tag in value.split(',')]
423
        
424
        return name, value
425
```

Version

Tile

Files

content-reading.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}docs/

content-reading.mddocs/