0
# Core Conversion Functions
1
2
The primary markup conversion functions that handle transformation between Creole, HTML, ReStructuredText, and Textile formats. These functions provide the main interface for python-creole and handle the most common conversion scenarios with sensible defaults.
3
4
## Capabilities
5
6
### Creole to HTML Conversion
7
8
Convert Creole markup to HTML with support for macros, custom block rules, and different line break modes.
9
10
```python { .api }
11
def creole2html(markup_string: str, debug: bool = False,
12
parser_kwargs: dict = None, emitter_kwargs: dict = None,
13
block_rules: tuple = None, blog_line_breaks: bool = True,
14
macros: dict = None, verbose: int = None, stderr = None,
15
strict: bool = False) -> str
16
```
17
18
**Parameters:**
19
- `markup_string`: Creole markup text to convert
20
- `debug`: Enable debug output for parser and emitter
21
- `parser_kwargs`: Additional parser configuration (deprecated)
22
- `emitter_kwargs`: Additional emitter configuration (deprecated)
23
- `block_rules`: Custom block-level parsing rules
24
- `blog_line_breaks`: Use blog-style line breaks (True) vs wiki-style (False)
25
- `macros`: Dictionary of macro functions for extending markup
26
- `verbose`: Verbosity level for output
27
- `stderr`: Error output stream
28
- `strict`: Enable strict Creole 1.0 compliance mode
29
30
**Returns:** HTML string
31
32
**Usage Examples:**
33
34
```python
35
from creole import creole2html
36
37
# Basic conversion
38
html = creole2html("This is **bold** text")
39
# Returns: '<p>This is <strong>bold</strong> text</p>'
40
41
# With macros
42
macros = {'code': lambda ext, text: f'<pre><code class="{ext}">{text}</code></pre>'}
43
html = creole2html('<<code ext="python">>\nprint("hello")\n<</code>>', macros=macros)
44
45
# With custom image sizing (non-standard extension)
46
html = creole2html('{{image.jpg|Alt text|90x120}}')
47
# Returns: '<p><img src="image.jpg" title="Alt text" alt="Alt text" width="90" height="120" /></p>'
48
49
# Strict mode (standard Creole only)
50
html = creole2html('{{image.jpg|Alt text|90x120}}', strict=True)
51
# Returns: '<p><img src="image.jpg" title="Alt text|90x120" alt="Alt text|90x120" /></p>'
52
```
53
54
### HTML to Creole Conversion
55
56
Convert HTML markup back to Creole format with configurable handling of unknown tags and strict compliance options.
57
58
```python { .api }
59
def html2creole(html_string: str, debug: bool = False,
60
parser_kwargs: dict = None, emitter_kwargs: dict = None,
61
unknown_emit = None, strict: bool = False) -> str
62
```
63
64
**Parameters:**
65
- `html_string`: HTML markup to convert
66
- `debug`: Enable debug output
67
- `parser_kwargs`: Additional parser configuration (deprecated)
68
- `emitter_kwargs`: Additional emitter configuration (deprecated)
69
- `unknown_emit`: Handler function for unknown HTML tags
70
- `strict`: Enable strict Creole output mode
71
72
**Returns:** Creole markup string
73
74
**Usage Examples:**
75
76
```python
77
from creole import html2creole
78
from creole.shared.unknown_tags import transparent_unknown_nodes
79
80
# Basic conversion
81
creole = html2creole('<p>This is <strong>bold</strong> text</p>')
82
# Returns: 'This is **bold** text'
83
84
# Handle unknown tags transparently
85
creole = html2creole('<p>Text with <unknown>content</unknown></p>',
86
unknown_emit=transparent_unknown_nodes)
87
# Returns: 'Text with content'
88
89
# Convert complex HTML structures
90
html = '''
91
<h1>Heading</h1>
92
<ul>
93
<li>Item 1</li>
94
<li>Item 2</li>
95
</ul>
96
<p>Paragraph with <a href="http://example.com">link</a></p>
97
'''
98
creole = html2creole(html)
99
# Returns: '= Heading =\n\n* Item 1\n* Item 2\n\nParagraph with [[http://example.com|link]]'
100
```
101
102
### HTML to ReStructuredText Conversion
103
104
Convert HTML to ReStructuredText markup for documentation systems and Sphinx integration.
105
106
```python { .api }
107
def html2rest(html_string: str, debug: bool = False,
108
parser_kwargs: dict = None, emitter_kwargs: dict = None,
109
unknown_emit = None) -> str
110
```
111
112
**Parameters:**
113
- `html_string`: HTML markup to convert
114
- `debug`: Enable debug output
115
- `parser_kwargs`: Additional parser configuration (deprecated)
116
- `emitter_kwargs`: Additional emitter configuration (deprecated)
117
- `unknown_emit`: Handler function for unknown HTML tags
118
119
**Returns:** ReStructuredText markup string
120
121
**Usage Examples:**
122
123
```python
124
from creole import html2rest
125
126
# Basic conversion
127
rest = html2rest('<p>This is <strong>bold</strong> and <em>italic</em> text</p>')
128
# Returns: 'This is **bold** and *italic* text'
129
130
# Convert headings and lists
131
html = '''
132
<h1>Main Title</h1>
133
<h2>Subtitle</h2>
134
<ul>
135
<li>First item</li>
136
<li>Second item</li>
137
</ul>
138
<p>Link to <a href="https://example.com">example</a></p>
139
'''
140
rest = html2rest(html)
141
# Returns ReStructuredText with proper heading underlines and reference links
142
```
143
144
### HTML to Textile Conversion
145
146
Convert HTML to Textile markup format.
147
148
```python { .api }
149
def html2textile(html_string: str, debug: bool = False,
150
parser_kwargs: dict = None, emitter_kwargs: dict = None,
151
unknown_emit = None) -> str
152
```
153
154
**Parameters:**
155
- `html_string`: HTML markup to convert
156
- `debug`: Enable debug output
157
- `parser_kwargs`: Additional parser configuration (deprecated)
158
- `emitter_kwargs`: Additional emitter configuration (deprecated)
159
- `unknown_emit`: Handler function for unknown HTML tags
160
161
**Returns:** Textile markup string
162
163
**Usage Examples:**
164
165
```python
166
from creole import html2textile
167
168
# Basic conversion
169
textile = html2textile('<p>This is <strong>bold</strong> and <i>italic</i> text</p>')
170
# Returns: 'This is *bold* and __italic__ text'
171
172
# Convert links and formatting
173
html = '<p>Visit <a href="http://example.com">example site</a> for more <em>information</em></p>'
174
textile = html2textile(html)
175
# Returns: 'Visit "example site":http://example.com for more __information__'
176
```
177
178
### HTML Document Tree Parsing
179
180
Low-level function to parse HTML into document tree structure for advanced processing.
181
182
```python { .api }
183
def parse_html(html_string: str, debug: bool = False) -> DocNode
184
```
185
186
**Parameters:**
187
- `html_string`: HTML markup to parse
188
- `debug`: Enable debug output
189
190
**Returns:** Document tree root node
191
192
**Usage Examples:**
193
194
```python
195
from creole import parse_html
196
197
# Parse HTML into document tree
198
html = '<p>Hello <strong>world</strong></p>'
199
doc_tree = parse_html(html)
200
201
# Access document structure
202
if debug:
203
doc_tree.debug() # Print tree structure
204
```
205
206
## Error Handling
207
208
All conversion functions accept Unicode strings and will raise `AssertionError` if non-Unicode input is provided. For HTML parsing errors or malformed markup, the functions attempt graceful degradation rather than raising exceptions.
209
210
Unknown HTML tags are handled according to the `unknown_emit` parameter:
211
- `None` (default): Remove unknown tags, keep content
212
- `raise_unknown_node`: Raise exception on unknown tags
213
- `transparent_unknown_nodes`: Pass through content only
214
- `escape_unknown_nodes`: Escape the tags as text
215
- Custom function: Provide your own handling logic