0
# Core Conversion Functions
1
2
Primary functions for converting Chinese characters to pinyin with various output options, heteronym support, and error handling customization.
3
4
## Capabilities
5
6
### Main Pinyin Conversion
7
8
The primary function for converting Chinese characters to pinyin with comprehensive options for output style, heteronym handling, and error processing.
9
10
```python { .api }
11
def pinyin(hans, style=Style.TONE, heteronym=False, errors='default', strict=True, v_to_u=False, neutral_tone_with_five=False):
12
"""
13
Convert Chinese characters to pinyin.
14
15
Parameters:
16
- hans (str): Chinese characters to convert
17
- style (Style): Output style (default: Style.TONE)
18
- heteronym (bool): Return all possible pronunciations for polyphonic characters (default: False)
19
- errors (str): Error handling strategy ('default', 'ignore', 'replace', 'exception') (default: 'default')
20
- strict (bool): Strict mode for character processing (default: True)
21
- v_to_u (bool): Convert 'v' to 'ü' in output (default: False)
22
- neutral_tone_with_five (bool): Use '5' for neutral tone in numeric styles (default: False)
23
24
Returns:
25
list: List of lists, where each inner list contains pinyin for one character.
26
With heteronym=True, inner lists may contain multiple pronunciations.
27
"""
28
```
29
30
#### Usage Examples
31
32
```python
33
from pypinyin import pinyin, Style
34
35
# Basic conversion with tone marks
36
result = pinyin('中国')
37
print(result) # [['zhōng'], ['guó']]
38
39
# Heteronym support - multiple pronunciations
40
result = pinyin('银行', heteronym=True)
41
print(result) # [['yín'], ['háng', 'xíng']]
42
43
# Different styles
44
result = pinyin('中国', style=Style.TONE3)
45
print(result) # [['zhong1'], ['guo2']]
46
47
result = pinyin('中国', style=Style.INITIALS)
48
print(result) # [['zh'], ['g']]
49
50
# Character conversion options
51
result = pinyin('女', style=Style.TONE2, v_to_u=True)
52
print(result) # [['nǔ']] instead of [['nv3']]
53
54
# Error handling
55
result = pinyin('中国abc', errors='ignore')
56
print(result) # [['zhōng'], ['guó']]
57
58
result = pinyin('中国abc', errors='replace')
59
print(result) # [['zhōng'], ['guó'], ['abc']]
60
```
61
62
### Simplified Pinyin Conversion
63
64
Optimized function for simple pinyin conversion without heteronym results, ideal for most common use cases.
65
66
```python { .api }
67
def lazy_pinyin(hans, style=Style.NORMAL, errors='default', strict=True, v_to_u=False, neutral_tone_with_five=False, tone_sandhi=False):
68
"""
69
Convert Chinese characters to pinyin (lazy mode - no heteronyms).
70
71
Parameters:
72
- hans (str): Chinese characters to convert
73
- style (Style): Output style (default: Style.NORMAL)
74
- errors (str): Error handling strategy ('default', 'ignore', 'replace', 'exception') (default: 'default')
75
- strict (bool): Strict mode for character processing (default: True)
76
- v_to_u (bool): Convert 'v' to 'ü' in output (default: False)
77
- neutral_tone_with_five (bool): Use '5' for neutral tone in numeric styles (default: False)
78
- tone_sandhi (bool): Apply tone sandhi processing rules (default: False)
79
80
Returns:
81
list: Flat list of pinyin strings, one per character.
82
"""
83
```
84
85
#### Usage Examples
86
87
```python
88
from pypinyin import lazy_pinyin, Style
89
90
# Simple conversion
91
result = lazy_pinyin('中国')
92
print(result) # ['zhong', 'guo']
93
94
# With tones
95
result = lazy_pinyin('中国', style=Style.TONE)
96
print(result) # ['zhōng', 'guó']
97
98
# Tone sandhi processing
99
result = lazy_pinyin('一个', tone_sandhi=True)
100
print(result) # Applies tone change rules
101
102
# First letters only
103
result = lazy_pinyin('中华人民共和国', style=Style.FIRST_LETTER)
104
print(result) # ['z', 'h', 'r', 'm', 'g', 'h', 'g']
105
```
106
107
### URL Slug Generation
108
109
Generate URL-friendly slug strings from Chinese characters using pinyin conversion with customizable separators.
110
111
```python { .api }
112
def slug(hans, style=Style.NORMAL, heteronym=False, separator='-', errors='default', strict=True):
113
"""
114
Generate slug string from Chinese characters.
115
116
Parameters:
117
- hans (str): Chinese characters to convert
118
- style (Style): Output style (default: Style.NORMAL)
119
- heteronym (bool): Include all pronunciations for polyphonic characters (default: False)
120
- separator (str): Separator between pinyin syllables (default: '-')
121
- errors (str): Error handling strategy ('default', 'ignore', 'replace', 'exception') (default: 'default')
122
- strict (bool): Strict mode for character processing (default: True)
123
124
Returns:
125
str: URL-friendly slug string.
126
"""
127
```
128
129
#### Usage Examples
130
131
```python
132
from pypinyin import slug, Style
133
134
# Basic slug generation
135
result = slug('中国')
136
print(result) # 'zhong-guo'
137
138
# Custom separator
139
result = slug('中国', separator='_')
140
print(result) # 'zhong_guo'
141
142
# With tones (not typical for URLs)
143
result = slug('中国', style=Style.TONE)
144
print(result) # 'zhōng-guó'
145
146
# Heteronym handling
147
result = slug('银行', heteronym=True, separator='_')
148
print(result) # 'yin_hang_xing'
149
150
# Mixed content
151
result = slug('北京大学2023')
152
print(result) # 'bei-jing-da-xue-2023'
153
```
154
155
## Error Strategies
156
157
All core functions support four error handling strategies:
158
159
- **'default'**: Keep unrecognized characters as-is in the output
160
- **'ignore'**: Skip unrecognized characters entirely
161
- **'replace'**: Replace unrecognized characters with Unicode code points (without \\u prefix)
162
- **'exception'**: Raise PinyinNotFoundException for unrecognized characters
163
164
```python
165
# Demonstration of error handling
166
text = '中国abc123'
167
168
# Default: keep unrecognized characters
169
result = lazy_pinyin(text, errors='default')
170
print(result) # ['zhong', 'guo', 'abc123']
171
172
# Ignore: skip unrecognized characters
173
result = lazy_pinyin(text, errors='ignore')
174
print(result) # ['zhong', 'guo']
175
176
# Replace: substitute unrecognized characters with unicode codes
177
result = lazy_pinyin(text, errors='replace')
178
print(result) # ['zhong', 'guo', '61626331323'] # Unicode codes without \u
179
180
# Exception: raise error for unrecognized characters
181
from pypinyin import PinyinNotFoundException
182
try:
183
result = lazy_pinyin(text, errors='exception')
184
except PinyinNotFoundException as e:
185
print(f"Exception raised: {e.message}")
186
print(f"Problematic chars: {e.chars}")
187
```