or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

advanced-features.mdcommand-line-tools.mdcore-functions.mddictionary-customization.mdindex.mdstyles-formatting.md

core-functions.mddocs/

0

# Core Conversion Functions

1

2

Primary functions for converting Chinese characters to pinyin with various output options, heteronym support, and error handling customization.

3

4

## Capabilities

5

6

### Main Pinyin Conversion

7

8

The primary function for converting Chinese characters to pinyin with comprehensive options for output style, heteronym handling, and error processing.

9

10

```python { .api }

11

def pinyin(hans, style=Style.TONE, heteronym=False, errors='default', strict=True, v_to_u=False, neutral_tone_with_five=False):

12

"""

13

Convert Chinese characters to pinyin.

14

15

Parameters:

16

- hans (str): Chinese characters to convert

17

- style (Style): Output style (default: Style.TONE)

18

- heteronym (bool): Return all possible pronunciations for polyphonic characters (default: False)

19

- errors (str): Error handling strategy ('default', 'ignore', 'replace', 'exception') (default: 'default')

20

- strict (bool): Strict mode for character processing (default: True)

21

- v_to_u (bool): Convert 'v' to 'ü' in output (default: False)

22

- neutral_tone_with_five (bool): Use '5' for neutral tone in numeric styles (default: False)

23

24

Returns:

25

list: List of lists, where each inner list contains pinyin for one character.

26

With heteronym=True, inner lists may contain multiple pronunciations.

27

"""

28

```

29

30

#### Usage Examples

31

32

```python

33

from pypinyin import pinyin, Style

34

35

# Basic conversion with tone marks

36

result = pinyin('中国')

37

print(result) # [['zhōng'], ['guó']]

38

39

# Heteronym support - multiple pronunciations

40

result = pinyin('银行', heteronym=True)

41

print(result) # [['yín'], ['háng', 'xíng']]

42

43

# Different styles

44

result = pinyin('中国', style=Style.TONE3)

45

print(result) # [['zhong1'], ['guo2']]

46

47

result = pinyin('中国', style=Style.INITIALS)

48

print(result) # [['zh'], ['g']]

49

50

# Character conversion options

51

result = pinyin('女', style=Style.TONE2, v_to_u=True)

52

print(result) # [['nǔ']] instead of [['nv3']]

53

54

# Error handling

55

result = pinyin('中国abc', errors='ignore')

56

print(result) # [['zhōng'], ['guó']]

57

58

result = pinyin('中国abc', errors='replace')

59

print(result) # [['zhōng'], ['guó'], ['abc']]

60

```

61

62

### Simplified Pinyin Conversion

63

64

Optimized function for simple pinyin conversion without heteronym results, ideal for most common use cases.

65

66

```python { .api }

67

def lazy_pinyin(hans, style=Style.NORMAL, errors='default', strict=True, v_to_u=False, neutral_tone_with_five=False, tone_sandhi=False):

68

"""

69

Convert Chinese characters to pinyin (lazy mode - no heteronyms).

70

71

Parameters:

72

- hans (str): Chinese characters to convert

73

- style (Style): Output style (default: Style.NORMAL)

74

- errors (str): Error handling strategy ('default', 'ignore', 'replace', 'exception') (default: 'default')

75

- strict (bool): Strict mode for character processing (default: True)

76

- v_to_u (bool): Convert 'v' to 'ü' in output (default: False)

77

- neutral_tone_with_five (bool): Use '5' for neutral tone in numeric styles (default: False)

78

- tone_sandhi (bool): Apply tone sandhi processing rules (default: False)

79

80

Returns:

81

list: Flat list of pinyin strings, one per character.

82

"""

83

```

84

85

#### Usage Examples

86

87

```python

88

from pypinyin import lazy_pinyin, Style

89

90

# Simple conversion

91

result = lazy_pinyin('中国')

92

print(result) # ['zhong', 'guo']

93

94

# With tones

95

result = lazy_pinyin('中国', style=Style.TONE)

96

print(result) # ['zhōng', 'guó']

97

98

# Tone sandhi processing

99

result = lazy_pinyin('一个', tone_sandhi=True)

100

print(result) # Applies tone change rules

101

102

# First letters only

103

result = lazy_pinyin('中华人民共和国', style=Style.FIRST_LETTER)

104

print(result) # ['z', 'h', 'r', 'm', 'g', 'h', 'g']

105

```

106

107

### URL Slug Generation

108

109

Generate URL-friendly slug strings from Chinese characters using pinyin conversion with customizable separators.

110

111

```python { .api }

112

def slug(hans, style=Style.NORMAL, heteronym=False, separator='-', errors='default', strict=True):

113

"""

114

Generate slug string from Chinese characters.

115

116

Parameters:

117

- hans (str): Chinese characters to convert

118

- style (Style): Output style (default: Style.NORMAL)

119

- heteronym (bool): Include all pronunciations for polyphonic characters (default: False)

120

- separator (str): Separator between pinyin syllables (default: '-')

121

- errors (str): Error handling strategy ('default', 'ignore', 'replace', 'exception') (default: 'default')

122

- strict (bool): Strict mode for character processing (default: True)

123

124

Returns:

125

str: URL-friendly slug string.

126

"""

127

```

128

129

#### Usage Examples

130

131

```python

132

from pypinyin import slug, Style

133

134

# Basic slug generation

135

result = slug('中国')

136

print(result) # 'zhong-guo'

137

138

# Custom separator

139

result = slug('中国', separator='_')

140

print(result) # 'zhong_guo'

141

142

# With tones (not typical for URLs)

143

result = slug('中国', style=Style.TONE)

144

print(result) # 'zhōng-guó'

145

146

# Heteronym handling

147

result = slug('银行', heteronym=True, separator='_')

148

print(result) # 'yin_hang_xing'

149

150

# Mixed content

151

result = slug('北京大学2023')

152

print(result) # 'bei-jing-da-xue-2023'

153

```

154

155

## Error Strategies

156

157

All core functions support four error handling strategies:

158

159

- **'default'**: Keep unrecognized characters as-is in the output

160

- **'ignore'**: Skip unrecognized characters entirely

161

- **'replace'**: Replace unrecognized characters with Unicode code points (without \\u prefix)

162

- **'exception'**: Raise PinyinNotFoundException for unrecognized characters

163

164

```python

165

# Demonstration of error handling

166

text = '中国abc123'

167

168

# Default: keep unrecognized characters

169

result = lazy_pinyin(text, errors='default')

170

print(result) # ['zhong', 'guo', 'abc123']

171

172

# Ignore: skip unrecognized characters

173

result = lazy_pinyin(text, errors='ignore')

174

print(result) # ['zhong', 'guo']

175

176

# Replace: substitute unrecognized characters with unicode codes

177

result = lazy_pinyin(text, errors='replace')

178

print(result) # ['zhong', 'guo', '61626331323'] # Unicode codes without \u

179

180

# Exception: raise error for unrecognized characters

181

from pypinyin import PinyinNotFoundException

182

try:

183

result = lazy_pinyin(text, errors='exception')

184

except PinyinNotFoundException as e:

185

print(f"Exception raised: {e.message}")

186

print(f"Problematic chars: {e.chars}")

187

```