or run

npx @tessl/cli init
Log in

Version

Tile

Overview

Evals

Files

Files

docs

cli-interface.mdcore-detection.mddetection-results.mdindex.mdlegacy-compatibility.md

cli-interface.mddocs/

0

# CLI Interface

1

2

Command-line interface and programmatic CLI functions for charset detection and file normalization. Provides both shell command capabilities and importable Python functions for CLI operations.

3

4

## Capabilities

5

6

### Command-Line Detection

7

8

Primary CLI detection function that processes files and outputs structured results in JSON format.

9

10

```python { .api }

11

def cli_detect(

12

paths: list[str],

13

alternatives: bool = False,

14

normalize: bool = False,

15

minimal: bool = False,

16

replace: bool = False,

17

force: bool = False,

18

threshold: float = 0.2,

19

verbose: bool = False

20

) -> None:

21

"""

22

CLI detection function for processing multiple files.

23

24

Parameters:

25

- paths: List of file paths to analyze

26

- alternatives: Output complementary possibilities if any (JSON list format)

27

- normalize: Permit normalization of input files

28

- minimal: Only output charset to STDOUT, disabling JSON output

29

- replace: Replace files when normalizing instead of creating new ones

30

- force: Replace files without asking for confirmation

31

- threshold: Custom maximum chaos allowed in decoded content (0.0-1.0)

32

- verbose: Display complementary information and detection logs

33

34

Returns:

35

None (outputs to stdout)

36

37

Note: This function handles multiple files and outputs JSON results to stdout

38

"""

39

```

40

41

**Usage Example:**

42

43

```python

44

from charset_normalizer.cli import cli_detect

45

46

# Analyze single file

47

cli_detect(['document.txt'])

48

49

# Analyze with alternatives and verbose output

50

cli_detect(['data.csv'], alternatives=True, verbose=True)

51

52

# Normalize files with replacement

53

cli_detect(['file1.txt', 'file2.csv'], normalize=True, replace=True, force=True)

54

55

# Use custom detection threshold

56

cli_detect(['mixed_encoding.txt'], threshold=0.15, verbose=True)

57

```

58

59

### Interactive Confirmation

60

61

Helper function for interactive yes/no prompts in CLI operations.

62

63

```python { .api }

64

def query_yes_no(question: str, default: str = "yes") -> bool:

65

"""

66

Ask a yes/no question via input() and return the answer.

67

68

Parameters:

69

- question: Question string presented to the user

70

- default: Presumed answer if user just hits Enter ("yes", "no", or None)

71

72

Returns:

73

bool: True for "yes", False for "no"

74

75

Raises:

76

ValueError: If default is not "yes", "no", or None

77

78

Note: Used internally by CLI for confirmation prompts

79

"""

80

```

81

82

**Usage Example:**

83

84

```python

85

from charset_normalizer.cli import query_yes_no

86

87

# Basic yes/no prompt

88

if query_yes_no("Do you want to continue?"):

89

print("Proceeding...")

90

else:

91

print("Cancelled")

92

93

# Default to "no"

94

if query_yes_no("Delete all files?", default="no"):

95

print("Files deleted")

96

97

# Require explicit answer

98

answer = query_yes_no("Are you sure?", default=None)

99

```

100

101

## Shell Command Usage

102

103

The charset-normalizer package provides the `normalizer` command-line tool:

104

105

```bash

106

# Basic detection

107

normalizer document.txt

108

109

# Multiple files with alternatives

110

normalizer file1.txt file2.csv --with-alternative

111

112

# Normalize files in place

113

normalizer data.txt --normalize --replace --force

114

115

# Verbose detection with custom threshold

116

normalizer mixed_encoding.txt --verbose --threshold 0.15

117

118

# Minimal output (encoding name only)

119

normalizer simple.txt --minimal

120

```

121

122

## JSON Output Format

123

124

The CLI outputs structured JSON results for programmatic consumption:

125

126

```json

127

{

128

"path": "/path/to/document.txt",

129

"encoding": "utf_8",

130

"encoding_aliases": ["utf-8", "u8", "utf8"],

131

"alternative_encodings": ["ascii"],

132

"language": "English",

133

"alphabets": ["Basic Latin"],

134

"has_sig_or_bom": false,

135

"chaos": 0.02,

136

"coherence": 0.85,

137

"unicode_path": null,

138

"is_preferred": true

139

}

140

```

141

142

When `--with-alternative` is used, output becomes an array of results:

143

144

```json

145

[

146

{

147

"path": "/path/to/document.txt",

148

"encoding": "utf_8",

149

"language": "English",

150

"chaos": 0.02,

151

"coherence": 0.85,

152

"is_preferred": true

153

},

154

{

155

"path": "/path/to/document.txt",

156

"encoding": "iso-8859-1",

157

"language": "English",

158

"chaos": 0.05,

159

"coherence": 0.82,

160

"is_preferred": false

161

}

162

]

163

```

164

165

## Integration Patterns

166

167

### Script Integration

168

169

```python

170

import sys

171

import json

172

from charset_normalizer.cli import cli_detect

173

from io import StringIO

174

175

# Capture CLI output programmatically

176

old_stdout = sys.stdout

177

sys.stdout = buffer = StringIO()

178

179

try:

180

cli_detect(['document.txt'])

181

output = buffer.getvalue()

182

result = json.loads(output)

183

print(f"Detected encoding: {result['encoding']}")

184

finally:

185

sys.stdout = old_stdout

186

```

187

188

### Batch Processing

189

190

```python

191

from charset_normalizer.cli import cli_detect

192

import os

193

194

# Process all text files in directory

195

text_files = [f for f in os.listdir('.') if f.endswith('.txt')]

196

cli_detect(text_files, alternatives=True, verbose=True)

197

```

198

199

### Safe File Normalization

200

201

```python

202

from charset_normalizer.cli import cli_detect, query_yes_no

203

import os

204

205

def safe_normalize_files(file_paths):

206

"""Safely normalize files with user confirmation."""

207

# First, detect encodings

208

cli_detect(file_paths, verbose=True)

209

210

# Ask for confirmation

211

if query_yes_no(f"Normalize {len(file_paths)} files?"):

212

cli_detect(file_paths, normalize=True, replace=True)

213

print("Files normalized successfully")

214

else:

215

print("Normalization cancelled")

216

217

# Usage

218

safe_normalize_files(['doc1.txt', 'doc2.csv'])

219

```

220

221

## Error Handling

222

223

The CLI functions handle various error conditions:

224

225

- **File not found**: Skips missing files with warning

226

- **Permission errors**: Reports access issues and continues

227

- **Binary files**: Automatically skips non-text content

228

- **Encoding failures**: Reports problematic files and continues

229

- **User interruption**: Handles Ctrl+C gracefully

230

231

For programmatic usage, wrap CLI calls in try-catch blocks:

232

233

```python

234

try:

235

cli_detect(['problematic_file.bin'])

236

except KeyboardInterrupt:

237

print("Detection interrupted by user")

238

except Exception as e:

239

print(f"CLI error: {e}")

240

```