0
# Utilities and Types
1
2
Utility functions for text processing, validation, and comprehensive type definitions used throughout the jc library. These provide the foundation for parser development and error handling.
3
4
## Exception Classes
5
6
Custom exception classes for jc-specific error handling.
7
8
```python { .api }
9
class ParseError(Exception):
10
"""
11
General parsing error exception.
12
13
Raised when a parser encounters data it cannot process correctly.
14
Used by individual parsers to indicate parsing failures.
15
"""
16
17
class LibraryNotInstalled(Exception):
18
"""
19
Missing library dependency exception.
20
21
Raised when a parser requires an optional dependency that is not installed.
22
Allows parsers to gracefully handle missing optional libraries.
23
"""
24
```
25
26
## Type Definitions
27
28
Comprehensive type system used throughout the jc library.
29
30
```python { .api }
31
from typing import Dict, List, Union, Optional, Iterable, Iterator, TypedDict, Any, Tuple
32
from types import ModuleType
33
34
# Basic type aliases
35
JSONDictType = Dict[str, Any]
36
CustomColorType = Dict[Any, str]
37
StreamingOutputType = Iterator[Union[JSONDictType, Tuple[BaseException, str]]]
38
39
# Parser metadata structure (Python 3.8+)
40
ParserInfoType = TypedDict('ParserInfoType', {
41
"name": str, # Parser module name
42
"argument": str, # CLI argument form (--parser-name)
43
"version": str, # Parser version
44
"description": str, # Parser description
45
"author": str, # Parser author
46
"author_email": str, # Author email
47
"compatible": List[str], # Compatible platforms
48
"magic_commands": List[str], # Commands that trigger magic mode
49
"tags": List[str], # Parser categorization tags
50
"documentation": str, # Parser documentation
51
"streaming": bool, # Whether parser supports streaming
52
"plugin": bool, # Whether parser is a plugin
53
"hidden": bool, # Whether parser is hidden from lists
54
"deprecated": bool # Whether parser is deprecated
55
}, total=False)
56
57
# Timestamp format structure
58
TimeStampFormatType = TypedDict('TimeStampFormatType', {
59
'id': int, # Format identifier
60
'format': str, # strftime format string
61
'locale': Optional[str] # Locale for parsing
62
})
63
```
64
65
## Text Processing Utilities
66
67
Core utility functions for text manipulation and encoding handling.
68
69
```python { .api }
70
def _asciify(string: str) -> str:
71
"""
72
Convert Unicode string to ASCII with simple character replacements.
73
74
Parameters:
75
- string: Input Unicode string
76
77
Returns:
78
- ASCII-compatible string with unicode characters converted
79
"""
80
81
def _safe_print(
82
string: str,
83
sep: str = ' ',
84
end: str = '\n',
85
file: TextIO = sys.stdout,
86
flush: bool = False
87
) -> None:
88
"""
89
Print output safely for both UTF-8 and ASCII encoding systems.
90
91
Parameters:
92
- string: String to print
93
- sep: Separator character
94
- end: End character
95
- file: Output file object
96
- flush: Whether to flush output
97
"""
98
99
def _safe_pager(string: str) -> None:
100
"""
101
Display output in pager safely for both UTF-8 and ASCII systems.
102
103
Parameters:
104
- string: String to display in pager
105
"""
106
```
107
108
## Parser Development Utilities
109
110
Helper functions and constants for developing custom parsers.
111
112
```python { .api }
113
# Global quiet flag for suppressing warnings
114
CLI_QUIET: bool = False
115
116
def warning_message(message_lines: List[str]) -> None:
117
"""
118
Display warning message unless quiet mode is enabled.
119
120
Parameters:
121
- message_lines: List of warning message lines to display
122
"""
123
124
def error_message(message_lines: List[str]) -> None:
125
"""
126
Display error message to stderr.
127
128
Parameters:
129
- message_lines: List of error message lines to display
130
"""
131
```
132
133
## Date and Time Utilities
134
135
Functions for parsing and formatting timestamps and dates.
136
137
```python { .api }
138
def timestamp_convert(
139
timestamp: Union[int, float, str],
140
format_hint: Optional[str] = None
141
) -> Dict[str, Any]:
142
"""
143
Convert various timestamp formats to structured datetime information.
144
145
Parameters:
146
- timestamp: Timestamp in various formats (epoch, ISO, etc.)
147
- format_hint: Optional format hint for parsing
148
149
Returns:
150
- Dictionary with parsed datetime components
151
"""
152
153
def iso_datetime_parse(
154
iso_string: str,
155
timezone_aware: bool = True
156
) -> Dict[str, Any]:
157
"""
158
Parse ISO 8601 datetime strings into structured format.
159
160
Parameters:
161
- iso_string: ISO 8601 formatted datetime string
162
- timezone_aware: Whether to include timezone information
163
164
Returns:
165
- Dictionary with parsed datetime components
166
"""
167
```
168
169
## Data Validation Utilities
170
171
Functions for validating and cleaning parser inputs and outputs.
172
173
```python { .api }
174
def normalize_key(key: str) -> str:
175
"""
176
Normalize dictionary keys for consistent output formatting.
177
178
Parameters:
179
- key: Raw key string
180
181
Returns:
182
- Normalized key string (lowercase, underscores, etc.)
183
"""
184
185
def clean_json_output(data: Any) -> Any:
186
"""
187
Clean and normalize JSON output for consistency.
188
189
Parameters:
190
- data: Raw parsed data structure
191
192
Returns:
193
- Cleaned data structure suitable for JSON serialization
194
"""
195
```
196
197
## Plugin System Utilities
198
199
Functions supporting the plugin system for custom parser development.
200
201
```python { .api }
202
def get_user_data_dir() -> str:
203
"""
204
Get user data directory for jc plugins and configuration.
205
206
Returns:
207
- Path to user-specific jc data directory
208
"""
209
210
def validate_plugin_parser(parser_module: ModuleType) -> bool:
211
"""
212
Validate that a module conforms to parser interface requirements.
213
214
Parameters:
215
- parser_module: Module to validate
216
217
Returns:
218
- True if module is a valid parser, False otherwise
219
"""
220
```
221
222
## Usage Examples
223
224
### Error Handling
225
226
```python
227
from jc.exceptions import ParseError, LibraryNotInstalled
228
229
try:
230
data = jc.parse('dig', dig_output)
231
except ParseError as e:
232
print(f"Parsing failed: {e}")
233
except LibraryNotInstalled as e:
234
print(f"Missing dependency: {e}")
235
```
236
237
### Type Annotations
238
239
```python
240
from jc.jc_types import JSONDictType, ParserInfoType
241
from typing import List
242
243
def process_parser_output(data: JSONDictType) -> None:
244
# Process parsed data with proper typing
245
if 'timestamp' in data:
246
print(f"Timestamp: {data['timestamp']}")
247
248
def get_all_parser_metadata() -> List[ParserInfoType]:
249
# Function with proper return type annotation
250
return jc.all_parser_info()
251
```
252
253
### Text Processing
254
255
```python
256
from jc.utils import _safe_print, _asciify
257
258
# Safe printing for different encodings
259
unicode_text = "Processing file: résumé.txt"
260
_safe_print(unicode_text)
261
262
# Convert unicode to ASCII when needed
263
ascii_text = _asciify(unicode_text)
264
print(ascii_text) # "Processing file: r\xe9sum\xe9.txt"
265
```
266
267
### Custom Parser Development
268
269
```python
270
from jc.jc_types import ParserInfoType
271
from jc.exceptions import ParseError
272
import jc.utils
273
274
# Parser metadata
275
info: ParserInfoType = {
276
'name': 'my_parser',
277
'description': 'Parse my custom format',
278
'author': 'Your Name',
279
'version': '1.0',
280
'compatible': ['linux', 'darwin'],
281
'tags': ['file']
282
}
283
284
def parse(data: str, quiet: bool = False, raw: bool = False) -> JSONDictType:
285
"""Parse custom data format"""
286
try:
287
# Parsing logic here
288
result = {'parsed': True, 'data': data.strip()}
289
return jc.utils.clean_json_output(result)
290
except Exception as e:
291
if not quiet:
292
jc.utils.warning_message([f"Parse error: {e}"])
293
raise ParseError(f"Failed to parse data: {e}")
294
```
295
296
### Plugin Directory Setup
297
298
```python
299
import os
300
from jc.utils import get_user_data_dir
301
302
# Set up plugin directory
303
plugin_dir = os.path.join(get_user_data_dir(), 'jcparsers')
304
os.makedirs(plugin_dir, exist_ok=True)
305
306
# Create custom parser file
307
parser_file = os.path.join(plugin_dir, 'my_custom_parser.py')
308
with open(parser_file, 'w') as f:
309
f.write(custom_parser_code)
310
```
311
312
This utilities module provides the essential infrastructure for jc's extensible parser system, robust error handling, and cross-platform compatibility.