0
# File Processing and Batch Operations
1
2
Functions for processing individual files and directories with proper encoding detection, pattern matching, and batch operation support. These functions enable integration into larger workflows and tools.
3
4
## Capabilities
5
6
### File Processing
7
8
Core functions for processing individual files with proper encoding detection and error handling.
9
10
```python { .api }
11
def fix_file(
12
filename: str,
13
args: Mapping[str, Any],
14
standard_out: IO[str] | None = None
15
) -> int:
16
"""
17
Runs fix_code() on a file and returns exit status.
18
19
Processes a single file with autoflake transformations, handling encoding
20
detection, file I/O, and error reporting.
21
22
Args:
23
filename: Path to the Python file to process
24
args: Dictionary of configuration options (same keys as fix_code parameters)
25
standard_out: Optional output stream for results (defaults to stdout)
26
27
Returns:
28
Exit status code (0 for success, non-zero for errors)
29
"""
30
```
31
32
### File Discovery and Filtering
33
34
Functions for finding and filtering Python files based on patterns and criteria.
35
36
```python { .api }
37
def find_files(
38
filenames: list[str],
39
recursive: bool,
40
exclude: Iterable[str]
41
) -> Iterable[str]:
42
"""
43
Yields filenames that match the specified criteria.
44
45
Discovers Python files from input paths, with support for recursive
46
directory traversal and pattern-based exclusion.
47
48
Args:
49
filenames: List of file or directory paths to process
50
recursive: Whether to recurse into subdirectories
51
exclude: Glob patterns for files/directories to exclude
52
53
Yields:
54
Paths to Python files that match the criteria
55
"""
56
```
57
58
```python { .api }
59
def is_python_file(filename: str) -> bool:
60
"""
61
Returns True if filename refers to a Python source file.
62
63
Checks both file extension (.py, .pyi) and shebang line to identify
64
Python files, including executable scripts.
65
66
Args:
67
filename: Path to file to check
68
69
Returns:
70
True if file is identified as Python source code
71
"""
72
```
73
74
```python { .api }
75
def is_exclude_file(filename: str, exclude: Iterable[str]) -> bool:
76
"""
77
Returns True if file matches any exclusion pattern.
78
79
Uses glob pattern matching to determine if a file should be excluded
80
from processing based on user-specified patterns.
81
82
Args:
83
filename: Path to file to check
84
exclude: Iterable of glob patterns for exclusion
85
86
Returns:
87
True if file matches any exclusion pattern
88
"""
89
```
90
91
```python { .api }
92
def match_file(filename: str, exclude: Iterable[str]) -> bool:
93
"""
94
Returns True if file is acceptable for processing.
95
96
Combines Python file detection and exclusion pattern matching to
97
determine if a file should be processed.
98
99
Args:
100
filename: Path to file to check
101
exclude: Iterable of glob patterns for exclusion
102
103
Returns:
104
True if file should be processed (is Python and not excluded)
105
"""
106
```
107
108
### Encoding Detection and File I/O
109
110
Functions for handling file encoding detection and safe file operations.
111
112
```python { .api }
113
def open_with_encoding(
114
filename: str,
115
encoding: str | None,
116
mode: str = "r",
117
limit_byte_check: int = -1
118
) -> IO[str]:
119
"""
120
Opens file with specified or detected encoding.
121
122
Handles encoding detection and file opening with proper error handling
123
for various encoding scenarios commonly found in Python projects.
124
125
Args:
126
filename: Path to file to open
127
encoding: Specific encoding to use, or None for auto-detection
128
mode: File open mode (default "r")
129
limit_byte_check: Byte limit for encoding detection (-1 for no limit)
130
131
Returns:
132
Opened file handle with correct encoding
133
"""
134
```
135
136
```python { .api }
137
def detect_encoding(filename: str, limit_byte_check: int = -1) -> str:
138
"""
139
Returns the detected encoding of a Python file.
140
141
Uses Python's tokenize module to detect file encoding from PEP 263
142
encoding declarations or defaults to utf-8.
143
144
Args:
145
filename: Path to file to analyze
146
limit_byte_check: Byte limit for detection (-1 for no limit)
147
148
Returns:
149
Detected encoding name (e.g., 'utf-8', 'latin-1')
150
"""
151
```
152
153
### Diff Generation
154
155
Functions for generating and formatting diffs showing changes made to code.
156
157
```python { .api }
158
def get_diff_text(
159
old: Sequence[str],
160
new: Sequence[str],
161
filename: str
162
) -> str:
163
"""
164
Returns unified diff text between old and new content.
165
166
Generates a unified diff format showing changes made to a file,
167
suitable for display in terminals or integration with version control.
168
169
Args:
170
old: Original lines of content
171
new: Modified lines of content
172
filename: File name to display in diff header
173
174
Returns:
175
Unified diff as a string
176
"""
177
```
178
179
### Text Processing Utilities
180
181
Utility functions for processing text and preserving formatting.
182
183
```python { .api }
184
def get_indentation(line: str) -> str:
185
"""
186
Returns the leading whitespace from a line.
187
188
Extracts indentation (spaces and tabs) to preserve Python code structure
189
when modifying lines.
190
191
Args:
192
line: Source code line to analyze
193
194
Returns:
195
Leading whitespace characters
196
"""
197
```
198
199
```python { .api }
200
def get_line_ending(line: str) -> str:
201
"""
202
Returns the line ending characters from a line.
203
204
Detects and preserves original line endings (\\n, \\r\\n, \\r) to maintain
205
file format consistency.
206
207
Args:
208
line: Source code line to analyze
209
210
Returns:
211
Line ending characters found in the line
212
"""
213
```
214
215
## Usage Examples
216
217
### Processing Single Files
218
219
```python
220
import autoflake
221
222
# Process a single file with basic options
223
exit_code = autoflake.fix_file(
224
"example.py",
225
{
226
"remove_unused_variables": True,
227
"remove_all_unused_imports": True,
228
"in_place": True
229
}
230
)
231
232
if exit_code == 0:
233
print("File processed successfully")
234
else:
235
print("Error processing file")
236
```
237
238
### Finding Files for Batch Processing
239
240
```python
241
import autoflake
242
243
# Find all Python files in a project, excluding tests
244
files = list(autoflake.find_files(
245
["src/", "lib/"],
246
recursive=True,
247
exclude=["*/test_*.py", "*/tests/*", "__pycache__"]
248
))
249
250
print(f"Found {len(files)} Python files to process")
251
```
252
253
### Custom File Detection
254
255
```python
256
import autoflake
257
import os
258
259
# Check if files are Python before processing
260
for filename in os.listdir("."):
261
if autoflake.is_python_file(filename):
262
if not autoflake.is_exclude_file(filename, ["__pycache__"]):
263
print(f"Would process: {filename}")
264
```
265
266
### Encoding Handling
267
268
```python
269
import autoflake
270
271
# Detect encoding before processing
272
encoding = autoflake.detect_encoding("script.py")
273
print(f"File encoding: {encoding}")
274
275
# Open with detected encoding
276
with autoflake.open_with_encoding("script.py", encoding) as f:
277
content = f.read()
278
279
# Process with encoding awareness
280
cleaned = autoflake.fix_code(content, remove_unused_variables=True)
281
```
282
283
### Diff Generation
284
285
```python
286
import autoflake
287
288
# Generate diff showing changes
289
original_lines = source_code.splitlines(keepends=True)
290
cleaned_code = autoflake.fix_code(source_code, remove_unused_variables=True)
291
cleaned_lines = cleaned_code.splitlines(keepends=True)
292
293
diff = autoflake.get_diff_text(original_lines, cleaned_lines, "example.py")
294
print(diff)
295
```
296
297
### Batch Processing with Error Handling
298
299
```python
300
import autoflake
301
302
def process_project(root_dir, config):
303
"""Process all Python files in a project directory."""
304
files = autoflake.find_files(
305
[root_dir],
306
recursive=True,
307
exclude=config.get("exclude", [])
308
)
309
310
results = {"success": 0, "errors": 0}
311
312
for filename in files:
313
try:
314
exit_code = autoflake.fix_file(filename, config)
315
if exit_code == 0:
316
results["success"] += 1
317
else:
318
results["errors"] += 1
319
print(f"Error processing {filename}")
320
except Exception as e:
321
results["errors"] += 1
322
print(f"Exception processing {filename}: {e}")
323
324
return results
325
326
# Usage
327
config = {
328
"remove_unused_variables": True,
329
"remove_all_unused_imports": True,
330
"in_place": True
331
}
332
333
results = process_project("src/", config)
334
print(f"Processed {results['success']} files successfully, {results['errors']} errors")
335
```