0
# Target Management
1
2
Target management system for discovering, filtering, and processing files for semgrep scanning. The target manager handles file discovery, language detection, and exclusion patterns.
3
4
## Capabilities
5
6
### Target Manager
7
8
Primary class for managing scan targets and file discovery.
9
10
```python { .api }
11
class TargetManager:
12
"""
13
Manages target files for scanning with file discovery and filtering.
14
15
Handles file discovery, language detection, gitignore patterns,
16
and file size limits for efficient scanning.
17
18
Attributes:
19
- scanning_root (ScanningRoot): Root directory configuration
20
- target_files (list): Discovered target files
21
- filtered_files (FilteredFiles): Files excluded from scanning
22
- include_patterns (list): Patterns for file inclusion
23
- exclude_patterns (list): Patterns for file exclusion
24
"""
25
def __init__(self, scanning_root, **kwargs): ...
26
27
def get_all_targets(self): ...
28
def filter_by_language(self, language): ...
29
def filter_by_size(self, max_size): ...
30
def apply_exclusions(self, patterns): ...
31
def get_target_count(self): ...
32
33
class ScanningRoot:
34
"""
35
Root directory configuration for scanning.
36
37
Defines the base directory and scanning parameters
38
for target discovery.
39
40
Attributes:
41
- path (str): Root directory path
42
- respect_gitignore (bool): Whether to respect .gitignore files
43
- baseline_handler (BaselineHandler): Handler for baseline comparison
44
"""
45
def __init__(self, path, **kwargs): ...
46
47
def get_path(self): ...
48
def is_valid(self): ...
49
def get_gitignore_patterns(self): ...
50
51
class TargetScanResult:
52
"""
53
Results of target file scanning and discovery.
54
55
Contains discovered files, exclusions, and metadata
56
about the target discovery process.
57
58
Attributes:
59
- targets (list): List of Target objects
60
- filtered_files (FilteredFiles): Excluded files with reasons
61
- stats (dict): Discovery statistics
62
"""
63
def __init__(self, targets, filtered_files): ...
64
65
def get_target_count(self): ...
66
def get_filtered_count(self): ...
67
def get_total_size(self): ...
68
```
69
70
### Target Information
71
72
Classes for representing individual scan targets.
73
74
```python { .api }
75
class Target:
76
"""
77
Represents a single file target for scanning.
78
79
Attributes:
80
- path (str): File path relative to scanning root
81
- language (str): Detected programming language
82
- size (int): File size in bytes
83
- encoding (str): File encoding (utf-8, etc.)
84
"""
85
def __init__(self, path, language=None): ...
86
87
def get_path(self): ...
88
def get_language(self): ...
89
def get_size(self): ...
90
def is_binary(self): ...
91
92
class TargetInfo:
93
"""
94
Additional metadata about target files.
95
96
Provides extended information about files including
97
modification times, permissions, and content analysis.
98
99
Attributes:
100
- target (Target): Associated target file
101
- last_modified (datetime): File modification time
102
- permissions (str): File permissions
103
- line_count (int): Number of lines in file
104
"""
105
def __init__(self, target): ...
106
107
def get_metadata(self): ...
108
def analyze_content(self): ...
109
```
110
111
### File Filtering
112
113
Classes for managing file filtering and exclusions.
114
115
```python { .api }
116
class FilteredFiles:
117
"""
118
Collection of files excluded from scanning with reasons.
119
120
Tracks files that were excluded and the reasons for exclusion
121
such as size limits, language filters, or ignore patterns.
122
123
Attributes:
124
- excluded_files (list): List of excluded file paths
125
- exclusion_reasons (dict): Mapping of files to exclusion reasons
126
- patterns_matched (dict): Which patterns matched each file
127
"""
128
def __init__(self): ...
129
130
def add_excluded_file(self, path, reason): ...
131
def get_excluded_count(self): ...
132
def get_exclusions_by_reason(self, reason): ...
133
def get_exclusion_summary(self): ...
134
135
class FileErrorLog:
136
"""
137
Log of file processing errors during target discovery.
138
139
Tracks files that couldn't be processed due to errors
140
like permission issues, encoding problems, or corruption.
141
142
Attributes:
143
- error_files (list): Files with processing errors
144
- error_details (dict): Detailed error information
145
"""
146
def __init__(self): ...
147
148
def add_error(self, path, error): ...
149
def get_error_count(self): ...
150
def get_errors_by_type(self, error_type): ...
151
152
class FileTargetingLog:
153
"""
154
Log of file targeting decisions and statistics.
155
156
Provides detailed logging of the file discovery process
157
including timing, patterns matched, and decisions made.
158
159
Attributes:
160
- discovered_files (int): Total files discovered
161
- included_files (int): Files included for scanning
162
- excluded_files (int): Files excluded from scanning
163
- timing_stats (dict): Performance timing information
164
"""
165
def __init__(self): ...
166
167
def log_discovery(self, file_count): ...
168
def log_inclusion(self, path, reason): ...
169
def log_exclusion(self, path, reason): ...
170
def get_summary(self): ...
171
```
172
173
## Usage Examples
174
175
### Basic Target Management
176
177
```python
178
from semgrep.target_manager import TargetManager, ScanningRoot
179
180
# Create scanning root
181
root = ScanningRoot(
182
path="./src",
183
respect_gitignore=True
184
)
185
186
# Create target manager
187
target_manager = TargetManager(
188
scanning_root=root,
189
max_target_bytes=1000000, # 1MB limit
190
include_patterns=["*.py", "*.js"],
191
exclude_patterns=["**/node_modules/**"]
192
)
193
194
# Discover targets
195
targets = target_manager.get_all_targets()
196
print(f"Found {len(targets)} files to scan")
197
198
# Filter by language
199
python_targets = target_manager.filter_by_language("python")
200
print(f"Python files: {len(python_targets)}")
201
```
202
203
### Advanced Target Filtering
204
205
```python
206
from semgrep.target_manager import TargetManager, FilteredFiles
207
208
# Get target discovery results
209
result = target_manager.discover_targets()
210
211
# Examine excluded files
212
filtered = result.filtered_files
213
print(f"Excluded {filtered.get_excluded_count()} files")
214
215
# Get exclusions by reason
216
size_exclusions = filtered.get_exclusions_by_reason("file_too_large")
217
print(f"Files excluded for size: {len(size_exclusions)}")
218
219
# Get exclusion summary
220
summary = filtered.get_exclusion_summary()
221
for reason, count in summary.items():
222
print(f"{reason}: {count} files")
223
```
224
225
### Working with Individual Targets
226
227
```python
228
from semgrep.target_manager import Target, TargetInfo
229
230
# Create target
231
target = Target(
232
path="src/main.py",
233
language="python"
234
)
235
236
# Get target information
237
info = TargetInfo(target)
238
metadata = info.get_metadata()
239
240
print(f"File: {target.get_path()}")
241
print(f"Language: {target.get_language()}")
242
print(f"Size: {target.get_size()} bytes")
243
print(f"Last modified: {metadata['last_modified']}")
244
print(f"Line count: {metadata['line_count']}")
245
```