0
# Rules and Matches
1
2
Classes and functions for working with semgrep rules and processing scan results, including rule validation and match filtering.
3
4
## Capabilities
5
6
### Rule Management
7
8
Core rule representation and manipulation.
9
10
```python { .api }
11
class Rule:
12
"""
13
Represents a semgrep rule with all its properties.
14
15
Attributes:
16
- id (str): Unique rule identifier
17
- message (str): Human-readable description of what the rule finds
18
- languages (list): Programming languages this rule applies to
19
- severity (str): Rule severity level (INFO, WARNING, ERROR)
20
- pattern (str): Primary pattern to match
21
- patterns (list): Complex pattern combinations
22
- metadata (dict): Additional rule metadata
23
- paths (dict): File path inclusion/exclusion patterns
24
- fix (str): Suggested fix for autofix functionality
25
"""
26
def __init__(self, raw: Dict[str, Any], yaml: Optional[YamlTree[YamlMap]] = None) -> None: ...
27
28
def validate(self): ...
29
def get_languages(self): ...
30
def matches_language(self, language): ...
31
def get_severity(self): ...
32
```
33
34
### Match Processing
35
36
Classes for representing and processing scan results.
37
38
```python { .api }
39
class RuleMatch:
40
"""
41
Represents a single finding from a semgrep scan.
42
43
Attributes:
44
- match (dict): Core match data with location and content
45
- message (str): Human-readable match message
46
- severity (str): Match severity (INFO, WARNING, ERROR)
47
- metadata (dict): Match metadata and extra information
48
- path (str): File path where match was found
49
- start (dict): Start position (line, col, offset)
50
- end (dict): End position (line, col, offset)
51
- extra (dict): Additional match information
52
- check_id (str): Rule ID that generated this match
53
- fix (str): Suggested fix text if available
54
"""
55
def __init__(self, match_dict): ...
56
57
def get_lines(self): ...
58
def get_code_snippet(self): ...
59
def has_fix(self): ...
60
def to_dict(self): ...
61
62
class RuleMatches:
63
"""
64
Collection of rule matches with filtering and sorting capabilities.
65
66
Methods for filtering by severity, file patterns, and rule IDs.
67
"""
68
def __init__(self, matches=None): ...
69
70
def add_match(self, match): ...
71
def filter_by_severity(self, severities): ...
72
def filter_by_path_pattern(self, pattern): ...
73
def filter_by_rule_ids(self, rule_ids): ...
74
def sort_by_file(self): ...
75
def sort_by_severity(self): ...
76
def to_dict(self): ...
77
```
78
79
### Rule Validation
80
81
Functions for validating rule syntax and schema compliance.
82
83
```python { .api }
84
def validate_single_rule(rule_dict):
85
"""
86
Validate a single rule against the semgrep schema.
87
88
Parameters:
89
- rule_dict (dict): Rule definition to validate
90
91
Returns:
92
Rule: Validated rule object
93
94
Raises:
95
InvalidRuleSchemaError: If rule validation fails
96
"""
97
98
def validate_rule_schema(rules):
99
"""
100
Validate multiple rules against schema.
101
102
Parameters:
103
- rules (list): List of rule dictionaries
104
105
Returns:
106
list: List of validated Rule objects
107
108
Raises:
109
InvalidRuleSchemaError: If any rule validation fails
110
"""
111
```
112
113
## Types
114
115
### Rule Pattern Types
116
117
```python { .api }
118
class PatternType:
119
"""
120
Enumeration of pattern types used in rules.
121
122
Values:
123
- PATTERN: Simple pattern matching
124
- PATTERN_EITHER: Match any of multiple patterns
125
- PATTERN_ALL: Match all patterns
126
- PATTERN_NOT: Exclude matches for pattern
127
- PATTERN_INSIDE: Pattern must be inside another pattern
128
- PATTERN_REGEX: Regular expression pattern
129
"""
130
PATTERN = "pattern"
131
PATTERN_EITHER = "pattern-either"
132
PATTERN_ALL = "pattern-all"
133
PATTERN_NOT = "pattern-not"
134
PATTERN_INSIDE = "pattern-inside"
135
PATTERN_REGEX = "pattern-regex"
136
137
class SeverityLevel:
138
"""
139
Rule and match severity levels.
140
141
Values:
142
- INFO: Informational findings
143
- WARNING: Potential issues that should be reviewed
144
- ERROR: Definite issues that should be fixed
145
"""
146
INFO = "INFO"
147
WARNING = "WARNING"
148
ERROR = "ERROR"
149
```
150
151
### Match Location Types
152
153
```python { .api }
154
class Position:
155
"""
156
Represents a position in source code.
157
158
Attributes:
159
- line (int): Line number (1-based)
160
- col (int): Column number (1-based)
161
- offset (int): Character offset from file start
162
"""
163
line: int
164
col: int
165
offset: int
166
167
class Location:
168
"""
169
Represents a location span in source code.
170
171
Attributes:
172
- start (Position): Start position
173
- end (Position): End position
174
- path (str): File path
175
"""
176
start: Position
177
end: Position
178
path: str
179
```
180
181
## Usage Examples
182
183
### Working with Rules
184
185
```python
186
from semgrep.rule import Rule
187
from semgrep.config_resolver import validate_single_rule
188
189
# Create rule from dictionary
190
rule_dict = {
191
"id": "hardcoded-password",
192
"pattern": "password = \"$VALUE\"",
193
"message": "Hardcoded password found",
194
"languages": ["python"],
195
"severity": "ERROR"
196
}
197
198
rule = validate_single_rule(rule_dict)
199
print(f"Rule ID: {rule.id}")
200
print(f"Languages: {rule.get_languages()}")
201
print(f"Severity: {rule.get_severity()}")
202
```
203
204
### Processing Scan Results
205
206
```python
207
from semgrep.rule_match import RuleMatches, SeverityLevel
208
209
# Assume we have scan results
210
results = run_scan_and_return_json(target_manager, config)
211
212
# Create RuleMatches collection
213
matches = RuleMatches(results.get('results', []))
214
215
# Filter high severity matches
216
high_severity_matches = matches.filter_by_severity([SeverityLevel.ERROR])
217
218
# Process each match
219
for match in high_severity_matches:
220
print(f"File: {match.path}")
221
print(f"Line: {match.start['line']}")
222
print(f"Message: {match.message}")
223
print(f"Code: {match.get_code_snippet()}")
224
225
if match.has_fix():
226
print(f"Suggested fix: {match.fix}")
227
```