0
# Package Parsing
1
2
Extraction and parsing of package specifications from requirements files, handling various formats including extras, comments, and pip options.
3
4
## Capabilities
5
6
### Package Detection from Requirements Files
7
8
Parses packages from requirements files, extracting clean package specifications while filtering out comments, pip options, and other non-package lines.
9
10
```python { .api }
11
class PackagesDetector:
12
def __init__(self, requirements_files):
13
"""
14
Initialize packages detector with requirements files.
15
16
Args:
17
requirements_files (list): List of requirements file paths to parse
18
"""
19
20
def get_packages(self):
21
"""
22
Return the list of detected packages from all requirements files.
23
24
Returns:
25
list: List of package specification strings (e.g., "django==3.2.0")
26
"""
27
28
def detect_packages(self, requirements_files):
29
"""
30
Parse packages from all provided requirements files.
31
32
Args:
33
requirements_files (list): List of requirements file paths to process
34
35
Side effects:
36
Populates self.packages with parsed package specifications
37
"""
38
```
39
40
### Requirements Line Processing
41
42
Processes individual lines from requirements files, handling various formats and filtering out non-package content.
43
44
```python { .api }
45
def _process_req_line(self, line):
46
"""
47
Process a single line from a requirements file.
48
49
Args:
50
line (str): Single line from requirements file
51
52
Processing rules:
53
- Skip empty lines and whitespace-only lines
54
- Skip comment lines (starting with #)
55
- Skip pip options (-f, --find-links, -i, --index-url, etc.)
56
- Handle inline comments by extracting package part
57
- Add valid package specifications to self.packages
58
"""
59
```
60
61
## Supported Package Formats
62
63
### Standard Package Specifications
64
65
```
66
django==3.2.0 # Exact version pin
67
requests>=2.25.1 # Minimum version (note: only == is processed for upgrades)
68
flask~=2.0.0 # Compatible version
69
```
70
71
### Package Extras
72
73
```
74
django[rest]==3.2.0 # Package with extras
75
requests[security,socks]==2.25.1 # Multiple extras
76
```
77
78
### Complex Specifications
79
80
```
81
git+https://github.com/user/repo.git#egg=package # VCS packages (filtered out)
82
-e . # Editable installs (filtered out)
83
./local/package # Local packages (filtered out)
84
```
85
86
## Filtered Content
87
88
The parser filters out various pip options and non-package lines:
89
90
### Pip Options (Filtered Out)
91
92
```
93
-f http://example.com/packages/ # --find-links
94
--find-links http://example.com/
95
-i http://pypi.example.com/ # --index-url
96
--index-url http://pypi.example.com/
97
--extra-index-url http://extra.com/
98
--no-index
99
-r other-requirements.txt # -r inclusions (handled by RequirementsDetector)
100
-Z # --always-unzip
101
--always-unzip
102
```
103
104
### Comments and Inline Comments
105
106
```
107
# This is a comment line # Filtered out
108
django==3.2.0 # This is inline # Extracts: django==3.2.0
109
```
110
111
### Empty Lines
112
113
```
114
# Filtered out
115
116
django==3.2.0 # Processed
117
# Filtered out
118
```
119
120
## Usage Examples
121
122
### Basic Package Detection
123
124
```python
125
from pip_upgrader.packages_detector import PackagesDetector
126
127
# Parse packages from requirements files
128
filenames = ['requirements.txt', 'requirements/dev.txt']
129
detector = PackagesDetector(filenames)
130
packages = detector.get_packages()
131
132
print(packages)
133
# Output: ['django==3.2.0', 'requests==2.25.1', 'pytest==6.2.4']
134
```
135
136
### Example Requirements File Processing
137
138
Given a requirements file:
139
140
```
141
# requirements.txt
142
django==3.2.0
143
requests>=2.25.1 # HTTP library
144
flask~=2.0.0
145
146
# Development dependencies
147
-r requirements/base.txt
148
pytest==6.2.4
149
black==21.5.4 # Code formatter
150
151
# Index configuration
152
-i https://pypi.org/simple/
153
--extra-index-url https://test.pypi.org/simple/
154
155
# Empty line above and below
156
157
celery[redis]==5.1.0
158
```
159
160
The PackagesDetector would extract:
161
162
```python
163
[
164
'django==3.2.0',
165
'requests>=2.25.1',
166
'flask~=2.0.0',
167
'pytest==6.2.4',
168
'black==21.5.4',
169
'celery[redis]==5.1.0'
170
]
171
```
172
173
### Integration with Requirements Detection
174
175
```python
176
from pip_upgrader.requirements_detector import RequirementsDetector
177
from pip_upgrader.packages_detector import PackagesDetector
178
179
# First detect requirements files
180
req_detector = RequirementsDetector(None) # Auto-detect
181
filenames = req_detector.get_filenames()
182
183
# Then parse packages from those files
184
pkg_detector = PackagesDetector(filenames)
185
packages = pkg_detector.get_packages()
186
187
print(f"Found {len(packages)} packages in {len(filenames)} files")
188
```
189
190
## Line Processing Details
191
192
### Comment Handling
193
194
The parser handles various comment scenarios:
195
196
```python
197
# Full line comment - skipped
198
django==3.2.0 # Package with inline comment - extracts package
199
# Indented comment - skipped
200
django==3.2.0 # comment with multiple # symbols - extracts package
201
```
202
203
### Whitespace Handling
204
205
```python
206
"django==3.2.0" # Standard format
207
" django==3.2.0 " # Leading/trailing whitespace stripped
208
"\tdjango==3.2.0\n" # Tabs and newlines handled
209
```
210
211
### Pip Options Detection
212
213
The parser identifies and filters pip options using prefix matching:
214
215
```python
216
# These lines are filtered out:
217
"-f http://example.com"
218
"--find-links http://example.com"
219
"-i http://pypi.example.com"
220
"--index-url http://pypi.example.com"
221
"--extra-index-url http://extra.com"
222
"--no-index"
223
"-r base.txt"
224
"-Z"
225
"--always-unzip"
226
```
227
228
## Error Handling
229
230
- **File read errors**: Handled by the calling code (RequirementsDetector validates files first)
231
- **Invalid package specifications**: Invalid lines are processed as-is and may cause issues in later pipeline stages
232
- **Encoding issues**: Files are read with default encoding; encoding errors may cause parsing failures
233
- **Empty files**: Gracefully handled, results in empty packages list
234
235
## Output Format
236
237
The detector returns a list of string package specifications exactly as they appear in the requirements files (minus comments and whitespace). The format is compatible with pip and follows the requirements file specification:
238
239
- `package==1.0.0` (exact version)
240
- `package>=1.0.0` (minimum version)
241
- `package~=1.0.0` (compatible version)
242
- `package[extra1,extra2]==1.0.0` (with extras)
243
244
Note: For upgrade detection, the system primarily works with `==` specifications, as these provide the current pinned version needed for comparison with available updates.