0
# Document Validation
1
2
Comprehensive validation functionality for SPDX documents against the official specification with detailed error reporting and support for multiple SPDX versions.
3
4
## Capabilities
5
6
### Full Document Validation
7
8
Validate complete SPDX documents against the official specification with comprehensive error reporting.
9
10
```python { .api }
11
def validate_full_spdx_document(document: Document, spdx_version: str = None) -> List[ValidationMessage]:
12
"""
13
Validate complete SPDX document against SPDX specification.
14
15
Performs comprehensive validation including:
16
- SPDX version compatibility
17
- Document structure validation
18
- Package information validation
19
- File information validation
20
- Relationship validation
21
- License information validation
22
- Checksum validation
23
- SPDX ID validation
24
25
Args:
26
document: SPDX document to validate
27
spdx_version: SPDX version to validate against ("SPDX-2.2" or "SPDX-2.3")
28
If None, uses version from document
29
30
Returns:
31
List[ValidationMessage]: List of validation errors/warnings
32
Empty list if document is valid
33
"""
34
```
35
36
### Creation Info Validation
37
38
Validate document creation information and metadata.
39
40
```python { .api }
41
def validate_creation_info(creation_info: CreationInfo, spdx_version: str) -> List[ValidationMessage]:
42
"""
43
Validate document creation information.
44
45
Validates:
46
- SPDX version format
47
- SPDX ID format
48
- Document namespace URI format
49
- Creator information
50
- License list version compatibility
51
52
Args:
53
creation_info: Document creation info to validate
54
spdx_version: SPDX version for validation rules
55
56
Returns:
57
List of validation messages
58
"""
59
```
60
61
### Package Validation
62
63
Validate package information and metadata.
64
65
```python { .api }
66
def validate_packages(packages: List[Package], spdx_version: str, document: Optional[Document] = None) -> List[ValidationMessage]:
67
"""
68
Validate all packages in document.
69
70
Validates:
71
- Package SPDX IDs
72
- Download locations
73
- Verification codes
74
- License information
75
- External package references
76
- Package relationships
77
78
Args:
79
packages: List of packages to validate
80
spdx_version: SPDX version for validation rules
81
82
Returns:
83
List of validation messages
84
"""
85
```
86
87
### File Validation
88
89
Validate file information and metadata.
90
91
```python { .api }
92
def validate_files(files: List[File], spdx_version: str, document: Optional[Document] = None) -> List[ValidationMessage]:
93
"""
94
Validate all files in document.
95
96
Validates:
97
- File SPDX IDs
98
- File paths
99
- Checksums and algorithms
100
- License information
101
- File types
102
- Copyright information
103
104
Args:
105
files: List of files to validate
106
spdx_version: SPDX version for validation rules
107
108
Returns:
109
List of validation messages
110
"""
111
```
112
113
### Relationship Validation
114
115
Validate relationships between SPDX elements.
116
117
```python { .api }
118
def validate_relationships(relationships: List[Relationship], spdx_version: str) -> List[ValidationMessage]:
119
"""
120
Validate all relationships in document.
121
122
Validates:
123
- Relationship types
124
- SPDX element references
125
- Relationship consistency
126
- Required relationships
127
128
Args:
129
relationships: List of relationships to validate
130
spdx_version: SPDX version for validation rules
131
132
Returns:
133
List of validation messages
134
"""
135
```
136
137
### License Validation
138
139
Validate license information and expressions.
140
141
```python { .api }
142
def validate_extracted_licensing_infos(
143
extracted_licensing_infos: List[ExtractedLicensingInfo],
144
spdx_version: str
145
) -> List[ValidationMessage]:
146
"""
147
Validate extracted licensing information.
148
149
Validates:
150
- License IDs format
151
- License text content
152
- License references
153
- License expressions
154
155
Args:
156
extracted_licensing_infos: List of extracted licenses to validate
157
spdx_version: SPDX version for validation rules
158
159
Returns:
160
List of validation messages
161
"""
162
```
163
164
### Checksum Validation
165
166
Validate checksums and algorithms.
167
168
```python { .api }
169
def validate_checksums(checksums: List[Checksum]) -> List[ValidationMessage]:
170
"""
171
Validate checksum information.
172
173
Validates:
174
- Checksum algorithm support
175
- Checksum format
176
- Checksum value format
177
178
Args:
179
checksums: List of checksums to validate
180
181
Returns:
182
List of validation messages
183
"""
184
```
185
186
### SPDX ID Validation
187
188
Validate SPDX identifier formats and references.
189
190
```python { .api }
191
def get_list_of_all_spdx_ids(document: Document) -> List[str]:
192
"""
193
Get all SPDX IDs present in document.
194
195
Args:
196
document: SPDX document
197
198
Returns:
199
List of all SPDX IDs in document
200
"""
201
202
def validate_spdx_id_format(spdx_id: str) -> List[ValidationMessage]:
203
"""
204
Validate SPDX ID format.
205
206
Args:
207
spdx_id: SPDX ID to validate
208
209
Returns:
210
List of validation messages
211
"""
212
```
213
214
### URI Validation
215
216
Validate URI formats used in SPDX documents.
217
218
```python { .api }
219
def validate_uri_format(uri: str) -> List[ValidationMessage]:
220
"""
221
Validate URI format.
222
223
Args:
224
uri: URI to validate
225
226
Returns:
227
List of validation messages
228
"""
229
```
230
231
## Usage Examples
232
233
### Basic Document Validation
234
235
```python
236
from spdx_tools.spdx.validation.document_validator import validate_full_spdx_document
237
from spdx_tools.spdx.parser.parse_anything import parse_file
238
239
# Parse and validate document
240
document = parse_file("example.spdx")
241
validation_messages = validate_full_spdx_document(document)
242
243
if validation_messages:
244
print("Document validation failed:")
245
for message in validation_messages:
246
print(f" - {message.validation_message}")
247
if hasattr(message, 'context'):
248
print(f" Context: {message.context}")
249
else:
250
print("Document is valid!")
251
```
252
253
### Version-Specific Validation
254
255
```python
256
# Validate against specific SPDX version
257
validation_messages = validate_full_spdx_document(document, "SPDX-2.3")
258
259
# Check document version
260
print(f"Document version: {document.creation_info.spdx_version}")
261
```
262
263
### Component-Specific Validation
264
265
```python
266
from spdx_tools.spdx.validation.package_validator import validate_packages
267
from spdx_tools.spdx.validation.file_validator import validate_files
268
from spdx_tools.spdx.validation.relationship_validator import validate_relationships
269
270
# Validate specific components
271
package_errors = validate_packages(document.packages, "SPDX-2.3")
272
file_errors = validate_files(document.files, "SPDX-2.3")
273
relationship_errors = validate_relationships(document.relationships, "SPDX-2.3")
274
275
print(f"Package validation errors: {len(package_errors)}")
276
print(f"File validation errors: {len(file_errors)}")
277
print(f"Relationship validation errors: {len(relationship_errors)}")
278
```
279
280
### Pre-Write Validation
281
282
```python
283
from spdx_tools.spdx.writer.write_anything import write_file
284
285
# Validation is enabled by default in write_file
286
try:
287
write_file(document, "output.json") # validate=True by default
288
print("Document written successfully (validation passed)")
289
except Exception as e:
290
print(f"Validation or writing failed: {e}")
291
292
# Skip validation for faster writing (not recommended)
293
write_file(document, "output.json", validate=False)
294
```
295
296
### Detailed Error Analysis
297
298
```python
299
def analyze_validation_errors(validation_messages):
300
"""Analyze and categorize validation errors."""
301
error_types = {}
302
303
for message in validation_messages:
304
# Categorize by error type
305
if "SPDX ID" in message.validation_message:
306
error_types.setdefault("SPDX ID", []).append(message)
307
elif "license" in message.validation_message.lower():
308
error_types.setdefault("License", []).append(message)
309
elif "checksum" in message.validation_message.lower():
310
error_types.setdefault("Checksum", []).append(message)
311
else:
312
error_types.setdefault("Other", []).append(message)
313
314
for error_type, messages in error_types.items():
315
print(f"{error_type} errors ({len(messages)}):")
316
for msg in messages:
317
print(f" - {msg.validation_message}")
318
319
# Analyze errors
320
validation_messages = validate_full_spdx_document(document)
321
if validation_messages:
322
analyze_validation_errors(validation_messages)
323
```
324
325
### Validation in CI/CD Pipeline
326
327
```python
328
import sys
329
from spdx_tools.spdx.parser.parse_anything import parse_file
330
from spdx_tools.spdx.validation.document_validator import validate_full_spdx_document
331
332
def validate_spdx_file(file_path: str) -> bool:
333
"""Validate SPDX file for CI/CD pipeline."""
334
try:
335
document = parse_file(file_path)
336
validation_messages = validate_full_spdx_document(document)
337
338
if validation_messages:
339
print(f"❌ Validation failed for {file_path}:")
340
for message in validation_messages:
341
print(f" - {message.validation_message}")
342
return False
343
else:
344
print(f"✅ {file_path} is valid")
345
return True
346
347
except Exception as e:
348
print(f"❌ Error processing {file_path}: {e}")
349
return False
350
351
# Use in CI script
352
if __name__ == "__main__":
353
files_to_validate = sys.argv[1:]
354
all_valid = all(validate_spdx_file(f) for f in files_to_validate)
355
sys.exit(0 if all_valid else 1)
356
```
357
358
## Types
359
360
```python { .api }
361
from typing import List, Optional
362
from dataclasses import dataclass
363
from enum import Enum
364
365
class ValidationMessage:
366
"""Validation error or warning message."""
367
validation_message: str
368
context: ValidationContext
369
370
@dataclass
371
class ValidationContext:
372
"""Context information for validation messages."""
373
spdx_id: Optional[str]
374
element_type: SpdxElementType
375
376
class SpdxElementType(Enum):
377
"""Types of SPDX elements for validation context."""
378
DOCUMENT = "Document"
379
PACKAGE = "Package"
380
FILE = "File"
381
SNIPPET = "Snippet"
382
RELATIONSHIP = "Relationship"
383
ANNOTATION = "Annotation"
384
EXTRACTED_LICENSING_INFO = "ExtractedLicensingInfo"
385
```