0
# Command Line Interface
1
2
Complete command-line interface for GFFtk providing direct access to all conversion, analysis, and manipulation functions through simple CLI commands. Each command corresponds to a Python function that can also be called programmatically.
3
4
## Capabilities
5
6
### Format Conversion CLI
7
8
Main command-line interface for format conversion operations.
9
10
```python { .api }
11
def convert(args):
12
"""
13
Command-line interface for format conversion operations.
14
15
Provides CLI access to all format conversion functions including
16
GFF3, GTF, TBL, GenBank, and protein/transcript extraction with
17
flexible filtering and output options.
18
19
Parameters:
20
- args (argparse.Namespace): Parsed command-line arguments containing:
21
- input: Input file path
22
- fasta: Genome FASTA file path
23
- output: Output file path
24
- format: Output format (gff3, gtf, tbl, genbank, proteins, etc.)
25
- table: Genetic code table
26
- grep: Filter patterns to include
27
- grepv: Filter patterns to exclude
28
- debug: Enable debug output
29
30
Returns:
31
None
32
"""
33
```
34
35
### Consensus Prediction CLI
36
37
Command-line interface for consensus gene prediction.
38
39
```python { .api }
40
def consensus(args):
41
"""
42
Command-line interface for EvidenceModeler-like consensus prediction.
43
44
Combines multiple gene prediction sources with protein and transcript
45
evidence to generate high-quality consensus gene models using
46
configurable weights and validation criteria.
47
48
Parameters:
49
- args (argparse.Namespace): Parsed command-line arguments containing:
50
- fasta: Genome FASTA file path
51
- genes: List of gene prediction file paths
52
- proteins: List of protein alignment file paths
53
- transcripts: List of transcript alignment file paths
54
- weights: Source weight configuration file
55
- output: Output consensus GFF3 file
56
- minscore: Minimum score threshold
57
- repeats: Repeat annotation file for filtering
58
- debug: Enable debug output
59
60
Returns:
61
None
62
"""
63
```
64
65
### Annotation Comparison CLI
66
67
Command-line interface for comparing two genome annotations.
68
69
```python { .api }
70
def compare(args):
71
"""
72
Command-line interface for annotation comparison analysis.
73
74
Compares two genome annotations to identify differences, calculate
75
similarity metrics, and generate detailed comparison reports with
76
feature-level analysis and statistics.
77
78
Parameters:
79
- args (argparse.Namespace): Parsed command-line arguments containing:
80
- old: Path to reference annotation file
81
- new: Path to query annotation file
82
- fasta: Genome FASTA file path
83
- output: Output comparison report path
84
- debug: Enable debug output
85
86
Returns:
87
None
88
"""
89
```
90
91
### Annotation Statistics CLI
92
93
Command-line interface for calculating annotation statistics.
94
95
```python { .api }
96
def stats(args):
97
"""
98
Command-line interface for annotation statistics calculation.
99
100
Calculates comprehensive statistics for genome annotations including
101
gene counts, feature distributions, sequence lengths, and quality
102
metrics with detailed reporting options.
103
104
Parameters:
105
- args (argparse.Namespace): Parsed command-line arguments containing:
106
- input: Input annotation file path
107
- fasta: Genome FASTA file path (optional)
108
- output: Output statistics file path
109
- format: Output format (text, json, csv)
110
111
Returns:
112
None
113
"""
114
```
115
116
### GFF3 File Sorting CLI
117
118
Command-line interface for sorting GFF3 files by genomic coordinates.
119
120
```python { .api }
121
def sort(args):
122
"""
123
Command-line interface for GFF3 file sorting.
124
125
Sorts GFF3 files by genomic coordinates ensuring proper feature
126
ordering and parent-child relationships are maintained.
127
128
Parameters:
129
- args (argparse.Namespace): Parsed command-line arguments containing:
130
- input: Input GFF3 file path
131
- output: Output sorted GFF3 file path
132
133
Returns:
134
None
135
"""
136
137
def sortGFF3(input, output):
138
"""
139
Sort GFF3 file by genomic coordinates.
140
141
Parameters:
142
- input (str): Input GFF3 file path
143
- output (str): Output sorted GFF3 file path
144
145
Returns:
146
None
147
"""
148
```
149
150
### GFF3 Sanitization CLI
151
152
Command-line interface for cleaning and validating GFF3 files.
153
154
```python { .api }
155
def sanitize(args):
156
"""
157
Command-line interface for GFF3 file sanitization.
158
159
Cleans and validates GFF3 files by fixing common format issues,
160
removing invalid features, and ensuring compliance with GFF3
161
specification requirements.
162
163
Parameters:
164
- args (argparse.Namespace): Parsed command-line arguments containing:
165
- input: Input GFF3 file path
166
- output: Output sanitized GFF3 file path
167
- strict: Enable strict validation mode
168
169
Returns:
170
None
171
"""
172
```
173
174
### Feature Renaming CLI
175
176
Command-line interface for systematic renaming of annotation features.
177
178
```python { .api }
179
def rename(args):
180
"""
181
Command-line interface for feature renaming operations.
182
183
Systematically renames annotation features using configurable
184
patterns and rules to ensure consistent naming conventions
185
across annotation files.
186
187
Parameters:
188
- args (argparse.Namespace): Parsed command-line arguments containing:
189
- input: Input annotation file path
190
- output: Output renamed annotation file path
191
- pattern: Renaming pattern specification
192
- prefix: Prefix for new feature names
193
194
Returns:
195
None
196
"""
197
```
198
199
## Usage Examples
200
201
### Basic Format Conversion
202
203
```bash
204
# Convert GFF3 to GTF format
205
gfftk convert -i annotation.gff3 -f genome.fasta -o output.gtf
206
207
# Extract protein sequences
208
gfftk convert -i annotation.gff3 -f genome.fasta -o proteins.faa --output-format proteins
209
210
# Convert with filtering
211
gfftk convert -i annotation.gff3 -f genome.fasta -o filtered.gff3 --grep product:kinase
212
```
213
214
### Consensus Prediction
215
216
```bash
217
# Basic consensus prediction
218
gfftk consensus -f genome.fasta -g augustus.gff3 genemark.gff3 -p proteins.gff3 -o consensus.gff3
219
220
# With custom weights and repeat filtering
221
gfftk consensus -f genome.fasta -g augustus.gff3 genemark.gff3 \
222
-p proteins.gff3 -t transcripts.gff3 \
223
-w weights.txt --repeats repeats.bed -o consensus.gff3
224
```
225
226
### Annotation Analysis
227
228
```bash
229
# Compare two annotations
230
gfftk compare --old reference.gff3 --new updated.gff3 -f genome.fasta -o comparison.txt
231
232
# Calculate statistics
233
gfftk stats -i annotation.gff3 -f genome.fasta -o stats.txt
234
235
# Sort GFF3 file
236
gfftk sort -i unsorted.gff3 -o sorted.gff3
237
```
238
239
### File Processing
240
241
```bash
242
# Sanitize GFF3 file
243
gfftk sanitize -i messy.gff3 -o clean.gff3
244
245
# Rename features systematically
246
gfftk rename -i annotation.gff3 -o renamed.gff3 --prefix GENE
247
```
248
249
## Programmatic Access
250
251
All CLI commands can be accessed programmatically by importing the corresponding functions:
252
253
```python
254
import argparse
255
from gfftk.convert import convert
256
from gfftk.consensus import consensus
257
from gfftk.compare import compare
258
from gfftk.stats import stats
259
from gfftk.sort import sort
260
from gfftk.sanitize import sanitize
261
from gfftk.rename import rename
262
263
# Create argument namespace (equivalent to CLI args)
264
args = argparse.Namespace(
265
input='annotation.gff3',
266
fasta='genome.fasta',
267
output='output.gtf',
268
format='gtf',
269
table=1,
270
debug=False
271
)
272
273
# Call conversion function
274
convert(args)
275
```
276
277
## Command Reference
278
279
| Command | Function | Description |
280
|---------|----------|-------------|
281
| `gfftk convert` | `convert()` | Format conversion and sequence extraction |
282
| `gfftk consensus` | `consensus()` | EvidenceModeler-like consensus prediction |
283
| `gfftk compare` | `compare()` | Annotation comparison and analysis |
284
| `gfftk stats` | `stats()` | Annotation statistics calculation |
285
| `gfftk sort` | `sort()` | GFF3 coordinate-based sorting |
286
| `gfftk sanitize` | `sanitize()` | GFF3 validation and cleaning |
287
| `gfftk rename` | `rename()` | Systematic feature renaming |
288
289
Each command provides comprehensive help via `gfftk <command> --help` with detailed parameter descriptions and usage examples.