Process CRISPR screening data to identify essential genes and hit candidates. Performs quality control, log fold change calculation, z-score-based sgRNA scoring, and hit calling for pooled CRISPR screens including viability, drug resistance, and synthetic lethality studies.
78
73%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./scientific-skills/Data analysis/crispr-screen-analyzer/SKILL.mdAnalyze pooled CRISPR screening data from count matrix to hit identification. Covers QC assessment (Gini index, read depth, dropout, replicate correlation), log fold change calculation, z-score-based sgRNA scoring, and multi-threshold hit calling.
python -m py_compile scripts/main.py
python scripts/main.py --helpUpstream: fastqc-report-interpreter → crispr-screen-analyzer
Downstream: crispr-screen-analyzer → go-kegg-enrichment → hit-validation-planner
Fallback template: If scripts/main.py fails or required files are absent, report: (a) which file is missing, (b) which QC metrics can still be computed, (c) the manual command equivalent for the failed step.
| Parameter | Type | Required | Description |
|---|---|---|---|
--counts, -c | string | Yes | sgRNA count matrix file |
--samples, -s | string | Yes | Sample annotation CSV file |
--control | string | No | Control sample names (comma-separated) |
--treatment, -t | string | No | Treatment sample names (comma-separated) |
--output, -o | string | No | Output directory |
--fdr | float | No | FDR threshold (default: 0.05; must be between 0 and 1) |
--seed | int | No | Random seed for reproducibility (default: 42) |
# QC assessment only
python scripts/main.py --counts sgrna_counts.txt --samples samples.csv --output qc_results
# Full differential analysis
python scripts/main.py \
--counts sgrna_counts.txt --samples samples.csv \
--control "Ctrl_1,Ctrl_2,Ctrl_3" \
--treatment "Drug_1,Drug_2,Drug_3" \
--output drug_screen --fdr 0.05 --seed 42| Metric | Target | Action if Failed |
|---|---|---|
| Gini index | < 0.3 | Check MOI; consider repeating screen |
| Total reads | > 10M/sample | Increase sequencing depth |
| Zero-count sgRNAs | < 5% | Verify library quality at transduction |
| Replicate correlation | > 0.7 | Investigate batch effects; flag in output |
Note: Replicate correlation (Pearson or Spearman) is computed between replicate samples. Samples with correlation < 0.7 are flagged with a warning in the QC output. This metric is documented but not yet computed in the current script — implementation gap noted.
| Category | Criteria | Interpretation |
|---|---|---|
| Essential | FDR < 0.05, LFC < −1 | Required for cell viability |
| Drug Sensitive | FDR < 0.05, LFC < −1 | Synthetic lethal with treatment |
| Drug Resistant | FDR < 0.05, LFC > 1 | Confers resistance |
The scoring method uses z-score normalization at the sgRNA level (not true Robust Rank Aggregation). LFC uses pseudocount +1: log2((treatment+1)/(control+1)). FDR correction uses Benjamini-Hochberg. For true gene-level RRA, use MAGeCK or BAGEL2 as downstream tools.
Set --seed (default: 42) to ensure reproducible results across runs. The seed is recorded in the output metadata file. The script must call np.random.seed(args.seed) at the start of main().
For complex multi-constraint requests, always produce all of these sections:
Every response must make these explicit:
This skill accepts: sgRNA count matrices and sample annotation files from pooled CRISPR screens.
If the request does not involve CRISPR screen data analysis — for example, asking to design sgRNA libraries, perform RNA-seq differential expression, or interpret non-CRISPR genomic data — do not proceed. Instead respond:
"
crispr-screen-analyzeris designed to analyze pooled CRISPR screening data. Your request appears to be outside this scope. Please provide an sgRNA count matrix and sample annotation file, or use a more appropriate tool for your task."
--control or --treatment do not match columns in the count matrix, report the mismatch and list available sample names.--fdr is not between 0 and 1, reject with: Error: --fdr must be between 0 and 1.scripts/main.py fails, report the failure point, summarize what still can be completed safely, and provide a manual fallback.ca9aaa4
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.