neoantigen-predictor

Predict patient-specific neoantigen candidate peptides with high immunogenicity based on HLA typing and tumor mutation profiles, for tumor immunotherapy target screening.

Quality

72%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./scientific-skills/Data analysis/neoantigen-predictor/SKILL.md

Neoantigen Predictor

Predicts patient-specific neoantigen candidate peptides with high immunogenicity based on HLA typing and tumor mutation profiles, providing target screening for tumor immunotherapy.

Quick Check

python -m py_compile scripts/main.py
python scripts/main.py --help
python scripts/main.py --hla "HLA-A*02:01" --mutations mutations.csv --output results.json

When to Use

Use this skill to predict neoantigens from tumor mutation data and patient HLA typing.
Use this skill to screen high-priority immunotherapy targets based on MHC binding affinity and immunogenicity scores.
Use this skill for data analysis tasks that require explicit assumptions, bounded scope, and a reproducible output format.
Use this skill when you need a documented fallback path for missing inputs, execution errors, or partial evidence.

Workflow

Validate input first (hard gate): Confirm the request is within scope. If vaccine design, clinical trial interpretation, or general genomics analysis is requested, emit the scope refusal before any processing.
Confirm the user objective, required inputs, and non-negotiable constraints before doing detailed work.
Validate that the request matches the documented scope and stop early if the task would require unsupported assumptions.
Use the packaged script path or the documented reasoning path with only the inputs that are actually available.
Return a structured result that separates assumptions, deliverables, risks, and unresolved items.
If execution fails or inputs are incomplete, switch to the fallback path and state exactly what blocked full completion.

Function Overview

Neoantigens are variant peptides generated by non-synonymous mutations in tumor cells, presented by the patient's HLA molecules and recognized by T cells. This tool integrates:

Mutant Peptide Generation — Extract 8-11mer variant peptides from mutation sites
HLA Binding Prediction — Predict peptide binding affinity to patient HLA molecules
Immunogenicity Assessment — Assess potential to elicit immune response
Priority Ranking — Comprehensive scoring to screen optimal neoantigen candidates

Input Format

HLA Typing Input

Format	Example	Description
Standard Nomenclature	`HLA-A*02:01`	WHO standard HLA nomenclature
Simplified	`A0201`	Omit HLA- and *:
Multi-alleles	`HLA-A02:01,A11:01,B*07:02`	Comma-separated

Mutation Data Input

VCF Format:

#CHROM  POS     ID  REF ALT QUAL    FILTER  INFO
chr17   7579472 .   G   A   100     PASS    GENE=TP53;AA=p.R273H

Table Format (CSV):

Gene	Chrom	Position	Ref	Alt	Protein_Change
TP53	chr17	7579472	G	A	p.R273H

Usage

Command Line

python scripts/main.py \
  --hla "HLA-A*02:01,HLA-A*11:01,B*07:02" \
  --vcf mutations.vcf \
  --output neoantigen_results.json

python scripts/main.py \
  --hla-file hla_genotype.txt \
  --mutations mutations.csv \
  --peptide-length 9,10,11 \
  --rank-cutoff 0.5 \
  --output results.json

Python API

from scripts.main import NeoantigenPredictor

predictor = NeoantigenPredictor()
hla_alleles = ["HLA-A*02:01", "HLA-A*11:01", "HLA-B*07:02"]
mutations = [{"gene": "TP53", "chrom": "chr17", "pos": 7579472, "ref": "G", "alt": "A", "protein_change": "p.R273H"}]
results = predictor.predict(hla_alleles=hla_alleles, mutations=mutations, peptide_length=[9, 10], mhc_method="netmhcpan")
high_affinity = predictor.filter_by_binding(results, rank_threshold=0.5)

Scoring Algorithms

MHC Binding Affinity

Metric	Threshold
Rank %	<0.5% = Strong, <2% = Weak
IC50 (nM)	<50nM = High, <500nM = Intermediate

Priority Score

priority_score = (
    0.40 * (1 - rank_percentile) +   # MHC binding
    0.35 * immunogenicity_score +     # Immunogenicity
    0.25 * clinical_score             # Expression, clonality
)

Algorithm Limitations

MHC binding prediction accuracy: ~85% (Rank < 0.5 threshold)
Immunogenicity prediction requires experimental validation (~60-70% correlation)
Does not consider HLA molecule expression levels on cell surface
Cannot predict immune tolerance or suppressive T cell responses

Clinical Application Notes

Important: This tool is for research purposes only. Prediction results must not be the sole basis for clinical decisions.

All candidate neoantigens require experimental validation (e.g., ELISPOT, tetramer staining)
Consider patient immune status and treatment history
Assess potential autoimmune toxicity risks
Combine with tumor microenvironment immune infiltration status

Dependencies

Python 3.8+ (strictly required; dataclasses module used)
biopython, pandas, numpy, requests
NetMHCpan 4.1 (optional, local install for improved performance)

Prerequisites

pip install -r requirements.txt

Input Validation

This skill accepts: patient HLA typing data and tumor mutation profiles (VCF, CSV, or FASTA format) for the purpose of predicting neoantigen candidates and immunotherapy targets.

If the user's request does not involve neoantigen prediction from HLA and mutation data — for example, asking to design vaccines, interpret clinical trial results, or perform general genomics analysis — do not proceed with the workflow. Instead respond:

"neoantigen-predictor is designed to predict neoantigen candidates from HLA typing and tumor mutation data for immunotherapy research. Your request appears to be outside this scope. Please provide HLA alleles and mutation data, or use a more appropriate tool for your task."

Do not continue the workflow when the request is out of scope, missing HLA typing or mutation data, or would require clinical decision-making. For missing inputs, state exactly which fields are missing.

Fallback Behavior

If scripts/main.py fails or required inputs are incomplete:

Report the exact failure point and error message (sanitized).
State what can still be completed (e.g., peptide generation without binding prediction if NetMHCpan is unavailable).
Manual fallback: use --variant-peptides peptides.fasta to skip mutation processing and predict binding for pre-generated peptides directly.
Do not fabricate binding scores, immunogenicity values, or clinical interpretations.

Output Requirements

Every final response must make these items explicit when relevant:

Objective or requested deliverable
Inputs used and assumptions introduced
Workflow or decision path
Core result, recommendation, or artifact
Constraints, risks, caveats, or validation needs (always include research-only disclaimer)
Unresolved items and next-step checks

Error Handling

If required inputs are missing, state exactly which fields are missing and request only the minimum additional information.
If the task goes outside the documented scope, stop instead of guessing or silently widening the assignment.
If scripts/main.py fails, report the failure point, summarize what still can be completed safely, and provide a manual fallback.
Do not fabricate files, citations, data, search results, or execution outcomes.

Response Template

Use the following fixed structure for non-trivial requests:

Objective
Inputs Received
Assumptions
Workflow
Deliverable
Risks and Limits (always include research-only disclaimer)
Next Checks

For stress/multi-constraint requests, also include:

Constraints checklist (compliance, performance, error paths)
Explicit boundary statement confirming no clinical decisions were made
Unresolved items with explicit blocking reasons

If the request is simple, you may compress the structure, but always keep the research disclaimer and scope limits explicit.

Repository: aipoch/medical-research-skills
Commit: ca9aaa4

Last updated: about 2 months ago
Created: about 2 months ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.