CtrlK
BlogDocsLog inGet started
Tessl Logo

backtesting-frameworks

Build robust, production-grade backtesting systems that avoid common pitfalls and produce reliable strategy performance estimates.

52

Quality

41%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/backtesting-frameworks/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a clear domain (backtesting) but relies on qualitative adjectives ('robust', 'production-grade', 'reliable') rather than listing concrete actions. It completely lacks a 'Use when...' clause, making it harder for Claude to know when to select this skill. The description would benefit significantly from specific actions and explicit trigger conditions.

Suggestions

Add a 'Use when...' clause with trigger terms like 'backtest', 'trading strategy', 'historical simulation', 'strategy evaluation', 'portfolio backtest'.

Replace vague qualifiers ('robust', 'production-grade') with specific concrete actions such as 'simulate trades against historical data, model slippage and transaction costs, calculate risk-adjusted returns, detect lookahead bias'.

Include common file types or tool references users might mention, such as 'OHLCV data', 'equity curves', 'Sharpe ratio', or 'drawdown analysis'.

DimensionReasoningScore

Specificity

Names the domain (backtesting systems) and mentions some qualities (robust, production-grade, avoid pitfalls, reliable performance estimates), but doesn't list specific concrete actions like 'simulate trades', 'calculate Sharpe ratios', 'handle slippage modeling', etc.

2 / 3

Completeness

Describes what it does at a high level (build backtesting systems) but completely lacks a 'Use when...' clause or any explicit trigger guidance for when Claude should select this skill. Per the rubric, a missing 'Use when...' clause caps completeness at 2, and the 'what' is also somewhat vague, warranting a score of 1.

1 / 3

Trigger Term Quality

Includes 'backtesting' which is a strong trigger term, and 'strategy performance' is relevant. However, it misses common variations users might say like 'backtest', 'trading strategy', 'historical simulation', 'portfolio testing', 'quantitative finance', or 'strategy evaluation'.

2 / 3

Distinctiveness Conflict Risk

'Backtesting systems' is a fairly specific niche that wouldn't overlap with most skills, but the vague phrasing around 'production-grade systems' and 'strategy performance' could potentially overlap with general software engineering or quantitative analysis skills.

2 / 3

Total

7

/

12

Passed

Implementation

50%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill is concise and well-structured at a high level, but critically lacks actionability — the core instructions read like a table of contents rather than executable guidance. Without any concrete code examples, specific commands, or detailed steps in the main file, Claude would struggle to act on this skill without immediately needing the external resource. The workflow sequence is present but lacks validation checkpoints important for a complex multi-step process like backtesting.

Suggestions

Add at least one concrete, executable code example in the main SKILL.md (e.g., a minimal backtest loop skeleton with realistic cost modeling) so the skill is actionable without requiring the external resource.

Make the workflow steps more specific with explicit validation checkpoints — e.g., 'Verify no future data leakage by checking that all features use only data available at signal time' and 'Validate results by comparing in-sample vs out-of-sample Sharpe ratios.'

Expand the instructions section to include specific patterns for common pitfalls (look-ahead bias detection, survivorship bias handling) with concrete checks rather than abstract bullet points.

DimensionReasoningScore

Conciseness

The content is lean and efficient. It avoids explaining what backtesting is or how trading works, assumes Claude's competence, and every section serves a clear purpose without padding.

3 / 3

Actionability

The instructions are entirely abstract and vague — 'Build point-in-time data pipelines,' 'Implement event-driven simulation' — with no concrete code, commands, specific examples, or executable guidance. Everything actionable is deferred to an external resource file.

1 / 3

Workflow Clarity

There is a rough sequence implied (define hypothesis → build pipelines → implement simulation → use splits), but steps lack specificity, there are no validation checkpoints, and no feedback loops for error recovery in what is inherently a multi-step, error-prone process.

2 / 3

Progressive Disclosure

There is a reference to an external resource file for detailed patterns, which is good structure. However, the SKILL.md itself provides almost no substantive quick-start content — it's essentially just a pointer with bullet-point abstractions, making the overview too thin to be useful on its own.

2 / 3

Total

8

/

12

Passed

Validation

90%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation10 / 11 Passed

Validation for skill structure

CriteriaDescriptionResult

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

10

/

11

Passed

Repository
sickn33/antigravity-awesome-skills
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.