CtrlK
BlogDocsLog inGet started
Tessl Logo

backend-tests

Run backend tests and code quality checks for OPRE OPS. Covers ops_api pytest, data_tools pytest, and nox linting/formatting sessions. Use this skill when the user wants to run backend tests, check code quality, lint Python code, run pytest, or verify their backend changes pass CI checks — even if they just say "run the tests" or "does this pass".

67

Quality

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

SKILL.md
Quality
Evals
Security

Quality

Content

65%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The body is highly actionable with concrete executable commands and clear sequencing, but loses points for redundant boilerplate that could be tightened, missing validation gates on the batch CI workflow, and a monolithic single-file structure with no progressive disclosure into reference files.

Suggestions

Consolidate the repeated 'cd backend/<pkg>' boilerplate and the verbose 'pytest -v --tb=short' variants into a shared note to remove redundant tokens across sections.

Add explicit validation gates to the 'all'/CI workflow (e.g., 'if a step fails, report and decide whether to continue before the next step') and a clearer validate→fix→retry loop for test failures, rather than running all four steps unconditionally.

Consider offloading the per-argument command recipes, Key File Locations, or Common Issues into a reference file to shrink the ~200-line body and give the skill genuine progressive disclosure.

DimensionReasoningScore

Conciseness

The body is mostly efficient concrete commands with little concept-explanation, but redundancy could be tightened: the verbose 'pytest -v --tb=short' variant, repeated 'cd backend/<pkg>' boilerplate, and a default-help section that restates the seven numbered cases above. This fits 'mostly efficient but could be tightened' rather than the 'every token earns its place' level 3.

2 / 3

Actionability

Fully executable, copy-paste-ready commands throughout — 'cd backend/ops_api', 'pipenv run pytest', 'pipenv run nox -s lint', 'docker info > /dev/null 2>&1 && echo ...', and specific test paths — with no pseudocode, matching the level 3 anchor and clearly above the incomplete-guidance level 2.

3 / 3

Workflow Clarity

Sequencing is clear (numbered cases, the 'all' Step 1/4–4/4 run, a Docker pre-flight check), but the batch CI workflow lacks explicit per-step validation gates or a validate→fix→retry feedback loop, and only data_tools has a pre-flight checkpoint. Per the rubric's batch-operation guideline, missing validation caps this at 2 rather than 3.

2 / 3

Progressive Disclosure

No bundle/reference files exist and the ~200-line body is a single monolithic file; sections are well-organized and navigable, but everything is inline with nothing offloaded to reference files, fitting 'content that should be separate is inline' (level 2) rather than the reference-split level 3.

2 / 3

Total

9

/

12

Passed

Description

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description is strong across all dimensions: it states concrete actions, includes natural trigger phrasing with casual variants, explicitly covers both what and when, and is tightly scoped to the OPRE OPS backend tooling. No changes needed.

DimensionReasoningScore

Specificity

Lists multiple concrete actions — 'ops_api pytest, data_tools pytest, and nox linting/formatting sessions' and 'lint Python code, run pytest' — rather than vague language, matching the 'lists multiple specific concrete actions' anchor and exceeding the 'some actions' level 2.

3 / 3

Completeness

Explicitly answers both what ('Run backend tests and code quality checks... Covers ops_api pytest, data_tools pytest, and nox linting/formatting sessions') and when ('Use this skill when the user wants to run backend tests... even if they just say "run the tests"'), satisfying the explicit-trigger level 3 anchor.

3 / 3

Trigger Term Quality

Covers natural terms users would say — 'run backend tests', 'check code quality', 'lint Python code', 'run pytest', plus casual variants 'run the tests' and 'does this pass' — giving good coverage including common variations, above level 2's 'missing common variations'.

3 / 3

Distinctiveness Conflict Risk

Tied to a specific repo and toolset (OPRE OPS, ops_api/data_tools, pytest, nox) with distinct triggers, occupying a clear niche unlikely to fire for unrelated skills, matching the level 3 'clear niche with distinct triggers' anchor.

3 / 3

Total

12

/

12

Passed

Validation

87%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation14 / 16 Passed

Validation for skill structure

CriteriaDescriptionResult

allowed_tools_field

'allowed-tools' contains unusual tool name(s)

Warning

frontmatter_unknown_keys

Unknown frontmatter key(s) found; consider removing or moving to metadata

Warning

Total

14

/

16

Passed

Repository
HHS/OPRE-OPS
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.