backend-tests

Run backend tests and code quality checks for OPRE OPS. Covers ops_api pytest, data_tools pytest, and nox linting/formatting sessions. Use this skill when the user wants to run backend tests, check code quality, lint Python code, run pytest, or verify their backend changes pass CI checks — even if they just say "run the tests" or "does this pass".

Quality

88%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Quality

Discovery

100%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

This is an excellent skill description that hits all the marks. It names specific tools and test suites, provides a clear 'Use this skill when...' clause with both formal and casual trigger phrases, and is scoped to a specific project making it highly distinctive. The third-person voice is used correctly throughout.

Dimension	Reasoning	Score
Specificity	Lists multiple specific concrete actions: 'ops_api pytest', 'data_tools pytest', 'nox linting/formatting sessions'. Names the specific project (OPRE OPS) and specific tools (pytest, nox).	3 / 3
Completeness	Clearly answers both 'what' (run backend tests and code quality checks covering ops_api pytest, data_tools pytest, nox linting/formatting) and 'when' (explicit 'Use this skill when...' clause with multiple trigger scenarios including casual phrases).	3 / 3
Trigger Term Quality	Excellent coverage of natural trigger terms: 'run backend tests', 'check code quality', 'lint Python code', 'run pytest', 'pass CI checks', 'run the tests', 'does this pass'. Includes both formal and casual phrasings users would naturally say.	3 / 3
Distinctiveness Conflict Risk	Highly distinctive — scoped to a specific project (OPRE OPS), specific test suites (ops_api, data_tools), and specific tools (pytest, nox). The combination of project name and backend-specific tooling makes it very unlikely to conflict with other skills.	3 / 3
	Total	12 / 12 Passed

Implementation

77%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a strong, highly actionable skill that provides clear executable commands for every scenario and good workflow sequencing with validation steps. Its main weakness is length — while most content earns its place, the skill could be slightly more concise by trimming explanatory prose. The single-file structure is acceptable given no bundle exists, but a more complex project might benefit from splitting reference material.

Suggestions

Trim explanatory prose like the loaded_db fixture description and the 'Note on data_tools lint' paragraph to just the essential actionable takeaway (e.g., 'data_tools has known pre-existing lint violations — only report issues in changed files').

Consider extracting 'Key File Locations' and 'Common Issues' into a separate reference file if the bundle grows, to keep the main skill focused on execution flow.

Dimension	Reasoning	Score
Conciseness	The skill is fairly long but most content is actionable commands and routing logic. Some sections could be tightened (e.g., the data_tools fixture explanation, the 'Note on data_tools lint' paragraph), and the help text in section 7 is somewhat redundant with the section headers themselves. Overall mostly efficient but not maximally lean.	2 / 3
Actionability	Every section provides concrete, copy-paste-ready bash commands with exact paths, tool invocations, and flags. The routing logic for $ARGUMENTS is explicit, the test path conventions are documented, and the expected output format (summary table) is specified. Troubleshooting includes specific error messages and fixes.	3 / 3
Workflow Clarity	The 'all/ci' workflow is clearly sequenced as Steps 1-4 with a final summary checkpoint. The data_tools section includes a Docker pre-flight validation check. The format-check section includes a feedback loop (check -> offer to auto-fix). The routing logic across 7 argument cases is unambiguous with clear decision criteria.	3 / 3
Progressive Disclosure	The content is well-structured with clear section headers and a logical flow from argument routing to key files to common issues. However, at ~150 lines it's a monolithic file with no references to supporting documents. The 'Key File Locations' and 'Common Issues' sections could potentially be split out, though the lack of bundle files means there's nothing to reference.	2 / 3
	Total	10 / 12 Passed

Validation

81%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 9 / 11 Passed

Validation for skill structure

Criteria	Description	Result
allowed_tools_field	'allowed-tools' contains unusual tool name(s)	Warning
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	9 / 11 Passed

Repository: HHS/OPRE-OPS
Commit: e2a9461

Reviewed: 2 days ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.