Python development with ruff, mypy, pytest - TDD and type safety
55
45%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/python/SKILL.mdQuality
Discovery
32%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
The description reads more like a tag list than a proper skill description. While it names specific tools which aids in matching, it lacks concrete actions, a 'Use when...' clause, and natural language trigger terms that users would employ. It needs significant expansion to function effectively as a skill selector among many options.
Suggestions
Add a 'Use when...' clause specifying triggers, e.g., 'Use when the user asks about Python testing, linting, type checking, or wants to follow TDD practices with ruff, mypy, or pytest.'
Expand with concrete actions, e.g., 'Writes tests first following TDD methodology, configures and runs ruff for linting, enforces type safety with mypy, and runs pytest test suites.'
Include common natural language variations users might say, such as 'linting', 'type checking', 'unit tests', 'test-driven development', 'code quality', and 'Python testing'.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Names the domain (Python development) and lists specific tools (ruff, mypy, pytest) along with methodologies (TDD, type safety), but doesn't describe concrete actions like 'run tests', 'lint code', 'check types', or 'write test-first implementations'. | 2 / 3 |
Completeness | Provides a partial 'what' (Python development with specific tools) but completely lacks a 'when' clause or any explicit trigger guidance. Per rubric guidelines, a missing 'Use when...' clause caps completeness at 2, and the 'what' is also weak enough to warrant a 1. | 1 / 3 |
Trigger Term Quality | Includes relevant tool names (ruff, mypy, pytest) and concepts (TDD, type safety) that users might mention, but misses common variations like 'linting', 'type checking', 'unit tests', 'testing', 'Python tests', or 'code quality'. | 2 / 3 |
Distinctiveness Conflict Risk | The specific tool names (ruff, mypy, pytest) help distinguish it from generic Python skills, but 'Python development' is broad enough to overlap with other Python-related skills. The combination of tools provides some distinctiveness but could still conflict with general Python coding skills. | 2 / 3 |
Total | 7 / 12 Passed |
Implementation
57%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
The skill provides excellent concrete, executable examples for Python tooling configuration and patterns, making it highly actionable. However, it's a monolithic file that mixes quick-reference material with lengthy CI/CD configs that would be better as separate references. Despite the description mentioning TDD, there's no explicit TDD workflow with validation checkpoints, and some content (anti-patterns, basic project structure) is knowledge Claude already possesses.
Suggestions
Add an explicit TDD workflow section with clear steps: 1. Write failing test → 2. Run pytest (verify failure) → 3. Implement → 4. Run pytest (verify pass) → 5. Run ruff/mypy → 6. Refactor
Move GitHub Actions, pre-commit config, and detailed patterns into separate referenced files (e.g., CI.md, PATTERNS.md) to reduce the main skill's token footprint
Remove anti-patterns list and basic project structure that Claude already knows, or condense to project-specific conventions only
Add a quick-start summary at the top showing the core development loop commands: `ruff check .`, `mypy --strict src/`, `pytest`
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | Generally efficient with good use of code examples, but includes some content Claude already knows well (project structure conventions, anti-patterns like mutable default arguments, bare except clauses). The GitHub Actions and pre-commit sections are quite lengthy and could be referenced externally. | 2 / 3 |
Actionability | Provides fully executable, copy-paste ready code examples throughout: pyproject.toml config, pytest examples with Arrange/Act/Assert, GitHub Actions workflow, pre-commit config, and concrete Python patterns with good/bad comparisons. | 3 / 3 |
Workflow Clarity | The skill covers multiple tools (ruff, mypy, pytest) with clear configuration but lacks an explicit development workflow sequence. There's no TDD workflow despite the description mentioning TDD, and no validation/feedback loop for the development cycle (e.g., write test → run test → implement → verify). | 2 / 3 |
Progressive Disclosure | Everything is in a single monolithic file with no references to external files for detailed content. The GitHub Actions, pre-commit config, and patterns sections could easily be split into separate reference files. The only reference is 'Load with: base.md' which is unclear. | 1 / 3 |
Total | 8 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
frontmatter_unknown_keys | Unknown frontmatter key(s) found; consider removing or moving to metadata | Warning |
Total | 10 / 11 Passed | |
d4ddb03
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.