Technical leadership guidance for engineering teams, architecture decisions, and technology strategy. Use when assessing technical debt, scaling engineering teams, evaluating technologies, making architecture decisions, establishing engineering metrics, or when user mentions CTO, tech debt, technical debt, team scaling, architecture decisions, technology evaluation, engineering metrics, DORA metrics, or technology strategy.
91
88%
Does it follow best practices?
Impact
92%
1.43xAverage score across 6 eval scenarios
Passed
No known issues
Quality
Discovery
92%Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.
This is a strong skill description that clearly articulates what it does and when to use it, with comprehensive trigger terms covering both formal and colloquial variations. The main weakness is the breadth of the domain—covering everything from team scaling to architecture decisions to engineering metrics—which could create overlap with more specialized skills in those individual areas. Overall, it follows best practices with third-person voice, explicit 'Use when' clause, and natural keyword coverage.
| Dimension | Reasoning | Score |
|---|---|---|
Specificity | Lists multiple specific concrete actions: assessing technical debt, scaling engineering teams, evaluating technologies, making architecture decisions, establishing engineering metrics. These are clear, actionable capabilities. | 3 / 3 |
Completeness | Clearly answers both 'what' (technical leadership guidance for engineering teams, architecture decisions, and technology strategy) and 'when' with an explicit 'Use when...' clause listing specific trigger scenarios and terms. | 3 / 3 |
Trigger Term Quality | Excellent coverage of natural terms users would say: 'CTO', 'tech debt', 'technical debt', 'team scaling', 'architecture decisions', 'technology evaluation', 'engineering metrics', 'DORA metrics', 'technology strategy'. Includes both formal and informal variations (e.g., 'tech debt' and 'technical debt'). | 3 / 3 |
Distinctiveness Conflict Risk | While terms like 'CTO', 'DORA metrics', and 'tech debt' are fairly distinctive, 'architecture decisions' and 'technology evaluation' could overlap with general software engineering or DevOps skills. The scope is broad enough that it might conflict with more specialized skills in any of its sub-domains. | 2 / 3 |
Total | 11 / 12 Passed |
Implementation
85%Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.
This is a well-structured CTO advisor skill with strong actionability through concrete workflows, scoring frameworks, and specific metrics targets. The progressive disclosure is excellent, with clear references to deeper materials. The main weakness is moderate verbosity — some sections explain concepts Claude already understands (general leadership principles, what ADRs are conceptually) and the 'Key Questions' section is largely filler that Claude could generate on its own.
Suggestions
Trim the 'Key Questions a CTO Asks' section — Claude can generate these; instead, use that space for more specific decision heuristics or thresholds.
Reduce explanatory text in sections like Culture and Crisis Management that describe well-known engineering leadership concepts Claude already knows, keeping only the specific opinionated guidance (e.g., the 1:2 senior:junior ratio).
| Dimension | Reasoning | Score |
|---|---|---|
Conciseness | The skill is fairly comprehensive but includes some content Claude already knows (e.g., explaining what ADRs are, general leadership platitudes like 'blameless post-mortems'). The 'Key Questions a CTO Asks' section and some of the cultural guidance are things Claude could generate without instruction. However, the structured tables and specific thresholds add genuine value. | 2 / 3 |
Actionability | The skill provides concrete, executable workflows with specific commands, scoring templates with weighted criteria, example output tables, specific numeric thresholds (e.g., tech debt ratio < 25%, MTTR < 1 hour), and clear formulas for prioritization (Severity × Blast Radius / Cost-to-fix). The build vs buy scoring matrix and tech debt inventory are copy-paste ready frameworks. | 3 / 3 |
Workflow Clarity | Multi-step workflows are clearly sequenced with explicit validation checkpoints. The Tech Debt Assessment workflow includes a checklist before presenting to stakeholders, the ADR workflow has a validation checkpoint with specific criteria before finalizing, and both include feedback loops (review with tech leads, ensure capacity fits). The Build vs Buy workflow chains into the ADR workflow for documentation. | 3 / 3 |
Progressive Disclosure | The skill provides a clear overview with well-signaled one-level-deep references to detailed materials (references/technology_evaluation_framework.md, references/engineering_metrics.md, references/architecture_decision_records.md). Content is appropriately split between the main skill file and reference documents, with inline content kept to actionable summaries and tables. | 3 / 3 |
Total | 11 / 12 Passed |
Validation
90%Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.
Validation — 10 / 11 Passed
Validation for skill structure
| Criteria | Description | Result |
|---|---|---|
metadata_field | 'metadata' should map string keys to string values | Warning |
Total | 10 / 11 Passed | |
967fe01
Table of Contents
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.