CtrlK
BlogDocsLog inGet started
Tessl Logo

giuseppe-trisciuoglio/developer-kit

Comprehensive developer toolkit providing reusable skills for Java/Spring Boot, TypeScript/NestJS/React/Next.js, Python, PHP, AWS CloudFormation, AI/RAG, DevOps, and more.

89

Quality

89%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Risky

Do not use without reviewing

Overview
Quality
Evals
Security
Files

SKILL.mdplugins/developer-kit-specs/skills/task-quality-kpi/

name:
task-quality-kpi
description:
Objective task quality evaluation framework using quantitative KPIs. KPIs are automatically calculated by a hook when task files are modified and saved to TASK-XXX--kpi.json. Use when: reading KPI data for task evaluation, understanding quality metrics, deciding whether to iterate or approve based on data.
allowed-tools:
Read, Write

Task Quality KPI Framework

Overview

The Task Quality KPI Framework provides objective, quantitative metrics for evaluating task implementation quality.

Key Architecture: KPIs are auto-generated by a hook - you read the results, not run scripts.

┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘

Why This Architecture?

ProblemSolution
Skills can't execute scriptsHook auto-runs on file save
Subjective review_statusQuantitative 0-10 scores
"Looks good to me"Evidence-based evaluation
Binary pass/failGraduated quality levels

KPI File Location

After any task file modification, find KPI data at:

docs/specs/[ID]/tasks/TASK-XXX--kpi.json

KPI Categories

┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘

Category Weights

CategoryWeightWhy
Spec Compliance30%Most important - did we build what was asked?
Code Quality25%Technical excellence
Test Coverage25%Verification and confidence
Contract Fulfillment20%Integration with other tasks

When to Use

  • Reading KPI data for task quality evaluation
  • Understanding quality metrics and scoring breakdown
  • Deciding whether to iterate or approve based on quantitative data
  • Integrating KPI checks into automated loops (agents_loop.py)
  • Generating evidence-based evaluation reports

Instructions

1. Reading KPI Data (Primary Use)

DO NOT run scripts - read the auto-generated file:

Read the KPI file:
  docs/specs/001-feature/tasks/TASK-001--kpi.json

2. Understanding the Data

The KPI file contains:

{
  "task_id": "TASK-001",
  "evaluated_at": "2026-01-15T10:30:00Z",
  "overall_score": 8.2,
  "passed_threshold": true,
  "threshold": 7.5,
  "kpi_scores": [
    {
      "category": "Spec Compliance",
      "weight": 30,
      "score": 8.5,
      "weighted_score": 2.55,
      "metrics": {
        "acceptance_criteria_met": 9.0,
        "requirements_coverage": 8.0,
        "no_scope_creep": 8.5
      },
      "evidence": [
        "Acceptance criteria: 9/10 checked",
        "Requirements coverage: 8/10"
      ]
    }
  ],
  "recommendations": [
    "Code Quality: Moderate improvements possible"
  ],
  "summary": "Score: 8.2/10 - PASSED"
}

3. Making Decisions

Use overall_score and passed_threshold:

IF passed_threshold == true:
  → Task meets quality standards
  → Approve and proceed

IF passed_threshold == false:
  → Task needs improvement
  → Check recommendations for specific targets
  → Create fix specification

Integration with Workflow

In Task Review (evaluator-agent)

## Review Process

1. Read KPI file: TASK-XXX--kpi.json
2. Extract overall_score and kpi_scores
3. Read task file to validate
4. Generate evaluation report
5. Decision based on passed_threshold

In agents_loop

# Check KPI file exists
kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"

if kpi_path.exists():
    kpi_data = json.loads(kpi_path.read_text())
    
    if kpi_data["passed_threshold"]:
        # Quality threshold met
        advance_state("update_done")
    else:
        # Need more work
        fix_targets = kpi_data["recommendations"]
        create_fix_task(fix_targets)
        advance_state("fix")
else:
    # KPI not generated yet - task may not be implemented
    log_warning("No KPI data found")

Multi-Iteration Loop

Instead of max 3 retries, iterate until quality threshold met:

Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions  
Iteration 3: Score 7.8 → PASSED → Proceed

Each iteration updates the KPI file automatically on task save.

Threshold Guidelines

ScoreQuality LevelAction
9.0-10.0ExceptionalApprove, document best practices
8.0-8.9GoodApprove with minor notes
7.0-7.9AcceptableApprove (if threshold 7.5)
6.0-6.9Below StandardRequest specific improvements
< 6.0PoorSignificant rework required

Recommended Thresholds

Project TypeThresholdRationale
Production MVP8.0High quality required
Internal Tool7.0Good enough
Prototype6.0Functional over perfect
Critical System8.5No compromises

Metric Details

Spec Compliance Metrics

Acceptance Criteria Met

  • Calculates: (checked_criteria / total_criteria) * 10
  • Source: Task file checkbox count
  • Example: 9/10 checked = 9.0

Requirements Coverage

  • Calculates: Count of REQ-IDs this task covers
  • Source: traceability-matrix.md
  • Example: 4 requirements covered = 8.0

No Scope Creep

  • Calculates: (implemented_files / expected_files) * 10
  • Source: Task "Files to Create" vs actual files
  • Penalizes: Missing files or unexpected additions

Code Quality Metrics

Static Analysis

  • Java: Maven Checkstyle
  • TypeScript: ESLint
  • Python: ruff
  • Score: 10 if passes, 5 if issues found

Complexity

  • Calculates: Functions >50 lines
  • Score: 10 - (long_functions_ratio * 5)
  • Penalizes: Large, complex functions

Patterns Alignment

  • Checks: Knowledge Graph patterns
  • Source: knowledge-graph.json
  • Validates: Implementation follows project patterns

Test Coverage Metrics

Unit Tests Present

  • Calculates: min(10, test_files * 5)
  • 2 test files = maximum score
  • Penalizes: Missing tests

Test/Code Ratio

  • Calculates: (test_count / code_count) * 10
  • 1:1 ratio = 10/10
  • Ideal: At least 1 test file per code file

Coverage Percentage

  • Source: Coverage reports (JaCoCo, lcov, etc.)
  • Calculates: coverage_percent / 10
  • 80% coverage = 8.0

Contract Fulfillment Metrics

Provides Verified

  • Checks: Files exist and export expected symbols
  • Source: Task provides frontmatter
  • Validates: Contract satisfied

Expects Satisfied

  • Checks: Dependencies provide required files/symbols
  • Source: Task expects frontmatter
  • Validates: Prerequisites met

When KPI File is Missing

If TASK-XXX--kpi.json doesn't exist:

  1. Task was never modified - Hook runs on file save
  2. Hook failed - Check Claude Code logs
  3. Task is new - Save the file first to trigger hook

DO NOT try to calculate KPIs manually. The hook runs automatically when:

  • Task file is saved (Write tool)
  • Task file is edited (Edit tool)

Best Practices

1. Always Check KPI File Exists

Before evaluating:

Check if KPI file exists:
  docs/specs/[ID]/tasks/TASK-XXX--kpi.json

If missing:
  - Task may not be implemented yet
  - Ask user to save the task file first

2. Trust the Metrics

The KPIs are objective. Only override with documented evidence:

  • Critical security issue not in metrics
  • Logic error not caught by static analysis
  • Exceptional quality not measured

3. Iterate on Low KPIs

Target specific categories:

❌ "Fix code quality issues"
✅ "Improve Code Quality KPI from 5.2 to 7.0:
    - Complexity: Refactor processData() (5→8)
    - Patterns: Add error handling (6→8)"

4. Track KPI Trends

Monitor quality over time:

Sprint 1: Average KPI 6.8
Sprint 2: Average KPI 7.3 (+0.5)
Sprint 3: Average KPI 7.9 (+0.6)

Troubleshooting

KPI File Not Generated

Check:

  1. Hook enabled in hooks.json
  2. Task file name matches pattern TASK-*.md
  3. File was actually saved (not just viewed)

KPI Scores Seem Wrong

Validate:

  1. Check evidence field for data sources
  2. Verify files exist at expected paths
  3. Some metrics need build tools (Maven, npm)

Low Scores Despite Good Code

Possible causes:

  • Missing test files
  • No coverage report generated
  • Acceptance criteria not checked
  • Lint rules too strict

Fix the root cause, not just the score.

Examples

Example 1: Reading KPI Data

Read the KPI file to evaluate task quality:
  docs/specs/001-feature/tasks/TASK-042--kpi.json

Based on the data:
- Overall score: 6.8/10 (below threshold)
- Lowest KPI: Test Coverage (5.0/10)
- Recommendation: Add unit tests

Decision: REQUEST FIXES - target Test Coverage improvement

Example 2: Iteration Decision

Iteration 1 KPI: Score 6.2 → FAILED
- Spec Compliance: 7.0 ✓
- Code Quality: 5.5 ✗
- Test Coverage: 6.0 ✗

Fix targets:
1. Refactor complex functions (Code Quality)
2. Add test coverage (Test Coverage)

Iteration 2 KPI: Score 7.8 → PASSED ✓

Example 3: agents_loop Integration

# In agents_loop, after implementation step
kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"

if kpi_file.exists():
    kpi = json.loads(kpi_file.read_text())
    
    if kpi["passed_threshold"]:
        print(f"✅ Task passed quality check: {kpi['overall_score']}/10")
        advance_state("update_done")
    else:
        print(f"❌ Task failed quality check: {kpi['overall_score']}/10")
        print("Recommendations:")
        for rec in kpi["recommendations"]:
            print(f"  - {rec}")
        advance_state("fix")

References

  • evaluator-agent.md - Agent that uses KPI data for evaluation
  • hooks.json - Hook configuration for auto-generation
  • task-kpi-analyzer.py - Hook script (do not execute directly)
  • agents_loop.py - Orchestrator that reads KPI for decisions

plugins

CHANGELOG.md

context7.json

CONTRIBUTING.md

README_CN.md

README_ES.md

README_IT.md

README.md

tessl.json

tile.json