CtrlK
BlogDocsLog inGet started
Tessl Logo

giuseppe-trisciuoglio/developer-kit

Comprehensive developer toolkit providing reusable skills for Java/Spring Boot, TypeScript/NestJS/React/Next.js, Python, PHP, AWS CloudFormation, AI/RAG, DevOps, and more.

90

Quality

90%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Risky

Do not use without reviewing

This version of the tile failed moderation
Moderation pipeline encountered an internal error
Overview
Quality
Evals
Security
Files

SKILL.mdplugins/developer-kit-specs/skills/task-quality-kpi/

name:
task-quality-kpi
description:
Objective task quality evaluation framework using quantitative KPIs. KPIs are automatically calculated by a hook when task files are modified and saved to TASK-XXX--kpi.json. Use when: reading KPI data for task evaluation, understanding quality metrics, deciding whether to iterate or approve based on data.
allowed-tools:
Read, Write

Task Quality KPI Framework

Overview

The Task Quality KPI Framework provides objective, quantitative metrics for evaluating task implementation quality.

Key Architecture: KPIs are auto-generated by a hook - you read the results, not run scripts.

┌─────────────────────────────────────────────────────────────┐
│  HOOK (auto-executes)                                       │
│  Trigger: PostToolUse on TASK-*.md                          │
│  Script: task-kpi-analyzer.py                               │
│  Output: TASK-XXX--kpi.json                                 │
├─────────────────────────────────────────────────────────────┤
│  SKILL / AGENT (reads output)                               │
│  Input: TASK-XXX--kpi.json                                  │
│  Action: Make evaluation decisions                          │
└─────────────────────────────────────────────────────────────┘

Why This Architecture?

ProblemSolution
Skills can't execute scriptsHook auto-runs on file save
Subjective review_statusQuantitative 0-10 scores
"Looks good to me"Evidence-based evaluation
Binary pass/failGraduated quality levels

KPI File Location

After any task file modification, find KPI data at:

docs/specs/[ID]/tasks/TASK-XXX--kpi.json

KPI Categories

┌─────────────────────────────────────────────────────────────┐
│                    OVERALL SCORE (0-10)                     │
├─────────────────────────────────────────────────────────────┤
│  Spec Compliance (30%)                                      │
│  ├── Acceptance Criteria Met (0-10)                         │
│  ├── Requirements Coverage (0-10)                           │
│  └── No Scope Creep (0-10)                                  │
├─────────────────────────────────────────────────────────────┤
│  Code Quality (25%)                                         │
│  ├── Static Analysis (0-10)                                 │
│  ├── Complexity (0-10)                                      │
│  └── Patterns Alignment (0-10)                              │
├─────────────────────────────────────────────────────────────┤
│  Test Coverage (25%)                                        │
│  ├── Unit Tests Present (0-10)                              │
│  ├── Test/Code Ratio (0-10)                                 │
│  └── Coverage Percentage (0-10)                             │
├─────────────────────────────────────────────────────────────┤
│  Contract Fulfillment (20%)                                 │
│  ├── Provides Verified (0-10)                               │
│  └── Expects Satisfied (0-10)                               │
└─────────────────────────────────────────────────────────────┘

Category Weights

CategoryWeightWhy
Spec Compliance30%Most important - did we build what was asked?
Code Quality25%Technical excellence
Test Coverage25%Verification and confidence
Contract Fulfillment20%Integration with other tasks

When to Use

  • Reading KPI data for task quality evaluation
  • Understanding quality metrics and scoring breakdown
  • Deciding whether to iterate or approve based on quantitative data
  • Integrating KPI checks into automated loops (agents_loop.py)
  • Generating evidence-based evaluation reports

Instructions

1. Reading KPI Data (Primary Use)

DO NOT run scripts - read the auto-generated file:

Read the KPI file:
  docs/specs/001-feature/tasks/TASK-001--kpi.json

2. Understanding the Data

The KPI file contains:

{
  "task_id": "TASK-001",
  "evaluated_at": "2026-01-15T10:30:00Z",
  "overall_score": 8.2,
  "passed_threshold": true,
  "threshold": 7.5,
  "kpi_scores": [
    {
      "category": "Spec Compliance",
      "weight": 30,
      "score": 8.5,
      "weighted_score": 2.55,
      "metrics": {
        "acceptance_criteria_met": 9.0,
        "requirements_coverage": 8.0,
        "no_scope_creep": 8.5
      },
      "evidence": [
        "Acceptance criteria: 9/10 checked",
        "Requirements coverage: 8/10"
      ]
    }
  ],
  "recommendations": [
    "Code Quality: Moderate improvements possible"
  ],
  "summary": "Score: 8.2/10 - PASSED"
}

3. Making Decisions

Use overall_score and passed_threshold:

IF passed_threshold == true:
  → Task meets quality standards
  → Approve and proceed

IF passed_threshold == false:
  → Task needs improvement
  → Check recommendations for specific targets
  → Create fix specification

Integration with Workflow

In Task Review (evaluator-agent)

## Review Process

1. Read KPI file: TASK-XXX--kpi.json
2. Extract overall_score and kpi_scores
3. Read task file to validate
4. Generate evaluation report
5. Decision based on passed_threshold

In agents_loop

# Check KPI file exists
kpi_path = spec_path / "tasks" / f"{task_id}--kpi.json"

if kpi_path.exists():
    kpi_data = json.loads(kpi_path.read_text())
    
    if kpi_data["passed_threshold"]:
        # Quality threshold met
        advance_state("update_done")
    else:
        # Need more work
        fix_targets = kpi_data["recommendations"]
        create_fix_task(fix_targets)
        advance_state("fix")
else:
    # KPI not generated yet - task may not be implemented
    log_warning("No KPI data found")

Multi-Iteration Loop

Instead of max 3 retries, iterate until quality threshold met:

Iteration 1: Score 6.2 → FAILED → Fix: Improve test coverage
Iteration 2: Score 7.1 → FAILED → Fix: Refactor complex functions  
Iteration 3: Score 7.8 → PASSED → Proceed

Each iteration updates the KPI file automatically on task save.

Threshold Guidelines

ScoreQuality LevelAction
9.0-10.0ExceptionalApprove, document best practices
8.0-8.9GoodApprove with minor notes
7.0-7.9AcceptableApprove (if threshold 7.5)
6.0-6.9Below StandardRequest specific improvements
< 6.0PoorSignificant rework required

Recommended Thresholds

Project TypeThresholdRationale
Production MVP8.0High quality required
Internal Tool7.0Good enough
Prototype6.0Functional over perfect
Critical System8.5No compromises

Metric Details

Spec Compliance Metrics

Acceptance Criteria Met

  • Calculates: (checked_criteria / total_criteria) * 10
  • Source: Task file checkbox count
  • Example: 9/10 checked = 9.0

Requirements Coverage

  • Calculates: Count of REQ-IDs this task covers
  • Source: traceability-matrix.md
  • Example: 4 requirements covered = 8.0

No Scope Creep

  • Calculates: (implemented_files / expected_files) * 10
  • Source: Task "Files to Create" vs actual files
  • Penalizes: Missing files or unexpected additions

Code Quality Metrics

Static Analysis

  • Java: Maven Checkstyle
  • TypeScript: ESLint
  • Python: ruff
  • Score: 10 if passes, 5 if issues found

Complexity

  • Calculates: Functions >50 lines
  • Score: 10 - (long_functions_ratio * 5)
  • Penalizes: Large, complex functions

Patterns Alignment

  • Checks: Knowledge Graph patterns
  • Source: knowledge-graph.json
  • Validates: Implementation follows project patterns

Test Coverage Metrics

Unit Tests Present

  • Calculates: min(10, test_files * 5)
  • 2 test files = maximum score
  • Penalizes: Missing tests

Test/Code Ratio

  • Calculates: (test_count / code_count) * 10
  • 1:1 ratio = 10/10
  • Ideal: At least 1 test file per code file

Coverage Percentage

  • Source: Coverage reports (JaCoCo, lcov, etc.)
  • Calculates: coverage_percent / 10
  • 80% coverage = 8.0

Contract Fulfillment Metrics

Provides Verified

  • Checks: Files exist and export expected symbols
  • Source: Task provides frontmatter
  • Validates: Contract satisfied

Expects Satisfied

  • Checks: Dependencies provide required files/symbols
  • Source: Task expects frontmatter
  • Validates: Prerequisites met

When KPI File is Missing

If TASK-XXX--kpi.json doesn't exist:

  1. Task was never modified - Hook runs on file save
  2. Hook failed - Check Claude Code logs
  3. Task is new - Save the file first to trigger hook

DO NOT try to calculate KPIs manually. The hook runs automatically when:

  • Task file is saved (Write tool)
  • Task file is edited (Edit tool)

Best Practices

1. Always Check KPI File Exists

Before evaluating:

Check if KPI file exists:
  docs/specs/[ID]/tasks/TASK-XXX--kpi.json

If missing:
  - Task may not be implemented yet
  - Ask user to save the task file first

2. Trust the Metrics

The KPIs are objective. Only override with documented evidence:

  • Critical security issue not in metrics
  • Logic error not caught by static analysis
  • Exceptional quality not measured

3. Iterate on Low KPIs

Target specific categories:

❌ "Fix code quality issues"
✅ "Improve Code Quality KPI from 5.2 to 7.0:
    - Complexity: Refactor processData() (5→8)
    - Patterns: Add error handling (6→8)"

4. Track KPI Trends

Monitor quality over time:

Sprint 1: Average KPI 6.8
Sprint 2: Average KPI 7.3 (+0.5)
Sprint 3: Average KPI 7.9 (+0.6)

Troubleshooting

KPI File Not Generated

Check:

  1. Hook enabled in hooks.json
  2. Task file name matches pattern TASK-*.md
  3. File was actually saved (not just viewed)

KPI Scores Seem Wrong

Validate:

  1. Check evidence field for data sources
  2. Verify files exist at expected paths
  3. Some metrics need build tools (Maven, npm)

Low Scores Despite Good Code

Possible causes:

  • Missing test files
  • No coverage report generated
  • Acceptance criteria not checked
  • Lint rules too strict

Fix the root cause, not just the score.

Examples

Example 1: Reading KPI Data

Read the KPI file to evaluate task quality:
  docs/specs/001-feature/tasks/TASK-042--kpi.json

Based on the data:
- Overall score: 6.8/10 (below threshold)
- Lowest KPI: Test Coverage (5.0/10)
- Recommendation: Add unit tests

Decision: REQUEST FIXES - target Test Coverage improvement

Example 2: Iteration Decision

Iteration 1 KPI: Score 6.2 → FAILED
- Spec Compliance: 7.0 ✓
- Code Quality: 5.5 ✗
- Test Coverage: 6.0 ✗

Fix targets:
1. Refactor complex functions (Code Quality)
2. Add test coverage (Test Coverage)

Iteration 2 KPI: Score 7.8 → PASSED ✓

Example 3: agents_loop Integration

# In agents_loop, after implementation step
kpi_file = spec_dir / "tasks" / f"{task_id}--kpi.json"

if kpi_file.exists():
    kpi = json.loads(kpi_file.read_text())
    
    if kpi["passed_threshold"]:
        print(f"✅ Task passed quality check: {kpi['overall_score']}/10")
        advance_state("update_done")
    else:
        print(f"❌ Task failed quality check: {kpi['overall_score']}/10")
        print("Recommendations:")
        for rec in kpi["recommendations"]:
            print(f"  - {rec}")
        advance_state("fix")

References

  • evaluator-agent.md - Agent that uses KPI data for evaluation
  • hooks.json - Hook configuration for auto-generation
  • task-kpi-analyzer.py - Hook script (do not execute directly)
  • agents_loop.py - Orchestrator that reads KPI for decisions

plugins

CHANGELOG.md

context7.json

CONTRIBUTING.md

README_CN.md

README_ES.md

README_IT.md

README.md

tessl.json

tile.json