Name: cappasoft/web-dev-estimation
Rating: 95.6 (1 reviews)
Author: cappasoft

cappasoft/web-dev-estimation

Estimates implementation time for web development tasks (frontend and/or backend) by analyzing the existing codebase and calibrating for an AI coding agent as executor — not a human developer. Use when the user asks about effort, sizing, or feasibility: 'how long', 'how much work', 'estimate this', 'what is the effort', 'breakdown this task', 'can we do this in X days', 'is this a big task', 'how complex is', 'what's involved in', 'fits in the sprint', 'rough sizing', 't-shirt size', 'story points'. Also use when the user describes a feature and implicitly wants to know scope — e.g. 'we need to add X to the app', 'thinking about building Y', 'is this feasible by Friday'. Supports batch estimation from any structured source (BMAD output, spec folders, PRDs, backlogs, task lists) — use when the user mentions 'estimate the stories', 'estimate the epic', 'scan the backlog', 'estimate all tasks', 'estimate the specs', or points to a folder of task/story/spec files.

1.40x

Quality

94%

Does it follow best practices?

Impact

98%

1.40x

Average score across 5 eval scenarios

Securityby

Passed

No known issues

{
  "context": "Tests whether the agent uses the batch estimation workflow with a consolidated summary table, estimates each task individually with per-task sizing, identifies cross-task dependencies, suggests an implementation order, and avoids redundant codebase analysis across tasks.",
  "type": "weighted_checklist",
  "checklist": [
    {
      "name": "Consolidated summary table",
      "description": "Output includes a single table or matrix comparing all 5 features side-by-side with at least task name, time estimate, and size for each",
      "max_score": 12
    },
    {
      "name": "All five features estimated",
      "description": "Every one of the 5 features has an individual time estimate (not just a lump total)",
      "max_score": 10
    },
    {
      "name": "Per-task T-shirt size",
      "description": "Each feature is assigned its own T-shirt size (XS/S/M/L/XL)",
      "max_score": 8
    },
    {
      "name": "Grand total provided",
      "description": "Output includes a total time range summing across all 5 features",
      "max_score": 8
    },
    {
      "name": "Implementation order suggested",
      "description": "Output recommends a sequencing or priority order for the 5 features with reasoning",
      "max_score": 10
    },
    {
      "name": "Cross-task dependencies noted",
      "description": "Output identifies at least one dependency between features (e.g. roles needed before invitations, or activity feed depending on event model)",
      "max_score": 10
    },
    {
      "name": "Dark mode sized smallest",
      "description": "The dark mode feature receives the smallest estimate (XS or S) given strong existing support from Tailwind and shadcn/ui",
      "max_score": 8
    },
    {
      "name": "Permissions sized largest",
      "description": "The role-based permissions feature receives one of the larger estimates (M, L, or XL) reflecting its cross-cutting, security-critical nature",
      "max_score": 8
    },
    {
      "name": "Time ranges not points",
      "description": "All time values are expressed as ranges, not single point estimates",
      "max_score": 8
    },
    {
      "name": "Stack detected as TypeScript",
      "description": "Output identifies the stack as TypeScript/Next.js",
      "max_score": 5
    },
    {
      "name": "Per-task risk or confidence",
      "description": "Each feature has either a risk note or confidence level indicating relative uncertainty",
      "max_score": 5
    },
    {
      "name": "Shared assumptions listed",
      "description": "Output includes assumptions that apply across multiple or all features",
      "max_score": 8
    }
  ]
}