Write and maintain Behavior-Driven Development tests with Gherkin and Cucumber. Use when defining acceptance scenarios, writing feature files, implementing step definitions, running Three Amigos sessions, or diagnosing BDD test quality issues. Keywords: bdd, gherkin, cucumber, given when then, feature files, step definitions, acceptance criteria, three amigos, example mapping.
Does it follow best practices?
Evaluation — 96%
↑ 1.04xAgent success when using this tile
Validation for skill structure
Use this skill when behavior needs to be specified and validated in shared, business-readable scenarios.
Do not use BDD feature files for low-level unit behavior that is internal and not stakeholder-facing.
npx cucumber-js --dry-run to identify missing step definitions before full execution.npx cucumber-js featuresExpected result: scenario pass/fail summary and non-zero exit on failures.
npx cucumber-js --dry-run featuresExpected result: undefined steps listed without full execution.
npx cucumber-js --tags "@smoke and not @wip"Expected result: only matching scenarios execute.
npx cucumber-js --format json:reports/cucumber-report.jsonExpected result: machine-readable execution report for CI/reporting.
Step definition mismatch:
npx cucumber-js --dry-run featuresExpected result: lists undefined steps that need implementation or have mismatched patterns.
Async timing issues:
Check that step definitions return or await promises. Steps that don't wait for async operations will complete before actions finish.
Pattern: Ensure async functions use await for all asynchronous calls.
sh skills/skill-quality-auditor/scripts/evaluate.sh bdd-testing --jsonExpected result: updated quality dimensions and grade.
WHY: Implementation-centric steps break when internals change.
BAD: When I click the submit button and call validateForm()
GOOD: When I submit the form
Consequence: Scenarios become brittle and unreadable to stakeholders.
WHY: Missing perspectives create ambiguous or incomplete acceptance behavior.
BAD: Engineering writes scenarios alone from assumptions. GOOD: Product, QA, and engineering align on examples first.
Consequence: Rework increases and acceptance disputes appear late.
WHY: Unverifiable outcomes cannot fail meaningfully.
BAD: Then it should work correctly
GOOD: Then I should see "Order confirmed"
Consequence: Tests pass without validating user-visible behavior.
WHY: Scenario order dependence creates flaky suites.
BAD: Scenario B assumes data created by scenario A. GOOD: Each scenario creates or mocks its own prerequisites.
Consequence: Parallel runs and CI become unstable.
references/principles-core-philosophy.mdreferences/gherkin-syntax.mdreferences/principles-three-amigos.mdreferences/principles-example-mapping.mdreferences/cucumber-setup.mdInstall with Tessl CLI
npx tessl i pantheon-ai/bdd-testing