CtrlK
BlogDocsLog inGet started
Tessl Logo

pantheon-ai/bdd-testing

Write and maintain Behavior-Driven Development tests with Gherkin and Cucumber. Use when defining acceptance scenarios, writing feature files, implementing step definitions, running Three Amigos sessions, or diagnosing BDD test quality issues. Keywords: bdd, gherkin, cucumber, given when then, feature files, step definitions, acceptance criteria, three amigos, example mapping.

Does it follow best practices?

Evaluation96%

1.04x

Agent success when using this tile

Validation for skill structure

Overview
Skills
Evals
Files

Evaluation results

92%

5%

E-Commerce Checkout Feature Development

Feature file creation and business language

Criteria
Without context
With context

Business language usage

93%

93%

Non-developer readability

93%

93%

Observable behavior focus

87%

93%

Given/When/Then structure

80%

100%

Avoids implementation details

87%

93%

Specific observable outcomes

90%

80%

Single behavior focus

80%

90%

Deterministic scenarios

80%

90%

With context: $0.5932 · 6m 50s · 25 turns · 21 in / 6,849 out tokens

98%

3%

CI/CD Integration for BDD Test Suite

Cucumber execution and reporting

Criteria
Without context
With context

Basic cucumber execution

100%

100%

Dry-run validation

100%

100%

Tag-based filtering

80%

100%

JSON report generation

100%

100%

Multiple execution methods

100%

100%

Proper file paths

100%

80%

Exit code awareness

80%

100%

Report file organization

100%

100%

With context: $0.5143 · 2m 50s · 23 turns · 369 in / 6,046 out tokens

91%

4%

Step Definition Implementation for Order Processing

Step definition implementation

Criteria
Without context
With context

Business language mapping

100%

100%

Avoids implementation details

90%

90%

Maintains scenario meaning

93%

93%

Proper Cucumber syntax

100%

100%

Step reusability

73%

80%

Abstraction layer

67%

87%

No brittle dependencies

90%

90%

With context: $0.8657 · 4m · 35 turns · 2,250 in / 10,436 out tokens

100%

13%

User Account Management Test Suite

Scenario independence and structure

Criteria
Without context
With context

Given/When/Then structure

93%

100%

Scenario independence

85%

100%

No cross-scenario coupling

90%

100%

Deterministic execution

80%

100%

Self-contained setup

87%

100%

Single behavior focus

80%

100%

Cleanup consideration

100%

100%

With context: $0.6997 · 4m 35s · 31 turns · 376 in / 8,562 out tokens

98%

6%

Payment Processing Feature Specification

Three Amigos process and anti-patterns

Criteria
Without context
With context

Three Amigos collaboration

90%

100%

Avoids vague outcomes

95%

95%

Observable behavior specification

87%

93%

Stakeholder alignment evidence

93%

100%

Specific examples over abstractions

93%

100%

Anti-pattern avoidance

90%

100%

Business language consistency

100%

100%

With context: $0.7398 · 3m 29s · 28 turns · 133 in / 9,626 out tokens

100%

Test Strategy Decision for Microservice Architecture

BDD scope boundaries and when not to use BDD

Criteria
Without context
With context

Identifies internal implementation exclusions

100%

100%

Stakeholder-facing behavior identification

100%

100%

Clear scope boundaries

100%

100%

Alternative testing recommendations

100%

100%

Business readability principle

100%

100%

Implementation detail avoidance

100%

100%

With context: $0.6212 · 3m 15s · 24 turns · 1,319 in / 7,249 out tokens

94%

1%

Legacy BDD Test Suite Refactoring

BDD test maintenance and refactoring workflows

Criteria
Without context
With context

Duplication identification

93%

93%

Scenario consolidation strategy

95%

95%

Step definition refactoring

90%

95%

Maintains scenario readability

93%

93%

Systematic refactoring approach

93%

93%

Preserves test intent

90%

90%

Reusable step library design

100%

100%

With context: $1.0099 · 5m 6s · 28 turns · 5,548 in / 12,527 out tokens

97%

Enterprise CI/CD Pipeline BDD Integration

Advanced Cucumber workflow and CI integration

Criteria
Without context
With context

Multiple output formats

100%

100%

Advanced tag strategies

100%

100%

CI/CD pipeline integration

100%

100%

Failure analysis workflow

100%

100%

Performance and timing considerations

100%

100%

Report organization and usage

90%

100%

Configuration management

90%

70%

Exit code handling

80%

100%

With context: $0.9504 · 5m 12s · 34 turns · 403 in / 13,324 out tokens

Install with Tessl CLI

npx tessl i pantheon-ai/bdd-testing
Evaluated
Agent
Claude Code

Table of Contents