Curated library of 39 AI agent skills for Ruby on Rails development. Organized by category: planning, testing, code-quality, ddd, engines, infrastructure, api, patterns, context, orchestration, and workflows. Includes 5 callable workflow skills (rails-tdd-loop, rails-review-flow, rails-setup-flow, rails-quality-flow, rails-engines-flow) for complete development cycles. Covers code review, architecture, security, testing (RSpec), engines, service objects, DDD patterns, and TDD automation.
95
98%
Does it follow best practices?
Impact
95%
1.20xAverage score across 35 eval scenarios
Passed
No known issues
A systematic process for improving tessl evaluation scores across all skills in this library.
This guide provides a repeatable workflow for diagnosing and fixing skill evaluation failures. It was developed while optimizing rails-engine-release and is designed to be applicable to any skill in the library.
Full-library eval on claude-sonnet-4-6 (32 scenarios, tile igmarin/rails-agent-skills):
Per the library's eval strategy: a skill that only beats baseline marginally is under-specified. The gap between baseline and with-context is the signal that skills are earning their token cost. Chasing the baseline up means codifying what the model already knows — bloat without signal.
What this means in practice:
| Scenario | Skill | Baseline | With ctx | Lift |
|---|---|---|---|---|
| S32 | ticket-planning | 30% | 100% | +70 |
| S8 | ruby-api-client-integration | 40% | 100% | +60 |
| S24 | generate-tasks | 43% | 100% | +57 |
| S13 | ruby-api-client-integration | 45% | 100% | +55 |
| S4 | refactor-safely | 60% | 100% | +40 |
| S14 | create-prd | 62% | 100% | +38 |
| S10 | rails-code-conventions (logging + backtrace) | 65% | 100% | +35 |
| S3 | ruby-service-objects | 71% | 100% | +29 |
| S12 | rails-graphql-best-practices | 71% | 100% | +27 |
| S27 | yard-documentation (inline tagged notes) | 76% | 100% | +24 |
tessl CLI with eval capabilitiesBefore running scenario evaluations, check the skill's intrinsic quality score:
tessl skill review --optimize rails-engine-releaseThis analyzes the skill file itself for:
Example improvement:
This step catches skill-level issues before running expensive scenario evaluations.
Run the evaluation and identify failing criteria:
tessl eval view --lastDocument the scores for both scenarios:
Compare the two scenarios to understand the problem type:
| Pattern | Meaning | Solution Approach |
|---|---|---|
| Baseline low, With-context high | Skill provides necessary guidance | Good - skill is valuable |
| Baseline high, With-context low | Context dilution or conflicting signals | Fix: Reduce noise, strengthen signal |
| Both low | Missing or incorrect instructions | Fix: Add explicit requirements |
| Both high | Skill is well-optimized | Document as reference |
For each failing criterion, determine the root cause:
Symptom: Agent doesn't mention required element in output Fix: Add explicit requirement to Output Style section
Example from rails-engine-release:
## Output Style
When asked to prepare a release, your output MUST include:
5. **Gemspec verification** — Explicitly state that gemspec metadata
(authors, description, files, dependencies) was verified
6. **Test suite status** — Confirm the full test suite passes
(`bundle exec rspec`) before proceedingSymptom: Scores drop when more files are present Fix: Implement progressive disclosure
## Extended Resources (Progressive Disclosure)
Load these files only when their specific content is needed:
- **[assets/checklist.md](assets/checklist.md)** — Use when you need
detailed verification steps
- **[references/advanced.md](references/advanced.md)** — Use for edge cases
or complex scenariosSymptom: Important instruction exists but agent doesn't prioritize it Fix: Move to prominent position with explicit imperative language
Priority order for maximum score improvement:
Output Style section (highest impact for baseline scores)
Progressive disclosure (highest impact for with-context scores)
assets/, references/Quick Reference table (medium impact for both)
HARD-GATE section (if applicable)
Run the evaluation again:
tessl eval run --skill rails-engine-releaseVerify both scenarios achieve 100% or acceptable threshold.
Iterate if needed: return to Step 1 with new scores.
Problem: Agent doesn't mention verifying gemspec metadata
Solution: Add to Output Style:
5. **Gemspec verification** — Explicitly state that gemspec metadata
(authors, description, files, dependencies) was verified against
tested Rails/Ruby versionsProblem: Agent doesn't confirm test suite passes
Solution: Add to Output Style:
6. **Test suite status** — Confirm the full test suite passes
(`bundle exec rspec`) before proceeding to buildProblem: Agent doesn't produce upgrade instructions for host apps
Solution:
assets/upgrade_template.md for referenceProblem: Agent doesn't mention blockers or state "no blockers"
Solution: Add to Output Style:
7. **Release blockers** — Call out any open issues preventing release,
or explicitly state "No blockers"| Check | Baseline | With Context |
|---|---|---|
| Gemspec verified | 2/8 (25%) | 5/8 (63%) |
| Test suite mentioned | 7/8 (88%) | 8/8 (100%) |
| Upgrade notes produced | 3/10 (30%) | 10/10 (100%) |
| Total | 86/100 | 97/100 |
assets/examples.md had circular references and wrong pathsassets/examples.md with clear relative paths and loading instructions| Check | Baseline | With Context |
|---|---|---|
| Feature branch task 0.0 | 0/10 (0%) | 10/10 (100%) |
| TDD write-spec sub-task | 3/10 (30%) | 10/10 (100%) |
| TDD run-spec-fail sub-task | 0/10 (0%) | 10/10 (100%) |
| TDD run-spec-pass sub-task | 0/8 (0%) | 8/8 (100%) |
| YARD post-implementation gate | 0/10 (0%) | 10/10 (100%) |
| Relevant Files section | 0/8 (0%) | 8/8 (100%) |
| Total | 43/100 | 100/100 |
Added Output Style section with 7 explicit requirements mapping directly to evaluation criteria:
Rewrote frontmatter description to include explicit trigger words:
When you see this pattern, first determine if it's a signal problem or a training knowledge gap:
Signal Problem (fixable):
Training Knowledge Gap (inherent limitation):
For generate-tasks:
Use this structure to maximize eval scores from the start. The same six sections are codified in skill-structure.md — that doc is the canonical SKILL.md shape the reinforcement pass enforces; this template is the fillable version.
---
name: skill-name
description: >
Use when [concrete trigger]. Covers [specific topics].
Trigger words: [symptom], [tool], [scenario].
---
# Skill Title
Use this skill when [brief trigger description].
## Quick Reference
| Aspect | Rule |
|--------|------|
| [Key point] | [Specific rule] |
| [Key point] | [Specific rule] |
## HARD-GATE (if applicable)DO NOT [forbidden action]. ALWAYS [required action].
## Core Process
1. [Step with specific command or pattern]
2. [Step with specific command or pattern]
3. [Step with specific command or pattern]
## Extended Resources (Progressive Disclosure)
Load these files only when needed:
- **[assets/template.md](assets/template.md)** — Use when [specific condition]
- **[references/advanced.md](references/advanced.md)** — Use when [specific condition]
## Output Style
When asked to [task], your output MUST include:
1. **[Criterion name]** — [Explicit requirement]
2. **[Criterion name]** — [Explicit requirement]
3. **[Criterion name]** — [Explicit requirement]
## Examples
[Minimal working example]
## Integration
| Skill | When to chain |
|-------|---------------|
| [skill-name] | When [condition] |Use this guide as the starting point for fixing any skill in the library:
# 1. Check skill quality first
tessl skill review --optimize <skill-name>
# 2. Run scenario evaluations
tessl eval run . --variant without-context --variant with-context
# 3. View specific failures
tessl eval view --last
# 4. Follow this guide's workflow (Step 0 → Step 5)For each skill below 100%:
tessl skill review --optimizetessl eval view --lastPriority order for optimization:
Document new patterns discovered in the Maintainer Notes section.
Maintainer Notes:
docs
evals
scenario-1
scenario-2
scenario-3
scenario-4
scenario-5
scenario-6
scenario-7
scenario-8
scenario-9
scenario-10
scenario-11
scenario-12
scenario-13
scenario-14
scenario-15
scenario-16
scenario-17
scenario-18
scenario-19
scenario-20
scenario-21
scenario-22
scenario-23
scenario-24
scenario-25
scenario-26
scenario-27
scenario-28
scenario-29
scenario-30
scenario-31
scenario-32
scenario-33
scenario-34
scenario-35
mcp_server
skills
api
api-rest-collection
rails-graphql-best-practices
code-quality
rails-architecture-review
rails-code-conventions
rails-code-review
rails-review-response
rails-security-review
rails-stack-conventions
assets
snippets
refactor-safely
context
rails-context-engineering
rails-project-onboarding
ddd
ddd-boundaries-review
ddd-rails-modeling
ddd-ubiquitous-language
engines
rails-engine-compatibility
rails-engine-docs
rails-engine-extraction
rails-engine-installers
rails-engine-release
rails-engine-reviewer
rails-engine-testing
infrastructure
rails-api-versioning
rails-background-jobs
rails-database-seeding
rails-frontend-hotwire
rails-migration-safety
rails-performance-optimization
orchestration
rails-skills-orchestrator
patterns
ruby-service-objects
strategy-factory-null-calculator
yard-documentation
planning
create-prd
generate-tasks
ticket-planning
testing
rails-bug-triage
rails-tdd-slices
rspec-best-practices
rspec-service-testing