CtrlK
BlogDocsLog inGet started
Tessl Logo

igmarin/rails-agent-skills

Curated library of 28 public AI agent skills for Ruby on Rails development. Organized by category: testing, code-quality, engines, infrastructure, api, and context. Covers code review, architecture, security, testing (RSpec), engines, Hotwire, and TDD automation. Shared Ruby skills (YARD docs, DDD, service objects) have moved to ruby-core-skills. Repository agents remain documented in GitHub but are intentionally excluded from the Tessl tile.

93

1.78x
Quality

95%

Does it follow best practices?

Impact

93%

1.78x

Average score across 28 eval scenarios

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

SKILL.mdagents/background-job/

name:
background-job
license:
MIT
description:
Orchestrates robust background job implementation: design job → TDD implementation → configure retry/discard strategies → test failure scenarios → production monitoring. Use when adding async processing, implementing background jobs, or configuring job queues. Trigger: background job, async processing, sidekiq, solid queue, active job, job queue, worker.
metadata:
{"version":"1.0.0","user-invocable":"true","entry_point":"Invoke when implementing background jobs with proper retry/discard strategies and monitoring","phases":"Phase 1: Job Design, Phase 2: TDD Implementation, Phase 3: Retry/Discard Configuration, Phase 4: Testing & Monitoring","hard_gates":"Job Design Complete, Tests Pass, Retry Strategy Configured, Failure Scenarios Tested","dependencies":[{"source":"self","skills":["implement-background-job","write-tests"]},{"source":"ruby-core-skills","skills":["tdd-process"]}],"keywords":"rails, background-job, async, sidekiq, solid-queue, active-job, retry, monitoring"}

Background Job Agent

Orchestrates robust background job implementation with TDD discipline, proper retry/discard strategies, comprehensive failure scenario testing, and production monitoring to ensure reliable async processing.


Phase 1: Job Design

Objective: Define job responsibilities, idempotency strategy, and error classification before writing code.

Steps:

  1. Job Purpose — Define trigger conditions, input parameters, expected output/side effects, and criticality.
  2. Idempotency — Design job to be safely re-runnable: use unique job keys, status checks, or sentinel timestamps.
  3. Error Classification — Classify all anticipated errors:
    • Transient (network timeouts, rate limits) → retry
    • Permanent (invalid data, record not found) → discard
    • Configuration (missing credentials) → alert
  4. Queue & Timeout — Assign queue priority and set execution timeout.

HARD GATE — Job Design Complete:

  • Purpose, trigger, input/output defined
  • Idempotency strategy specified
  • All errors classified as transient/permanent
  • Queue and timeout values chosen

If gate fails: Clarify requirements before implementation.


Phase 2: TDD Implementation

Objective: Implement job logic under TDD discipline.

Steps:

  1. Choose unit vs. integration test approach.
  2. Write failing tests covering: successful execution, idempotency (run twice = same result), transient error raises, permanent error discards.
  3. Confirm tests FAIL for the right reason (job not yet implemented).
  4. Propose implementation approach and wait for explicit user approval.
  5. Implement job; confirm tests PASS.
  6. Run full test suite — confirm no regressions.

HARD GATE — Tests Pass:

  • Tests exist and run
  • Tests failed before implementation
  • All tests pass after implementation
  • Full suite green

Example job test skeleton:

# spec/jobs/order_confirmation_email_job_spec.rb
RSpec.describe OrderConfirmationEmailJob do
  let(:order) { create(:order, :completed) }

  it 'sends confirmation email' do
    expect(EmailService).to receive(:send_confirmation).with(order.id, order.customer_email, order.total)
    described_class.perform_now(order.id, order.customer_email, order.total)
  end

  it 'is idempotent' do
    expect(EmailService).to receive(:send_confirmation).once
    2.times { described_class.perform_now(order.id, order.customer_email, order.total) }
  end

  it 'raises on transient errors so retry triggers' do
    allow(EmailService).to receive(:send_confirmation).and_raise(EmailService::TimeoutError)
    expect { described_class.perform_now(order.id, order.customer_email, order.total) }.to raise_error(EmailService::TimeoutError)
  end
end

Example job implementation skeleton:

# app/jobs/order_confirmation_email_job.rb
class OrderConfirmationEmailJob < ApplicationJob
  queue_as :default

  retry_on  EmailService::TimeoutError,    wait: :exponentially_longer, attempts: 5
  retry_on  EmailService::RateLimitError,  wait: :exponentially_longer, attempts: 3
  discard_on ActiveRecord::RecordNotFound
  discard_on EmailService::InvalidEmailError

  def perform(order_id, customer_email, order_total)
    order = Order.find(order_id)
    return if order.email_sent_at.present?   # idempotency guard

    EmailService.send_confirmation(order_id, customer_email, order_total)
    order.update!(email_sent_at: Time.current)
  rescue EmailService::TimeoutError, EmailService::RateLimitError => e
    Rails.logger.error("[#{self.class}] transient error: #{e.message}")
    raise
  end
end

Note: discard_on handles permanent errors at the framework level — no rescue block is needed for them. The rescue block above covers only transient errors that need logging before being re-raised to trigger retry.


Phase 3: Retry/Discard Configuration

Objective: Harden job for production with correct retry backoff, discard rules, timeouts, and monitoring hooks.

Steps:

  1. Choose backend (Solid Queue for Rails 8+, Sidekiq for high scale) and configure worker concurrency.
  2. Apply retry_on with exponential backoff and a capped attempt count (3–5) for every transient error class.
  3. Apply discard_on for every permanent error class; log discards.
  4. Set job execution timeout and queue timeout at the worker/config level.
  5. Wire error tracking (e.g., Sentry) and metrics (e.g., StatsD/Datadog) in ApplicationJob callbacks.

Solid Queue (Rails 8+) snippet:

# config/initializers/solid_queue.rb
SolidQueue.configure { |c| c.worker = { processes: 2, threads: 5, polling_interval: 1 } }

Sidekiq snippet:

# config/initializers/sidekiq.rb
Sidekiq.configure_server { |c| c.redis = { url: ENV['REDIS_URL'] } }

Monitoring hook in ApplicationJob:

class ApplicationJob < ActiveJob::Base
  around_perform do |job, block|
    start = Time.current
    block.call
    StatsD.timing("jobs.#{job.class.name.underscore}.duration", Time.current - start)
    StatsD.increment("jobs.#{job.class.name.underscore}.success")
  rescue StandardError
    StatsD.increment("jobs.#{job.class.name.underscore}.failure")
    raise
  end
end

HARD GATE — Retry Strategy Configured:

  • retry_on declared for every transient error with backoff and attempt cap
  • discard_on declared for every permanent error with logging
  • Timeouts configured at job and worker level
  • Metrics/alerting wired

If gate fails: Job is not production-ready.


Phase 4: Failure Scenario Testing & Monitoring

Objective: Verify retry/discard behaviour under injected failures and confirm observability.

Steps:

  1. Inject transient errors → assert job raises (triggering retry logic).
  2. Inject permanent errors → assert job does not raise and error is logged.
  3. Confirm timeout handling (stub slow operations).
  4. Verify metrics increment on success and failure paths.
  5. Confirm queue-depth alerts fire when queue backs up.

Example failure scenario tests:

RSpec.describe OrderConfirmationEmailJob do
  let(:order) { create(:order, :completed) }

  it 'logs and re-raises on transient error' do
    allow(EmailService).to receive(:send_confirmation).and_raise(EmailService::TimeoutError)
    expect(Rails.logger).to receive(:error).with(/transient error/)
    expect { described_class.perform_now(order.id, order.customer_email, order.total) }
      .to raise_error(EmailService::TimeoutError)
  end

  it 'discards silently on permanent error' do
    allow(EmailService).to receive(:send_confirmation).and_raise(EmailService::InvalidEmailError)
    expect { described_class.perform_now(order.id, "bad", order.total) }.not_to raise_error
  end
end

HARD GATE — Failure Scenarios Tested:

  • Retry path tested (raises on transient error)
  • Discard path tested (no raise on permanent error)
  • Error logging assertions pass
  • Metrics verified on success and failure
  • Performance acceptable under expected load

If gate fails: Address failure scenarios before deploying.


HARD GATE: Production Readiness

Never deploy a background job without:

  • Idempotency guard implemented and tested
  • All transient errors covered by retry_on with backoff
  • All permanent errors covered by discard_on with logging
  • Failure scenario tests passing
  • Metrics and error-tracking wired
  • Timeouts configured

Error Recovery

Job fails repeatedly in production:

  1. Check retry patterns and error rates in monitoring.
  2. Review logs for error class and stack trace.
  3. Classify error (transient vs. permanent) and adjust retry_on/discard_on if mis-classified.
  4. Fix root cause; redeploy.

Queue backs up:

  1. Scale worker processes/threads.
  2. Promote critical jobs to a higher-priority queue.
  3. Optimise job execution time or batch size.

Anti-Patterns to Avoid

  • Non-idempotent jobs — always guard against duplicate execution.
  • Missing retry/discard — never deploy without both strategies configured.
  • Silent failures — always log and track errors.
  • Unbounded retries — cap attempts (3–5 is typical).
  • Blocking operations — keep jobs short; offload slow I/O.
  • No monitoring — wire metrics before going to production.

agents

background-job

SKILL.md

README.md

tile.json