igmarin/rails-agent-skills

Curated library of 28 public AI agent skills for Ruby on Rails development. Organized by category: testing, code-quality, engines, infrastructure, api, and context. Covers code review, architecture, security, testing (RSpec), engines, Hotwire, and TDD automation. Shared Ruby skills (YARD docs, DDD, service objects) have moved to ruby-core-skills. Repository agents remain documented in GitHub but are intentionally excluded from the Tessl tile.

1.78x

Quality

95%

Does it follow best practices?

Impact

93%

1.78x

Average score across 28 eval scenarios

Securityby

Passed

No known issues

name:: background-job
license:: MIT
description:: Orchestrates robust background job implementation: design job → TDD implementation → configure retry/discard strategies → test failure scenarios → production monitoring. Use when adding async processing, implementing background jobs, or configuring job queues. Trigger: background job, async processing, sidekiq, solid queue, active job, job queue, worker.
metadata:: {"version":"1.0.0","user-invocable":"true","entry_point":"Invoke when implementing background jobs with proper retry/discard strategies and monitoring","phases":"Phase 1: Job Design, Phase 2: TDD Implementation, Phase 3: Retry/Discard Configuration, Phase 4: Testing & Monitoring","hard_gates":"Job Design Complete, Tests Pass, Retry Strategy Configured, Failure Scenarios Tested","dependencies":[{"source":"self","skills":["implement-background-job","write-tests"]},{"source":"ruby-core-skills","skills":["tdd-process"]}],"keywords":"rails, background-job, async, sidekiq, solid-queue, active-job, retry, monitoring"}

Background Job Agent

Name: igmarin/rails-agent-skills
Rating: 93.21 (1 reviews)
Author: igmarin

Orchestrates robust background job implementation with TDD discipline, proper retry/discard strategies, comprehensive failure scenario testing, and production monitoring to ensure reliable async processing.

Phase 1: Job Design

Objective: Define job responsibilities, idempotency strategy, and error classification before writing code.

Steps:

Job Purpose — Define trigger conditions, input parameters, expected output/side effects, and criticality.
Idempotency — Design job to be safely re-runnable: use unique job keys, status checks, or sentinel timestamps.
Error Classification — Classify all anticipated errors:
- Transient (network timeouts, rate limits) → retry
- Permanent (invalid data, record not found) → discard
- Configuration (missing credentials) → alert
Queue & Timeout — Assign queue priority and set execution timeout.

HARD GATE — Job Design Complete:

Purpose, trigger, input/output defined
Idempotency strategy specified
All errors classified as transient/permanent
Queue and timeout values chosen

If gate fails: Clarify requirements before implementation.

Phase 2: TDD Implementation

Objective: Implement job logic under TDD discipline.

Steps:

Choose unit vs. integration test approach.
Write failing tests covering: successful execution, idempotency (run twice = same result), transient error raises, permanent error discards.
Confirm tests FAIL for the right reason (job not yet implemented).
Propose implementation approach and wait for explicit user approval.
Implement job; confirm tests PASS.
Run full test suite — confirm no regressions.

HARD GATE — Tests Pass:

Tests exist and run
Tests failed before implementation
All tests pass after implementation
Full suite green

Example job test skeleton:

# spec/jobs/order_confirmation_email_job_spec.rb
RSpec.describe OrderConfirmationEmailJob do
  let(:order) { create(:order, :completed) }

  it 'sends confirmation email' do
    expect(EmailService).to receive(:send_confirmation).with(order.id, order.customer_email, order.total)
    described_class.perform_now(order.id, order.customer_email, order.total)
  end

  it 'is idempotent' do
    expect(EmailService).to receive(:send_confirmation).once
    2.times { described_class.perform_now(order.id, order.customer_email, order.total) }
  end

  it 'raises on transient errors so retry triggers' do
    allow(EmailService).to receive(:send_confirmation).and_raise(EmailService::TimeoutError)
    expect { described_class.perform_now(order.id, order.customer_email, order.total) }.to raise_error(EmailService::TimeoutError)
  end
end

Example job implementation skeleton:

# app/jobs/order_confirmation_email_job.rb
class OrderConfirmationEmailJob < ApplicationJob
  queue_as :default

  retry_on  EmailService::TimeoutError,    wait: :exponentially_longer, attempts: 5
  retry_on  EmailService::RateLimitError,  wait: :exponentially_longer, attempts: 3
  discard_on ActiveRecord::RecordNotFound
  discard_on EmailService::InvalidEmailError

  def perform(order_id, customer_email, order_total)
    order = Order.find(order_id)
    return if order.email_sent_at.present?   # idempotency guard

    EmailService.send_confirmation(order_id, customer_email, order_total)
    order.update!(email_sent_at: Time.current)
  rescue EmailService::TimeoutError, EmailService::RateLimitError => e
    Rails.logger.error("[#{self.class}] transient error: #{e.message}")
    raise
  end
end

Note: discard_on handles permanent errors at the framework level — no rescue block is needed for them. The rescue block above covers only transient errors that need logging before being re-raised to trigger retry.

Phase 3: Retry/Discard Configuration

Objective: Harden job for production with correct retry backoff, discard rules, timeouts, and monitoring hooks.

Steps:

Choose backend (Solid Queue for Rails 8+, Sidekiq for high scale) and configure worker concurrency.
Apply retry_on with exponential backoff and a capped attempt count (3–5) for every transient error class.
Apply discard_on for every permanent error class; log discards.
Set job execution timeout and queue timeout at the worker/config level.
Wire error tracking (e.g., Sentry) and metrics (e.g., StatsD/Datadog) in ApplicationJob callbacks.

Solid Queue (Rails 8+) snippet:

# config/initializers/solid_queue.rb
SolidQueue.configure { |c| c.worker = { processes: 2, threads: 5, polling_interval: 1 } }

Sidekiq snippet:

# config/initializers/sidekiq.rb
Sidekiq.configure_server { |c| c.redis = { url: ENV['REDIS_URL'] } }

Monitoring hook in ApplicationJob:

class ApplicationJob < ActiveJob::Base
  around_perform do |job, block|
    start = Time.current
    block.call
    StatsD.timing("jobs.#{job.class.name.underscore}.duration", Time.current - start)
    StatsD.increment("jobs.#{job.class.name.underscore}.success")
  rescue StandardError
    StatsD.increment("jobs.#{job.class.name.underscore}.failure")
    raise
  end
end

HARD GATE — Retry Strategy Configured:

retry_on declared for every transient error with backoff and attempt cap
discard_on declared for every permanent error with logging
Timeouts configured at job and worker level
Metrics/alerting wired

If gate fails: Job is not production-ready.

Phase 4: Failure Scenario Testing & Monitoring

Objective: Verify retry/discard behaviour under injected failures and confirm observability.

Steps:

Inject transient errors → assert job raises (triggering retry logic).
Inject permanent errors → assert job does not raise and error is logged.
Confirm timeout handling (stub slow operations).
Verify metrics increment on success and failure paths.
Confirm queue-depth alerts fire when queue backs up.

Example failure scenario tests:

RSpec.describe OrderConfirmationEmailJob do
  let(:order) { create(:order, :completed) }

  it 'logs and re-raises on transient error' do
    allow(EmailService).to receive(:send_confirmation).and_raise(EmailService::TimeoutError)
    expect(Rails.logger).to receive(:error).with(/transient error/)
    expect { described_class.perform_now(order.id, order.customer_email, order.total) }
      .to raise_error(EmailService::TimeoutError)
  end

  it 'discards silently on permanent error' do
    allow(EmailService).to receive(:send_confirmation).and_raise(EmailService::InvalidEmailError)
    expect { described_class.perform_now(order.id, "bad", order.total) }.not_to raise_error
  end
end

HARD GATE — Failure Scenarios Tested:

Retry path tested (raises on transient error)
Discard path tested (no raise on permanent error)
Error logging assertions pass
Metrics verified on success and failure
Performance acceptable under expected load

If gate fails: Address failure scenarios before deploying.

HARD GATE: Production Readiness

Never deploy a background job without:

Idempotency guard implemented and tested
All transient errors covered by retry_on with backoff
All permanent errors covered by discard_on with logging
Failure scenario tests passing
Metrics and error-tracking wired
Timeouts configured

Error Recovery

Job fails repeatedly in production:

Check retry patterns and error rates in monitoring.
Review logs for error class and stack trace.
Classify error (transient vs. permanent) and adjust retry_on/discard_on if mis-classified.
Fix root cause; redeploy.

Queue backs up:

Scale worker processes/threads.
Promote critical jobs to a higher-priority queue.
Optimise job execution time or batch size.

Anti-Patterns to Avoid

Non-idempotent jobs — always guard against duplicate execution.
Missing retry/discard — never deploy without both strategies configured.
Silent failures — always log and track errors.
Unbounded retries — cap attempts (3–5 is typical).
Blocking operations — keep jobs short; offload slow I/O.
No monitoring — wire metrics before going to production.

agents

background-job

SKILL.md

bug-fix

engine

graphql

migration

quality

review

setup

tdd

docs

evals

skills

README.md

tile.json

igmarin/rails-agent-skills

SKILL.md.css-3qkkll{font-size:var(--chakra-font-sizes-sm);font-weight:var(--chakra-font-weights-normal);color:var(--chakra-colors-gray-300);}agents/background-job/

Background Job Agent

Phase 1: Job Design

Phase 2: TDD Implementation

Phase 3: Retry/Discard Configuration

Phase 4: Failure Scenario Testing & Monitoring

HARD GATE: Production Readiness

Error Recovery

Anti-Patterns to Avoid

SKILL.mdagents/background-job/