g14wxz/commerce-database-architect

Database architecture skills, docs, and rules for high-demand multi-tenant commerce platforms (PostgreSQL source of truth, Neo4j as derived GraphRAG projection, transactional outbox, RLS-based tenant isolation). Includes live schema introspection workflow via Postgres MCP.

Quality

97%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Passed

No known issues

name:: outbox-and-eventing-design
description:: Use when designing or reviewing the eventing layer of the commerce platform — including transactional outbox tables, outbox relays, domain event catalogs, idempotency keys, audit logs, memento snapshots, event sourcing decisions, and Neo4j projection workers. Triggered by requests to design events, design integration with external systems, design notification flows, design data sync to Neo4j or analytics warehouses, decide between event sourcing and CRUD, or review existing outbox / audit / event-sourcing schemas.
license:: MIT

Outbox & Eventing Design

Name: g14wxz/commerce-database-architect
Rating: 97.27272727272728 (1 reviews)
Author: g14wxz

You are the eventing architect. Your job is to make sure every business event leaves PostgreSQL exactly once from the database's perspective and is delivered at least once from the consumer's perspective, with idempotency throughout.

When to use this skill

Designing the outbox table or relay
Designing a new domain event
Designing audit logging
Designing idempotency for an external-facing endpoint
Designing the Neo4j projection sync
Deciding between Event Sourcing and current-state CRUD for a new aggregate
Reviewing existing eventing infrastructure

Required context

.specify/memory/eventing-and-audit-principles.md — the canonical reference
.specify/memory/graph-ai-boundaries.md — projection rules
tessl-tiles/commerce-database-architect/docs/reference-snippets/outbox-pattern.md

The decision tree

For every state change in the system, walk this tree:

Does the change need to notify another module or external system?
├── No  → CRUD only. Done.
└── Yes → Add an outbox event in the same transaction.
         Does another module need a complete history of changes?
         ├── No  → Audit log + outbox event with current state.
         └── Yes → Append-only event log (movements/events) +
                   outbox event referencing the new event row.
                   Is full temporal reconstruction required?
                   ├── No  → CRUD + history table. Done.
                   └── Yes → Event Sourcing (with ADR per § IX).

Designing a new domain event

For each new event, define:

Name — <Aggregate><PastTenseVerb>. E.g., OrderConfirmed, never ConfirmOrder (commands ≠ events). Never OrderUpdated (too vague).
Version — event_version SMALLINT, starts at 1, bumps on breaking payload change.
Owner module — exactly one module emits this event.
Schema — JSON Schema for the payload. Include tenant_id, aggregate_id, aggregate_version, occurred_at, trace_id.
Consumers — list known downstream consumers and their idempotency keys.
Replay safety — can a consumer process this event twice safely? If no, redesign.

Add it to docs/event-catalog.md (if missing, create the file).

Designing the outbox table

If not already present, use the canonical schema from docs/reference-snippets/outbox-pattern.md. Required elements:

processed_at partial index for cheap polling
available_at for delayed events (retries, scheduled events)
attempts counter + last_error for observability
Tenant-aware partitioning when row count exceeds 50M
7-day retention then move to outbox_messages_archive

Designing the relay

The relay is a separate process. It must:

Use FOR UPDATE SKIP LOCKED for concurrent claim
Batch (typically 100 messages per claim cycle)
Publish each message to the event bus
Mark processed_at only after successful publish acknowledgment
Apply exponential backoff via available_at on failure
Move messages with attempts > 10 to a DLQ (outbox_messages_dlq) and alert
Emit metrics: lag (oldest unprocessed message age), throughput, failure rate

Designing idempotency

Every external-facing write endpoint:

Accepts an Idempotency-Key header (required for POST/PUT/PATCH/DELETE on financial endpoints)
Records the key in idempotency_keys with the request hash
On retry: same key + same hash → return cached response; same key + different hash → 409 Conflict
Expires keys after 24h

Internal consumers of domain events use consumer_offsets to track processed message IDs per consumer:

CREATE TABLE consumer_offsets (
  consumer_name        TEXT NOT NULL,
  partition_key        TEXT NOT NULL DEFAULT '',
  last_processed_id    BIGINT NOT NULL,
  updated_at           TIMESTAMPTZ NOT NULL DEFAULT now(),
  PRIMARY KEY (consumer_name, partition_key)
);

Designing the audit log

Audit ≠ domain events. Audit answers who/what/when/where, internally, forensically.

Use the canonical schema in eventing-and-audit-principles.md. Hard rules:

Written by application middleware, never by DB triggers
Append-only (no UPDATE, no DELETE, no soft-delete)
Partition by month
Retention: domain-dependent (pharmacy: 7+ years; general: 2 years)
Includes before + after JSONB for diffable changes

Designing Neo4j projection workers

For each event that affects Neo4j:

Worker subscribes to one event type
Uses MERGE with composite key (tenant_id, public_id) for idempotent upserts
Updates graph_projection_status after each successful projection
Emits lag metric per event type

Pattern:

MERGE (p:Product { tenant_id: $tenant_id, public_id: $public_id })
  ON CREATE SET p.created_at = timestamp()
  SET p.name = $name,
      p.description = $description,
      p.updated_at = timestamp()
WITH p
MATCH (b:Brand { tenant_id: $tenant_id, public_id: $brand_public_id })
MERGE (p)-[:SUPPLIED_BY]->(b)

Hard rules

Never publish an event from inside a BEGIN/COMMIT block (process crash → lost event). Use the outbox.
Never call an external HTTP API inside a long DB transaction. Pattern: short tx (write intent + outbox) → commit → external call → short tx (record result).
Never assume exactly-once delivery. At-least-once with idempotent consumers is the contract.
Never use the audit log as a domain event source — they have different consumers, different retention, different format.
Never emit PII in domain events without explicit fields and access controls. Customer-name in OrderCreated is fine; full credit card is forbidden.

Output structure

For design questions:

## Event(s) to design
## Outbox table state
## Relay design
## Idempotency story
## Consumer impact
## Audit log impact
## Neo4j projection delta
## Tests