CtrlK
BlogDocsLog inGet started
Tessl Logo

g14wxz/commerce-database-architect

Database architecture skills, docs, and rules for high-demand multi-tenant commerce platforms (PostgreSQL source of truth, Neo4j as derived GraphRAG projection, transactional outbox, RLS-based tenant isolation). Includes live schema introspection workflow via Postgres MCP.

97

Quality

97%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

outbox-pattern.mddocs/reference-snippets/

Outbox Pattern — Reference Implementation

This is the canonical outbox implementation for the platform. Skills cite this file rather than re-deriving the pattern. Copy it as-is for new projects; deviate only with an ADR.


Table

CREATE TABLE outbox_messages (
  id              BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
  tenant_id       BIGINT NOT NULL,
  aggregate_type  TEXT NOT NULL,                          -- 'Order', 'Product'
  aggregate_id    BIGINT NOT NULL,                        -- the affected row
  event_type      TEXT NOT NULL,                          -- 'OrderConfirmed'
  event_version   SMALLINT NOT NULL DEFAULT 1,
  payload         JSONB NOT NULL,
  occurred_at     TIMESTAMPTZ NOT NULL DEFAULT now(),
  available_at    TIMESTAMPTZ NOT NULL DEFAULT now(),     -- for delayed/retry
  processed_at    TIMESTAMPTZ,
  attempts        SMALLINT NOT NULL DEFAULT 0,
  last_error      TEXT,
  trace_id        TEXT
);

-- Cheap polling: partial index on the pending set
CREATE INDEX outbox_messages_pending_idx
  ON outbox_messages (available_at, id)
  WHERE processed_at IS NULL;

-- Optional: tenant-aware lookup for debugging
CREATE INDEX outbox_messages_tenant_aggregate_idx
  ON outbox_messages (tenant_id, aggregate_type, aggregate_id);

When the table grows beyond 50M rows, partition by month on occurred_at.


Writing an event (in the same transaction as the state change)

BEGIN;

UPDATE sales_orders
   SET status = 'confirmed',
       confirmed_at = now(),
       version = version + 1
 WHERE id = $1
   AND version = $2;

INSERT INTO outbox_messages (
  tenant_id, aggregate_type, aggregate_id, event_type, payload, trace_id
) VALUES (
  $tenant_id, 'Order', $order_id, 'OrderConfirmed', $payload::jsonb, $trace_id
);

COMMIT;

Either both rows are persisted or neither is. No external publish happens inside the transaction.


Relay query (canonical)

Multiple relay workers can run concurrently. FOR UPDATE SKIP LOCKED lets them claim disjoint sets without contention.

WITH claimed AS (
  SELECT id
    FROM outbox_messages
   WHERE processed_at IS NULL
     AND available_at <= now()
   ORDER BY id
   LIMIT 100
   FOR UPDATE SKIP LOCKED
)
SELECT m.*
  FROM outbox_messages m
  JOIN claimed c ON c.id = m.id;

The relay then publishes each message to the event bus. On success:

UPDATE outbox_messages
   SET processed_at = now()
 WHERE id = ANY($1::bigint[]);

On failure (transient):

UPDATE outbox_messages
   SET attempts = attempts + 1,
       available_at = now() + (LEAST(attempts + 1, 8) * INTERVAL '30 seconds'),
       last_error = $error
 WHERE id = $id;

On permanent failure (attempts > 10), move to DLQ:

WITH moved AS (
  DELETE FROM outbox_messages WHERE id = $id RETURNING *
)
INSERT INTO outbox_messages_dlq SELECT * FROM moved;

Archive

After 7 days, processed messages move to outbox_messages_archive, partitioned by month, retained for 90 days unless compliance requires longer.

CREATE TABLE outbox_messages_archive (
  LIKE outbox_messages INCLUDING DEFAULTS INCLUDING CONSTRAINTS
) PARTITION BY RANGE (occurred_at);

-- Initial partition
CREATE TABLE outbox_messages_archive_202605
  PARTITION OF outbox_messages_archive
  FOR VALUES FROM ('2026-05-01') TO ('2026-06-01');

Hourly archival job (idempotent):

WITH moved AS (
  DELETE FROM outbox_messages
   WHERE processed_at IS NOT NULL
     AND processed_at < now() - INTERVAL '7 days'
   RETURNING *
)
INSERT INTO outbox_messages_archive SELECT * FROM moved;

Drop archive partitions older than 90 days monthly.


Consumer offsets

Consumers track their position to enable replay and ensure idempotency.

CREATE TABLE consumer_offsets (
  consumer_name        TEXT NOT NULL,
  partition_key        TEXT NOT NULL DEFAULT '',
  last_processed_id    BIGINT NOT NULL,
  last_processed_at    TIMESTAMPTZ NOT NULL DEFAULT now(),
  PRIMARY KEY (consumer_name, partition_key)
);

A consumer reads:

SELECT *
  FROM outbox_messages
 WHERE id > (
   SELECT last_processed_id
     FROM consumer_offsets
    WHERE consumer_name = $name AND partition_key = $partition
 )
   AND processed_at IS NOT NULL
 ORDER BY id
 LIMIT $batch_size;

After processing, update the offset in the same transaction as the consumer's state change.


Observability

Required metrics:

  • outbox_lag_seconds (gauge) — now() - MIN(occurred_at) WHERE processed_at IS NULL
  • outbox_throughput (counter) — messages processed per second
  • outbox_failures (counter) — by event_type
  • outbox_dlq_depth (gauge) — COUNT(*) FROM outbox_messages_dlq

Required alerts:

  • Lag > 60s for 5 minutes
  • Any message with attempts > 10
  • DLQ depth > 0

Common mistakes (don't)

  • Publishing inside the BEGIN/COMMIT block. Process crash → lost event.
  • Polling without FOR UPDATE SKIP LOCKED. Concurrent relays will double-process or contend on the same rows.
  • Indexing the whole table instead of just the pending set. Wastes write capacity. Use the partial index above.
  • Setting processed_at before the publish ack returns. If the publish fails after the update, the event is lost.
  • Marking old messages dead without DLQ. Lose forensic value when an incident requires re-processing.

docs

reference-snippets

expand-contract-migration.md

idempotency-keys.md

inventory-movements.md

optimistic-concurrency.md

outbox-pattern.md

rls-templates.md

event-catalog.md

index.md

tile.json