Guides application developers in designing correct and performant transaction patterns for CockroachDB, covering transaction lifetime, implicit vs explicit transactions, retry handling with exponential backoff, pushing invariants into SQL, selective pessimistic locking, set-based operations, connection pooling, prepared statements, keyset pagination, follower reads, and separating business logic from database logic. Use when building applications on CockroachDB, designing transaction workflows, handling retries, optimizing application-layer database interactions, or configuring connection pools.
85
82%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Guides application developers through the design principles and implementation patterns needed to build correct, performant, and resilient applications on CockroachDB. Covers the full spectrum from transaction scoping and retry logic to connection pooling and observability.
Complement to SQL skills: For SQL syntax, schema design, and query optimization, see cockroachdb-sql. For benchmarking transaction formulations under contention, see benchmarking-transaction-patterns.
Transactions must include only the minimal set of SQL operations needed for one atomic state change. Do not place remote API calls, service-to-service requests, loops, expensive computation, or artificial waits inside a CockroachDB transaction.
Long-lived transactions increase intent lifetime, contention, and retry probability in CockroachDB's distributed, optimistic-concurrency architecture.
Anti-pattern:
@Transactional
public void createOrder(Order order) {
orderRepository.save(order);
paymentGateway.charge(order); // external call inside TX
}Correct approach — split the logic:
@Transactional
public void createOrderRecord(Order order) {
orderRepository.save(order);
}
// Outside the transaction
paymentGateway.charge(order);Why it matters:
40001 retry errorsCockroachDB automatically wraps each individual SQL statement as a transaction in autocommit mode. For single INSERT, UPDATE, DELETE, or SELECT statements, do not wrap in explicit BEGIN/COMMIT.
Preferred:
INSERT INTO orders (id, status)
VALUES (gen_random_uuid(), 'open');Avoid:
BEGIN;
INSERT INTO orders (id, status)
VALUES (gen_random_uuid(), 'open');
COMMIT;Benefits: Simpler code paths, lower latency (fewer round trips), less resource usage, and fewer retry concerns since single-statement transactions are easier for CockroachDB to retry automatically.
When multiple SQL operations must succeed or fail together, use explicit transactions with BEGIN/COMMIT. Because CockroachDB defaults to SERIALIZABLE isolation, transaction retries are a normal part of correct execution under contention.
BEGIN;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
UPDATE accounts SET balance = balance + 100 WHERE id = 2;
COMMIT;Client-side retry loop with exponential backoff:
import random
import time
def execute_with_retry(conn, txn_logic):
backoff = 0.1
while True:
try:
with conn.transaction() as txn:
txn_logic(txn)
return
except SerializationFailure:
time.sleep(backoff + random.uniform(0, 0.1))
backoff = min(backoff * 2, 2.0)Advanced retry with the cockroach_restart savepoint protocol:
BEGIN;
SAVEPOINT cockroach_restart;
-- transactional work
RELEASE SAVEPOINT cockroach_restart;
COMMIT;WARNING: Generic savepoints do NOT work as retry mechanisms. CockroachDB aborts the entire transaction on a 40001 serialization failure. Using ROLLBACK TO SAVEPOINT on a regular savepoint cannot recover -- the transaction remains in an aborted state. Only the special SAVEPOINT cockroach_restart protocol (where the client catches the error, rolls back to the savepoint, and re-executes the work) is supported. For most applications, a full-transaction retry loop is simpler and recommended.
SQLSTATE guidance:
| Code | Meaning | Action |
|---|---|---|
40001 | Serialization / retryable | Retry the entire unit of work with backoff and jitter |
40003 | Ambiguous result / indeterminate commit | Do not blindly replay non-idempotent work |
08xx / 57xx | Network or server transient issues | Retry carefully, account for ambiguous commits |
23xxx | Constraint and application errors | Usually should not be retried |
Read-only transactions perform retrieval only and make no writes. Marking them as read-only allows CockroachDB to avoid unnecessary write intents, reduce contention with writers, and enable follower or bounded-staleness reads.
BEGIN;
SET TRANSACTION READ ONLY;
SELECT * FROM customers WHERE region = 'US-East';
COMMIT;Do not fetch state into application code, modify it in memory, and write it back. Prefer atomic SQL, constraints, guarded UPDATEs, UPSERT, INSERT ... ON CONFLICT, and CTE-based mutations.
Anti-pattern:
balance = db.fetch("SELECT balance FROM accounts WHERE id = 123")
balance += 100
db.execute("UPDATE accounts SET balance = %s WHERE id = 123", (balance,))Preferred atomic SQL:
UPDATE accounts
SET balance = balance + 100
WHERE id = 123;Guarded write with invariant enforcement:
UPDATE customer_daily_limits
SET used_total = used_total + $2
WHERE customer_id = $1
AND day = current_date
AND used_total + $2 <= daily_limit;Atomic CTE pattern:
WITH limit_row AS (
SELECT customer_id, day
FROM customer_daily_limits
WHERE customer_id = $1 AND day = current_date
FOR UPDATE
), spend AS (
UPDATE customer_daily_limits AS l
SET remaining_limit = l.remaining_limit - $2,
used_total = l.used_total + $2
FROM limit_row
WHERE l.customer_id = limit_row.customer_id
AND l.day = limit_row.day
AND l.remaining_limit >= $2
RETURNING l.customer_id, l.day
), ins AS (
INSERT INTO transfers (customer_id, amount, direction, created_at)
SELECT $1, $2, 'debit', now()
FROM spend
RETURNING id AS transfer_id
)
SELECT transfer_id FROM ins;Key approaches:
UPDATE ... SET col = col + 1UNIQUE, CHECK, NOT NULL, and FOREIGN KEY constraintsUPSERT or INSERT ... ON CONFLICT instead of read-before-write existence checksCockroachDB defaults to optimistic concurrency, which works well for most workloads. For hot rows or contention-heavy read-before-write paths, SELECT ... FOR UPDATE reduces retry churn by making contenders wait instead of race.
BEGIN;
SELECT balance FROM accounts WHERE id = 1 FOR UPDATE;
UPDATE accounts SET balance = balance - 100 WHERE id = 1;
COMMIT;Use when:
Counterintuitive contention insight: Adding more application pods or threads targeting the same hot rows does NOT increase throughput -- it decreases it. With N concurrent writers on the same row, only 1 can commit per round; the other N-1 are aborted with 40001 and must retry. More concurrency on hot data means more wasted work and lower TPS. Solutions: use SELECT ... FOR UPDATE to serialize access, use atomic UPDATE SET balance = balance + amount to eliminate the read-modify-write cycle, or distribute writes across multiple rows.
Trade-off: Overusing pessimistic locks can introduce waiting chains or deadlocks. Reserve for hot paths and contention-heavy workloads.
CockroachDB performs best with set-oriented SQL rather than many small client-driven statements. This reduces round trips, shortens contention windows, and improves throughput.
Row-by-row anti-pattern:
for row in rows:
db.execute(
"UPDATE accounts SET balance = balance + 10 WHERE id = %s",
(row.id,)
)Set-based preferred:
UPDATE accounts
SET balance = balance + 10
WHERE region = 'US-East';Batch INSERT:
INSERT INTO trades (id, symbol, price)
VALUES
(1, 'AAPL', 180),
(2, 'GOOG', 125),
(3, 'AMZN', 140);Batch UPDATE with UNNEST:
WITH incoming AS (
SELECT *
FROM UNNEST(
ARRAY['u1', 'u2', 'u3']::STRING[],
ARRAY['active', 'inactive', 'active']::STRING[]
) AS t(id, new_status)
)
UPDATE users AS u
SET status = incoming.new_status,
updated_at = now()
FROM incoming
WHERE u.id = incoming.id;Maintenance batching with LIMIT:
DELETE FROM sessions
WHERE expires_at < now()
LIMIT 10000;JDBC batching (Java): Use addBatch/executeBatch instead of per-row executeUpdate. This sends all rows in a single network round trip rather than N individual round trips, eliminating idle time that can account for ~50% of transaction latency in chatty workloads.
Declarative TTL:
ALTER TABLE events
SET (ttl_expiration_expression = 'created_at + INTERVAL ''7 DAY''');Many analytics, dashboard, and display-oriented queries do not need the absolute latest value. CockroachDB supports follower reads and bounded-staleness reads from follower replicas with lower latency.
Basic follower read:
SELECT * FROM orders
AS OF SYSTEM TIME '-5s';Bounded staleness:
SELECT * FROM inventory
AS OF SYSTEM TIME with_max_staleness(INTERVAL '10s');Read-write split pattern for heavy reads: When a workflow reads a large payload (e.g., KYC JSON document) and then updates a status field, split it into three phases: (1) read outside the transaction with AS OF SYSTEM TIME for a conflict-free snapshot, (2) process in the application layer, (3) start a short write-only transaction. This avoids holding write intents during the heavy read.
Use when: Dashboards, analytics, ETL, display-only reads, or large-payload workflows where the read and write can be separated.
Avoid when: The workflow requires the latest transactional state for a subsequent write decision.
As the OFFSET grows, CockroachDB must scan and discard more rows. Keyset pagination uses the last row's ordered key values to jump directly to the next page.
OFFSET/LIMIT (inefficient at depth):
SELECT id, order_date, customer_id
FROM orders
ORDER BY id
LIMIT 100 OFFSET 5000;Keyset pagination (preferred):
SELECT id, order_date, customer_id
FROM orders
WHERE id > 5000
ORDER BY id
LIMIT 100;Multi-column keyset:
SELECT id, created_at, customer_id
FROM orders
WHERE (created_at, id) > ('2025-01-01 00:00:00', 5000)
ORDER BY created_at, id
LIMIT 100;Trade-off: Keyset pagination is ideal for next/previous navigation but not for arbitrary "jump to page 73" UX.
Prepared statements reuse query structure and bind new values, improving performance through plan reuse and protecting against SQL injection.
Unsafe dynamic string concatenation:
query = f"SELECT * FROM users WHERE username = '{user_input}'"
cursor.execute(query)Prepared / parameterized execution:
cursor.execute("SELECT * FROM users WHERE username = %s;", (user_input,))Plan reuse:
PREPARE get_balance AS
SELECT balance FROM accounts WHERE id = $1;
EXECUTE get_balance(1001);
EXECUTE get_balance(2002);Select only the columns you need. SELECT * increases network payload, memory usage, CPU cost, and prevents narrower index-only scans.
-- Avoid
SELECT * FROM users WHERE id = 101;
-- Preferred
SELECT name, email FROM users WHERE id = 101;Schema evolution impact: If a later schema change adds profile_picture BYTEA, queries using SELECT * automatically pull that extra data. Explicit projections avoid this hidden performance regression.
Sequential or monotonically increasing primary keys create write hotspots. Keys and indexes should distribute reads and writes across ranges evenly.
Hotspot anti-pattern:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_id UUID,
region STRING
);Randomized key:
CREATE TABLE orders (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
customer_id UUID,
region STRING
);Hash-sharded index:
CREATE INDEX orders_by_id_hash
ON orders (id)
USING HASH SHARDED WITH BUCKET_COUNT = 16;Composite key for natural distribution:
CREATE TABLE sales (
region_id STRING,
order_id UUID DEFAULT gen_random_uuid(),
PRIMARY KEY (region_id, order_id)
);Enforce explicit PKs cluster-wide:
SET CLUSTER SETTING sql.defaults.require_explicit_primary_keys.enabled = true;Opening new database connections is expensive. Pooling reuses live connections to improve performance and prevent overload.
HikariCP guidance:
maximumPoolSize: (vCPUs * 4) / number_of_pool_instances
minimumIdle: equal to maximumPoolSize
maxLifetime: 30 min (add jitter +/- 5 min)
idleTimeout: 5-10 min typical
keepaliveTime: slightly shorter than infrastructure timeout (~5 min)
connectionTimeout: 10-30 s typical
autoCommit: true unless using explicit transactions onlyExample stable configuration:
maximum-pool-size: 12
minimum-idle: 12
max-lifetime: 1800000
idle-timeout: 600000
keepalive-time: 300000
connection-timeout: 10000
auto-commit: true
pool-name: ingestionPoolCockroachDB should manage ACID reads, writes, and schema-level integrity. The application layer should orchestrate workflows, external services, queues, and long-running work.
Inside the transaction:
Outside the transaction:
Asynchronous workflow pattern:
def handle_order(order):
db.execute("INSERT INTO orders (id, status) VALUES (%s, %s)", (order.id, 'PENDING'))
publish_event('process_order', {'order_id': order.id})CockroachDB has a practical limit of ~16MB per transaction payload. This limit applies to the TOTAL data written in a single transaction, not just individual rows.
Two ways to hit the limit:
Guidelines:
Exceeding the limit causes split failed while applying backpressure to Put errors: large Raft proposals block consensus, range splits stall, and the system applies backpressure.
Set session-level guardrails to catch runaway queries and missing WHERE clauses during development and testing:
SET transaction_rows_read_err = 10000;
SET transaction_rows_written_err = 1000;These cause transactions that exceed the thresholds to fail with an explicit error rather than silently consuming cluster resources.
Single-user correctness is not sufficient. Test with realistic concurrency to surface retries, hotspots, contention, and workload-specific bottlenecks.
Quick start:
cockroach workload init bank 'postgresql://root@localhost:26257?sslmode=disable'
cockroach workload run bank --concurrency=64 --duration=10mSee monitoring-and-concurrency-testing for detailed contention queries, validation checklists, and Prometheus metrics.
Actively monitor query latency, contention, retries, and data distribution using EXPLAIN ANALYZE, crdb_internal.transaction_contention_events, DB Console SQL Activity, and Key Visualizer.
See monitoring-and-concurrency-testing for live contention queries, Prometheus metrics, and external monitoring integration.
| Scenario | Recommended Pattern |
|---|---|
| Single SQL statement | Implicit transaction (autocommit) |
| Multiple statements, all-or-nothing | Explicit transaction with retry loop |
| Read current state before write on hot rows | SELECT ... FOR UPDATE |
| Historical, display, or reporting read | AS OF SYSTEM TIME / follower reads |
| Batch of records in memory | UNNEST / VALUES / batch SQL |
| Multi-step business rule in one operation | Single-statement CTE |
40001 serialization errorsINSERT ... ON CONFLICT DO NOTHING)EXPLAIN ANALYZE on production queries that modify data84bc1e4
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.