Process use when you need to archive historical database records to reduce primary database size. This skill automates moving old data to archive tables or cold storage (S3, Azure Blob, GCS). Trigger with phrases like "archive old database records", "implement data retention policy", "move historical data to cold storage", or "reduce database size with archival".
Install with Tessl CLI
npx tessl i github:jeremylongshore/claude-code-plugins-plus-skills --skill archiving-databases89
Quality
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Implement automated data archival pipelines that move historical records from primary database tables to archive storage (archive tables, S3, Azure Blob, or GCS) based on age, status, or access frequency criteria.
psql or mysql CLI for executing archival queriesaws s3, az storage, or gsutil CLI for cloud storage uploadsSELECT pg_size_pretty(pg_total_relation_size('table_name')) to identify archival candidatesIdentify archival candidates by finding large tables with time-based data:
SELECT relname, n_live_tup, pg_size_pretty(pg_total_relation_size(relid)) FROM pg_stat_user_tables ORDER BY pg_total_relation_size(relid) DESC LIMIT 10Define archival criteria for each table:
WHERE created_at < NOW() - INTERVAL '1 year')WHERE status IN ('completed', 'cancelled', 'expired'))WHERE created_at < NOW() - INTERVAL '6 months' AND status = 'completed')SELECT COUNT(*), pg_size_pretty(pg_column_size(t.*)) FROM table_name t WHERE <criteria>Handle referential integrity by archiving in dependency order:
SELECT COUNT(*) FROM active_child WHERE parent_id IN (SELECT id FROM parent WHERE <archive_criteria>)Create archive destination tables matching the source schema plus metadata columns:
CREATE TABLE orders_archive (LIKE orders INCLUDING ALL)ALTER TABLE orders_archive ADD COLUMN archived_at TIMESTAMPTZ DEFAULT NOW()ALTER TABLE orders_archive ADD COLUMN archive_batch_id UUIDImplement the archival operation as an atomic batch:
SELECT gen_random_uuid() AS batch_idINSERT INTO orders_archive SELECT *, NOW(), batch_id FROM orders WHERE <criteria>SELECT COUNT(*) FROM orders_archive WHERE archive_batch_id = batch_idDELETE FROM orders WHERE id IN (SELECT id FROM orders_archive WHERE archive_batch_id = batch_id)For cloud storage archival, export data to files before upload:
COPY (SELECT * FROM orders WHERE <criteria>) TO '/tmp/archive_orders_2023.csv' WITH CSV HEADERgzip /tmp/archive_orders_2023.csvaws s3 cp /tmp/archive_orders_2023.csv.gz s3://archive-bucket/orders/2023/ --sse aws:kmsProcess archival in batches to avoid long-running transactions and excessive lock time:
Run VACUUM ANALYZE on source tables after archival to reclaim disk space and update statistics. For large archival operations (>30% of table), consider VACUUM FULL during a maintenance window (requires exclusive lock).
Implement data retrieval procedures for archived data:
UNION ALL between active and archive tablesSchedule recurring archival with a cron job or database scheduler. Run weekly or monthly. Include monitoring that alerts on: archival job failure, unexpected archive volume (too many or too few records), and source table size not decreasing after archival.
| Error | Cause | Solution |
|---|---|---|
| Foreign key violation during DELETE | Active child records still reference archived parent | Archive child records first; verify no active references exist before deleting parent records |
| Disk space not reclaimed after archival | PostgreSQL marks deleted rows as dead tuples but does not release space | Run VACUUM FULL table_name during maintenance window; or use pg_repack for online space reclamation |
| Archive batch interrupted mid-transaction | Network failure, timeout, or crash during archival | Transaction rollback ensures atomicity; restart from the last completed batch using batch_id tracking |
| Cloud storage upload fails | Network timeout, credential expiration, or bucket permissions | Implement retry with exponential backoff; verify credentials before starting; use multipart upload for files >100MB |
| Archived data needed for audit | Compliance request requires access to archived records | Query archive tables directly; or restore from cloud storage using the archive manifest to locate the correct files |
Archiving 2 years of completed orders to reduce database size by 60%: An orders table with 50M rows (120GB) contains 30M completed orders older than 1 year. Archival moves these to orders_archive in batches of 50,000 rows over 3 hours during off-peak. Source table drops to 20M rows (48GB). VACUUM reclaims 72GB. Query performance on active orders improves by 40%.
Tiered archival to S3 with Parquet format: Orders 6-12 months old move to archive tables (warm tier, queryable via SQL). Orders older than 12 months export to S3 as Parquet files (cold tier, retrievable on request). Parquet format reduces storage costs by 80% compared to CSV. Archive manifest tracks 156 Parquet files across 36 monthly partitions.
GDPR-compliant data retention with automatic purging: Archival script moves user data older than 3 years to archive tables. A separate purge job permanently deletes archive records older than 7 years. Both jobs log actions to an immutable audit trail. Monthly compliance report shows record counts by age tier and confirms purge completion.
062ff50
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.