Use when ConfigHub's Unit Data and the cluster's live state for that Unit have diverged — phrases like "reconcile drift", "the cluster changed out of band", "someone kubectl edit'd this", "ConfigHub and the cluster disagree", "accept the live changes", "overwrite the cluster with ConfigHub", "refresh from live", "who owns this drift?", "we have drift on app-a in prod". Runs `cub unit refresh` to pull current live state, `cub unit diff` against Data, walks the decide-who-wins decision (ConfigHub wins → re-apply; cluster wins → absorb; merge → selective reconcile), and executes the chosen path. Do not load for revision history rewind (use `rollback-revision`), for post-apply verification / three-way agreement checks (use `verify-apply`), for the first-time apply of a newly-bound Unit (use `cub-apply`), or for importing wholesale live resources into a brand-new Unit (use `import-from-cluster`).
89
88%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Passed
No known issues
Resolve divergence between a Unit's Data (what ConfigHub thinks is the desired state) and its LiveData/LiveState (what the cluster currently has). The decision isn't mechanical — it's a judgment call about which side should be authoritative for the drift, and the right answer depends on what the drift contains and how it got there.
| Resolution | Meaning | Commands |
|---|---|---|
| ConfigHub wins | Drift is unauthorized / noise (manual kubectl edit, controller mutation, drift-by-default fields). ConfigHub's Data is the source of truth. | Re-apply via the cub-apply skill — cub unit apply <slug>. If the drift recurs, address the source (who's running the out-of-band edits, what controller is rewriting the field) rather than treating it as a repeat symptom. |
| Cluster wins | Drift represents an intentional change made in-cluster that should now be absorbed into ConfigHub. | cub unit refresh <slug> (pulls LiveData into Data as a new head revision), then commit the result via cub unit apply when ready. |
| Selective merge | Some drift accepted, some rejected. Common when a controller adds annotations/fields you want to keep while a hand-edit you want to reject sits in the same Unit. | cub unit refresh to pull LiveData into a staging revision, then cub-mutate (functions) to keep only the subset you want, then apply. |
The decision goes through the user, not the skill. The skill's job is to show the diff clearly, name the likely sources of drift, and execute the user's chosen resolution.
cub unit refresh (run manually or by another skill) surfaced changes and the user needs to decide what to do.rollback-revision.verify-apply. That skill assumes the apply has completed; this one handles divergence.cub unit list shows a Unit with LiveRevisionNum != LastAppliedRevisionNum. That indicates an incomplete or stuck apply or destroy.cub-apply skill.import-from-cluster (which creates a Unit from live state; drift-reconcile operates on existing Units).cub organization list succeeds (proves a valid token; cub context get / cub info / cub version don't require one).cub unit get <slug> --space <s> -o jq='{TargetID: .Unit.TargetID, BridgeWorker: .BridgeWorker.Slug}'
cub worker status --space <worker-space> <worker-slug>Before the skill decides anything, know that the Worker's Kubernetes bridge already strips fields managed by a long list of controllers during refresh / import — HPA / VPA, Deployment / ReplicaSet / StatefulSet / DaemonSet / Job / CronJob controllers, the scheduler and cluster-autoscaler / descheduler, Istio / Linkerd sidecar injectors, Traefik ingress, cert-manager, plus status fields across the board. Full list: https://github.com/confighub/sdk/blob/main/bridge-impl/kubernetes/kubernetes_lib.go (search for ignoredFieldManagers).
Consequences:
spec.replicas case usually doesn't show up as drift — the Worker strips it. If it does, the field manager isn't the HPA; dig into what's writing it.kubectl edit, debugging patches), controllers not in the ignored list, and fields the admission chain writes but attributes to the original applier (cleaned-up SecurityContext fields, for example).Frame the "name the likely source" step around what the Worker leaves in, not around every category of controller activity — most of the noise has already been removed.
Read the drift as Data vs LiveData — both are cleaned of .status, controller-managed fields, and the other noise the Worker already elides (references/cub-cli.md → "A Unit's four 'what's in it' views"). Apples to apples.
# Point diagnosis at one Unit first, even for bulk cases — the resolution choice
# should be informed by examples, not a summary.
cub unit diff <slug> --space <s> --from=LastAppliedRevisionNum --to=LiveRevisionNum
# The cluster state corresponding to the unit, cleaned, at the time of the last action (apply, refresh, or import; will be empty after destroy):
cub unit livedata <slug> --space <s>
# For cluster debugging (status, managedFields, full detail), use livestate —
# NOT for drift diffs (too noisy against Data).
# Also current at the time of the last action (apply, refresh, or import; will be empty after destroy).
cub unit livestate <slug> --space <s>For a preview of what a refresh would bring in — without touching the Unit — use --dry-run. Refresh queues a unit-action rather than creating a new Unit revision:
opID=$(cub unit refresh --wait --space <s> <slug> --dry-run -o jq='.QueuedOperationID')
cub unit-action get --space <s> <slug> "$opID" --data # the refreshed data that would be returnedWhen performing a dry run, the Data / LiveData / LiveState the operation computed stay in the QueuedOperation record; read them via cub unit-action get rather than expecting them on the Unit. Useful to answer "what would refresh absorb?" before committing to the refresh.
It's a good idea to use refresh to check for drift before making changes to the configuration data, to ensure you are operating on up-to-date data, rather than waiting until apply time.
cub unit apply also supports --dry-run similarly, and can be used to test whether updates will work in the cluster, after changes have been made to the configuration data. Resource creates sometimes can't be verified in this way due to dependencies that aren't actually created.
After Worker-side elision (above), what's left is usually one of:
kubectl edit / kubectl patch. Someone debugged in prod and left the edit in place (a "break glass" operational change). Ask the user whether the edit is meant to stick (absorb → cluster wins) or was a stopgap (ConfigHub wins, re-apply to overwrite).manager field pointing at the client that sent the request, so the Worker's field-manager elision doesn't catch them. These usually represent real workload requirements; absorb into Data.manager field in the diff's metadata.managedFields to identify them. Absorb if the field is legitimately controller-owned; restore from Data if the write is wrong.Walk through the diff with the user and reach one of the three resolutions. For bulk drift, decide per-Unit or group Units that have the same drift shape and decide for the group.
Re-apply the Unit's Data. The cub-apply skill owns this — hand off:
cub unit apply <slug> --space <s> --waitIf drift recurs on the same Unit, identify the source (a teammate still running kubectl edit, a mutating admission controller rewriting a field, an HPA writing spec.replicas) and address that source. Re-applying on a schedule treats a recurring cause as a repeat symptom; the fix belongs at the source (educate the teammate, adjust the admission policy, remove spec.replicas from Data so the HPA owns it).
To undo out-of-band cluster edits explicitly (not just a re-apply — capture what's there, then reset to the prior Data): refresh to absorb live state into a new head, then restore back to the pre-refresh head and apply. This records both the observation and the revert as distinct revisions:
cub unit refresh <slug> --space <s> # new head = live state (N)
cub unit update <slug> --space <s> --restore -1 \
--change-desc "Revert out-of-band cluster changes to <slug>. Restored to pre-refresh head (N-1).
User prompt: <verbatim>
Clarifications: <condensed — what was changed in-cluster and why it's rejected>"
cub unit apply <slug> --space <s> --waitUse this when it's important for the audit trail to show what was in the cluster, not just that ConfigHub's Data was re-applied.
Note that due to asynchronous triggers and changes made due to links (aka "resolve"), refresh and other mutations sometimes generate two revisions rather than just one. Either create a tag and set it with cub unit tag prior to refresh, or simply look at the revisions with cub revision list after performing cub unit refresh.
Absorb live state into Data:
cub unit refresh <slug> --space <s>cub unit refresh creates a new head revision whose Data matches current LiveData. Review the revision:
cub unit diff <slug> --space <s> --from=-1 # new head vs prior headRefresh updates LiveRevisionNum and LastAppliedRevisionNum to match the new HeadRevisionNum, so an apply is not necessary - the changes were already in the cluster.
If you had unapplied changes before doing the reset, you may want to merge them into the updated configuration data so that they can be applied.
cub unit update --patch --space <s> <slug> --merge-source Self --merge-base PreviousLiveRevisionNum --merge-end Before:HeadRevisionNum --change-desc "Merge unapplied changes from before refresh"Refresh into a new head, then use cub-mutate to reject the parts you don't want:
cub unit refresh <slug> --space <s>
# New head contains everything — the accepted sidecar annotations AND the stopgap kubectl edit.
# Reject the kubectl edit with a function (example: strip a specific label back to the prior value).
cub function set --space <s> --unit <slug> \
--change-desc "Keep sidecar annotations; reject debug label left by manual kubectl edit.
User prompt: <verbatim>
Clarifications: <condensed>" \
-o mutations \
-- set-label app.kubernetes.io/debug "-" # example; use the function that matches
cub unit apply <slug> --space <s> --waitFor more complex merges (many fields to keep / reject), a whole-unit rewrite fallback (via cub-mutate Shape 6) may be cleaner than a chain of functions. Use the function path when the merge is three or fewer fields.
Reconciliation handles the drift that's present. If the same drift is likely to recur, the fix is at the source, not in the reconciliation cadence:
kubectl RBAC on the namespace; route them through cub-mutate next time.spec.replicas from a Deployment that's under an HPA). If it isn't in Data, it can't drift.Scheduling a drift-reconcile run for regular recurrence is a workaround, not a fix — use it only when you can't change the source.
Apply the same loop per-Unit, or for same-shape drift use bulk commands:
# Bulk drift detection check.
cub unit refresh --space "*" --filter <app>-home/<app>-app --wait --dry-run -o mutations
# Bulk refresh.
cub unit refresh --space "*" --filter <app>-home/<app>-app --wait
# Bulk re-apply (ConfigHub wins resolution).
cub unit apply --space "*" --filter <app>-home/<app>-app --waitAlways --dry-run first on bulk refresh — you're creating new head revisions on every matching Unit.
cub unit refresh / diff / livestate / livedata / update / tag, cub unit-action get, cub function set, read-only kubectl get/describe for diagnosis.kubectl edit / apply / patch / delete to "fix" drift — that creates more drift. If a cluster-side fix is genuinely needed (e.g., a broken resource that won't accept a re-apply), do it through cub-mutate + cub-apply.verify-apply to clear first, then re-check.cub unit refresh --wait --dry-run -o mutations shows no additional changes.cub unit get <slug> --space <s> --web — current state including LiveRevisionNum and drift indicators.cub revision list <slug> --space <s> --web — the refresh + apply revisions with their --change-desc.references/cub-cli.md — --change-desc scope; -o mutations on mutating calls.references/functions-catalog.md — functions for selective merge (strip-metadata-*, set-label, set-annotation, set-cel).cub-apply (runtime for ConfigHub-wins), cub-mutate (surgical edits during selective merge), verify-apply (post-apply verification; this skill is the divergence counterpart when an apply converged but the cluster has since diverged, or when a "drift" report is actually a stuck/failed apply), rollback-revision (ConfigHub-history rewind, different problem).59ea831
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.