Autonomous multi-agent task orchestration with dependency analysis, parallel tmux/Codex execution, and self-healing heartbeat monitoring. Use for large projects with multiple issues/tasks that need coordinated parallel execution.
Install with Tessl CLI
npx tessl i github:jdrhyne/agent-skills --skill task-orchestrator83
Does it follow best practices?
If you maintain this skill, you can automatically optimize it using the tessl CLI to improve its score:
npx tessl skill review --optimize ./path/to/skillValidation for skill structure
Autonomous orchestration of multi-agent builds using tmux + Codex with self-healing monitoring.
Load the senior-engineering skill alongside this one for engineering principles.
A JSON file defining all tasks, their dependencies, files touched, and status.
{
"project": "project-name",
"repo": "owner/repo",
"workdir": "/path/to/worktrees",
"created": "2026-01-17T00:00:00Z",
"model": "gpt-5.2-codex",
"modelTier": "high",
"phases": [
{
"name": "Phase 1: Critical",
"tasks": [
{
"id": "t1",
"issue": 1,
"title": "Fix X",
"files": ["src/foo.js"],
"dependsOn": [],
"status": "pending",
"worktree": null,
"tmuxSession": null,
"startedAt": null,
"lastProgress": null,
"completedAt": null,
"prNumber": null
}
]
}
]
}dependsOn array enforces ordering# 1. Create working directory
WORKDIR="${TMPDIR:-/tmp}/orchestrator-$(date +%s)"
mkdir -p "$WORKDIR"
# 2. Clone repo for worktrees
git clone https://github.com/OWNER/REPO.git "$WORKDIR/repo"
cd "$WORKDIR/repo"
# 3. Create tmux socket
SOCKET="$WORKDIR/orchestrator.sock"
# 4. Initialize manifest
cat > "$WORKDIR/manifest.json" << 'EOF'
{
"project": "PROJECT_NAME",
"repo": "OWNER/REPO",
"workdir": "WORKDIR_PATH",
"socket": "SOCKET_PATH",
"created": "TIMESTAMP",
"model": "gpt-5.2-codex",
"modelTier": "high",
"phases": []
}
EOF# Fetch all open issues
gh issue list --repo OWNER/REPO --state open --json number,title,body,labels > issues.json
# Group by files mentioned in issue body
# Tasks touching same files should serialize# For each task, create isolated worktree
cd "$WORKDIR/repo"
git worktree add -b fix/issue-N "$WORKDIR/task-tN" mainSOCKET="$WORKDIR/orchestrator.sock"
# Create session for task
tmux -S "$SOCKET" new-session -d -s "task-tN"
# Launch Codex (uses gpt-5.2-codex with reasoning_effort=high from ~/.codex/config.toml)
# Note: Model config is in ~/.codex/config.toml, not CLI flag
tmux -S "$SOCKET" send-keys -t "task-tN" \
"cd $WORKDIR/task-tN && codex --yolo 'Fix issue #N: DESCRIPTION. Run tests, commit with good message, push to origin.'" Enter#!/bin/bash
# check_progress.sh - Run via heartbeat
WORKDIR="$1"
SOCKET="$WORKDIR/orchestrator.sock"
MANIFEST="$WORKDIR/manifest.json"
STALL_THRESHOLD_MINS=20
check_session() {
local session="$1"
local task_id="$2"
# Capture recent output
local output=$(tmux -S "$SOCKET" capture-pane -p -t "$session" -S -50 2>/dev/null)
# Check for completion indicators
if echo "$output" | grep -qE "(All tests passed|Successfully pushed|❯ $)"; then
echo "DONE:$task_id"
return 0
fi
# Check for errors
if echo "$output" | grep -qiE "(error:|failed:|FATAL|panic)"; then
echo "ERROR:$task_id"
return 1
fi
# Check for stall (prompt waiting for input)
if echo "$output" | grep -qE "(\? |Continue\?|y/n|Press any key)"; then
echo "STUCK:$task_id:waiting_for_input"
return 2
fi
echo "RUNNING:$task_id"
return 0
}
# Check all active sessions
for session in $(tmux -S "$SOCKET" list-sessions -F "#{session_name}" 2>/dev/null); do
check_session "$session" "$session"
doneWhen a task is stuck, the orchestrator should:
Waiting for input → Send appropriate response
tmux -S "$SOCKET" send-keys -t "$session" "y" EnterError/failure → Capture logs, analyze, retry with fixes
# Capture error context
tmux -S "$SOCKET" capture-pane -p -t "$session" -S -100 > "$WORKDIR/logs/$task_id-error.log"
# Kill and restart with error context
tmux -S "$SOCKET" kill-session -t "$session"
tmux -S "$SOCKET" new-session -d -s "$session"
tmux -S "$SOCKET" send-keys -t "$session" \
"cd $WORKDIR/$task_id && codex --model gpt-5.2-codex-high --yolo 'Previous attempt failed with: $(cat error.log | tail -20). Fix the issue and retry.'" EnterNo progress for 20+ mins → Nudge or restart
# Check git log for recent commits
cd "$WORKDIR/$task_id"
LAST_COMMIT=$(git log -1 --format="%ar" 2>/dev/null)
# If no commits in threshold, restart# Add to cron (every 15 minutes)
cron action:add job:{
"label": "orchestrator-heartbeat",
"schedule": "*/15 * * * *",
"prompt": "Check orchestration progress at WORKDIR. Read manifest, check all tmux sessions, self-heal any stuck tasks, advance to next phase if current is complete. Do NOT ping human - fix issues yourself."
}# 1. Fetch issues
gh issue list --repo OWNER/REPO --state open --json number,title,body > /tmp/issues.json
# 2. Analyze for dependencies (files mentioned, explicit deps)
# Group into phases:
# - Phase 1: Critical/blocking issues (no deps)
# - Phase 2: High priority (may depend on Phase 1)
# - Phase 3: Medium/low (depends on earlier phases)
# 3. Within each phase, identify:
# - Parallel batch: Different files, no deps → run simultaneously
# - Serial batch: Same files or explicit deps → run in orderWrite manifest.json with all tasks, dependencies, file mappings.
# Create worktrees for Phase 1 tasks
for task in phase1_tasks; do
git worktree add -b "fix/issue-$issue" "$WORKDIR/task-$id" main
done
# Launch tmux sessions
for task in phase1_parallel_batch; do
tmux -S "$SOCKET" new-session -d -s "task-$id"
tmux -S "$SOCKET" send-keys -t "task-$id" \
"cd $WORKDIR/task-$id && codex --model gpt-5.2-codex-high --yolo '$PROMPT'" Enter
doneHeartbeat checks every 15 mins:
# When task completes successfully
cd "$WORKDIR/task-$id"
git push -u origin "fix/issue-$issue"
gh pr create --repo OWNER/REPO \
--head "fix/issue-$issue" \
--title "fix: Issue #$issue - $TITLE" \
--body "Closes #$issue
## Changes
[Auto-generated by Codex orchestrator]
## Testing
- [ ] Unit tests pass
- [ ] Manual verification"# After all PRs merged or work complete
tmux -S "$SOCKET" kill-server
cd "$WORKDIR/repo"
for task in all_tasks; do
git worktree remove "$WORKDIR/task-$id" --force
done
rm -rf "$WORKDIR"| Status | Meaning |
|---|---|
pending | Not started yet |
blocked | Waiting on dependency |
running | Codex session active |
stuck | Needs intervention (auto-heal) |
error | Failed, needs retry |
complete | Done, ready for PR |
pr_open | PR created |
merged | PR merged |
{
"project": "nuri-security-framework",
"repo": "jdrhyne/nuri-security-framework",
"phases": [
{
"name": "Phase 1: Critical",
"tasks": [
{"id": "t1", "issue": 1, "files": ["ceo_root_manager.js"], "dependsOn": []},
{"id": "t2", "issue": 2, "files": ["ceo_root_manager.js"], "dependsOn": ["t1"]},
{"id": "t3", "issue": 3, "files": ["workspace_validator.js"], "dependsOn": []}
]
},
{
"name": "Phase 2: High",
"tasks": [
{"id": "t4", "issue": 4, "files": ["kill_switch.js", "container_executor.js"], "dependsOn": []},
{"id": "t5", "issue": 5, "files": ["kill_switch.js"], "dependsOn": ["t4"]},
{"id": "t6", "issue": 6, "files": ["ceo_root_manager.js"], "dependsOn": ["t2"]},
{"id": "t7", "issue": 7, "files": ["container_executor.js"], "dependsOn": []},
{"id": "t8", "issue": 8, "files": ["container_executor.js", "egress_proxy.js"], "dependsOn": ["t7"]}
]
}
]
}Parallel execution in Phase 1:
Parallel execution in Phase 2:
--model gpt-5.2-codex-highWhen using codex exec --full-auto, the sandbox:
git push fails with "Could not resolve host"~/nuri_workspaceThe heartbeat should check for:
username@hostname path %, worker is donegit log @{u}.. --oneline shows commits not on remoteWhen detected, the orchestrator (not the worker) should:
gh pr create# In heartbeat, for each task:
cd /tmp/orchestrator-*/task-tN
if tmux capture-pane shows shell prompt; then
# Worker finished, check for unpushed work
if git log @{u}.. --oneline | grep -q .; then
git push -u origin HEAD
gh pr create --title "$(git log --format=%s -1)" --body "Closes #N" --base main
fi
fifae44e7
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.