drive-process

Use when you need to drive a CLI program through command-and-response interaction via wsh. Examples: "run a build command and check the output", "interact with an installer that asks questions", "execute a sequence of shell commands and handle errors".

Quality

64%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Risky

Do not use without reviewing

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/drive-process/SKILL.md

IMPORTANT: EXECUTION CONTEXT This skill describes what to do — domain patterns and decision-making. It does NOT describe how to call the API.

If you have wsh_* tools (check your toolkit for wsh_send_input, wsh_get_screen, etc.): use them directly. Operation names in this skill generally map to tool names (e.g., "send input" → wsh_send_input). When in doubt, list your available wsh_* tools.

If you do NOT have wsh_* tools: you are in HTTP/curl fallback mode. DO NOT GUESS endpoints or CLI subcommands. Load the full API reference first: search your workspace for skills/core/ and read SKILL.md. It contains every endpoint with working curl examples and a bootstrap sequence.

Quick bootstrap: curl -sf --unix-socket ${XDG_RUNTIME_DIR:-/tmp}/wsh/default.http.sock http://localhost/health — if that fails: wsh server -L agent-$$ --ephemeral & and retry.

wsh:drive-process — Driving CLI Programs

You're operating a terminal programmatically. You send input, wait for output to settle, read the screen, and decide what to do next. This skill teaches you the patterns and pitfalls.

The Loop

Every interaction follows the same shape:

Send input — a command, a response to a prompt, a keystroke
Wait for idle — output settles, suggesting the program may be idle. Choose your timeout based on what you expect:
- Fast commands (ls, cat, echo): 500-1000ms
- Build/install commands: 3000-5000ms
- Network operations: 2000-3000ms Idle is a hint, not a guarantee. The program may still be working — it just hasn't produced output recently.
Read the screen — see what happened
Decide — did the command succeed? Is there a prompt waiting for input? Did something go wrong? Act accordingly.

When re-polling idle (e.g., the command isn't done yet), pass back the generation from the previous response as last_generation to avoid busy-loop storms. Or use fresh=true for simplicity.

Sending a Command

Always include a newline to "press Enter":

send input: npm install\n

Without the trailing \n, you've typed the text but haven't submitted it. Sometimes that's what you want (e.g., building up a command before sending), but usually you want the newline.

Reading the Result

After waiting for idle, read the screen. Prefer plain format when you just need text content. Use styled when formatting matters (e.g., distinguishing error output highlighted in red).

If the output is long, it may have scrolled off screen. Use scrollback to get the full history.

Handling Interactive Prompts

Many programs ask questions and wait for a response. After reading the screen, look for patterns like:

[Y/n] or [y/N] — yes/no confirmation
Password: or Enter passphrase: — credential prompts
> or ? — interactive selection (fzf, inquirer, etc.)
(yes/no) — full-word confirmation (e.g., SSH host verification)
Press any key to continue

Respond naturally — send the appropriate input:

send input: y\n
send input: yes\n

For password prompts, note that the terminal will not echo your input back. The screen will look unchanged after you type. Wait for idle after sending — the program will advance.

Control Characters

Emergency exits and special actions: Ctrl+C (interrupt), Ctrl+D (EOF/exit), Ctrl+Z (suspend), Ctrl+L (clear screen), Ctrl+U (clear line), Escape.

If a command hangs, try Ctrl+C first. If unresponsive, Ctrl+Z to suspend then kill %1.

Detecting Success and Failure

After reading the screen, look for signals:

Success indicators:

A fresh shell prompt ($, #, >) on the last line
Explicit success messages ("done", "completed", "ok")
Exit code 0 if visible

Failure indicators:

Words like "error", "failed", "fatal", "denied", "not found"
Stack traces or tracebacks
A shell prompt after unexpectedly short output
Non-zero exit codes

When in doubt, check the exit code explicitly:

send input: echo $?\n

A 0 means the previous command succeeded. Anything else is a failure.

Long-Running Commands

Some commands run for minutes or longer — builds, downloads, test suites. Waiting for idle will return when output pauses, but the command may not be done.

Strategies:

Poll in a loop. Wait for idle, read the screen, check if a shell prompt has returned. If not, wait again:

wait for idle (timeout: 5000ms)
read screen
# No prompt yet? Wait again.

Use scrollback for full output. Long commands produce output that scrolls off screen. After the command finishes, read scrollback to get everything:

read scrollback (offset: 0, limit: 500)

Don't set unreasonably long idle timeouts. A timeout_ms=30000 means you'll wait 30 seconds of silence before getting a response. Prefer shorter timeouts with repeated polls — it lets you observe intermediate progress and react if something goes wrong.

Common Patterns

Chained Commands

When you need to run several commands in sequence, you have two options. Run them as separate send/wait/read cycles when you need to inspect output between steps:

# Step 1
send: cd /project
wait, read — verify directory exists

# Step 2
send: npm install
wait, read — check for errors

# Step 3
send: npm test
wait, read — check results

Or chain with && when intermediate output doesn't matter:

send: cd /project && npm install && npm test
wait, read — check final result

Prefer separate cycles. They give you the chance to detect problems early and adjust.

Piped Commands

Pipes work naturally. Send the full pipeline:

send: grep -r "TODO" src/ | wc -l

Background Processes

If you start a background process (&), it won't block the shell prompt. But its output may interleave with future commands. Consider redirecting output:

send: ./long-task.sh > /tmp/task.log 2>&1 &

Then check on it later:

send: cat /tmp/task.log

Pagers

Commands like git log, man, or less enter a pager that waits for keyboard navigation. If you just need the content, bypass the pager:

send: git log --no-pager
send: PAGER=cat man ls

If you're already stuck in a pager, press q to exit:

send: q

Heredocs and Multi-Line Input

To write multi-line content, use heredocs:

send: cat > /tmp/config.yaml << 'EOF'\n
send: key: value\n
send: other: thing\n
send: EOF\n

Pitfalls

Don't skip the wait

It's tempting to send input immediately after the previous command. Don't. If the shell hasn't finished processing, your input may land in the wrong place — or be swallowed entirely. Always wait for idle before sending the next input.

Don't assume the screen is everything

The screen shows only the last N lines (typically 24 rows). A command that produced 500 lines of output will have 476 lines in scrollback. If you need full output, read scrollback.

Watch for prompts you didn't expect

Installers, package managers, and system tools love to ask surprise questions. If you read the screen and see no shell prompt but also no obvious output-in-progress, look for a prompt waiting for your response.

Destructive commands

You are operating a real terminal on a real machine. rm, DROP TABLE, git push --force — these do real damage. Before running destructive commands:

Confirm with the human via overlay, panel, or input capture
Double-check paths and arguments
Prefer dry-run flags when available (--dry-run, --whatif, -n)

Knowing when to give up

If a command is stuck and not responding to Ctrl+C, don't hammer it with more input. Strategies in order:

Send Ctrl+C
Wait a moment, try Ctrl+C again
Send Ctrl+Z to suspend, then kill %1
Tell the human what's happening and ask for help

Shell state persists

You're in a real shell session. Environment variables you set, directories you cd into, background jobs you spawn — they all persist. Be mindful of the state you leave behind.

Repository: deepgram/wsh
Commit: 4863aaf

Last updated: 10 days ago
Created: 10 days ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.