Diagnoses and fixes conflicts between Tailscale and proxy/VPN tools (Shadowrocket, Clash, Surge) on macOS. Covers five conflict layers - (1) route hijacking, (2) HTTP proxy env var interception, (3) system proxy bypass, (4) SSH ProxyCommand double tunneling, and (5) VM/container runtime proxy propagation (OrbStack/Docker). Includes SOP for remote development via SSH tunnels with proxy-safe Makefile patterns. Use when Tailscale ping works but SSH/HTTP times out, when browser returns 503 but curl works, when git push fails with "failed to begin relaying via HTTP", when Docker pull times out behind TUN/VPN, when setting up Tailscale SSH to WSL instances, or when bootstrapping remote dev environments over Tailscale.
Install with Tessl CLI
npx tessl i github:fernandezbaptiste/claude-code-skills --skill tunnel-doctor92
Does it follow best practices?
Validation for skill structure
Diagnose and fix conflicts when Tailscale coexists with proxy/VPN tools on macOS, with specific guidance for SSH access to WSL instances.
Proxy/VPN tools on macOS create conflicts at five independent layers. Layers 1-3 affect Tailscale connectivity; Layer 4 affects SSH git operations; Layer 5 affects VM/container runtimes:
| Layer | What breaks | What still works | Root cause |
|---|---|---|---|
| 1. Route table | Everything (SSH, curl, browser) | tailscale ping | tun-excluded-routes adds en0 route overriding Tailscale utun |
| 2. HTTP env vars | curl, Python requests, Node.js fetch | SSH, browser | http_proxy set without NO_PROXY for Tailscale |
| 3. System proxy (browser) | Browser only (HTTP 503) | SSH, curl (both with/without proxy) | Browser uses VPN system proxy; DIRECT rule routes via Wi-Fi, not Tailscale utun |
| 4. SSH ProxyCommand double tunnel | git push/pull (intermittent) | ssh -T (small data) | connect -H creates HTTP CONNECT tunnel redundant with Shadowrocket TUN; landing proxy drops large/long-lived transfers |
| 5. VM/Container proxy propagation | docker pull, docker build | Host curl, running containers | VM runtime (OrbStack/Docker Desktop) auto-injects or caches proxy config; removing proxy makes it worse (VM traffic via TUN → TLS timeout) |
Determine which scenario applies:
curl and SSH both work → System proxy bypass conflict (Step 2C)localhost → browser can't follow → SSH tunnel needed (Step 2D)make status / scripts curl to localhost fail with proxy → localhost proxy interception (Step 2E)git push/pull fails with FATAL: failed to begin relaying via HTTP → SSH double tunnel (Step 2F)docker pull fails with TLS handshake timeout or docker build can't fetch base images → VM/container proxy propagation (Step 2G)git clone fails with Connection closed by 198.18.x.x → TUN DNS hijack for SSH (Step 2H)operation not permitted → Tailscale SSH config issue (Step 4)be-child ssh exits code 1 → WSL snap sandbox issue (Step 5)Key distinctions:
http_proxy/NO_PROXY env vars. If SSH works but HTTP doesn't → Layer 2.curl uses http_proxy env var, NOT the system proxy. Browser uses system proxy (set by VPN). If curl works but browser doesn't → Layer 3.tailscale ping works but regular ping doesn't → Layer 1 (route table corrupted).ssh -T git@github.com works but git push fails intermittently → Layer 4 (double tunnel).curl https://... works but docker pull times out → Layer 5 (VM proxy propagation).198.18.x.x virtual IPs → TUN DNS hijack (Step 2H).Check if proxy env vars are intercepting Tailscale HTTP traffic:
env | grep -i proxyBroken output — proxy is set but NO_PROXY doesn't exclude Tailscale:
http_proxy=http://127.0.0.1:1082
https_proxy=http://127.0.0.1:1082
NO_PROXY=localhost,127.0.0.1 ← Missing Tailscale!Fix — add Tailscale MagicDNS domain + CIDR to NO_PROXY:
export NO_PROXY=localhost,127.0.0.1,.ts.net,100.64.0.0/10,192.168.*,10.*,172.16.*| Entry | Covers | Why |
|---|---|---|
.ts.net | MagicDNS domains (host.tailnet.ts.net) | Matched before DNS resolution |
100.64.0.0/10 | Tailscale IPs (100.64.* – 100.127.*) | Precise CIDR, no public IP false positives |
192.168.*,10.*,172.16.* | RFC 1918 private networks | LAN should never be proxied |
Two layers complement each other: .ts.net handles domain-based access, 100.64.0.0/10 handles direct IP access.
NO_PROXY syntax pitfalls — see references/proxy_conflict_reference.md for the compatibility matrix.
Verify the fix:
# Both must return HTTP 200:
NO_PROXY="...(new value)..." curl -s --connect-timeout 5 http://<host>.ts.net:<port>/health -w "HTTP %{http_code}\n"
NO_PROXY="...(new value)..." curl -s --connect-timeout 5 http://<tailscale-ip>:<port>/health -w "HTTP %{http_code}\n"Then persist in shell config (~/.zshrc or ~/.bashrc).
Check if a proxy tool hijacked the Tailscale CGNAT range:
route -n get <tailscale-ip>Healthy output — traffic goes through Tailscale interface:
destination: 100.64.0.0
interface: utun7 # Tailscale interface (utunN varies)Broken output — proxy hijacked the route:
destination: 100.64.0.0
gateway: 192.168.x.1 # Default gateway
interface: en0 # Physical interface, NOT TailscaleConfirm with full route table:
netstat -rn | grep 100.64Two competing routes indicate a conflict:
100.64/10 192.168.x.1 UGSc en0 ← Proxy added this (wins)
100.64/10 link#N UCSI utun7 ← Tailscale route (loses)Root cause: On macOS, UGSc (Static Gateway) takes priority over UCSI (Cloned Static Interface) for the same prefix length.
Symptom: Browser shows HTTP 503 for http://<tailscale-ip>:<port>, but both curl --noproxy '*' and curl (with proxy env var) return 200. SSH also works.
Root cause: The browser uses the system proxy configured by the VPN profile (Shadowrocket/Clash/Surge). The proxy matches IP-CIDR,100.64.0.0/10,DIRECT and tries to connect directly — but "directly" means via the Wi-Fi interface (en0), NOT through Tailscale's utun interface. The proxy process itself doesn't have a route to Tailscale IPs, so the connection fails with 503.
Diagnosis:
# curl with proxy env var works (curl connects to proxy port, but traffic flows differently)
curl -s -o /dev/null -w "%{http_code}" http://<tailscale-ip>:<port>/
# → 200
# Browser gets 503 because it goes through the VPN system proxy, not http_proxy env varFix — add Tailscale CGNAT range to skip-proxy in the proxy tool config:
For Shadowrocket, in [General]:
skip-proxy = 192.168.0.0/16, 10.0.0.0/8, 172.16.0.0/12, 100.64.0.0/10, localhost, *.local, captive.apple.comskip-proxy tells the system "bypass the proxy entirely for these addresses." The browser then connects directly through the OS network stack, where Tailscale's routing table correctly handles the traffic.
Why skip-proxy works but tun-excluded-routes doesn't:
skip-proxy: Bypasses the HTTP proxy layer only. Traffic still flows through the TUN interface and Tailscale utun handles it. Safe.tun-excluded-routes: Removes the CIDR from the TUN routing entirely. This creates a competing en0 route that overrides Tailscale. Breaks everything.Symptom: Dev server runs on a remote machine (e.g., Mac Mini via Tailscale). You access http://<tailscale-ip>:3010 in the browser. Login/signup works, but after auth, the app redirects to http://localhost:3010/ which fails — localhost on your machine isn't running the dev server.
Root cause: The app's APP_URL (or equivalent) is set to http://localhost:3010. Auth libraries (Better-Auth, NextAuth, etc.) use this URL for callback redirects. Changing APP_URL to the Tailscale IP introduces Shadowrocket proxy conflicts and breaks local development on the remote machine.
Fix — SSH local port forwarding. This avoids all three conflict layers entirely:
# Forward local port 3010 to remote machine's localhost:3010
ssh -NL 3010:localhost:3010 <tailscale-ip>
# Or with autossh for auto-reconnect (recommended for long sessions)
autossh -M 0 -f -N -L 3010:localhost:3010 \
-o "ServerAliveInterval=30" \
-o "ServerAliveCountMax=3" \
-o "ExitOnForwardFailure=yes" \
<tailscale-ip>Now access http://localhost:3010 in the browser. Auth redirects to localhost:3010 → tunnel → remote dev server → works correctly.
Why this is the best approach:
.env changes needed — APP_URL=http://localhost:3010 works everywherelocalhost is always in skip-proxyInstall autossh: brew install autossh (macOS) or apt install autossh (Linux)
Kill background tunnel: pkill -f 'autossh.*<tailscale-ip>'
Symptom: Makefile targets or scripts that curl localhost (health checks, warmup routes) fail or timeout when http_proxy is set globally in the shell.
Root cause: http_proxy=http://127.0.0.1:1082 is set in ~/.zshrc but no_proxy doesn't include localhost. All curl commands send localhost requests through the proxy.
Fix — add --noproxy localhost to all localhost curl commands in scripts:
# WRONG — fails when http_proxy is set
@curl -sf http://localhost:9000/minio/health/live && echo "OK"
# CORRECT — always bypasses proxy for localhost
@curl --noproxy localhost -sf http://localhost:9000/minio/health/live && echo "OK"Alternatively, set no_proxy globally in ~/.zshrc:
export no_proxy=localhost,127.0.0.1Symptom: ssh -T git@github.com succeeds consistently, but git push or git pull fails intermittently with:
FATAL: failed to begin relaying via HTTP.
Connection closed by UNKNOWN port 65535Small operations (auth, fetch metadata) work; large data transfers fail.
Root cause: When Shadowrocket TUN is active, it already routes all TCP traffic through its VPN tunnel. If SSH config also uses ProxyCommand connect -H, data flows through two proxy layers — the landing proxy drops large/long-lived HTTP CONNECT connections.
Diagnosis:
# 1. Confirm Shadowrocket TUN is active
ifconfig | grep '^utun'
# 2. Check SSH config for ProxyCommand
grep -A5 'Host github.com' ~/.ssh/config
# 3. Confirm: removing ProxyCommand fixes push
GIT_SSH_COMMAND="ssh -o ProxyCommand=none" git push origin mainFix — remove ProxyCommand and switch to ssh.github.com:443. See references/proxy_conflict_reference.md § SSH ProxyCommand and Git Operations for the full SSH config, why port 443 helps, and fallback options when VPN is off.
Symptom: docker pull or docker build fails with net/http: TLS handshake timeout or Internal Server Error from auth.docker.io, while host curl to the same URLs works fine.
Applies to: OrbStack, Docker Desktop, or any VM-based Docker runtime on macOS with Shadowrocket/Clash TUN active.
Root cause: VM-based Docker runtimes (OrbStack, Docker Desktop) run the Docker daemon inside a lightweight VM. The VM's outbound traffic takes a different network path than host processes:
Host process (curl): Process → TUN (Shadowrocket) → landing proxy → internet ✅
VM process (Docker): Docker daemon → VM bridge → host network → TUN → ??? ❌The TUN handles host-originated traffic correctly but may drop or delay VM-bridged traffic (different TCP stack, MTU, keepalive behavior).
Three sub-problems and their fixes:
OrbStack's network_proxy: auto reads http_proxy from the shell environment and writes it to ~/.orbstack/config/docker.json. Crucially, orbctl config set network_proxy none does NOT clean up docker.json — the cached proxy persists.
Diagnosis:
# OrbStack config says "none" but Docker still shows proxy
orbctl config get network_proxy # → "none"
docker info | grep -i proxy # → HTTP Proxy: http://127.0.0.1:1082 ← stale!
# The real source of truth:
cat ~/.orbstack/config/docker.json
# → {"proxies": {"http-proxy": "http://127.0.0.1:1082", ...}} ← cached!Fix — DON'T remove the proxy. Instead, add precise no-proxy to prevent localhost interception while keeping the proxy as the VM's outbound channel:
# Write corrected config (keeps proxy, adds no-proxy for local traffic)
python3 -c "
import json
config = {
'proxies': {
'http-proxy': 'http://127.0.0.1:1082',
'https-proxy': 'http://127.0.0.1:1082',
'no-proxy': 'localhost,127.0.0.1,::1,192.168.128.0/24,100.64.0.0/10,host.internal,*.local'
}
}
json.dump(config, open('$HOME/.orbstack/config/docker.json', 'w'), indent=2)
"
# Full restart (not just docker engine)
orbctl stop && sleep 3 && orbctl startWhy NOT remove the proxy: When TUN is active, removing the Docker proxy means VM traffic goes directly through the bridge → TUN path, which causes TLS handshake timeouts. The proxy provides a working outbound channel because OrbStack maps host 127.0.0.1 into the VM.
| Docker config | Traffic path | Result |
|---|---|---|
Proxy ON, no no-proxy | Docker → proxy → TUN → internet | Docker Hub ✅, localhost probes ❌ |
| Proxy OFF | Docker → VM bridge → host → TUN → internet | TLS timeout ❌ |
Proxy ON + no-proxy | External: Docker → proxy → internet ✅; Local: Docker → direct ✅ | Both work ✅ |
Deploy scripts that curl localhost inside the Docker environment will route through the proxy. Fix by adding NO_PROXY at the script level:
# In deploy.sh or similar scripts:
_local_bypass="localhost,127.0.0.1,::1"
if [[ -n "${NO_PROXY:-}" ]]; then
export NO_PROXY="${_local_bypass},${NO_PROXY}"
else
export NO_PROXY="${_local_bypass}"
fi
export no_proxy="$NO_PROXY"
# Use 127.0.0.1 instead of localhost in probe URLs (some proxy implementations
# only match exact string "localhost" in no-proxy, not the resolved IP)
curl http://127.0.0.1:3001/health # ✅ bypasses proxy
curl http://localhost:3001/health # ❌ may still go through proxyVerify the fix:
# Docker proxy check (should show proxy + no-proxy)
docker info | grep -iE "proxy|No Proxy"
# Pull test
docker pull --quiet hello-world
# Local probe test
curl -s http://127.0.0.1:3001/healthSymptom: git clone/fetch/push fails with Connection closed by 198.18.0.x port 443. ssh -T git@github.com may also fail. DNS resolution returns 198.18.x.x addresses instead of real IPs.
Root cause: Shadowrocket TUN intercepts all DNS queries and returns virtual IPs in the 198.18.0.0/15 range. It then routes traffic to these virtual IPs through the TUN for protocol-aware proxying. HTTP/HTTPS works because the landing proxy understands these protocols, but SSH-over-443 (used by GitHub) gets mishandled — the TUN sees port 443 traffic, expects HTTPS, and drops the SSH handshake.
Diagnosis:
# DNS returns virtual IP (TUN hijack)
nslookup ssh.github.com
# → 198.18.0.26 ← Shadowrocket virtual IP, NOT real GitHub IP
# Direct IP works (bypasses DNS hijack)
ssh -o HostName=140.82.112.35 -o Port=443 git@github.com
# → "Hi user! You've successfully authenticated"Fix — use direct IP in SSH config to bypass DNS hijack:
# ~/.ssh/config
Host github.com
HostName 140.82.112.35 # GitHub SSH server real IP (bypasses TUN DNS hijack)
Port 443
User git
ServerAliveInterval 60
ServerAliveCountMax 3
IdentityFile ~/.ssh/id_ed25519GitHub SSH server IPs (as of 2026, verify with dig +short ssh.github.com @8.8.8.8):
140.82.112.35 (primary)140.82.112.36 (alternate)Trade-off: Hardcoded IPs break if GitHub changes them. Monitor ssh -T git@github.com — if it starts failing, update the IP. A cron job can automate this:
# Weekly check (add to crontab)
0 9 * * 1 dig +short ssh.github.com @8.8.8.8 | head -1 > /tmp/github-ssh-ip.txtAlternative (if you control Shadowrocket rules): Add GitHub SSH IPs to DIRECT rule so TUN passes them through without protocol inspection:
IP-CIDR,140.82.112.0/24,DIRECT
IP-CIDR,192.30.252.0/22,DIRECTThis is more robust but requires proxy tool config access.
Identify the proxy tool and apply the appropriate fix. See references/proxy_conflict_reference.md for detailed instructions per tool.
Key principle: Do NOT use tun-excluded-routes to exclude 100.64.0.0/10. This causes the proxy to add a → en0 route that overrides Tailscale. Instead, let the traffic enter the proxy TUN and use a DIRECT rule to pass it through.
Universal fix — add this rule to any proxy tool:
IP-CIDR,100.64.0.0/10,DIRECT
IP-CIDR,fd7a:115c:a1e0::/48,DIRECTAfter applying fixes, verify:
route -n get <tailscale-ip>
# Should show Tailscale utun interface, NOT en0If SSH connects but returns operation not permitted, the Tailscale ACL may require browser authentication for each connection.
At Tailscale ACL admin, ensure the SSH section uses "action": "accept":
"ssh": [
{
"action": "accept",
"src": ["autogroup:member"],
"dst": ["autogroup:self"],
"users": ["autogroup:nonroot", "root"]
}
]Note: "action": "check" requires browser authentication each time. Change to "accept" for non-interactive SSH access.
If SSH connects and ACL passes but fails with be-child ssh exit code 1 in tailscaled logs, the snap-installed Tailscale has sandbox restrictions preventing SSH shell execution.
Diagnosis — check WSL tailscaled logs:
# For snap installs:
sudo journalctl -u snap.tailscale.tailscaled -n 30 --no-pager
# For apt installs:
sudo journalctl -u tailscaled -n 30 --no-pagerLook for:
access granted to user@example.com as ssh-user "username"
starting non-pty command: [/snap/tailscale/.../tailscaled be-child ssh ...]
Wait: code=1Fix — replace snap with apt installation:
# Remove snap version
sudo snap remove tailscale
# Install apt version
curl -fsSL https://tailscale.com/install.sh | sh
# Start with SSH enabled
sudo tailscale up --sshImportant: The new installation may assign a different Tailscale IP. Check with tailscale status --self.
Run a complete connectivity test:
# 1. Check route is correct
route -n get <tailscale-ip>
# 2. Test TCP connectivity
nc -z -w 5 <tailscale-ip> 22
# 3. Test SSH
ssh -o ConnectTimeout=10 -o StrictHostKeyChecking=no <user>@<tailscale-ip> 'echo SSH_OK && hostname && whoami'All three must pass. If step 1 fails, revisit Step 3. If step 2 fails, check WSL sshd or firewall. If step 3 fails, revisit Steps 4-5.
Proactive setup guide for remote development over Tailscale with proxy tools. Follow these steps before encountering problems.
ssh <tailscale-ip> 'echo ok'Any Makefile target that curls localhost must use --noproxy localhost. This is required because http_proxy is often set globally in ~/.zshrc (common in China), and Make inherits shell environment variables.
## ── Health Checks ─────────────────────────────────────
status: ## Health check dashboard
@echo "=== Dev Infrastructure ==="
@docker exec my-postgres pg_isready -U postgres 2>/dev/null && echo "PostgreSQL: OK" || echo "PostgreSQL: FAIL"
@curl --noproxy localhost -sf http://localhost:9000/minio/health/live >/dev/null 2>&1 && echo "MinIO: OK" || echo "MinIO: FAIL"
@curl --noproxy localhost -sf http://localhost:3001/api/status >/dev/null 2>&1 && echo "API: OK" || echo "API: FAIL"
## ── Route Warmup ──────────────────────────────────────
warmup: ## Pre-compile key routes (run after dev server is ready)
@echo "Warming up dev server routes..."
@echo -n " /api/health → " && curl --noproxy localhost -s -o /dev/null -w '%{http_code} (%{time_total}s)\n' http://localhost:3010/api/health
@echo -n " / → " && curl --noproxy localhost -s -o /dev/null -w '%{http_code} (%{time_total}s)\n' http://localhost:3010/
@echo "Warmup complete."Rules:
curl http://localhost call MUST include --noproxy localhostdocker exec) are unaffected by http_proxy — no fix neededredis-cli, pg_isready connect via TCP directly — no fix neededAdd these targets for remote development via Tailscale SSH tunnels:
## ── Remote Development ────────────────────────────────
REMOTE_HOST ?= <tailscale-ip>
TUNNEL_FORWARD ?= -L 3010:localhost:3010
tunnel: ## SSH tunnel to remote machine (foreground)
ssh -N $(TUNNEL_FORWARD) $(REMOTE_HOST)
tunnel-bg: ## SSH tunnel to remote machine (background, auto-reconnect)
autossh -M 0 -f -N $(TUNNEL_FORWARD) \
-o "ServerAliveInterval=30" \
-o "ServerAliveCountMax=3" \
-o "ExitOnForwardFailure=yes" \
$(REMOTE_HOST)
@echo "Tunnel running in background. Kill with: pkill -f 'autossh.*$(REMOTE_HOST)'"Design decisions:
| Choice | Rationale |
|---|---|
?= (conditional assign) | Allows override: make tunnel REMOTE_HOST=100.x.x.x |
TUNNEL_FORWARD as variable | Supports multi-port: make tunnel TUNNEL_FORWARD="-L 3010:localhost:3010 -L 9000:localhost:9000" |
autossh -M 0 | Disables autossh's own monitoring port; relies on ServerAliveInterval instead (more reliable through NAT) |
ExitOnForwardFailure=yes | Fails immediately if port is already bound, instead of silently running without tunnel |
Kill hint uses autossh.*$(REMOTE_HOST) | Precise pattern — won't accidentally kill other SSH sessions |
Install autossh: brew install autossh (macOS) or apt install autossh (Linux/WSL)
When the project requires multiple services (dev server + object storage + API gateway):
# Forward multiple ports in one tunnel
make tunnel TUNNEL_FORWARD="-L 3010:localhost:3010 -L 9000:localhost:9000 -L 3001:localhost:3001"
# Or define a project-specific default in Makefile
TUNNEL_FORWARD ?= -L 3010:localhost:3010 -L 9000:localhost:9000Each -L flag is independent. If one port is already bound locally, ExitOnForwardFailure=yes will abort the entire tunnel — fix the port conflict first.
SSH non-login shells don't load ~/.zshrc, so nvm/Homebrew tools and proxy env vars are unavailable. Prefix all remote commands with source ~/.zshrc 2>/dev/null;. See references/proxy_conflict_reference.md § SSH Non-Login Shell Pitfall for details and examples.
For Makefile targets that run remote commands:
REMOTE_CMD = ssh $(REMOTE_HOST) 'source ~/.zshrc 2>/dev/null; $(1)'
remote-status: ## Check remote dev server status
$(call REMOTE_CMD,curl --noproxy localhost -sf http://localhost:3010/api/health && echo "OK" || echo "FAIL")# 1. Clone repo and install dependencies
ssh <tailscale-ip>
cd /path/to/project
git clone git@github.com:user/repo.git && cd repo
pnpm install # Add --registry https://registry.npmmirror.com if in China
# 2. Copy .env from local machine (run on local)
scp .env <tailscale-ip>:/path/to/project/repo/.env
# 3. Start Docker infrastructure
make up && make status
# 4. Run database migrations
bun run db:migrate
# 5. Start dev server
bun run dev# 1. Start tunnel
make tunnel-bg
# 2. Open browser
open http://localhost:3010
# 3. Auth, coding, testing — everything works as if local
# 4. When done, kill tunnel
pkill -f 'autossh.*<tailscale-ip>'Browser → localhost:3010 → SSH tunnel → Remote localhost:3010 → Dev server
↓
Auth redirects to localhost:3010
↓
Browser follows redirect → same tunnel → worksThe key insight: APP_URL=http://localhost:3010 in .env is correct for both local and remote development. The SSH tunnel makes the remote server's localhost accessible as the local machine's localhost. Auth callback redirects to localhost:3010 always resolve correctly.
Before starting remote development, verify:
tailscale statusssh <tailscale-ip> 'echo ok'[Rule] has IP-CIDR,100.64.0.0/10,DIRECTskip-proxy includes 100.64.0.0/10tun-excluded-routes does NOT include 100.64.0.0/10NO_PROXY includes .ts.net,100.64.0.0/10autossh installed: which autossh--noproxy localhostssh <ip> 'source ~/.zshrc 2>/dev/null; curl --noproxy localhost -sf http://localhost:3010/'make tunnel-bg && curl -sf http://localhost:3010/4f0eae8
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.