Workflow 1 adaptation for robotics and embodied AI. Orchestrates robotics-aware literature survey, idea generation, novelty check, and critical review to go from a broad robotics direction to benchmark-grounded, simulation-first ideas. Use when user says \"robotics idea discovery\", \"机器人找idea\", \"embodied AI idea\", \"机器人方向探索\", \"sim2real 选题\", or wants ideas for manipulation, locomotion, navigation, drones, humanoids, or general robot learning.
79
73%
Does it follow best practices?
Impact
Pending
No eval scenarios have been run
Advisory
Suggest reviewing before use
Optimize this skill with Tessl
npx tessl skill review --optimize ./skills/skills-codex/idea-discovery-robot/SKILL.mdOrchestrate a robotics-specific idea discovery workflow for: $ARGUMENTS
This skill chains four sub-skills into a single automated pipeline:
/research-lit → /idea-creator (robotics framing) → /novelty-check → /research-review
(survey) (filter + pilot plan) (verify novel) (critical feedback)But every phase must be grounded in robotics-specific constraints:
The goal is not to produce flashy demos. The goal is to produce ideas that are:
sim-first — Prefer simulation or offline-log pilots before any hardware executionexplicit approval only — Never assume physical robot access or approvalgpt-5.4 — External reviewer model via a secondary Codex agentOverride inline, e.g.
/idea-discovery-robot "bimanual manipulation" — only sim ideas, no real robotor/idea-discovery-robot "drone navigation" — focus on CoRL/RSS, 2 pilot ideas max
Follow the phases in order. Do not stop after a checkpoint unless:
If AUTO_PROCEED=true and the user does not respond, continue immediately to the next phase using the strongest sim-first, benchmark-grounded option.
Before generating ideas, extract or infer this Robotics Problem Frame from $ARGUMENTS and local project context:
If some fields are missing, make explicit assumptions and default to:
Write this frame into working notes before moving on. Every later decision should reference it.
Invoke:
/research-lit "$ARGUMENTS — focus venues: CoRL, RSS, ICRA, IROS, RA-L, TRO, Science Robotics"Then reorganize the findings using a robotics lens instead of a generic ML lens.
For each relevant paper, classify:
| Axis | Examples |
|---|---|
| Embodiment | single-arm, mobile manipulator, humanoid, drone, quadruped |
| Task | pick-place, insertion, navigation, locomotion, long-horizon rearrangement |
| Learning setup | RL, BC, IL, offline RL, world model, planning, diffusion policy |
| Observation | RGB, RGB-D, proprioception, tactile, language |
| Action abstraction | torque, joint velocity, end-effector delta pose, waypoint planner |
| Eval regime | pure sim, sim+real, real-only, offline benchmark |
| Benchmark | ManiSkill, RLBench, Isaac Lab, Habitat, Meta-World, CALVIN, LIBERO, custom |
| Metrics | success rate, collision rate, intervention count, path length, latency, energy |
| Main bottleneck | sample inefficiency, brittleness, reset cost, perception drift, sim2real gap |
When refining the survey, prioritize:
Do not stop at "who got the best success rate." Explicitly identify:
Checkpoint: Present the landscape to the user in robotics terms:
🤖 Robotics survey complete. I grouped the field by embodiment, benchmark, action interface, and sim2real setup.
Main gaps:
1. [...]
2. [...]
3. [...]
Should I generate ideas under this framing, or should I narrow to a specific robot / benchmark / modality?Generate ideas only after the robotics frame is explicit.
Invoke the existing idea generator, but pass the Robotics Problem Frame and landscape matrix into the prompt so it does not produce generic ML ideas:
/idea-creator "$ARGUMENTS — robotics frame: [paste Robotics Problem Frame] — focus venues: CoRL, RSS, ICRA, IROS, RA-L — benchmark-specific ideas only — sim-first pilots — no real-robot execution without explicit approval — require failure metrics and baseline clarity"Then rewrite and filter the output using the robotics-specific rules below.
Each candidate idea must include:
Prefer ideas that:
Downrank ideas that are mostly:
For each idea, reject or heavily downrank if:
Checkpoint: Present the ranked robotics ideas before novelty checking:
💡 Robotics ideas generated. Top candidates:
1. [Idea 1] — Embodiment: [...] — Benchmark: [...] — Pilot: sim/offline — Risk: LOW/MEDIUM/HIGH
2. [Idea 2] — Embodiment: [...] — Benchmark: [...] — Pilot: sim/offline — Risk: LOW/MEDIUM/HIGH
3. [Idea 3] — requires hardware / weak benchmark / high risk
Should I carry the top sim-first ideas into novelty checking and external review?
(If no response, I'll continue with the strongest benchmark-grounded ideas.)For the top ideas, design a minimal validation package.
If the repository already contains a usable simulator, benchmark harness, or offline dataset pipeline, you may validate the top 1-3 ideas there. If not, do not force execution. Produce a concrete pilot plan instead.
By default, pilots should be one of:
Only propose a real-robot pilot if the user explicitly wants that.
For each surviving idea, specify:
- Embodiment:
- Benchmark / simulator:
- Baselines:
- Pilot type: sim / offline / real
- Compute estimate:
- Human/operator time:
- Success metrics:
- Failure metrics:
- Safety concerns:
- What result would count as positive signal:
- What negative result would still be publishable:Never auto-proceed to physical robot testing. If an idea needs hardware:
needs physical validationIf no cheap sim/offline pilot exists, keep the idea in the report but label it high execution risk.
After Phase 3, continue to Phase 4 even if you only produced a pilot plan rather than running a pilot. Lack of immediate execution is not a reason to stop the workflow.
For each top idea, run:
/novelty-check "[idea description with embodiment + task family + benchmark + sensor stack + controller/policy class + sim2real angle + target venues: CoRL/RSS/ICRA/IROS/RA-L]"Robotics novelty checks must include:
Be especially skeptical of ideas that are just:
If the method is not novel but the finding or evaluation protocol is, say that explicitly.
Invoke:
/research-review "[top idea with robotics framing, embodiment, benchmark, baselines, pilot plan, evaluation metrics, and sim2real/hardware risks — review as CoRL/RSS/ICRA reviewer]"Frame the reviewer as a senior CoRL / RSS / ICRA reviewer. Ask them to focus on:
Update the report with the reviewer's minimum viable evidence package.
Write or update IDEA_REPORT.md with a robotics-specific structure so it stays compatible with downstream workflows.
# Robotics Idea Discovery Report
**Direction**: $ARGUMENTS
**Date**: [today]
**Pipeline**: research-lit → idea-creator (robotics framing) → novelty-check → research-review
## Robotics Problem Frame
- Embodiment:
- Task family:
- Observation / action interface:
- Available assets:
- Constraints:
## Landscape Matrix
[grouped by embodiment, benchmark, and bottleneck]
## Ranked Ideas
### Idea 1: [title] — RECOMMENDED
- Embodiment:
- Benchmark / simulator:
- Bottleneck addressed:
- Pilot type: sim / offline / real
- Positive signal:
- Novelty:
- Reviewer score:
- Hardware risk:
- Next step:
## Eliminated Ideas
- [idea] — killed because benchmark unclear / hardware inaccessible / novelty weak / no fair evaluation
## Evidence Package for the Top Idea
- Required baselines:
- Required metrics:
- Required failure cases:
- Whether real robot evidence is mandatory:
## Next Steps
- [ ] Implement sim-first pilot
- [ ] Run /novelty-check on the final idea wording
- [ ] Only after approval: consider hardware validationAfter this workflow identifies a strong robotics idea:
/idea-discovery-robot "direction" ← you are here
implement sim-first pilot
/run-experiment ← if infrastructure exists
/auto-review-loop "top robotics idea"If no simulator or benchmark is available yet, stop at the report and ask the user to choose whether to build infrastructure or pivot to a more executable idea.
dc00dfb
If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.