CtrlK
BlogDocsLog inGet started
Tessl Logo

auto-review-loop-llm

Autonomous research review loop using any OpenAI-compatible LLM API. Configure via llm-chat MCP server or environment variables. Trigger with "auto review loop llm" or "llm review".

57

Quality

47%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Critical

Do not install without reviewing

Optimize this skill with Tessl

npx tessl skill review --optimize ./skills/skills-codex/auto-review-loop-llm/SKILL.md
SKILL.md
Quality
Evals
Security

Quality

Discovery

40%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a specific tool integration (OpenAI-compatible LLM API, llm-chat MCP server) and provides explicit trigger phrases, which is helpful. However, it fails to explain what the skill actually does in concrete terms — what is being reviewed, what inputs it takes, what outputs it produces, and what the 'loop' entails. The lack of specificity about capabilities significantly weakens its utility for skill selection.

Suggestions

Add concrete actions describing what the review loop does, e.g., 'Iteratively reviews research papers/code by sending content to an external LLM, collecting feedback, and refining outputs until quality criteria are met.'

Expand the 'when' clause with natural use cases, e.g., 'Use when the user wants automated multi-pass review of documents, research papers, or code using an external LLM for feedback.'

Clarify what 'research review' means in practice — does it review academic papers, code, data analysis, or something else? This would reduce ambiguity and improve distinctiveness.

DimensionReasoningScore

Specificity

The description says 'autonomous research review loop' but never explains what concrete actions are performed — what is being reviewed, what outputs are produced, or what steps the loop involves. 'Research review loop' is abstract and vague.

1 / 3

Completeness

It partially answers 'what' (autonomous research review loop using an LLM API) and provides explicit trigger phrases, but the 'what' is too vague to be meaningful — it doesn't explain what the loop actually does. The 'when' is addressed via trigger phrases but lacks context about use cases.

2 / 3

Trigger Term Quality

It includes some trigger phrases like 'auto review loop llm' and 'llm review', and mentions 'OpenAI-compatible LLM API', but these are somewhat artificial trigger phrases rather than natural terms a user would say. A user might say 'review my research' or 'automated review' but the exact phrases given feel prescribed rather than natural.

2 / 3

Distinctiveness Conflict Risk

The mention of 'llm-chat MCP server' and 'OpenAI-compatible LLM API' provides some distinctiveness, and the specific trigger phrases help. However, 'research review' is broad enough to potentially overlap with other review or research-related skills.

2 / 3

Total

7

/

12

Passed

Implementation

55%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

The skill excels at actionability and workflow clarity with concrete, executable examples and a well-defined multi-phase loop with validation checkpoints. However, it suffers significantly from verbosity — the provider table, repeated prompt templates (shown 3 times), and inline configuration examples bloat the content far beyond what's needed. The lack of progressive disclosure means all reference material is crammed into one file rather than being appropriately split.

Suggestions

Move the supported providers table and MCP configuration examples to a separate PROVIDERS.md or CONFIG.md reference file, linking to it from the main skill.

Consolidate the three nearly-identical review prompt templates into a single parameterized template, noting that Round 2+ should include previous review summary and changes.

Move the curl fallback method to a separate FALLBACK.md file since MCP is the primary method, keeping only a brief mention and link in the main skill.

Remove the provider-specific details (8 providers with URLs and models) — Claude can look these up or the user can configure them; only show one example provider in the config block.

DimensionReasoningScore

Conciseness

The skill is excessively verbose. The supported providers table with 8 Chinese/international LLM providers is largely unnecessary padding. The curl fallback examples, MCP configuration JSON, and repeated review prompt templates (shown 3 times with minor variations) significantly bloat the content. Much of this (provider URLs, JSON config format) is reference material Claude doesn't need inline.

1 / 3

Actionability

The skill provides fully concrete, executable guidance: exact MCP tool call syntax, complete curl commands, specific JSON schemas for state persistence, exact prompt templates, and clear threshold values (score >= 6/10). Everything is copy-paste ready.

3 / 3

Workflow Clarity

The workflow is clearly sequenced through Phases A-E with explicit validation checkpoints (Phase B has a STOP condition, state persistence at end of Phase E, recovery check at initialization). The feedback loop of review → implement → re-review is well-defined with clear termination conditions.

3 / 3

Progressive Disclosure

This is a monolithic wall of text with no references to external files for detailed content. The provider table, prompt templates, curl examples, and MCP configuration could all be split into separate reference files. Everything is inlined in one large document with no navigation structure beyond section headers.

1 / 3

Total

8

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Repository
wanshuiyin/Auto-claude-code-research-in-sleep
Reviewed

Table of Contents

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.