research

Research prediction markets - base rates, resolution rules, historical data

Quality

27%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./src/skills/bundled/research/SKILL.md

Quality

Content

22%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This skill reads more like a feature specification or product mockup than an actionable skill for Claude. It defines commands and shows idealized outputs but provides no concrete guidance on how to actually perform research — no data sources, no tools, no APIs, no verification methods. The base rate numbers in examples appear fabricated with no sourcing methodology, which is particularly problematic for a research-focused skill.

Suggestions

Add concrete, actionable steps for how to actually research base rates — specify data sources to query, tools to use (web search, specific databases), and how to validate findings.

Include a clear workflow: e.g., 1) Search for historical data, 2) Verify with multiple sources, 3) Calculate rates, 4) Flag uncertainty/confidence levels.

Define what the /baserate, /resolution, and /history commands actually do mechanically — are these web searches? Database lookups? Clarify the tools Claude should use.

Add explicit guidance on how to handle uncertainty and when to caveat findings, especially since prediction market research involves probabilistic claims that could be misleading if presented without proper sourcing.

Dimension	Reasoning	Score
Conciseness	The skill is reasonably concise but includes some unnecessary enumeration of research areas (political, economic, sports) that Claude already knows how to categorize. The examples section partially duplicates what the output format already shows.	2 / 3
Actionability	The skill provides no executable code, no actual data sources, no API endpoints, and no concrete methods for looking up base rates or resolution rules. The commands (/baserate, /resolution, /history) are defined but there's no implementation or guidance on how to actually perform the research — it just shows idealized outputs with no path to producing them.	1 / 3
Workflow Clarity	There is no clear workflow or sequence of steps for conducting research. The skill lists commands and example outputs but never explains how to go from a query to a result — no data sources to check, no verification steps, no process for validating base rate calculations.	1 / 3
Progressive Disclosure	The content is organized into logical sections (commands, research areas, examples, output format) which provides some structure. However, there are no references to external files for deeper content, and the research areas section is a flat list that could either be removed or expanded into separate reference documents.	2 / 3
	Total	6 / 12 Passed

Description

32%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description identifies a reasonably niche domain (prediction markets) and lists some relevant subtopics, but it reads more like a tag list than a proper skill description. It critically lacks any 'Use when...' guidance, and the actions described are vague topic areas rather than concrete capabilities. Adding explicit trigger conditions and more specific action verbs would significantly improve skill selection accuracy.

Suggestions

Add an explicit 'Use when...' clause, e.g., 'Use when the user asks about prediction market probabilities, forecasting questions, or wants to analyze market odds and historical accuracy.'

Replace the topic list with concrete action verbs describing what the skill does, e.g., 'Researches prediction market questions by calculating base rates, analyzing resolution criteria, and comparing historical forecasting data.'

Include common user-facing trigger terms and platform names like 'forecasting', 'Polymarket', 'Metaculus', 'probabilities', 'calibration', 'betting odds' to improve keyword coverage.

Dimension	Reasoning	Score
Specificity	Names the domain (prediction markets) and some actions (research base rates, resolution rules, historical data), but these are more like topic areas than concrete actions. It doesn't specify what it actually does with these (e.g., 'calculates base rates', 'retrieves resolution criteria', 'analyzes historical accuracy').	2 / 3
Completeness	Describes a rough 'what' (research prediction markets topics) but completely lacks a 'when' clause. There is no 'Use when...' or equivalent trigger guidance, which per the rubric should cap completeness at 2, and since the 'what' is also weak, this scores a 1.	1 / 3
Trigger Term Quality	Includes relevant domain keywords like 'prediction markets', 'base rates', 'resolution rules', and 'historical data' which users might naturally say. However, it misses common variations like 'forecasting', 'Polymarket', 'Metaculus', 'probabilities', 'calibration', or 'betting markets'.	2 / 3
Distinctiveness Conflict Risk	The prediction markets domain is fairly niche, which helps distinctiveness. However, 'research' and 'historical data' are generic enough that they could overlap with general research or data analysis skills. The lack of explicit trigger boundaries increases conflict risk.	2 / 3
	Total	7 / 12 Passed

Validation

90%

Warnings & errors only

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation — 10 / 11 Passed

Validation for skill structure

Criteria	Description	Result
frontmatter_unknown_keys	Unknown frontmatter key(s) found; consider removing or moving to metadata	Warning

	Total	10 / 11 Passed

Repository: alsk1992/CloddsBot
Commit: e71a5f6

Reviewed: 2 months ago

Table of Contents

Discovery Implementation Validation

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.