analyze-results

Analyze ML experiment results, compute statistics, generate comparison tables and insights. Use when user says "analyze results", "compare", or needs to interpret experimental data.

Quality

—

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Passed

No known issues

Analyze Experiment Results

Analyze: $ARGUMENTS

Workflow

Step 1: Locate Results

Find all relevant JSON/CSV result files:

Check figures/, results/, or project-specific output directories
Parse JSON results into structured data

Step 2: Build Comparison Table

Organize results by:

Independent variables: model type, hyperparameters, data config
Dependent variables: primary metric (e.g., perplexity, accuracy, loss), secondary metrics
Delta vs baseline: always compute relative improvement

Step 3: Statistical Analysis

If multiple seeds: report mean +/- std, check reproducibility
If sweeping a parameter: identify trends (monotonic, U-shaped, plateau)
Flag outliers or suspicious results

Step 4: Generate Insights

For each finding, structure as:

Observation: what the data shows (with numbers)
Interpretation: why this might be happening
Implication: what this means for the research question
Next step: what experiment would test the interpretation

Step 5: Update Documentation

If findings are significant:

Propose updates to project notes or experiment reports
Draft a concise finding statement (1-2 sentences)

Output Format

Always include:

Raw data table
Key findings (numbered, concise)
Suggested next experiments (if any)

Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Commit: fe5963c

Last updated: about 18 hours ago
Created: about 18 hours ago

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.