experiment-audit

Audit experiment integrity before claiming results. Uses cross-model review (GPT-5.4) to check for fake ground truth, score normalization fraud, phantom results, and insufficient scope. Use when user says "审计实验", "check experiment integrity", "audit results", "实验诚实度", or after experiments complete before writing claims.

Quality

88%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

Securityby

Risky

Do not use without reviewing

1 high severity finding. You should review these findings carefully before considering using this skill.

High

W007: Insecure credential handling detected in skill instructions

What this means

The skill handles credentials insecurely by requiring the agent to include secret values verbatim in its generated output. This exposes credentials in the agent’s context and conversation history, creating a risk of data exfiltration.

Why it was flagged

Insecure credential handling detected (high risk: 0.80). The reviewer is explicitly asked to read config/result files and produce "exact file:line references" and detailed findings, which can cause secrets (API keys, tokens, cookies) present in those files to be read and reproduced verbatim in the audit outputs, creating an exfiltration risk.

Report incorrect finding

Repository: wanshuiyin/Auto-claude-code-research-in-sleep
Commit: 2028ac4

Audited: 7 days ago
Security analysis

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.