llmobs-testing

This skill should be used when the user asks to "write LLMObs tests", "add tests for LLM Observability", "test an LLMObs plugin", "llmobs test", "llmobs spec", "test llm observability", "assertLlmObsSpanEvent", "useLlmObs", "getEvents", "MOCK_STRING", "MOCK_NOT_NULLISH", "MOCK_NUMBER", "MOCK_OBJECT", "VCR cassette", "record cassette", "replay cassette", "vcr proxy", "llmobs cassette", "test chat completions", "test streaming", "test embeddings", "test agent runs", "test orchestration", "test workflow", "llmobs span event", "LLMObs test strategy", "LlmObsCategory test", "LLM_CLIENT test", "MULTI_PROVIDER test", "ORCHESTRATION test", "INFRASTRUCTURE test", "span kind llm test", "span kind workflow test", "inputMessages", "outputMessages", "token metrics", "llmobs span validation", "cassette not generated", "re-record cassette", "127.0.0.1:9126", or needs to write, modify, or debug tests for any LLMObs plugin in dd-trace-js.

Quality

62%

Does it follow best practices?

Impact

—

No eval scenarios have been run

Securityby

Advisory

Suggest reviewing before use

Fix and improve this skill with Tessl

tessl review fix ./.agents/skills/llmobs-testing/SKILL.md

1 medium severity finding. This skill can be installed but you should review these findings before use.

Medium

W011: Third-party content exposure detected (indirect prompt injection risk)

What this means

The skill exposes the agent to untrusted, user-generated content from public third-party sources, creating a risk of indirect prompt injection. This includes browsing arbitrary URLs, reading social media posts or forum comments, and analyzing content from unknown websites.

Why it was flagged

Third-party content exposure detected (high risk: 0.90). Yes — the skill's required workflow (SKILL.md and references/vcr-cassettes.md) instructs tests to make real HTTP calls to external LLM providers via a VCR proxy (e.g., baseURL http://127.0.0.1:9126/vcr/{provider}) and ingest provider responses (cassettes) which are arbitrary third-party outputs used to drive assertions and test behavior.

Report incorrect finding

Repository: DataDog/dd-trace-js
Commit: 50aa025

Audited: 2 months ago
Security analysis

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.