CtrlK
BlogDocsLog inGet started
Tessl Logo

llmobs-testing

This skill should be used when the user asks to "write LLMObs tests", "add tests for LLM Observability", "test an LLMObs plugin", "llmobs test", "llmobs spec", "test llm observability", "assertLlmObsSpanEvent", "useLlmObs", "getEvents", "MOCK_STRING", "MOCK_NOT_NULLISH", "MOCK_NUMBER", "MOCK_OBJECT", "VCR cassette", "record cassette", "replay cassette", "vcr proxy", "llmobs cassette", "test chat completions", "test streaming", "test embeddings", "test agent runs", "test orchestration", "test workflow", "llmobs span event", "LLMObs test strategy", "LlmObsCategory test", "LLM_CLIENT test", "MULTI_PROVIDER test", "ORCHESTRATION test", "INFRASTRUCTURE test", "span kind llm test", "span kind workflow test", "inputMessages", "outputMessages", "token metrics", "llmobs span validation", "cassette not generated", "re-record cassette", "127.0.0.1:9126", or needs to write, modify, or debug tests for any LLMObs plugin in dd-trace-js.

56

Quality

62%

Does it follow best practices?

Impact

No eval scenarios have been run

SecuritybySnyk

Advisory

Suggest reviewing before use

Optimize this skill with Tessl

npx tessl skill review --optimize ./.agents/skills/llmobs-testing/SKILL.md
SKILL.md
Quality
Evals
Security

Security

1 medium severity finding. This skill can be installed but you should review these findings before use.

Medium

W011: Third-party content exposure detected (indirect prompt injection risk)

What this means

The skill exposes the agent to untrusted, user-generated content from public third-party sources, creating a risk of indirect prompt injection. This includes browsing arbitrary URLs, reading social media posts or forum comments, and analyzing content from unknown websites.

Why it was flagged

Third-party content exposure detected (high risk: 0.90). Yes — the skill's required workflow (SKILL.md and references/vcr-cassettes.md) instructs tests to make real HTTP calls to external LLM providers via a VCR proxy (e.g., baseURL http://127.0.0.1:9126/vcr/{provider}) and ingest provider responses (cassettes) which are arbitrary third-party outputs used to drive assertions and test behavior.

Report incorrect finding
Repository
DataDog/dd-trace-js
Audited
Security analysis
Snyk

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.