CtrlK
BlogDocsLog inGet started
Tessl Logo

jbvc/cpp-testing

Use only when writing/updating/fixing C++ tests, configuring GoogleTest/CTest, diagnosing failing or flaky tests, or adding coverage/sanitizers.

75

Quality

75%

Does it follow best practices?

Impact

Pending

No eval scenarios have been run

SecuritybySnyk

Passed

No known issues

Overview
Quality
Evals
Security
Files

Quality

Discovery

72%

Based on the skill's description, can an agent find and select it at the right time? Clear, specific descriptions lead to better discovery.

The description excels at trigger term quality and distinctiveness, clearly carving out a niche for C++ testing with GoogleTest/CTest. However, it is structured entirely as a 'when to use' clause without a separate 'what it does' statement, which weakens completeness. Adding an explicit capability statement before the trigger clause would strengthen it.

Suggestions

Add an explicit 'what' statement before the trigger clause, e.g., 'Provides guidance on writing C++ unit tests, configuring GoogleTest/CTest in CMake, debugging flaky tests, and setting up coverage/sanitizer builds.'

Expand specificity by listing concrete actions like 'create test fixtures, configure CMakeLists.txt for CTest, add ASan/TSan/UBSan flags, interpret coverage reports'.

DimensionReasoningScore

Specificity

The description names the domain (C++ tests) and mentions several actions (writing, updating, fixing, configuring, diagnosing, adding coverage/sanitizers), but these are somewhat general verbs rather than deeply specific concrete actions like 'generate test fixtures' or 'set up CMakeLists.txt for CTest'.

2 / 3

Completeness

The description has a strong 'when' clause ('Use only when...') but the 'what does this do' part is essentially embedded within the when clause rather than explicitly stated. It tells Claude when to use it but doesn't clearly describe what the skill provides or teaches (e.g., best practices, templates, configuration patterns).

2 / 3

Trigger Term Quality

Includes strong natural trigger terms users would say: 'C++ tests', 'GoogleTest', 'CTest', 'failing tests', 'flaky tests', 'coverage', 'sanitizers'. These cover common variations of how users would describe testing-related tasks in C++.

3 / 3

Distinctiveness Conflict Risk

Very clearly scoped to C++ testing with GoogleTest/CTest, coverage, and sanitizers. The combination of language (C++), framework (GoogleTest/CTest), and task type (testing) creates a distinct niche unlikely to conflict with other skills.

3 / 3

Total

10

/

12

Passed

Implementation

64%

Reviews the quality of instructions and guidance provided to agents. Good implementation is clear, handles edge cases, and produces reliable results.

This is a solid, actionable skill with excellent executable code examples covering the full C++ testing workflow from basic tests through coverage and sanitizers. Its main weaknesses are moderate verbosity (redundant examples, concepts Claude already knows) and a monolithic structure that could benefit from splitting detailed reference material into separate files. The workflow sections could be strengthened with explicit validation checkpoints, particularly around coverage and sanitizer workflows.

Suggestions

Remove the duplicate basic unit test example (CalculatorTest) since it's nearly identical to the TDD example, and trim the 'Core Concepts' section to only project-specific conventions Claude wouldn't already know.

Add explicit validation checkpoints to the coverage and sanitizer workflows (e.g., 'verify coverage meets threshold', 'check sanitizer output for errors before proceeding').

Split the coverage, sanitizer, and fuzzing sections into separate referenced files (e.g., COVERAGE.md, SANITIZERS.md) to improve progressive disclosure and reduce the main file's token footprint.

DimensionReasoningScore

Conciseness

The skill is fairly comprehensive but includes some unnecessary verbosity. The 'When NOT to Use' section, the 'Core Concepts' bullet list restating things Claude already knows (TDD, isolation, mocks vs fakes), and the basic unit test example being nearly identical to the TDD example add redundant tokens. The fixture example includes inline stub implementations that pad length. However, most content is useful reference material.

2 / 3

Actionability

The skill provides fully executable code examples for unit tests, fixtures, mocks, CMake configuration, coverage setup (both GCC and Clang), sanitizer configuration, and test running commands. The CMake quickstart is copy-paste ready with complete FetchContent setup, and bash commands are specific and complete.

3 / 3

Workflow Clarity

The TDD workflow (RED → GREEN → REFACTOR) is clearly sequenced, and the debugging failures section has a reasonable sequence. However, the debugging section lacks explicit validation checkpoints — it says 'expand to full suite once the root cause is fixed' but doesn't include a verify step. The coverage workflow is a sequence of commands but lacks validation (e.g., checking coverage thresholds). For operations like sanitizer builds, there's no feedback loop for addressing found issues.

2 / 3

Progressive Disclosure

The content is well-structured with clear section headers and logical organization, but it's monolithic — everything is in one file with no references to external files for detailed topics like coverage, sanitizers, or fuzzing. The fuzzing appendix and alternatives section could be separate files. At ~250 lines, some content (like the full coverage and sanitizer CMake configs) could be split out with clear references.

2 / 3

Total

9

/

12

Passed

Validation

100%

Checks the skill against the spec for correct structure and formatting. All validation checks must pass before discovery and implementation can be scored.

Validation11 / 11 Passed

Validation for skill structure

No warnings or errors.

Reviewed

Table of Contents