evilissimo/implementation-integrity-review

Reviews repositories, pull requests, diffs, and agent-generated code for reward hacking, fake completion, defensive theater, architectural bypasses, weakened guarantees, hidden fallbacks, and misleading abstractions.

1.09x

Quality

97%

Does it follow best practices?

Impact

100%

1.09x

Average score across 6 eval scenarios

Securityby

Passed

No known issues

Reward Hacking Patterns

Name: evilissimo/implementation-integrity-review
Rating: 98.5 (1 reviews)
Author: evilissimo

Reward hacking means optimizing the implementation for the review, test suite, benchmark, or acceptance checklist while failing the real contract. Treat these patterns as leads and confirm them with code references before reporting.

Fixture-Specific Behavior

Branches that check for known test values, filenames, IDs, timestamps, hosts, users, or environment names.
Parsers that accept only the examples from tests or docs while rejecting valid variants.
Hardcoded outputs that match snapshots, golden files, or expected examples.
Mock-only behavior that cannot work with the real dependency.

Evidence to collect: the special-case condition, the corresponding fixture or test input, and a plausible valid input that would fail.

Test-Only Branches

Production code checks NODE_ENV, PYTEST_CURRENT_TEST, CI, JEST_WORKER, or similar variables to bypass behavior.
Code imports test helpers, fixtures, or fake providers outside test modules.
Runtime behavior differs when coverage, debug, or local test settings are enabled.
A branch exists only to satisfy a narrow failing test rather than the contract.

Do not report legitimate dependency injection by itself. Report it when the test path is easier than production and the production contract is unverified or broken.

Weakened Assertions And Coverage

Assertions move from exact behavior to "truthy", "not null", "called once", "does not throw", or snapshot-only checks.
Failing tests are skipped, xfailed, quarantined, deleted, or narrowed without replacing the lost coverage.
Integration tests become unit tests with mocks that remove the risky behavior.
CI filters, path ignores, or optional jobs exclude the changed code.

Evidence to collect: before/after test intent, the guarantee lost, and the production code now able to regress unnoticed.

Fake Completion

Functions return success while work is queued nowhere, persisted nowhere, or sent to no dependency.
TODO, placeholder, or stub paths are reachable in production.
New APIs expose status fields that always say complete, synced, healthy, or enabled.
Errors are converted to empty lists, default objects, or success responses.

Report fake completion when a caller would reasonably believe the requested work happened.

Fake Resiliency

Retry loops do not retry the failing operation, ignore final failure, or do not respect timeouts/backoff.
Circuit breakers, fallbacks, or caches never change state or are scoped so narrowly they cannot protect production calls.
Async/concurrent code still runs serially while advertising parallel behavior.
Cache keys ignore inputs, tenants, permissions, locale, or version.

Evidence to collect: the advertised resilience property and the exact path that fails to provide it.

Benchmark Or Metric Gaming

Fast paths bypass correctness checks only under benchmark input sizes or benchmark command names.
Metrics are renamed, suppressed, sampled away, or reset to hide regressions.
Health checks are changed to report process liveness while dependencies are failing.
Performance improvements come from skipping required work rather than making it cheaper.

Report only when the metric change hides real operational or correctness risk.

False Positive Controls

These are usually legitimate:

Test fixtures contained entirely in tests.
Dependency injection that exercises the same semantics as production.
Narrow acceptance tests that are supplemented by lower-level coverage.
Feature flags that default safely and expose unsupported/degraded behavior.
Mocking external services while preserving contract tests for the adapter.

Prefer one strong finding over several weak pattern matches. The strongest reward-hacking findings show the incentive target, the code path optimized for that target, and the real behavior left broken.

evilissimo/implementation-integrity-review