CtrlK
BlogDocsLog inGet started
Tessl Logo

flows-code-review

Run the technical (code) review step of Flows app certification. Produces three artifacts under reviews/code-review/feedback-round-<N>/: review-files.md (per-file inventory), review-packages.md (dependency audit), and code-review-report.md (scored report with Must Fix / Should Fix / Nice Fix items). Use when the user asks for a Flows code review, technical review, pre-submit review, app certification code review, or "run flows-code-review". Re-run until 0 open Must Fix items remain before moving on to flows-design-review.

Invalid
This skill can't be scored yet
Validation errors are blocking scoring. Review and fix them to unlock Quality, Impact and Security scores. See what needs fixing →
SKILL.md
Quality
Evals
Security

Flows Code Review

This skill is the technical review step of the Flows app certification flow:

flows-app-brief  →  build  →  flows-code-review (this skill, repeat until clean)  →  flows-design-review  →  flows-external-app-submit

Pre-checks

Before starting, confirm:

  • package.json exists (Node/TS project)
  • We are inside a git repository (git rev-parse --git-dir)
  • If App-Brief.md is missing at repo root, warn the user that flows-app-brief should be run first, but continue.

Determine the feedback round: look at reviews/code-review/. If it doesn't exist, use feedback-round-1/. Otherwise increment to the next missing round number.

Step 1 — Load review guidance

Before any scoring, read all relevant skill files. The skills ship alongside this skill in the same repo and are available locally. For each skill below, Glob '**/skills/<name>/SKILL.md' and Read the first match.

Required reads (always):

SkillInforms criteria
correctness-and-error-handling1.1 (bugs), 1.4 (tests), plus error-state patterns
security1.2 (CDF via SDK), 1.3 (dependencies, CVEs)
dependencies-audit1.3 (packages audit)
test-coverage1.4 (coverage gate), 1.6 (testability)
code-quality1.5 (dead code), 1.6 (patterns)
dm-limits-and-best-practices2.1–2.5 (DMS query patterns, limits, rate)
performance2.3–2.4 (pagination, call rate)
design3.1 (Aura consistency)

Conditional read:

  • setup-flows-auth — read if the app imports connectToHostApp, useHostApp, or configures OAuth in Vite.

Note on DMS patterns: the dm-limits-and-best-practices skill is the canonical local reference for search vs joins, cardinality, and query-time semantics — no external semantic-knowledge/ submodule is needed.

Do not proceed to scoring until all skill reads are complete.

Step 2 — Build the file inventory (review-files.md)

List all .ts and .tsx files in the app (exclude node_modules, dist, .cognite-bundles). For each file assess:

  • Structure (1–5): file size, single responsibility, clear organization
  • Quality (1–5): type safety, no any/unknown casts, proper error handling, no dead code, clean naming
  • Patterns (1–5 or N/A): DI/testability, correct platform patterns, no anti-patterns; N/A for pure utils/types
  • Tests: adequate | has tests with meaningful gaps | no test file | N/A (presentational-only or barrel)

Write to reviews/code-review/feedback-round-<N>/review-files.md:

## File inventory: [app name]

| File | Structure | Quality | Patterns | Tests | Notes |
| ---- | --------- | ------- | -------- | ----- | ----- |
| src/main.tsx | 5 | 5 | N/A | N/A | Clean entry point |
| src/services/cdfHelper.ts | 3 | 2 | 2 | ✗ | No DI; raw sdk.get/post; no test file (1.2, 1.6, 2.5) |
| ... | ... | ... | ... | ... | ... |

Step 3 — Build the package inventory (review-packages.md)

Gather dependency state with two one-shot commands — do not loop per package (npm view <pkg> is one network call per package; 40–80 calls = 2–4 minutes):

npm outdated --json 2>/dev/null || true   # exits 1 when outdated packages exist — normal; parse stdout regardless
npm audit --json 2>/dev/null || true       # exits 1 when vulnerabilities exist — parse stdout regardless

From npm outdated --json: packages absent from the output are up-to-date; packages present show current / wanted / latest. Flag any in dependencies (not devDependencies) that are ≥ 1 major behind.

From npm audit --json: parse severity counts and per-package advisory details.

Deprecated status: npm outdated does not report this. Only spot-check with npm view <pkg> deprecated for packages that already triggered another flag (≥ 1 major behind, in audit advisories, or unfamiliar name).

Assign health: Pass (up-to-date or ≤ 1 minor behind, 0 critical/high CVEs) | Warn (1 major behind in dependencies, or moderate CVE) | Fail (≥ 2 majors behind, or high/critical CVE, or deprecated).

Write to reviews/code-review/feedback-round-<N>/review-packages.md:

## Package audit: [app name]

### Dependencies

| Package | Used version | Latest | Deprecated | CVEs | Health |
| ------- | ------------ | ------ | ---------- | ---- | ------ |
| react | ^18.2.0 | 18.3.1 | No | 0 | Pass |
| some-old-lib | ^1.0.0 | 1.0.3 | No | 0 | Fail |
| ... | ... | ... | ... | ... | ... |

### Security audit

| Severity | Count |
| -------- | ----- |
| Critical | 0 |
| High | 0 |
| Moderate | 0 |
| Low | 0 |

#### Vulnerabilities

| Package | Severity | Title | Patched in | Advisory |
| ------- | -------- | ----- | ---------- | -------- |
| example-pkg | High | Prototype pollution | >=2.1.0 | https://github.com/advisories/GHSA-xxxx |

Step 4 — Assess test coverage

Run the test suite with coverage:

npx vitest run --coverage   # Vite/Vitest projects
npx jest --coverage          # Jest projects
npm test -- --coverage       # generic fallback

Record the output: framework, pass/fail counts, statement/branch/function/line percentages. If the test suite fails to run or no framework is configured, note explicitly — absence of a working test setup is itself a finding for criterion 1.4.

Hard gate: Overall line coverage must be ≥ 80% for the app to be approved. Coverage applies to all .ts/.tsx files under src/ (excluding test files, type declarations, and main.tsx). Apps that exclude production files from coverage measurement must be re-measured at full scope. Coverage exclusions hiding untested code are themselves a finding.

Also note per each non-trivial file: is there a corresponding .test.ts / .spec.ts? If not, mark Tests: ✗ in the file inventory. Assess whether hooks and services use dependency injection via React context rather than direct imports (vi.mock usage is a red flag unless justified with a comment).

Step 5 — Score the 12 criteria

Evaluate all 12 criteria using the rubric below. For each, state the score and one or two sentences of evidence (file paths, patterns observed, or gaps). Mark N/A with explanation if a criterion is not applicable.

1.1 No known bugs

Check: Open issues, obvious regressions, error paths untested, or TODO: fix in critical paths.

ScoreWhy
5No known defects; edge cases handled or explicitly out of scope with safe failure modes.
4Minor issues only; none affect core user flows or data integrity.
3Some known bugs or rough edges; workarounds exist or impact is limited.
2Material bugs, unreliable flows, or silent failures; users or data could be harmed.
1Broken primary flows, data corruption risk, or security-adjacent defects.

1.2 CDF access via Cognite SDK only

Check: All traffic to CDF goes through the official Cognite SDK. Flag any fetch/axios/raw REST to CDF-like URLs that bypasses the SDK.

ScoreWhy
5CDF usage is exclusively via the SDK; non-CDF calls are minimal and intentional.
4SDK used for CDF; one borderline or legacy call worth confirming.
3Mix of SDK and direct calls with weak justification.
2Repeated or critical paths use raw CDF HTTP instead of the SDK.
1Custom CDF clients, token handling outside SDK patterns, or undisclosed CDF endpoints.

1.3 Dependencies and packages

Check: package.json / lockfiles; unmaintained packages, duplicate majors, known CVEs, license issues, install scripts.

ScoreWhy
5Dependencies are current, trustworthy, and scanned; no red flags.
4Small updates or pinning tweaks needed; no serious supply-chain signals.
3Outdated or heavy deps; plan to upgrade documented.
2High-risk packages, excessive bundle, or unclear provenance.
1Known-vulnerable, deprecated, or malicious-adjacent dependency usage.

1.4 Test coverage

Check: Unit and integration tests for critical logic, API contracts, and error handling; CI runs tests; coverage adequate for risk.

Hard gate: Coverage tooling must be configured and working. Overall line coverage must be ≥ 80%.

ScoreWhy
5Critical paths well covered; ≥ 80% line coverage; failures easy to localize.
4Good coverage (≥ 80%) with a few gaps in secondary modules.
3Core flows partially tested; coverage below 80% or important branches missing.
2Sparse tests; coverage well below 80%; refactors would be unsafe.
1No meaningful automated tests, or coverage tooling not configured.

1.5 Dead code and maintainability

Check: Unused exports, unreachable branches, commented-out blocks, duplicate utilities.

Hard gate: Unreachable pages, entirely unused files, and significant dead code blocks must be removed before approval (score 1–2, must fix).

ScoreWhy
5Codebase is lean; dead paths removed or clearly feature-flagged with owners.
4Minor cruft; quick cleanup possible.
3Noticeable dead code or duplication; increases review and bug surface.
2Large unused areas or confusing structure; hard to reason about behavior.
1Unmaintainable tangle; dead code masks real execution paths.

1.6 Coding patterns and testability

Check: Dependency injection via React context; interface-based services; ViewModel pattern for pages.

// Preferred: injectable via context
const defaultDependencies = { useDataSource, useAnalytics };
export type UseMyHookContextType = typeof defaultDependencies;
export const UseMyHookContext = createContext<UseMyHookContextType>(defaultDependencies);
export function useMyHook() {
  const { useDataSource } = useContext(UseMyHookContext);
}
ScoreWhy
5DI via context throughout; interface-based services; ViewModel hooks for pages; type-safe mocks.
4Mostly good patterns; one or two hooks import directly but still testable.
3Mixed: some injectable, others import directly; vi.mock used frequently.
2Most hooks hard-code dependencies; tests rely on module mocking throughout.
1No DI; global singletons or direct SDK calls throughout.

2.1 DMS query patterns (search / query vs heavier paths)

Check: For read-heavy workloads where filters allow, prefer APIs that hit Elasticsearch (query or search on instances) rather than overusing list paths that stress Postgres.

ScoreWhy
5Read paths use the lightest suitable DMS API; writes and exceptions are documented.
4Mostly correct; one or two calls could move to query/search.
3Mix of appropriate and heavy reads; performance risk under load.
2Default pattern is heavier than needed; likely to stress Postgres or DMS.
1Systematic misuse of APIs contrary to platform guidance.

2.2 Server-side filtering

Check: Filters, limits, and projections applied in the API request — not by downloading large result sets and filtering in the browser.

ScoreWhy
5Payloads are tight; filtering expressed in queries; pagination is real, not simulated.
4Minor over-fetch; easy to tighten.
3Repeated patterns of client-side filtering on medium result sets.
2Large downloads with in-memory filtering; obvious scalability issue.
1Unbounded or near-unbounded reads with client-side reduction.

2.3 Limits, pagination, and prefetch

Check: Low, explicit limits; cursor- or page-based retrieval; no aggressive prefetch of pages the user may never need.

ScoreWhy
5Limits and paging match UX; no wasteful speculative loading.
4Reasonable; small tuning opportunities.
3Occasional high limits or redundant page fetches.
2Frequent large pages or prefetch storms.
1Unbounded lists, deep prefetch chains, or N+1 DMS patterns.

2.4 Rate of calls — do not hammer DMS

Check: Debouncing, batching, caching; no tight loops of identical requests; no re-fetching on every render without need.

ScoreWhy
5Request rate proportional to user intent; caching and deduplication in place.
4Mostly fine; a hot path could batch or debounce slightly.
3Chatty UI or polling without backoff; risk under concurrent users.
2Clear risk of overwhelming DMS or shared quotas.
1Tight loops, runaway polling, or duplicate parallel identical calls.

2.5 Throttling and 429 responses — backoff with jitter

Check: On 429 (or similar rate-limit signals), clients use exponential backoff with jitter, respect Retry-After, and avoid synchronized retries.

ScoreWhy
5Backoff + jitter implemented; retries bounded; behavior degrades gracefully.
4Backoff exists; jitter or caps could be improved.
3Naive fixed-interval retries or missing handling for throttling.
2Aggressive retries on 429; thundering herd risk.
1No handling; infinite or immediate tight retries on errors.

3.1 Aura design system

Check: Prefer Aura components and tokens for layout, forms, tables, feedback, and typography; custom UI should align with Aura spacing, color, and interaction patterns.

ScoreWhy
5Aura used consistently; custom pieces match design language.
4Mostly Aura; isolated custom widgets with acceptable alignment.
3Mix of ad-hoc UI and Aura; some inconsistency for users.
2Largely non-Aura; visually disconnected from platform.
1Clashing patterns, inaccessible controls, or no alignment with Aura when feasible.

Note: a score of 3 does not block initial approval. Accessibility gaps (missing aria-label, unlabeled inputs) should be escalated to Should Fix regardless of the Aura score.

Step 6 — Write code-review-report.md

Write reviews/code-review/feedback-round-<N>/code-review-report.md following this template:

# [App name] — Flows code review

This document is the platform review for [App name], conducted as part of the Cognite Flows app certification process.

## What this review covers

- **Protect the user and the customer** — no known bugs, correct SDK usage, healthy dependencies, adequate test coverage, and a clean codebase.
- **Protect Cognite services** — efficient DMS query patterns, server-side filtering, bounded pagination, and graceful rate-limit handling.
- **Protect the brand** — UI consistency with the Aura design system.

Scores are 1–5. A score of 1–2 on any criterion blocks approval. Score 3 is acceptable with tracked follow-up. Scores 4–5 are good.

## Path to approval

This review found **[N] must-fix item(s)** that block approval. [One sentence on should-fix count and overall state.] Once the must-fix items are addressed, re-run `flows-code-review`.

---

## Review details

### Summary
[2–4 sentences: positives first, then a clear statement of how many blocking issues and what they are]

### Reviewed commit
`<full SHA>`

### Test coverage
- **Framework:** [Vitest / Jest / None]
- **Tests run:** [N passed, N failed, N skipped]
- **Coverage:** [Statements: X% | Branches: X% | Functions: X% | Lines: X%]
- **Notable gaps:** [uncovered critical paths, or "N/A"]

### Package & security summary
- **Total packages:** [N] dependencies, [N] devDependencies
- **Health:** [N] pass, [N] warn, [N] fail
- **Vulnerabilities:** [N] critical, [N] high, [N] moderate, [N] low
- Full details: see `review-packages.md`

### Scores

| Area | Criterion | Score | Notes |
| ---- | --------- | ----- | ----- |
| User & customer | 1.1 Known bugs | /5 | |
| User & customer | 1.2 CDF via SDK | /5 | |
| User & customer | 1.3 Packages | /5 | |
| User & customer | 1.4 Tests & coverage | /5 | |
| User & customer | 1.5 Dead code | /5 | |
| User & customer | 1.6 Patterns & testability | /5 | |
| Cognite services | 2.1 DMS query patterns | /5 | |
| Cognite services | 2.2 Server-side filter | /5 | |
| Cognite services | 2.3 Limits & pages | /5 | |
| Cognite services | 2.4 Call rate | /5 | |
| Cognite services | 2.5 429 backoff | /5 | |
| Brand | 3.1 Aura | /5 | |

### Must fix before deploy
- [ ] [description] — [file:line] — [criterion ref]
  _Impact:_ [one sentence explaining the user/customer consequence]

### Should fix before deploy
- [ ] [description] — [file:line] — [criterion ref]

### Nice to fix before deploy
- [ ] [description] — [file:line] — [criterion ref]

Categorization:

  • Must fix (score 1–2): security issues, broken core flows, unbounded API calls, missing SDK usage for CDF, test coverage below 80% or tooling not configured, unreachable pages or significant dead code, pervasive hard-coded dependencies with no DI path. Each must-fix item must include an _Impact:_ note explaining the user/customer consequence.
  • Should fix (score 3): missing test files for non-trivial services/hooks/utils, missing tests for critical paths, client-side filtering of large datasets, missing backoff/retry, accessibility gaps, vi.mock overuse without justification.
  • Nice to fix (score 4 gaps): minor Aura inconsistencies, small cleanup, non-critical package updates, minor dead exports.

End the file with a machine-readable summary block (exact format required by flows-external-app-submit):

## Summary

- Must Fix open: <integer>
- Should Fix open: <integer>
- Nice Fix open: <integer>

Use exactly those labels and the singular line format. When all Must Fix items are resolved, the line MUST read Must Fix open: 0.

Step 7 — Verify

After writing all three artifacts, cross-check them:

  1. Re-read code-review-report.md, review-files.md, and review-packages.md.
  2. Verify every criterion 1.1–1.6, 2.1–2.5, 3.1 appears in the Scores table with an actual score (not blank).
  3. Verify every must-fix item (score 1–2) has an _Impact:_ note.
  4. Verify the Summary block is present and the integer counts match the actual number of open checklist items.
  5. Verify that no gh api calls or curl to raw.githubusercontent.com were made during this skill run — all guidance should have come from local skill reads (Step 1) and the inlined rubric above.

Print to the terminal after writing:

Must Fix open: <n>
Should Fix open: <n>
Nice Fix open: <n>

When to stop

Re-run this skill until Must Fix open: 0 in the latest round's code-review-report.md. Only then proceed to flows-design-review.

Repository
cognitedata/builder-skills
Last updated
Created

Is this your skill?

If you maintain this skill, you can claim it as your own. Once claimed, you can manage eval scenarios, bundle related skills, attach documentation or rules, and ensure cross-agent compatibility.