Workflow · GitHub Reviews · gh CLI 2.96 · Conventional Comments 1.0
Code Review
Correctness-first, kind, specific, blocking-vs-nit reviews.
Updated 5 Jul 2026 · CC0
AGENTS.mdrepo rootYou are reviewing code and preparing code for review. "Good" means a review that catches real defects before merge, gives the author specific and actionable feedback anchored to a line, and never blocks on taste. A PR you author is small, self-reviewed, and mergeable with one pass.
Stack
- GitHub pull requests as the review surface; drive them from
ghCLI 2.96 (gh pr diff,gh pr view,gh pr review,gh pr checks,gh api). - Feedback grammar: Conventional Comments 1.0 (
conventionalcomments.org) — every comment starts with a label. - Commit and PR-title grammar: Conventional Commits 1.0.0 (
feat:,fix:,refactor:,test:,docs:,chore:,BREAKING CHANGE:). - Automated review bots on CI: reviewdog 0.21 (
-reporter=github-pr-review, posts linter output as inline comments only on changed lines), pre-commit 4.6 (pre-commit run --all-files) for the hook barrier. - Security scanners in CI and pre-commit: Semgrep 1.168 (
semgrep ci), Gitleaks 8.30.1 (gitleaks git --pre-commit --stagedpre-commit,gitleaks dirin CI). - Merge gate: GitHub repository rulesets (not the legacy branch-protection UI) — require PR, require review from Code Owners, require passing status checks, require linear history, enable the merge queue, block force-push to protected refs.
- Ownership:
.github/CODEOWNERS. PR scaffolding:.github/pull_request_template.md.
Project conventions
- Layout:
.github/CODEOWNERS,.github/pull_request_template.md,.github/workflows/ci.yml,.pre-commit-config.yaml,.semgrep.yml,CONTRIBUTING.mddocumenting the review bar. - PR title = Conventional Commit subject, imperative, <= 72 chars. PR body states **intent** (what + why), links the issue (
Closes #123), and lists the manual test done. No body -> not reviewable. - One logical change per PR. Refactors and behavior changes never share a PR — reviewers cannot tell a rename from a bug otherwise.
- Reviewer names files and lines as
path/to/file.ts:42. Suggested edits use GitHub ```suggestion blocks so the author can one-click apply. - Formatting and lint are not review topics — they are enforced by pre-commit + reviewdog before a human looks. If you find yourself typing a comment about whitespace, quotes, or import order, delete it and fix the config instead.
Review workflow
Drive the review from the CLI; open the web UI only to click a line and leave a ```suggestion.
- Read intent first:
gh pr view <n> --json title,body,files— grasp what, why, and the blast radius before opening the diff. - Gate on CI:
gh pr checks <n> --watch. Never spend human attention on a PR whose linter/tests are red — the machine review runs first. - Read the diff:
gh pr diff <n>(pipe todeltalocally for a word-diff). Review a stacked PR against its own base, not the whole stack. - Repro when non-obvious:
gh pr checkout <n>, then exercise the changed path and run the suite locally before asserting a bug exists. - Comment on lines via API/UI:
gh pr reviewposts only the top-level verdict + body; a line-anchored comment or ```suggestion needs the web UI orgh api repos/{owner}/{repo}/pulls/{n}/comments -f path=… -F line=… -f side=RIGHT. - Submit once: batch every note into a single review —
gh pr review <n> --approve | --comment | --request-changes -b "<summary>". Never drip one comment per thread. - Re-review the delta: after a force-push/rebase,
git range-diff <base>..<old> <base>..<new>shows exactly what moved — don't re-litigate what you already approved.
Review priority order
Review strictly in this order; do not descend a level until the level above is clear. Spend your attention budget top-heavy.
- Correctness and edge cases — does the code do what the PR body claims? This is 80% of the value of review.
- Security — untrusted input, authz, secrets, injection (see Security).
- Tests — does the change ship tests that would fail without it?
- Design and readability — naming, boundaries, coupling, will the next reader understand it.
- Style — last, and almost entirely delegated to the formatter/linter.
A review that reorders these (nitpicking naming while a null-deref sits untouched) is a failed review.
Reviewing for substance
- Read the intent first. Open the PR body and linked issue before the diff. Review the code against its stated purpose, not against the code you would have written.
- Trace the changed logic by hand for the non-happy path. For each modified function ask: what happens on empty input,
null/undefined, zero, negative, the max value, a duplicate, a concurrent caller? - Hunt the classic defects explicitly: off-by-one on loop/slice bounds;
<=vs<; unhandled promise rejection / missingawait; error swallowed by a barecatch; earlyreturnthat skips cleanup; mutation of a shared/argsobject; integer overflow or float money math; timezone/DST in date logic;==where===is meant. - Error paths are code too. Every
throw/reject/error branch must be reachable, logged with context, and leave state consistent. Rejectcatch {}that hides failures. - Concurrency: flag shared mutable state without a lock, read-modify-write races, non-idempotent retries, and unbounded parallelism (
Promise.allover an unbounded array). - Resource lifetimes: every opened file, socket, DB connection, lock, and subscription needs a matching release on all paths including the error path — prefer
with/defer/using/try-with-resources over a manual close that athrowcan skip. - Migrations and rollback are blocking review targets. A DB migration must be reversible or ship a documented forward-fix; it must be backward-compatible with the currently deployed code (expand/contract — add column + backfill + switch reads, drop later) so a rollback does not break prod. Destructive DDL (
DROP,ALTER ... DROP) in the same deploy as the code that stops using it is a BLOCKING issue. - Tests are part of correctness, not a formality. The change must include a test that fails on
mainand passes on the branch. A bug fix with no regression test is incomplete — request one. Verify tests assert the new behavior, not just that code runs (no assertion-free tests, noexpect(true)). - Verify the diff is the whole change. Check for a leftover debug
console.log, a commented-out block, aTODOshipped without a ticket, a feature-flag default flipped on, or a dependency added topackage.jsonbut unused.
Change-type playbook
Pattern-match the diff to its failure mode; each kind fails differently.
- Public API / contract: additive-only or versioned. A removed/renamed field, narrowed type, or changed status code breaks callers — require a deprecation window and a consumer migration plan.
- DB migration: expand/contract, reversible, backward-compatible with the running code. Backfills must be batched and resumable, not one unbounded
UPDATE. - New dependency: justified over stdlib, maintained, license-compatible, pinned; check transitive weight and that supply-chain scanning (
osv-scanner/Dependabot) is green. - Feature flag / config default: the shipped default is the safe/off value, and rollback is a flag flip with no redeploy.
- Concurrency / async: map every shared write,
await, and retry — the failure modes are lost updates, missing idempotency, and swallowed cancellation, not throughput. - Perf-sensitive path: demand a number (benchmark,
EXPLAIN/query plan), not an adjective. Watch for N+1 queries and unbounded fan-out. - Generated / vendored code: review the generator input and the command, not the 5k-line output; keep it in its own commit so it never buries logic.
Writing review feedback
Anchor every comment to
file:lineand give a concrete fix or a ```suggestion block. "This is wrong" is banned — say what is wrong, the input that breaks it, and what to do instead.issue (blocking):
parseAmountatbilling.ts:40divides byratewhich is0for unlisted currencies ->Infinitywritten to the ledger. Guardrate === 0and throwUnsupportedCurrencyError.Use Conventional Comments labels so blocking vs. optional is unambiguous:
praise:,nitpick:,suggestion:,issue:,question:,thought:,chore:. Decorate with(blocking)or(non-blocking).Only
issue (blocking)and an unresolvedquestion (blocking)gate the merge. Everything else — style leanings, "I'd have named this differently", speculative future-proofing — is(non-blocking)and must not hold up the PR.Explain the WHY. Link the failing input, the spec, the incident, or the perf number. A rule without a reason will be re-litigated next PR.
Ask, don't assert, when unsure. Use
question:when you might be missing context ("IsuserIdguaranteed non-null here? If a webhook can omit it this NPEs"). You are often missing context; humility catches more bugs than confidence.Name what is good. Use
praise:on a clean abstraction or a well-chosen test. It is not filler — it tells the author what to keep doing.Map your verdict to the tri-state:
--approvewhen only non-blocking notes remain;--commentwhen you have questions but no blockers;--request-changesonly when there is a blocking issue. Do not request-changes over a nit.Batch comments into one review (
gh pr review), not a drip of single comments that spam the author's inbox.Tone: critique the code, never the author. Prefer "this function" / "we" over "you". No sarcasm, no "obviously", no "why didn't you". Assume the author had a reason and ask for it.
Keeping PRs reviewable
- Target under ~400 lines changed; past that, defect-detection falls off a cliff and reviewers rubber-stamp. Split by seam.
- Stack PRs for a large feature: land the interface, then the implementation, then the wiring, each reviewable alone. Never open a 2,000-line PR and expect a real review.
- Self-review your own diff first in the GitHub UI before requesting review — you will catch the debug log and the accidental file yourself.
- Keep generated/vendored files, lockfile churn, and mass reformatting in separate commits or PRs so real changes are not buried.
Forbidden review behaviors
- No rubber-stamping. "LGTM" without reading the diff is a failure; if you approve, you co-own the defects.
- No style-only reviews. A review that leaves 6 whitespace nits and misses the SQL injection is worse than no review.
- No blocking on preference. Personal taste, hypothetical future needs, and "how I'd write it" are non-blocking or silent.
- No giant unreviewable PRs waved through because they are too big to read — split them or send them back.
- No vague negatives. Every criticism carries a reason and a suggested direction.
Testing
- Match the repo's runner (Vitest/Jest/pytest/
go test) — do not introduce a second framework in a review. - Require tests that fail without the change: new behavior gets a positive test; a bug fix gets a regression test reproducing the original failure.
- Insist on boundary and error-path coverage, not just the happy path: empty, null, max, duplicate, and the thrown-error case.
- Reject flaky patterns: real
sleep/wall-clock waits, tests depending on execution order, unseeded randomness, real network calls. Require fake timers, seeded data, and mocked I/O. - Tests must be deterministic and independent — each passes in isolation. Force serial execution to expose shared-state coupling:
jest --runInBand,vitest --no-file-parallelism,pytest -n0(disable xdist),go test -p 1. - Coverage delta is a signal, not a gate. Do not approve solely because coverage rose; do question a fix that lowers it.
Security
- Treat every external input as hostile: request bodies, query params, headers, webhook payloads, env, file contents. Validate at the boundary with a schema (Zod/Pydantic) before use.
- Injection: require parameterized queries / prepared statements — flag any string-concatenated SQL, shell (
execwith interpolation), or NoSQL query as BLOCKING. Semgreptaintrules catch most; still read for it. - Secrets: Gitleaks must pass; a hard-coded key, token, or password in the diff is an automatic block even after rotation. Secrets come from a vault/env, never source or CI logs.
- AuthZ over authN: verify the change checks the caller is allowed to touch this resource (object-level authorization), not merely that they are logged in — broken object-level authz is the top real-world API bug.
- SSRF and path traversal: any user-controlled URL, host, or file path is a target — allowlist destinations, block internal ranges/metadata endpoints, and resolve-then-confine paths to a base dir. Never
fetch/opena raw user string. - Unsafe deserialization: reject
pickle,yaml.load(usesafe_load), Java nativereadObject, and any deserializer over untrusted bytes — it is remote code execution, not a data bug. - Output/rendering: confirm untrusted data is escaped/encoded for its sink (HTML, SQL, shell, log). No
dangerouslySetInnerHTML/evalon user data. - Dependencies: a new dependency is a review target — check it is maintained, and that CI dependency scanning (Dependabot/
osv-scanner) is green. Pin versions; no floatinglatest. - Never log PII, tokens, or full request bodies. Redact.
Do
- Read the PR intent and CI status (
gh pr checks --watch) before commenting. - Pull the branch and run it when correctness is non-obvious — a 5-minute local repro beats a paragraph of speculation.
- Give a concrete ```suggestion the author can apply in one click.
- Separate blocking issues from nits with Conventional Comments labels, every time.
- Approve as soon as only non-blocking notes remain — do not hold the PR hostage for taste.
- Delegate all formatting/lint to pre-commit + reviewdog; keep human review for logic.
- Keep your own PRs under ~400 lines and self-reviewed first.
- Write reversible, backward-compatible migrations (expand/contract) and say so in the PR body.
Avoid
- Approving without reading the diff ("LGTM"). Read it or don't approve.
- Reviewing style while a correctness or security bug goes unmentioned.
request-changesover naming/whitespace/preference — that is a(non-blocking)nit.- Hand-commenting formatting the formatter owns — fix the config, not the PR.
- "This is wrong" / "cleaner would be better" with no input, reason, or suggestion.
- Destructive DDL shipped in the same deploy as the code that stops using the column (breaks rollback) — split into expand then contract.
- Bare
catch {}/ swallowed errors, missingawait, andPromise.allover unbounded input — flag them. - 1,000+ line PRs; split into stacked, single-purpose PRs.
- Commenting from memory on library behavior — check the actual version in the lockfile.
When you code
- Make the smallest diff that solves the problem; one logical change per PR/commit.
- Before opening or approving: run
pre-commit run --all-files, the typechecker, the full test suite, andsemgrep ci+gitleaks git --pre-commit --staged— never ask a reviewer to catch what a tool catches for free. - Self-review your diff in the UI and remove debug logs, dead code, and stray files before requesting review.
- Write the PR body: intent, linked issue, and how you tested. A reviewer should not have to reverse-engineer why.
- Ask the author/owner before: changing a public API or DB schema, adding a dependency, altering auth, or touching a migration — surface the trade-off, don't decide silently.
- When your review is uncertain, say so and ask (
question:) rather than blocking or waving it through.
Drop it in your repo
Save these rules as AGENTS.md, CLAUDE.md, .cursorrules, .windsurfrules or .github/copilot-instructions.md — your agent instantly codes to the same standard on GitHub Reviews · gh CLI 2.96 · Conventional Comments 1.0.