AI Code Review Needs a Threat Model

Reviewing AI-generated code as if it were merely junior code misses the point. The risk profile is different.

A junior engineer may misunderstand a system, but they can explain their path, remember the discussion, and learn the local consequences. A model can generate code that looks locally excellent while importing assumptions from nowhere in the codebase.

The thesis

AI code review needs an explicit threat model: what kinds of harm are plausible when a machine produces a convincing patch?

Without that threat model, reviewers overfocus on visible style and underfocus on hidden behavior.

The production pattern

The dangerous AI patches are rarely absurd. They are reasonable in the abstract and wrong in the specific.

They add a cache without respecting invalidation semantics. They make a retry idempotent in the test but not in the real workflow. They simplify an authorization branch whose awkwardness encoded a product rule. They replace a boring local helper with a clever abstraction that does not match operational ownership.

These are not syntax failures. They are system interpretation failures.

The model

I use six threat categories for AI-generated changes.

Correctness drift: the code solves a nearby problem but changes edge-case behavior, ordering, time semantics, or compatibility.

Contract drift: the code changes an API, event, schema, permission rule, or error shape without naming the change.

Security drift: the code broadens access, mishandles untrusted input, weakens validation, logs sensitive data, or treats model output as trusted fact.

Operational drift: the code adds latency, retries, concurrency, background work, or failure coupling without updating observability and recovery paths.

Maintainability drift: the code introduces abstractions, generated patterns, or duplicated logic that future owners will struggle to debug.

Evidence drift: the tests pass but prove the wrong thing, overfit the generated implementation, or omit the failure mode that motivated the change.

The review checklist:

What is the worst plausible production behavior if this assumption is wrong?
Which contracts changed without being called out?
What input is trusted that used to be checked?
What happens under retry, concurrency, timeout, and partial failure?
Are tests protecting requirements or implementation details?
Would rollback restore the old behavior cleanly?

Where this goes wrong

Threat modeling can become paranoia. Not every AI patch is a security review, and treating every generated helper as a critical risk will slow teams into avoiding the tool entirely.

The right level depends on blast radius. A local rendering cleanup and a permission-system change should not receive the same review.

There is also a human bias problem. Once a reviewer knows a patch was AI-generated, they may either distrust it reflexively or accept it as machine-checked. Both are lazy. The threat model should discipline the review, not replace judgment.

What I do now

For any nontrivial generated patch, I ask the author or agent to identify touched contracts and expected failure modes. If the answer is empty, the review is not ready.

I pay special attention to code that crosses trust boundaries: user input, authentication, authorization, billing semantics, persisted data, background processing, external calls, and operational controls.

I also prefer negative tests. AI-generated test suites often prove that the new happy path works. I want tests showing that invalid, unauthorized, duplicated, delayed, or partially failed inputs behave correctly.

The missing threat: misplaced authority

There is one threat category I call out separately in review because it hides inside otherwise clean code: misplaced authority.

Misplaced authority happens when generated code decides something that should have remained a policy, product, or operator decision. It might choose a default permission, infer a workflow status, convert uncertainty into a final answer, silence an alert, or treat a missing value as safe. The code may be syntactically ordinary. The problem is that it moved decision rights without saying so.

I look for this especially around conditionals. A branch that used to be awkward may have encoded a real exception. A null check may have preserved the difference between unknown and false. A manual approval may have existed because the system could not safely automate the last step.

The review question is simple: who was allowed to make this decision before the patch, and who makes it after? If the answer changed, the pull request needs to name that change directly.

This is also where AI assistance can overfit to elegance. Models often prefer simpler control flow, fewer branches, and consistent abstractions. Production systems sometimes keep ugly boundaries because they represent negotiated risk. Removing the ugliness without understanding the negotiation is not cleanup. It is an unauthorized design change.

When in doubt, I ask for the old decision rule in plain language before accepting the new one.

Closing takeaway

Review machine-generated code by threat category, not by polish: correctness, contracts, security, operations, maintainability, and evidence.