Automate PR review with Claude Code and @claude

Mark Kendall
19 hours ago
12 min read

The key is not “let AI approve everything.” The key is:

Local Claude Code helps the developer produce a better PR.

Pipeline automation proves the PR is safe.

Claude/GitHub/Azure review agents reduce human review effort.

The human reviewer makes the final judgment.

Claude Code already supports GitHub PR/issue automation through @claude mentions, GitHub Actions, and automatic code review workflows. Anthropic’s docs say Claude can analyze PRs, create PRs, implement features, fix bugs, and follow repo standards through CLAUDE.md; their Code Review feature uses multiple agents to look for different issue classes, verifies findings, ranks severity, and posts inline review comments.

Here is the full operating model I would put in place.

1. The PR review process should have 4 layers

Layer 1 — Developer local review before PR

This is where Claude Code is most valuable.

Before the developer opens the PR, they run Claude locally inside the repo and ask it to review the change against the story, the intent file, the repo standards, and the tests.

The developer should not just say:

review my code

They should say something closer to:

Review this branch before I open a PR.

Use the Jira story, the intent file, CLAUDE.md, and the existing repo patterns.

Check:

1. Did I implement only the requested scope?

2. Did I follow the repo architecture and naming conventions?

3. Are there missing tests?

4. Are there risky changes to auth, data, config, API contracts, or database behavior?

5. Are there obvious regressions?

6. Is the PR small enough and understandable?

7. What should I fix before opening the PR?

Do not rewrite code yet. Give me a prioritized review.

Then the developer fixes the obvious stuff before pushing.

This matters because the best PR review is the one that catches dumb mistakes before the team sees them.

Layer 2 — Automated checks on push

When the developer pushes the branch, normal CI should run immediately.

This is not where Claude should replace the basics. This is where you run deterministic checks:

Build

Unit tests

Integration tests

Lint

Format

Type check

Dependency scan

Secret scan

Static security scan

API contract tests

Migration checks

Coverage threshold

Claude should not be the first line of defense for things a machine can already prove.

Layer 3 — AI PR review in GitHub or Azure DevOps

Once the PR opens, Claude should review the diff and post findings.

In GitHub, use Claude Code GitHub Actions or Claude Code Review. The GitHub Action can respond to @claude, review PRs, implement fixes, answer questions, and run on your GitHub runner.

The best setup is usually:

Small PRs: automatic Claude review

Large PRs: manual @claude review once

Security-sensitive PRs: required AI security review

External contributor PRs: require human approval before AI workflow runs

Anthropic’s Code Review supports automatic review once after PR creation, review after every push, or manual review using @claude review / @claude review once. That manual option is important because repeated review on every push can burn cost and create noise.

Layer 4 — Human reviewer decision

The human reviewer still owns the merge.

Their job changes from “read every line from scratch” to:

Validate the intent

Validate the architecture

Validate the risky areas

Validate the tests

Validate Claude’s findings

Approve only if the evidence is clean

The reviewer should ask:

Does this PR satisfy the story?

Is the scope controlled?

Did the developer include evidence?

Did CI pass?

Did Claude find anything important?

Were important comments resolved?

Are the tests meaningful?

Is the code maintainable by this team?

That is the new human PR role.

Not less important. More focused.

2. What should happen locally vs pipeline vs GitHub vs Azure DevOps?

Here is the clean division.

Area

Best place

Why

Understanding the story

Local Claude Code

Developer needs fast interactive reasoning

Refactoring before PR

Local Claude Code

Safe before team review

Test generation

Local Claude Code first, CI second

Claude helps write tests; CI proves them

Build/test/lint/typecheck

Pipeline

Deterministic and repeatable

Security scan

Pipeline + AI security review

Need both rules and reasoning

PR explanation

Local + PR template

Developer should explain the change

Diff review

GitHub/Azure PR automation

Review happens where discussion lives

Architecture review

Claude + senior reviewer

AI assists, human decides

Approval

Human reviewer

Do not let AI be the final approver yet

Merge gate

GitHub/Azure branch policy

Enforce the process

Simple rule:

Local is for making the PR better.

Pipeline is for proving it.

GitHub/Azure is for discussing it.

Humans are for approving it.

3. The agents I would create

Do not start with 20 agents. Start with 6 useful ones.

1. Intent Compliance Agent

Checks whether the PR actually implements the story.

It looks at:

Jira story

Intent file

Acceptance criteria

Changed files

Test evidence

It answers:

Implemented

Partially implemented

Out of scope

Missing acceptance criteria

This is huge for your intent-driven engineering model.

2. Architecture Agent

Checks whether the implementation follows the repo’s structure.

It looks for:

Wrong layer

Business logic in controller

Duplicated patterns

Bypassing shared services

Bad dependency direction

Breaking existing conventions

This is especially important in the .NET/DataSpring-style environment where teams may be working across multiple repos.

3. Test Quality Agent

Not just “are there tests?”

It checks:

Do tests cover the acceptance criteria?

Are edge cases covered?

Are tests meaningful or just shallow?

Did production code change without test changes?

Are integration/API tests needed?

This is important because CI can pass with bad tests. Recent research on GitHub Actions-era PRs warns that test-code review can become marginalized when teams rely heavily on CI, so you want a specific test-review lane, not just “tests passed.”

4. Security Agent

Checks:

Auth bypass

Authorization gaps

Secrets

Injection

Unsafe deserialization

PII logging

Broken access control

Dependency risks

Dangerous config changes

Anthropic has a Claude Code Security Reviewer GitHub Action that is diff-aware, comments on PRs, works across languages, and focuses on semantic security findings, but their repo also warns it is not hardened against prompt-injection and should be used carefully on trusted PRs.

5. Data/API Contract Agent

For .NET/data-heavy enterprise apps, this one matters.

Checks:

Database migration risk

Schema compatibility

DTO/API contract changes

Backward compatibility

Null handling

Pagination/filtering/sorting behavior

Data import/export correctness

This is where many enterprise bugs live.

6. PR Summarizer / Reviewer Assistant

Creates a clean PR summary:

What changed

Why it changed

Files touched

Risk areas

Test evidence

Migration notes

Reviewer checklist

Known limitations

This helps the human reviewer move faster.

4. Sub-agents by repo type

For a multi-repo enterprise client, I would create repo-specific sub-agents.

For example:

frontend-reviewer

api-reviewer

database-reviewer

security-reviewer

test-reviewer

devops-reviewer

documentation-reviewer

architecture-reviewer

For .NET specifically:

dotnet-api-reviewer

entity-framework-reviewer

dependency-injection-reviewer

controller-service-pattern-reviewer

configuration-reviewer

logging-observability-reviewer

xunit-test-reviewer

For React:

react-component-reviewer

state-management-reviewer

accessibility-reviewer

api-client-reviewer

ui-regression-reviewer

For DevOps:

github-actions-reviewer

azure-pipelines-reviewer

terraform-reviewer

kubernetes-reviewer

secrets-config-reviewer

The point is not fancy names. The point is specialized review lenses.

Anthropic’s hook system supports lifecycle automation around Claude Code sessions, including events like session start, prompt submit, pre-tool-use, post-tool-use, file changes, and subagent start/stop. That means you can build local automation around Claude’s behavior, not just GitHub workflows.

5. The actual PR workflow I would implement

Here is the end-to-end flow.

Step 1 — Developer pulls story and intent

Before coding:

Read Jira story

Read intent file

Read repo CLAUDE.md

Ask Claude to create implementation plan

Confirm files likely to change

Confirm tests required

Claude prompt:

Create an implementation plan for this story.

Use:

- intent.md

- CLAUDE.md

- existing repo patterns

- acceptance criteria

Return:

1. files likely to change

2. tests required

3. risks

4. questions/blockers

5. implementation sequence

Do not write code yet.

Step 2 — Developer codes with Claude Code

Claude can implement, but the developer stays in control.

For each feature:

small change

run tests

review diff

commit

repeat

Do not let Claude generate a 2,000-line PR without checkpoints.

Step 3 — Local pre-PR review

Before pushing:

claude review branch against main

run build

run tests

run lint

generate PR summary

The developer should fix everything obvious before the PR opens.

Step 4 — PR opens with required template

The PR template should force evidence.

Example:

## Story / Intent

Link:

## What changed

## Acceptance criteria covered

## Test evidence

- [ ] Unit tests

- [ ] Integration tests

- [ ] Manual test

- [ ] API contract test

- [ ] Migration tested

## Risk areas

- [ ] Auth/security

- [ ] Database

- [ ] API contract

- [ ] Config/secrets

- [ ] Performance

- [ ] UI behavior

## Claude local review completed?

- [ ] Yes

## Notes for reviewer

Step 5 — GitHub/Azure pipeline runs

Required checks:

build

unit tests

integration tests

lint/format

typecheck

coverage

dependency scan

secret scan

SAST

container scan if applicable

migration validation

For GitHub:

on:

pull_request:

branches: [ main, develop ]

jobs:

build-test:

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

- name: Setup

run: echo "install SDKs here"

- name: Build

run: echo "dotnet build or npm build"

- name: Test

run: echo "dotnet test or npm test"

For Azure DevOps, same idea:

trigger:

branches:

include:

- main

- develop

pr:

branches:

include:

- main

- develop

stages:

- stage: Validate

jobs:

- job: BuildAndTest

steps:

- script: echo "restore/build/test here"

Step 6 — Claude PR review runs

Use one of three modes:

Mode 1: automatic review on every PR

Mode 2: manual @claude review once

Mode 3: path-specific review only for risky files

For early teams, I would use:

Manual @claude review once for normal PRs

Automatic review for high-risk paths

Required security review for auth/data/config changes

Why? Because full automatic review on every push can create cost/noise. Anthropic’s docs explicitly provide @claude review once for one-time review when you do not want every subsequent push to trigger another review.

Step 7 — Human reviewer reviews evidence first

Reviewer sequence:

1. Read story and intent.

2. Read PR summary.

3. Check changed files.

4. Check CI results.

5. Read Claude findings.

6. Review tests.

7. Inspect risky areas manually.

8. Ask Claude targeted questions if needed.

9. Approve, request changes, or split PR.

Good reviewer prompt inside PR:

@claude review this PR specifically for:

1. missed acceptance criteria

2. architecture violations

3. missing tests

4. data/API contract risk

5. security issues

Ignore style nits unless they create maintainability risk.

Step 8 — Claude helps respond to review comments

After human or Claude comments:

@claude fix the issues from the review comments, but do not change public API behavior unless explicitly required.

Or:

@claude explain whether this reviewer comment is valid based on the existing repo patterns.

Step 9 — Final merge gate

Merge only when:

CI green

Claude important findings resolved or accepted as false positive

Required human approval complete

PR template complete

No unresolved critical comments

Branch up to date

6. What GitHub Actions should you use?

Minimum stack:

actions/checkout

actions/setup-node or actions/setup-dotnet

dependency caching

test runner

coverage reporter

secret scanning

dependency scanning

CodeQL

Claude Code Action

Claude Code Security Review

For GitHub-native security:

CodeQL

Dependabot

Secret scanning

Branch protection

Required status checks

CODEOWNERS

For Claude-specific:

anthropics/claude-code-action

anthropics/claude-code-security-review

Claude Code Review, if available/enabled for the repo

The Claude Code Action repository says it supports PR/issue integration, code review, implementation, structured outputs, progress tracking, and running on your own GitHub runner.

A practical GitHub PR review workflow would look like this conceptually:

name: Claude PR Review

on:

pull_request:

types: [opened, synchronize, reopened, ready_for_review]

issue_comment:

types: [created]

pull_request_review_comment:

types: [created]

permissions:

contents: read

pull-requests: write

issues: write

jobs:

claude-review:

if: github.event.pull_request.draft == false

runs-on: ubuntu-latest

steps:

- uses: actions/checkout@v4

with:

fetch-depth: 0

- name: Claude Code Review

uses: anthropics/claude-code-action@main

with:

anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}

prompt: |

Review this pull request.

Focus on:

- acceptance criteria

- architecture

- test quality

- security

- data/API contract risk

- maintainability

Do not nitpick formatting unless it creates real risk.

You would tune this per repo.

7. What should go in

CLAUDE.md

Every repo needs a strong CLAUDE.md. Claude follows project standards through CLAUDE.md in GitHub workflows, so this becomes part of your operating model.

Recommended sections:

# Repo Purpose

What this repo does.

# Architecture

Layers, boundaries, dependency rules.

# Coding Standards

Naming, patterns, error handling, logging.

# Testing Standards

Required test types and commands.

# Security Rules

Auth, secrets, PII, logging, validation.

# PR Rules

Small PRs, required summary, evidence.

# Do Not Do

No broad refactors.

No public contract changes without approval.

No database migration without migration notes.

No secrets in code.

No bypassing shared services.

# Commands

Build command.

Test command.

Lint command.

Run locally command.

For the new client, this is probably one of your biggest missing pieces.

A lot of teams say “Claude isn’t helping,” but Claude is being dropped into repos with no operating instructions, no intent files, no standards file, no review checklist, and no pipeline evidence.

That is not Claude failing.

That is the engineering system being under-specified.

8. GitHub vs Azure DevOps

GitHub

GitHub is better right now for Claude Code automation because Claude Code has official GitHub Actions integration, comment-based @claude workflows, PR annotations, and GitHub-native review flows.

Use GitHub for:

@claude PR review

automatic PR comments

Claude Code Action

security review action

CODEOWNERS

required checks

branch protection

Azure DevOps

Azure DevOps is still fine for enterprise CI/CD, especially in Microsoft/.NET shops.

Use Azure DevOps for:

build validation

release pipelines

environment approvals

deployment gates

test plans

work item traceability

enterprise governance

But if the repo is in Azure DevOps and you want Claude automation, you may need more custom scripting around Claude Code/Claude API/Agent SDK rather than the clean GitHub-native experience. Anthropic says Claude Code GitHub Actions is built on top of the Claude Agent SDK, which can be used to build custom automation workflows beyond GitHub Actions.

My practical recommendation:

If source control is GitHub: use GitHub Actions + Claude Code Action.

If source control is Azure Repos: use Azure Pipelines for deterministic checks, then call Claude through a scripted review job or custom bot.

If enterprise already uses Azure DevOps Boards but GitHub repos: keep Boards for work tracking and GitHub for PR automation.

9. How much automation is too much?

For this client phase, I would not start with full autonomous approval.

Start here:

Phase 1 — Human-led, AI-assisted

Developer uses Claude locally before PR

CI runs normal checks

Reviewer uses @claude review once

Human approves

This is the safest starting point.

Phase 2 — Required AI review for risky paths

Automatically run Claude review when PR touches:

auth

security

database migrations

API contracts

payment/financial logic

PII

config/secrets

infrastructure

Phase 3 — Agentic fix assistance

Claude can respond to comments and push suggested fixes, but the developer reviews them.

@claude fix the missing test coverage

@claude update the PR summary

@claude address the security finding

Phase 4 — Policy-based automation

Use structured outputs from Claude to classify PRs:

low risk

medium risk

high risk

needs architecture review

needs security review

needs data review

Then route reviewers automatically.

Phase 5 — Semi-autonomous maintenance PRs

Allow Claude to open PRs for:

docs updates

dependency bumps

test cleanup

dead code removal

small refactors

lint fixes

Still human-approved.

10. The reviewer’s new checklist

Give every reviewer this:

# Human PR Reviewer Checklist

## 1. Intent

- Does the PR satisfy the story?

- Are all acceptance criteria covered?

- Is anything out of scope?

## 2. Scope

- Is the PR small enough?

- Are unrelated refactors mixed in?

- Should this be split?

## 3. Architecture

- Does it follow repo patterns?

- Are boundaries respected?

- Is business logic in the right layer?

## 4. Tests

- Are tests meaningful?

- Do they cover edge cases?

- Did the PR change behavior without adding tests?

## 5. Risk

- Auth/security?

- Database?

- API contract?

- Config/secrets?

- Performance?

- Logging/PII?

## 6. Evidence

- CI passed?

- Claude review completed?

- Important findings resolved?

- Manual testing documented?

## 7. Decision

- Approve

- Request changes

- Ask for split

- Escalate to architecture/security/data reviewer

11. My recommended target architecture

For your client, I would propose this as the “PR Intelligence Layer.”

Developer Workstation

└── Claude Code

├── story understanding

├── implementation planning

├── local review

├── test generation

└── PR summary generation

Repository

├── CLAUDE.md

├── intent.md

├── PR template

├── CODEOWNERS

├── test commands

└── architecture rules

CI/CD Pipeline

├── build

├── test

├── lint

├── typecheck

├── coverage

├── dependency scan

├── secret scan

├── SAST

└── migration/API checks

GitHub / Azure PR Layer

├── Claude PR Review

├── Claude Security Review

├── path-specific reviewers

├── reviewer assignment

├── required checks

└── merge policy

Human Review

├── intent validation

├── architecture judgment

├── risk review

├── test review

└── approval

That is the model.

Not random Claude usage.

An engineered PR system.

12. The simple rollout plan

I would roll this out in 10 working days.

Days 1–2: Baseline

Pick one repo.

Add:

CLAUDE.md

PR template

review checklist

local Claude review prompt

Days 3–4: CI hardening

Make sure every PR runs:

build

tests

lint

security scan

coverage

Days 5–6: Claude GitHub review

Add:

Claude Code Action

manual @claude review once

security review for risky paths

Days 7–8: Specialized review prompts

Create prompts/commands for:

intent review

architecture review

test review

security review

data/API review

Days 9–10: Metrics

Start tracking:

PR cycle time

review time

number of review comments

defects found before merge

defects found after merge

rework rate

CI failure rate

Claude findings accepted/rejected

That last metric is important. Anthropic’s Code Review UI even lets users rate findings with thumbs up/down; their docs say those reactions are collected after merge to tune the reviewer, though reactions do not trigger re-review.

13. The big message to the team

Here is the line I would use with them:

We are not using Claude Code to replace PR review. We are using Claude Code to make every PR arrive cleaner, better explained, better tested, and easier for a human to approve.

That is the whole shift.

The developer should no longer throw a PR over the wall.

The PR should arrive with:

clear intent

small scope

test evidence

AI pre-review

CI evidence

risk notes

reviewer guidance

That is how you get from “Claude wrote some code” to Claude improved the software delivery system.

Automate PR review with Claude Code and @claude

Recent Posts

Comments

Subscribe Form