
Automate PR review with Claude Code and @claude
- Mark Kendall
- 19 hours ago
- 12 min read
The key is not “let AI approve everything.” The key is:
Local Claude Code helps the developer produce a better PR.
Pipeline automation proves the PR is safe.
Claude/GitHub/Azure review agents reduce human review effort.
The human reviewer makes the final judgment.
Claude Code already supports GitHub PR/issue automation through @claude mentions, GitHub Actions, and automatic code review workflows. Anthropic’s docs say Claude can analyze PRs, create PRs, implement features, fix bugs, and follow repo standards through CLAUDE.md; their Code Review feature uses multiple agents to look for different issue classes, verifies findings, ranks severity, and posts inline review comments.
Here is the full operating model I would put in place.
1. The PR review process should have 4 layers
Layer 1 — Developer local review before PR
This is where Claude Code is most valuable.
Before the developer opens the PR, they run Claude locally inside the repo and ask it to review the change against the story, the intent file, the repo standards, and the tests.
The developer should not just say:
review my code
They should say something closer to:
Review this branch before I open a PR.
Use the Jira story, the intent file, CLAUDE.md, and the existing repo patterns.
Check:
1. Did I implement only the requested scope?
2. Did I follow the repo architecture and naming conventions?
3. Are there missing tests?
4. Are there risky changes to auth, data, config, API contracts, or database behavior?
5. Are there obvious regressions?
6. Is the PR small enough and understandable?
7. What should I fix before opening the PR?
Do not rewrite code yet. Give me a prioritized review.
Then the developer fixes the obvious stuff before pushing.
This matters because the best PR review is the one that catches dumb mistakes before the team sees them.
Layer 2 — Automated checks on push
When the developer pushes the branch, normal CI should run immediately.
This is not where Claude should replace the basics. This is where you run deterministic checks:
Build
Unit tests
Integration tests
Lint
Format
Type check
Dependency scan
Secret scan
Static security scan
API contract tests
Migration checks
Coverage threshold
Claude should not be the first line of defense for things a machine can already prove.
Layer 3 — AI PR review in GitHub or Azure DevOps
Once the PR opens, Claude should review the diff and post findings.
In GitHub, use Claude Code GitHub Actions or Claude Code Review. The GitHub Action can respond to @claude, review PRs, implement fixes, answer questions, and run on your GitHub runner.
The best setup is usually:
Small PRs: automatic Claude review
Large PRs: manual @claude review once
Security-sensitive PRs: required AI security review
External contributor PRs: require human approval before AI workflow runs
Anthropic’s Code Review supports automatic review once after PR creation, review after every push, or manual review using @claude review / @claude review once. That manual option is important because repeated review on every push can burn cost and create noise.
Layer 4 — Human reviewer decision
The human reviewer still owns the merge.
Their job changes from “read every line from scratch” to:
Validate the intent
Validate the architecture
Validate the risky areas
Validate the tests
Validate Claude’s findings
Approve only if the evidence is clean
The reviewer should ask:
Does this PR satisfy the story?
Is the scope controlled?
Did the developer include evidence?
Did CI pass?
Did Claude find anything important?
Were important comments resolved?
Are the tests meaningful?
Is the code maintainable by this team?
That is the new human PR role.
Not less important. More focused.
2. What should happen locally vs pipeline vs GitHub vs Azure DevOps?
Here is the clean division.
Area
Best place
Why
Understanding the story
Local Claude Code
Developer needs fast interactive reasoning
Refactoring before PR
Local Claude Code
Safe before team review
Test generation
Local Claude Code first, CI second
Claude helps write tests; CI proves them
Build/test/lint/typecheck
Pipeline
Deterministic and repeatable
Security scan
Pipeline + AI security review
Need both rules and reasoning
PR explanation
Local + PR template
Developer should explain the change
Diff review
GitHub/Azure PR automation
Review happens where discussion lives
Architecture review
Claude + senior reviewer
AI assists, human decides
Approval
Human reviewer
Do not let AI be the final approver yet
Merge gate
GitHub/Azure branch policy
Enforce the process
Simple rule:
Local is for making the PR better.
Pipeline is for proving it.
GitHub/Azure is for discussing it.
Humans are for approving it.
3. The agents I would create
Do not start with 20 agents. Start with 6 useful ones.
1. Intent Compliance Agent
Checks whether the PR actually implements the story.
It looks at:
Jira story
Intent file
Acceptance criteria
Changed files
Test evidence
It answers:
Implemented
Partially implemented
Out of scope
Missing acceptance criteria
This is huge for your intent-driven engineering model.
2. Architecture Agent
Checks whether the implementation follows the repo’s structure.
It looks for:
Wrong layer
Business logic in controller
Duplicated patterns
Bypassing shared services
Bad dependency direction
Breaking existing conventions
This is especially important in the .NET/DataSpring-style environment where teams may be working across multiple repos.
3. Test Quality Agent
Not just “are there tests?”
It checks:
Do tests cover the acceptance criteria?
Are edge cases covered?
Are tests meaningful or just shallow?
Did production code change without test changes?
Are integration/API tests needed?
This is important because CI can pass with bad tests. Recent research on GitHub Actions-era PRs warns that test-code review can become marginalized when teams rely heavily on CI, so you want a specific test-review lane, not just “tests passed.”
4. Security Agent
Checks:
Auth bypass
Authorization gaps
Secrets
Injection
Unsafe deserialization
PII logging
Broken access control
Dependency risks
Dangerous config changes
Anthropic has a Claude Code Security Reviewer GitHub Action that is diff-aware, comments on PRs, works across languages, and focuses on semantic security findings, but their repo also warns it is not hardened against prompt-injection and should be used carefully on trusted PRs.
5. Data/API Contract Agent
For .NET/data-heavy enterprise apps, this one matters.
Checks:
Database migration risk
Schema compatibility
DTO/API contract changes
Backward compatibility
Null handling
Pagination/filtering/sorting behavior
Data import/export correctness
This is where many enterprise bugs live.
6. PR Summarizer / Reviewer Assistant
Creates a clean PR summary:
What changed
Why it changed
Files touched
Risk areas
Test evidence
Migration notes
Reviewer checklist
Known limitations
This helps the human reviewer move faster.
4. Sub-agents by repo type
For a multi-repo enterprise client, I would create repo-specific sub-agents.
For example:
frontend-reviewer
api-reviewer
database-reviewer
security-reviewer
test-reviewer
devops-reviewer
documentation-reviewer
architecture-reviewer
For .NET specifically:
dotnet-api-reviewer
entity-framework-reviewer
dependency-injection-reviewer
controller-service-pattern-reviewer
configuration-reviewer
logging-observability-reviewer
xunit-test-reviewer
For React:
react-component-reviewer
state-management-reviewer
accessibility-reviewer
api-client-reviewer
ui-regression-reviewer
For DevOps:
github-actions-reviewer
azure-pipelines-reviewer
terraform-reviewer
kubernetes-reviewer
secrets-config-reviewer
The point is not fancy names. The point is specialized review lenses.
Anthropic’s hook system supports lifecycle automation around Claude Code sessions, including events like session start, prompt submit, pre-tool-use, post-tool-use, file changes, and subagent start/stop. That means you can build local automation around Claude’s behavior, not just GitHub workflows.
5. The actual PR workflow I would implement
Here is the end-to-end flow.
Step 1 — Developer pulls story and intent
Before coding:
Read Jira story
Read intent file
Read repo CLAUDE.md
Ask Claude to create implementation plan
Confirm files likely to change
Confirm tests required
Claude prompt:
Create an implementation plan for this story.
Use:
- existing repo patterns
- acceptance criteria
Return:
1. files likely to change
2. tests required
3. risks
4. questions/blockers
5. implementation sequence
Do not write code yet.
Step 2 — Developer codes with Claude Code
Claude can implement, but the developer stays in control.
For each feature:
small change
run tests
review diff
commit
repeat
Do not let Claude generate a 2,000-line PR without checkpoints.
Step 3 — Local pre-PR review
Before pushing:
claude review branch against main
run build
run tests
run lint
generate PR summary
The developer should fix everything obvious before the PR opens.
Step 4 — PR opens with required template
The PR template should force evidence.
Example:
## Story / Intent
Link:
## What changed
-
## Acceptance criteria covered
-
## Test evidence
- [ ] Unit tests
- [ ] Integration tests
- [ ] Manual test
- [ ] API contract test
- [ ] Migration tested
## Risk areas
- [ ] Auth/security
- [ ] Database
- [ ] API contract
- [ ] Config/secrets
- [ ] Performance
- [ ] UI behavior
## Claude local review completed?
- [ ] Yes
## Notes for reviewer
-
Step 5 — GitHub/Azure pipeline runs
Required checks:
build
unit tests
integration tests
lint/format
typecheck
coverage
dependency scan
secret scan
SAST
container scan if applicable
migration validation
For GitHub:
on:
pull_request:
branches: [ main, develop ]
jobs:
build-test:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup
run: echo "install SDKs here"
- name: Build
run: echo "dotnet build or npm build"
- name: Test
run: echo "dotnet test or npm test"
For Azure DevOps, same idea:
trigger:
branches:
include:
- main
- develop
pr:
branches:
include:
- main
- develop
stages:
- stage: Validate
jobs:
- job: BuildAndTest
steps:
- script: echo "restore/build/test here"
Step 6 — Claude PR review runs
Use one of three modes:
Mode 1: automatic review on every PR
Mode 2: manual @claude review once
Mode 3: path-specific review only for risky files
For early teams, I would use:
Manual @claude review once for normal PRs
Automatic review for high-risk paths
Required security review for auth/data/config changes
Why? Because full automatic review on every push can create cost/noise. Anthropic’s docs explicitly provide @claude review once for one-time review when you do not want every subsequent push to trigger another review.
Step 7 — Human reviewer reviews evidence first
Reviewer sequence:
1. Read story and intent.
2. Read PR summary.
3. Check changed files.
4. Check CI results.
5. Read Claude findings.
6. Review tests.
7. Inspect risky areas manually.
8. Ask Claude targeted questions if needed.
9. Approve, request changes, or split PR.
Good reviewer prompt inside PR:
@claude review this PR specifically for:
1. missed acceptance criteria
2. architecture violations
3. missing tests
4. data/API contract risk
5. security issues
Ignore style nits unless they create maintainability risk.
Step 8 — Claude helps respond to review comments
After human or Claude comments:
@claude fix the issues from the review comments, but do not change public API behavior unless explicitly required.
Or:
@claude explain whether this reviewer comment is valid based on the existing repo patterns.
Step 9 — Final merge gate
Merge only when:
CI green
Claude important findings resolved or accepted as false positive
Required human approval complete
PR template complete
No unresolved critical comments
Branch up to date
6. What GitHub Actions should you use?
Minimum stack:
actions/checkout
actions/setup-node or actions/setup-dotnet
dependency caching
test runner
coverage reporter
secret scanning
dependency scanning
CodeQL
Claude Code Action
Claude Code Security Review
For GitHub-native security:
CodeQL
Dependabot
Secret scanning
Branch protection
Required status checks
CODEOWNERS
For Claude-specific:
anthropics/claude-code-action
anthropics/claude-code-security-review
Claude Code Review, if available/enabled for the repo
The Claude Code Action repository says it supports PR/issue integration, code review, implementation, structured outputs, progress tracking, and running on your own GitHub runner.
A practical GitHub PR review workflow would look like this conceptually:
name: Claude PR Review
on:
pull_request:
types: [opened, synchronize, reopened, ready_for_review]
issue_comment:
types: [created]
pull_request_review_comment:
types: [created]
permissions:
contents: read
pull-requests: write
issues: write
jobs:
claude-review:
if: github.event.pull_request.draft == false
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Claude Code Review
uses: anthropics/claude-code-action@main
with:
anthropic_api_key: ${{ secrets.ANTHROPIC_API_KEY }}
prompt: |
Review this pull request.
Focus on:
- acceptance criteria
- architecture
- test quality
- security
- data/API contract risk
- maintainability
Do not nitpick formatting unless it creates real risk.
You would tune this per repo.
7. What should go in
?
Every repo needs a strong CLAUDE.md. Claude follows project standards through CLAUDE.md in GitHub workflows, so this becomes part of your operating model.
Recommended sections:
# Repo Purpose
What this repo does.
# Architecture
Layers, boundaries, dependency rules.
# Coding Standards
Naming, patterns, error handling, logging.
# Testing Standards
Required test types and commands.
# Security Rules
Auth, secrets, PII, logging, validation.
# PR Rules
Small PRs, required summary, evidence.
# Do Not Do
No broad refactors.
No public contract changes without approval.
No database migration without migration notes.
No secrets in code.
No bypassing shared services.
# Commands
Build command.
Test command.
Lint command.
Run locally command.
For the new client, this is probably one of your biggest missing pieces.
A lot of teams say “Claude isn’t helping,” but Claude is being dropped into repos with no operating instructions, no intent files, no standards file, no review checklist, and no pipeline evidence.
That is not Claude failing.
That is the engineering system being under-specified.
8. GitHub vs Azure DevOps
GitHub
GitHub is better right now for Claude Code automation because Claude Code has official GitHub Actions integration, comment-based @claude workflows, PR annotations, and GitHub-native review flows.
Use GitHub for:
@claude PR review
automatic PR comments
Claude Code Action
security review action
CODEOWNERS
required checks
branch protection
Azure DevOps
Azure DevOps is still fine for enterprise CI/CD, especially in Microsoft/.NET shops.
Use Azure DevOps for:
build validation
release pipelines
environment approvals
deployment gates
test plans
work item traceability
enterprise governance
But if the repo is in Azure DevOps and you want Claude automation, you may need more custom scripting around Claude Code/Claude API/Agent SDK rather than the clean GitHub-native experience. Anthropic says Claude Code GitHub Actions is built on top of the Claude Agent SDK, which can be used to build custom automation workflows beyond GitHub Actions.
My practical recommendation:
If source control is GitHub: use GitHub Actions + Claude Code Action.
If source control is Azure Repos: use Azure Pipelines for deterministic checks, then call Claude through a scripted review job or custom bot.
If enterprise already uses Azure DevOps Boards but GitHub repos: keep Boards for work tracking and GitHub for PR automation.
9. How much automation is too much?
For this client phase, I would not start with full autonomous approval.
Start here:
Phase 1 — Human-led, AI-assisted
Developer uses Claude locally before PR
CI runs normal checks
Reviewer uses @claude review once
Human approves
This is the safest starting point.
Phase 2 — Required AI review for risky paths
Automatically run Claude review when PR touches:
auth
security
database migrations
API contracts
payment/financial logic
PII
config/secrets
infrastructure
Phase 3 — Agentic fix assistance
Claude can respond to comments and push suggested fixes, but the developer reviews them.
@claude fix the missing test coverage
@claude update the PR summary
@claude address the security finding
Phase 4 — Policy-based automation
Use structured outputs from Claude to classify PRs:
low risk
medium risk
high risk
needs architecture review
needs security review
needs data review
Then route reviewers automatically.
Phase 5 — Semi-autonomous maintenance PRs
Allow Claude to open PRs for:
docs updates
dependency bumps
test cleanup
dead code removal
small refactors
lint fixes
Still human-approved.
10. The reviewer’s new checklist
Give every reviewer this:
# Human PR Reviewer Checklist
## 1. Intent
- Does the PR satisfy the story?
- Are all acceptance criteria covered?
- Is anything out of scope?
## 2. Scope
- Is the PR small enough?
- Are unrelated refactors mixed in?
- Should this be split?
## 3. Architecture
- Does it follow repo patterns?
- Are boundaries respected?
- Is business logic in the right layer?
## 4. Tests
- Are tests meaningful?
- Do they cover edge cases?
- Did the PR change behavior without adding tests?
## 5. Risk
- Auth/security?
- Database?
- API contract?
- Config/secrets?
- Performance?
- Logging/PII?
## 6. Evidence
- CI passed?
- Claude review completed?
- Important findings resolved?
- Manual testing documented?
## 7. Decision
- Approve
- Request changes
- Ask for split
- Escalate to architecture/security/data reviewer
11. My recommended target architecture
For your client, I would propose this as the “PR Intelligence Layer.”
Developer Workstation
└── Claude Code
├── story understanding
├── implementation planning
├── local review
├── test generation
└── PR summary generation
Repository
├── CLAUDE.md
├── intent.md
├── PR template
├── CODEOWNERS
├── test commands
└── architecture rules
CI/CD Pipeline
├── build
├── test
├── lint
├── typecheck
├── coverage
├── dependency scan
├── secret scan
├── SAST
└── migration/API checks
GitHub / Azure PR Layer
├── Claude PR Review
├── Claude Security Review
├── path-specific reviewers
├── reviewer assignment
├── required checks
└── merge policy
Human Review
├── intent validation
├── architecture judgment
├── risk review
├── test review
└── approval
That is the model.
Not random Claude usage.
An engineered PR system.
12. The simple rollout plan
I would roll this out in 10 working days.
Days 1–2: Baseline
Pick one repo.
Add:
PR template
review checklist
local Claude review prompt
Days 3–4: CI hardening
Make sure every PR runs:
build
tests
lint
security scan
coverage
Days 5–6: Claude GitHub review
Add:
Claude Code Action
manual @claude review once
security review for risky paths
Days 7–8: Specialized review prompts
Create prompts/commands for:
intent review
architecture review
test review
security review
data/API review
Days 9–10: Metrics
Start tracking:
PR cycle time
review time
number of review comments
defects found before merge
defects found after merge
rework rate
CI failure rate
Claude findings accepted/rejected
That last metric is important. Anthropic’s Code Review UI even lets users rate findings with thumbs up/down; their docs say those reactions are collected after merge to tune the reviewer, though reactions do not trigger re-review.
13. The big message to the team
Here is the line I would use with them:
We are not using Claude Code to replace PR review. We are using Claude Code to make every PR arrive cleaner, better explained, better tested, and easier for a human to approve.
That is the whole shift.
The developer should no longer throw a PR over the wall.
The PR should arrive with:
clear intent
small scope
test evidence
AI pre-review
CI evidence
risk notes
reviewer guidance
That is how you get from “Claude wrote some code” to Claude improved the software delivery system.

Comments