Skip to main content

Architecture overview

When a pull request is opened (or updated), GitHub sends a webhook to MergeWatch. The event flows through a review pipeline that ends with review comments posted back to the PR. MergeWatch supports two deployment modes with different architectures. The review pipeline and agent behavior are identical in both.

SaaS architecture

This diagram shows the managed SaaS architecture running in MergeWatch’s AWS account.

Self-Hosted architecture

This diagram shows the self-hosted architecture running on your infrastructure with Docker.

Step-by-step flow

1

PR opened or updated

A developer opens a pull request or pushes new commits. GitHub fires a pull_request webhook event (opened, synchronize, or reopened).
2

Webhook received

SaaS: API Gateway receives the POST request and invokes the WebhookHandler Lambda (512 MB, 30 s timeout).Self-Hosted: The Express server receives the POST request on the /webhook endpoint (port 3000).In both modes, the handler validates the payload against the webhook secret using HMAC-SHA256. Invalid signatures are rejected with a 401.
3

Review pipeline starts

SaaS: The handler invokes the ReviewAgent Lambda asynchronously using InvocationType.Event (fire-and-forget). The ReviewAgent Lambda (1024 MB, 300 s timeout) picks up the event.Self-Hosted: The Express server processes the review in the same process, running the agent pipeline asynchronously after acknowledging the webhook.In both modes, the PR diff is fetched from the GitHub API using the installation token, then fanned out to eight specialized agents.
4

Multi-agent parallel review

Eight agents run concurrently via Promise.all() — see the pipeline below. Each agent receives the diff and returns structured JSON findings.
5

Orchestration

The orchestrator agent receives all findings, deduplicates overlapping comments, ranks by severity and confidence, and produces a merge readiness score (1—5).
6

Results posted to GitHub

MergeWatch posts a Check Run with a status and conclusion, plus inline review comments on the relevant lines.
  • MergeWatch adds a :eyes: reaction to the PR when starting a review to signal that analysis is underway.
  • On re-review, MergeWatch dismisses stale reviews before posting new ones.
  • The summary comment is edited in place (not duplicated) on re-review.
SaaS: The review record is written to DynamoDB.Self-Hosted: The review record is written to PostgreSQL.

The multi-agent pipeline

Eight specialized agents run in parallel. Each is a separate LLM invocation with a focused system prompt.
AgentResponsibilityExample finding
SecuritySQL injection, XSS, secrets in code, dependency vulnerabilities”User input passed to exec() without sanitization”
Bug detectionNull derefs, off-by-ones, race conditions, logic errors”Array index i + 1 can exceed arr.length
StyleNaming conventions, dead code, missing types, readability”Exported function processData has no JSDoc”
Error handlingEmpty catch blocks, swallowed errors, unhandled promise rejections”Promise rejection in fetchUser() is caught but silently ignored”
Test coverageMissing tests for new public functions, untested edge cases”New exported function validateInput() has no corresponding test”
Comment accuracyMisleading or outdated code comments, incorrect JSDoc”JSDoc says @returns string but function returns Promise<string>
SummaryHuman-readable summary of the PR’s intent and scope”Adds rate limiting to the /api/upload endpoint”
DiagramGenerates a Mermaid diagram of the changed control flowMermaid sequence or flowchart of the new code path
All eight agents run via Promise.all() — the total latency is bounded by the slowest agent, not the sum.

Agent output schema

Every agent (except summary and diagram) returns an array of findings in this shape:
{
  "file": "src/api/handler.ts",
  "line": 42,
  "severity": "critical",
  "confidence": 92,
  "title": "Unsanitized input passed to shell exec",
  "description": "The `command` variable includes user-supplied input from `req.body.cmd` and is passed directly to `child_process.exec()`. This allows arbitrary command injection.",
  "suggestion": "Use `execFile()` with an explicit argument array, or validate `cmd` against an allowlist."
}
FieldTypeDescription
filestringRelative path to the file in the diff
linenumberLine number in the new version of the file
severity"critical" | "warning" | "info"How urgent the finding is
confidencenumber (1—100)How confident the agent is that this is a real issue, not a false positive
titlestringOne-line summary of the finding
descriptionstringDetailed explanation
suggestionstringRecommended fix

Orchestrator behavior

After all agents complete, the orchestrator:
  1. Deduplicates — if the security agent and the bug agent both flag the same line for the same root cause, only the higher-confidence finding is kept.
  2. Ranks — findings are sorted by severity (critical > warning > info), then by confidence descending.
  3. Scores — a merge readiness score from 1 to 5 is computed:
ScoreCriteriaMeaning
5No critical findings, at most 1 warningNo issues found. Ship it.
4No critical findings, 2-3 warningsMinor suggestions only.
3No critical findings but 4+ warnings, OR exactly 1 critical findingWarnings present. Review recommended before merge.
22 or more critical findingsCritical findings present. Do not merge without addressing.
13+ critical findings or a major security vulnerabilityHigh-confidence critical findings. Merge blocked.

Codebase awareness

MergeWatch supports agentic file fetching for cross-file context. When enabled, agents can request additional files from the repository via the GitHub API to understand surrounding context — such as imported modules, type definitions, or test files — beyond what the PR diff includes. Codebase awareness is controlled by three configuration options in .mergewatch.yml:
PropertyTypeDefaultDescription
codebaseAwarenessbooleantrueEnable agentic file fetching. When true, agents can request additional files for cross-file context.
maxFileRequestRoundsnumber1Maximum rounds of file fetching per agent. Each round lets the agent request more files based on what it learned in the previous round. Maximum value is 2.
maxContextKBnumber256Maximum total size (in KB) of fetched files per agent. Prevents runaway context expansion on large repositories.
Codebase awareness is enabled by default. To revert to diff-only analysis (faster, lower cost), set codebaseAwareness: false in your .mergewatch.yml.
Future plans: A planned enhancement will add an embedding index of the full repository, enabling agents to understand cross-file dependencies and architectural patterns without explicit file fetching.

Infrastructure details

Self-Hosted

Express Server (Docker)

  • Image: ghcr.io/santthosh/mergewatch:latest
  • Port: 3000
  • Webhook endpoint: /webhook
  • Role: Validate signature, fetch diff, invoke agents, run orchestrator, post results

PostgreSQL

  • Storage: Review history, installation config, repo settings
  • ORM: Drizzle ORM (packages/storage-postgres)
  • Managed by: Docker Compose sidecar

SaaS

WebhookHandler Lambda

  • Memory: 512 MB
  • Timeout: 30 seconds
  • Role: Validate HMAC-SHA256 signature, parse event, invoke ReviewAgent (async)

ReviewAgent Lambda

  • Memory: 1024 MB
  • Timeout: 300 seconds (5 min)
  • Role: Fetch diff, invoke agents, run orchestrator, post results

DynamoDB Tables

  • installations — GitHub App installation config, repo settings
  • reviews — Review history, agent findings, scores
  • Billing mode: On-demand (pay-per-request)

Amazon Bedrock

  • Default model: us.anthropic.claude-sonnet-4-20250514-v1:0
  • Role: Powers all eight review agents and the orchestrator

Data flow

Where your data goes depends on which deployment model you choose.

Self-Hosted

Everything stays on your infrastructure. MergeWatch has zero access. You run the Docker container, you own the data, you see every log. MergeWatch (the company) never sees your code, your diffs, your review results, or your LLM usage.

Managed SaaS

Everything runs in MergeWatch’s infrastructure. Your diff is processed by MergeWatch’s Lambda and sent to MergeWatch’s Bedrock. This is the fastest setup but offers the least data isolation.
Self-hosted: Nothing. MergeWatch has no access to your infrastructure, your code, your diffs, or your review results. Zero telemetry is sent back.Managed SaaS: MergeWatch sees everything: the diff, the LLM prompts and responses, and the review results. This is the trade-off for zero-setup convenience.If your security posture requires that code never leave your infrastructure, use self-hosted.

Check Runs

MergeWatch uses the GitHub Check Runs API to report results. Each review creates a Check Run with:
  • status: queuedin_progresscompleted
  • conclusion: success (score 4—5), neutral (score 3), or failure (score 1—2)
  • output.title: Merge readiness score and finding count
  • output.summary: The summary agent’s output plus the orchestrator’s ranked findings
This integrates with GitHub’s branch protection rules — you can require the MergeWatch check to pass before merging.

GitHub review events

MergeWatch maps merge-readiness scores to GitHub review events:
Merge scoreGitHub review eventEffect
4-5APPROVEPR shows as approved by MergeWatch
3COMMENTPR shows review comments without approval
1-2REQUEST_CHANGESPR shows changes requested by MergeWatch
This integrates with GitHub’s required reviewers — if MergeWatch is a required reviewer, low scores can block merging.

Quickstart

Install MergeWatch and get your first review in under 10 minutes.

Configuration

Customize agent behavior, skip rules, and model selection via .mergewatch.yml.