How Scanning Works
How Secbez finds real vulnerabilities — code graph, multi-agent reasoning, validation, and a deterministic-first pipeline.
A Secbez scan is a typed pipeline. Every finding has to defend itself with evidence before it reaches you. The short version of what happens:
- We parse your repository into a deep code graph — every function, class, route, call edge, and dataflow path. The graph sees your codebase as a connected system, not as isolated files.
- We reason over the graph with specialized AI agents. They explore the code, identify high-signal regions, and propose findings against a strict evidence contract.
- Every finding is validated before it survives. Findings that can't be defended with concrete evidence are dropped or flagged for review.
- Surviving findings are deduped against your baseline, evaluated against your policy, and published to GitHub and the dashboard with the supporting evidence and a fix plan.
The code graph
The graph is the foundation. We see everything in your codebase and we can reason about it: how data flows from request to response, which routes are reachable, which middlewares dominate which paths, which functions handle money or PII, where authorization is enforced and where it isn't. Each scan operates on its own isolated snapshot — concurrent scans never see each other's state.
Because dataflow is part of the graph itself, the analysis isn't a one-pass regex over text. It's structured reasoning over a real, queryable representation of your code.
Multi-agent reasoning
We use specialized AI agents to reason over the graph. Each agent is constrained: it can only confirm a finding when the evidence is concrete (attacker-controlled input, reachable sink, missing or bypassable enforcement, no compensating control). When the evidence is incomplete, the agent returns "inconclusive" — never "vulnerable." We refuse to ship a finding we can't defend.
For high-stakes findings, an additional validation step re-examines the source and either upholds the finding or downgrades severity. This is recall-preserving — validation can downgrade a finding but never invent one.
Deterministic-first
The pipeline is deterministic-first by design. AI cannot mint a finding without supporting deterministic evidence. Gating decisions never depend on AI availability — if models are unreachable, the scan still completes with the deterministic part of the pipeline. The contract for "what counts as a confirmed vulnerability" is fixed; only the reasoning over evidence is AI-driven.
Baseline, policy, publish
After the gate, findings are matched against the repository baseline and your suppression rules. New findings (introduced by the change) gate the PR; baseline findings are tracked but don't. A policy decision maps the result to pass / warn / fail, and the GitHub Check Run reflects that.
Supported languages
The graph engine covers JavaScript, TypeScript, Python, Go, Rust, Java, C/C++, C#, Kotlin, PHP, Ruby, and Swift — each with full graph indexing and dataflow. See Supported Languages for the full matrix and Enterprise framework-coverage commitments.
Invariants
These are non-negotiable across every scan:
- Every finding carries a code location and a snippet.
- Budgets (file, byte, candidate, time) are enforced and surfaced — nothing is silently dropped.
- Secrets are redacted before any AI call.
(repo, sha, mode)is the idempotency key — retries are safe under at-least-once delivery.- Gating never depends on AI availability.