Severity & Confidence
How Secbez assigns severity using CVSS 3.1 and confidence based on evidence completeness, so teams can prioritize.
Every finding has two independent ratings: severity (how bad is it if exploited?) and confidence (how complete is the evidence?). Treat them as separate axes — a high-severity, low-confidence finding is not the same as a low-severity, high-confidence one.
Severity
Severity is derived from a CVSS 3.1 base score on the deterministic evidence (vector, attack complexity, privileges required, user interaction, impact on confidentiality / integrity / availability). The numeric score is bucketed:
| Level | CVSS Score | Meaning | Examples |
|---|---|---|---|
| Critical | 9.0 – 10.0 | Severe impact, exploitable remotely with no authentication. | Unauthenticated RCE, SQLi exposing all data |
| High | 7.0 – 8.9 | Significant impact, relatively easy to exploit. | Authenticated SQLi, IDOR on sensitive resources |
| Medium | 4.0 – 6.9 | Moderate impact, or requires specific conditions. | XSS requiring user interaction, missing auth on non-sensitive endpoint |
| Low | 0.1 – 3.9 | Limited impact or hard to exploit. | Information disclosure, missing security headers |
Confidence
Confidence reflects how complete the deterministic and graph evidence is, including whether an invariant agent reached a confirming verdict.
| Level | Meaning |
|---|---|
| High | Direct exploit path observable in code: attacker-controlled source, reachable sink, no compensating control. |
| Medium | The vulnerability is likely real but depends on one verifiable assumption (specific configuration, runtime condition). |
| Low | Multiple unverifiable assumptions. Often a structural pattern match without a verified taint or auth-chain proof. |
Findings where the agent reached an inconclusive verdict — for example, an IDOR candidate where the tenant predicate could not be located in the dominating path — are tagged low-confidence and routed to the needs-review lane rather than gating the PR.
How context affects scoring
Cross-file context directly influences the final score:
- A SQL injection sink reachable from an unauthenticated HTTP route lifts severity and confidence.
- A missing auth check on an endpoint that handles money or PII is more critical than one on a status page.
- A vulnerable function with no callers in the graph (dead code) is downgraded.
- A sanitizer or framework-level barrier in the dominating path (verified, not just named) downgrades the finding or suppresses it entirely.
How to prioritize
Work in this order:
- Critical, high confidence — confirmed, severe. Patch immediately.
- High, high confidence — confirmed, high impact. Patch in the current sprint.
- Critical / high, medium confidence — likely real, single assumption. Verify, then patch.
- Needs-review findings — evidence was incomplete. A reviewer with codebase context should triage.
- Medium / low — fold into routine maintenance.
Don't dismiss high-severity, low-confidence findings outright. They often represent real risk that the deterministic engine could not fully prove on its own — exactly the surface where a human reviewer adds value.