Grading explained

Findings collapse into a single letter (A → F) and a 0–100 score. The formula is deterministic, transparent, and tuned to be high-precision: a single CRIT is enough to drop you out of safe-to-install territory.

The formula

Counts of CRIT, HIGH, and MED findings are evaluated in order; the first matching row wins:

crit ≥ 2              →  F · 0
crit = 1              →  D · 30
high ≥ 3  or  med ≥ 6 →  C · 55
high ≥ 1  or  med ≥ 3 →  B · 75
else                  →  A · 95

Verdict mapping

Worked examples

Severity precedence is strict: a single CRIT will always drop you to D even if you have zero HIGH/MED. Conversely, 100 LOWs are still an A — LOW doesn't contribute to the grade in v0 (no v0 rule emits LOW yet).

Why these numbers

The thresholds are picked so that the typical "well-meaning but careless" skill (no manifest, a couple of undeclared egresses) lands at B, and the "obvious malice" skill (env-var harvest + instruction injection) lands at F. The middle bands (C and D) are narrow on purpose — most scans should be A, B, or F. Anything in between deserves human review, which is why the Expert Review Network is coming soon.

Will this change?

Likely yes. LLM-based semantic probes are coming soon and will produce a richer signal than regex; the sandbox will then add behavioral findings; reviewer-curated weights come later. Whenever the formula changes, the result page banner will note the grading-rules version.