eval-report-workflow
github.com/UKGovernmentBEIS/inspect_evals
Scanned Thu, 28 May 2026 15:25:59 GMT
Scan ID crawl-ad611g27iu7umexa3guv2uqq · 1ms
B
SCORE 75 / 100
Verdict: Safe to install

1 high-severity finding.

This skill runs unsafe shell commands plus 1 other issue listed below.

0 critical1 high1 medium10 rules passed

Why grade B?

score · 75 / 100

The current grade reflects 1 high-severity finding (any HIGH → B).

0 CRIT1 HIGH1 MED0 LOW
To reach a higher grade
  • A
    Reach Atarget score 95

    Resolve all 1 HIGH.

Thresholds are documented at /docs/grading. Source-of-truth is the grade() function in @skillox/scanner.

Findings · ordered by severity

high
Dangerous shell pattern: eval backtick
The skill contains a shell command pattern (`eval backtick`) commonly used in destructive or supply-chain attacks.
rule: dangerous-shellline: 8CWE-78
6# Make an Evaluation Report
7
8This workflow drives [`tools/evaluation_report.py`](../../../tools/evaluation_report.py), which reads a per-eval `report_config.yaml` and produces a full reproducible `report.md` (results table, reference comparison, per-category breakdowns, token totals, approximate cost) plus header-only JSON copies of the input logs under `results/`. The `report_config.yaml`, regenerated `report.md`, and `results/` folder are committed alongside the eval's `eval.yaml`.eval backtick — common in destructive or supply-chain attacks
9
10## Report Formatting
med
No capability manifest declared
The skill ships without a `manifest.yaml` or `capabilities` block in its frontmatter. Without a manifest, the runtime cannot enforce what this skill is permitted to do.
rule: no-manifest
Scan another →Share
skillox.io/r/crawl-ad611g27iu7umexa3guv2uqq