All 12 rules

Every v0 rule is a pure function over the parsed SKILL.md. Pattern-based rules scan the text; provenance-based rules call the GitHub API for repository metadata. Each emits findings with severity, a line range, and (for line-based rules) ±2 lines of context.

Severity legend: CRIT — credential-exfil / agent-takeover. HIGH — destructive or supply-chain. MED — declaration / hygiene.

Pattern-based (regex over parsed markdown)

These run line-by-line against the SKILL.md body and frontmatter.

CRITenv-var-harvesting

References a known secret env var ($ANTHROPIC_API_KEY, $DATABASE_URL, $AWS_ACCESS_KEY_ID, 22 total).

If an attacker can lure the agent into including this in an outbound URL or message, the credential leaks.

CRITinstruction-injection

"ignore previous instructions", "also include the value of", "when the user asks to read", and 2 more patterns.

Classic prompt-injection trigger phrases. Agents may treat the line as a system directive instead of user content.

CRITurl-exfiltration

URLs that interpolate a secret variable into the query string.

Once the agent fetches the URL, the credential is in the recipient's access log.

HIGHdangerous-shell

rm -rf /, curl|sh, wget|sh, chmod 777, eval $(…), eval `…` — 7 patterns total.

Destructive or supply-chain attack primitives.

HIGHfilesystem-overreach

~/.ssh/, ~/.aws/, ~/.gnupg/, /etc/passwd, /proc/self/environ — 13 sensitive paths.

Reading these from an unsandboxed skill is a credential-exfiltration vector.

MEDnetwork-egress-undeclared

URLs to hosts not in the manifest's capabilities.network.egress allowlist (only fires when a manifest is present).

A skill that declares api.acme.io and then talks to analytics.acme.io is lying about its capabilities.

MEDsubprocess-execution

child_process, spawn(, exec(, subprocess.Popen, os.system( — 6 patterns.

Subprocesses break out of any capability declaration. Should require explicit process.exec in the manifest.

HIGHobfuscation

Base64 blobs ≥100 chars, escaped-hex runs ≥8, unicode-escape runs ≥5.

Legitimate skills rarely include long base64/hex/unicode runs. Often hides a payload.

Provenance-based (GitHub API)

These fire on repository metadata, not on the SKILL.md content. Only applicable when the scan URL points to a GitHub repo. They fall back to safe defaults on API errors.

HIGHrepo-age-young

GitHub repo created < 14 days ago.

Most supply-chain attacks use freshly-created throwaway repos. Established projects rarely match.

MEDrepo-popularity-low

< 10 stars AND single contributor (contributors fetched live from /contributors).

No community vetting + lone author = elevated risk profile.

HIGHforce-pushes-recent

Forced PushEvent to default branch in the last 30 days, via the GitHub /events feed.

Recent force-push is a common pre-attack pattern (rewriting history to hide a malicious commit).

MEDno-manifest

No `capabilities` block in the SKILL.md frontmatter.

Without a manifest, the runtime cannot enforce what this skill is permitted to do. Required for sandboxing (planned).

v0 caveats

instruction-injection is regex-based. Alongside it, the worker now also runs an LLM-based semantic probe suite (gated on ANTHROPIC_API_KEY) that catches behavioral exfil patterns the regex misses — see Semantic prompt injection. An initial probe set today, expanding over time.
repo-popularity-low: contributor count uses /contributors?per_page=2&anon=true, distinguishing “1” from “2+” only. Exact counts ship later.
force-pushes-recent uses the unauthenticated /events feed, which is rate-limited to 60 req/hr per IP. Set GITHUB_TOKEN on the worker to raise to 5000/hr.
HIGH-severity findings count toward the grade — see Grading explained.