After the Pipeline Runs
Parts 1–10 of this series covered how GeistScope discovers, probes, and documents
vulnerabilities. The pipeline ends with a findings/ directory of markdown files.
Three tools handle what comes next: turning a finding into a reportable submission,
understanding what a decompiled binary is doing, and building an exploit scaffold
from a CVE.
All three use the same llm-client backend — Anthropic’s Claude Sonnet 4.6 when
ANTHROPIC_API_KEY is set, a local Ollama model otherwise, and a deterministic
offline mode for testing.
mg-report: From Finding to Submission
HackerOne reports follow a structure: title, severity with CVSS vector, description,
steps to reproduce, impact, and evidence. Writing that from scratch for every finding
is time-consuming and inconsistent. mg-report generates a ready-to-submit draft.
mg-report generate target-bounty 2026-05-15-003
The tool reads the finding markdown, the engagement scope.json, and the
fingerprinter’s summary.json. It passes all three to the LLM with a structured
prompt that asks for each section of the report explicitly. The response gets written
to findings/reports/2026-05-15-003.md.
CVSS scoring is computed in Rust, not by the model. The cvss module takes the
finding’s severity and category and returns a vector string and numeric score:
CVSS:3.1/AV:N/AC:L/PR:N/UI:N/S:U/C:H/I:H/A:N → 9.1 Critical
The score feeds into the report template so the model doesn’t have to reason about CVSS math — it just fills in the narrative sections.
Bulk mode. When you want to generate reports for everything that’s ready:
mg-report generate target-bounty --all-unconfirmed
This iterates findings with status confirmed or triaged and generates reports
for any that don’t have one yet. --force overwrites existing reports if you want
to regenerate after changing the finding description.
Disclosure. For findings that need coordinated disclosure rather than a bug bounty submission:
mg-report disclose target-bounty 2026-05-15-003 \
--vendor "ACME Corp" \
--contact [email protected] \
--timeline-days 90
This writes two files: a CVE-formatted writeup in findings/disclosures/ and a
disclosure email draft ready to send. The 90-day timeline is the default; it goes
into the email with the calculated disclosure date.
mg-recopilot: Reverse-Engineering with Context
Binary analysis — specifically, reading decompiler output — is one of those tasks where an LLM is genuinely useful. The variable names are garbage, the control flow is tangled, and pattern-matching against known vulnerability classes is exactly what models are good at.
mg-recopilot operates on decompiled pseudocode dropped into the engagement
directory under re/<binary>/raw/<function>.c. The output goes next to the input:
<function>.md (human-readable analysis) and <function>.json (structured data
for the harness).
mg-recopilot analyze target-bounty \
--binary target-api \
--function process_user_input
The analysis prompt instructs the model to produce six structured sections:
- Function Purpose — what this function is supposed to do
- Variable Map — local variables and their likely semantic meaning
- Control Flow Notes — branches, loops, edge cases
- Suspicious Logic — anything that looks like it could be a vulnerability
- Exploit Primitives — concrete primitives present: buffer overflows, use-after-free, integer overflow, format string, type confusion, etc.
- Suggested Next Steps — what to look at next: which call sites to check, which inputs to trace, which offsets to verify in a debugger
The model is given the raw pseudocode (capped at 128KB) plus any binary manifest
dropped in re/<binary>/manifest.json — architecture, compiler flags, linked
libraries, known CVEs for the binary version. That context changes what counts as
suspicious: a memcpy without bounds check is more interesting in a network-facing
binary than in an offline utility.
mg-exploitgen: Scaffolding From a CVE
When you have a CVE description and a target environment, the gap between “I know
this is vulnerable” and “I have a working exploit” is mostly research and boilerplate.
mg-exploitgen generates the scaffold that fills that gap.
mg-exploitgen scaffold target-bounty \
--cve CVE-2026-0001 \
--cve-description cve.md \
--target-env target-env.json
cve.md is a plain text description of the vulnerability — the NVD advisory,
the researcher’s writeup, the vendor’s patch notes. target-env.json describes
the deployment: OS, architecture, compiler version, ASLR/NX status, whether PIE
is enabled, what protections the binary has.
The model generates a scaffold under engagements/<name>/exploits/CVE-2026-0001/:
exploits/CVE-2026-0001/
├── runbook.md ← step-by-step exploitation notes
├── exploit.py ← skeleton exploit with offsets TBD
├── notes.md ← researcher notes, open questions
└── references.md ← CVE links, patch commits, related writeups
No network access. The tool reads from disk and calls the LLM. It doesn’t verify the CVE is real, doesn’t query external databases, and doesn’t download anything. The quality of the scaffold depends entirely on the quality of the description and env spec you provide.
The runbook is the most useful output — it’s a prose explanation of the exploitation path given what the model knows about the vulnerability class and the target constraints. An experienced researcher will recognize what’s plausible and what needs verification. A less experienced one gets a structured starting point rather than a blank page.
The Pattern Across All Three
Each tool reads engagement context, calls llm-client.complete(system, user),
and writes structured output. The --offline flag skips the LLM call and writes
a deterministic placeholder — useful in CI and for testing the pipeline without
API access.
The harness exposes all three as endpoints: report.generate, recopilot.analyze,
exploitgen.scaffold. An AI operator working through the harness can trigger report
generation, request pseudocode analysis, and scaffold exploits in the same JSON
conversation it uses for recon and fuzzing.
That’s the full loop: discover, probe, document, analyze, report — all from one engagement directory, all with structured data the AI can read and act on.
This is the final post in the GeistScope series. The full codebase is on GitHub.