Overview

Over the past few months, our team has been evaluating specialized security-focused large language models (LLMs) on internal infrastructure. These models are designed to automatically identify vulnerabilities in codebases—helping us patch weaknesses before they become real threats. Among the models tested, none captured our attention more than Mythos Preview from Anthropic. As part of Project Glasswing, we were granted early access to Mythos Preview and promptly directed it at over fifty of our own repositories to observe its capabilities, limitations, and the operational shifts needed to scale such a tool. This tutorial shares our findings and provides a step-by-step guide for running similar security audits using Mythos Preview. We’ll cover what this model does exceptionally well—exploit chain construction and proof generation—and where other models fall short. By the end, you’ll understand how to set up, run, and interpret results from Mythos Preview within your own security pipeline.

Mastering Security Audits with Mythos Preview: A Practical Guide to Exploit Chain Construction and Proof Generation — Source: blog.cloudflare.com

Prerequisites

Access to Mythos Preview: You must have an invitation or subscription to use Mythos Preview. Contact Anthropic for details.
Infrastructure: A dedicated environment (cloud or on-premises) with sufficient compute resources (GPU recommended) to run the model and compile/test exploits.
Source Code Repositories: At least one target repository containing code you wish to analyze. Ensure you have proper permissions to scan the code.
Basic Security Knowledge: Understanding of common vulnerability types (use-after-free, buffer overflow, etc.) and exploit mechanisms (ROP chains, control flow hijacking).
Python and Scripting: Familiarity with Python and shell scripting to automate tasks and interpret model output.

Step-by-Step Instructions

1. Setting Up the Environment

Before launching Mythos Preview, prepare a sandboxed environment that mimics your production setup. This environment will execute any generated proof-of-concept code safely. Use Docker or a virtual machine to isolate the scanning process.

# Example: Create a Docker container with necessary compilers and tools
docker run -it --name mythos-sandbox ubuntu:22.04 /bin/bash
apt update && apt install -y gcc g++ python3 python3-pip git

Install Mythos Preview’s API client if provided, or use direct HTTP calls to the model’s endpoint. The exact authentication method will depend on your agreement with Anthropic.

2. Configuring Mythos Preview

Mythos Preview requires configuration parameters to tailor its scans. The model accepts:

Target repository path
Scan depth (e.g., “deep” for full control-flow analysis)
Exploit chain construction mode (enable or disable)
Proof generation loop (number of iterations, error tolerance)
Output format (JSON, plain text, etc.)

# Sample configuration file (mythos_config.json)
{
  "repo_path": "/home/user/sample_app",
  "scan_depth": "deep",
  "exploit_chain": true,
  "proof_iterations": 5,
  "output": "json"
}

Submit the configuration via the API or command-line interface. For example:

mythos-cli --config mythos_config.json

3. Running the Security Scan

Once configured, initiate the scan. Mythos Preview will automatically parse the repository, identify potential vulnerabilities, and attempt to chain them into exploit sequences. The scan may take several hours, depending on codebase size and model complexity. Monitor progress through logs:

tail -f /var/log/mythos/scan.log

During execution, the model outputs intermediate findings, including identified bug primitives and potential chains. Record these for later analysis.

4. Analyzing Exploit Chain Construction

One of Mythos Preview’s standout features is exploit chain construction. A real-world attack rarely uses a single bug; it chains multiple small primitives together. For example, a use-after-free might be combined with an arbitrary read/write primitive, followed by control-flow hijacking via ROP chains. Mythos Preview reasons about these combinations at a level resembling a senior security researcher.

After the scan completes, review the chain output. The model will present a step-by-step reasoning path, showing how each primitive connects. Validate this reasoning manually or with auxiliary tools. Example output snippet:

"chain": [
  {"step": 1, "vuln": "use-after-free in function foo()", "action": "trigger free after reallocation"},
  {"step": 2, "vuln": "heap corruption leads to arbitrary write", "action": "overwrite function pointer"},
  {"step": 3, "vuln": "control flow hijack", "action": "execute ROP chain"}
]

If the chain appears plausible, proceed to the next step.

5. Validating Proof Generation

Finding a bug is one thing; proving it is exploitable is another. Mythos Preview excels at proof generation: it writes code that triggers the suspected bug, compiles it in a scratch environment, and runs it. If the expected behavior occurs (e.g., crash, privilege escalation), the model records it as a successful proof. If not, it reads the failure output, adjusts its hypothesis, and retries. This iterative loop is critical because a vulnerability without a working proof remains speculation.

To examine proof attempts, look at the generated code and logs:

# Example proof script generated by Mythos Preview
#include <stdio.h>
int main() {
  char *buf = malloc(10);
  free(buf);
  // Use-after-free trigger
  strcpy(buf, "AAAA");
  return 0;
}

Check the output of the compiled program. Did it segfault as expected? If the proof fails, review the error message and see if the model retries with an adjusted payload. A well-functioning proof generation cycle will converge on a reliable exploit.

Common Mistakes

Skipping Prerequisites: Without proper sandboxing, running exploit code can damage your system or corrupt data. Always use isolated environments.
Over-reliance on One Model: While Mythos Preview is powerful, no model is perfect. Cross-check findings with other tools and manual review to avoid false positives or missed chains.
Ignoring the Iterative Loop: The proof generation loop is not automatic magic—you must monitor it. If the model fails repeatedly, adjust configuration (e.g., increase iterations or relax tolerance).
Misinterpreting Chain Output: An apparently logical chain might contain hidden assumptions. Verify each step with a debugger or static analysis before accepting it as valid.
Neglecting Infrastructure Scaling: The original Project Glasswing tests revealed that running Mythos Preview at scale requires architectural changes—e.g., parallelizing scans and storing intermediate state. Plan for these upfront.

Summary

Mythos Preview represents a significant advance in automated security analysis, particularly in its ability to construct exploit chains and generate working proofs. By following the steps outlined in this guide—setting up a sandboxed environment, configuring the model, running scans, analyzing chains, and validating proofs—you can leverage its capabilities effectively. Remember that while Mythos Preview outperforms general-purpose frontier models at the crucial step of stitching vulnerabilities together, it still benefits from human oversight. For teams ready to invest in infrastructure and process changes, Mythos Preview offers a powerful addition to the security toolkit.

Mastering Security Audits with Mythos Preview: A Practical Guide to Exploit Chain Construction and Proof Generation