How I Use Foundry in Smart Contract Audits

People assume an auditor's toolbox is some sprawling thing. For me, most of the day-to-day is reading code by hand and then reaching for Foundry to prove what I think I've found. That's really it. I read until I have a suspicion, then I write a test that either confirms it or embarrasses me. Foundry is the thing that turns "I think this is broken" into "here, watch it break."

So this is how I actually use it. Not a tutorial on every command, just the three things I lean on during a real engagement: proof-of-concept exploit tests, Anvil for poking at a contract live, and the gas report when a client is paying for one.

The POC is the whole point

A finding without a working proof of concept is a claim. That's the blunt version. I can write the most carefully reasoned paragraph in the world about why a function is exploitable, and a sponsor is still within their rights to read it, shrug, and mark it disputed. A POC removes that argument. When you hand someone a forge test that sets up an attacker, runs the exploit, and asserts the bad outcome, there's nothing left to debate. The test either passes or it doesn't, and if it passes, the bug is real.

The other thing a POC does, which matters just as much, is protect me from myself. It's easy to talk yourself into a vulnerability that isn't there. You trace a path, it looks ugly, you get excited, and you've half-written the report before you've checked whether the exploit actually lands. Writing the test forces the question. More than once I've sat down to prove a "high" and watched it fall apart in setUp, because some modifier I'd skimmed past was doing exactly the job I assumed it wasn't. Better I find that out in my editor than a judge finds it out in my report.

I'll be honest about why I care about this so much. On my AI Arena audit I turned up around six issues, and some of them never got validated because I had trouble getting my POC tests to run cleanly. An unproven finding is just a claim, and I left some claims on the table that I'm fairly sure were real. That stung, and it's the reason I drilled on Foundry until writing a POC stopped being the hard part of the job. The bug-finding is the skill. But if you can't demonstrate the bug, you didn't really finish the work.

What a POC test actually looks like

Here's the shape I use. Most of my POCs are some variant of this: a setUp that deploys the target and funds an attacker, and a single test_Exploit_... that performs the attack and then asserts the thing that should never be allowed to happen.

This one is modeled on the bug class I see constantly, where a privileged or trusted input isn't actually validated, so a caller can hand the contract values it was never supposed to accept. It's generic on purpose, but it's real Foundry.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

import {Test, console2} from "forge-std/Test.sol";
import {Vault} from "../src/Vault.sol";

contract VaultExploitTest is Test {
    Vault internal vault;

    address internal owner = makeAddr("owner");
    address internal attacker = makeAddr("attacker");

    function setUp() public {
        vm.prank(owner);
        vault = new Vault();

        // The vault holds funds that should only move on the owner's say-so.
        vm.deal(address(vault), 100 ether);
    }

    function test_Exploit_UnauthorizedWithdraw() public {
        uint256 attackerBefore = attacker.balance;

        // The attacker calls a path that trusts a caller-supplied address
        // without checking that the caller is allowed to set it.
        vm.prank(attacker);
        vault.setBeneficiary(attacker);

        vm.prank(attacker);
        vault.withdraw(100 ether);

        // The assertion is the finding. If this passes, the vault let an
        // unprivileged caller drain it, which is the whole bug.
        assertEq(attacker.balance, attackerBefore + 100 ether);
        assertEq(address(vault).balance, 0);
    }
}

A few things I do deliberately here. I name the attacker and the owner with makeAddr so the trace reads like a story instead of a wall of hex. I use vm.prank right before the call that matters, not three lines early, so it's obvious who's calling what. And the assertion is written as the abuse, not the absence of it. I'm asserting that the attacker walked away with the money, because that's the sentence I want in the report. When that test goes green, the finding writes itself.

Run it the normal way and lean on the trace when something doesn't line up:

forge test --match-test test_Exploit_UnauthorizedWithdraw -vvvv

The -vvvv is not optional for me when a POC misbehaves. The full trace shows every call, every revert reason, every return value, and nine times out of ten the reason my exploit "doesn't work" is sitting right there in the trace: a require I forgot about, a value that came back zero, a prank that wore off. Read the trace before you start doubting the bug.

Anvil and cast, for when reading isn't enough

Sometimes the code won't tell me what I need to know. The logic branches on some state I can't fully hold in my head, or a function's return value depends on a chain of getters and I want to see the actual number, not my mental model of it. That's when I stop reading and start poking.

I'll spin up a local node, or fork mainnet if the protocol depends on real on-chain state:

# Plain local devnet
anvil

# Or fork from a live network to inspect real deployed state
anvil --fork-url $RPC_URL

Then I just call things with cast and look at what comes back. No test harness, no assertions, just interrogating the contract directly:

# Read a view function and see the real value, not the one I assumed
cast call $VAULT "beneficiary()(address)" --rpc-url http://localhost:8545

# Send a transaction from one of Anvil's funded accounts and watch what happens
cast send $VAULT "setBeneficiary(address)" $ATTACKER \
  --rpc-url http://localhost:8545 --private-key $ANVIL_KEY

This is the part of the workflow that feels least like "auditing" and most like just messing around, but it pulls its weight. Reading a contract tells you what it's supposed to do. Calling it on a fork tells you what it does. When those two disagree, you've usually found something, and a few minutes of cast call will surface a disagreement that an hour of staring at the source might not. Once I see the contract actually behave the way I suspected, I take it back into a proper forge test and turn it into a POC. The Anvil session is reconnaissance; the test is the evidence.

The gas report, and being honest about where it sits

forge test --gas-report gives you a per-function breakdown of gas usage across your whole test run:

forge test --gas-report

You get a table for each contract: min, average, median, and max gas for every function your tests touch, plus deployment cost. On a public contest I rarely care. But on a private engagement it's a deliverable clients actually ask about, and I treat it as one.

Here's the order of operations I'm honest with clients about: security first, gas second. I am not going to recommend a gas optimization that introduces the smallest amount of risk, and I'm wary of the whole genre of "clever" gas tricks that make code harder to reason about, because hard-to-reason-about code is where bugs live. The security review is the product. The gas work comes after, once the contracts are sound.

That said, when the security pass is done and I've got the contracts loaded into a test suite anyway, the gas report is nearly free to produce and clients genuinely value it. A protocol that's about to deploy is looking at real money in deployment cost and real money in what their users will pay per transaction for years. If I can point at the three functions eating the most gas, flag the storage layout that's costing them an extra SLOAD on a hot path, or note that a loop is reading from storage when it could cache to memory, that's money I just saved them on top of the safety work they came for. It's a secondary deliverable, but it's a real one, and the gas report is how I find what's worth flagging instead of guessing.

The short version

Foundry is where my suspicions go to get tested. I read code until something looks wrong, I poke at it with Anvil and cast when reading isn't enough, and then I write a POC that proves the bug beyond argument. The gas report rides along on private work because the contracts are already in a test harness and clients care about it.

If you want the longer version of how the reading-and-tracing part fits together, I wrote that up separately in how I actually audit a contract. And if you've got a protocol you want reviewed by someone who'll prove the findings instead of just asserting them, here's how I work.