How I Actually Audit a Smart Contract

People ask me what an audit actually looks like, and they usually expect some secret tooling or a clever trick. It isn't that. My process is mostly discipline and order. The order matters more than any single tool, because most bugs that survive into production aren't hidden in clever code. They're sitting in plain sight in a function nobody traced all the way through. So this is the routine I run, in the order I run it, and why each step earns its place.

Start with the contest page

If it's a public contest, the contest page is the first thing I read, end to end. Not skimmed. Read. It tells me what's in scope, what's out, what the sponsor already knows is shaky, and what they're worried about. It also tells me where they don't want me looking, which is sometimes more interesting than where they do.

This step is free reconnaissance. Someone has already done the work of summarizing the system and flagging the soft spots, and I'd be foolish to skip it. If it's a private engagement there's no contest page, but the same idea applies: I want the scope and the sponsor's own anxieties on the table before I form any opinions of my own.

Read the company docs before the code

This is the step people are most tempted to skip, and skipping it is a mistake. Before I open a single contract, I read the company's documentation in full. The goal here isn't technical. It's intent. I want to understand what the company is trying to achieve and what they expect their code to do.

That sounds obvious, but it's the whole game. Almost every serious bug is a gap between what the code is supposed to do and what it actually does. You cannot find that gap if you only know one side of it. If I read the code first, I learn what it does, and then I start unconsciously assuming that whatever it does is what it's meant to do. That's exactly the trap that lets bugs slide right past you. Reading the docs first gives me the intended behavior as an independent reference, so when the code disagrees with it later, I notice.

So docs come before code, on purpose. I'm building the "should" before I ever look at the "is."

Then the README, for the dev's logic

After the company docs, I go into the repository and read the README completely. The docs told me what the business wants. The README tells me how the developers think. It's where they explain the architecture, the build steps, the assumptions, the shortcuts. It's the closest thing I get to sitting next to them while they explain their own design.

By the time I'm done with the docs and the README, I have two mental models: what the protocol is supposed to do, and how the team believes they built it. Now I can go check whether those two things are actually true.

Pull the repo and read the code by hand

Now I pull the repo and start reading the codebase manually. Not searching for a specific bug yet, just reading, to understand it. I want the shape of the system in my head: which contracts exist, who talks to whom, where the money and the permissions live. This is slow and it's supposed to be slow. You can't reason about a system you don't actually understand, and there's no shortcut to understanding except reading.

Build it and run the tests, because tests are gold

Next I build it. First just to confirm it builds and the existing tests pass, because a repo that doesn't build cleanly is already telling you something. But the real reason I run the tests is that tests are gold for an auditor.

Here's why. A test suite is a written record of what the developers thought about. Every test is a scenario they cared enough to check. Which means the gaps in the test suite are a map of what they didn't think about. The edge cases they never wrote a test for are exactly the cases they probably never reasoned through in the code either. When I see a function with thorough happy-path tests and nothing covering what happens on a weird input, an empty array, a re-entrant call, or a malicious caller, that's not a dead end. That's an arrow pointing at where to dig.

So I read the tests almost like documentation of the developers' blind spots. They reveal what the devs missed in their own logic and implementation, and that's usually where the bugs are.

An automated pass before the manual one

Before I commit to the slow manual work, I do an automated pass over the logic flow. Slither, Echidna, Cyfrin Aderyn, depending on the codebase. I run these early, not late.

The order is deliberate. These tools are fast and they're tireless, and they'll surface the mechanical stuff in minutes: shadowed variables, unchecked return values, reentrancy patterns, the well-known footguns. There's no reason to spend my human attention hunting for things a static analyzer finds for free. Let the machine clear the obvious layer first, and let the noise it produces point me at areas worth a closer look. Then I spend my actual brain on the things the tools can't reason about: business logic, intent, the gap between the docs and the code.

Automated first, manual second. The machine handles breadth so I can spend depth where it counts.

Now the manual audit: follow every function

This is the core of the work. I start following each function by hand. Not reading it in isolation and deciding it looks fine, but tracing how it interacts with other functions and other contracts. Where do its inputs come from? Who's allowed to call it? What does it call in turn, and what does that thing assume? A function can be perfectly correct on its own and still be a disaster because of who can reach it and with what.

That last part is where most of my real findings come from: following inputs back to their source. A value that looks safe inside one function is often something the caller fully controls, and once you trace it back far enough you realize the protocol is trusting data it should never have trusted.

That's exactly what happened on my AI Arena audit. The minting function took a fighter's type and DNA straight from the caller and only checked that you owned the mint pass you were burning. Nothing verified that the traits you passed in were the ones the protocol intended. Reading the inputs carefully and following where they actually came from is the whole reason that bug fell out. I wrote the full thing up here: AI Arena: hand-picking rare fighter NFTs via redeemMintPass. It's a good example of how this step pays off, because the function looked completely fine until you asked where its values were coming from.

Hammer the obvious high-risk spots

Once I've traced the logic, I go hit the usual suspects directly. Access control first. Who can call the privileged functions, and is every one of them actually gated? A surprising amount of the time, the answer is "this one isn't," and that's your whole finding.

// The kind of thing I'm checking for: is this gate actually here,
// and is it on every path that mutates privileged state?
function setTreasury(address newTreasury) external onlyOwner {
    treasury = newTreasury;
}

The trick is that the missing modifier never looks dramatic. The function with the onlyOwner left off looks exactly like the twenty functions around it that have it. You only catch it by checking every single one rather than assuming the pattern holds.

After access control I go after anything involving randomness or external data. VRF usage, oracle reads, price feeds, anything where a value the protocol depends on could be nudged or predicted. If a result is supposed to be random but is actually deterministic on an input someone controls, it isn't random, it's a lookup table they can browse. If a price comes from a source that can be manipulated within a transaction, the whole thing built on top of it is exposed. These spots are obvious targets precisely because they're high value, so I make sure I've gone at them deliberately rather than hoping I'd have stumbled across them.

Why the order is the method

If there's one thing I'd want a founder to take from this, it's that the sequence is the point. Docs before code, so I know the intended behavior before the implementation can bias me. Tests early, because they show me where the team's attention ran out. Automated tools before manual review, so I'm not spending human focus on what a script can find. And manual tracing last, because by then I actually understand the system well enough to know what "wrong" would look like.

None of these steps are exotic. That's deliberate. A good audit isn't a magic trick, it's the refusal to skip the boring parts. The bugs that take down protocols are almost never clever. They're the function nobody traced to the end, the input nobody followed back to its source, the modifier nobody noticed was missing.

If that's the kind of careful reading you want on your contracts before they ship, here's how I work.