<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/">
    <channel>
        <title>nowarp Blog</title>
        <link>https://nowarp.io/blog</link>
        <description>nowarp Blog</description>
        <lastBuildDate>Fri, 24 Apr 2026 00:00:00 GMT</lastBuildDate>
        <docs>https://validator.w3.org/feed/docs/rss2.html</docs>
        <generator>https://github.com/jpmonette/feed</generator>
        <language>en</language>
        <item>
            <title><![CDATA[Compiler Testing — Part 1: Coverage-Guided Fuzzing with Grammars and LLMs]]></title>
            <link>https://nowarp.io/blog/compiler-testing-part-1</link>
            <guid>https://nowarp.io/blog/compiler-testing-part-1</guid>
            <pubDate>Fri, 24 Apr 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[Compiler fuzzing for small languages is a specific problem — few optimization passes, tiny corpora, thin docs. This post covers how coverage-guided fuzzing and LLM-assisted tooling adapt to smart-contract compilers, including a literature overview, related projects, and evaluation results. Found 100+ compiler bugs across Sui Move, Cairo, Solang, Solidity, and Leo. These are not lexer or parser crashes on malformed input — every bug was triggered by structurally valid programs against mature, audited, production compilers.]]></description>
            <content:encoded><![CDATA[<p>Compiler fuzzing for small languages is a specific problem — few optimization passes, tiny corpora, thin docs. This post covers how coverage-guided fuzzing and LLM-assisted tooling adapt to smart-contract compilers, including a literature overview, related projects, and evaluation results. Found <strong>100+ compiler bugs</strong> across <a href="https://github.com/MystenLabs/sui/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Sui Move</a>, <a href="https://github.com/starkware-libs/cairo/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Cairo</a>, <a href="https://github.com/hyperledger-solang/solang/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Solang</a>, <a href="https://github.com/argotorg/solidity/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Solidity</a>, and <a href="https://github.com/ProvableHQ/leo/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Leo</a>. These are not lexer or parser crashes on malformed input — every bug was triggered by structurally valid programs against mature, audited, production compilers.</p>
<p>This post may be useful to you if you:</p>
<ul>
<li class="">Develop, maintain, or test a programming language, especially one targeting smart contracts</li>
<li class="">Do structure-aware fuzzing against real-world targets</li>
</ul>
<p>The post is organized as follows:</p>
<ol>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#background" class="">Background</a></strong> — related work, existing approaches, and what makes small-language fuzzing different from C/C++</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#fuzzing-harness-and-configuration" class="">Fuzzing harness and configuration</a></strong> — harness design, fuzzer orchestration, tuning for compiler targets</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#custom-mutators" class="">Custom mutators</a></strong> — leveraging LLMs and tree-sitter grammars in AFL++ mutators</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#corpus-and-dictionaries" class="">Corpus and dictionaries</a></strong> — corpus collection, minimization, dictionary construction</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#triage-workflow" class="">Triage workflow</a></strong> — deduplication, minimization assisted by tools and LLM, and report filing</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#evaluation" class="">Evaluation</a></strong> — all targets, all results, consolidated</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#conclusion-and-further-work" class="">Conclusion and further work</a></strong> — summarizes the post, notes what comes next, lists the published tools</li>
</ol>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="background">Background<a href="https://nowarp.io/blog/compiler-testing-part-1#background" class="hash-link" aria-label="Direct link to Background" title="Direct link to Background" translate="no">​</a></h2>
<p>Fuzzing is one of the approaches for finding bugs in compilers. While it does not provide correctness guarantees, it enables you to uncover hidden bugs by generating corner cases that users rarely trigger. Compilers are particularly good targets – they process complex structured input through multiple transformation passes with internal invariants and assumptions.</p>
<p>In the simplest case, the goal is to find compiler crashes – internal compiler errors (ICE). This is easy, because you don't have to write a fuzzing oracle – just execute the compilation pipeline on fuzz data and collect crashes. This post focuses only on ICE; other kinds of errors will be covered in the later part.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-valid-leo-crash-bfe7fda170a23b351190cbe685a6c2bd.png" style="width:75%"></div>
<div align="center"><em>Valid ICE: hex literal as array index → compiler panic (<a href="https://github.com/ProvableHQ/leo/issues/29229" target="_blank" rel="noopener noreferrer" class="">leo#29229</a>)</em></div>
<br>
<p>The issues found that way have a low risk for end users – these bugs may crash the tooling (e.g. analyzers, LSP) or the compiler itself, preventing the user from writing planned code and messing up the development process. They don't affect the running program.</p>
<p>The standard fuzzing technique when source code is available is coverage-guided fuzzing. Popular fuzzers operate at byte and bit level — but compilers accept structured input. Pushing random bytes will only hit lexer/parser errors and is far too ineffective to reach later passes. That's why grammar-aware fuzzing exists.</p>
<p><strong>Key idea of grammar-aware fuzzing</strong>: generate syntactically correct programs that likely pass the lexer/parser and hit internals of the compiler. This way, we challenge later passes like the typechecker, semantic analysis, and codegen – trying to violate some invariants and assumptions the compiler developers made.</p>
<p>While challenging the lexer/parser is easy, it was intentionally skipped for all the compilers. In small teams and small languages, nobody really cares if input containing 5000 sequential <code>(</code> symbols will crash the parser. This kind of issue is very common and could be easily found, but it is not worth the time to report or fix, because no sane user will ever write code like this.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-malformed-cairo-3df8dd061394436514a29be7f25dd1eb.png"></div>
<div align="center"><em>Cairo: malformed AST (unterminated <code>$</code> in macro rule) → out-of-bounds access. Such bugs <strong>were not considered valid</strong> and <strong>not reported</strong>.</em></div>
<br>
<p>Most existing research on grammar-aware compiler fuzzing targets C/C++ compilers. Some of these approaches transfer to smart-contract languages, some do not. Here is what makes these targets different:</p>
<ul>
<li class=""><strong>Few optimization passes</strong> – smart-contract languages focus on correctness, not runtime speed, and are developed by small teams. Program generators (e.g. <a href="https://github.com/csmith-project/csmith" target="_blank" rel="noopener noreferrer" class="">CSmith</a> or <a href="https://github.com/intel/yarpgen" target="_blank" rel="noopener noreferrer" class="">YARPGen</a>) or EMI mutators (like Hermes <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[7]</a> or XDead <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[8]</a>) that target miscompilations from aggressive optimizations are of limited use here.</li>
<li class=""><strong>Simpler execution environments</strong> – smart-contract languages target smart-contracts, not general-purpose computing. This means fewer codegen paths and a simpler runtime, which limits the surface for deep codegen bugs.</li>
<li class=""><strong>Rust as implementation language</strong> – many of these compilers are written in Rust, which determines the tooling (cargo-fuzz, AFL++ Rust bindings) and the crash patterns we target: panics, unprotected unwraps, index-out-of-bounds.</li>
<li class=""><strong>Low popularity</strong> – fewer real-world examples are available, which limits corpus collection and approaches that rely on injecting existing code snippets into the fuzzing process <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[1]</a>.</li>
<li class=""><strong>Often poor documentation</strong> – approaches leveraging language documentation or specification <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[3]</a> are limited, though they work when teams explicitly care about good docs.</li>
<li class=""><strong>Tree-sitter grammars available</strong> – smart-contract languages typically ship <a href="https://tree-sitter.github.io/" target="_blank" rel="noopener noreferrer" class="">tree-sitter</a> grammars for tooling (IDE extensions, syntax highlighting), while <a href="https://www.antlr.org/" target="_blank" rel="noopener noreferrer" class="">ANTLR4</a> grammars are rare. This makes tools leveraging tree-sitter work well out of the box.</li>
</ul>
<p>The fuzzing campaign for ICE requires the following parts to be implemented:</p>
<ul>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#fuzzing-harness-and-configuration" class="">Fuzzing harness</a></strong> – executes the compilation pipeline on fuzz inputs and collects crashes. Needed to run fuzzers in persistent mode and filter out benign panics like stack overflows from parser bugs.</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#custom-mutators" class="">Custom mutators</a></strong> – implement grammar-aware mutation rules on top of AFL++. Default byte-level mutators can't generate structurally valid programs, so custom mutators are what actually get past the parser.</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#corpus-and-dictionaries" class="">Corpus</a></strong> – a collection of seed programs that mutations are derived from. All grammar-aware mutators operate on these inputs, so corpus quality directly determines mutation quality.</li>
<li class=""><strong><a href="https://nowarp.io/blog/compiler-testing-part-1#dictionaries" class="">Fuzzing dictionaries</a></strong> – lists of language-specific tokens fed to default mutators (if used). Help byte-level mutations produce valid-looking fragments instead of pure noise.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="fuzzing-harness-and-configuration">Fuzzing harness and configuration<a href="https://nowarp.io/blog/compiler-testing-part-1#fuzzing-harness-and-configuration" class="hash-link" aria-label="Direct link to Fuzzing harness and configuration" title="Direct link to Fuzzing harness and configuration" translate="no">​</a></h2>
<p>A fuzzing harness is a program that sets up fuzzers in persistent mode to receive and process fuzz inputs looking for crashes. Additionally, it sorts out benign panics, like stack overflows typically caused by lexer/parser bugs.</p>
<p>The main fuzzer used is AFL++. It is the most mature, provides an API to write custom mutators, and has the best configuration options. Meanwhile, honggfuzz and libFuzzer use different mutation algorithms, which increases coverage when combined with AFL++.</p>
<p>In some campaigns, honggfuzz and libFuzzer were executed in a single thread and were supplementary; the main work was done by AFL++.</p>
<p>While AFL++ <a href="https://github.com/AFLplusplus/AFLplusplus//blob/4e5c0469ad9d56060317ebdc88027e2143f7b979/docs/Changelog.md#L931" target="_blank" rel="noopener noreferrer" class="">provides an option</a> to sync with foreign fuzzers, you'll still need to implement different harness binaries for each fuzzer.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="multifuzz-unified-orchestration"><code>multifuzz</code>: unified orchestration<a href="https://nowarp.io/blog/compiler-testing-part-1#multifuzz-unified-orchestration" class="hash-link" aria-label="Direct link to multifuzz-unified-orchestration" title="Direct link to multifuzz-unified-orchestration" translate="no">​</a></h3>
<p>To simplify configuration and orchestration of multiple fuzzers, a lightweight orchestrator called <a href="https://github.com/jubnzv/multifuzz" target="_blank" rel="noopener noreferrer" class=""><code>multifuzz</code></a> was implemented. It solves three tasks:</p>
<ul>
<li class="">Unified Rust API to configure all three fuzzers in a single config</li>
<li class="">Explicit configuration for all the fuzzers – all <a href="https://aflplus.plus/docs/env_variables/" target="_blank" rel="noopener noreferrer" class="">env variables</a> and fuzzer arguments must be described explicitly in the config, zero hidden options</li>
<li class="">CLI to manage individual fuzzing instances: start, stop, restart</li>
</ul>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-multifuzz-harness-842c942f7cb610d44080e6fe8b05dcc2.png"></div>
<div align="center"><em>multifuzz harness: single Rust macro shared by AFL++, honggfuzz, and libFuzzer</em></div>
<br>
<p>Overall, it adds a zero-overhead configuration layer that sets up a single harness for all three fuzzers and manages them at runtime. Everything is 100% explicit – the tool does not introduce any fancy defaults, so you have to <a href="https://aflplus.plus/docs/fuzzing_in_depth/" target="_blank" rel="noopener noreferrer" class="">read the documentation</a>.</p>
<p>Here is an <a href="https://github.com/nowarp/move-fuzz/blob/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/crates/source-multifuzz/multifuzz.toml" target="_blank" rel="noopener noreferrer" class="">example configuration</a> used to fuzz the Sui Move compiler that shows how multiple workers with different options may be configured.</p>
<p>It is optional. Alternatively, you could achieve the same results writing a Makefile or custom scripts and/or running <a href="https://github.com/tmux/tmux/" target="_blank" rel="noopener noreferrer" class="">tmux</a> sessions for each fuzzer worker.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="fuzzers-configuration">Fuzzers configuration<a href="https://nowarp.io/blog/compiler-testing-part-1#fuzzers-configuration" class="hash-link" aria-label="Direct link to Fuzzers configuration" title="Direct link to Fuzzers configuration" translate="no">​</a></h3>
<p>To achieve the best fuzzer performance for grammar-aware testing, the following options were used:</p>
<ul>
<li class=""><strong>Selective instrumentation</strong> – used to focus fuzzing on specific places in the source code, like recently added features in the compiler. The approach used AFL++ partial instrumentation and is well described in <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[2]</a> and the <a href="https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.instrument_list.md" target="_blank" rel="noopener noreferrer" class="">documentation</a>.</li>
<li class=""><strong>No complex byte level mutators</strong> were used – cmp-log (or redqueen <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[6]</a>), fuzzers <a href="https://github.com/microsvuln/awesome-afl" target="_blank" rel="noopener noreferrer" class="">involving symbolic execution</a>, and <a href="https://github.com/AngoraFuzzer/Angora" target="_blank" rel="noopener noreferrer" class="">Angora</a> (which uses taint traces from inputs) were all skipped, since they were designed to target bit/byte-level mutations. For grammar-aware fuzzing this does not give much benefit, and considering the execution overhead, it slows down the fuzzer.</li>
<li class=""><strong>Timeouts</strong> – compilers are slow, and some mutations may generate code that increases compilation time, e.g. by hitting constant evaluation or generating many entries. The timeout was typically set around 1000ms – enough to keep the corpus clean and avoid cluttering it with useless inputs.</li>
<li class=""><strong>Memory limits</strong> – some targets eat RAM; a special case is Cairo, which uses <a href="https://github.com/salsa-rs/salsa" target="_blank" rel="noopener noreferrer" class="">Salsa</a> – an incremental computation library with its own cache. Other projects may also consume a lot of memory when dealing with large generated inputs. The <code>-m</code> option is required.</li>
</ul>
<p>Otherwise, the fuzzing process relies mostly on custom mutators. Fuzzer configuration follows the <a href="https://aflplus.plus/docs/fuzzing_in_depth/" target="_blank" rel="noopener noreferrer" class="">AFL++ documentation</a>.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="custom-mutators">Custom mutators<a href="https://nowarp.io/blog/compiler-testing-part-1#custom-mutators" class="hash-link" aria-label="Direct link to Custom mutators" title="Direct link to Custom mutators" translate="no">​</a></h2>
<p>Recent versions of AFL++ provide <a href="https://github.com/AFLplusplus/AFLplusplus/blob/4e5c0469ad9d56060317ebdc88027e2143f7b979/custom_mutators/rust/README.md" target="_blank" rel="noopener noreferrer" class="">Rust API bindings</a> to write custom mutators, which simplify development — smart-contract languages are often written in Rust, so you can trigger their internals (e.g. parser, AST) directly from the custom mutator.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="ad-hoc-custom-mutator">Ad-hoc custom mutator<a href="https://nowarp.io/blog/compiler-testing-part-1#ad-hoc-custom-mutator" class="hash-link" aria-label="Direct link to Ad-hoc custom mutator" title="Direct link to Ad-hoc custom mutator" translate="no">​</a></h3>
<p>The first attempt was simple: after reading the <a href="https://github.com/agroce/afl-compiler-fuzzer" target="_blank" rel="noopener noreferrer" class="">experiment</a> conducted by Alex Groce <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[13]</a>, the idea was to create a Move-specific AFL++ custom mutator. The result is a <a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/move" target="_blank" rel="noopener noreferrer" class="">small mutator</a> written in C that swaps common language symbols (e.g. <code>{</code> and <code>[</code>), replaces and deletes code blocks, and provides some Move-specific mutations. It uses the custom mutator API and does not fork AFL++.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-move-mutator-cab49c7323cd37cc2ca5044633ee5768.png" style="width:75%"></div>
<div align="center"><em>Example of ad-hoc Move-specific mutations implemented</em></div>
<br>
<p>The problems with this approach:</p>
<ol>
<li class="">To be generic enough to target all C-style-syntax compilers, it has to sacrifice language-specific patterns</li>
<li class="">It relies heavily on a good corpus</li>
<li class="">It is too focused on havoc-style mutations without respecting program structure</li>
</ol>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="afl-ts-tree-sitter-based-afl-mutator"><code>afl-ts</code>: Tree-sitter based AFL++ mutator<a href="https://nowarp.io/blog/compiler-testing-part-1#afl-ts-tree-sitter-based-afl-mutator" class="hash-link" aria-label="Direct link to afl-ts-tree-sitter-based-afl-mutator" title="Direct link to afl-ts-tree-sitter-based-afl-mutator" translate="no">​</a></h3>
<p>Instead of mutating bytes, we could mutate the AST directly. <a href="https://tree-sitter.github.io/" target="_blank" rel="noopener noreferrer" class="">Tree-sitter</a> grammars give you typed nodes to swap, delete, and splice — preserving program structure.</p>
<p>A similar tool and approach already exist in the Rust ecosystem: <a href="https://github.com/langston-barrett/tree-splicer" target="_blank" rel="noopener noreferrer" class="">tree-splicer</a> is used by <a href="https://github.com/langston-barrett/tree-crasher" target="_blank" rel="noopener noreferrer" class="">tree-crasher</a> and <a href="https://github.com/matthiaskrgr/icemaker" target="_blank" rel="noopener noreferrer" class="">ice-maker</a> to find ICEs in the rustc compiler. Similar mutation algorithms are applied in multiple <a href="https://nowarp.io/blog/compiler-testing-part-1#other-grammar-aware-fuzzers" class="">grammar-aware fuzzing projects</a>, which typically use <a href="https://www.antlr.org/" target="_blank" rel="noopener noreferrer" class="">ANTLR4</a> grammars, uncommon among smart-contract languages. However, tree-splicer is a standalone tool, not an AFL++ custom mutator.</p>
<p><a href="https://github.com/jubnzv/afl-ts" target="_blank" rel="noopener noreferrer" class=""><code>afl-ts</code></a> mutator integrates grammar-aware mutations into AFL++ as a custom mutator. It is fully configurable via environment variables and works with any tree-sitter grammar built with modern tree-sitter. Instead of tweaking the mutator to add the language <a href="https://github.com/langston-barrett/tree-splicer/pull/3/changes#diff-0aef08b59d5e277120a6ed6f290f2171683340f202fb8aaf745cb0f4a1d4e7bdR4" target="_blank" rel="noopener noreferrer" class="">as needed in tree-splicer</a>, the user just points to the grammar shared library via the <code>TS_GRAMMAR</code> env variable; the language function name can be set via <code>TS_LANG_FUNC</code>, but the tool can usually deduce it from the grammar filename.</p>
<p>Here is the complete table of mutations it conducts:</p>
<table><thead><tr><th>Strategy</th><th>Weight</th><th>What it does</th></tr></thead><tbody><tr><td><code>ts-del</code></td><td>20</td><td>Delete a named AST subtree</td></tr><tr><td><code>ts-bank</code></td><td>20</td><td>Replace subtree with type-compatible one from corpus bank (<code>TSSymbol</code> match)</td></tr><tr><td><code>ts-add</code></td><td>20</td><td>Replace subtree with type-compatible one from AFL++'s <code>add_buf</code></td></tr><tr><td><code>ts-swap</code></td><td>15</td><td>Swap two sibling nodes of the same type</td></tr><tr><td><code>ts-shrink</code></td><td>10</td><td>Replace node with a same-type descendant (always reduces size)</td></tr><tr><td><code>ts-lit</code></td><td>5</td><td>Replace leaf with random literal</td></tr><tr><td><code>ts-dup</code></td><td>3</td><td>Duplicate a subtree adjacent to itself</td></tr><tr><td><code>ts-ins</code></td><td>7</td><td>Insert a type-compatible bank subtree after a node (grows input, capped at 2x)</td></tr><tr><td><code>ts-range</code></td><td>4</td><td>Replace a contiguous run of same-symbol siblings with a same-symbol run from <code>add_buf</code> or 1..3 concatenated bank entries</td></tr><tr><td><code>ts-chaos</code></td><td>2</td><td>Bypass the type-safety filter on <code>ts-bank</code> / <code>ts-add</code> / <code>ts-range</code> / <code>ts-kins</code> / <code>ts-stutter</code>: splice a random bank (or <code>add_buf</code>) node into the destination regardless of <code>TSSymbol</code>, or stutter the envelope of any parent around any descendant. Produces deliberately ungrammatical inputs to increase coverage.</td></tr><tr><td><code>ts-kdel</code></td><td>10</td><td>Delete 1..3 contiguous children from a run of same-symbol siblings, swallowing one adjacent separator so the remaining list stays well-formed</td></tr><tr><td><code>ts-kins</code></td><td>10</td><td>Insert 1..3 same-symbol children at a random boundary of a same-symbol sibling run. Donors come from <code>add_buf</code>, the bank, or a duplicated existing member. Separator is detected from the existing list</td></tr><tr><td><code>ts-stutter</code></td><td>4</td><td>Pick a parent <code>P</code> and a same-symbol descendant <code>C</code>, then repeat <code>P</code>'s prefix/suffix envelope <code>N</code> times around <code>C</code> (<a href="https://gitlab.com/akihe/radamsa" target="_blank" rel="noopener noreferrer" class="">radamsa</a>-style tree stutter). Type-safe by default; chaos mode drops the symbol-equality filter.</td></tr></tbody></table>
<p>Weights represent the probability of each mutation being applied.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-ts-add-1d52b933e8f77ec740007850cfefff17.png"></div>
<div align="center"><em><code>ts-add</code> replaces a function element's contents with another from the same file. <code>ts-bank</code> does the same across files in the queue.</em></div>
<br>
<p>Typically, corpus files grow a little when using <code>ts-ins</code> a lot, but not significantly, because the addition must add some coverage to be kept by AFL++.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-ts-swap-6a6b5e8a819b9a4401d7cb388aac9641.png"></div>
<div align="center"><em><code>ts-swap</code> picks matching elements (here, function return types) and swaps them.</em></div>
<br>
<p>This mutator alone found lots of bugs; most of the <a href="https://github.com/hyperledger-solang/solang/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Solang</a> and <a href="https://github.com/argotorg/solidity/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Solidity</a> findings came from it.</p>
<p>The quality of the tree-sitter grammar matters — grammars producing too many <code>ERROR</code> nodes on valid input degrade mutation quality. Here are the grammars used:</p>
<table><thead><tr><th>Target</th><th>Tree-sitter grammar</th></tr></thead><tbody><tr><td>Sui Move</td><td><a href="https://github.com/MystenLabs/sui/tree/main/external-crates/move/tooling/tree-sitter" target="_blank" rel="noopener noreferrer" class="">tree-sitter-move</a></td></tr><tr><td>Cairo</td><td><a href="https://github.com/starkware-libs/tree-sitter-cairo" target="_blank" rel="noopener noreferrer" class="">tree-sitter-cairo</a></td></tr><tr><td>Leo</td><td><a href="https://github.com/r001/tree-sitter-leo" target="_blank" rel="noopener noreferrer" class="">tree-sitter-leo</a></td></tr><tr><td>Solidity / Solang</td><td><a href="https://github.com/JoranHonig/tree-sitter-solidity" target="_blank" rel="noopener noreferrer" class="">tree-sitter-solidity</a></td></tr></tbody></table>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="metamut-style-mutators">MetaMut-style mutators<a href="https://nowarp.io/blog/compiler-testing-part-1#metamut-style-mutators" class="hash-link" aria-label="Direct link to MetaMut-style mutators" title="Direct link to MetaMut-style mutators" translate="no">​</a></h3>
<p>Beyond tree-sitter mutations, we want language-specific operations that test semantic and codegen passes — without hand-writing them. MetaMut solves this.</p>
<p>The MetaMut paper <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[5]</a> describes an approach to generating language-specific mutations using LLMs. While the experiment in the paper <a href="https://github.com/icsnju/MetaMut" target="_blank" rel="noopener noreferrer" class="">focused</a> on C and C++ compilers, it can be applied to Rust-based smart-contract languages as well.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-metamut-scheme-5eceb27d01540ba7678430781c2ad8d7.png"></div>
<div align="center"><em>MetaMut pipeline (source: <a href="https://connglli.github.io/pdfs/metamut_asplos24.pdf" target="_blank" rel="noopener noreferrer" class="">original paper</a>)</em></div>
<br>
<p>We will consider the MetaMut-style mutator developed for Sui Move: <a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove" target="_blank" rel="noopener noreferrer" class="">MetaMove</a>. While the approach is applicable to other languages, we will focus on Move, which contains <a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove/src/mutators" target="_blank" rel="noopener noreferrer" class="">884 unique mutators</a> plus all the <a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove/scripts" target="_blank" rel="noopener noreferrer" class="">scripts</a> needed to demonstrate the approach. While the core idea is similar to MetaMut, the implementation differs in several ways.</p>
<p>From the implementation perspective, it consists of these components:</p>
<ol>
<li class=""><strong>Rust mutator library</strong> – a small Rust library that lets the model create custom mutators using the compiler's AST without reading the whole compiler codebase on each step. It contains a simplified AST and some logic to call custom mutators.</li>
<li class=""><strong>Script to invent new mutators</strong> – combines mutating operations (<code>swap</code>, <code>toggle</code>, ...) with all available AST elements, generates descriptions of how each mutation should work, and saves the results.</li>
<li class=""><strong>Script to implement new mutators</strong> – takes the descriptions generated by the previous script and the AST from the library, and calls the model to generate mutations with compilation feedback.</li>
<li class=""><strong>Script to verify the generated mutators</strong> – checks whether all of them can be applied and whether they generate syntactically valid code.</li>
</ol>
<p>The experiment used <strong>Sonnet 4.6</strong> to invent and generate the mutators.</p>
<p>Consider the implementation and differences from the original approach in greater detail.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="rust-mutator-library">Rust mutator library<a href="https://nowarp.io/blog/compiler-testing-part-1#rust-mutator-library" class="hash-link" aria-label="Direct link to Rust mutator library" title="Direct link to Rust mutator library" translate="no">​</a></h4>
<p>The library wraps the Move parser, walks the AST, collecting target categories (expressions, if/match/loop, let bindings, function calls, etc.), and exposes a single <code>MuAstContext</code> that each mutator operates on. Each generated mutator implements a <code>MoveMutator</code> trait with four methods: <code>name()</code>, <code>description()</code>, <code>needs()</code> returning a bitmask of required AST targets, and <code>mutate()</code> that edits the source via byte-offset rewriting — no AST-to-source serializer needed.</p>
<p>Pre-filtering by <code>needs()</code> is the key efficiency trick: the fuzz loop computes the available target kinds once per input, and only mutators whose <code>needs()</code> overlap with those kinds get invoked. A minimal example:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">impl</span><span class="token plain"> </span><span class="token class-name">MoveMutator</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token class-name">SwapBinOp</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">name</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token lifetime-annotation symbol" style="color:#36acaa">'static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">str</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"SwapBinOp"</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">description</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token lifetime-annotation symbol" style="color:#36acaa">'static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">str</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"Swap a binary operator with a compatible one"</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">needs</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">u32</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token constant" style="color:#36acaa">TK_BINOP</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">mutate</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> ctx</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">mut</span><span class="token plain"> </span><span class="token class-name">MuAstContext</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">bool</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> binop </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> ctx</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">pick_random_binop</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token operator" style="color:#393A34">?</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> replacement </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> ctx</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">compatible_op</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">binop</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">kind</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ctx</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">replace_text</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">binop</span><span class="token punctuation" style="color:#393A34">.</span><span class="token plain">loc</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> replacement</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="inventing-mutators-with-llm">Inventing mutators with LLM<a href="https://nowarp.io/blog/compiler-testing-part-1#inventing-mutators-with-llm" class="hash-link" aria-label="Direct link to Inventing mutators with LLM" title="Direct link to Inventing mutators with LLM" translate="no">​</a></h4>
<p>The invent phase produces <code>(Name, Description)</code> pairs — each named <code>{Action}{Structure}</code> (e.g. <code>SwapBinOp</code>, <code>ToggleMutability</code>) — that feed the implementation phase. The <a href="https://github.com/nowarp/move-fuzz/blob/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove/prompts/invent_mutator.txt" target="_blank" rel="noopener noreferrer" class="">prompt</a> combines two catalogs: 15 generic actions from the paper plus Move-specific AST structures (BinOp, Match, Ability, Visibility, ModuleDef, etc.).</p>
<p>Mutation actions:</p>
<div class="language-text codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">swap      — Replace one element with a compatible alternative</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">remove    — Delete an element from the program</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">add       — Insert a new element into the program</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">duplicate — Copy an element and insert the copy nearby</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">negate    — Invert or negate an element's meaning</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">modify    — Change an element's value or property</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">inline    — Replace a reference with the thing it refers to</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">wrap      — Surround an element with a new construct</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">unwrap    — Remove a surrounding construct, keeping inner content</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">reorder   — Change the order of sibling elements</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">lift      — Move an element to an outer/higher scope</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">sink      — Move an element to an inner/lower scope</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">split     — Break one element into two separate ones</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">merge     — Combine two elements into one</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">toggle    — Flip a boolean-like property on/off</span><br></div></code></pre></div></div>
<p>These are generic enough to apply to smart-contract languages as-is. The prompt explicitly asks for <strong>syntactically valid mutations only</strong> — anything that fails to parse is rejected later in the <a href="https://nowarp.io/blog/compiler-testing-part-1#validating-mutators" class="">Validating phase</a>.</p>
<p>The LLM occasionally hallucinates impossible combinations (e.g. <code>negate ModuleDef</code>) and invents descriptions to fit. That's fine for fuzzing: the mutator still changes program structure and opens new paths, and the <a href="https://nowarp.io/blog/compiler-testing-part-1#validating-mutators" class="">Validating phase</a> catches anything that doesn't actually modify code or produces invalid syntax. Here's how <code>negate ModuleDef</code> got interpreted:</p>
<blockquote>
<p>"Find two <code>module NAME { ... }</code> declarations in the same file and swap their identifiers, breaking fully-qualified callers and exercising the resolver's duplicate-symbol / shadowing paths."</p>
</blockquote>
<p>Differences from the paper:</p>
<ul>
<li class="">Batched generation — 8 operations per target per prompt, saves tokens</li>
<li class="">Caching and skip logic for batches that failed in the first iteration</li>
<li class="">A configuration option to prioritize specific target structures (e.g. recently-added <code>enum</code>/<code>match</code> for Sui Move)</li>
</ul>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="implementing-mutators">Implementing mutators<a href="https://nowarp.io/blog/compiler-testing-part-1#implementing-mutators" class="hash-link" aria-label="Direct link to Implementing mutators" title="Direct link to Implementing mutators" translate="no">​</a></h4>
<p>The implement phase turns each <code>(Name, Description)</code> pair into a compiled Rust mutator registered in the driver. The <a href="https://github.com/nowarp/move-fuzz/blob/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove/prompts/implement_mutator.txt" target="_blank" rel="noopener noreferrer" class="">prompt</a> contains the μAST API reference and a reference implementation (<code>SwapBinOp</code>). The LLM returns Rust code as text — it has no filesystem access. The <a href="https://github.com/nowarp/move-fuzz/blob/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove/scripts/generate.py" target="_blank" rel="noopener noreferrer" class="">generation script</a> writes each response to <code>src/mutators/{name}.rs</code>, runs <code>cargo check</code>, and on failure sends the code plus compiler error back for a refinement pass (up to 10 rounds). Mutators that still don't compile are dropped.</p>
<p>Mutator quality varies — many are simple, some are hallucinated. That's fine at scale: each target project has 700–1000 combination ideas to invent mutators, and the <a href="https://nowarp.io/blog/compiler-testing-part-1#validating-mutators" class="">Validating phase</a> filters the ones that don't actually modify code or generate garbage. About 7% of mutators needed manual fixes after validation to become useful.</p>
<p>Some mutators are primitive but effective. <code>WrapExpressionStmt</code>, generated for Leo, found 4 ICEs despite its simplicity:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">//! WrapExpressionStmt: Wrap an expression statement in an assert or call.</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">impl</span><span class="token plain"> </span><span class="token class-name">LeoMutator</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">for</span><span class="token plain"> </span><span class="token class-name">WrapExpressionStmt</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">name</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token lifetime-annotation symbol" style="color:#36acaa">'static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">str</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"WrapExpressionStmt"</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// Generated by the model in the Invent phase</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">description</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token lifetime-annotation symbol" style="color:#36acaa">'static</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">str</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token string" style="color:#e3116c">"Wraps an expression statement in an assert or redundant call to \</span><br></div><div class="token-line" style="color:#393A34"><span class="token string" style="color:#e3116c">         test type-checking and circuit generation on nested expressions"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">fn</span><span class="token plain"> </span><span class="token function-definition function" style="color:#d73a49">mutate</span><span class="token punctuation" style="color:#393A34">(</span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> ctx</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token keyword" style="color:#00009f">mut</span><span class="token plain"> </span><span class="token class-name">MuAstContext</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">-&gt;</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">bool</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">// ... pick an expression from one of the statements if available</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">// Wrapping the expression found</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> wrapped </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">match</span><span class="token plain"> ctx</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">rand_index</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token number" style="color:#36acaa">0</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=&gt;</span><span class="token plain"> </span><span class="token macro property" style="color:#36acaa">format!</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"assert_eq({}, {});"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> expr</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> expr</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token number" style="color:#36acaa">1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=&gt;</span><span class="token plain"> </span><span class="token macro property" style="color:#36acaa">format!</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"assert_neq({}, 0u32);"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> expr</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token number" style="color:#36acaa">2</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=&gt;</span><span class="token plain"> </span><span class="token macro property" style="color:#36acaa">format!</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"assert({} == {});"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> expr</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> expr</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token number" style="color:#36acaa">3</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=&gt;</span><span class="token plain"> </span><span class="token macro property" style="color:#36acaa">format!</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"let {} = {};"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> ctx</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">generate_unique_name</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string" style="color:#e3116c">"_w"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> expr</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        ctx</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">replace_text</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">target</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&amp;</span><span class="token plain">wrapped</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">        </span><span class="token boolean" style="color:#36acaa">true</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="validating-mutators">Validating mutators<a href="https://nowarp.io/blog/compiler-testing-part-1#validating-mutators" class="hash-link" aria-label="Direct link to Validating mutators" title="Direct link to Validating mutators" translate="no">​</a></h4>
<p>The implement phase only guarantees compilation — not useful behavior. The <a href="https://github.com/nowarp/move-fuzz/blob/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove/scripts/validate_syntax.py" target="_blank" rel="noopener noreferrer" class="">validation script</a> runs each registered mutator against clean source files from the corpus and classifies the output via <code>move-check</code>:</p>
<ol>
<li class="">Sample N compilable files from the corpus (no parser/lexer errors in baseline)</li>
<li class="">Apply each mutator to K files with different seeds (parallel workers)</li>
<li class="">Classify resulting errors by <code>move-check</code> category:<!-- -->
<ul>
<li class=""><strong>category 1</strong> (parser/lexer) → invalid syntax, mutator <strong>rejected</strong></li>
<li class=""><strong>categories 2-4</strong> (name resolution, unbound variables, type errors) → acceptable, these are exactly the passes we want to test</li>
</ul>
</li>
<li class="">Flag mutators that never apply (always no-op, wasting CPU) — an example: a generated top-level-declaration mutator that looked for <code>use</code> among function statements</li>
</ol>
<p>The script also highlights gaps in the corpus – if a mutation never applies, the needed construction is likely missing from the corpus.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="conclusion">Conclusion<a href="https://nowarp.io/blog/compiler-testing-part-1#conclusion" class="hash-link" aria-label="Direct link to Conclusion" title="Direct link to Conclusion" translate="no">​</a></h4>
<p>Having hundreds of LLM-generated mutators challenging semantic and codegen passes <em>almost for free</em> is a big win for compiler fuzzing — it opens new paths and increases coverage without hand-writing a program generator or spending time on custom coverage-guided tooling.</p>
<p>It works best when combined with other grammar-aware mutators like the <a href="https://nowarp.io/blog/compiler-testing-part-1#afl-ts-tree-sitter-based-afl-mutator" class="">tree-sitter splice mutator</a>, which adds randomness and uncovers more subtle cases.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="built-in-afl-mutators">Built-in AFL++ mutators<a href="https://nowarp.io/blog/compiler-testing-part-1#built-in-afl-mutators" class="hash-link" aria-label="Direct link to Built-in AFL++ mutators" title="Direct link to Built-in AFL++ mutators" translate="no">​</a></h3>
<p>AFL++ ships with several <a href="https://github.com/AFLplusplus/AFLplusplus/tree/stable/custom_mutators" target="_blank" rel="noopener noreferrer" class="">custom mutators</a> in its distribution. Multiple mutators can be stacked via <code>AFL_CUSTOM_MUTATOR_LIBRARY</code> — different mutation algorithms hit different paths, so combining a grammar-aware mutators with byte-level alternatives increases overall coverage.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="autotokens">autotokens<a href="https://nowarp.io/blog/compiler-testing-part-1#autotokens" class="hash-link" aria-label="Direct link to autotokens" title="Direct link to autotokens" translate="no">​</a></h4>
<p>A grammar-free token fuzzer that splits input into tokens and shuffles them with different strategies. It learns its token pool from the <code>-x</code> dictionary and <code>CMPLOG</code>, and mutates below the grammar level. Useful as a lightweight complement to <code>afl-ts</code> — it picks up on tokens the grammar-aware mutator does not know (e.g. identifiers present in the corpus but not captured in the AST).</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="radamsa">radamsa<a href="https://nowarp.io/blog/compiler-testing-part-1#radamsa" class="hash-link" aria-label="Direct link to radamsa" title="Direct link to radamsa" translate="no">​</a></h4>
<p><a href="https://gitlab.com/akihe/radamsa" target="_blank" rel="noopener noreferrer" class="">radamsa</a> is a general-purpose byte-level fuzzer with several strategies that transfer to compiler fuzzing:</p>
<ul>
<li class=""><code>sed-tree-stutter</code> — generates deeply nested expressions (e.g. <code>f(g(h(f(g(h(f(g(h(x))))))))))</code>), often crashing parser stack depth and occasionally triggering typechecker errors. <a href="https://nowarp.io/blog/compiler-testing-part-1#afl-ts-tree-sitter-based-afl-mutator" class=""><code>afl-ts</code></a> implements the same strategy at the grammar level; radamsa operates at byte level and is more aggressive.</li>
<li class=""><code>rand-as-count</code> — appends large <code>A</code>-strings, useful for hitting integral-type boundaries and array length checks</li>
<li class="">Byte-level glyph injection — adds "interesting" symbols (unicode glyphs, control bytes) that crash the lexer/parser. Lexer/parser bugs are <a href="https://nowarp.io/blog/compiler-testing-part-1#background" class="">out of scope</a> here, but the strategy may be useful if you target them.</li>
<li class="">Boundary literal injection — swaps numeric values with edge cases (0, MAX, negative, large numbers) to stress integer overflow paths. <a href="https://nowarp.io/blog/compiler-testing-part-1#afl-ts-tree-sitter-based-afl-mutator" class=""><code>afl-ts</code></a>'s <code>ts-lit</code> already does this at the grammar level.</li>
</ul>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-radamsa-output-61ea764518a77912030ba094802ae907.png"></div>
<div align="center"><em>radamsa introduced a large literal reproducing <a href="https://github.com/argotorg/solidity/issues/16619" target="_blank" rel="noopener noreferrer" class="">solidity#16619</a></em></div>
<br>
<p>Caveats:</p>
<ul>
<li class="">Most radamsa output is parser/lexer noise that should be filtered during triage</li>
<li class="">Very large inputs slow the harness (e.g. constant evaluation on huge numbers)</li>
</ul>
<p>Not the primary mutator, but running it on one worker adds corpus diversity.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="other-grammar-aware-fuzzers">Other grammar-aware fuzzers<a href="https://nowarp.io/blog/compiler-testing-part-1#other-grammar-aware-fuzzers" class="hash-link" aria-label="Direct link to Other grammar-aware fuzzers" title="Direct link to Other grammar-aware fuzzers" translate="no">​</a></h3>
<p>There are other open-source fuzzers and custom mutators that may be used to improve the fuzzing campaign. Some of them are integrated to AFL++ while some could be used as an external fuzzers (AFL++ <code>-F</code> flag).</p>
<p>Some fuzzers from papers and open-source projects represent ideas that overlap with the approaches used here. They are mentioned because they may be useful if you are writing your own tooling, or they may be more suitable for the language you are targeting:</p>
<ul>
<li class=""><a href="https://github.com/atnwalk/atnwalk" target="_blank" rel="noopener noreferrer" class="">ATNwalk</a> – provides grammar-aware mutations and has <a href="https://github.com/AFLplusplus/AFLplusplus//blob/4e5c0469ad9d56060317ebdc88027e2143f7b979/custom_mutators/atnwalk/README.md" target="_blank" rel="noopener noreferrer" class="">built-in AFL++ integration</a>, but is not convenient to use since it requires a quality ANTLR4 grammar.</li>
<li class=""><a href="https://github.com/HexHive/Gramatron" target="_blank" rel="noopener noreferrer" class="">Gramatron</a> – grammar-aware fuzzer that operates on grammar automata, which was used for fuzzing an experimental language for the <a href="https://ton.org/en" target="_blank" rel="noopener noreferrer" class="">TON</a> blockchain; the main issue was the automaton generation algorithm. Grammars can be generated using <a href="https://github.com/jubnzv/treesitter-to-gramatron" target="_blank" rel="noopener noreferrer" class="">this script</a> or manually, but they must be very minimal. This is acceptable for dynamically typed languages like JavaScript as described in the paper <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[9]</a>, but for statically typed blockchain languages the grammars blow up the fixpoint algorithm that generates the automaton.</li>
<li class=""><a href="https://github.com/fuzz4all/fuzz4all" target="_blank" rel="noopener noreferrer" class="">Fuzz4All</a> – uses LLM-based generation to fuzz compilers <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[12]</a>. A good option to <a href="https://nowarp.io/blog/compiler-testing-part-1#corpus-and-dictionaries" class="">extend the corpus</a> or run separately alongside the coverage-guided fuzzer.</li>
<li class=""><a href="https://github.com/uw-pluverse/perses/tree/master/kitten" target="_blank" rel="noopener noreferrer" class="">Kitten</a> – an ANTLR4-based program generator that recently found 328 bugs in common compilers. It uses grammar-aware mutations similar to tree-splicer or <code>afl-ts</code>, with additional strategies like rarity-weighted target selection, kleene-targeted mutations, and top-down grammar generation powered by ANTLR4 <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[10]</a>.</li>
<li class=""><a href="https://github.com/ncsu-swat/IssueMut" target="_blank" rel="noopener noreferrer" class="">IssueMut</a> – the same idea as MetaMut, but previous findings are used as a source of mutations <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[11]</a>.</li>
</ul>
<p>These are interesting sources of related mutation strategies that may be used to improve the tooling.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="corpus-and-dictionaries">Corpus and dictionaries<a href="https://nowarp.io/blog/compiler-testing-part-1#corpus-and-dictionaries" class="hash-link" aria-label="Direct link to Corpus and dictionaries" title="Direct link to Corpus and dictionaries" translate="no">​</a></h2>
<p>The corpus feeds the mutators. Grammar-aware mutators like <a href="https://nowarp.io/blog/compiler-testing-part-1#afl-ts-tree-sitter-based-afl-mutator" class=""><code>afl-ts</code></a> and <a href="https://nowarp.io/blog/compiler-testing-part-1#metamut-style-mutators" class="">MetaMut-style</a> splice, swap, and delete subtrees from corpus entries. Byte-level mutators like radamsa and autotokens extract tokens from the same files. Corpus quality directly determines mutation quality — mutators can only produce what they can see.</p>
<p>A good corpus is small, diverse, and covers a broad surface of the language. Small because havoc and custom splice-style mutators run faster on small files, and oversized entries slow the whole campaign. Diverse because grammar-aware mutators only splice what's present in the corpus — missing language constructs stay unreachable. The tension between "small" and "diverse" is resolved by minimization: collect broadly, then trim to the smallest set that still covers the same paths (discussed in the next subsection).</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="collecting-corpus-files">Collecting corpus files<a href="https://nowarp.io/blog/compiler-testing-part-1#collecting-corpus-files" class="hash-link" aria-label="Direct link to Collecting corpus files" title="Direct link to Collecting corpus files" translate="no">​</a></h3>
<p>The most straightforward way to seed the corpus is to collect source files somehwere, remove large and slow inputs and <a href="https://aflplus.plus/docs/fuzzing_in_depth/#b-making-the-input-corpus-unique" target="_blank" rel="noopener noreferrer" class="">minimize</a> the corpus.</p>
<p>To collect the initial corpus you could start with:</p>
<ul>
<li class="">compiler's test suite and examples</li>
<li class="">projects on GitHub</li>
<li class="">datasets, e.g. verifier/scan projects often provides information, there are decompiled <a href="https://github.com/MystenLabs/sui-packages/" target="_blank" rel="noopener noreferrer" class=""><code>sui-packages</code></a> for Sui Move or <a href="https://huggingface.co/datasets/Zellic/all-ethereum-contracts" target="_blank" rel="noopener noreferrer" class="">Zellic dataset of Ethereum contracts</a></li>
</ul>
<p>The initial metric to evaluate the corpus coverage is <a href="https://afl-1.readthedocs.io/en/latest/user_guide.html#map-coverage" target="_blank" rel="noopener noreferrer" class="">AFL++ stats</a> and code coverage you could get with <code>llvm-cov</code>/<code>gcov</code>.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="tsgen-tree-sitter-based-generation"><code>tsgen</code>: tree-sitter-based generation<a href="https://nowarp.io/blog/compiler-testing-part-1#tsgen-tree-sitter-based-generation" class="hash-link" aria-label="Direct link to tsgen-tree-sitter-based-generation" title="Direct link to tsgen-tree-sitter-based-generation" translate="no">​</a></h3>
<p>Sometimes you get low coverage even after seeding with existing open-source code for the compiler. This happens when the language is powerful enough that not all of its features are actively used, or when new features have just been introduced.</p>
<p>This was the case for Cairo fuzzing. To cover the gaps, a small utility was created: <a href="https://github.com/jubnzv/tsgen" target="_blank" rel="noopener noreferrer" class=""><code>tsgen</code></a>.</p>
<p>It generates a seed corpus directly from a tree-sitter <code>grammar.json</code>. The generator walks the grammar recursively — at each <code>CHOICE</code> node it picks an alternative, at each <code>REPEAT</code> it picks a count, and at each terminal it samples from a dictionary (optionally augmented by identifiers and literals harvested from real source files). A min-depth pre-pass prevents infinite recursion through self-referential rules (<code>expression → binary_op → expression → ...</code>), and generated programs are validated with the compiled parser to drop anything that doesn't parse.</p>
<p>After generating the corpus, run <code>afl-cmin</code> on it. Practical results: 150k generated Solidity files reduced to ~1300 unique seeds under 1024 bytes each; for the Cairo grammar it was ~700 seeds. That's a lot of seeds for free — the process takes less time than exploring the corpus from the ground up with grammar-aware mutators, and harvesting identifiers from real source files gives better diversity.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="llm-based-seed-generation-with-coverage-feedback">LLM-based seed generation with coverage feedback<a href="https://nowarp.io/blog/compiler-testing-part-1#llm-based-seed-generation-with-coverage-feedback" class="hash-link" aria-label="Direct link to LLM-based seed generation with coverage feedback" title="Direct link to LLM-based seed generation with coverage feedback" translate="no">​</a></h3>
<p>Another option: generate seeds with an LLM.</p>
<p>Obvious starting points:</p>
<ul>
<li class="">Explore previous findings for the repo and ask the LLM to generate seeds based on them.</li>
<li class="">Explore typically bug-prone code based on experience — optimizations, code generation, IR transformations — and ask the LLM to generate code that triggers specific places: constant folding, pattern matching compilation, etc.</li>
<li class="">Explore documentation and specification; generate seeds targeting rarely-used or tricky constructions.</li>
<li class="">Use <code>git blame</code> on reachable code paths to generate seeds targeting recently introduced changes.</li>
</ul>
<p>This works well when you are just starting the campaign and seeding the corpus. After that, its usefulness is limited, because grammar-aware fuzzers will hit most of the paths anyway.</p>
<p>To avoid wasting time and tokens on already-covered paths, involve code coverage: look at which paths are not yet triggered and write a simple script that asks the LLM to generate seeds for specific gaps, then check coverage again in a feedback loop.</p>
<p>While code coverage <a href="https://danielhall.io/code-coverage-is-a-terrible-metric" target="_blank" rel="noopener noreferrer" class="">is not a good metric</a>, it at least lets you make sure your corpus doesn't have complete gaps.</p>
<p>Additionally, approaches like <a href="https://github.com/ise-uiuc/WhiteFox" target="_blank" rel="noopener noreferrer" class="">WhiteFox</a><a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[3]</a> that leverage language documentation for fuzzing were <a href="https://gusarich.com/blog/fuzzing-with-llms" target="_blank" rel="noopener noreferrer" class="">successfully applied for TON</a>. But this requires good documentation and does not scale well when testing multiple compilers.</p>
<p>Another idea that worked: compile your corpus and execute what gets compiled. This reveals bytecode opcodes not covered by the corpus. For example, for Sui Move there were a few extremely rare opcodes related to pattern matching that were absent from 400k decompiled contracts and from the initial corpus, but got covered later this way.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="identifier-renaming-with-tree-sitter">Identifier renaming with tree-sitter<a href="https://nowarp.io/blog/compiler-testing-part-1#identifier-renaming-with-tree-sitter" class="hash-link" aria-label="Direct link to Identifier renaming with tree-sitter" title="Direct link to Identifier renaming with tree-sitter" translate="no">​</a></h3>
<p>A problem the fuzzer encounters when working with a generated corpus and grammar-aware mutations is the rate of semantic errors. Mutations often shuffle identifiers and code structure, producing lots of "undeclared variable" errors that do not let the fuzzer open new paths.</p>
<p>The solution: write a script that renames all identifiers to deterministic names and saves the result to a separate renamed corpus. Here is <a href="https://gist.github.com/jubnzv/10649e33865430d88de8eaa91fa50e9e" target="_blank" rel="noopener noreferrer" class="">a 50-line script</a> doing this with <code>tree-sitter-solidity</code>.</p>
<p>The simplest approach that works: name all identifiers with a uniform pattern, e.g. <code>v0</code>, <code>v1</code>, ... Save this corpus and let <code>afl-ts</code> (in particular the <code>ts-bank</code> mutation) find the errors.</p>
<div align="center"><img style="width:70%" src="https://nowarp.io/assets/images/2026-04-17-named-corpus-finding-e7d1ad10ebeba599f78f0a93f79b842a.png"></div>
<div align="center"><em>Solidity: <code>ts-kdel</code> mutation on a renamed corpus → ICE (<a href="https://github.com/argotorg/solidity/issues/16636" target="_blank" rel="noopener noreferrer" class="">solidity#16636</a>). Without renaming, this would only trigger an "undeclared variable" error.</em></div>
<br>
<p>The approach is straightforward and a similar one is used in the generation routine of <a href="https://github.com/softsec-kaist/codealchemist" target="_blank" rel="noopener noreferrer" class="">CodeAlchemist</a> — a program generator for fuzzing JavaScript engines <a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="">[4]</a> — for the same purpose.</p>
<p>Additionally, you may want to seed stdlib identifiers or language keywords to challenge the semantic passes.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="dictionaries">Dictionaries<a href="https://nowarp.io/blog/compiler-testing-part-1#dictionaries" class="hash-link" aria-label="Direct link to Dictionaries" title="Direct link to Dictionaries" translate="no">​</a></h3>
<p>A fuzzing dictionary is a list of tokens that the fuzzer inserts into inputs during mutations. All three fuzzers support them: AFL++ via <a href="https://aflplus.plus/docs/fuzzing_in_depth/#c-using-multiple-cores" target="_blank" rel="noopener noreferrer" class=""><code>-x</code></a>, honggfuzz via <code>--dict</code>, and libFuzzer via <code>-dict</code>. A custom dictionary for the language must be used if you run AFL++ without <code>AFL_CUSTOM_MUTATOR_ONLY = 1</code> — this enables the havoc pass to add meaningful constructions to the code.</p>
<p>Ideas for initial dictionary setup:</p>
<ul>
<li class="">Language's grammar or parser implementation</li>
<li class="">Common patterns from documentation and examples</li>
<li class="">Names of standard functions and language elements (e.g. possible modifiers or values of <code>pragma</code>)</li>
<li class=""><code>{</code>, <code>[</code> and similar symbols – these often give interesting results</li>
<li class="">Constructions involved in previous crashes (find in regression unit tests and/or GitHub search for previous ICEs)</li>
<li class=""><a href="https://aflplus.plus/docs/env_variables/#5-settings-for-afl-clang-fast--afl-clang-fast-afl-clang-lto-afl-gcc-fast" target="_blank" rel="noopener noreferrer" class=""><code>AFL_LLVM_DICT2FILE</code></a> — auto-extracts string comparisons from the target at compile time. Useful as a supplement, but for compiler fuzzing a hand-crafted dictionary from the language grammar is more effective</li>
</ul>
<p>Focus on keeping entries atomic; avoid long constructions. For example, instead of <code>let a = &amp;mut x</code> add <code>&amp;mut</code> and <code>let</code> separately – havoc and grammar-aware mutators will figure it out by combining them with existing identifiers and operations.</p>
<p>After running the corpus for a while, it is a good idea to:</p>
<ul>
<li class="">Check coverage – typically you can find operators/constructions that are hit most rarely – add them to the dictionary</li>
<li class="">Include constructions from findings made by the fuzzer</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="triage-workflow">Triage workflow<a href="https://nowarp.io/blog/compiler-testing-part-1#triage-workflow" class="hash-link" aria-label="Direct link to Triage workflow" title="Direct link to Triage workflow" translate="no">​</a></h2>
<p>The typical output of a fuzzer is a number of crash and hang (timeout) files — usually big files with lots of irrelevant garbage. Additionally, some crashes are duplicated despite the AFL++ deduplication mechanism, because the same bug may be caused by different syntactic constructions leading to different triggering paths. The goal of triaging is to remove duplicated crashes first, and then minimize the remaining files to a <a href="https://stackoverflow.com/help/minimal-reproducible-example" target="_blank" rel="noopener noreferrer" class="">minimal reproducible example</a> (MRE) to report.</p>
<p>The suggested approach:</p>
<ol>
<li class=""><a href="https://nowarp.io/blog/compiler-testing-part-1#deduplication" class="">Deduplicate</a> crashes with a triage script that analyzes the backtrace of crash/hang callsites</li>
<li class=""><a href="https://nowarp.io/blog/compiler-testing-part-1#minimization" class="">Minimize</a> the results — manually, using tooling, or with LLMs</li>
<li class=""><a href="https://nowarp.io/blog/compiler-testing-part-1#report-filing" class="">Report filing</a></li>
</ol>
<p>Each stage is described below.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="deduplication">Deduplication<a href="https://nowarp.io/blog/compiler-testing-part-1#deduplication" class="hash-link" aria-label="Direct link to Deduplication" title="Direct link to Deduplication" translate="no">​</a></h3>
<p>A long campaign produces hundreds of crash files for a handful of underlying bugs. Most of them are duplicates of the same panic triggered by different inputs, plus a tail of "benign" panics (stack overflows from parser bugs, intentional TODO errors, etc.) that should not be reported. A short script handles the filtering and grouping.</p>
<p>The algorithm:</p>
<ol>
<li class="">Collect crash inputs from all AFL++ output dirs across workers</li>
<li class="">Replay each crash against the harness with backtrace enabled</li>
<li class="">Filter out benign panics by matching known patterns (unimplemented features, stack overflows, etc.)</li>
<li class="">Extract the throw location from the backtrace</li>
<li class="">Normalize the panic message — strip identifiers, source locations, numbers, etc.</li>
<li class="">Group and cache crashes to avoid reporting old bugs again</li>
</ol>
<div align="center"><img style="width:75%" src="https://nowarp.io/assets/images/2026-04-17-triage-script-085b3fef0e62ad93d40dd58ac164f67a.png"></div>
<div align="center"><em><code>triage.py</code> output for the Solidity campaign: 157 bugs considered unique by AFL grouped into 16 unique locations</em></div>
<br>
<p>While the script seems easy to implement with LLMs, make sure it works correctly — especially backtrace parsing and deduplication logic — to avoid losing valid bugs.</p>
<p>An example implementation of such a script for a Solidity fuzzing campaign is available as a <a href="https://gist.github.com/jubnzv/827f06a16e2127c1bfed17de0c139619" target="_blank" rel="noopener noreferrer" class="">gist</a>.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="minimization">Minimization<a href="https://nowarp.io/blog/compiler-testing-part-1#minimization" class="hash-link" aria-label="Direct link to Minimization" title="Direct link to Minimization" translate="no">​</a></h3>
<p>When minimizing a crash report manually, use the <a href="https://www.debuggingbook.org/html/DeltaDebugger.html" target="_blank" rel="noopener noreferrer" class="">delta debugging technique</a> — a classic troubleshooting approach.</p>
<p>Among the tools, <a href="https://github.com/uw-pluverse/perses" target="_blank" rel="noopener noreferrer" class="">perses</a> and <a href="https://github.com/langston-barrett/treereduce" target="_blank" rel="noopener noreferrer" class="">treereduce</a> can help — both provide grammar-aware reduction. <code>afl-tmin</code> is not a good fit here, because it operates at the bit/byte level and knows nothing about the grammar.</p>
<p>But typically it is not worth your time — you can safely delegate it to an LLM without any extra commands. Two things to watch for: tell the model not to report ASTs recovered after parsing errors; on weird-looking source, it sometimes gives up without reproducing the bug.</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-04-17-minimized-solidity-8fb5a04c742f497d216648fde05b56b4.png"></div>
<div align="center"><em>Minimized by LLM: the original crash file and the minimized version (<a href="https://github.com/argotorg/solidity/issues/16622" target="_blank" rel="noopener noreferrer" class="">solidity#16622</a>)</em></div>
<br>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="report-filing">Report filing<a href="https://nowarp.io/blog/compiler-testing-part-1#report-filing" class="hash-link" aria-label="Direct link to Report filing" title="Direct link to Report filing" translate="no">​</a></h3>
<p>After deduplication and minimization, you need to check for duplicates against existing issues and write a report.</p>
<p>LLMs work great here, but you need a good prompt:</p>
<ul>
<li class="">Ask the model to check for duplicates with <code>gh</code>.</li>
<li class="">Always ask it to reproduce with the real compiler, not with the fuzzing harness.</li>
<li class="">Write very concise reports without root cause analysis or suggested fixes — this avoids hallucinations. Don't write anything you did not check by yourself.</li>
</ul>
<p>First, triage a few reports by yourself, then write a CLAUDE.md triage guide for the model to follow. The complete move-fuzz triage/minimization/reporting prompt <a href="https://github.com/nowarp/move-fuzz/blob/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/CLAUDE.md" target="_blank" rel="noopener noreferrer" class="">is available</a> in the <code>move-fuzz</code> repo.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="evaluation">Evaluation<a href="https://nowarp.io/blog/compiler-testing-part-1#evaluation" class="hash-link" aria-label="Direct link to Evaluation" title="Direct link to Evaluation" translate="no">​</a></h2>
<p>Here are the results of the campaign. All the findings goes beyond lexer and parser, and triggered by later compilation passes. The findings for Sui Move, Leo, and Cairo are comfirmed and almost all were fixed. Solang and Solidity bugs are under triage at the moment of publishsing (Apr 2026).</p>
<p>Here is the complete table:</p>
<table><thead><tr><th>Compiler</th><th>ICEs found</th><th>Issues</th></tr></thead><tbody><tr><td><a href="https://github.com/MystenLabs/sui/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Sui Move</a></td><td>27</td><td><a href="https://github.com/MystenLabs/sui/issues/25349" target="_blank" rel="noopener noreferrer" class="">#25349</a> <a href="https://github.com/MystenLabs/sui/issues/25450" target="_blank" rel="noopener noreferrer" class="">#25450</a> <a href="https://github.com/MystenLabs/sui/issues/25451" target="_blank" rel="noopener noreferrer" class="">#25451</a> <a href="https://github.com/MystenLabs/sui/issues/25452" target="_blank" rel="noopener noreferrer" class="">#25452</a> <a href="https://github.com/MystenLabs/sui/issues/25453" target="_blank" rel="noopener noreferrer" class="">#25453</a> <a href="https://github.com/MystenLabs/sui/issues/25454" target="_blank" rel="noopener noreferrer" class="">#25454</a> <a href="https://github.com/MystenLabs/sui/issues/25455" target="_blank" rel="noopener noreferrer" class="">#25455</a> <a href="https://github.com/MystenLabs/sui/issues/25456" target="_blank" rel="noopener noreferrer" class="">#25456</a> <a href="https://github.com/MystenLabs/sui/issues/25457" target="_blank" rel="noopener noreferrer" class="">#25457</a> <a href="https://github.com/MystenLabs/sui/issues/25458" target="_blank" rel="noopener noreferrer" class="">#25458</a> <a href="https://github.com/MystenLabs/sui/issues/25459" target="_blank" rel="noopener noreferrer" class="">#25459</a> <a href="https://github.com/MystenLabs/sui/issues/25460" target="_blank" rel="noopener noreferrer" class="">#25460</a> <a href="https://github.com/MystenLabs/sui/issues/25472" target="_blank" rel="noopener noreferrer" class="">#25472</a> <a href="https://github.com/MystenLabs/sui/issues/25529" target="_blank" rel="noopener noreferrer" class="">#25529</a> <a href="https://github.com/MystenLabs/sui/issues/25548" target="_blank" rel="noopener noreferrer" class="">#25548</a> <a href="https://github.com/MystenLabs/sui/issues/25595" target="_blank" rel="noopener noreferrer" class="">#25595</a> <a href="https://github.com/MystenLabs/sui/issues/25607" target="_blank" rel="noopener noreferrer" class="">#25607</a> <a href="https://github.com/MystenLabs/sui/issues/25608" target="_blank" rel="noopener noreferrer" class="">#25608</a> <a href="https://github.com/MystenLabs/sui/issues/25650" target="_blank" rel="noopener noreferrer" class="">#25650</a> <a href="https://github.com/MystenLabs/sui/issues/25711" target="_blank" rel="noopener noreferrer" class="">#25711</a> <a href="https://github.com/MystenLabs/sui/issues/25750" target="_blank" rel="noopener noreferrer" class="">#25750</a> <a href="https://github.com/MystenLabs/sui/issues/25775" target="_blank" rel="noopener noreferrer" class="">#25775</a> <a href="https://github.com/MystenLabs/sui/issues/25790" target="_blank" rel="noopener noreferrer" class="">#25790</a> <a href="https://github.com/MystenLabs/sui/issues/25825" target="_blank" rel="noopener noreferrer" class="">#25825</a> <a href="https://github.com/MystenLabs/sui/issues/25826" target="_blank" rel="noopener noreferrer" class="">#25826</a> <a href="https://github.com/MystenLabs/sui/issues/25846" target="_blank" rel="noopener noreferrer" class="">#25846</a> <a href="https://github.com/MystenLabs/sui/issues/26110" target="_blank" rel="noopener noreferrer" class="">#26110</a></td></tr><tr><td><a href="https://github.com/ProvableHQ/leo/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Leo</a></td><td>22</td><td><a href="https://github.com/ProvableHQ/leo/issues/29218" target="_blank" rel="noopener noreferrer" class="">#29218</a> <a href="https://github.com/ProvableHQ/leo/issues/29219" target="_blank" rel="noopener noreferrer" class="">#29219</a> <a href="https://github.com/ProvableHQ/leo/issues/29220" target="_blank" rel="noopener noreferrer" class="">#29220</a> <a href="https://github.com/ProvableHQ/leo/issues/29221" target="_blank" rel="noopener noreferrer" class="">#29221</a> <a href="https://github.com/ProvableHQ/leo/issues/29222" target="_blank" rel="noopener noreferrer" class="">#29222</a> <a href="https://github.com/ProvableHQ/leo/issues/29223" target="_blank" rel="noopener noreferrer" class="">#29223</a> <a href="https://github.com/ProvableHQ/leo/issues/29224" target="_blank" rel="noopener noreferrer" class="">#29224</a> <a href="https://github.com/ProvableHQ/leo/issues/29225" target="_blank" rel="noopener noreferrer" class="">#29225</a> <a href="https://github.com/ProvableHQ/leo/issues/29226" target="_blank" rel="noopener noreferrer" class="">#29226</a> <a href="https://github.com/ProvableHQ/leo/issues/29227" target="_blank" rel="noopener noreferrer" class="">#29227</a> <a href="https://github.com/ProvableHQ/leo/issues/29229" target="_blank" rel="noopener noreferrer" class="">#29229</a> <a href="https://github.com/ProvableHQ/leo/issues/29230" target="_blank" rel="noopener noreferrer" class="">#29230</a> <a href="https://github.com/ProvableHQ/leo/issues/29305" target="_blank" rel="noopener noreferrer" class="">#29305</a> <a href="https://github.com/ProvableHQ/leo/issues/29306" target="_blank" rel="noopener noreferrer" class="">#29306</a> <a href="https://github.com/ProvableHQ/leo/issues/29307" target="_blank" rel="noopener noreferrer" class="">#29307</a> <a href="https://github.com/ProvableHQ/leo/issues/29309" target="_blank" rel="noopener noreferrer" class="">#29309</a> <a href="https://github.com/ProvableHQ/leo/issues/29314" target="_blank" rel="noopener noreferrer" class="">#29314</a> <a href="https://github.com/ProvableHQ/leo/issues/29315" target="_blank" rel="noopener noreferrer" class="">#29315</a> <a href="https://github.com/ProvableHQ/leo/issues/29316" target="_blank" rel="noopener noreferrer" class="">#29316</a> <a href="https://github.com/ProvableHQ/leo/issues/29324" target="_blank" rel="noopener noreferrer" class="">#29324</a> <a href="https://github.com/ProvableHQ/leo/issues/29325" target="_blank" rel="noopener noreferrer" class="">#29325</a> <a href="https://github.com/ProvableHQ/leo/issues/29326" target="_blank" rel="noopener noreferrer" class="">#29326</a></td></tr><tr><td><a href="https://github.com/hyperledger-solang/solang/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Solang</a></td><td>20</td><td><a href="https://github.com/hyperledger-solang/solang/issues/1862" target="_blank" rel="noopener noreferrer" class="">#1862</a> <a href="https://github.com/hyperledger-solang/solang/issues/1863" target="_blank" rel="noopener noreferrer" class="">#1863</a> <a href="https://github.com/hyperledger-solang/solang/issues/1864" target="_blank" rel="noopener noreferrer" class="">#1864</a> <a href="https://github.com/hyperledger-solang/solang/issues/1865" target="_blank" rel="noopener noreferrer" class="">#1865</a> <a href="https://github.com/hyperledger-solang/solang/issues/1866" target="_blank" rel="noopener noreferrer" class="">#1866</a> <a href="https://github.com/hyperledger-solang/solang/issues/1867" target="_blank" rel="noopener noreferrer" class="">#1867</a> <a href="https://github.com/hyperledger-solang/solang/issues/1868" target="_blank" rel="noopener noreferrer" class="">#1868</a> <a href="https://github.com/hyperledger-solang/solang/issues/1869" target="_blank" rel="noopener noreferrer" class="">#1869</a> <a href="https://github.com/hyperledger-solang/solang/issues/1870" target="_blank" rel="noopener noreferrer" class="">#1870</a> <a href="https://github.com/hyperledger-solang/solang/issues/1871" target="_blank" rel="noopener noreferrer" class="">#1871</a> <a href="https://github.com/hyperledger-solang/solang/issues/1872" target="_blank" rel="noopener noreferrer" class="">#1872</a> <a href="https://github.com/hyperledger-solang/solang/issues/1873" target="_blank" rel="noopener noreferrer" class="">#1873</a> <a href="https://github.com/hyperledger-solang/solang/issues/1874" target="_blank" rel="noopener noreferrer" class="">#1874</a> <a href="https://github.com/hyperledger-solang/solang/issues/1876" target="_blank" rel="noopener noreferrer" class="">#1876</a> <a href="https://github.com/hyperledger-solang/solang/issues/1877" target="_blank" rel="noopener noreferrer" class="">#1877</a> <a href="https://github.com/hyperledger-solang/solang/issues/1878" target="_blank" rel="noopener noreferrer" class="">#1878</a> <a href="https://github.com/hyperledger-solang/solang/issues/1879" target="_blank" rel="noopener noreferrer" class="">#1879</a> <a href="https://github.com/hyperledger-solang/solang/issues/1880" target="_blank" rel="noopener noreferrer" class="">#1880</a> <a href="https://github.com/hyperledger-solang/solang/issues/1881" target="_blank" rel="noopener noreferrer" class="">#1881</a> <a href="https://github.com/hyperledger-solang/solang/issues/1882" target="_blank" rel="noopener noreferrer" class="">#1882</a></td></tr><tr><td><a href="https://github.com/argotorg/solidity/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Solidity</a></td><td>20</td><td><a href="https://github.com/argotorg/solidity/issues/16610" target="_blank" rel="noopener noreferrer" class="">#16610</a> <a href="https://github.com/argotorg/solidity/issues/16611" target="_blank" rel="noopener noreferrer" class="">#16611</a> <a href="https://github.com/argotorg/solidity/issues/16612" target="_blank" rel="noopener noreferrer" class="">#16612</a> <a href="https://github.com/argotorg/solidity/issues/16613" target="_blank" rel="noopener noreferrer" class="">#16613</a> <a href="https://github.com/argotorg/solidity/issues/16614" target="_blank" rel="noopener noreferrer" class="">#16614</a> <a href="https://github.com/argotorg/solidity/issues/16615" target="_blank" rel="noopener noreferrer" class="">#16615</a> <a href="https://github.com/argotorg/solidity/issues/16616" target="_blank" rel="noopener noreferrer" class="">#16616</a> <a href="https://github.com/argotorg/solidity/issues/16617" target="_blank" rel="noopener noreferrer" class="">#16617</a> <a href="https://github.com/argotorg/solidity/issues/16618" target="_blank" rel="noopener noreferrer" class="">#16618</a> <a href="https://github.com/argotorg/solidity/issues/16619" target="_blank" rel="noopener noreferrer" class="">#16619</a> <a href="https://github.com/argotorg/solidity/issues/16620" target="_blank" rel="noopener noreferrer" class="">#16620</a> <a href="https://github.com/argotorg/solidity/issues/16621" target="_blank" rel="noopener noreferrer" class="">#16621</a> <a href="https://github.com/argotorg/solidity/issues/16622" target="_blank" rel="noopener noreferrer" class="">#16622</a> <a href="https://github.com/argotorg/solidity/issues/16624" target="_blank" rel="noopener noreferrer" class="">#16624</a> <a href="https://github.com/argotorg/solidity/issues/16627" target="_blank" rel="noopener noreferrer" class="">#16627</a> <a href="https://github.com/argotorg/solidity/issues/16628" target="_blank" rel="noopener noreferrer" class="">#16628</a> <a href="https://github.com/argotorg/solidity/issues/16629" target="_blank" rel="noopener noreferrer" class="">#16629</a> <a href="https://github.com/argotorg/solidity/issues/16630" target="_blank" rel="noopener noreferrer" class="">#16630</a> <a href="https://github.com/argotorg/solidity/issues/16633" target="_blank" rel="noopener noreferrer" class="">#16633</a> <a href="https://github.com/argotorg/solidity/issues/16636" target="_blank" rel="noopener noreferrer" class="">#16636</a></td></tr><tr><td><a href="https://github.com/starkware-libs/cairo/issues?q=is%3Aissue%20author%3Ajubnzv" target="_blank" rel="noopener noreferrer" class="">Cairo</a></td><td>11</td><td><a href="https://github.com/starkware-libs/cairo/issues/9785" target="_blank" rel="noopener noreferrer" class="">#9785</a> <a href="https://github.com/starkware-libs/cairo/issues/9786" target="_blank" rel="noopener noreferrer" class="">#9786</a> <a href="https://github.com/starkware-libs/cairo/issues/9787" target="_blank" rel="noopener noreferrer" class="">#9787</a> <a href="https://github.com/starkware-libs/cairo/issues/9788" target="_blank" rel="noopener noreferrer" class="">#9788</a> <a href="https://github.com/starkware-libs/cairo/issues/9789" target="_blank" rel="noopener noreferrer" class="">#9789</a> <a href="https://github.com/starkware-libs/cairo/issues/9790" target="_blank" rel="noopener noreferrer" class="">#9790</a> <a href="https://github.com/starkware-libs/cairo/issues/9791" target="_blank" rel="noopener noreferrer" class="">#9791</a> <a href="https://github.com/starkware-libs/cairo/issues/9797" target="_blank" rel="noopener noreferrer" class="">#9797</a> <a href="https://github.com/starkware-libs/cairo/issues/9798" target="_blank" rel="noopener noreferrer" class="">#9798</a> <a href="https://github.com/starkware-libs/cairo/issues/9799" target="_blank" rel="noopener noreferrer" class="">#9799</a> <a href="https://github.com/starkware-libs/cairo/issues/9824" target="_blank" rel="noopener noreferrer" class="">#9824</a></td></tr><tr><td><strong>Total</strong></td><td><strong>100</strong></td><td></td></tr></tbody></table>
<p>The campaign was run on a 2019 Intel i7 U-series and did not take that much time. The goal was to verify the approach, not to find all possible bugs, because running the infrastructure takes resources. These findings were mostly the result of initial corpus generation and quality mutators, and relied on coverage-guided path exploration much less.</p>
<p>Here is the concrete configuration used in fuzzing campaigns:</p>
<ul>
<li class="">Sui Move and Leo were fuzzed mostly with crafted MetaMut-style mutators with additional afl-ts instances. Default AFL++ mutations with custom dicts were applied for some workers. honggfuzz and libfuzzer workers with dicts were applied in 1 thread for some time.</li>
<li class="">Solidity and Solang both were fuzzed with <code>afl-ts</code> workers mostly because solidity has a large corpus of contracts (e.g. <a href="https://huggingface.co/datasets/Zellic/all-ethereum-contracts" target="_blank" rel="noopener noreferrer" class="">Zellic dataset</a>, lots of regression tests for previous findings in the Solidity repo) and good <a href="https://github.com/JoranHonig/tree-sitter-solidity" target="_blank" rel="noopener noreferrer" class="">up-to-date tree-sitter grammar</a>. Default AFL++ mutations with custom dicts were applied for some workers.</li>
<li class="">Cairo – a MetaMut-style mutators with only mutations for rare constructions and <code>afl-ts</code>. AFL++ mutations were disabled – since the fuzzing campaign there has extremely low stability and often hits memory limits because of <a href="https://github.com/salsa-rs/salsa" target="_blank" rel="noopener noreferrer" class="">Salsa</a>, any extra executions are expensive because they clutter the corpus making the fuzzing less effective. Thus, only grammar-aware mutations were applied there.</li>
</ul>
<p>While most of the bugs were found for Sui Move, the approach is maybe more developed there — Sui Move was used as the initial target for fuzzing, as a follow-up to <a href="https://nowarp.io/blog/skry/" target="_blank" rel="noopener noreferrer" class="">previous work</a>, so the tooling was matured on it before being applied to the other compilers.</p>
<p>There is no precise statistics which custom mutators give the best results nor comparision, while most finding were made by custom MetaMut-style mutations and the <code>afl-ts</code> mutator. This is not a paper evaluating a simple mutator – if your goal is also to find bugs in production code, you should use all the approaches giving you the result with low effort and proven result; combining multiple fuzzers for better corpus diversity and quickier path findng.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="challenges">Challenges<a href="https://nowarp.io/blog/compiler-testing-part-1#challenges" class="hash-link" aria-label="Direct link to Challenges" title="Direct link to Challenges" translate="no">​</a></h3>
<p>Some challenges encountered during these campaigns:</p>
<ul>
<li class=""><strong>Corpus growth with big files</strong> — if you start without a good initial corpus and enable <code>ts-add</code>, the corpus accumulates oversized entries that slow down the whole campaign. Minimize early and aggressively.</li>
<li class=""><strong>Stability of stateful compilers</strong> — Cairo uses <a href="https://github.com/salsa-rs/salsa" target="_blank" rel="noopener noreferrer" class="">Salsa</a>, an incremental computation library. While convenient for tooling development, it complicates fuzzing: the fuzzing state has to be reset every N iterations to avoid OOM, and the MetaMut-style mutator has to be tweaked accordingly. Move and Leo are mostly stable; any minor issues are likely caused by map type usage, but they don't affect the campaign.</li>
<li class=""><strong>Tree-sitter grammar quality</strong> — the whole pipeline (corpus generation, <code>afl-ts</code>, renaming script) relies heavily on the grammar parsing valid source without <code>ERROR</code> nodes. While <code>afl-ts</code> tries to recover from <code>ERROR</code> nodes by inserting syntactically valid code via its <code>ts-chaos</code> strategy, it is more efficient to run on a clean grammar.</li>
<li class=""><strong>Reproducibility across versions</strong> — compilers move fast. An ICE found at HEAD may already be fixed by the time someone triages the report, or the minimizer may shift the behavior to a different internal error. Pin the submodule version in the harness and include the exact commit in every report.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="conclusion-and-further-work">Conclusion and further work<a href="https://nowarp.io/blog/compiler-testing-part-1#conclusion-and-further-work" class="hash-link" aria-label="Direct link to Conclusion and further work" title="Direct link to Conclusion and further work" translate="no">​</a></h2>
<p>This blogpost shares experience setting up a cheap, fast fuzzing campaign for a non-mainstream language to find ICE. The approach and tooling are reproducible for any compiler.</p>
<p>Two new AFL++ grammar-aware mutators are introduced: <a href="https://nowarp.io/blog/compiler-testing-part-1#afl-ts-tree-sitter-based-afl-mutator" class=""><code>afl-ts</code></a> mutator that works with any tree-sitter grammar, and a <a href="https://nowarp.io/blog/compiler-testing-part-1#metamut-style-mutators" class="">MetaMut-style</a> LLM-generated mutator that produces hundreds of language-specific operations from a few prompts. Both proved effective in finding ICE.</p>
<p><a href="https://nowarp.io/blog/compiler-testing-part-1#corpus-and-dictionaries" class="">Corpus and dictionary setup</a> is covered with practical advice: collect broadly, minimize aggressively, mix manual dictionary entries with <code>AFL_LLVM_DICT2FILE</code> auto-generation. Helper tools (tsgen, validation scripts) are included.</p>
<p><a href="https://nowarp.io/blog/compiler-testing-part-1#triage-workflow" class="">Minimization and triage</a> are LLM-assisted: a CLAUDE.md triage guide handles bucketing and MRE generation, while <code>afl-cmin</code> and perses (or an LLM directly) shrink test cases. Concise prompts without root cause analysis reduce hallucination.</p>
<p>It is like experience sharing – before digging into tools and literature this setup took a couple of weeks; with the approach described here, it takes 1-2 days to get real findings.</p>
<p>We intentionally don't consider approaches to testing that require more time and effort to implement. Oracles, miscompilation, and implementation/specification mismatch errors – these techniques are out of scope and will be described in the next part, since this post is already large as fuck.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="projects-discussed">Projects discussed<a href="https://nowarp.io/blog/compiler-testing-part-1#projects-discussed" class="hash-link" aria-label="Direct link to Projects discussed" title="Direct link to Projects discussed" translate="no">​</a></h3>
<p>A list of small utilities, mutators, and tools recently published and used in the project:</p>
<ul>
<li class=""><a href="https://github.com/jubnzv/afl-ts" target="_blank" rel="noopener noreferrer" class="">jubnzv/afl-ts</a> – grammar-aware AFL++ mutator leveraging tree-sitter</li>
<li class=""><a href="https://github.com/jubnzv/tsgen" target="_blank" rel="noopener noreferrer" class="">jubnzv/tsgen</a> – utility for seeding fuzzing corpora using tree-sitter grammars</li>
<li class=""><a href="https://github.com/jubnzv/multifuzz" target="_blank" rel="noopener noreferrer" class="">jubnzv/multifuzz</a> – unified configuration, orchestration, and Rust API for AFL++/honggfuzz/libFuzzer — no implicit settings or overhead</li>
<li class=""><a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc" target="_blank" rel="noopener noreferrer" class="">nowarp/move-fuzz</a> – fuzzer and mutators for Sui Move<!-- -->
<ul>
<li class=""><a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/crates/source-multifuzz" target="_blank" rel="noopener noreferrer" class="">Fuzzing harness</a>, <a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/dicts" target="_blank" rel="noopener noreferrer" class="">dictionaries</a>, and <a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/scripts" target="_blank" rel="noopener noreferrer" class="">scripts</a></li>
<li class=""><a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/metamove" target="_blank" rel="noopener noreferrer" class="">MetaMut-style mutator</a> with 884 custom mutations for Sui Move</li>
<li class=""><a href="https://github.com/nowarp/move-fuzz/tree/f59321cb299c4877d64493d4c0a95d2f54f5f7bc/custom_mutators/move" target="_blank" rel="noopener noreferrer" class="">Ad-hoc Move mutator</a></li>
</ul>
</li>
<li class="">Many ad-hoc scripts demonstrating the approaches — see the post.</li>
</ul>
<p>Not published yet:</p>
<ul>
<li class="">Leo: fuzzing harness with utilities and MetaMut-style mutator</li>
<li class="">Cairo: fuzzing harness with utilities and MetaMut-style mutator</li>
<li class="">Solidity and Solang: fuzzing harness with utilities</li>
<li class="">Any experiments beyond the scope of the described techniques</li>
</ul>
<p>If you work on any of the compilers mentioned, reach out — happy to share repo access.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="references">References<a href="https://nowarp.io/blog/compiler-testing-part-1#references" class="hash-link" aria-label="Direct link to References" title="Direct link to References" translate="no">​</a></h2>
<ol>
<li class=""><a href="https://shao-hua-li.github.io/assets/pdf/2024_pldi_creal_final.pdf" target="_blank" rel="noopener noreferrer" class="">Li et al – Boosting Compiler Testing by Injecting Real-World Code</a> (2024)</li>
<li class=""><a href="https://arxiv.org/pdf/2505.02464v1" target="_blank" rel="noopener noreferrer" class="">Paaßen et al – Targeted Fuzzing for Unsafe Rust Code: Leveraging Selective Instrumentation</a> (2025)</li>
<li class=""><a href="https://arxiv.org/abs/2310.15991" target="_blank" rel="noopener noreferrer" class="">Yang et al – WhiteFox: White-Box Compiler Fuzzing Empowered by Large Language Models</a> (2023)</li>
<li class=""><a href="https://taesoo.kim/pubs/2020/park:die.pdf" target="_blank" rel="noopener noreferrer" class="">Park et al – Fuzzing JavaScript Engines with Aspect-preserving Mutation</a> (2020)</li>
<li class=""><a href="https://connglli.github.io/pdfs/metamut_asplos24.pdf" target="_blank" rel="noopener noreferrer" class="">Ou et al – The Mutators Reloaded: Fuzzing Compilers with Large Language Model Generated Mutation Operators</a> (2024)</li>
<li class=""><a href="https://schumilo.de/publications/redqueen/NDSS19-Redqueen.pdf" target="_blank" rel="noopener noreferrer" class="">Aschermann et al – REDQUEEN: Fuzzing with Input-to-State Correspondence</a> (2019)</li>
<li class=""><a href="https://www.vuminhle.com/pdf/oopsla16.pdf" target="_blank" rel="noopener noreferrer" class="">Sun et al – Finding compiler bugs via live code mutation</a> (2016)</li>
<li class=""><a href="https://haoxintu.github.io/files/icse2024-nier-camera-ready.pdf" target="_blank" rel="noopener noreferrer" class="">Tu et al – Beyond a Joke: Dead Code Elimination Can Delete Live Code</a> (2024)</li>
<li class=""><a href="https://nebelwelt.net/files/21ISSTA.pdf" target="_blank" rel="noopener noreferrer" class="">Srivastava et al – Gramatron: Effective Grammar-Aware Fuzzing</a> (2021)</li>
<li class=""><a href="https://cs.uwaterloo.ca/~cnsun/public/publication/issta25-tool/issta25-tool.pdf" target="_blank" rel="noopener noreferrer" class="">Xie et al – Kitten: A Simple Yet Effective Baseline for Evaluating LLM-Based Compiler Testing Techniques</a> (2025)</li>
<li class=""><a href="https://arxiv.org/pdf/2510.07834v1" target="_blank" rel="noopener noreferrer" class="">Liu et al – Bug Histories as Sources of Compiler Fuzzing Mutators</a> (2025)</li>
<li class=""><a href="https://arxiv.org/pdf/2308.04748" target="_blank" rel="noopener noreferrer" class="">Xia et al – Fuzz4All: Universal Fuzzing with Large Language Models</a> (2024)</li>
<li class=""><a href="https://agroce.github.io/cc22.pdf" target="_blank" rel="noopener noreferrer" class="">Groce et al – Making No-Fuss Compiler Fuzzing Effective</a> (2022)</li>
</ol>]]></content:encoded>
            <author>jubnzv@gmail.com (Georgiy Komarov)</author>
            <category>fuzzing</category>
            <category>compilers</category>
            <category>llm</category>
            <category>sui</category>
            <category>move</category>
            <category>ethereum</category>
            <category>compilers-testing</category>
        </item>
        <item>
            <title><![CDATA[Skry: Hybrid LLM Static Analysis for Sui Move]]></title>
            <link>https://nowarp.io/blog/skry</link>
            <guid>https://nowarp.io/blog/skry</guid>
            <pubDate>Sat, 17 Jan 2026 00:00:00 GMT</pubDate>
            <description><![CDATA[A hybrid static analysis + LLM security tool for Sui Move, focused on access control, governance, and centralization issues.]]></description>
            <content:encoded><![CDATA[<p>This is an overview of a new hybrid static analysis + LLM security tool for Sui Move, focused on access control, governance, and centralization issues. Skry uses static analysis to narrow candidates, then applies targeted LLM classification, then calls interprocedural and cross-module taint propagation and uses static analysis to detect the issues. This avoids most LLM hallucinations and reaches bugs pure static analysis can't. Proof-of-concept source code <a href="https://github.com/nowarp/skry" target="_blank" rel="noopener noreferrer" class="">is available</a>.</p>
<p>The blog post contains the following sections:</p>
<ol>
<li class=""><strong>Static analysis + LLM:</strong> description of the approach</li>
<li class=""><strong>Skry: Design &amp; implementation:</strong> analyzer pipeline, key features, and what makes it different</li>
<li class=""><strong>Evaluation:</strong> findings on real-world contracts, detection accuracy, reproduced audit findings</li>
<li class=""><strong>Conclusion:</strong> current use cases and future work</li>
</ol>
<p>The current project state is a proof-of-concept. It does find issues in real-world projects demonstrating low false-positive ratio, but needs more work to make its analysis more precise and capable.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="static-analysis--llm">Static analysis + LLM<a href="https://nowarp.io/blog/skry#static-analysis--llm" class="hash-link" aria-label="Direct link to Static analysis + LLM" title="Direct link to Static analysis + LLM" translate="no">​</a></h2>
<p>LLMs are already used in smart-contract security. Typical usage is straightforward: provide source code as input and ask the model to identify potential vulnerabilities.</p>
<p>Some real issues have been found this way, but practical experimentation exposes several limitations:</p>
<ol>
<li class=""><strong>Noise:</strong> without strict scoping, the model reasons about irrelevant code and paths.</li>
<li class=""><strong>Cost:</strong> large contexts and RAG-style setups significantly increase inference cost.</li>
<li class=""><strong>Non-determinism:</strong> results are fuzzy and difficult to reproduce.</li>
<li class=""><strong>High false-positive rate:</strong> models tend to over-report issues without semantic grounding.</li>
</ol>
<p>Using LLMs alone is not effective for systematic vulnerability detection. The issue is not the model itself, but the lack of structure, constraints, and analysis scope. As a result, approaches that combine deterministic methods with LLMs have been proposed. In static analysis + LLM systems, existing tools integrate LLMs into the analysis pipeline either to extend detection logic <a href="https://nowarp.io/blog/skry#references" class="">[1]</a> <a href="https://nowarp.io/blog/skry#references" class="">[2]</a> or to reduce the false-positive rate <a href="https://nowarp.io/blog/skry#references" class="">[3]</a>.</p>
<p>In these approaches, static analysis performs the core reasoning and defines the analysis scope, while LLMs are used only for properties that cannot be reliably inferred statically, such as semantic intent or project-specific logic. The goal is not to replace static analysis, but to extend it where classic techniques rely on approximations or heuristics.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="skry-design--implementation">Skry: Design &amp; implementation<a href="https://nowarp.io/blog/skry#skry-design--implementation" class="hash-link" aria-label="Direct link to Skry: Design &amp; implementation" title="Direct link to Skry: Design &amp; implementation" translate="no">​</a></h2>
<p>Skry is a static program analyzer that uses LLMs to reduce reliance on heuristics and overapproximations when dealing with bugs that are difficult to express using classic static analysis alone. LLMs are used only for data classification and limited semantic reasoning about smart contract constructs. Bug detection and soundness remain the responsibility of the static analyzer.</p>
<p>Internally, the analyzer collects information about the contract as Datalog-style facts. These facts are later queried using a small eDSL based on <a href="https://github.com/hylang/hy" target="_blank" rel="noopener noreferrer" class="">Hy</a> macros. This design allows both structural and semantic information to be reused across analyses and rules.</p>
<p>The overall pipeline architecture is shown below:</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-01-17-pipeline-999a303e7a6d126a0db09e310291bac5.png"></div>
<p>The implementation is based on Python, with Hy used for the eDSL. Choosing Python is <em>intellectually violent</em>, but sufficient for the proof-of-concept version when set up carefully. Python simplifies integration with tree-sitter for parsing, LLM APIs, and provides quality testing and debugging support. It also simplifies future integration with external tooling that may be worth considering, such as probabilistic Datalog engines or SMT solvers.</p>
<p>The analyzer is:</p>
<ul>
<li class="">source-level and currently supports Sui Move only,</li>
<li class="">interprocedural and cross-module for taint and dataflow analysis,</li>
<li class="">path-insensitive in the current version.</li>
</ul>
<p>The following sections describe specific components in more detail, including the rule system, fact representation, LLM-based classification, and detection of access control and centralization risks.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="scope-and-focus">Scope and focus<a href="https://nowarp.io/blog/skry#scope-and-focus" class="hash-link" aria-label="Direct link to Scope and focus" title="Direct link to Scope and focus" translate="no">​</a></h3>
<p>Skry is intentionally focused on a narrow set of security patterns that are difficult to express using classic static analysis:</p>
<ul>
<li class=""><strong>Access control:</strong> capability misuse, missing authorization, pause bypass, generic type safety.</li>
<li class=""><strong>Centralization:</strong> admin drain patterns, missing audit events, immutable configuration, single-step ownership.</li>
<li class=""><strong>Structural checks:</strong> double initialization, missing transfers, duplicated branches, weak randomness.</li>
</ul>
<p>Structural issues come for free when building a static analyzer and are included when detected.
Access-control and centralization issues depend on semantic properties such as what a capability represents, who owns what, where privilege boundaries are, and what the project intends. Traditional tools usually cannot model this with confidence and either guess based on heuristics or ignore them.</p>
<p>Skry's detection logic is deterministic. LLM-based classification is used only to extract the minimal semantic information in a constrained scope needed to reason about access control patterns where static analysis alone cannot.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="static-analysis-and-code-facts">Static analysis and code facts<a href="https://nowarp.io/blog/skry#static-analysis-and-code-facts" class="hash-link" aria-label="Direct link to Static analysis and code facts" title="Direct link to Static analysis and code facts" translate="no">​</a></h3>
<p>The analyzer implements a classic monotone framework and interprocedural taint analysis, propagating information across Move modules and packages.</p>
<p>In addition to classic IRs common in static analyzers, the tool stores extracted information about the source code as Datalog-style facts. This is done for extensibility: users can access these facts directly from the eDSL to create new rules, regardless of whether the information is purely structural or “semantic” data gathered from LLM classification. This approach can also be used to generate project-specific rules or to cover common vulnerability patterns.</p>
<p>The code facts are defined in <a href="https://github.com/nowarp/skry/blob/05ee2ea0c86e7d57bca5df3e4240177332cd2db4/src/core/facts.py" target="_blank" rel="noopener noreferrer" class="">src/core/facts.py</a>, and represent the structural and semantic information. The goal of such a separation is flexibility: it enables the user or the model to combine these facts to build custom rules. Without changing the analyzer itself or using it as a framework.</p>
<p>A typical dump of code facts produced via <code>--dump-facts=&lt;DIR&gt;</code> contains information about each function and struct in the project, including LLM classifications and project categories. This output can be used for debugging and for designing new project-specific rules. It prints the code fragments of all the modules with the relevant facts. Here is a small example:</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-01-17-facts-dump-0e18d6059a6e67e3bb2bbe03f1e85869.png"></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="rules-and-structural-filters">Rules and structural filters<a href="https://nowarp.io/blog/skry#rules-and-structural-filters" class="hash-link" aria-label="Direct link to Rules and structural filters" title="Direct link to Rules and structural filters" translate="no">​</a></h3>
<p>The main detection logic is implemented in a small macro-based eDSL. Hy is chosen for dual interoperability with Python and its macro capabilities, which allow rules to be expressed in a concise format. This makes rules easier to read and, if needed, easier to generate via an LLM for a specific project.</p>
<p>After collecting structural information, the eDSL is used to directly access these facts, along with utilities to manipulate and combine them.</p>
<p>Currently, the tool supports <a href="https://github.com/nowarp/skry/blob/05ee2ea0c86e7d57bca5df3e4240177332cd2db4/src/rules" target="_blank" rel="noopener noreferrer" class="">45 rules</a>.</p>
<p>Here is <a href="https://github.com/nowarp/skry/blob/05ee2ea0c86e7d57bca5df3e4240177332cd2db4/rules/centralization.hy#L41" target="_blank" rel="noopener noreferrer" class="">one of the available rules</a>:</p>
<div class="language-clojure codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-clojure codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token comment" style="color:#999988;font-style:italic">;; ---------------------------------------------------------------------------</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">;; centralized-reward-distribution - Admin picks lottery/game winners</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">;; ---------------------------------------------------------------------------</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">;; Gaming project where admin-controlled function distributes rewards to</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">;; admin-chosen recipients. No verifiable on-chain randomness - users must</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">;; trust the admin to be fair.</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">;; Impact: Unfair game - legitimate players never win.</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">defrule</span><span class="token plain"> centralized-reward-distribution</span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token symbol" style="color:#36acaa">:severity</span><span class="token plain"> </span><span class="token symbol" style="color:#36acaa">:medium</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token symbol" style="color:#36acaa">:categories</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">[</span><span class="token symbol" style="color:#36acaa">:centralization</span><span class="token plain"> </span><span class="token symbol" style="color:#36acaa">:fairness</span><span class="token punctuation" style="color:#393A34">]</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token symbol" style="color:#36acaa">:description</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"Admin-controlled reward distribution - no verifiable winner selection"</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token symbol" style="color:#36acaa">:match</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">fun</span><span class="token plain"> </span><span class="token symbol" style="color:#36acaa">:public</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token symbol" style="color:#36acaa">:filter</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">and</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">project-category?</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"gaming"</span><span class="token plain"> facts ctx</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain">        </span><span class="token comment" style="color:#999988;font-style:italic">;; Gaming/lottery/gambling project</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">checks-sender*?</span><span class="token plain"> f facts ctx</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain">                 </span><span class="token comment" style="color:#999988;font-style:italic">;; Admin-gated (transitive sender check)</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">transfers-from-shared-object?</span><span class="token plain"> f facts ctx</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain">   </span><span class="token comment" style="color:#999988;font-style:italic">;; Extracts from shared pool</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">has-param-type?</span><span class="token plain"> f </span><span class="token string" style="color:#e3116c">"address"</span><span class="token plain"> facts ctx</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain">       </span><span class="token comment" style="color:#999988;font-style:italic">;; Has address param = admin picks recipient</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">transfers-from-sender?</span><span class="token plain"> f facts ctx</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">;; Not user withdrawing own funds</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">            </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">not</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token function" style="color:#d73a49">is-init?</span><span class="token plain"> f facts ctx</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">)</span><br></div></code></pre></div></div>
<p>Here, the rule file uses helper functions that combine existing code facts to make them more convenient to use. The naming is literal: functions ending with <code>?</code> return booleans, and <code>*</code> indicates transitive behavior.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="llm-classification">LLM classification<a href="https://nowarp.io/blog/skry#llm-classification" class="hash-link" aria-label="Direct link to LLM classification" title="Direct link to LLM classification" translate="no">​</a></h3>
<p>LLM classification is used in three cases:</p>
<ol>
<li class=""><strong>Project-wide feature classification:</strong> for example, the presence of a global pause, <a href="https://docs.sui.io/guides/developer/packages/upgrade" target="_blank" rel="noopener noreferrer" class="">versioning</a>, and project categories. This information is used to adjust specific rules.</li>
<li class=""><strong>Data classification:</strong> for each struct, the tool determines whether it contains sensitive data, configuration parameters, or protocol invariants, and whether it is intended to be owned by a privileged user.</li>
<li class=""><strong>Rule double-checking:</strong> in the <code>:classify</code> section for a subset of rules, to reduce the false-positive rate by handling subtle Move patterns or intentional design decisions.</li>
</ol>
<p>The generated prompts are based on <a href="https://jinja.palletsprojects.com/en/stable/" target="_blank" rel="noopener noreferrer" class="">Jinja2</a> templates and are available in <a href="https://github.com/nowarp/skry/blob/05ee2ea0c86e7d57bca5df3e4240177332cd2db4/src/prompts" target="_blank" rel="noopener noreferrer" class="">src/prompts/</a>.</p>
<p>Here is the real-world example prompt generated from the <a href="https://github.com/nowarp/skry/blob/05ee2ea0c86e7d57bca5df3e4240177332cd2db4/src/prompts/classify/sensitivity_batch.j2" target="_blank" rel="noopener noreferrer" class="">data sensitivity classification template</a>:</p>
<div class="spoilerContainer_ujGA"><div class="spoilerLine_Q5Qv">&gt; <!-- -->Show spoiler</div></div>
<p>LLM output:</p>
<div class="language-json codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-json codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token punctuation" style="color:#393A34">[</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"field"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"LockerCap::liquidation_ratio"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"reason"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"economic"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"confidence"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.85</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"field"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"LockerCap::price_with_discount_ratio"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"reason"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"economic"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"confidence"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.90</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"field"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"LockerCap::inactivation_delay"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"reason"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token string" style="color:#e3116c">"availability"</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token property" style="color:#36acaa">"confidence"</span><span class="token operator" style="color:#393A34">:</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">0.75</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">]</span><br></div></code></pre></div></div>
<p>Which is correctly used to generate the following semantic facts:</p>
<ul>
<li class=""><code>liquidation_ratio</code> → <code>economic</code> (manipulate = unfair liquidations)</li>
<li class=""><code>price_with_discount_ratio</code> → <code>economic</code> (manipulate = steal funds via discounts)</li>
<li class=""><code>inactivation_delay</code> → <code>availability</code> (manipulate = trap lockers or let them escape early)</li>
</ul>
<p>The model also correctly ignored counters, timestamps, data structures – because of the reduced scope and concrete prompt.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="access-control-and-centralization-risks-detection">Access control and centralization risks detection<a href="https://nowarp.io/blog/skry#access-control-and-centralization-risks-detection" class="hash-link" aria-label="Direct link to Access control and centralization risks detection" title="Direct link to Access control and centralization risks detection" translate="no">​</a></h3>
<p>In Sui Move, access control is expressed through <a href="https://move-book.com/object/ownership/" target="_blank" rel="noopener noreferrer" class="">ownership</a> of <a href="https://move-book.com/programmability/capability/" target="_blank" rel="noopener noreferrer" class="">capability objects</a>. A capability is a first-class object that grants permission to perform a restricted operation. Functions require specific capability types as parameters, and only callers that own the corresponding object can invoke those operations. Capabilities are typically created during initialization and transferred explicitly, making access control decisions explicit and traceable in the code.</p>
<p>The tool implements <a href="https://github.com/nowarp/skry/blob/05ee2ea0c86e7d57bca5df3e4240177332cd2db4/src/anlaysis/cap_graph.py" target="_blank" rel="noopener noreferrer" class="">a simple IR</a> based on code facts that tracks capabilities in a graph. A Mermaid dump of this IR (accessible through <code>--dump-cap-graph=&lt;DIR&gt;</code>) looks like this:</p>
<div align="center"><img src="https://nowarp.io/assets/images/2026-01-17-capgraph-bf1e8fffedb5e33e97c6e194f1e4ff48.png"></div>
<p>Skry models this access control structure using a capability graph derived from code facts. The graph captures which addresses own which capabilities, which functions require those capabilities, and which objects are mutated as a result. This representation makes privilege boundaries, capability hierarchies, and sensitive state transitions explicit and analyzable.</p>
<p>In addition, the Move ownership model distinguishes between shared and owned objects. This distinction is critical for access control analysis, as shared objects enable global access patterns, while owned objects enforce per-address authority.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="evaluation">Evaluation<a href="https://nowarp.io/blog/skry#evaluation" class="hash-link" aria-label="Direct link to Evaluation" title="Direct link to Evaluation" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="evaluation-setup">Evaluation Setup<a href="https://nowarp.io/blog/skry#evaluation-setup" class="hash-link" aria-label="Direct link to Evaluation Setup" title="Direct link to Evaluation Setup" translate="no">​</a></h3>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="data">Data<a href="https://nowarp.io/blog/skry#data" class="hash-link" aria-label="Direct link to Data" title="Direct link to Data" translate="no">​</a></h4>
<p>We evaluated the tool on a set of real-world Sui contracts with source code available on GitHub.</p>
<p>The following criteria were used to select projects:</p>
<ul>
<li class="">smart contracts only (no libraries),</li>
<li class="">only <a href="https://blog.sui.io/move-vs-sui-move-explainer/" target="_blank" rel="noopener noreferrer" class="">Sui Move</a>; Aptos and other Move-based ecosystems were excluded,</li>
<li class="">production-quality projects: no forks, tutorials, or hackathon artifacts,</li>
<li class="">primarily targeting <a href="https://move-book.com/guides/2024-migration-guide/" target="_blank" rel="noopener noreferrer" class="">Move 2024</a>, with a small number of older projects included.</li>
</ul>
<p>In total, we identified <a href="https://gist.github.com/jubnzv/901144d8976b3bb8f8220a5b338c33aa" target="_blank" rel="noopener noreferrer" class="">94 projects matching these criteria</a>.</p>
<p>In addition, audit reports from prior security assessments were used to reproduce historical findings.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="approach">Approach<a href="https://nowarp.io/blog/skry#approach" class="hash-link" aria-label="Direct link to Approach" title="Direct link to Approach" translate="no">​</a></h4>
<p>The evaluation approach was to reproduce critical access control issues on historical code from large Sui projects and to test non-critical rules on a collected set of contracts.</p>
<p>To evaluate the tool on large, production-ready projects, we reintroduced findings previously reported in audits by top firms. This was done by applying targeted mutations to production code that matched those findings and verifying that the tool detected them.</p>
<p>The evaluation focuses only on manually verified, non-exploitable issues. These findings are used to demonstrate the tool’s detection capabilities. All cases were reviewed manually to confirm that they either reflect intentional design decisions that warrant manual validation or originate from historical source code. No new exploitable vulnerabilities are claimed.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="models-and-the-cost-of-evaluation">Models and the cost of evaluation<a href="https://nowarp.io/blog/skry#models-and-the-cost-of-evaluation" class="hash-link" aria-label="Direct link to Models and the cost of evaluation" title="Direct link to Models and the cost of evaluation" translate="no">​</a></h4>
<p>For evaluation purposes, two models were used:</p>
<ul>
<li class=""><strong>DeepSeek</strong> — chosen for its inexpensive API and precise adherence to prompt instructions.</li>
<li class=""><strong>Opus 4.5</strong> — provides the best accuracy and demonstrates strong "knowledge" of Move; available via Claude Code and callable from the CLI, making it cheaper than API-based usage.</li>
</ul>
<p>In addition to API-based execution, the tool supports multiple modes, including a manual mode for prompt debugging and integration with <a href="https://claude.com/product/claude-code" target="_blank" rel="noopener noreferrer" class="">Claude Code</a>.</p>
<p>The typical number of prompts and overall cost depend on the project. Larger codebases and a higher number of potentially sensitive functions result in more prompts. In practice, small projects (around three Move files) require approximately seven prompts, while large production projects require around 30–40 queries.</p>
<p>The tool also uses caching: all LLM prompts and responses are cached and reused in subsequent executions unless a fresh run is explicitly forced.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="critical-access-control-issues">Critical access control issues<a href="https://nowarp.io/blog/skry#critical-access-control-issues" class="hash-link" aria-label="Direct link to Critical access control issues" title="Direct link to Critical access control issues" translate="no">​</a></h3>
<p>These rules detect high-impact access control and capability-handling issues that are typically exploitable and unacceptable in production code.</p>
<p>These rules include:</p>
<ul>
<li class=""><code>unprotected-pause</code> – Access control issue where a user can manipulate the global lock mechanism (critical).</li>
<li class=""><code>sensitive-internal-public-exposure</code> – Internal helper exposed as <code>public</code>.</li>
<li class=""><code>generic-type-mismatch</code> – Generic type parameter used without validation.</li>
<li class=""><code>arbitrary-recipient-drain</code> – Transfer to a user-controlled address without authorization.</li>
<li class=""><code>missing-authorization</code> – Entry function reaches a dangerous sink without an authorization check.</li>
<li class=""><code>user-asset-write-without-ownership</code> – Write to user assets without ownership proof.</li>
<li class=""><code>missing-destroy-guard</code> – Capability destruction without authorization.</li>
<li class=""><code>capability-takeover</code> – Capability can be acquired by an unauthorized address.</li>
<li class=""><code>capability-leak-via-store</code> – Capability stored in a shared object field.</li>
<li class=""><code>phantom-type-mismatch</code> – Capability guard uses a different phantom type than the target.</li>
<li class=""><code>test-only-missing</code> – Public function returns a privileged capability without <code>#[test_only]</code>.</li>
<li class=""><code>duplicated_branch_condition</code> – Same branch condition appears multiple times.</li>
</ul>
<p>These issues have <strong>critical severity</strong>. A valid finding in production code is likely exploitable, and therefore such issues are not expected to appear in audited projects.</p>
<p>To validate the tool, previously reported audit findings were reintroduced, and the analyzer was tested to ensure it detects them. The evaluated protocols are production-grade, large, and have mature, audited codebases. The critical issues listed below were present in earlier audits and are not deployed in live systems; the purpose here is tool evaluation.</p>
<p>Reproduced findings include:</p>
<table><thead><tr><th>Finding ID</th><th>Project</th><th>Analyzer’s warning</th></tr></thead><tbody><tr><td>STG-03</td><td>Navi</td><td><code>sensitive-internal-public-exposure</code></td></tr><tr><td>POOL-01</td><td>Navi</td><td><code>arbitrary-recipient-drain</code></td></tr><tr><td>OS-NVI-ADV-00</td><td>Navi</td><td><code>generic-type-mismatch</code></td></tr><tr><td>AMA-1</td><td>Balanced</td><td><code>sensitive-internal-public-exposure</code></td></tr></tbody></table>
<p>This validation approach reintroduces known security issues from audits, runs the analyzer on the modified code, and checks that the expected warnings are produced. This approach is used because unaudited source code for large production projects is generally no longer available.</p>
<p>To validate these warnings, the following <a href="https://gist.github.com/jubnzv/ae20fbb849c4b74c58a5a158edb709da" target="_blank" rel="noopener noreferrer" class="">mutations</a> were applied.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="pause-and-version-usage-anomalies">Pause and version usage anomalies<a href="https://nowarp.io/blog/skry#pause-and-version-usage-anomalies" class="hash-link" aria-label="Direct link to Pause and version usage anomalies" title="Direct link to Pause and version usage anomalies" translate="no">​</a></h3>
<p>A pause or global lock is a common pattern in smart contracts and is typically used for security purposes. The absence of a pause mechanism may indicate a centralization issue, a missing check, or a valid design decision. In most cases, such issues have medium severity and should be mentioned in audits. Only in rare cases does a missing pause check lead to severe or unexpected behavior.</p>
<p>In the current implementation, there are three related rules. The <code>pause-check-missing</code> and <code>version-check-missing</code> rules detect anomalies in version and global pause usage. If a public function omits a check that is consistently applied in similar functions, it is reported.</p>
<p>While an unprotected version check in Move <a href="https://docs.sui.io/guides/developer/packages/upgrade" target="_blank" rel="noopener noreferrer" class="">upgradable contracts</a> is considered a critical vulnerability, missing pause checks are often intentional design decisions or non-exploitable bugs.</p>
<p>An <a href="https://github.com/rocknwa/multisig-treasury/blob/ac51bb98a4011f16dbd56eeb237d681d42061088/sources/treasury.move#L419" target="_blank" rel="noopener noreferrer" class="">example of such a warning</a> is shown below:</p>
<pre class="wrap-code">[HIGH][pause-check-missing][sources/treasury.move:419:5] in function 'multisig_treasury::treasury::create_simple_proposal'</pre>
<p>The treasury has an <a href="https://github.com/rocknwa/multisig-treasury/blob/ac51bb98a4011f16dbd56eeb237d681d42061088/sources/treasury.move#L107" target="_blank" rel="noopener noreferrer" class=""><code>is_frozen</code> state</a> that is checked in <a href="https://github.com/rocknwa/multisig-treasury/blob/ac51bb98a4011f16dbd56eeb237d681d42061088/sources/treasury.move#L853" target="_blank" rel="noopener noreferrer" class=""><code>create_emergency_proposal</code></a>, but <code>create_proposal</code> and <code>create_simple_proposal</code> never check it. As a result, this is highlighted by the rule as a pause check anomaly. In this specific case, it may be a deliberate design choice if the freeze mechanism is intended to block only the emergency fast-track.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="centralization-risks">Centralization risks<a href="https://nowarp.io/blog/skry#centralization-risks" class="hash-link" aria-label="Direct link to Centralization risks" title="Direct link to Centralization risks" translate="no">​</a></h3>
<p>Centralization risks correspond to intentional design decisions or missing features where privileged user(s) have excessive control over critical project logic.</p>
<p>Below are some example rules and real-world findings.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="single-step-ownership"><code>single-step-ownership</code><a href="https://nowarp.io/blog/skry#single-step-ownership" class="hash-link" aria-label="Direct link to single-step-ownership" title="Direct link to single-step-ownership" translate="no">​</a></h4>
<p>The rule highlights single-step ownership transfers.</p>
<p>In the source code, this appears as the following pattern:</p>
<div class="language-rust codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-rust codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token plain">public fun </span><span class="token function" style="color:#d73a49">admin_transfer</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">ac </span><span class="token class-name">AdminCap</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> recipient</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> address</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token namespace" style="opacity:0.7">transfer</span><span class="token namespace punctuation" style="opacity:0.7;color:#393A34">::</span><span class="token function" style="color:#d73a49">transfer</span><span class="token punctuation" style="color:#393A34">(</span><span class="token plain">ac</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> recipient</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>The issue with this pattern is that if the current admin provides an incorrect address (e.g., due to a typo, phishing attack, or copy-paste error), the admin capability is irrecoverably lost, with no mechanism to cancel or correct the transfer.</p>
<p>Here are example findings highlighting similar patterns:</p>
<pre class="wrap-code">[HIGH][single-step-ownership][liquid_staking/sources/ownership.move:33:5] in function 'liquid_staking::ownership::transfer_owner'</pre>
<pre class="wrap-code">[HIGH][single-step-ownership][liquid_staking/sources/ownership.move:47:5] in function 'liquid_staking::ownership::transfer_operator'</pre>
<p>Both functions <a href="https://github.com/Sui-Volo/volo-liquid-staking-contracts/blob/3798d6fa6ebca9007cde2c082dd4f6efba2b81c0/liquid_staking/sources/ownership.move#L47" target="_blank" rel="noopener noreferrer" class=""><code>transfer_operator</code></a> and <a href="https://github.com/Sui-Volo/volo-liquid-staking-contracts/blob/3798d6fa6ebca9007cde2c082dd4f6efba2b81c0/liquid_staking/sources/ownership.move#L33" target="_blank" rel="noopener noreferrer" class=""><code>transfer_owner</code></a> implement the single-step ownership pattern. These can be improved by using a two-step transfer pattern: the current owner calls, for example, <code>offer(new_addr)</code> to set a pending recipient, and the new owner then calls <code>claim()</code> to accept. This ensures the recipient address is valid and controlled, and allows cancellation in case of a mistake.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="centralized-reward-distribution"><code>centralized-reward-distribution</code><a href="https://nowarp.io/blog/skry#centralized-reward-distribution" class="hash-link" aria-label="Direct link to centralized-reward-distribution" title="Direct link to centralized-reward-distribution" translate="no">​</a></h4>
<p>Detects admin-controlled reward distribution in projects classified as "gaming", where there is no verifiable winner selection.</p>
<p>An <a href="https://github.com/dravynn/sui-smartcontract/blob/47bf71ad325cf80727c1730efb92d8e725d27657/sources/core.move#L147" target="_blank" rel="noopener noreferrer" class="">example warning</a> for a lottery project:</p>
<pre class="wrap-code">[MEDIUM][centralized-reward-distribution][sources/core.move:147:5] in function 'rtmtree::longshot_jackpot::goal_shot'</pre>
<p>An admin-gated function decides who receives rewards and when. The player address is <a href="https://github.com/dravynn/sui-smartcontract/blob/47bf71ad325cf80727c1730efb92d8e725d27657/sources/core.move#L150" target="_blank" rel="noopener noreferrer" class="">passed as a parameter</a>, but <a href="https://github.com/dravynn/sui-smartcontract/blob/47bf71ad325cf80727c1730efb92d8e725d27657/sources/core.move#L153" target="_blank" rel="noopener noreferrer" class="">execution is controlled by the admin</a>. A malicious admin can refuse to distribute rewards to legitimate winners.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="admin-bypasses-pause"><code>admin-bypasses-pause</code><a href="https://nowarp.io/blog/skry#admin-bypasses-pause" class="hash-link" aria-label="Direct link to admin-bypasses-pause" title="Direct link to admin-bypasses-pause" translate="no">​</a></h4>
<p>Detects a possible centralization risk when an admin can bypass the global lock mechanism.</p>
<p>Here is an example warning:</p>
<pre class="wrap-code">[INFO][admin-bypasses-pause][lending_core/sources/pool.move:228:5] in function 'lending_core::pool::withdraw_treasury'</pre>
<p><a href="https://github.com/naviprotocol/navi-smart-contracts/blob/916c63b628bf75ffbdee38c3dd698c7292afe517/lending_core/sources/pool.move#L228" target="_blank" rel="noopener noreferrer" class=""><code>withdraw_treasury</code></a> requires the <code>PoolAdminCap</code> capability. A pause mechanism exists via the <code>Storage.paused</code> field, which is checked through <code>when_not_paused()</code> in user-facing lending functions. There is no pause check here: <code>withdraw_treasury</code> does not take <code>Storage</code> as a parameter and therefore cannot check the pause state. This creates a centralization risk: an admin can drain the treasury even when the protocol is paused.</p>
<p>The severity is informational, reflecting the fact that this may be an intentional design decision, while still requiring additional attention.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="missing-admin-event"><code>missing-admin-event</code><a href="https://nowarp.io/blog/skry#missing-admin-event" class="hash-link" aria-label="Direct link to missing-admin-event" title="Direct link to missing-admin-event" translate="no">​</a></h4>
<p>Generates an informative warning when critical protocol changes may require emitting an event. This is similar to <a href="https://github.com/crytic/slither/wiki/Detector-Documentation#missing-events-access-control" target="_blank" rel="noopener noreferrer" class="">Slither’s missing event</a> detector, but relies on LLM classification.</p>
<p>Some example informational warnings that require manual validation:</p>
<pre class="wrap-code">[INFO][missing-admin-event][sources/patience.move:160:5] in function 'patience::patience::withdraw_fee' - extracts value. Emit event with: amount</pre>
<pre class="wrap-code">[INFO][missing-admin-event][sources/patience.move:166:5] in function 'patience::patience::admin_transfer' - transfers to user-controlled address. Emit event with: recipient</pre>
<pre class="wrap-code">[INFO][missing-admin-event][contracts/core/sources/prize_pool.move:161:1] in function 'anglerfish::prize_pool::claim_protocol_fee' - extracts value. Emit event with: amount</pre>
<pre class="wrap-code">[INFO][missing-admin-event][contracts/core/sources/prize_pool.move:172:1] in function 'anglerfish::prize_pool::claim_treasury_reserve' - extracts value. Emit event with: amount</pre>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="protocol-configuration-parameters-and-invariants">Protocol configuration parameters and invariants<a href="https://nowarp.io/blog/skry#protocol-configuration-parameters-and-invariants" class="hash-link" aria-label="Direct link to Protocol configuration parameters and invariants" title="Direct link to Protocol configuration parameters and invariants" translate="no">​</a></h3>
<p>In Sui Move, all data is expressed in structs.
Configuration parameters may be either:</p>
<ul>
<li class="">mutable <em>configuration parameters</em> — values expected to change during the contract’s lifecycle. Examples: <code>fee_rate</code>, <code>loyalty_address</code>.</li>
<li class="">immutable <em>protocol invariants</em> — parameters set once during initialization. Examples: <code>protocol_fee_bps</code>, <code>decimals</code> (e.g., to manipulate specific tokens).</li>
</ul>
<p>Some detectors rely on LLM-based classification and highlight anomalies that should be checked manually:</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="missing-mutable-config-setter"><code>missing-mutable-config-setter</code><a href="https://nowarp.io/blog/skry#missing-mutable-config-setter" class="hash-link" aria-label="Direct link to missing-mutable-config-setter" title="Direct link to missing-mutable-config-setter" translate="no">​</a></h4>
<p>The rule relies on LLM classification for struct fields and static analysis that propagates data. If the model detects that a struct field represents a configuration value that is expected to be mutable, the analysis checks whether any setters exist for it.</p>
<p>Some example real-world findings:</p>
<pre class="wrap-code">[MEDIUM][missing-mutable-config-setter][core/sources/satlayer_pool.move:55:1] field 'satlayer_core::satlayer_pool::Vault.min_deposit_amount'</pre>
<p>The <a href="https://github.com/satlayer/satlayer-sui/blob/5958934f918ebae2cb2f1ad0919783c9114c0340/core/sources/satlayer_pool.move#L55" target="_blank" rel="noopener noreferrer" class=""><code>Vault</code></a> struct has admin setters for similar configuration fields:</p>
<ul>
<li class=""><code>staking_cap</code> → <code>set_staking_cap</code></li>
<li class=""><code>withdrawal_cooldown</code> → <code>update_withdrawal_time</code></li>
<li class=""><code>is_paused</code> → <code>toggle_vault_pause</code></li>
<li class=""><code>caps_enabled</code> → <code>set_caps_enabled</code></li>
<li class=""><code>min_deposit_amount</code> → <em>no setter</em></li>
</ul>
<p>This makes the finding valid: <code>min_deposit_amount</code> is correctly classified as a configuration field and may require a setter, unless this is an intentional design decision.</p>
<pre class="wrap-code">[MEDIUM][missing-mutable-config-setter][sources/curve.move:42:5] field 'pumpfun::curve::Configurator.swap_fee'</pre>
<p>The admin has setters for all other <a href="https://github.com/angel10x/Sui-Pump--Move/blob/9aa1d364128816ebc80cec4240ae655220289c91/sources/curve.move#L42" target="_blank" rel="noopener noreferrer" class=""><code>Configurator</code></a> fields, but not <code>swap_fee</code>. This looks like an oversight correctly classified and highlighted by the analyzer. If the swap fee needs adjustment, the admin has no way to change it.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="mutable-protocol-invariant"><code>mutable-protocol-invariant</code><a href="https://nowarp.io/blog/skry#mutable-protocol-invariant" class="hash-link" aria-label="Direct link to mutable-protocol-invariant" title="Direct link to mutable-protocol-invariant" translate="no">​</a></h4>
<p>The rule is similar to <code>missing-mutable-config-setter</code>, but detects the opposite pattern: if the LLM classifies a field as an immutable protocol invariant set once during initialization, it should never be changed.</p>
<p>Some examples of real-world findings include:</p>
<pre class="wrap-code">[HIGH][mutable-protocol-invariant][sources/lootbox.move:118:5] in function 'suigar::lootbox::edit_lootbox' writes to invariant 'suigar::lootbox::LootBox.reward_amounts'</pre>
<pre class="wrap-code">[HIGH][mutable-protocol-invariant][sources/lootbox.move:118:5] in function 'suigar::lootbox::edit_lootbox' writes to invariant 'suigar::lootbox::LootBox.reward_probabilities'</pre>
<p>This means that the LLM classified two fields of <a href="https://github.com/Suigar-Gaming/suigar-contracts/blob/500ce3dec9fd2c3816d3eeeaea0c3bce5283ed15/sources/lootbox.move#L41" target="_blank" rel="noopener noreferrer" class=""><code>LootBox</code></a> as protocol invariants that should be immutable, but they are modified in <code>edit_lootbox</code>. If the contract were to separate purchase and reveal into two transactions, an admin could change reward distributions between these steps, causing users to receive different payouts than expected at purchase time. While this may be an intentional design decision or a centralization risk, it warrants <code>medium</code> severity.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="evaluation-results">Evaluation Results<a href="https://nowarp.io/blog/skry#evaluation-results" class="hash-link" aria-label="Direct link to Evaluation Results" title="Direct link to Evaluation Results" translate="no">​</a></h3>
<p>The problem with precision evaluation is that non-exploitable findings often resemble intentional design decisions, even if these don't follow security best practices. As a result, it is impossible to provide an <em>exact</em> number for non-critical warnings.</p>
<p>For small projects with 3–4 modules, the tool typically generates 0 to 5 warnings, including some unclear cases that require manual review, some true positives, and some false positives. Overall, it is not noisy. For the largest production codebases, the tool generates around 40 warnings, including a number of valid concerns that deserve discussion during a security assessment.</p>
<p>Key insights on false positives:</p>
<ul>
<li class="">Many false positives are related to limitations in the current (proof-of-concept) analyzer's implementation; a more mature codebase would significantly reduce the rate.</li>
<li class="">LLM misclassification does occur, but can be mitigated by providing more focused context and <a href="https://www.promptingguide.ai/techniques/cot" target="_blank" rel="noopener noreferrer" class="">requesting explicit reasoning steps</a>, followed by prompt refinement. While 100% accuracy or guarantees are not possible, precision can be improved at the cost of higher inference overhead.</li>
</ul>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="challenges">Challenges<a href="https://nowarp.io/blog/skry#challenges" class="hash-link" aria-label="Direct link to Challenges" title="Direct link to Challenges" translate="no">​</a></h4>
<p>The key challenges we encountered include:</p>
<ul>
<li class="">It is impossible to distinguish design decisions from valid findings without direct contact with project owners, which is especially relevant for non-critical findings.</li>
<li class="">In many cases, there is no source code available to reproduce previous audit findings. Projects are often renamed, removed, or do not publish the audited code, making historical verification difficult.</li>
<li class="">Most evaluated projects are mature and well audited, so access control issues are relatively rare.</li>
<li class="">Cost is a minor concern. Having Claude Code installed allowed us to run the tool on around 100 contracts, occasionally hitting usage limits.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="conclusion">Conclusion<a href="https://nowarp.io/blog/skry#conclusion" class="hash-link" aria-label="Direct link to Conclusion" title="Direct link to Conclusion" translate="no">​</a></h2>
<p>The tool is still <em>proof-of-concept</em>. The approach has been tested and it works. The analyzer produces valid warnings that belong in a security assessment for Sui Move smart contracts. However, the analysis is not yet comprehensive or systematic and requires further work.</p>
<p>Skry is a static analyzer at its core – this is where most engineering effort belongs. The current version shows feasibility, not precision. Improving accuracy requires improving the analysis engine itself. Move is a relatively large language, and supporting more real-world patterns will require additional engineering effort.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="current-use-cases">Current use cases<a href="https://nowarp.io/blog/skry#current-use-cases" class="hash-link" aria-label="Direct link to Current use cases" title="Direct link to Current use cases" translate="no">​</a></h3>
<p>Despite its proof-of-concept status, the analysis is relatively cheap and already usable in practice:</p>
<ul>
<li class=""><strong>Bug finding before deployment:</strong> can be used as an additional check before audit and deployment.</li>
<li class=""><strong>Audit assistance:</strong> the primary use case. The tool highlights issues for manual validation and can surface potential audit findings.</li>
<li class=""><strong>Centralization risk validation:</strong> can be used to assess trust assumptions and administrative control when evaluating a project.</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="future-work">Future work<a href="https://nowarp.io/blog/skry#future-work" class="hash-link" aria-label="Direct link to Future work" title="Direct link to Future work" translate="no">​</a></h3>
<p>The core of the tool is the static analysis engine. Improving it will improve both static precision and LLM-based classification by enabling more specific and constrained prompts.</p>
<p>Some important areas for future work include:</p>
<ul>
<li class=""><strong>Advanced Sui Move pattern support:</strong> for example, OTW-based access control, address-based access control patterns (more common in Aptos, but sometimes present in Sui in the wild), and more comprehensive tainted data propagation covering all language constructs.</li>
<li class=""><strong>IR improvement:</strong> the current IR is relatively simplistic and expresses only the information required for access control detection.</li>
<li class=""><strong>Path sensitivity:</strong> full symbolic execution for Move is complex but feasible; introducing simple sink-protection facts and dominance relations in the IR is a reasonable starting point.</li>
<li class=""><strong>Object lifecycle tracking:</strong> as an extension of dataflow analysis to cover additional patterns.</li>
<li class=""><strong>Execution optimization:</strong> routines involving fact processing and interprocedural, cross-module analysis can be optimized to improve performance.</li>
<li class=""><strong>Dependency management:</strong> understanding external dependencies is important; the implementation may require integration with the build system to obtain actual dependency sources.</li>
</ul>
<p>While improving the analysis engine will make the analyzer more accurate and effective, new rule categories beyond access control and governance can also be introduced:</p>
<ul>
<li class=""><strong>Variable misuse:</strong> using variables of the correct type in an incorrect context (e.g., incorrect argument ordering or checking properties of the wrong variable, as in <a href="https://github.com/switchboard-xyz/sui/issues/3" target="_blank" rel="noopener noreferrer" class="">this example</a>). Papers such as <a href="https://www.semanticscholar.org/paper/Detection-of-Variable-Misuse-Using-Static-Analysis-Morgachev-Ignatyev/1d0c04632d3289970a2aae69b92d876ec34571e2" target="_blank" rel="noopener noreferrer" class="">this one</a> cover this topic in detail. A proper implementation would require IR improvements to reduce the number of LLM requests.</li>
<li class=""><strong>Oracle patterns:</strong> while less widespread compared to access control issues, oracle API misuse still appears in audit findings and is worth covering.</li>
<li class=""><strong>Flash loan and slippage issues:</strong> although Sui smart contract design <a href="https://blog.trailofbits.com/2025/09/10/how-sui-move-rethinks-flash-loan-security/" target="_blank" rel="noopener noreferrer" class="">protects</a> against many flash loan attacks, some patterns still appear in audits and could be addressed.</li>
<li class=""><strong>Cross-project pattern extraction:</strong> while invariant checking is typically handled by a different class of tools, the existing eDSL could be leveraged to generate new rules using LLMs, based on patterns extracted from existing projects via a RAG-style approach combined with code mutation.</li>
</ul>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="references">References<a href="https://nowarp.io/blog/skry#references" class="hash-link" aria-label="Direct link to References" title="Direct link to References" translate="no">​</a></h2>
<ol>
<li class=""><a href="https://arxiv.org/pdf/2404.04306" target="_blank" rel="noopener noreferrer" class="">Xia et al – AuditGPT: Auditing Smart Contracts with ChatGPT</a></li>
<li class=""><a href="https://arxiv.org/pdf/2308.03314" target="_blank" rel="noopener noreferrer" class="">Sun et al – GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis</a></li>
<li class=""><a href="https://arxiv.org/pdf/2504.11711" target="_blank" rel="noopener noreferrer" class="">Li et al – The Hitchhiker’s Guide to Program Analysis, Part II: Deep Thoughts by LLMs</a></li>
</ol>]]></content:encoded>
            <author>jubnzv@gmail.com (Georgiy Komarov)</author>
            <category>skry</category>
            <category>move</category>
            <category>sui</category>
            <category>static analysis</category>
            <category>llm</category>
        </item>
        <item>
            <title><![CDATA[TON Security Risks: A Static Analysis Perspective]]></title>
            <link>https://nowarp.io/blog/ton-security-risks</link>
            <guid>https://nowarp.io/blog/ton-security-risks</guid>
            <pubDate>Sun, 26 Jan 2025 00:00:00 GMT</pubDate>
            <description><![CDATA[Exploring static analysis capabilities and limitations for TON smart contracts security through Misti.]]></description>
            <content:encoded><![CDATA[<p>Smart contracts are unforgiving. A single bug can vaporize millions of dollars. If you're coming from web development, forget everything you know about "move fast and break things" - here, breaking things means <em>actually breaking things</em>. With money. Real money.</p>
<p>This is where static analysis comes in. It's a technique that examines your code before deployment to automatically detect potential vulnerabilities. While no automated tool can guarantee security, static analysis can identify common pitfalls early in development.</p>
<p>This post:</p>
<ul>
<li class="">Explores static analysis capabilities and limitations for smart contracts security.</li>
<li class="">Shows how this fits into <a href="https://ton.org/" target="_blank" rel="noopener noreferrer" class="">TON</a> security landscape through <a href="https://nowarp.io/tools/misti/" target="_blank" rel="noopener noreferrer" class="">Misti</a>.</li>
</ul>
<p>Understanding static program analysis enables you to add an additional layer of automated security verification to your development process, catching some vulnerabilities before they reach production.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="static-analysis-for-web3-and-ton">Static Analysis for Web3 and TON<a href="https://nowarp.io/blog/ton-security-risks#static-analysis-for-web3-and-ton" class="hash-link" aria-label="Direct link to Static Analysis for Web3 and TON" title="Direct link to Static Analysis for Web3 and TON" translate="no">​</a></h2>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="static-program-analysis-101">Static Program Analysis 101<a href="https://nowarp.io/blog/ton-security-risks#static-program-analysis-101" class="hash-link" aria-label="Direct link to Static Program Analysis 101" title="Direct link to Static Program Analysis 101" translate="no">​</a></h3>
<p>Security tooling is an essential part of modern smart contract development, serving as the first line of defense against vulnerabilities. While manual code review remains crucial, automated analysis tools can systematically identify classes of bugs that would be tedious and error-prone to catch by hand.</p>
<p>Static program analysis examines code without executing it. The classic approach used in program analysis and compiler design is the Monotone Framework <a href="https://nowarp.io/blog/ton-security-risks#references" class="">[4]</a>. This builds abstract models of your program using <a href="https://en.wikipedia.org/wiki/Control-flow_graph" target="_blank" rel="noopener noreferrer" class="">control flow graphs</a> (CFGs) and <a href="https://en.wikipedia.org/wiki/Data-flow_analysis" target="_blank" rel="noopener noreferrer" class="">data flow analysis</a> to reason about program behavior without the undecidability of analyzing all possible runtime scenarios.</p>
<p>The analysis pipeline could be illustrated like this:</p>
<div align="center"><img src="https://nowarp.io/assets/images/2025-01-26-pipeline-6204d425768116cef91972258563e1d2.png"></div>
<p>In essence, the approach is straightforward: we read and analyze the code structure to identify potential security vulnerabilities, without executing the code itself.</p>
<p>There are also different analysis techniques that exist, which can be used to <a href="https://en.wikipedia.org/wiki/Symbolic_execution" target="_blank" rel="noopener noreferrer" class="">explore concrete paths of execution</a>, <a href="https://en.wikipedia.org/wiki/Model_checking#Symbolic_model_checking" target="_blank" rel="noopener noreferrer" class="">prove properties on a limited domain of program</a>, <a href="https://en.wikipedia.org/wiki/Abstract_interpretation" target="_blank" rel="noopener noreferrer" class="">prove program properties with more accuracy</a>, and so on, but the core issue is always the same: <a href="https://en.wikipedia.org/wiki/Rice%27s_theorem" target="_blank" rel="noopener noreferrer" class="">static undecidability</a>.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="application-to-ton-smart-contracts">Application to TON Smart Contracts<a href="https://nowarp.io/blog/ton-security-risks#application-to-ton-smart-contracts" class="hash-link" aria-label="Direct link to Application to TON Smart Contracts" title="Direct link to Application to TON Smart Contracts" translate="no">​</a></h3>
<p>Smart contracts are just programs executing in blockchain. They have some differences, but fundamentally the same techniques of classic program analysis can be applied to smart contract analysis.</p>
<p>Being a dynamic field, web3 security is currently actively developing, new approaches are being tested, but still, that's a wild west: tools' impact on actual bug finding is suboptimal <a href="https://nowarp.io/blog/ton-security-risks#references" class="">[1]</a>. Despite that, we have a huge set of approaches described in papers and applied in commercial/free tools, we know about past bugs typical for other blockchains, and this knowledge can be extrapolated to TON.</p>
<p>TON creates its own unique architectural issues, while most of the generic web3 bugs from research and practical experience are still valid for TON. It has a runtime environment based on a stack machine and a couple of imperative languages without advanced language design solutions, which makes it quite similar to existing blockchain environments.</p>
<p>Examples of common security and functionality issues present in TON contracts include <a href="https://nowarp.io/blog/ton-security-risks#references" class="">[2]</a>:</p>
<div align="center"><img src="https://nowarp.io/assets/images/2025-01-26-web3-bugs-779ce142d570a47396f08249eff24cda.png"></div>
<p>But the most complicated bugs in TON are related to its unique actor model and asynchronous message-passing <a href="https://nowarp.io/blog/ton-security-risks#references" class="">[3]</a>:</p>
<ul>
<li class="">Partial execution: state mutations due to asynchronous message passing</li>
<li class="">Man in the middle in message flow</li>
<li class=""><em>Anything</em> that requires understanding the specification of the system; thus requiring a manual audit</li>
</ul>
<p>Here's the key point: complex bugs require understanding the system, and sometimes even developers don't fully understand it.</p>
<div class="spoilerContainer_ujGA"><div class="spoilerLine_Q5Qv">&gt; <!-- -->Show spoiler</div></div>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="misti-ton-static-analyzer">Misti: TON Static Analyzer<a href="https://nowarp.io/blog/ton-security-risks#misti-ton-static-analyzer" class="hash-link" aria-label="Direct link to Misti: TON Static Analyzer" title="Direct link to Misti: TON Static Analyzer" translate="no">​</a></h2>
<p><a href="https://nowarp.io/tools/misti" target="_blank" rel="noopener noreferrer" class="">Misti</a> as a source-level analyzer for <a href="https://tact-lang.org/" target="_blank" rel="noopener noreferrer" class="">Tact</a> contracts based on monotone framework that works exactly as described above, as well as combining it with <a href="https://nowarp.io/tools/misti/docs/next/hacking/souffle" target="_blank" rel="noopener noreferrer" class="">Datalog-based analyses</a>. Certainly it has all the limitations typical for this approach. These are essential and done by design.</p>
<p>Misti covers different categories of <a href="https://nowarp.io/tools/misti/docs/next/detectors" target="_blank" rel="noopener noreferrer" class="">security and optimization issues</a>:</p>
<ul>
<li class="">Cell storage issues</li>
<li class="">Resource exhaustion vectors potentially leading to <a href="https://en.wikipedia.org/wiki/Denial-of-service_attack" target="_blank" rel="noopener noreferrer" class="">DoS</a></li>
<li class="">Arithmetic issues</li>
<li class="">Unauthorized access to critical functions and contract's state</li>
<li class="">Code optimization</li>
<li class="">Generic suspicious patterns<!-- -->:things<!-- --> we learn from web3 security in past</li>
</ul>
<p>Let's consider some concrete case studies of analyses Misti implements for both generic smart contract issues and TON/Tact-specific problems.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="cell-storage-issues">Cell storage issues<a href="https://nowarp.io/blog/ton-security-risks#cell-storage-issues" class="hash-link" aria-label="Direct link to Cell storage issues" title="Direct link to Cell storage issues" translate="no">​</a></h3>
<p>TON <a href="https://docs.ton.org/v3/concepts/dive-into-ton/ton-blockchain/cells-as-data-storage" target="_blank" rel="noopener noreferrer" class="">stores</a> persistent data in <code>Cell</code> structures. Cell is a low-level primitive containing up to <code>1023</code> bytes of data and up to <code>4</code> references to other cells used to create high-level data structures.</p>
<p>The possible issue with cells arises when access or write operation disrupts these limits. When the user tries either to load non-existing data from a cell or write data/references beyond the specified limits, it leads to <code>CellUnderflow</code> and <code>CellOverflow</code> <a href="https://docs.ton.org/v3/documentation/tvm/tvm-exit-codes#standard-exit-codes" target="_blank" rel="noopener noreferrer" class="">compute phase exceptions</a>:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">beginCell</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token variable" style="color:#36acaa">b1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">storeInt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">data</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">257</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token variable" style="color:#36acaa">b1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">storeInt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">balance</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">257</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token variable" style="color:#36acaa">b1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">storeInt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">owner_data</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">257</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">// CellOverflow: storing more than 1023 bits</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token variable" style="color:#36acaa">b1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">storeInt</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">msg</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">info</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">257</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><br></div></code></pre></div></div>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">s1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">beginCell</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// Creating a Slice with 1 reference</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">           </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">storeRef</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">c</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">           </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">endCell</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">           </span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">asSlice</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">ref1</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">s1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">loadRef</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// OK</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">ref2</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">s1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">loadRef</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">// CellUnderflow</span><br></div></code></pre></div></div>
<p>Because of the asynchronous nature of TON, these issues may lead to unexpected message flow disrupting the logic of the contract.</p>
<p>The main issue of detecting these is that in Tact, Cell operations might be used within different data structures like Builder, Slice, Cell, Struct and Message, and might require reasoning about the source code within different function/method calls.</p>
<p>The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/CellBounds/" target="_blank" rel="noopener noreferrer">CellBounds</a></b> detector tries to handle this by statically inspecting the source code.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="tact-specific-issues">Tact-specific issues<a href="https://nowarp.io/blog/ton-security-risks#tact-specific-issues" class="hash-link" aria-label="Direct link to Tact-specific issues" title="Direct link to Tact-specific issues" translate="no">​</a></h3>
<p>There are plenty of Tact-specific issues covered by Misti. Let's consider some of them with source code examples.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="function-arguments-in-tact-are-immutable">Function arguments in Tact are immutable<a href="https://nowarp.io/blog/ton-security-risks#function-arguments-in-tact-are-immutable" class="hash-link" aria-label="Direct link to Function arguments in Tact are immutable" title="Direct link to Function arguments in Tact are immutable" translate="no">​</a></h4>
<p>Thus, the developer should not mutate them expecting they'll be changed in the callsite. The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/ArgCopyMutation/" target="_blank" rel="noopener noreferrer">ArgCopyMutation</a></b> detector finds these cases (unless the developer explicitly returns the modified parameter):</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">fun</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">setA</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">a</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Int</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">m</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token generics builtin">map</span><span class="token generics punctuation" style="color:#393A34">&lt;</span><span class="token generics builtin">Int</span><span class="token generics punctuation" style="color:#393A34">,</span><span class="token generics"> </span><span class="token generics builtin">Int</span><span class="token generics punctuation" style="color:#393A34">&gt;</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: `m` won't be modified in the callsite</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token variable" style="color:#36acaa">m</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">set</span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">key</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">a</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="some-exit-codes-are-reserved">Some exit codes are reserved<a href="https://nowarp.io/blog/ton-security-risks#some-exit-codes-are-reserved" class="hash-link" aria-label="Direct link to Some exit codes are reserved" title="Direct link to Some exit codes are reserved" translate="no">​</a></h4>
<p>Codes from 0 to 255 <a href="https://docs.tact-lang.org/book/exit-codes/" target="_blank" rel="noopener noreferrer" class="">are reserved</a> by Tact and TON. Thus, the developer should never use them to avoid breaking the expected behavior of the contract. The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/ExitCodeUsage/" target="_blank" rel="noopener noreferrer">ExitCodeUsage</a></b> detector interprets the possible numeric values used in exit codes in order to detect suspicious cases like these:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-literal string" style="color:#e3116c">"test"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: Throwing the reserved `128` code</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token function" style="color:#d73a49">nativeThrowUnless</span><span class="token punctuation" style="color:#393A34">(</span><span class="token number" style="color:#36acaa">128</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">sender</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">owner</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="dont-overlap-string-receivers-values">Don't overlap string receivers values<a href="https://nowarp.io/blog/ton-security-risks#dont-overlap-string-receivers-values" class="hash-link" aria-label="Direct link to Don't overlap string receivers values" title="Direct link to Don't overlap string receivers values" translate="no">​</a></h4>
<p>Tact has <a href="https://docs.tact-lang.org/book/receive/" target="_blank" rel="noopener noreferrer" class="">text receivers</a> which accept a particular string as a message. The issue arises when a generic receiver (defined <code>receive()</code>) handles these messages:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">contract</span><span class="token plain"> </span><span class="token class-name">Test</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token string-literal string" style="color:#e3116c">"test"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">msg</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">String</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: "test" message should be handles in `receive("test")`</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">msg</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token string-literal string" style="color:#e3116c">"test"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">/*...*/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>This leads to unexpected control flow making some receivers unreachable. The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/StringReceiversOverlap/" target="_blank" rel="noopener noreferrer">StringReceiversOverlap</a></b> detector handles this.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="choose-better-tact-api">Choose better Tact API<a href="https://nowarp.io/blog/ton-security-risks#choose-better-tact-api" class="hash-link" aria-label="Direct link to Choose better Tact API" title="Direct link to Choose better Tact API" translate="no">​</a></h4>
<p>Tact, being a dynamically developed language, can introduce new features making code more effective or deprecate some features or create safer alternatives or potentially dangerous functions.</p>
<p>Examples of these functions include but are not limited to:</p>
<ol>
<li class=""><a href="https://docs.tact-lang.org/ref/core-advanced#nativesendmessage" target="_blank" rel="noopener noreferrer" class=""><code>nativeSendMessage</code></a> should be replaced with <a href="https://docs.tact-lang.org/book/send/" target="_blank" rel="noopener noreferrer" class=""><code>send</code></a></li>
<li class=""><a href="https://docs.tact-lang.org/ref/core-advanced/#nativerandom" target="_blank" rel="noopener noreferrer" class=""><code>nativeRandom</code></a> should be replaced with <a href="https://docs.tact-lang.org/ref/core-random/#randomint" target="_blank" rel="noopener noreferrer" class=""><code>randomInt</code></a></li>
<li class="">Tact provides optimized versions of <code>send</code>: <a href="https://docs.tact-lang.org/ref/core-common/#deploy" target="_blank" rel="noopener noreferrer" class=""><code>deploy</code></a> and <a href="https://docs.tact-lang.org/ref/core-common/#message" target="_blank" rel="noopener noreferrer" class=""><code>message</code></a></li>
</ol>
<p>Here is some illustrating code:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">contract</span><span class="token plain"> </span><span class="token class-name">Test</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: Prefer more effective `deploy` function</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">init</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">initOf</span><span class="token plain"> </span><span class="token constant" style="color:#36acaa">A</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token function" style="color:#d73a49">send</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">SendParameters</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">code</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">init</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">code</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">/* ... */</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain" style="display:inline-block"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: Prefer `emptySlice()`</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">    </span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">s</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Slice</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token function" style="color:#d73a49">emptyCell</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">asSlice</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>It might be not-so-easy to follow all the updates within different versions of the Tact compiler; thus Misti covers this. Examples of detectors of this category include <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/PreferredStdlibApi/" target="_blank" rel="noopener noreferrer">PreferredStdlibApi</a></b> and <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/SuboptimalSend/" target="_blank" rel="noopener noreferrer">SuboptimalSend</a></b>.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="arithmetic-issues">Arithmetic issues<a href="https://nowarp.io/blog/ton-security-risks#arithmetic-issues" class="hash-link" aria-label="Direct link to Arithmetic issues" title="Direct link to Arithmetic issues" translate="no">​</a></h3>
<p>A classic arithmetic issue typical for smart contracts is division before multiplication. The thing is that typically the division operation can leave some remainder that might be missed when using multiplication afterward. If there is no handling of the remainder, the contract might lose user funds or tokens on such operations:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">a</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Int</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">10</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Int</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">3</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">c</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Int</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token number" style="color:#36acaa">2</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token comment" style="color:#999988;font-style:italic">// Bad: Division before multiplication</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">result</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Int</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">a</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">/</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">b</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">*</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">c</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/DivideBeforeMultiply/" target="_blank" rel="noopener noreferrer">DivideBeforeMultiply</a></b> detector can detect these cases.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="use-random-properly">Use random properly<a href="https://nowarp.io/blog/ton-security-risks#use-random-properly" class="hash-link" aria-label="Direct link to Use random properly" title="Direct link to Use random properly" translate="no">​</a></h4>
<p>TVM implements some <a href="https://docs.ton.org/v3/guidelines/smart-contracts/security/random-number-generation" target="_blank" rel="noopener noreferrer" class="">PRG functionality</a>. When using it in Tact, the developer should care about API usage and read the documentation carefully. They always have to initialize seed correctly using either the Tact standard library <a href="https://docs.tact-lang.org/ref/core-advanced/#nativepreparerandom" target="_blank" rel="noopener noreferrer" class=""><code>nativePrepareRandom</code></a> or some TVM assembly to initialize the seed. <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/EnsurePrgSeed/" target="_blank" rel="noopener noreferrer">EnsurePrgSeed</a></b> ensures the random seed is set up before accessing randomness features.</p>
<div class="spoilerContainer_ujGA"><div class="spoilerLine_Q5Qv">&gt; <!-- -->Show spoiler</div></div>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="unprotected-calls-or-state-changes">Unprotected calls or state changes<a href="https://nowarp.io/blog/ton-security-risks#unprotected-calls-or-state-changes" class="hash-link" aria-label="Direct link to Unprotected calls or state changes" title="Direct link to Unprotected calls or state changes" translate="no">​</a></h3>
<p>In smart contracts there are always privileged functions which e.g. destroy contract, send funds, and it is possible to change critical parameters through sending messages. All this functionality essentially should be protected from random users to avoid cases like <a href="https://blog.openzeppelin.com/on-the-parity-wallet-multisig-hack-405a8c12e8f7" target="_blank" rel="noopener noreferrer" class="">The Parity Wallet Hack</a>.</p>
<p>Here is some code illustrating this for Tact:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">receive</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">s1</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token builtin">Slice</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token keyword" style="color:#00009f">let</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">a</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">=</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">s1</span><span class="token punctuation" style="color:#393A34">.</span><span class="token function" style="color:#d73a49">loadAddress</span><span class="token punctuation" style="color:#393A34">(</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: Anyone could send funds to an arbitrary address</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic">// The protection would be to `require` a specific sender address</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token function" style="color:#d73a49">send</span><span class="token punctuation" style="color:#393A34">(</span><span class="token class-name">SendParameters</span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">to</span><span class="token punctuation" style="color:#393A34">:</span><span class="token plain"> </span><span class="token variable" style="color:#36acaa">a</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token comment" style="color:#999988;font-style:italic">/*...*/</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">}</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/UnprotectedCall/" target="_blank" rel="noopener noreferrer">UnprotectedCall</a></b> detector can protect your code against such cases.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="generic-code-issues">Generic code issues<a href="https://nowarp.io/blog/ton-security-risks#generic-code-issues" class="hash-link" aria-label="Direct link to Generic code issues" title="Direct link to Generic code issues" translate="no">​</a></h3>
<p>There are plenty of issues typical not only for smart contracts but for normal programs as well. But while using these issues in <em>not safety-critical systems</em> typically doesn't lead to severe damage, in contracts the results might be much worse. Let's consider some case studies:</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="be-careful-with-copy-paste">Be careful with copy-paste<a href="https://nowarp.io/blog/ton-security-risks#be-careful-with-copy-paste" class="hash-link" aria-label="Direct link to Be careful with copy-paste" title="Direct link to Be careful with copy-paste" translate="no">​</a></h4>
<p>A typical case is when you need slightly different logic that makes no sense to parametrize to avoid overcomplicating your code. But you should be careful there, not forgetting to update all the branches after copy-pasting and later when refactoring your code:</p>
<div class="language-tact codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#393A34;--prism-background-color:#f6f8fa"><div class="codeBlockContent_QJqH"><pre tabindex="0" class="prism-code language-tact codeBlock_bY9V thin-scrollbar" style="color:#393A34;background-color:#f6f8fa"><code class="codeBlockLines_e6Vv"><div class="token-line" style="color:#393A34"><span class="token keyword" style="color:#00009f">if</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">(</span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">lockPeriod</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">&gt;</span><span class="token plain"> </span><span class="token constant" style="color:#36acaa">HALF_YEAR</span><span class="token punctuation" style="color:#393A34">)</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token function" style="color:#d73a49">require</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">sender</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">owner</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string-literal string" style="color:#e3116c">"Only owner can trade"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">else</span><span class="token plain"> </span><span class="token punctuation" style="color:#393A34">{</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token comment" style="color:#999988;font-style:italic">// Bad: The developer forgot to update the copy-pasted branch</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain">  </span><span class="token function" style="color:#d73a49">require</span><span class="token punctuation" style="color:#393A34">(</span><span class="token variable" style="color:#36acaa">sender</span><span class="token plain"> </span><span class="token operator" style="color:#393A34">==</span><span class="token plain"> </span><span class="token keyword" style="color:#00009f">self</span><span class="token punctuation" style="color:#393A34">.</span><span class="token variable" style="color:#36acaa">owner</span><span class="token punctuation" style="color:#393A34">,</span><span class="token plain"> </span><span class="token string-literal string" style="color:#e3116c">"Only owner can trade"</span><span class="token punctuation" style="color:#393A34">)</span><span class="token punctuation" style="color:#393A34">;</span><span class="token plain"></span><br></div><div class="token-line" style="color:#393A34"><span class="token plain"></span><span class="token punctuation" style="color:#393A34">}</span><br></div></code></pre></div></div>
<p>The <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/BranchDuplicate/" target="_blank" rel="noopener noreferrer">BranchDuplicate</a></b> detector can find equal branches in code.</p>
<h4 class="anchor anchorTargetStickyNavbar_Vzrq" id="get-rid-of-dead-code">Get rid of dead code<a href="https://nowarp.io/blog/ton-security-risks#get-rid-of-dead-code" class="hash-link" aria-label="Direct link to Get rid of dead code" title="Direct link to Get rid of dead code" translate="no">​</a></h4>
<p>Dead code is not as simple as it sounds. It not only clutters your codebase but often indicates that the developer:</p>
<ul>
<li class="">Forgot to implement the intended logic (e.g. unused constant or write-only field)</li>
<li class="">Didn't check the error returning from the function (can lead to control flow anomalies, search for: <code>weird ERC20 attack</code> and read <a href="https://nowarp.io/blog/ton-security-risks#references" class="">[5]</a> for more)</li>
</ul>
<p>Examples of dead code detectors in Misti are: <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/NeverAccessedVariables/" target="_blank" rel="noopener noreferrer">NeverAccessedVariables</a></b>, <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/ReadOnlyVariables/" target="_blank" rel="noopener noreferrer">ReadOnlyVariables</a></b>, <b><a href="https://nowarp.io/tools/misti/docs/next/detectors/UnusedExpressionResult/" target="_blank" rel="noopener noreferrer">UnusedExpressionResult</a></b>.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="conclusion">Conclusion<a href="https://nowarp.io/blog/ton-security-risks#conclusion" class="hash-link" aria-label="Direct link to Conclusion" title="Direct link to Conclusion" translate="no">​</a></h2>
<p>We've considered the basic information about static program analysis and it's application to the TON security landscape. Now, here are some concrete steps to increase security you should be thinking about.</p>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="increase-security-of-your-project">Increase security of your project<a href="https://nowarp.io/blog/ton-security-risks#increase-security-of-your-project" class="hash-link" aria-label="Direct link to Increase security of your project" title="Direct link to Increase security of your project" translate="no">​</a></h3>
<ul>
<li class=""><strong>Integrate security tools:</strong> Use every possibility to make your code secure. Static analysis is a good as a first line of defense, catching common vulnerabilities early and automatically. Add Misti to your <a href="https://nowarp.io/tools/misti/docs/tutorial/ci-cd" target="_blank" rel="noopener noreferrer" class="">CI/CD</a> and integrate it <a href="https://t.me/nowarp_io/4" target="_blank" rel="noopener noreferrer" class="">in the development process</a>.</li>
<li class=""><strong>Apply development practices:</strong> testing, design, formal specification or at least documentation. This develops a separate post and might be highlighted in this blog later. Prefer <a href="https://tact-lang.org/" target="_blank" rel="noopener noreferrer" class="">a safe language</a>.</li>
<li class=""><strong>Set up processes:</strong> Security not only about analysis and audits: you should think in advance about security development in incident response processes.</li>
<li class=""><strong>Do security audit:</strong> While static analysis enhances the security process, it cannot replace thorough manual audits. For production contracts, professional security audits remain essential. You could find our contacts on <a href="https://nowarp.io/" target="_blank" rel="noopener noreferrer" class="">nowarp.io</a> or browse security teams collaborating with TF.</li>
</ul>
<h3 class="anchor anchorTargetStickyNavbar_Vzrq" id="future-directions-in-misti">Future Directions in Misti<a href="https://nowarp.io/blog/ton-security-risks#future-directions-in-misti" class="hash-link" aria-label="Direct link to Future Directions in Misti" title="Direct link to Future Directions in Misti" translate="no">​</a></h3>
<p>There are many possibilities in TON security automation. Our concrete steps in Misti for the next months:</p>
<ul>
<li class="">Implementing <a href="https://github.com/nowarp/misti/issues/254" target="_blank" rel="noopener noreferrer" class="">IFDS with path-sensitivity tracking</a> in order to improve accuracy of interprocedural taint analysis</li>
<li class="">Implement more Tact detectors using advanced static analysis techniques. The concrete roadmap will be available in the <a href="https://github.com/nowarp/misti/milestones" target="_blank" rel="noopener noreferrer" class="">GitHub milestones</a>.</li>
<li class="">Improve integrability and API to support third-party developers</li>
<li class="">Provide better tooling for auditors to actually understand the structure of contracts</li>
</ul>
<p>Overall, Misti still following the development of the Tact language and improves it support to make development on it more smooth and secure.</p>
<h2 class="anchor anchorTargetStickyNavbar_Vzrq" id="references">References<a href="https://nowarp.io/blog/ton-security-risks#references" class="hash-link" aria-label="Direct link to References" title="Direct link to References" translate="no">​</a></h2>
<ol>
<li class=""><a href="https://yanniss.github.io/symvalic-oopsla21.pdf" target="_blank" rel="noopener noreferrer" class="">Smaragdakis et al. – Symbolic Value Analysis for Smart Contracts</a></li>
<li class=""><a href="https://github.com/ZhangZhuoSJTU/Web3Bugs/blob/main/papers/icse23.pdf" target="_blank" rel="noopener noreferrer" class="">Zhang et al. - Demystifying Exploitable Bugs in Smart Contracts</a></li>
<li class=""><a href="https://docs.ton.org/v3/guidelines/smart-contracts/security/secure-programming" target="_blank" rel="noopener noreferrer" class="">TON Documentation: Secure Smart Contract Programming</a></li>
<li class=""><a href="https://cs.au.dk/~amoeller/spa/spa.pdf" target="_blank" rel="noopener noreferrer" class="">Anders Møller and Michael I. Schwartzbach – Static Program Analysis</a></li>
<li class=""><a href="https://dl.acm.org/doi/10.1145/3605768.3623546" target="_blank" rel="noopener noreferrer" class="">Gan et al. – Why Trick Me: The Honeypot Traps on Decentralized Exchanges</a></li>
</ol>]]></content:encoded>
            <author>jubnzv@gmail.com (Georgiy Komarov)</author>
            <category>misti</category>
            <category>ton</category>
            <category>static analysis</category>
        </item>
    </channel>
</rss>