[vm] Fix several correctness issues in Move VM runtime (#19052)
What specific code changed
Seven files in the Move VM runtime were patched:
interpreter.rs: Added a guard thatdispatch functions must have ≥1 argumentand removed a deadVecPackoperand-stack pop for closures.loader/function.rs&loader/script.rs: Replaced the infallibleBytecode::into()conversion with the fallibleInstruction::try_from()so that malformed bytecodes are caught at load time instead of runtime.runtime_ref_checks.rs: Replaced the panickingsafe_unwrap_err!onVecPack(n)with a safe cast*n as usize.runtime_type_checks.rs: Added an explicit comment thatExistsinstructions deliberately skip thekeyability check because read-only existence tests are not exploitable.storage/ty_layout_converter.rs: Added a 4-line comment explaining why delayed-field layouts bypass the cache.move-vm/types/src/instr.rs: Changed theVecPackimmediate fromu64tou16and introduced the new fallibleimpl TryFromthat returnsfor Instruction PartialVMResult.
Why this change was made
The runtime had latent correctness holes:
- Dynamic dispatch could be called on a native with zero parameters, violating the convention that the last argument carries the function selector.
- Bytecode loading used an infallible
into()conversion, so any future invalid opcode would silently becomeInstruction::Nopinstead of failing. VecPackcarried au64count but the VM limits vectors toMAX_VEC_SIZE = u16::MAX; the wider type invited overflow and the accompanyingsafe_unwrap_err!could panic.
These issues were found during internal audits for the v1.9 release.
How it works technically
The diff introduces a new TryFrom implementation in instr.rs:
impl TryFrom<Bytecode> for Instruction { ... fn try_from(bc: Bytecode) -> PartialVMResult<Self> { match bc { ... Bytecode::VecPack(si, n) => { ensure!(n <= u16::MAX as u64, StatusCode::MALFORMED); Instruction::VecPack(si, n as u16) } ... } } }Loader phases now invoke Instruction::try_from instead of into(), so any out-of-range n triggers an early PartialVMError and aborts loading. The interpreter’s dynamic-dispatch path explicitly checks function.param_tys().is_empty() and bails with INVARIANT_VIOLATION if true. Reference-checking for VecPack now uses the narrowed u16 directly, eliminating the try_into().unwrap() panic.
Where it fits in the Aptos pipeline
These changes sit inside the Move VM runtime, between consensus ordering and state-commit:
Consensus → (Quorum Store) → Block-STM → Move VM → State DB
The loader runs when a block is first processed by Block-STM; the interpreter runs during every transaction execution; the type- and reference-checkers run only when the paranoid mode flag is enabled (default on main-net since v1.7). Therefore the patches harden both the initial bytecode validation and the speculative execution path.
What the implications are
- Any module or script containing a
VecPackwith count > 65535 will now be rejected at load time instead of causing undefined behaviour. - Dynamic dispatch on improperly defined natives fails deterministically rather than corrupting the operand stack.
- The paranoid checker no longer panics on large vectors; it continues with correct reference counting.
- No on-chain state migration is required—the fixes are purely in the VM implementation, so nodes only need to upgrade.
ELI5 — Explain Like I'm 5
Imagine the Move VM is a picky chef who follows a recipe book (bytecode) to cook every transaction. Georgy found three pages in the book that could make the chef slice his finger or serve raw food.
First, the recipe for “dynamic dispatch” forgot to say “you must have at least one ingredient on the counter”; the chef could try to grab air and crash. The patch adds a line that checks the counter isn’t empty.
Second, the old cookbook used a magic marker that turned any scribble into “do nothing” instead of warning you the page was torn. The new marker refuses to cook if it sees a torn page, so bad recipes are thrown out before the stove is lit.
Third, one recipe asked the chef to pack 100 000 items into a picnic basket that only holds 65 535 sandwiches; the chef used to panic when he realised it wouldn’t fit. Now the recipe is rewritten to use a smaller number that always fits, and the chef stays calm.
Because the fixes live only in the kitchen (VM), once validators upgrade their chef, every transaction from that moment on is safer without changing anything already stored in the pantry (blockchain state).
Other Deep Dives
View this report interactively with Advanced / ELI5 tabs at https://aptos-intelligence.vercel.app/#c8db4ac. Plain-text version: /reports/c8db4ac.txt.