Published papers, technical reports, and talks online.


Hongxu Cai
Zhong Shao
Alexander Vaynberg


Self-modifying code (SMC), in this paper, broadly refers to any program that purposely loads, generates, or mutates code at runtime. It is widely used in many of the world's critical software systems to support runtime code generation and optimization, dynamic loading and linking, OS kernel boot-loading, just-in-time compilation, binary translation, virtual machine monitor, or dynamic code encryption and obfuscation. Unfortunately, SMC is also extremely difficult to reason about: existing formal verification techniques---including Hoare logic, type system, and proof-carrying code---consistently maintain the assumption that program code stored in memory is fixed and immutable. This severely limits the applicability and power of today's program verification systems.

This paper presents a novel yet simple extension of Hoare-logic-like framework to support modular verification of general von-Neumann machine code with runtime code manipulation. By dropping the assumption that code memory is fixed and immutable, we are forced to apply local reasoning and separation logic at the very beginning, and treat program code uniformly as regular data structure. We address the interaction between separation and code memory and show how to establish the frame rules for local reasoning even in the presence of SMC. Our system is realistic, but designed to be highly generic, so that it can support assembly code under all modern CPUs (including both x86 and MIPS). Our system is expressive and fully mechanized. We prove its soundness in the Coq proof assistant and demonstrate its power by certifying a series of realistic examples and applications---all of which can directly run on the SPIM simulator or any stock x86 hardware.


  • In Proc. 2007 ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI'07), San Diego, California, pages 66-77, June 2007. ©2007 ACM.

  • Technical Report YALEU/DCS/TR-1379 (Extended Version)

  • Coq Implementaion (tar.gz) [108k]
    • Encoding of the general target machine (GTM);
    • Encoding of the assertion language and separation logic including the common properties;
    • Encoding of GCAP0 on GTM including the soundness proof, frame rules, and other derived rules.
    • Encoding of GCAP1 on GTM including the soundness proof, frame rules, and other derived rules.
    • Encoding of GCAP2 on GTM including the soundness proof, frame rules, and other derived rules.
    • Certified examples.