Inside Ohm's PEG-to-Wasm Compiler

Source: ohmjs.org | Rating: ⭐⭐⭐⭐ (4/5)

Ohm v18 beta features a complete rewrite of the parsing engine - compiling PEG grammars to WebAssembly. Result: 50x faster, 10% memory usage compared to v17.

Technical Highlights:
  • AST interpretation to Wasm compilation: Previous versions interpreted the PExpr tree; v18 compiles to Wasm
  • Compile-time loop structure: Alternative matching is inlined at compile time, not runtime dispatch
  • Arena allocation: CST nodes use bump allocation in Wasm linear memory instead of GC'd JS objects
  • Compact node layout: 32-bit offset references instead of full pointers
  • Rule application as function calls: Each grammar rule becomes its own Wasm function

The key insight is that parsing expression trees (PExprs) can be directly compiled to WebAssembly rather than interpreted. The code generation phase transforms the grammar's PExpr tree into Wasm instructions, with each rule becoming a function.

Memory management uses region-based allocation - all CST nodes share the same lifetime as their parent MatchResult, allowing efficient bulk deallocation.

Ohm PEG WebAssembly Compilers Parsing AssemblyScript