asm-rs

A pure Rust multi-architecture assembly engine for offensive security. Zero unsafe, no_std-compatible, designed for embedding in exploit compilers, JIT engines, security tools, and shellcode generators.


Project maintained by hupe1980 Hosted on GitHub Pages — Theme by mattgraham

Architecture Guide

Deep dive into the asm-rs assembler pipeline, module responsibilities, encoding architecture, and testing strategy.


Pipeline Overview

Source Text
     │
     ▼
┌──────────────┐
│ Preprocessor │  Expands macros, loops, conditionals
└──────┬───────┘  (.macro/.rept/.irp/.if)
       │
       ▼
┌─────────┐
│  Lexer  │  Zero-copy tokenization into Token<'src>
└────┬────┘  with source spans for error reporting
     │
     ▼
┌─────────┐
│  Parser  │  Produces intermediate representation (IR)
└────┬─────┘  from token stream using Intel syntax rules
     │
     ▼
┌───────────┐
│ Optimizer │  Peephole optimizations (zero-idiom, MOV narrowing,
└─────┬─────┘  AND→TEST conversion) when OptLevel::Size is active
      │
      ▼
┌──────────┐
│ Encoder  │  Translates IR instructions into machine code
└────┬─────┘  bytes with relocations and relax info
     │
     ▼
┌──────────┐
│  Linker  │  Resolves labels, relaxes branches (Szymanski),
└────┬─────┘  applies relocations, produces final output
     │
     ▼
  Output Bytes + Labels + Applied Relocations

Module Responsibilities

Preprocessor

The preprocessor operates on raw source text before the lexer. It performs text-level expansion of macros, loops, and conditional assembly directives.

Macro definitions (.macro / .endm):

Repeat loops:

Conditional assembly:

Expression evaluator (recursive-descent):

Design decisions:


Lexer

The lexer performs zero-copy tokenization of assembly source text. It produces a Vec<Token<'src>> where each token carries:

Design decisions:


Parser

The parser consumes &[Token<'_>] and produces Vec<Statement>. It handles:

The parser is a simple recursive-descent parser producing flat IR statements. Instruction fields are fully stack-allocated: Mnemonic (inline [u8; 24]), OperandList (inline [Operand; 6]), PrefixList (inline [Prefix; 4]) — yielding zero heap allocations per instruction.

Constant Expression Evaluator

Precedence Operators Associativity
1 (lowest) \| (bitwise OR) Left
2 ^ (bitwise XOR) Left
3 & (bitwise AND) Left
4 <<, >> (shift) Left
5 +, - (add/sub) Left
6 *, /, % (mul/div/mod) Left
7 unary -, ~ (negate, NOT) Right
8 (highest) atoms: numbers, constants, (expr)

AT&T / GAS Syntax Support

When Syntax::Att is active, the parser switches to AT&T operand parsing:

Feature AT&T Syntax Intel Equivalent
Register prefix %rax, %eax rax, eax
Immediate prefix $42, $0xFF 42, 0xFF
Operand order movq $1, %rax (src, dst) mov rax, 1 (dst, src)
Memory disp(%base, %index, scale) [base + index*scale + disp]
Segment override %fs:0x28(%rax) fs:[rax + 0x28]
Mnemonic suffix movq, addl, movb Size from operands
Indirect *%rax, *(%rax) rax, [rax]

Mnemonic translations: movzblmovzx, movsblmovsx, movslqmovsxd, cltqcdqe, cqtocqo, etc.


Optimizer

The peephole optimizer runs after parsing, before encoding. It transforms individual instructions for shorter machine code when OptLevel::Size is active.

Pattern Replacement Savings
mov reg64, 0 / mov reg32, 0 xor reg32, reg32 5–7 → 2 bytes
mov reg64, small_imm (0 < imm ≤ u32::MAX) mov reg32, imm32 7 → 5 bytes
and reg64, u32_imm and reg32, u32_imm Saves 1 byte (REX removed)
and reg, reg (same register) test reg, reg Equivalent, better for flags

Design: operates on IR Statement level, each optimization is a pure function returning Option<Statement>, easy to extend.


Encoder

The encoder translates one Instruction into machine code bytes.

x86-64 encoding:

Zero-allocation design:


Unified x86 Dispatch

The x86 module covers ~725 mnemonics across these encoding classes:

Class Examples
Fixed encoding (85) Zero-operand instructions via const sorted table + binary search
ALU class (8) ADD/OR/ADC/SBB/AND/SUB/XOR/CMP
Unary class NOT/NEG/MUL/DIV/IDIV
Shift class SHL/SHR/SAR/ROL/ROR/RCL/RCR
Condition-code class All 16 conditions → Jcc, SETcc, CMOVcc
SSE/SSE2/SSE3/SSSE3/SSE4 100+ SIMD instructions
AVX/AVX2 (VEX) 300+ instructions with FMA3, permutes, broadcasts
AVX-512 (EVEX) 120+ instructions with ZMM0-ZMM31
BMI1/BMI2, ADX, TSX Specialized extensions

ARM32 Encoder

Key features:


Thumb / Thumb-2 Encoder

Shares arm.rs with ARM32 encoder:


AArch64 Encoder

Key features:


RISC-V Encoder

Key features:


Linker

The linker collects encoded fragments and resolves references:

Fragment model:

Szymanski’s branch relaxation:

  1. All relaxable branches start as short form
  2. Each iteration checks displacements
  3. Out-of-range branches promoted to long form
  4. Growth is monotonic — once promoted, never shrinks
  5. Guaranteed convergence (max 100 iterations)

Relocations — 17 relocation kinds across all architectures:

Kind Architecture Description
X86Relative x86 RIP-relative displacement
Absolute All Raw LE byte write (1–8 bytes)
ArmBranch24 ARM32 B/BL 24-bit offset
ArmLdrLit ARM32 LDR literal 12-bit
Aarch64Jump26 AArch64 B/BL 26-bit offset
Aarch64Branch19 AArch64 B.cond/CBZ/CBNZ
Aarch64Branch14 AArch64 TBZ/TBNZ
RvJal20 RISC-V JAL 21-bit J-type
RvBranch12 RISC-V B-type 13-bit
RvAuipc20 RISC-V AUIPC+JALR pair
ThumbBranch8/11 Thumb 16-bit branches
ThumbBl/BranchW Thumb 32-bit branches

Encoding Format Reference

x86-64 Instruction Format

┌────────┬────┬───────┬─────┬──────────────┬───────────┐
│ Prefix │REX │Opcode │ModRM│     SIB      │Disp/Imm   │
│ (opt)  │opt │1-3 B  │(opt)│    (opt)     │  (opt)    │
└────────┴────┴───────┴─────┴──────────────┴───────────┘

REX Prefix (0x40–0x4F)

  0  1  0  0  W  R  X  B
                │  │  │  └─ Extension of ModRM.rm or SIB.base
                │  │  └──── Extension of SIB.index
                │  └─────── Extension of ModRM.reg
                └────────── 64-bit operand size

ModR/M Byte

  mod   reg/opcode   r/m
  [7:6]   [5:3]     [2:0]

SIB Byte

  scale   index   base
  [7:6]   [5:3]   [2:0]

Required when using RSP/R12 as base or scaled index addressing.


Testing Strategy

The project maintains an extensive test suite across multiple categories:

Zero warnings, zero clippy warnings, Miri clean.