A pure Rust multi-architecture assembly engine for offensive security. Zero unsafe, no_std-compatible, designed for embedding in exploit compilers, JIT engines, security tools, and shellcode generators.
Deep dive into the asm-rs assembler pipeline, module responsibilities, encoding architecture, and testing strategy.
Source Text
│
▼
┌──────────────┐
│ Preprocessor │ Expands macros, loops, conditionals
└──────┬───────┘ (.macro/.rept/.irp/.if)
│
▼
┌─────────┐
│ Lexer │ Zero-copy tokenization into Token<'src>
└────┬────┘ with source spans for error reporting
│
▼
┌─────────┐
│ Parser │ Produces intermediate representation (IR)
└────┬─────┘ from token stream using Intel syntax rules
│
▼
┌───────────┐
│ Optimizer │ Peephole optimizations (zero-idiom, MOV narrowing,
└─────┬─────┘ AND→TEST conversion) when OptLevel::Size is active
│
▼
┌──────────┐
│ Encoder │ Translates IR instructions into machine code
└────┬─────┘ bytes with relocations and relax info
│
▼
┌──────────┐
│ Linker │ Resolves labels, relaxes branches (Szymanski),
└────┬─────┘ applies relocations, produces final output
│
▼
Output Bytes + Labels + Applied Relocations
The preprocessor operates on raw source text before the lexer. It performs text-level expansion of macros, loops, and conditional assembly directives.
Macro definitions (.macro / .endm):
\param).macro name reg=rax):vararg) for collecting remaining arguments\@ counterRepeat loops:
.rept count / .endr — repeat body count times.irp symbol, value1, value2, ... / .endr — iterate substituting symbol.irpc symbol, string / .endr — iterate over charactersConditional assembly:
.if expression / .else / .elseif expression / .endif.ifdef symbol / .ifndef symboldefined(symbol) function in expressionsExpression evaluator (recursive-descent):
||, &&, |, ^, &, ==, !=, <, >, <=, >=, <<, >>, +, -, *, /, %, unary !, -, ~0x hex, 0b binary, 0o octal, character 'A'Design decisions:
Assembler.emit() — transparent to callersThe lexer performs zero-copy tokenization of assembly source text. It produces
a Vec<Token<'src>> where each token carries:
Ident, Number, Directive, LabelDef, Comma, etc.Cow<'src, str> borrowing directly from the source string (zero
allocation for identifiers, numbers, directives, and punctuation; only
string/char literals and numeric labels allocate)Design decisions:
i128 in the tokeneq_ignore_ascii_case() (zero allocation)#) starts comments1:, 1b, 1f) recognized at lexer level#[inline] on hot helpers: parse_number_at, hex_digitThe parser consumes &[Token<'_>] and produces Vec<Statement>. It handles:
[base + index*scale + disp]Box<MemoryOperand> to shrink Operand enum).byte, .word, .equ, .align, .fill, .syntax, etc.)fs:[rax], %fs:0x28(%rax))parse_with_syntax()The parser is a simple recursive-descent parser producing flat IR statements.
Instruction fields are fully stack-allocated: Mnemonic (inline [u8; 24]),
OperandList (inline [Operand; 6]), PrefixList (inline [Prefix; 4]) —
yielding zero heap allocations per instruction.
| Precedence | Operators | Associativity |
|---|---|---|
| 1 (lowest) | \| (bitwise OR) |
Left |
| 2 | ^ (bitwise XOR) |
Left |
| 3 | & (bitwise AND) |
Left |
| 4 | <<, >> (shift) |
Left |
| 5 | +, - (add/sub) |
Left |
| 6 | *, /, % (mul/div/mod) |
Left |
| 7 | unary -, ~ (negate, NOT) |
Right |
| 8 (highest) | atoms: numbers, constants, (expr) |
— |
When Syntax::Att is active, the parser switches to AT&T operand parsing:
| Feature | AT&T Syntax | Intel Equivalent |
|---|---|---|
| Register prefix | %rax, %eax |
rax, eax |
| Immediate prefix | $42, $0xFF |
42, 0xFF |
| Operand order | movq $1, %rax (src, dst) |
mov rax, 1 (dst, src) |
| Memory | disp(%base, %index, scale) |
[base + index*scale + disp] |
| Segment override | %fs:0x28(%rax) |
fs:[rax + 0x28] |
| Mnemonic suffix | movq, addl, movb |
Size from operands |
| Indirect | *%rax, *(%rax) |
rax, [rax] |
Mnemonic translations: movzbl→movzx, movsbl→movsx, movslq→movsxd,
cltq→cdqe, cqto→cqo, etc.
The peephole optimizer runs after parsing, before encoding.
It transforms individual instructions for shorter machine code when
OptLevel::Size is active.
| Pattern | Replacement | Savings |
|---|---|---|
mov reg64, 0 / mov reg32, 0 |
xor reg32, reg32 |
5–7 → 2 bytes |
mov reg64, small_imm (0 < imm ≤ u32::MAX) |
mov reg32, imm32 |
7 → 5 bytes |
and reg64, u32_imm |
and reg32, u32_imm |
Saves 1 byte (REX removed) |
and reg, reg (same register) |
test reg, reg |
Equivalent, better for flags |
Design: operates on IR Statement level, each optimization is a pure function
returning Option<Statement>, easy to extend.
The encoder translates one Instruction into machine code bytes.
x86-64 encoding:
Zero-allocation design:
InstrBytes stack-allocated [u8; 32] replaces per-instruction Vec<u8>FragmentBytes::Inline(InstrBytes) for instructions; Heap(Vec<u8>) only for dataOperand::Memory(Box<MemoryOperand>) — boxed to shrink Operand from 56 → ~32 bytesenable_listing() is not calledThe x86 module covers ~725 mnemonics across these encoding classes:
| Class | Examples |
|---|---|
| Fixed encoding (85) | Zero-operand instructions via const sorted table + binary search |
| ALU class (8) | ADD/OR/ADC/SBB/AND/SUB/XOR/CMP |
| Unary class | NOT/NEG/MUL/DIV/IDIV |
| Shift class | SHL/SHR/SAR/ROL/ROR/RCL/RCR |
| Condition-code class | All 16 conditions → Jcc, SETcc, CMOVcc |
| SSE/SSE2/SSE3/SSSE3/SSE4 | 100+ SIMD instructions |
| AVX/AVX2 (VEX) | 300+ instructions with FMA3, permutes, broadcasts |
| AVX-512 (EVEX) | 120+ instructions with ZMM0-ZMM31 |
| BMI1/BMI2, ADX, TSX | Specialized extensions |
Key features:
Shares arm.rs with ARM32 encoder:
.thumb/.arm mode switching, .thumb_func LSBKey features:
Key features:
li decomposition for large constants.option rvc/.option norvcThe linker collects encoded fragments and resolves references:
Fragment model:
Fragment::Fixed — fixed-size dataFragment::Align — dynamic alignment padding (multi-byte NOP for x86)Fragment::Relaxable — branches with short/long formsFragment::Org — advance location counterSzymanski’s branch relaxation:
Relocations — 17 relocation kinds across all architectures:
| Kind | Architecture | Description |
|---|---|---|
X86Relative |
x86 | RIP-relative displacement |
Absolute |
All | Raw LE byte write (1–8 bytes) |
ArmBranch24 |
ARM32 | B/BL 24-bit offset |
ArmLdrLit |
ARM32 | LDR literal 12-bit |
Aarch64Jump26 |
AArch64 | B/BL 26-bit offset |
Aarch64Branch19 |
AArch64 | B.cond/CBZ/CBNZ |
Aarch64Branch14 |
AArch64 | TBZ/TBNZ |
RvJal20 |
RISC-V | JAL 21-bit J-type |
RvBranch12 |
RISC-V | B-type 13-bit |
RvAuipc20 |
RISC-V | AUIPC+JALR pair |
ThumbBranch8/11 |
Thumb | 16-bit branches |
ThumbBl/BranchW |
Thumb | 32-bit branches |
┌────────┬────┬───────┬─────┬──────────────┬───────────┐
│ Prefix │REX │Opcode │ModRM│ SIB │Disp/Imm │
│ (opt) │opt │1-3 B │(opt)│ (opt) │ (opt) │
└────────┴────┴───────┴─────┴──────────────┴───────────┘
0 1 0 0 W R X B
│ │ │ └─ Extension of ModRM.rm or SIB.base
│ │ └──── Extension of SIB.index
│ └─────── Extension of ModRM.reg
└────────── 64-bit operand size
mod reg/opcode r/m
[7:6] [5:3] [2:0]
mod=00: [r/m] (no displacement)mod=01: [r/m + disp8]mod=10: [r/m + disp32]mod=11: register direct scale index base
[7:6] [5:3] [2:0]
Required when using RSP/R12 as base or scaled index addressing.
The project maintains an extensive test suite across multiple categories:
cargo-fuzz targets for all architecturesZero warnings, zero clippy warnings, Miri clean.