The fxsave and fxrstor instructions

The IA-32 instruction set has some darn cool instructions.

Instructions such as the fxsave/fxrstor combo are using a stack to operate:

Info
The fxsave instruction saves the current state of the x87 FPU, MMX technology, XMM, and MXCSR registers to a 512-byte memory location specified in the destination operand.
Info
The fxrstor instruction reloads the x87 FPU, MMX technology, XMM, and MXCSR registers from the 512-byte memory image specified in the source operand. The manual also states that “this data should have been written to memory previously using the FXSAVE instruction”.

The save and restore instructions allows us to do some cool tricks:

  • save “large” amount of data in the stack
  • swap registers values (not necessarily like fxchg)
  • pack data from multiple registers
  • unpack data into multiple registers

I will now show how you can save some code on that stack and later restore it into registers for further execution.

We first need some code and data to re-use:

 1section .data
 2  align 64
 3  regsave times 0x200 db 0x90
 4
 5  msg db "hello",0xa,0x0
 6
 7section .text
 8  global _start
 9
10exit_0:
11  mov eax, 1    ; b8 01 00 00 00
12  mov edi, 1    ; bf 01 00 00 00
13  mov rsi, msg  ; 48 be 00 00 00 00|00 00 00 00
14  mov edx, 7    ; ba 07 00 00 00
15  syscall       ; 0f 05
16
17  xor rdi, rdi  ; 48 31 ff
18  mov rax, 60   ; b8 3c|00 00 00
19  syscall       ; 0f 05
20
21exit_1:
22  mov edi, 1    ; bf 01 00 00 00
23  mov eax, 0x3c ; b8 3c 00 00 00
24  syscall       ; 0f 05

Now we copy the code into the xmm registers and we store them on the regsave stack:

1_start:
2  ; save some code in regsave sections using 128-bits chunks
3  movdqu xmm0, [exit_0 + 0x10 * 0]
4  movdqu xmm1, [exit_0 + 0x10 * 1]
5  movdqu xmm2, [exit_0 + 0x10 * 2]
6
7  ; copy data to the ordered regsave area
8  fxsave [regsave]
Note
The xmm registers are pretty common and frequently replace memcpy during compilation but you might want to copy the exit_0 code in some other registers than the xmm0, xmm1 and xmm2 we used previously.

At that point, regsave+0xa0 contains the exit_0 function across the saved xmm0, xmm1 and xmm2 registers:

10x4030a0:       0xb8    0x1     0x0     0x0     0x0     0xbf    0x1     0x0
20x4030a8:       0x0     0x0     0x48    0xbe    0x0     0x32    0x40    0x0
30x4030b0:       0x0     0x0     0x0     0x0     0xba    0x7     0x0     0x0
40x4030b8:       0x0     0xf     0x5     0x48    0x31    0xff    0xb8    0x3c
50x4030c0:       0x0     0x0     0x0     0xf     0x5     0xbf    0x1     0x0
60x4030c8:       0x0     0x0     0xb8    0x3c    0x0     0x0     0x0     0xf

We now have a copy of the exit_0 function that you can execute. If you cannot execute it right away, you can use fxrstor to reloads registers and craft an execution from there. Here are some ways to do it:

1  ; restore registers
2  fxrstor  [regsave]
3
4  ; exec on the regsave data
5  mov rax, regsave
6  add rax, 0xa0 ; xmm0 offset
7  push rax
8  ret

or:

 1  ; restore registers
 2  fxrstor  [regsave]
 3
 4  ; exec on the stack using registers
 5  sub     rsp, 0x10
 6  movdqu  [rsp], xmm2
 7  sub     rsp, 0x10
 8  movdqu  [rsp], xmm1
 9  sub     rsp, 0x10
10  movdqu  [rsp], xmm0
11  jmp rsp
Tip
In x87, the FPU is also using a stack (or barrel). You might find the fld/fstp instructions useful.