ZydisInfo – the disassembler that breaks the code, twice

The moment I heard of machine code and its opcodes… I fell in love. Being able to understand machine code from just looking at the binary (okay, mostly its hexadecimal representation) felt like magic. And since many simple x86 assembly instructions are quite easy to decipher, I really liked the fact I could not only ‘read some of the code’ by just looking at binary, but also use that knowledge to patch code here and there, too.

Of course, today everyone knows about nopping code with 0x90, or changing the conditional jumps from 0x74, 0x75 to 0xEB, but back then it was something special. Unfortunately, once you learn the basics, this feeling doesn’t last for too long, because the opcodes got … complicated, and they did so, pretty quickly, too. The FPU, MMX, SSEn, AVXn instructions are not for the faint-hearted, and it takes a lot effort to understand them on a mathematical level, let alone memorizing their opcodes. And on top of that, the new CPUs arrived, bytecode in many different forms is a thing, and on top of that we have code virtualizers, so now it’s really prohibitive to even think of learning any of it… unless you are a dedicated low-level code fan.

Still, even in 2023 it really helps to know some of the most important opcodes, at least in the x86/x64 world. Malware uses many tricks to obfuscate code, use opcodes to enforce incorrect disassembly, or trigger exceptions on undocumented instructions. Patching is also still a thing, and knowing at least a subset of most popular opcodes helps to quickly understand what is going on. For example, if some random routine is looking for some specific byte values that correspond to known opcodes it’s really handy to know some of them to quickly make an educated guess that we are looking at some sort of length disassembler, or a hooking/unhooking routine…

Let’s admit it though – we can’t learn it all, so, it’s time to cheat a bit and then hopefully win some…

Knowing how complicated all of this became, for a long time I dreamed of a tool that takes a series of bytes, interprets it as code, and breaks it down into smaller chunks where the respective parts of the alleged machine instruction are clearly deconstructed, described, and represented; that is, the prefixes, the opcode itself, the operation direction, the size of the argument, the R/M, MOD, REG, SIB, and IMM and DISP parts, etc. and all are extracted and presented in a nice way to the user…

And after thinking of it for a long time I only last week asked about a tool like this…

Thanks to Steve Eckels, we now know that such tool does exist! It’s called Zydisinfo, and It was created by Joel Höner et al (with Florian Bernd creating most of Zydisinfo, as per this twit).

Over last few days I spent some time playing around with Zydisinfo and I am really impressed. This is a fantastic educational tool that many students and assembler lovers will find absolutely delightful to work with.

Let’s see a few examples:

ZydisInfo -64 “90” (NOP)

no surprise here…

ZydisInfo -64 “74 01” (short jump)

no surprise here either…

ZydisInfo -64 “67 8B 04 C1” (mov eax, dword ptr ds:[ecx+eax*8])

a more complicated example and it still works like a charm…

Isn’t that cool?

Joel et al, you really killed it! Touche!

Documenting the undocumented – Excel’s SaveAs method…

A few days ago @kernelv0id asked about an undocumented Excel format that he observed being used by one of the payloads he was analysing. He saw a malicious .xlsb file dropping a file that was being saved with a file format equal to 3. For those who don’t know, the Excel API ‘SaveAs‘ takes a bunch of arguments, including a file name and a file format that we want the file to be saved as. According to this page, number ‘3’ is undocumented.

This triggered my interest so I quickly tested what that saved file may look like. To my surprise, it was just a TAB-separated text file!

A-ha.

This gave me an excuse to write a simple test macro to go and try running ‘SaveAs’ method with all the file formats from 0 to 62:

Sub x()
   On Error Resume Next
    For i = 0 To 62
       If i < 10 Then f = "out\0" & i Else f = "out\" & i
       ActiveWorkbook.SaveAs Filename:=f, FileFormat:=i
    Next i
End Sub

and cross-referencing the results with the documented file formats, leading me to this final table, sorted by a file format constant.

The TSV, PDF, XPS, are great to see…. Why Microsoft is not documenting these yet?

I believe the Office suite hides a lot of secrets from us. It’s time to start digging!