MZ File format flavors & malware

Analyzing files starting with the ‘MZ’ magic value can be called a “daily bread” for reverse engineers. The reason for this is pretty simple – if you look at the top of your average executable file you will notice that majority of them start with these 2 magic letters. Since it’s the most common file format that malware analysts work with, in this post I will have a deeper (but still high-level) look at files of this type.

There are so many types of executables starting with ‘MZ’ that looking at the first 2 bytes is often not enough. In fact, there are so many various flavors of MZ files, that it’s pretty hard to list them all, but let’s try anyway:

  • 16-bit, 32-bit and 64-bit executables
  • PC and mobile executables
  • x32, x64, IA64, AMD64, etc.
  • .NET
  • Executables for Windows 3.1 and Windows 9x/NT ( ‘NE’ vs. ‘PE’)
  • Drivers for Windows 3.1/Windows 9x and Windows NT ( ‘LE’ vs. ‘PE’)
  • GUI applications and console applications
  • User mode executables (processes, services – usually saved as files with the .exe, .scr, .cpl extension) and Dynamically Loaded Libraries (saved as files with .dll extension; others are saved as .ocx, .vbx, etc.)
  • User mode executables (processes) and services (service processes)
  • Kernel mode drivers (.sys, .drv) and kernel mode libraries (also saved with a .sys file extension)
  • Standard DLLs and COM DLLs (e.g. ActiveX, Browser Helper Objects)
  • Standard DLLs and Service DLLs (loaded by svchost.exe)
  • Dedicated DLL files (e.g. LSP, Shell extensions, deskbands, Plugins, MSGINA, windows hooks, etc.)
  • Old-school standalone executables (‘DOS type’)
  • Files produced by various compilers: Microsoft Visual Studio, Borland Delphi, Visual Basic, mingw32, gcc and many more.
  • Files produced by various script compilers e.g. perl2exe, py2exe, php2exe, AutoIt, WinBatch, etc.
  • Installers e.g. Nullsoft, InnoSetup, Wise, Vyse, etc.,
  • Resource-only files e.g. fonts
  • Executables with overlays
  • Executables with appended data

From malware analysis point of view, we have to also include another categorization as well, which is very much related to “extra” file properties often added by malware authors, including:

  • compression (packing)
  • encryption
  • wrapping
  • obfuscation
  • protection
  • corruption
  • virtualization
  • misleading information
  • anti-techniques

Finally, we can use as a classifier the presence and the content of the following metadata:

  • Rich header
  • Number of Sections
  • Characteristics of Sections (writable, readable, executable, etc.)
  • Characteristics of Import and export table
  • Debugging information (including timestamps and paths to .PDB files)
  • Resources information
  • Digital signatures
  • Appended data
  • Compiler specific information e.g. debug information, or PACKAGEINFO for Delphi application

It is super high-level, but as you may guess, analyzing any single executable listed on this list requires completely different approach.

 

Update #1:

fixed a mistake related to NE/PE – NE files have been replaced by PE files on 32-bit Windows; thx to Imaginative (one of the best reversers I know) for picking it up 🙂

Update #2:

Just to clarify: NE files still run on Win XP + this file format is being used to store .fon files (Thx Ange @ corkami.com – he is one of the best binary magicians out there!)