You are browsing the archive for Reversing.

Batch decompilation with IDA / Hex-Rays Decompiler

July 4, 2019 in IDA/Hex-Rays, Random ideas, Reversing, Silly, Tips & Tricks, Trivia

if you are very used to 32-bit IDA you may sometimes find yourself in a blind alley when you try to port your working solution to IDA 64-bit. This was the case with my old batch decompilation script.

The way it works is very simple – for every <file> in a folder, run IDA in its automation/batch mode mode, decompile the <file>, and finally save it in a <file>.c file – more or less like the below (I am omitting the loop):

c:\Ida\idaw.exe -A -Ohexrays:-new:%%k.c:ALL “%%k”

Nothing could be simpler.

Until you run it with the 64-bit idaw64.exe:

c:\Ida\idaw64.exe -A -Ohexrays:-new:%%k.c:ALL “%%k”

It doesn’t work. It loads idaw64 and just stays there.

The gotcha is in a plug-in name. The 64-bit decompiler’s plugin name is not hexrays, it’s not hexrays64 either. It is actually hexx64.dll.

So, you have to run this instead:

c:\Ida\idaw64.exe -A -Ohexx64:-new:%%k.c:ALL “%%k”

It’s ridiculously trivial, but it’s always the little things.

Also, interestingly, when you google hexx64.dll or hexx64.p64 you only get a few hits. As if not too many ppl ever came across the issue.

Another gotcha is that if you run it with too many files, your system’s performance will deteriorate quickly. I don’t know if it is memory fragmentation/leaks, or something else, but after running the script on a number of samples I observed my VM dying on me and requiring a restart due to low memory (despite no other process running on a 2G RAM guest). If you know what causes it I would be grateful if you could let me know.

The third gotcha is to rely on the text version of IDA for this task – it is faster than the GUI version. At least in my experience.

Finally, the last gotcha is to remove all the other plugins from the IDA’s Plugins directory, other than the one you are using e.g. hexrays. Why? This may look like nothing, but IDA enumerates and loads all of them _each_ time it starts.

Returning the call – ‘moshi moshi’, the API way (a.k.a. API cold calling), Part 2

April 27, 2019 in Malware Analysis, Reversing

In my old post I described an idea of initializing registers with a predefined value using calls to APIs that return predictable values; such approach can help to develop novelty anti-* tricks, in particular targeting emulation of any sort & imho this is certainly worth an attention of anyone who tries to:

  • write code bypassing AV/EDR in a generic way
  • create obfuscators that may try to hide some logic of the program… outside of it…
  • reverse static analysis in a more generic way — reversers will need to take these into account to cover additional cases of opaque predicates

The idea of expanding on the old post was with me for a while and I finally decided to provide more examples of OS code that can be ‘repurposed’.

It took a long time, because it takes a lot of code browsing/searches. And I think I only touched the surface, because to be effective in this area one needs a proper database of OS code snippets that is searchable the same way as ROP gadgets databases.

So, there you go… if you are looking for an interesting reversing project: try to automate finding opaque predicates in as many libs as possible!

Browsing through the MSDN documentation and code of various OS libraries it’s quite easy to spot the functions that are no longer supported/deprecated or otherwise not maintained; they are a perfect target for abuse:

  • For example, looking at ‘Network DDE Reference‘ one can immediately recognize potential of old functions exported by Nddeapi.dll; they are returning NDDE_NOT_IMPLEMENTED (14) by default; there are exceptions of course – a couple of NetDDE APIs return 0, but this DLL can help to initialize registers to one of these two values w/o much effort
  • The good ol’ kernel32.dll is also full of surprises. The 16-bit legacy code is still there, even if in a dormant state. Calling a very old exported function UTRegister with wrong parameters can give you 1 in the eax, but if you provide 3 good arguments i.e.
UTRegister(0, "mem16.dll", 0, "GetMemory", &ptr, 0, 0)

You will get 0 in eax, plus a bonus – a pointer to a call back (a function in ptr variable). If you call this callback function, you will get 0x2000 in eax.

So, you can use this particular API to get 0, 1, or 0x2000 into eax.

  • Incorrect parameters passed to some very well-known functions can provide some unexpected results – we just need to make a deliberate mistake in our code 🙂
    For example, the good old LoadLibraryEx can help to initialize eax to 0x57 (ERROR_INVALID_PARAMETER) if you pass a nonzero value as a second argument when you call the function
  • LockResource is one of the best wrappers – it is doing nothing other than moving the first argument passed to the function to eax
  • I_CryptGetLruEntryIdentifier is similar, but the value will be increased by 8 (there are tones functions that are similar)
  • FreeResource does nothing and returns 0
  • If we need to initialize a DWORD value at some specific location (pointer), the msvcr120_clr0400 ! _vacopy API may come handy; it takes a pointer, and a value as its arguments and does the dirty job for us
  • Many GDI functions return 0 by default e.g. EngQueryEMFInfo, FixBrushOrgEx, GdiPlayJournal
  • Then there is always a trivial example of GetCurrentProcess that is always returning 0xFFFFFFFF
  • CloseProfileUserMapping, GdiSupportsFontChangeEvent, GdiEntry16, ImmReleaseContext, SetConsoleMaximumWindowSize always return 1; there is lots of APIs that behave this way
  • NtVdm64CreateProcessInternalW, VerifyConsoleIoHandle always return 0
  • Many unimplemented functions return 0x78 (ERROR_CALL_NOT_IMPLEMENTED) e.g. RegLoadMUIStringA
  • Many unimplemented functions return 0x32 (ERROR_NOT_SUPPORTED) e.g. SetEncryptedFileMetadata
  • ElfReportEventAndSourceW will give you 0xC0000002 (NT_STATUS_NOT_IMPLEMENTED)
  • SslChangeNotify will give you 0x80090029 (NTE_NOT_SUPPORTED )
  • MD5Init and similar functions can initialize some buffers with MD5 (or other hash functions’) initial values e.g. 0x67452301, 0x0EFCDAB89, 0x98BADCFE, 0x10325476; these could be used w/o using actual hash calculation code
  • Cleverly used encryption functions can give you a predictable set of values (e.g. initialize a buffer with a specific pattern, encrypt it, use some the data from the buffer as offsets to other data structures/callbacks)
  • FlushInstructionCache returns 1; I was curious about it and even asked on Twitter about it; It would seem x86 architecture doesn’t need it, but it’s good to call it for future compatibility; So, eax=1 it is
  • GetLargePageMinimum can give you a predictable value if you discard the lower bits of it e.g. 0x7FFExxxxh
  • Many DLLs with DllInstall, DllRegisterServer exports will return 0 when these APIs are called
  • SetLastError/GetLastError can transfer data via OS data buffers
  • Many msvbvm60 functions return 0x80010007 (RPC_E_SERVER_DIED)
  • msvcr* and msvcp* DLLs contain lots of functions that can help to transform data in a more or less unpredictable way
  • Some of the exported functions are so old school that they can be used to cause predictable exceptions e.g. reading/writing to I/O ports
  • COM return values can be predictable as well:
    • S_OK Operation successful 0x00000000
    • E_ABORT Operation aborted 0x80004004
    • E_ACCESSDENIED General access denied error 0x80070005
    • E_FAIL Unspecified failure 0x80004005
    • E_HANDLE Handle that is not valid 0x80070006
    • E_INVALIDARG One or more arguments are not valid 0x80070057
    • E_NOINTERFACE No such interface supported 0x80004002
    • E_NOTIMPL Not implemented 0x80004001
    • E_OUTOFMEMORY Failed to allocate necessary memory 0x8007000E
    • E_POINTER Pointer that is not valid 0x80004003
    • E_UNEXPECTED Unexpected failure 0x8000FFFF

And so on and so forth.

Many API functions can be called with incorrect parameters only to return a predictable error. Some of them are mentioned above, but these I looked for are primarily focused on returning error immediately (i.e. they are deprecated or legacy and are just a simple block of ‘mov eax, error_value/ret’ code); they are not the best choice.

It is much better to look for existing, popular, and working APIs, and call them with some unexpected error-prone arguments. Not necessarily fuzzing them, but fuzzing itself could be used to get some predictable results too.

Also…

Many DLLs that are created as traditional ‘utility’ DLLs export tones of mathematical, logical, worker, environmental, string functions that can be used to deliver (un)predictable code and data. Not a single sandbox or emulator handles them all, let alone scripts for IDA or Ghidra.

Finally, there are tones of other DLLs provided by 3rd party vendors, where the signed code offers all the primitives to deliver some value, string, or an opportunity to execute a more complex code block (e.g. adding stuff to Registry, writing files, hooking keyboard, mouse, etc.) that can be re-used.

Also…

Remember that there is always a way to instrument your own program.

Debugging functions, SEH, VEH, VirtualProtect, etc. can help to modify buffers as they are processed by the OS code or external DLL APIs. It’s hard to manipulate these buffers on a kernel level of course, but in the userland you can do lots of trickery this way. Causing predictable exceptions triggered inside the OS libs is a nice way to swap these buffers on the way to be processed (and outside of the actual program’s code). You can do the same for the return values.

You could even use a Trap Flag to fully trace through a certain API and only change the returned value at the very last moment…

I don’t know how many available ‘APIs are out there atm, but there is enough to make some interesting code decisions.