IDA colonoscopy

One of the most annoying things I come across during analysis are … function names. It’s great to have many of them resolved either via flirt of symbols, but the length of some of these function names is making it really hard to read code.

It is especially important with ‘basic’ string functions that hide behind constructs like:

(std::basic_string,std::allocator,_STL70> const &,uint,uint)
std::basic_string,std::allocator,_STL70>::operator=(ushort const *)

Why not simple ‘assign’ and ‘operator’?

It’s because it’s puristic and accurate, that’s why 🙂

Reading code listings relying on these functions is difficult, and it involves a lot of mental processing to find the actual method name in these long strings.

I got bored doing so and coded a very badly written idapython script that replaces these names with a shorter version. Again, this is a blasphemy to both IDA and IDAPython so you have been warned.

import idaapi
import idc
import types
import os
import pprint
import random

mask = idc.GetLongPrm(idc.INF_SHORT_DN)

for func_ea in idautils.Functions():
    function_name = idc.GetFunctionName(func_ea)
    function_name_dem = idc.Demangle(function_name, mask)
    if function_name_dem != None:
       function_name = function_name_dem'hex_',function_name,re.IGNORECASE) 
    if not m:
       print function_name'basic_string.*?::([^:=]+)\(',function_name,re.IGNORECASE) 
       if m: 
          short_fun = 
          short_fun1 = re.sub('[\(=< ~\'\"\+\`-].+$','',short_fun) 
          while True: 
             short_fun = 'hex_string_' + short_fun1 + "_" + str(cnt) 
             res = MakeName(func_ea,short_fun) 
             if res: 
                print short_fun 
             cnt = cnt + 1 
             if cnt>1000: 

The result:



Batch decompilation with IDA / Hex-Rays Decompiler

if you are very used to 32-bit IDA you may sometimes find yourself in a blind alley when you try to port your working solution to IDA 64-bit. This was the case with my old batch decompilation script.

The way it works is very simple – for every <file> in a folder, run IDA in its automation/batch mode mode, decompile the <file>, and finally save it in a <file>.c file – more or less like the below (I am omitting the loop):

c:\Ida\idaw.exe -A -Ohexrays:-new:%%k.c:ALL “%%k”

Nothing could be simpler.

Until you run it with the 64-bit idaw64.exe:

c:\Ida\idaw64.exe -A -Ohexrays:-new:%%k.c:ALL “%%k”

It doesn’t work. It loads idaw64 and just stays there.

The gotcha is in a plug-in name. The 64-bit decompiler’s plugin name is not hexrays, it’s not hexrays64 either. It is actually hexx64.dll.

So, you have to run this instead:

c:\Ida\idaw64.exe -A -Ohexx64:-new:%%k.c:ALL “%%k”

It’s ridiculously trivial, but it’s always the little things.

Also, interestingly, when you google hexx64.dll or hexx64.p64 you only get a few hits. As if not too many ppl ever came across the issue.

Another gotcha is that if you run it with too many files, your system’s performance will deteriorate quickly. I don’t know if it is memory fragmentation/leaks, or something else, but after running the script on a number of samples I observed my VM dying on me and requiring a restart due to low memory (despite no other process running on a 2G RAM guest). If you know what causes it I would be grateful if you could let me know.

The third gotcha is to rely on the text version of IDA for this task – it is faster than the GUI version. At least in my experience.

Finally, the last gotcha is to remove all the other plugins from the IDA’s Plugins directory, other than the one you are using e.g. hexrays. Why? This may look like nothing, but IDA enumerates and loads all of them _each_ time it starts.