HexDive 0.6 – new strings and more -Context…

October 18, 2012 in HexDive, Malware Analysis, Software Releases

Update

I have received a question from Pedro about the APIs that are commonly used by keyloggers which I mentioned in a context of one of the screenshots; The APIs I had in mind were MonitorFromPoint and GetMonitorInfoA (used for taking screenshots on multiple monitors) and a few others that can be seen on both screenshot and inside the example_hdive_qC.txt file; this was an ambiguous statement for a few reasons (APIs can be part of a clean framework or unit/module, keylogger is not an infostealer, etc.), so I am clarifying it for the future reader;

Last, but not least – obviously the only way to confirm that any APIs highlighted by HexDive are used for malicious purposes is by doing more in-depth analysis – the only thing HexDive does is identification of APIs and strings of interest for the malware analyst 🙂

Old post

New version is 25% larger (what a bloatware! :)) as it brings in a huge number of new strings:

  • PE Section names and other packer identifiers
  • Installer-related strings
  • Identifiers of script-to-exe type tools e.g. perl2exe, py2exe, exerb, winbatch
  • Lots of known CLSID strings

It slowly gets to the point where I wanted it to be when I started writing it. I also think I finally got it right on how to present the data extracted from a file in a way that:

  • shows as many interesting strings as possible
  • makes it as readable as possible
  • with all that it still provides information about the string’s context
  • allows to quickly find the string in a hex editor
  • in a full-output mode allows for an easy parsing
  • avoid missing strings as much as possible

So, with all that said, the new contextual output is introduced in this version.

It works the same way as the old version -c, but it removes keywords and duplicated lines from output (not perfectly, but good enough). I must mention here that contextual output requires a wide screen (terminal at least 120 columns), but I hope if you do malware analysis you have this available 🙂  (feel free to let me know if you need a more narrower output, so I can accommodate that in a future version).

The new contextual output option is available as capitalized -c i.e. -C – You can run it in many ways, e.g.

hdive -C
hdive -aC
hdive -afC

See example below and as usual, I would be grateful if you let me know if it works for you or if you spot issues.

Example Session

This is a sample of a new malware, downloaded quite recently.

Running hdive on it first:

hdive -C // note capital letter

 

The file is UPXd, and we can see some Borland strings (Boolean/False/True/Char/etc.).

We can unpack it using upx.exe

upx -d test\sample.exe -o test\sample.exe.unpacked

…and then run hdive again:

hdive -qC test\sample.exe.unpacked

Now it looks much better and it’s definitely Borland.

Scrolling down we can see lots of juicy info – APIs that are commonly used by keyloggers,

then going further, we can see winsock functions and strings, as well as Delphi components (units) listed as well together with ‘username’, ‘password’:

and finally lots of HTTP-related strings, as well as another unit-name from Borland:

There are more interesting strings there – you can see output of the command by viewing all the attached text files; read on.

Out of curiosity, I compared the output of the following commands:

  • strings -q -n 6 // this is usually a good length allowing removing a lot of junk
  • hdive -q
  • hdive -qC

on the very same sample and then compared the file sizes and number of lines in each file.

These are the results:

dir example_*
2012-10-19  01:24            17,185 example_hdive_q.txt
2012-10-19  01:24            61,364 example_hdive_qC.txt
2012-10-19  01:24            58,199 example_strings_qn6.txt

wc -l example*   1336 example_hdive_q.txt    529 example_hdive_qC.txt   3777 example_strings_qn6.tx

It would seem (and mind you, it is a very subjective statement :)) that hdive can be quite a time saver! Instead of reviewing over 3.5K, you end up doing 35% of it and immediately getting juicy keywords and their context (this can be of course still improved).

You can download the files here:

  • examples:

Enjoy!

Share this :)

Comments are closed.