I have received a question from Pedro about the APIs that are commonly used by keyloggers which I mentioned in a context of one of the screenshots; The APIs I had in mind were MonitorFromPoint and GetMonitorInfoA (used for taking screenshots on multiple monitors) and a few others that can be seen on both screenshot and inside the example_hdive_qC.txt file; this was an ambiguous statement for a few reasons (APIs can be part of a clean framework or unit/module, keylogger is not an infostealer, etc.), so I am clarifying it for the future reader;
Last, but not least – obviously the only way to confirm that any APIs highlighted by HexDive are used for malicious purposes is by doing more in-depth analysis – the only thing HexDive does is identification of APIs and strings of interest for the malware analyst
New version is 25% larger (what a bloatware! :)) as it brings in a huge number of new strings:
- PE Section names and other packer identifiers
- Installer-related strings
- Identifiers of script-to-exe type tools e.g. perl2exe, py2exe, exerb, winbatch
- Lots of known CLSID strings
It slowly gets to the point where I wanted it to be when I started writing it. I also think I finally got it right on how to present the data extracted from a file in a way that:
- shows as many interesting strings as possible
- makes it as readable as possible
- with all that it still provides information about the string’s context
- allows to quickly find the string in a hex editor
- in a full-output mode allows for an easy parsing
- avoid missing strings as much as possible
So, with all that said, the new contextual output is introduced in this version.
It works the same way as the old version -c, but it removes keywords and duplicated lines from output (not perfectly, but good enough). I must mention here that contextual output requires a wide screen (terminal at least 120 columns), but I hope if you do malware analysis you have this available (feel free to let me know if you need a more narrower output, so I can accommodate that in a future version).
The new contextual output option is available as capitalized -c i.e. -C – You can run it in many ways, e.g.
See example below and as usual, I would be grateful if you let me know if it works for you or if you spot issues.
This is a sample of a new malware, downloaded quite recently.
Running hdive on it first:
hdive -C // note capital letter
The file is UPXd, and we can see some Borland strings (Boolean/False/True/Char/etc.).
We can unpack it using upx.exe
upx -d test\sample.exe -o test\sample.exe.unpacked
…and then run hdive again:
hdive -qC test\sample.exe.unpacked
Now it looks much better and it’s definitely Borland.
Scrolling down we can see lots of juicy info – APIs that are commonly used by keyloggers,
then going further, we can see winsock functions and strings, as well as Delphi components (units) listed as well together with ‘username’, ‘password’:
There are more interesting strings there – you can see output of the command by viewing all the attached text files; read on.
Out of curiosity, I compared the output of the following commands:
- strings -q -n 6 // this is usually a good length allowing removing a lot of junk
- hdive -q
- hdive -qC
on the very same sample and then compared the file sizes and number of lines in each file.
These are the results:
dir example_* 2012-10-19 01:24 17,185 example_hdive_q.txt 2012-10-19 01:24 61,364 example_hdive_qC.txt 2012-10-19 01:24 58,199 example_strings_qn6.txt
wc -l example* 1336 example_hdive_q.txt 529 example_hdive_qC.txt 3777 example_strings_qn6.tx
It would seem (and mind you, it is a very subjective statement :)) that hdive can be quite a time saver! Instead of reviewing over 3.5K, you end up doing 35% of it and immediately getting juicy keywords and their context (this can be of course still improved).
You can download the files here: