{"id":1394,"date":"2012-10-18T17:42:36","date_gmt":"2012-10-18T17:42:36","guid":{"rendered":"http:\/\/www.hexacorn.com\/blog\/?p=1394"},"modified":"2012-11-12T12:58:39","modified_gmt":"2012-11-12T12:58:39","slug":"hexdive-0-6-new-strings-and-more-context","status":"publish","type":"post","link":"https:\/\/www.hexacorn.com\/blog\/2012\/10\/18\/hexdive-0-6-new-strings-and-more-context\/","title":{"rendered":"HexDive 0.6 &#8211; new strings and more -Context&#8230;"},"content":{"rendered":"<p><strong>Update<\/strong><\/p>\n<p>I have received a question from Pedro about the APIs that are commonly used by keyloggers which I mentioned in a context of one of the screenshots; The APIs I had in mind were MonitorFromPoint and GetMonitorInfoA (used for taking screenshots on multiple monitors) and a few others that can be seen on both screenshot and inside the <a href=\"https:\/\/www.hexacorn.com\/examples\/2012-10-19_example_hdive_qC.txt\">example_hdive_qC.txt<\/a> file; this was an ambiguous statement for a few reasons (APIs can be part of a clean framework or unit\/module, keylogger is not an infostealer, etc.), so I am clarifying it for the future reader;<\/p>\n<p>Last, but not least &#8211; obviously the only way to confirm that any APIs highlighted by HexDive are used for malicious purposes is by doing more in-depth analysis &#8211; the only thing HexDive does is identification of APIs and strings of interest for the malware analyst \ud83d\ude42<\/p>\n<p><strong>Old post<\/strong><\/p>\n<p>New version is 25% larger (what a bloatware! :)) as it brings in a huge number of new strings:<\/p>\n<ul>\n<li><a title=\"Random Stats from 1.2M samples \u2013 PE Section Names\" href=\"https:\/\/www.hexacorn.com\/blog\/2012\/10\/14\/random-stats-from-1-2m-samples-pe-section-names\/\">PE Section names<\/a> and other packer identifiers<\/li>\n<li>Installer-related strings<\/li>\n<li>Identifiers of script-to-exe type tools e.g. perl2exe, py2exe, exerb, winbatch<\/li>\n<li>Lots of known CLSID strings<\/li>\n<\/ul>\n<p>It slowly gets to the point where I wanted it to be when I started writing it. I also think I finally got it right on how to present the data extracted from a file in a way that:<\/p>\n<ul>\n<li>shows as many interesting strings as possible<\/li>\n<li>makes it as readable as possible<\/li>\n<li>with all that it still provides information about the string&#8217;s context<\/li>\n<li>allows to quickly find the string in a hex editor<\/li>\n<li>in a full-output mode allows for an easy parsing<\/li>\n<li>avoid missing strings as much as possible<\/li>\n<\/ul>\n<p>So, with all that said, the new contextual output is introduced in this version.<\/p>\n<p>It works the same way as the old version <strong><span style=\"color: #ff0000;\">-c<\/span><\/strong>, but it removes keywords and duplicated lines from output (not perfectly, but good enough). I must mention here that contextual output requires a wide screen (terminal at least 120 columns), but I hope if you do malware analysis you have this available \ud83d\ude42\u00a0 (feel free to let me know if you need a more narrower output, so I can accommodate that in a future version).<\/p>\n<p>The new contextual output option is available as capitalized <strong><span style=\"color: #ff0000;\">-c<\/span><\/strong> i.e. <strong><span style=\"color: #ff0000;\">-C<\/span><\/strong> &#8211; You can run it in many ways, e.g.<\/p>\n<pre style=\"padding-left: 30px;\">hdive -C<\/pre>\n<pre style=\"padding-left: 30px;\">hdive -aC<\/pre>\n<pre style=\"padding-left: 30px;\">hdive -afC<\/pre>\n<p>See example below and as usual, I would be grateful if you let me know if it works for you or if you spot issues.<\/p>\n<h4><strong>Example Session<br \/>\n<\/strong><\/h4>\n<p>This is a sample of a new <a href=\"https:\/\/www.virustotal.com\/file\/9f66e1965d55fcf731fe9fc11c0e744de7c30383c0eb97542b7300d7a575e503\/analysis\/\">malware<\/a>, downloaded quite recently.<\/p>\n<p>Running hdive on it first:<\/p>\n<pre style=\"padding-left: 30px;\">hdive -C \/\/ note capital letter<\/pre>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_1.png\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-medium wp-image-1398\" title=\"hdive_06_1\" src=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_1-300x137.png\" alt=\"\" width=\"300\" height=\"137\" srcset=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_1-300x137.png 300w, https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_1.png 989w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>The file is UPXd, and we can see some Borland strings (Boolean\/False\/True\/Char\/etc.).<\/p>\n<p>We can unpack it using upx.exe<\/p>\n<pre style=\"padding-left: 30px;\">upx -d test\\sample.exe -o test\\sample.exe.unpacked<\/pre>\n<p><a href=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_21.png\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-medium wp-image-1400\" title=\"hdive_06_2\" src=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_21-300x98.png\" alt=\"\" width=\"300\" height=\"98\" srcset=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_21-300x98.png 300w, https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_21.png 725w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>&#8230;and then run hdive again:<\/p>\n<pre style=\"padding-left: 30px;\">hdive -qC test\\sample.exe.unpacked<\/pre>\n<p>Now it looks much better and it&#8217;s definitely Borland.<\/p>\n<p>Scrolling down we can see lots of juicy info &#8211; APIs that are commonly used by keyloggers,<\/p>\n<p><a href=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_3.png\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-medium wp-image-1401\" title=\"hdive_06_3\" src=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_3-300x153.png\" alt=\"\" width=\"300\" height=\"153\" srcset=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_3-300x153.png 300w, https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_3.png 981w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>then going further, we can see winsock functions and strings, as well as Delphi components (units) listed as well together with &#8216;username&#8217;, &#8216;password&#8217;:<\/p>\n<p><a href=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_4.png\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-medium wp-image-1402\" title=\"hdive_06_4\" src=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_4-300x153.png\" alt=\"\" width=\"300\" height=\"153\" srcset=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_4-300x153.png 300w, https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_4.png 981w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a>and finally lots of HTTP-related strings, as well as another unit-name from Borland:<\/p>\n<p><a href=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_5.png\"><img decoding=\"async\" loading=\"lazy\" class=\"aligncenter size-medium wp-image-1403\" title=\"hdive_06_5\" src=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_5-300x153.png\" alt=\"\" width=\"300\" height=\"153\" srcset=\"https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_5-300x153.png 300w, https:\/\/www.hexacorn.com\/blog\/wp-content\/uploads\/2012\/10\/hdive_06_5.png 981w\" sizes=\"(max-width: 300px) 100vw, 300px\" \/><\/a><\/p>\n<p>There are more interesting strings there &#8211; you can see output of the command by viewing all the attached text files; read on.<\/p>\n<p>Out of curiosity, I compared the output of the following commands:<\/p>\n<ul>\n<li>strings -q -n 6 \/\/ this is usually a good length allowing removing a lot of junk<\/li>\n<li>hdive -q<\/li>\n<li>hdive -qC<\/li>\n<\/ul>\n<p>on the very same sample and then compared the file sizes and number of lines in each file.<\/p>\n<p>These are the results:<\/p>\n<pre style=\"padding-left: 30px;\">dir example_*\r\n2012-10-19\u00a0 01:24\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 17,185 example_hdive_q.txt\r\n2012-10-19\u00a0 01:24\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 61,364 example_hdive_qC.txt\r\n2012-10-19\u00a0 01:24\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 58,199 example_strings_qn6.txt\r\n<br style=\"padding-left: 30px;\" \/>wc -l example*\r\n\u00a0 1336 example_hdive_q.txt\r\n\u00a0\u00a0 529 example_hdive_qC.txt\r\n\u00a0 3777 example_strings_qn6.tx<\/pre>\n<p>It would seem (and mind you, it is a very subjective statement :)) that hdive can be quite a time saver! Instead of reviewing over 3.5K, you end up doing 35% of it and immediately getting juicy keywords and their context (this can be of course still improved).<\/p>\n<p>You can download the files here:<\/p>\n<ul>\n<li><a href=\"https:\/\/hexacorn.com\/download.php?f=hdive.exe\">hdive.exe<\/a> or <a href=\"https:\/\/hexacorn.com\/download.php?f=hdive.zip\">hdive.zip<\/a><\/li>\n<\/ul>\n<ul>\n<li>examples:<\/li>\n<\/ul>\n<blockquote>\n<ul>\n<li><a href=\"https:\/\/www.hexacorn.com\/examples\/2012-10-19_example_hdive_q.txt\">example_hdive_q.txt<\/a><\/li>\n<li><a href=\"https:\/\/www.hexacorn.com\/examples\/2012-10-19_example_hdive_qC.txt\">example_hdive_qC.txt<\/a><\/li>\n<li><a href=\"https:\/\/www.hexacorn.com\/examples\/2012-10-19_example_strings_qn6.txt\">example_strings_qn6.txt<\/a><\/li>\n<\/ul>\n<\/blockquote>\n<p>Enjoy!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Update I have received a question from Pedro about the APIs that are commonly used by keyloggers which I mentioned in a context of one of the screenshots; The APIs I had in mind were MonitorFromPoint and GetMonitorInfoA (used for &hellip; <a href=\"https:\/\/www.hexacorn.com\/blog\/2012\/10\/18\/hexdive-0-6-new-strings-and-more-context\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[23,9,5],"tags":[],"_links":{"self":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/1394"}],"collection":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/comments?post=1394"}],"version-history":[{"count":11,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/1394\/revisions"}],"predecessor-version":[{"id":1442,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/1394\/revisions\/1442"}],"wp:attachment":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/media?parent=1394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/categories?post=1394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/tags?post=1394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}