{"id":1936,"date":"2013-06-06T02:17:30","date_gmt":"2013-06-06T02:17:30","guid":{"rendered":"http:\/\/www.hexacorn.com\/blog\/?p=1936"},"modified":"2013-06-06T02:46:56","modified_gmt":"2013-06-06T02:46:56","slug":"clustering-and-advancedin-depth-malware-analysis-with-hexdive-pro","status":"publish","type":"post","link":"https:\/\/www.hexacorn.com\/blog\/2013\/06\/06\/clustering-and-advancedin-depth-malware-analysis-with-hexdive-pro\/","title":{"rendered":"Clustering and Advanced\/In-depth Malware Analysis with HexDive Pro"},"content":{"rendered":"<p>A few months ago I introduced a new tool called <a href=\"https:\/\/www.hexacorn.com\/blog\/category\/software-releases\/hexdive\/\">HexDive<\/a>. The tool speeds up analysis of strings that are extracted from portable executable files (PE). It does it by showing only these strings that are the most relevant from a malware analysis perspective.<\/p>\n<p>Strings extracted directly from a PE file have certainly some value, but it&#8217;s limited by many factors including:<\/p>\n<ul>\n<li><strong>Compression<\/strong> (code and\/or data is decompressed only when program is executed)<\/li>\n<li><strong>Encryption<\/strong> (code and\/or data is decrypted only when program is executed)<\/li>\n<li><strong>Obfuscation<\/strong> (code and\/or data are hidden between a lot of junk code and data)<\/li>\n<li><strong>Wrapping<\/strong> (code and\/or data is hidden deep inside the file and &#8216;unwrapped&#8217; only when program is executed)<\/li>\n<li><strong>Dynamic code loading<\/strong> (code injects, shellcodes that may be hidden using techniques described above)<\/li>\n<li><strong>The environment <\/strong>(code and\/or data is not a part of the malware itself, but is extracted from the system on which it is executed)<\/li>\n<li><strong>The nature of run-time<\/strong> (code and data seen depends on the environment and code branches inside the malware)<\/li>\n<li><strong>Anti- tricks<\/strong> (what we see depends heavily on malware&#8217;s ability to detect it is running inside the sandbox, or under monitoring tools e.g. debugger)<\/li>\n<\/ul>\n<p>To address this, HexDive Pro takes analysis to the next level and allows to extract many run-time artifacts produced by a running program.<\/p>\n<p>This includes:<\/p>\n<ul>\n<li>API calls and their parameters<\/li>\n<li>Hex dumps and Strings extracted from buffers allocated during the run-time (including stack)<\/li>\n<li>Code Injects and shellcodes<\/li>\n<li>Wrapped code<\/li>\n<li>Screenshots of all windows<\/li>\n<li>Very specific features of the malware that can help to uniquely identify it<\/li>\n<li>and it can do a few other things that I will keep secret at the moment, but will reveal in next posts \ud83d\ude42<\/li>\n<\/ul>\n<p>To demonstrate what HexDive Pro can do, all I have to do is to provide a reference to what I posted in last few months.<\/p>\n<p>In fact, most of the clustering, batch analysis and malware analysis posts were heavily influenced by results provided by HexDive Pro. The results the tool provided thus far helped me to:<\/p>\n<ul>\n<li>&#8230; discover the hidden code inside <a title=\"ZeroAccess death match with Shell_NotifyIconW\" href=\"https:\/\/www.hexacorn.com\/blog\/2012\/09\/18\/zeroaccess-death-match-with-shell_notifyiconw\/\">ZeroAccess<\/a><\/li>\n<li>&#8230; cluster ZeroAccess samples I have in my collection to find out which contain code using <a title=\"HMFT 0.3 + Extended Attributes, short update\" href=\"https:\/\/www.hexacorn.com\/blog\/2013\/02\/17\/hmft-3-0-extended-attributes-short-update\/\">Extended Attributes (NTFS)<\/a> and to create a list of all known EA names used by this malware<\/li>\n<li>&#8230; cluster APT <a title=\"Clustering and Batch Analysis of APT1 sampleset\" href=\"https:\/\/www.hexacorn.com\/blog\/2013\/03\/04\/clustering-and-batch-analysis-of-apt1-sampleset\/\">sampleset<\/a> in <a title=\"Clustering and Batch Analysis of APT1 sampleset, part 2\" href=\"https:\/\/www.hexacorn.com\/blog\/2013\/03\/05\/clustering-and-batch-analysis-of-apt1-sampleset-part-2\/\">many<\/a> <a title=\"Clustering and Batch Analysis of APT1 sampleset, part 3\" href=\"https:\/\/www.hexacorn.com\/blog\/2013\/03\/12\/clustering-and-batch-analysis-of-apt1-sampleset-part-3\/\">ways<\/a>.<\/li>\n<li>&#8230; instantly discover strings in <a title=\"Quick look at\u2026\" href=\"https:\/\/www.hexacorn.com\/blog\/2012\/05\/29\/quick-look-at\/\">Flame<\/a> malware<\/li>\n<li>and others, more or less influenced by it (including various statistics)<\/li>\n<\/ul>\n<p>The results of these experiments helped me a lot to tweak the code so that it is as useful as possible.<\/p>\n<p>On the surface, HexDive Pro is working like a typical API monitor &#8211; running malware under its control and using various tricks to intercept traces of its execution. Going deeper, it combines best pieces of <a href=\"https:\/\/www.hexacorn.com\/blog\/category\/software-releases\/hexacorn-application-monitor\/\">Application Monitor<\/a>, <a href=\"https:\/\/www.hexacorn.com\/blog\/category\/software-releases\/hexdive\/\">Hex Dive<\/a>, <a href=\"https:\/\/www.hexacorn.com\/blog\/category\/software-releases\/hmft\/\">HMFT<\/a>, <a href=\"https:\/\/www.hexacorn.com\/blog\/category\/software-releases\/hstrings\/\">Hstrings<\/a> and also leverages information from numerous databases of artifacts (both static and dynamic) I gathered over the years of malware analysis.<\/p>\n<p>All of these combined efforts produce a tool that makes it possible to gain an in-depth knowledge about the analyzed malware within 30-180 seconds.<\/p>\n<p>In fact, the APT1 clustering data I posted <a href=\"https:\/\/www.hexacorn.com\/blog\/2013\/03\/04\/clustering-and-batch-analysis-of-apt1-sampleset\/\">here<\/a> has been generated pretty quickly using HexDive Pro. The results posted were just a tip of the iceberg as the output contained all the juice one can extract manually only after hours of painstaking analysis. If you multiply it by a number of samples, the performance gain is tremendous.<\/p>\n<p>Anyone who does malware analysis professionally knows how tedious in-depth analysis can be. Anyone who doesn&#8217;t, is forced to rely on writeups written by the antivirus companies, peers&#8217; help and search engines.<\/p>\n<p>With HexDive Pro you will be able to often learn more about malware than you can read online, you will be also able to verify what you read in AV writeups. On occasion, the tool will also miserably fail which could mean that you have stumbled upon a new trick\u00a0 to inject code, new trick to escape tracing, or new 0day that helps the malware to run free. Or there may be a bug.<\/p>\n<p>Such is a life of software like this \ud83d\ude42<\/p>\n<p>Last, but not least &#8211; the audience for the tool are:<\/p>\n<ul>\n<li>Forensic investigators who don&#8217;t have malware analysis skills.<\/li>\n<li>Beginners and intermediate level malware analysts.<\/li>\n<li>Anyone who wants to do batch analysis and clustering of their samplesets.<\/li>\n<li>Anyone who wants to analyze not only malware, but any Windows software (32-bit only); the tool provides in-depth look into internals working of the software applications and may be useful in security\/vulnerability assessments.<\/li>\n<li>Hardcore malware analysts may benefit from the tool as well, but they probably already have adequate or better private tools on their own.<\/li>\n<\/ul>\n<p>I have tested it extensively and since it&#8217;s a private tool that evolved from a few API monitors I wrote in the past, as well as many other tools\/scripts I have written and finally my own experience doing in-depth malware analysis I have a hope it will be useful for the community.<\/p>\n<p>The first version is coming soon. Stay tuned!<\/p>\n<p>Note: The software will be available commercially only.<\/p>\n<h2><strong>Some more examples<\/strong><\/h2>\n<p>The following artifacts are extracted instantly:<\/p>\n<ul>\n<li>List of API extracted during run-time:\n<ul>\n<li>Gets Procedure Address: WS2_32.dll, accept<\/li>\n<li>Gets Procedure Address: WS2_32.dll, bind<\/li>\n<li>Gets Procedure Address: WS2_32.dll, closesocket<\/li>\n<li>Gets Procedure Address: WS2_32.dll, connect<\/li>\n<li>Gets Procedure Address: WS2_32.dll, getpeername<\/li>\n<li>Gets Procedure Address: WS2_32.dll, getsockname<\/li>\n<li>Gets Procedure Address: WS2_32.dll, getsockopt<\/li>\n<li>&#8230;<\/li>\n<\/ul>\n<\/li>\n<li>User agents used by malware<\/li>\n<li>Information about stealing capabilities of malware (e.g. targeted applications)<\/li>\n<li>Files that malware tries to find on the system (e.g. to actually run)<\/li>\n<li>Various tricks to escape analysis\/HIPS<\/li>\n<li>Various tricks to detect monitoring tools<\/li>\n<li>Access to PhysicalDevices (memory, drives) &#8211; usually bypassing HIPS and infecting MBR<\/li>\n<li>Buffers (read\/written files, read\/written memory, etc.)<\/li>\n<\/ul>\n<pre style=\"padding-left: 60px;\"><span style=\"font-size: xx-small;\"><strong>Injected\/wrapped .exe<\/strong>\r\n4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF 00 00 - MZ.............. \r\n00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - ................ \r\n0E 1F BA 0E 00 B4 09 CD 21 B8 01 4C CD 21 54 68 - ........!..L.!Th \r\n74 20 62 65 20 72 75 6E 20 69 6E 20 44 4F 53 20 - t be run in DOS \u00a0\r\nD7 52 82 ED 93 33 EC BE 93 33 EC BE 93 33 EC BE - .R...3...3...3.. \r\n10 3B B0 BE 92 33 EC BE 1D 3B B3 BE 97 33 EC BE - .;...3...;...3.. \r\n52 69 63 68 93 33 EC BE 00 00 00 00 00 00 00 00 - Rich.3..........\r\n50 45 00 00 4C 01 06 00 01 A6 4A 46 00 00 00 00 - PE..L.....JF....\r\nB8 00 00 00 00 00 00 00 40 00 00 00 00 00 00 00 - ........@.......\r\n00 00 00 00 00 00 00 00 00 00 00 00 E0 00 00 00 - ................\r\n69 73 20 70 72 6F 67 72 61 6D 20 63 61 6E 6E 6F - is program canno\r\n6D 6F 64 65 2E 0D 0D 0A 24 00 00 00 00 00 00 00 - mode....$.......\r\n10 3B B1 BE 94 33 EC BE 93 33 ED BE 8A 33 EC BE - .;...3...3...3..\r\n10 3B B2 BE 92 33 EC BE 10 3B B6 BE 92 33 EC BE - .;...3...;...3..\r\n00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 - ................\r\n00 00 00 00 E0 00 02 21 0B 01 05 0C 00 90 00 00 - .......!........\r\n\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0\u00a0 \u00a0\r\n<strong>MBR code<\/strong>\r\n33 C0 8E D0 BC 00 7C FB 50 07 50 1F FC BE 1B 7C - 3.....|.P.P....|\r\n38 6E 00 7C 09 75 13 83 C5 10 E2 F4 CD 18 8B F5 - 8n.|.u..........\r\nF0 AC 3C 00 74 FC BB 07 00 B4 0E CD 10 EB F2 88 - ..&lt;.t...........\r\n80 7E 04 0C 74 05 A0 B6 07 75 D2 80 46 02 06 83 - .~..t....u..F...\r\nBC 81 3E FE 7D 55 AA 74 0B 80 7E 10 00 74 C8 A0 - ..&gt;.}U.t..~..t..\r\n00 B4 08 CD 13 72 23 8A C1 24 3F 98 8A DE 8A FC - .....r#..$?.....\r\n0A 77 23 72 05 39 46 08 73 1C B8 01 02 BB 00 7C - .w#r.9F.s......|\r\nBF 1B 06 50 57 B9 E5 01 F3 A4 CB BD BE 07 B1 04 - ...PW...........\r\n83 C6 10 49 74 19 38 2C 74 F6 A0 B5 07 B4 07 8B - ...It.8,t.......\r\n4E 10 E8 46 00 73 2A FE 46 10 80 7E 04 0B 74 0B - N..F.s*.F..~..t.\r\n46 08 06 83 56 0A 00 E8 21 00 73 05 A0 B6 07 EB - F...V...!.s.....\r\nB7 07 EB A9 8B FC 1E 57 8B F5 CB BF 05 00 8A 56 - .......W.......V\r\n43 F7 E3 8B D1 86 D6 B1 06 D2 EE 42 F7 E2 39 56 - C..........B..9V\r\n8B 4E 02 8B 56 00 CD 13 73 51 4F 74 4E 32 E4 8A - .N..V...sQOtN2..<\/span><\/pre>\n","protected":false},"excerpt":{"rendered":"<p>A few months ago I introduced a new tool called HexDive. The tool speeds up analysis of strings that are extracted from portable executable files (PE). It does it by showing only these strings that are the most relevant from &hellip; <a href=\"https:\/\/www.hexacorn.com\/blog\/2013\/06\/06\/clustering-and-advancedin-depth-malware-analysis-with-hexdive-pro\/\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":[],"categories":[28,19,32,9,5],"tags":[],"_links":{"self":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/1936"}],"collection":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/comments?post=1936"}],"version-history":[{"count":15,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/1936\/revisions"}],"predecessor-version":[{"id":1946,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/posts\/1936\/revisions\/1946"}],"wp:attachment":[{"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/media?parent=1936"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/categories?post=1936"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.hexacorn.com\/blog\/wp-json\/wp\/v2\/tags?post=1936"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}