adam | Hexacorn | Page 451

I have recently been toying around with clustering of various malicious sample sets – running files through a sandbox and static analysis tools, and then applying various normalization and histograms to the output. The results are not mind-blowing, but encouraging. They help in grouping various malware families into separate buckets, improve log parsing routines, and in some cases can be also leveraged to quickly discover hidden properties of the malware e.g. encryption keys, User Agents, HTTP verbs, etc. etc. – these may be then used for more in-depth analysis of proxy logs, etc.

Here is a short list of ‘clusterable’ attributes just in case you want to design your own clustering solution and are looking for a quick cheat list; it is certainly far from being complete, but may give you some pointers:

STATIC

File Name
File Extension
File Size
File Type
- This will have a lot of ‘subtypes’ – for MZ files see details here and here
- For executable – sequence of bytes at the entry point, and at the real entry point (for main, wmain, DLLMain, as well as for VB, Delphi code)
- For PE file – for each of these: their names where applicable, sizes, flags, entropy, strings:
  - sections (for list of known sections see here)
  - import tables
  - export tables
- For PE file –
  - PE type
  - Image base
  - Compilation/debug time stamps
  - Resources – number, topology
  - Debug strings
File Entropy
Compiler (PEiD, etc.)
Packer, protector
File hashes (MD5, SHA1, CTPH, …)
Extracted strings
Presence and characteristics of appended data (e.g. installers)
Sequences of code
- Disassembled code
- Decompiled code
- Selected code (e.g. map of calls)
Detection by various AVs
Multimedia properties (e.g. width, height, EXIF data, etc.)

DYNAMIC

Accessed IPs
Accessed URLs
GET and POST Queries
User Agents
Ports used
Created/accessed Mutexes/mutants
Created/accessed Atoms
Created/accessed Window names
Created/accessed Window classes
Created/accessed Windows topology
Windows’ visibility
Windows’ Unicodeness
Windows’ topology
Windows’ titles
Windows’ classes
Crypto used + built-in or API-based
Popular strings used (e.g. copyright banners as seen here)
Execution paths (code, sequences, code blocks, API sequences)
Use of location-independent code
Use of escalation of privileges tricks
Use and type of code injection
Use of kernel drivers (including system DLLs)
Use of stolen certificates
Use of anti-* techniques
Use of 0days
Use of timestomping
Use of dynamically vbuilt strings (run-time)
Use of code to adjust privileges)
Use of keylogging techniques (and what type: hook, API hook, etc.)
Use of external tools (e.g. cmd.exe, reg.exe, net.exe)
Use of autoruns.inf
Use of DKOM
Use of code directly accessing physical drives
Use of code directly accessing physical memory
Use of code directly accessing BIOS
Use of hypervisor
MBR – code modification
MBR – partition table modification
Passwords used for encryption and to access (e.g. FTP/SMTP/IRC)
Dropped file locations, names
Searched path locations, registry names
Targeted applications (e.g. browser, mail, IM and P2P clients, etc.)
Added/modified registry entries
APIs executed and their arguments
- Type of APIs (kernel32 win32 APIs or ntdll Zw/NT APIs)
- Delays used in waiting functions
- APIs/techniques used for memory allocation (heap, virtual*, stack-based, etc.)
- APIs/techniques used for self-deletion
- APIs/techniques used for running other .exes
- APIs/techniques used for network (winsock or wininet/also Rtl functions from ntdll)
- APIs/techniques used for network enumeration (Net*, WNet*, Domain*)
- Process enumeration APis
- …

Let me interrupt you here…

Okay, okay, I get i!!! It is a never ending list!!!

update

fixed the title of the post – it’s obviously a version 0.3 and not 3.0 🙂

old post

In my last post I talked about detecting Extended Attributes (used by ZeroAccess malware) using HMFT. Today I got a chance to update it a bit with some more information.

First of all, I clustered some of the ZeroAccess samples I had and I came up with a list of comprehensive (of course it’s limited by a sampleset I have) file locations and their Extended Attributes that are used by the malware:

%SYSTEMROOT%\system32\services.exe::731
%USERPROFILE%\appdata\local\a4ca9b9c\u::@@@
%USERPROFILE%\AppData\Local\{0c9c4ca4-c3a9-47cf-2e3e-4db8bf2ad457}\U::001
%SYSTEMROOT%\$NtUninstallKB16214$\2764741532\U::CFG

You can find a full list of samples using EAs together with hashes (md5_sha1) here.

Secondly, I added some code to HMFT and now it can dump Extended Attribute’s name (and some printable content of the EA value) as well:

   RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 224
      LengthOfAttributeD       = 40
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 0
      FlagsW                   = 0
      AttributeIdentifierW     = 4
      --
      SizeOfContentD          = 16
      OffsetToContentW        = 24
      --
        MFTA_EA
            OfsNextEAD      = 16
            FlagsB          = 0
            EaNameLenB      = 3
            EaValueLenW     = 3
            EaName = FOO
            EaValue= bar

Using newer version of HMFT on one of the ZeroAccess samples gives the following result after postprocessing with eads.pl script:

After HMFT update, eads.pl had to be slightly modified::

use strict;
my $f='';
my $l='';
while (<>)
{
  s/[\r\n]+//g;
  $f = $1 if /FileName = (.+)$/;
  print "$f has $1 record\n" if ($l =~ /(MFTA_EA(_[A-Z]+)?)/);
  print "$f:".":$1\n" if (/EaName = (.+)$/);
  print "$f:$1\n" if ($l =~ /MFTA_DATA/&&/AttributeName = (.+)$/);
  $l = $_;
}

Btw. if you look at the screenshot above you will notice :SummaryInformation ADS used by this sample (5D23ACF4C2221B687BC96A2701786C13/ AB7EEC68F9438E31523D0A67E7612CA666C8F56A) as well – it can be even better seen in the window of Process Monitor during the malware installation:

In terms of APIs used by ZeroAccess to create EAs, I finally came across a few samples that use ZwSetEaFile to do so,. Interestingly. none of the samples used this API to create EA for services.exe – all the samples using this API create the following EA:

%USERPROFILE%\appdata\local\a4ca9b9c\u::@@@

(Please refer to the older post for more information about the context of this discussion.)

You can download latest hmft here.

Hexacorn

Hexacorn

Author Archives: adam

Clustering and Batch Analysis

STATIC

DYNAMIC

Let me interrupt you here…

HMFT 0.3 + Extended Attributes, short update