You are browsing the archive for Forensic Analysis.

JumpLists file names and AppID calculator

April 30, 2013 in Forensic Analysis, Software Releases

JumpList files are an interesting forensic artifact and as such they have been thoroughly explored by many researchers over last 2-3 years. There is really a lot of material out there and there are also many tools that parse JumpList files’ structure quite well. This is why in this post I will focus not on the content of JumpList files, but on their… file names.

Algorithm

The JumpList file names are created using hash-like values that in turn are based on something that is called AppID. The Forensics Wiki lists many known Jump List file names based on AppIDs; examples include:

  • 918e0ecb43d17e23 used by Notepad (32-bit)
  • 9b9cdc69c1c24e2b used by Notepad (64-bit)
  • 1bc392b8e104a00e used by Remote Desktop

and so on and so forth. The data from Forensics Wiki has been harvested from many sources and it’s a very useful reference for further research.

The algorithm to create a hash-like value is actually ‘sort of known’. There are posts out there suggesting that the AppID is a nothing but a CRC64 sum taken from the application path. For example, in this post, an Anonymous poster provided a Hexrays Decompiler’s code snapshot taken from shell32.dll showing how the AppID is generated. When I came across this particular comment I decided to verify it. I applied CRC64 sum to an example path and compared it with an expected known file name, and since you are reading this post you are probably guessing that it failed miserably :)

Okay, so since it failed and since the algorithm didn’t t seem to be explored in-depth yet I thought I will give it a go. It turned out to be quite simple, but there were a few challenges on the way that may be interesting to know about so I describe it below. I also ended up writing a perl script that I called AppID calculator (appid_calc.pl). It allows you to calculate an AppID based on provided string – more about it below as well. You can find a download link to the script at the bottom of this post.

Challenges

Using the code snippet I referred to earlier as a guidance, I quickly found the code responsible for generating AppIDs, put the appropriate breakpoints in a debugger, and.. immediately understood why the CRC64 (path) didn’t work for me earlier :)

The CRC64 algorithm has been indeed applied to a path, but there are a few quirks:

  • The path is first converted to Unicode
  • If the path is located in one of locations that are recognized and treated by system in a special way, the path is normalized first
  • The CRC64(Path) algorithm applies only to AppIDs automatically generated by the system; At any point of time any application can change its AppID either using the SetCurrentProcessExplicitAppUserModelID API, or can even apply window-specific AppID using  IPropertyStore::SetValue to change the PKEY_AppUserModel_ID property of  the particular window
  • On top of that, the CRC64 uses a non-standard polynomial

First, let’s talk about the CRC64. There are many CRC algorithms out there. In fact, the difference is not only between the length in bits (CRC16, CRC32, CRC64), but also in the configuration of a particular implementation. There are obviously many standard configurations (Wikipedia described quite a few), but the one used in AppID generation is not on the standard list. I know, because the very first thing I tried was to use all standard configurations, but all of them failed :-) .

The actual code used by the system relies on a precalculated lookup table, but googling around for the numbers from the table only brought 2-3 hits. In such case, the usual way of solving the issue is to rip the code from the source and reimplement it e.g. in perl.  This could be done easily. The 2-3 hits I mentioned earlier refer to a code that was created as a result of reverse engineering of thumbcache.dll  file – turns out that the very exact CRC64 configuration/implementation has been used in that DLL.

Exploring the properties of CRC I eventually managed to deduce the CRC configuration and the actual polynomial used to generate the lookup table.

The polynomial used by the AppID algorithm is 0x92C64265D32139A4.

Once I found out I went to google again and this time I also got 2-3 hits only. First two were on the Thumb Cache-related code I already mentioned. The last one was the Microsoft page describing the use of this particular polynomial in a ADSStreamHeader structure:

Crc (8 bytes): A bit-reversed CRC-64 hash of the FCIADS stream from the TimeStamp field to the end of the structure that can be used to validate the integrity of the FCIADS stream. The cyclic redundancy check (CRC) polynomial is x**64 + x**61 + x**58 + x**56 + x**55 + x**52 + x**51 + x**50 + x**47 + x**42 + x**39 + x**38 + x**35 + x**33 + x**32 + x**31 + x**29 + x**26 + x**25 + x**22 + x**17 + x**14 + x**13 + x**9 + x**8 + x**6 + x**3 + 1, with the leading 1 implied. The normal representation is 0x92C64265D32139A4.

That was a good sign and I could now start implementing the appid calculator w/o ripping the lookup tables.

The second issue to solve was the normalization.  The paths are normalized using KNOWNFOLDERIDs, so it’s a simple search and replace before applying the CRC.

One aspect of normalization I need to mention is… ambiguity. Depending on the OS (32 vs. 64 bit) different KNOWNFOLDERIDs are applied during the normalization path and it’s quite confusing. I suggest reading the Microsoft page I linked to above for further details.

Last, but not least. – quite a lot applications use SetCurrentProcessExplicitAppUserModelID API to change their AppID after they are executed. For example, the following applications do it (AppID – application name):

  • Microsoft.Silverlight.Offline – Silverlight
  • Microsoft.InternetExplorer.Default – Internet Explorer
  • VMware.Workstation.vmplayer – VMWare Player
  • Microsoft.Windows.MediaPlayer32 – Windows Media Player (32-bit)
  • Microsoft.Windows.MediaPlayer64 – Windows Media Player (64-bit)

For this reason, attempting to find e.g. AppID of c:\program files\Internet Explorer\iexplore.exe doesn’t really make sense as all IE windows are grouped under Microsoft.InternetExplorer.Default AppID.

Examples

AppIDs of InternetExplorer and Sticky Notes

appid_1

These can be confirmed by looking at Forensic Wiki:

  • Microsoft.InternetExplorer.Default28C8B86DEAB549A1

appid_2

  • Microsoft.Windows.StickyNotes337ED59AF273C758

appid_3

 Notepad

appid_4

You may notice that in this example there are 2 different AppIDs shown. This is because of the ambiguity I mentioned earlier; applications running on 64-bit systems can be executed in more than one configuration and since there is WOW64 folder redirection happening AppID needs to be calculated in a context.

The Notepad path looks the same to both 32- and 64-bit application (because of WOW64 folder redirection):

  • c:\windows\system32\notepad.exe

but the AppID depends on a type of Notepad .exe file:

  • if it is 32-bit, the AppID is 918E0ECB43D17E23
  • if 64-bit, the AppID is 9B9CDC69C1C24E2B.

This can be also confirmed via Forensic Wiki:

appid_6

Internet Explorer – via path

It gets even more complicated with Program Files folder as it has two versions – with and without (X86) and 32-/64- bit applications both ‘see’ Program Files the same way. As an example we could try to generate a hash for Internet Explorer in various configurations by running appid calculator and providing to it a path to c:\Program Files\Internet Explorer\iexplore.exe. As mentioned earlier IE uses an AppID that it sets up during the launch, so you should never see AppIDs shown on the screenshot below, but it is a simple example to show various configurations of Program Files folder using a well-known path.

appid_5

Again, I strongly suggest reading the Microsoft Article about KNOWNFOLDERIDs, The appid calculator provides a link to it as well if the path is known to be ambiguous (system32, program files, program files\common).

Download

You can find the script here. This is a first version, coded in a hurry so it may contain bugs. If you find any issues, please let me know. Thanks!

To run:

perl appid_calc.pl

If no argument is passed to it, it will calculate a few sample AppIDs – the examples illustrate various ways one can provide the path to the script:

  • c:\windows\notepad.exe
  • c:\windows\system32\notepad.exe
  • c:\windows\syswow64\notepad.exe
  • {1AC14E77-02E7-4E5D-B744-2EB1AE5198B7}\notepad.exe
  • c:\program files\Internet Explorer\iexplore.exe
  • MICROSOFT.INTERNETEXPLORER.DEFAULT

Java cache file names

April 19, 2013 in Forensic Analysis, Software Releases

I was wondering how Java generates the file names for its temporary cache files and after googling around, I found the answer in the Java source code – the function responsible is called generateCacheFileName and its implementation has changed over the time; here is how they do it in JDK 5 and 6/7:

JDK 5.xx

Files are saved in the following location:

  • %USERPROFILE%\Application Data\Sun\Java\Deployment\
    cache\javapi\v1.0\[cachefilename]

The procedure for generating [cachefilename] is described here:

JDK 6.xx-7.xx

Files are saved in the following location:

  • %USERPROFILE%\Local Settings\Application Data\Sun\Java\Deployment\
    cache\6.0\[cachebucket]\[cachefilename]

The procedure for generating [cachebucket]\[cachefilename] is described here:

The code

I ripped the code from these sources and created a simple java snippet that helps to test cache file name for a given URL. At the moment it has a small bug, but I hope you won’t notice it :)

Example – JRE 1.5

I googled around and found an old applet that worked under JRE 1.5, then visited the page so that the cached files could be created; the URL passed to the cachename Java program produces exactly same result:

javacache_1

Example – JRE 1.6-1.7

I simply visited Oracle web page that detects the browser and let the applet load:

javacache_2
Download

You can download the code here.

To compile, run:

javac cachename.java

To execute, run:

java cachename url

 

RegRipper Ripper (3R) and the list of reg keys covered by RR plugins

April 4, 2013 in 3RPG, Forensic Analysis

update

Updated 3R to cover the latest archive from the RegRipper site – plugins20130403.zip (new version introduced over 40 new scripts)

old post

I got curious what keys are already covered by existing 280+ RegRipper Plugins so I wrote a quick and dirty script to retrieve the data from all plugins in an automated way. For the fun of it, I named the script RegRipper Ripper (3R).

The script is here, and the result of running it over the latest bundle is available here.

You may use the list to see what’s already covered and… avoid writing a plugin for a key that is already handled.

The 3R is a dumb script, so a few things I had to fix manually (but still inside the script, so it can be used to regenerate the tables anytime needed, e.g. after the bundle update). I hope there are no mistakes, but if you spot any, please let me know and I will fix that. Thanks!

3RPG – 4 RegRipper Plugins in 15 minutes

March 15, 2013 in 3RPG, Forensic Analysis, Software Releases

In this post I show how to quickly develop 4 plugins using 3RPG. Except for the documentation (this post) it took barely 10-15 minutes.

You can download plugins here.

01. Detecting presence of 7zip on the system

7Zip has a key in the following location

HKEY_LOCAL_MACHINE\SOFTWARE\7-Zip

This is enough to build the script:

01_7zip1

Note that the name of the script is automatically prefixed with an underscore (7zip -> _7zip) for names starting with digits (it’s because perl doesn’t ‘like’ it).

Also, when you paste the 7zip registry key, and change the focus 3RPG will automatically strip HKEY_LOCAL_MACHINE\SOFTWARE part:

01_7zip2Now click the code – 3RPG will automatically select it all for your convenience.

01_7zip3

You can now copy this to any editor and save – use a name highlighted in red and with an extension .pl i.e. _7zip.pl.

Then run:

perl rip.pl -r SOFTWARE.copy0 -p _7zip

The result:

01_7zip4

02 Listing persistent network mappings

All mapped drives are listed under the following key:

HKEY_CURRENT_USER\Network

Again, we run through the same exercise as previously – this time we include ‘Yes, scan subkeys, depth=2′

02_netmap1

Then run:

perl rip.pl -r NTUSER.DAT -p netmap

and the result is:

02_netmap2b

03. Listing all possible CLSID autostart entries

Amongst various less-known autostart mechanisms that I listed in my older post we can find adding or re-using entries of COM servers. Such technique can be used to introduce a man-in-the-middle code for a legitimate plugins, shell extensions, etc. .

The information about the COM servers is stored under the following key:

HKEY_LOCAL_MACHINE\SOFTWARE\Classes\CLSID

The names of DLLs, EXEs, etc. are usually listed under {Default} value, so the plugin below will list (going recursively through the whole node) all possible {Default} values listed under CLSID node.

03_clsid1

We run it as:

perl rip.pl -r Software2 -p clsid

And the results are:

03_clsid2

This is not a perfect solution as many {Default} values don’t include a file name, but we could either grep results by specific extension e.g. dll, or patch the script manually and add a better routine (e.g. only list values under InprocServer32 and LocalServer32)

03_clsid3

Last, but not least – running this plugin often probably doesn’t make sense as it’s very slow, but it is a simple example that demonstrates how to search for {Default} values.

 04. Listing keys with binary data

This is just another simple example showing how REG_BINARY data is presented in the output of plugins generated with 3RGP.

For the example, I will look at the key

HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\
CurrentVersion\Print\Printers\Microsoft XPS Document Writer

associated with Microsoft XPS Document Writer and its value Default DevMode.

I don’t know what’s exactly inside this key, but since it contains a binary blob, it will serve the purpose here.

04_xps1

We run it as:

perl rip.pl -r Software2 -p xps

And the results are:

04_xps2

That’s it! Thanks for reading!

3RPG – Rapid RegRipper Plugin Development

March 14, 2013 in 3RPG, Forensic Analysis, Software Releases

Inspired by DFIR posts from users (often non-programmers) requesting help with writing/improving RegRipper plugins I created a new tool that aims at developing RR plugins in a much faster way.

The tool is called 3RPG and it’s oriented mainly at non-programmers and less experienced programmers. Of course, if you are an old school perl programmer, go ahead and try it as well. Any feedback and comments will be much appreciated.

What is 3RPG?

3RPG is a web form that helps you to quickly build Plugins for RegRipper by Harlan Carvey.

You just need to fill-in a few fields and the code of the new plugin will be ‘developed’ instantly in front of your eyes.

You can go and check how it works here – 1000 words worth screenshot should help you to get the idea:

3rpg_1

Benefits a.k.a. why 3RPG was created?

If you are a non-programmer…

  • You can use a web form to instantly create your own RegRipper Plugin for a specific registry node/key
  • If you need to add extra features, you can pass such script with example data to more experienced RegRipper plugin programmers – trust me, they will appreciate the effort you put into research and will be more eager to help
  • You can save 3RPG as an HTML page and use it offline

If you are a programmer…

  • You know that writing new RegRipper plugins ‘by hand’ is kinda painful i.e. it’s easier to modify existing script to add features than starting from the scratch
  • Creating new scripts is usually a copy and paste game – there is always a chance for making a silly typo or mistake
  • In general – in many cases simply (recursively) enumerating a specific registry node/key and cherry-picking something with a simple filter is enough
  • Also, adding a generic data print mechanism for all possible registry data types helps to quickly ‘analyze’ plugins’ output w/o any extra effort
  • ..and this is exactly what the 3RPG offers; more complex scenarios require (obviously) some manual coding
  • You can also fetch the template and adjust it to your needs manually – I am confident that with small modifications it may support all possible registry retrieval needs
  • If you are curious about technical details, I talk about it at the bottom of this post

How to use 3RPG?

Just go to the 3RPG Wizard, fill in the form (takes 1-2 minutes), then copy and paste the resulting script and save to a file – once you do, you are ready to go!

To run/test the script, use the newly created file (here myscript) with RegRipper:

perl rip.pl -r <hive> -p myscript

For a typical script, these fields are required:

  • a script name e.g. myplugin.pl
  • a hive name(s) e.g. Software
  • a node e.g. Microsoft\Windows\CurrentVersion\Run
  • a key name/value (works like a filter) e.g. x86
  • if you want to scan subkeys (recursively, you can also specify the depth)
  • if you want to include Wow6432Node keys (typically, you do since many new systems are 64-bit)

and then leave the rest fields with default values.

Share!

If you write a new plugin, share the script with the community (if you do, please fill-in the rest of the fields to avoid generic/default values in the scripts. Thanks!)

 

Examples

Software \ Run key enumeration

Implementing a classic Run key enumeration for the Software hive is easy – it’s actually already written for you on the 3RPG page (it’s based on default values of 3RPG).

Just copy the script from 3RPG page

3rpg_1c

and save it as ‘myscript.pl’, then run it as:

rip.pl -r SOFTWARE.copy0 -p myscript

Running it with a test hive gives the following results:

3rpg_2

Software \ Run key enumeration with a specific value

A similar example as before, we just want to narrow down the search looking for e.g. for ‘MSN’

We just need to type ‘msn’ (it’s case insensitive) in ‘What keys/values would you like to include?‘ field:

3rpg_3

Saving the resulting script and running as previous will only show keys/values/data for values/data that contain ‘msn’ (keys are not checked as you are enumerating recursively anyway).

3rpg_4

Technical details

3RPG is a web form. It’s written in HTML + JavaScript. As a base for the plug-in I relied on my old generic RR plugin template that I used in the past. It exploits the fact that the registry data is stored in a tree-like fashion, so recursive enumeration is a natural way of parsing such data w/o going into intricacies of parsing specific keys, values, and conditional processing. It is also very similar to the way command line reg.exe works when executed with ‘query’ or ‘query /s’.

Currently, the following features are supported:

  • 3RPG is interactive – changes to the script are instantly visible and highlighted in the source code
  • A script name can be specified from the form
  • A hive can be selected manually, but script will try to select the correct one based on the key i.e. some hive name(s) are automatically selected when key names including substrings like ‘HKEY_LOCAL_MACHINE\Software’ are pasted
  • Enumeration of keys can be recursive, with a specified depth
  • Filtering of key names/values is possible
  • Code for parsing Wow6432Node nodes can be added with a single click
  • Data dumping is supported for all registry data types (non-printable data is printed as hex)

Bugs

It’s the first version, so bugs are there for sure; if you spot any, please do let me know.

Thanks in advance!

HMFT 0.3 + Extended Attributes, short update

February 17, 2013 in Anti-Forensics, Compromise Detection, Forensic Analysis, HMFT, Malware Analysis

update

fixed the title of the post  – it’s obviously a version 0.3 and not 3.0 :-)

old post

In my last post I talked about detecting Extended Attributes (used by ZeroAccess malware) using HMFT.  Today I got a chance to update it a bit with some more information.

First of all, I clustered some of the ZeroAccess samples I had and I came up with a list of comprehensive (of course it’s limited by a sampleset I have) file locations and their Extended Attributes that are used by the malware:

  • %SYSTEMROOT%\system32\services.exe::731
  • %USERPROFILE%\appdata\local\a4ca9b9c\u::@@@ 
  • %USERPROFILE%\AppData\Local\{0c9c4ca4-c3a9-47cf-2e3e-4db8bf2ad457}\U::001
  • %SYSTEMROOT%\$NtUninstallKB16214$\2764741532\U::CFG

You can find a full list of samples using EAs together with hashes (md5_sha1) here.

Secondly, I added some code to HMFT and now it can dump Extended Attribute’s name (and some printable content of the EA value) as well:

   RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 224
      LengthOfAttributeD       = 40
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 0
      FlagsW                   = 0
      AttributeIdentifierW     = 4
      --
      SizeOfContentD          = 16
      OffsetToContentW        = 24
      --
        MFTA_EA
            OfsNextEAD      = 16
            FlagsB          = 0
            EaNameLenB      = 3
            EaValueLenW     = 3
            EaName = FOO
            EaValue= bar

Using newer version of HMFT on one of the ZeroAccess samples gives the following result after postprocessing with eads.pl script:

2013-02-17_zeroaccess_ea1

After HMFT update, eads.pl had to be slightly modified::

use strict;
my $f='';
my $l='';
while (<>)
{
  s/[\r\n]+//g;
  $f = $1 if /FileName = (.+)$/;
  print "$f has $1 record\n" if ($l =~ /(MFTA_EA(_[A-Z]+)?)/);
  print "$f:".":$1\n" if (/EaName = (.+)$/);
  print "$f:$1\n" if ($l =~ /MFTA_DATA/&&/AttributeName = (.+)$/);
  $l = $_;
}

Btw. if you look at the screenshot above you will notice :SummaryInformation ADS used by this sample (5D23ACF4C2221B687BC96A2701786C13/ AB7EEC68F9438E31523D0A67E7612CA666C8F56A) as well – it can be even better seen in the window of Process Monitor during the malware installation:

2013-02-17_zeroaccess_ea2

In terms of APIs used by ZeroAccess to create EAs, I finally came across a few samples that use ZwSetEaFile to do so,. Interestingly. none of the samples used this API to create EA for services.exe – all the samples using this API create the following EA:

  • %USERPROFILE%\appdata\local\a4ca9b9c\u::@@@

(Please refer to the older post for more information about the context of this discussion.)

You can download latest hmft here.

 

Detecting Extended Attributes (ZeroAccess) and other Frankenstein’s Monsters with HMFT

January 25, 2013 in Anti-Forensics, Compromise Detection, Forensic Analysis, HMFT, Malware Analysis

The topic of Extended Attributes (EA) has been recently covered in an excellent post by Corey. Entitled Extracting ZeroAccess from NTFS Extended Attributes it goes into (amazing) depth explaining on what EA is and how to extract this artifact from the system. It’s a pure forensic gold and if you haven’t read this post yet, please go ahead and do so before reading mine.

Similarly to Corey, I was very interested in researching EA, and I finally took some time tonight to have a deeper look at it myself. I actually wanted to dig in the code more than the $MFT artifacts alone not only to have something to write about (after all, Corey already covered everything! :-) ), but also because I wanted to see how the EA is actually created and what system functions/APIs are used by malware. The reason behind this curiosity was improvement of my analysis tools and techniques, and a few other ideas that I will be quiet about for the moment.

I first assumed that the ZeroAccess’ EAs are created using ZwSetEaFile/NtSetEaFile function from ntdll.dll. I saw this API name popping up on some blogs and I saw it being referenced in my ZeroAccess memory/file dumps so it was a natural ‘breakpoint’ choice for OllyDbg analysis:

zeroaccess_ea_1

To my surprise, none of the samples I checked used this function at all!

Curious, I started digging into it a bit more and realized that for the samples I looked at, the EAs are actually created not by  ZwSetEaFile/NtSetEaFile function, but by ZwCreateFile/NtCreateFile.

Surprised?

I was!

Looking at a documentation, you can see the following function parameters described on MSDN:

NTSTATUS NtCreateFile(
  _Out_     PHANDLE FileHandle,
  _In_      ACCESS_MASK DesiredAccess,
  _In_      POBJECT_ATTRIBUTES ObjectAttributes,
  _Out_     PIO_STATUS_BLOCK IoStatusBlock,
  _In_opt_  PLARGE_INTEGER AllocationSize,
  _In_      ULONG FileAttributes,
  _In_      ULONG ShareAccess,
  _In_      ULONG CreateDisposition,
  _In_      ULONG CreateOptions,
  _In_      PVOID EaBuffer,
  _In_      ULONG EaLength
);

Yes, it’s that simple.

One thing to note – the EA is added to files on both windows XP and Windows 7, but only under Windows 7 I observed the modification of services.exe. On Windows XP, it only appended EA to the  ‘U’ file and nothing else.

Okay, I mentioned I had a couple of ideas why I wanted to research this feature. Now it’s time to reveal them!

Idea #1 – POC

Once I found out what APIs are being used by the malware, I was also able to produce a simple snippet of code that reproduces the functionality:

.586
.MODEL FLAT,STDCALL

 o equ OFFSET
 include    windows.inc
 include    kernel32.inc
 includelib kernel32.lib
 include    ntdll.inc
 includelib ntdll.lib
 include    masm32.inc
 includelib masm32.lib

IO_STATUS_BLOCK STRUCT
    union
    Status        dd ?
    Pointer        dd ?
    ends
    Information    dd ?
IO_STATUS_BLOCK ENDS

.data?
 file db 256 dup (?)
 fa   db 256 dup (?)
 _FILE_FULL_EA_INFORMATION struct
   NextEntryOffset dd ?
   Flags           db ?
   EaNameLength    db ?
   EaValueLength   dw ?
   EaName          db ?
 _FILE_FULL_EA_INFORMATION ends
 FEA equ _FILE_FULL_EA_INFORMATION
 io IO_STATUS_BLOCK <>
.code
  Start:
  invoke GetCL,1, o file
  lea    edi,[fa+_FILE_FULL_EA_INFORMATION.EaName]
  invoke GetCL,2, edi
  invoke lstrlenA,edi
  lea    esi,[fa+_FILE_FULL_EA_INFORMATION.EaNameLength]
  mov    [esi],al
  add    edi,eax
  inc    edi
  invoke GetCL,3, edi
  invoke lstrlenA,edi
  lea    esi,[fa+_FILE_FULL_EA_INFORMATION.EaValueLength]
  mov    [esi],al
  add    edi,eax
  invoke CreateFileA, o file, \
                      GENERIC_WRITE, \
                      0, \
                      NULL, \
                      CREATE_NEW, \
                      FILE_ATTRIBUTE_NORMAL, \
                      NULL
  xchg   eax,ebx
  mov    eax,edi
  sub    eax,o fa
  invoke NtSetEaFile,ebx,o io,o fa, eax
  invoke CloseHandle,ebx
  invoke ExitProcess,0
END Start

This code can be used for testing purposes in a lab environment.

You can either compile the code yourself using masm32 or you can use a precompiled binary – download it here.

To run:

ea.exe <full path name to a file> <EA name> <EA value>

e.g.:

ea.exe g:\test.txt foo bar

Remember to specify a full path to a file. Also, choose a non-existing file name for a file (the program won’t work with files that are already present).

Last, but not least – there is no error checks, you can add it yourself if you wish :-)

Idea #2 – Reduce the FUD factor

While it is a novelty technique, it is not very advanced -  a single API call does all the dirty job to _create_ the EA.

To _detect_ EA is not very difficult either – as long as you have a right tool to do so :-)

Idea #3 – Show how to detect EA on a live system

Now that I got a POC, I can run it:

g:\test.txt foo bar

and then analyze changes introduced to the file system.

I can do it quickly  with hmft.

hmft -l g: mft_list

I tested the program on a small drive that I use for my tests. I formatted it first to ensure its MFT is clean:
hmft_ea_1

I then opened the mft_list file in a Total Commander’s Lister and searched for MFTA_EA. hmft_ea_2

I am pasting the full record for your reference:

  [FILE]
    SignatureD                    = 1162627398
    OffsetToFixupArrayW           = 48
    NumberOfEntriesInFixupArrayW  = 3
    LogFileSequenceNumberQ        = 1062946
    SequenceValueW                = 1
    LinkCountW                    = 1
    OffsetToFirstAttributeW       = 56
    FlagsW                        = 1
    UsedSizeOfMFTEntryD           = 368
    AllocatedSizeOfMFTEntryD      = 1024
    FileReferenceToBaseRecordQ    = 0
    NextAttributeIdD              = 5
   --

    RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 16
      LengthOfAttributeD       = 96
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 0
      FlagsW                   = 0
      AttributeIdentifierW     = 0
      --
      SizeOfContentD          = 72
      OffsetToContentW        = 24
      --
        MFTA_STANDARD_INFORMATION
            CreationTimeQ         = 130036100539989520
            ModificationTimeQ     = 130036100539989520
            MFTModificationTimeQ  = 130036100539989520
            AccessTimeQ           = 130036100539989520
            FlagsD                = 32
            MaxNumOfVersionsD     = 0
            VersionNumberD        = 0
            ClassIdD              = 0
            OwnerIdD              = 0
            SecurityIdD           = 261
            QuotaQ                = 0
            USNQ                  = 0
            CreationTime (epoch)    = 1359136453
            ModificationTime (epoch)  = 1359136453
            MFTModificationTime (epoch)  = 1359136453
            AccessTime (epoch)           = 1359136453
   --

    RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 48
      LengthOfAttributeD       = 112
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 0
      FlagsW                   = 0
      AttributeIdentifierW     = 2
      --
      SizeOfContentD          = 82
      OffsetToContentW        = 24
      --
        MFTA_FILE_NAME
            ParentID6             = 5
            ParentUseIndexW       = 5
            CreationTimeQ         = 130036100539989520
            ModificationTimeQ     = 130036100539989520
            MFTModificationTimeQ  = 130036100539989520
            AccessTimeQ           = 130036100539989520
            CreationTime (epoch)    = 1359136453
            ModificationTime (epoch)  = 1359136453
            MFTModificationTime (epoch)  = 1359136453
            AccessTime (epoch)           = 1359136453
            AllocatedSizeQ        = 0
            RealSizeQ             = 0
            FlagsD                = 32
            ReparseValueD         = 0
            LengthOfNameB         = 8
            NameSpaceB            = 3
     FileName = test.txt
   --

    RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 128
      LengthOfAttributeD       = 24
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 24
      FlagsW                   = 0
      AttributeIdentifierW     = 1
      --
      SizeOfContentD          = 0
      OffsetToContentW        = 24
      --
        MFTA_DATA
   --

   
    RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 208
      LengthOfAttributeD       = 32
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 0
      FlagsW                   = 0
      AttributeIdentifierW     = 3
      --
      SizeOfContentD          = 8
      OffsetToContentW        = 24
      --
        MFTA_EA_INFORMATION
   --

    RESIDENT ATTRIBUTE
      AttributeTypeIdentifierD = 224
      LengthOfAttributeD       = 40
      NonResidentFlagB         = 0
      LengthOfNameB            = 0
      OffsetToNameW            = 0
      FlagsW                   = 0
      AttributeIdentifierW     = 4
      --
      SizeOfContentD          = 16
      OffsetToContentW        = 24
      --
        MFTA_EA

There are two EA-related entries here:

  • MFTA_EA_INFORMATION
  • MFTA_EA record

Manual analysis like this are quite tiring, so we can write a short perl snippet that can help us with postprocessing:

use strict;
my $f='';
my $l='';
while (<>)
{
  s/[\r\n]+//g;
  $f = $1 if /FileName = (.+)$/;
  print "$f has $1 record\n" if ($l =~ /(MFTA_EA(_[A-Z]+)?)/);
  $l = $_;
}

Saving it into ea.pl file, and running it as:

ea.pl mft_list

produces the following output:

hmft_ea_3

Idea #4 – Detect ZeroAccess with hmft

It’s simple :)

  • I ran hmft before the ZeroAccess installation
  • Then I infected my test box
  • I then ran hmft after the ZeroAccess installation

zeroaccess_ea_2

At this stage, all I had to do was to run ea.pl on both outputs and I got the following results:

zeroaccess_ea_3

Or, for the sake of copy & paste (and web bots :) ):

r:\>ea.pl before_installation
V20~1.6 has MFTA_EA_INFORMATION record
V20~1.6 has MFTA_EA record

r:\>ea.pl after_installation
U has MFTA_EA_INFORMATION record
U has MFTA_EA record
V20~1.6 has MFTA_EA_INFORMATION record
V20~1.6 has MFTA_EA record
U has MFTA_EA_INFORMATION record
U has MFTA_EA record
services.exe has MFTA_EA_INFORMATION record
services.exe has MFTA_EA record/span>

As we can see, the malware activity is immediately visible.

Btw. V20~1.6 is a $MFT FILE record that refers to C:\Windows\CSC\v2.0.6 and is related to Offline files (client-side caching). I don’t have any information about the content of this EA. Perhaps someone will be more curious than me to poke around there :-)

Idea #5 – Create a Frankenstein’s monster

Using EA and ADS (Alternate Data Streams) with a single file is also possible.

You can use ea.exe to create such Frankenstein’s monster in 2 simple steps:

  • by running it first with a  filename only – this will create EA record
  • and then re-runing it with a stream name, this will create the ADS, but EA for ADS will fail (sometimes it’s OK to fail :) )

The result is shown on the following screenshot:
ea_frankensteins_monster_1

Using hmft and a combination of ea.pl and ads.pl (posted in older post related to HMFT) in a single eads.pl script:

use strict;
my $f='';
my $l='';
while (<>)
{
  s/[\r\n]+//g;
  $f = $1 if /FileName = (.+)$/;
  print "$f has $1 record\n" if ($l =~ /(MFTA_EA(_[A-Z]+)?)/);
  print "$f:$1\n" if ($l =~ /MFTA_DATA/&&/AttributeName = (.+)$/);
  $l = $_;
}

we can easily detect such beast as well.

That’s all, thanks for reading!

Malware attacking POS systems

December 19, 2012 in Compromise Detection, Forensic Analysis, Malware Analysis

Recently there has been quite a lot of technical posts about RAM scrappers targeting Point Of Sale (POS) systems i.e. malware stealing track data directly from memory of the systems involved in processing of credit cards within the Payment Card Industry (PCI). I am speaking – of course – about Dexter malware. You can find selected (good, technical and informative) articles covering this particular malware here: Verizon, Seculert, Volatility Labs, Trustwave.

It’s good to see that the actual samples are now being either shared publicly or at least discussions about their internals are becoming available for a public eye. Xylitol is definitely leading here as he has been talking about this topic and specific samples a few times this year (example here and here), and sporadically, some of the PFI companies write a blog or two, or present their findings on security conferences. One thing worth to mention here is that some ‘juicy’ knowledge about specific RAM scraping samples has been shared many times in the past, but it has never gained as much exposure as it probably should e.g. many hashes of RAM scrapers have been mentioned in public advisories from card schemes e.g. here, here, and here. Still, access to the actual samples is very limited plus the hashes of samples keep changing (they are often recompiled for each new compromise).

RAM Scraping and theft of data in transit

What is ‘RAM Scraping’?

RAM scraping is a different way of saying that malware reads and parses data directly from a memory (or a file containing memory dump) of a legitimate application responsible for credit card processing. Such ‘sniffing’ is usually scheduled to run at regular intervals. The malware can also directly ‘plug’ or hook into the payment application’s internals and analyze content of its buffers used to temporarily store credit card data in transit.

RAM scraping is not a new idea, many carding attacks within at least last 5 years are relying on this technique and are described in detail by Trustwave and Verizon Business – well-known security companies that specialize in PFI investigations. The RAM scraping technique is extremely simple, effective and… quiet – except for the time when hackers come to the system to install the malware and occasionally come back to extort accumulated data, there is not much of suspicious or easily identifiable activity going on on the compromised system.

It’s the ‘in transit’ aspect of RAM scraping that makes the attack so successful; even if the credit card data never touches the disk (e.g. on a properly hardened and configured system), the malware can still intercept it as it is injected into a transaction process and actively participates in it as an ‘observer’. It acts in a way similar to a man-in-the-middle attack with no modification of data involved (in other words, whatever application is processing – it will be first ‘seen’ by malware before it is passed to the legitimate payment processing application; and this is when data gets sniffed/stolen/dumped).

In the first method of RAM scraping mentioned above  the malware acts as an active ‘observer’ of other processes memory constantly analyzing it and looking for card data. It uses a ReadProcessMemory API to access the memory of a targeted process.

The second one is more complex as it interacts directly with a targeted application – it can be a patched / modified binary or code patching of the running application – writing such patch requires either a good familiarity (on a programmatic level) with the payment application or the attacker needs to spend some time reverse engineering the application internals to know where to hook into its card processing functions. In a way, it is like a plugin code attached to the legitimate software. A very good example of the complex malware using this technique was the infamous ATM malware described first by Threatexpert back in 2009.

The malware targeting POS systems comes in all flavors. It is written in perl, python, .NET, Delphi, C, and sometimes these are just legitimate applications modified to serve malicious purpose e.g. winpcap, ngrep, etc..

There is currently no good protection for this kind of attack on a software level (although system hardening, blocking access to process’ memory or immediately cleaning buffers used for credit cards and even introducing dummy yet incorrect track data inside the application buffers /randomly/ could possibly help; if you are merchant, ask POS vendors about it; if you are POS vendor, feel free to ask me more about it).

Other types of POS malware & hacking techniques

For the sake of completeness it is worth mentioning that some malware variants include code to cover other areas of the system as well and apart from memory scraping they can sniff unencrypted track data from network (again ‘data in transit’), or use traditional keyloggers to intercept track data directly as it enters the system used for swiping the cards e.g. in hotels or restaurants (card readers present themselves to the system as a keyboard, hence track data can be intercepted via keystroke interception).

One can find PAN/Track harvesters working as sniffers putting network card interface in a promiscuous mode, or as specific modules injected into specific processes (more targeted approach), keyloggers,  screen grabbers, and so on and so forth. Some techniques are even simpler – enabling legitimate flags/settings responsible for debugging purposes or to enable logs, or sometimes even simply increasing log verbosity allows to change the behavior of the POS application so that it will start storing PANs/Track data (and the hacker just needs to re-visit system a bit later to harvest the data). In some cases attackers also downgrade the applications to restore older, vulnerable versions of POS software on the compromised system. Such modifications are usually very subtle and since they don’t even require malware to be active on the system – very hard to detect.

On the server side, the attacker may change the script responsible for card processing to transfer data to the attacker’s destination immediately after site users enter them – sometimes such data is stored in a local file as well. Other attacks rely on SQL injection and card data is dumped directly from the database to attacker’s client/tool. Older malware would also use SMTP or FTP to transfer data out in a real-time, but it’s really old school and doesn’t work in more and more environments. While ‘smash and grab’ approach still works, the mission to ‘stay quiet and steal as long as possible’ is a trend growing over last few years. Using a cliche metaphor, hackers now build oasis-like wells that act as card reservoirs to which they come back to fetch new harvested data once in a while.

Example malware attacking POS

I will describe here a a few specific examples of malware targeting POS systems. There are not too many publicly available samples available, but since now they are out there in the wild for quite some time (thanks to Xylitol for sharing the samples via his blog), let’s get to business and describe what we got there…

lanst.exe

MD5         D770ADBEE04D14D6AA2F188247AF16D0
SHA1        2474EC06E46605D60AC2B04B20998EB052AF275F

It’ s a perl2exe compiled executable.

Perl2exe executables contain an encrypted perl code that is decrypted during run-time and interpreted by the embedded perl processor/interpreter; because of this, we can extract the perl code during run-time.

Lanst.exe’s perl code looks like this (we can save it as lanst.pl and even run):

POS_Malware_1

It is obvious from looking at the screenshot that there seem to be some funny unrecognized characters in the source code.

It’s a good occasion to use hstrings:

hstrings  -ps0 lanst.pl > lanst.pl.probe

It will probe all the encodings it knows and save the output data into lanst.pl.probe file.

Browsing through lanst.pl.probe file using Total Commander’s Lister we can see

POS_Malware_2

 

Okay, so encoding is cp866,OEM Russian; Cyrillic (DOS).

We can now go back to lanst.pl and use Lister’s Encodings menu to change the default encoding to 866.

POS_Malware_3

Et voilà!

We get a nice Perl code with Russian comments:

POS_Malware_4

The code itself is not that interesting – it is a boring card scanner that tries to check if the attacked system stores any track data; it is multithreaded, can scan local system, its shares and computers in a domain. It also allows for file and file extensions exclusion/inclusion to speed up system analysis. Admittedly, it is a a nicely written triage script. And yes, I lied – it is actually quite interesting after all – a very efficient code that does exactly what is supposed to do in only a few dozen of lines in perl.

Notably, the source code includes a version number 1.4a

$version="Version 1.4a MultiThread from 22.04.2008";

and a code that prevents it from running if it is executed after a certain date.

$dietime = 1207392905+(86400*30*2);
if ( time  > $dietime ) { die("Can't open Handle/Tie.PM!"); };

This variant ‘dies’ if the date is 60 days past Sat, 05 Apr 2008 10:55:05 GMT – as you can see from the code above, it produces a misleading error message if executed at a wrong time.

Let’s take two important notes here:

  • It is a very old sample! And since its version is 1.4, the earlier versions must exist.
  • If you read my older post, you may recall that built-in ‘expiration date’ is one of the reasons why dynamic analysis is often not enough

A simple test on a dirty box (with a dummy Track data inside the track_samples.txt file) produces the following output:

POS_Malware_6

Quite a nicely behaving hacking tool, isn’t? The guys who run it must feel really happy when they see it hitting the jackpot. Not so funny though if the track data comes from your own card that you have used at the compromised restaurant a few months ago.

Another aspect worth mentioning, the code creates various output files: ccfind.log is the most important amongst them as it contains the track data found on the scanned system together with the file names. If you came across this file on your case, congratulations – you have found a smoking gun…

POS_Malware_7

The lanst.exe is both a triage/reconnaissance tool and a harvesting machine that is looking for easy targets on compromised systems i.e. files storing unencrypted track data that are ready for an immediate extortion.

It is not a RAM scraper per se, but I describe it here because it can be often found inside the ‘toolchests’. Traces of such tools being used are also a good indicator of a compromise.

dnsmgr.exe

MD5         3004CE6CB7C44605CDF971B74DB3A079
SHA1        F023B5F5CD8B85B266D0A0AD416136FDA27577EF

Another perl2exe compiled script. Decompiled code presents itself as yet another card parser that searches for Track1 and Track 2 patterns in a specific set of files. This is a scraper using similar technique to the one used in lanst.exe (regular expressions matching two types of track data) – yet again it is actually not a RAM scraper, but a file scraper.

If it sounds a bit confusing, it is because the files it parses are actually memory dumps obtained using a dedicated memory dumping tool. That is, the actual memory dumping part is implemented in a separate program.

One note here: memory dumping programs are typically part of hacker’s toolchest and since the functionality is trivial and easy to implement they are not described in this post; notably, memory dumping/parsing techniques are not carding/hacking-specific – many reverse engineers, penetration testers, and other security pros often use such tools during malware analysis, debugging sessions, pentesting or auditing gigs. Gaming cheat engines also use the same functionality.

Going back to dnsmgr.exe – as mentioned, there are two components involved here:

  • one is a memory dumper that enumerates memory blocks from the process(es) that is/are of carders’ interest e.g. application processing card data
  • second one is a parser (dnsmgr.exe) – it analyzes the dumped data looking for track 1 and 2 patterns – fragment of the parser are shown below.

POS_Malware_8

It is a first generation of RAM scraping malware and as you can see it is not very advanced on a programmatic level, but worked well for quite some time (at least 3 years AFAIK; some may still be present on some POS systems even today!)

Second generation of RAM scrapers combined memory dump&parsing functionality into a small executable as shown in a next example.

rdasrv.exe

MD5         D9A3FB2BFAC89FEA2772C7A73A8422F2
SHA1        06A0F4ED13F31A4D291040AE09D0D136D6BB46C3

This is a second generation of RAM scrapers; it has been already described by various AV companies – so here just for the completeness: it is a code written in Delphi that runs as a service; it enumerates memory blocks of processes and reads them one by one, on the way utilizing regex patterns that match Tracks 1 and 2 – whatever matches theses patterns is intercepted and preserved in a locally created file.

POS_Malware_5

As mentioned earlier, it is a service, so it has to be installed, then started:

POS_Malware_9

While running, it creates a c:\windows\system32\data.txt file that contains intercepted information – Track data:

POS_Malware_A

Last, but not least, it can be also uninstalled:

POS_Malware_B

compenum.exe

MD5         BCC61BDF1A2F4CE0F17407A72BA65413
SHA1        B026397615ED9B63396EB5A4DF102DB706992E0E

MD5         C5C3341FBDD38C041E550D5DFF187A8F
SHA1        6686CE1C9B9809034333EEBD546523AE91491DB6

Two samples that are simple LAN recon/enumeration tools – they utilize WNet* functions to enumerate resources. They accept 3 command line arguments: -nocomment, -domains, -fullinfo and BCC61BDF1A2F4CE0F17407A72BA65413 accepts extra argument -createbat. The meaning of the command line arguments is as follows: extra info on WNET output (NETRESOURCE.lpComment), disable information about domains, output full information, output everything to a batch file (‘play.bat’ or ‘p’).

Conclusion

These are old samples and there is nothing new here; RAM scraping malware is not very complex when compared to far more advanced families like Zeus or ZeroAccess, but this is a enough to harvest credit card numbers and later extort them from compromised systems. There are a lot more variants written by other carding groups yet the samples are not available publicly; most of them work the same way though – user mode components targeting memory of specific processes or all processes using ReadProcessMemory API or direct hooks in the payment applications’ code/libraries; kernel drivers are rare. Dexter’s arrival suggests that POS systems are gaining some attention and may be targeted even more than in previous years.

If you admin a POS system don’t be frightened, but consider making a step forward towards getting your systems PCI DSS compliant. While it’s not perfect, it will definitely improve the security posture of your organization.

hstrings – when all strings are attached…

November 5, 2012 in Forensic Analysis, hstrings, Malware Analysis, Software Releases

TL;DR;

a new strings tool that attempts to extract localized strings e.g. French, Chinese from an input file; see example below

Intro

Traditional strings utilities are usually limited to ANSI/Unicode-LE/Unicode-BE strings. This is understandable as these are the most prevalent type of strings that we come across in our daily work.  However, many files exist that contain more strings – these we usually miss as they contain accented letters and these break the typical string extraction algorithms. On top of that there are a lot various character encodings out there that make it non-trivial to pick up right bytes in a regular expression or a state machine. One can have accented letters saved as Unicode-LE, Unicode-BE, UTF8, or using one of many legacy encodings e.g. Windows Code Pages or IBM EBCDIC encodings.

For quite some time I had in mind an idea to write a smarter strings extraction program that would take this localization/encoding mess into account so even before I released RUStrings I had been already thinking to write something more generic. In other words, I wanted to write a tool that can extract strings from a file in any well-known encoding and language possible.

As usual – I didn’t know what trouble I am getting myself into when I began :) .

As mentioned earlier, there are many encodings used by various platforms and the same string of bytes can be… a random garbage… or it can be  representing a string of characters encoded in one of at least 150 encodings possible including not only legacy encodings, but also Unicode. And not Unicode seen as a subset of characters belonging to ASCII set interleaved by zeros  (‘simplified Unicode’ that string extraction tools rely on), but Unicode that includes blocks dedicated to specific languages and letters e.g. Chinese, Cyrillic, Hangul, etc.

The tool I present below attempts to:

  • read an input file,
  • walk through the file content
  • apply heuristics and find characters encoded as:
    • bytes (ANSI and other legacy character sets)
    • words (Unicode LE, Unicode BE, and DBCS)
    • byte sequences (utf-8, utf-7, MBCS – multibyte encodings e.g. iso-2022-jp (Japanese) , GB18030 Simplified Chinese etc.)
  • it then normalizes these code points to Unicode LE
  • and appends the strings to an output file for a specific encoding

At this stage program is in alpha stage as I am still not sure how to present the output properly. Currently the program generates a lot of output files. Way too many. But it is not trivial to make it simpler.

From a data processing perspective it is actually quite a complex problem – since bytes can be interpreted in many ways, the program needs to show all of all the possible strings extracted from a file. The same string of bytes can be easily interpreted as some legacy ANSI code page (actually, simultaneously almost all of them), or as Chinese multibyte encoding – it then needs to normalize the output to unicode, so we have multiple unicode streams coming out of multiple decoders and in the same location of the file. My detection algorithm relies on state machine-like heuristics and it outputs data as it goes through the data. Since the various encoding heuristics are applied at once (one pass through a file), outputting data to a file may cause race conditions and streams from various decoders can start interleaving – leading to a mess. So, currently the output is in different files. I have a few ideas on how to solve, but each has a trade off associated with it, so stay tuned :)

Okay, enough babbling and boring theory – let’s look at some example.

EXAMPLE

First, we need to create a a few text sample files that contain some random text in various languages encoded in many different encodings.

I generated a few non-sensical lorem-ipsum texts by Lorem Ipsum Generator.

Russian

Нам аутым убяквюэ нолюёжжэ ад. Нам граэкы компльыктётюр нэ. Квуй видырэр ёнэрмйщ ку, прё ат фиэрэнт элььэефэнд эррорибуз. Ан нам фэюгаят юлламкорпэр интылльэгэбат. Пэр декам квюаэчтио эа, эним витаэ июварыт вэл экз, эа емпэтюсъ элыктрам шэа. Ед съюммо ыльигэнди мэль, ыам эи кхоро кэтэро зальютатуж, одео нюмквуам мэнтётюм эа квуй.

Chinese

主谷三間機望飼営電時始能快本面一界。約握企曜回金忙出行場説必確天下員週。連芸止嘩健集人説火忘冠率庭泉。田位国以供地紹臣同旅百出済理強波。球告続況時心断主別重並行県邦不康。記悪暮投氏性善治地長中消。小作解共供小田民覧花伝聞団点。止都要空性難改大境新真権軽降真細登皇。読道決集房休講員軟渡慎無告書。社風理載当宿竹金来簡月教。

Greek

Ιδ φιμ ιλλυδ αλικυαμ συσιπιθ, ετ ηαβεο σανστυς κυι, θεμπορ λυπταθυμ σομπρεχενσαμ μει αν. Υθροκυε νολυισε νες ετ, αδχυς οφφισιις ινφιδυντ αδ σεα. Συ νες λιβρις θιμεαμ. Φιξ μαζιμ λυπταθυμ δελισαθισιμι υθ. Περ υθ πωσε μυνερε.

Luxembourgish

As Fläiß ménger Stieren dat. An och sinn Stret gewalteg, wär am gutt d’Land hinnen, wäit eraus ménger si dee. Feld löschteg mä gei. Fu sou deser Riesen, Blummen löschteg hun jo.

 I then saved these files with different encodings:

  • Russian: 1251, koi8-R, Unicode-BE, Unicode-LE, UTF8
  • Chinese: utf8, GB2312, GB18030
  • Greek: Unicode-BE, 1253
  • Luxembourgish: 1252, Unicode-LE

Once done, I combined all of the files into one large file – now the sample file contains multiple texts in multiple different languages saved in multiple different character encodings:

Running htrings over the file produces multiple output files:

Yes, it’s quite a lot and reviewing them all is atm an overkill; I have already mentioned that I am still thinking how to improve the presentation layer :-)

The rule of a thumb is to start with Windows ANSI code pages, UTF8, Unicode-LE (ULE*) and Unicode-BE (UBE*) and of course cheat – we can go ahead and look at the files associated with the encodings we used in the example above i.e. Russian, Greek, etc. – after all, it’s just an example :) :

Previewing the result files gives us the following:

  • h_GB18030,GB18030 Simplified Chinese (4 byte); Chinese Simplified (GB18030)

  • h_windows-1253,ANSI Greek; Greek (Windows)

  • h_windows-1251,ANSI Cyrillic; Cyrillic (Windows)

  • h_windows-1252,ANSI Latin 1; Western European (Windows)

So, it would seem that it works…

 

I will be releasing the first version of hstrings soon.

Thanks for reading!

Prefetch file names and UNC paths

October 29, 2012 in Anti-Forensics, Forensic Analysis

In one of the older posts, I talked about how the Prefetch file names are created. Today I was looking at program execution from network shares i.e. originating from the UNC paths and realized that I have not included these in the original article.

VM Shares

To test what happens, I launched WinXP under windbg and put a breakpoint on the hashing function and then executed a test file from a shared VM folder – the screenshot shows the mapping between the drive and the UNC path where the executable is placed:

Once executed, the windbg popped up and I could trace the full path to a file in a Memory window

As it seems, nothing really surprising:

  • z:\test.exe is executed
  • it is mapped to its UNC path \\vmware-host\Shared Folders\X\test.exe
  • which is then prepended with a device name responsible for HGFS file system (used internally by VM) to form a final string used in a hash calculation
  • \DEVICE\HGFS\VMWARE-HOST\SHARED FOLDERS\X\TEST.EXE

Real share

Now, that was the case with a ‘fake’ share created by the VM software.

What about a real share?

Following the same procedure:

  • I mapped a host \\H\C$ drive as N: inside the guest system with ‘net use’
  • and then executed N:\test.exe

The result shown below is not very surprising either as now the path refers to LANMANREDIRECTOR:

  • \DEVICE\LANMANREDIRECTOR\H\C$\TEST.EXE

Substed paths

And in case you are curious what happens to drives created with subst…

For drives mapped locally using ‘subst drive: path’ e.g.

subst g: .

there is no difference as the device will refer to HARDDISKVOLUME### (where ### is hard drive’s number) – I don’t include screenshot here as I hope this example doesn’t need one.

However, using subst in a slightly different way i.e. referring to target path via localhost’s IP: e.g.

subst g: \\127.0.0.1\c$

will make the Prefetch file name to be created using the following path:

  • \DEVICE\LANMANREDIRECTOR\127.0.0.1\C$\TEST.EXE

As you can see, each of the test files created a different hash

In other words, there is plenty of ways to abuse the file naming creation of the prefetch file and it’s quite hard to write an universal hash calculator to cover all these cases – it really depends on the environment and there are lots of tricks to confuse the system + I bet there are a few more that wait to be uncovered.