Introducing filighting and the future of DFIR tools

Filighting (FIle highLIGHTING) is a proof of concept idea that I implemented in perl as a naive clustering and data reduction algorithm modeled on the way software is built on Windows platform.

TL;DR; The algo is as follows:

  • enumerate all the files in a directory
  • read all the files one by one and try to see if any of them contain actual references to other files
  • cross-reference these
  • profit

Yup. It’s that simple.

How Windows Software is built?

Windows software can be built in many ways, using various programming languages, platforms and frameworks.

For the purpose of this post we will focus on the most typical software packages that contain a couple of components:

  • Main program file – the actual program – portable executable (.exe)
  • Additional executable files – typically libraries, but sometimes other .exe and kernel mode drivers (.exe, .dll, .sys, .ocx, etc.)
  • Localization/Language files (e.g. .lng, .mui, etc.)
  • Configuration files (.cfg, etc.)
  • Templates (.template, .theme, etc.)
  • Databases (.db, .sql, etc.)
  • Readme files, Help files (.txt, .hlp, etc.)
  • GFX files (.jpg, .png, etc.)
  • Plug-ins (.dll, etc.)
  • and whatever else that is required for the program to offer some functionality
  • + Registry entries (which I skip in this post)

Notably, there are programs that are basically a single executable – many OS programs used to be just simple .exe f.ex. Notepad.exe, or Calc.exe. In newer versions of Windows they rely on additional localization files ( .mui), or are just links to other programs f.ex. Calc.exe on Windows 10 linking to a Metro application. While the programs that are just single executables are not the focus of this post, they certainly could be highlighted as possible ‘orphans’ by the very same algorithm, or its spin-offs.

Okay, what can we do with this knowledge?

Knowing that software contains many files gives us a hint that there must be some links between them all that are somehow established during the compilation, installation, or program use phases.

  • The building process may compile hardcoded file names into the final main program file and/or its libraries, configuration files, etc..
  • The installation program drops the files in respective folders and creates configuration files, registry entries, etc.
  • The program use is the activity that user or application performs and it affects how the files are created, added, modified, etc.

While it is hard to keep a track of it all, it certainly makes sense to try to imagine these interconnections and attempt to create a hidden graph that connects all these components together.

It is also tempting to imagine that recognizing these connections would allow us to cluster files into buckets that could be then hidden from the ‘view’ during analysis!

This is not an easy thing to do for the whole file system, but it works pretty well for selected case-scenarios and in particular – directories. And there is really a lot of ways to improve this especially if file format is considered and links not only between files, but also between files and the Registry are considered.

As usual: subject to a further research!

Weaknesses

It’s very easy to abuse it. You just need to drop files that self-reference each other and to make it even more tricky, reference ‘good’ files on the system.

Installations that cover more than one folder are also problematic (‘Common Files’ subfolder is a good example for ‘multi-folder’ installation).

Protected files – usually compressed, virtualized main program executable files won’t reveal references to other files.

There are probably more…

Still… I do believe this is the future of DFIR tools, even if the possible implementations may vary a lot from the idea I am discussing here.

‘Known hashes’ is good.

‘Known hashes+files’ is good+.

Time for a simple example

Okay, just writing about stuff is not enough.

Let’s see how it works in practice.

In this test I install Total Commander – the latest 32-bit version from http://www.ghisler.com/download.htm

Once installed, the installation folder contains the following list of files:

  • CABRK.DLL
  • CGLPT64.SYS
  • CGLPT9X.VXD
  • CGLPTNT.SYS
  • DEFAULT.BAR
  • DESCRIPT.ION
  • FRERES32.DLL
  • HISTORY.TXT
  • KEYBOARD.TXT
  • NO.BAR
  • NOCLOSE.EXE
  • REGISTER.RTF
  • SFXHEAD.SFX
  • SHARE_NT.EXE
  • SIZE!.TXT
  • TC7Z.DLL
  • TC7ZIPIF.DLL
  • TCMADMIN.EXE
  • TCMDLZMA.DLL
  • TCMDX64.EXE
  • TCUNINST.EXE
  • TCUNINST.WUL
  • TCUNZLIB.DLL
  • TcUsbRun.exe
  • TOTALCMD.CHM
  • TOTALCMD.EXE
  • TOTALCMD.EXE.MANIFEST
  • TOTALCMD.INC
  • UNACEV2.DLL
  • UNRAR.DLL
  • UNRAR9X.DLL
  • WC32TO16.EXE
  • WCMICONS.DLL
  • WCMICONS.INC
  • WCMZIP32.DLL
  • WCUNINST.WUL
  • wcx_ftp.ini
  • wincmd.ini
  • LANGUAGE\WCMD_CHN.INC
  • LANGUAGE\WCMD_CHN.LNG
  • LANGUAGE\WCMD_CHN.MNU
  • LANGUAGE\WCMD_CZ.INC
  • LANGUAGE\WCMD_CZ.LNG
  • LANGUAGE\WCMD_CZ.MNU
  • LANGUAGE\WCMD_DAN.INC
  • LANGUAGE\WCMD_DAN.LNG
  • LANGUAGE\WCMD_DAN.MNU
  • LANGUAGE\WCMD_DEU.INC
  • LANGUAGE\WCMD_DEU.LNG
  • LANGUAGE\WCMD_DEU.MNU
  • LANGUAGE\WCMD_DUT.INC
  • LANGUAGE\WCMD_DUT.LNG
  • LANGUAGE\WCMD_DUT.MNU
  • LANGUAGE\WCMD_ENG.MNU
  • LANGUAGE\WCMD_ESP.INC
  • LANGUAGE\WCMD_ESP.LNG
  • LANGUAGE\WCMD_ESP.MNU
  • LANGUAGE\WCMD_FRA.INC
  • LANGUAGE\WCMD_FRA.LNG
  • LANGUAGE\WCMD_FRA.MNU
  • LANGUAGE\WCMD_HUN.INC
  • LANGUAGE\WCMD_HUN.LNG
  • LANGUAGE\WCMD_HUN.MNU
  • LANGUAGE\WCMD_ITA.INC
  • LANGUAGE\WCMD_ITA.LNG
  • LANGUAGE\WCMD_ITA.MNU
  • LANGUAGE\WCMD_KOR.INC
  • LANGUAGE\WCMD_KOR.LNG
  • LANGUAGE\WCMD_KOR.MNU
  • LANGUAGE\WCMD_NOR.LNG
  • LANGUAGE\WCMD_NOR.MNU
  • LANGUAGE\WCMD_POL.LNG
  • LANGUAGE\WCMD_POL.MNU
  • LANGUAGE\WCMD_ROM.INC
  • LANGUAGE\WCMD_ROM.LNG
  • LANGUAGE\WCMD_ROM.MNU
  • LANGUAGE\WCMD_RUS.INC
  • LANGUAGE\WCMD_RUS.LNG
  • LANGUAGE\WCMD_RUS.MNU
  • LANGUAGE\WCMD_SK.LNG
  • LANGUAGE\WCMD_SK.MNU
  • LANGUAGE\WCMD_SVN.INC
  • LANGUAGE\WCMD_SVN.LNG
  • LANGUAGE\WCMD_SVN.MNU
  • LANGUAGE\WCMD_SWE.LNG
  • LANGUAGE\WCMD_SWE.MNU

This is quite a lot of files. If you come across it during exam, you won’t be able to tell which ones are legit and which are not. You need to browse through it all. It takes a lot of human cycles away.

Using a simple script which implements the aforementioned algo I was able to generate the following list of links established between all these files (files are sorted in order of ‘what file is the most popular’, or – in other words – ‘which file is referenced by others the most frequently’:

  • wcmzip32.dll 21
    • DESCRIPT.ION
    • HISTORY.TXT
    • WCMD_CHN.LNG
    • WCMD_CZ.LNG
    • WCMD_DAN.LNG
    • WCMD_DEU.LNG
    • WCMD_DUT.LNG
    • WCMD_ESP.LNG
    • WCMD_FRA.LNG
    • WCMD_HUN.LNG
    • WCMD_ITA.LNG
    • WCMD_KOR.LNG
    • WCMD_NOR.LNG
    • WCMD_POL.LNG
    • WCMD_ROM.LNG
    • WCMD_RUS.LNG
    • WCMD_SK.LNG
    • WCMD_SVN.LNG
    • WCMD_SWE.LNG
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • tcuninst.exe 20
    • DESCRIPT.ION
    • HISTORY.TXT
    • WCMD_CHN.LNG
    • WCMD_CZ.LNG
    • WCMD_DAN.LNG
    • WCMD_DEU.LNG
    • WCMD_DUT.LNG
    • WCMD_ESP.LNG
    • WCMD_FRA.LNG
    • WCMD_HUN.LNG
    • WCMD_ITA.LNG
    • WCMD_KOR.LNG
    • WCMD_NOR.LNG
    • WCMD_POL.LNG
    • WCMD_ROM.LNG
    • WCMD_RUS.LNG
    • WCMD_SK.LNG
    • WCMD_SVN.LNG
    • WCMD_SWE.LNG
    • TOTALCMD.EXE
  • descript.ion 18
    • HISTORY.TXT
    • WCMD_CHN.LNG
    • WCMD_DEU.LNG
    • WCMD_DUT.LNG
    • WCMD_ESP.LNG
    • WCMD_FRA.LNG
    • WCMD_HUN.LNG
    • WCMD_ITA.LNG
    • WCMD_KOR.LNG
    • WCMD_NOR.LNG
    • WCMD_POL.LNG
    • WCMD_ROM.LNG
    • WCMD_RUS.LNG
    • WCMD_SK.LNG
    • WCMD_SVN.LNG
    • WCMD_SWE.LNG
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • totalcmd.inc 14
    • DESCRIPT.ION
    • HISTORY.TXT
    • WCMD_CHN.INC
    • WCMD_CZ.INC
    • WCMD_DAN.INC
    • WCMD_DEU.INC
    • WCMD_FRA.INC
    • WCMD_FRA.LNG
    • WCMD_HUN.INC
    • WCMD_KOR.INC
    • WCMD_ROM.LNG
    • WCMD_RUS.INC
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • wcx_ftp.ini 6
    • HISTORY.TXT
    • WCMD_CZ.LNG
    • WCMD_RUS.LNG
    • WCMD_SK.LNG
    • TCUNINST.EXE
    • TOTALCMD.EXE
  • noclose.exe 5
    • DESCRIPT.ION
    • HISTORY.TXT
    • KEYBOARD.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • unrar.dll 5
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
    • UNRAR9X.DLL
  • totalcmd.exe 5
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.EXE
    • TCUNINST.WUL
    • TcUsbRun.exe
  • tc7z.dll 4
    • DESCRIPT.ION
    • TC7ZIPIF.DLL
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • sfxhead.sfx 4
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • tcmdx64.exe 4
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • wcmicons.dll 4
    • DEFAULT.BAR
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • cglptnt.sys 4
    • CGLPT64.SYS
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
  • tcmadmin.exe 4
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • unrar9x.dll 4
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • tcunzlib.dll 4
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • tcusbrun.exe 3
    • DESCRIPT.ION
    • HISTORY.TXT
    • TCUNINST.WUL
  • freres32.dll 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • share_nt.exe 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • cabrk.dll 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • wc32to16.exe 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • wincmd.ini 3
    • HISTORY.TXT
    • TCUNINST.EXE
    • TOTALCMD.EXE
  • default.bar 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • tc7zipif.dll 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • unacev2.dll 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • tcmdlzma.dll 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • cglpt9x.vxd 3
    • DESCRIPT.ION
    • TCUNINST.WUL
    • TOTALCMD.EXE
  • wcuninst.wul 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • history.txt 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • tcuninst.wul 2
    • DESCRIPT.ION
    • TCUNINST.EXE
  • register.rtf 2
    • WCMD_FRA.LNG
    • TCUNINST.WUL
  • size!.txt 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • totalcmd.chm 2
    • TCUNINST.EXE
    • TCUNINST.WUL
  • totalcmd.exe.manifest 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • cglpt64.sys 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • wcmd_deu.lng 2
    • HISTORY.TXT
    • TCUNINST.WUL
  • wcmicons.inc 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • no.bar 2
    • DESCRIPT.ION
    • TCUNINST.WUL
  • wcmd_deu.mnu 1
    • TCUNINST.WUL
  • wcmd_pol.mnu 1
    • TCUNINST.WUL
  • wcmd_hun.mnu 1
    • TCUNINST.WUL
  • wcmd_kor.inc 1
    • TCUNINST.WUL
  • wcmd_dut.lng 1
    • TCUNINST.WUL
  • wcmd_rom.inc 1
    • TCUNINST.WUL
  • wcmd_swe.lng 1
    • TCUNINST.WUL
  • wcmd_swe.mnu 1
    • TCUNINST.WUL
  • wcmd_svn.inc 1
    • TCUNINST.WUL
  • wcmd_cz.lng 1
    • TCUNINST.WUL
  • wcmd_dut.inc 1
    • TCUNINST.WUL
  • wcmd_kor.lng 1
    • TCUNINST.WUL
  • wcmd_kor.mnu 1
    • TCUNINST.WUL
  • wcmd_cz.inc 1
    • TCUNINST.WUL
  • wcmd_fra.inc 1
    • TCUNINST.WUL
  • wcmd_rus.inc 1
    • TCUNINST.WUL
  • wcmd_cz.mnu 1
    • TCUNINST.WUL
  • wcmd_fra.mnu 1
    • TCUNINST.WUL
  • wcmd_ita.mnu 1
    • TCUNINST.WUL
  • wcmd_nor.mnu 1
    • TCUNINST.WUL
  • wcmd_esp.mnu 1
    • TCUNINST.WUL
  • wcmd_rom.mnu 1
    • TCUNINST.WUL
  • wcmd_dan.inc 1
    • TCUNINST.WUL
  • wcmd_deu.inc 1
    • TCUNINST.WUL
  • wcmd_rus.mnu 1
    • TCUNINST.WUL
  • wcmd_hun.lng 1
    • TCUNINST.WUL
  • wcmd_chn.mnu 1
    • TCUNINST.WUL
  • wcmd_eng.mnu 1
    • TCUNINST.WUL
  • wcmd_ita.lng 1
    • TCUNINST.WUL
  • wcmd_dan.mnu 1
    • TCUNINST.WUL
  • wcmd_sk.lng 1
    • TCUNINST.WUL
  • wcmd_pol.lng 1
    • TCUNINST.WUL
  • wcmd_sk.mnu 1
    • TCUNINST.WUL
  • keyboard.txt 1
    • TCUNINST.WUL
  • wcmd_dan.lng 1
    • TCUNINST.WUL
  • wcmd_esp.lng 1
    • TCUNINST.WUL
  • wcmd_chn.inc 1
    • TCUNINST.WUL
  • wcmd_nor.lng 1
    • TCUNINST.WUL
  • wcmd_fra.lng 1
    • TCUNINST.WUL
  • wcmd_rom.lng 1
    • TCUNINST.WUL
  • wcmd_esp.inc 1
    • TCUNINST.WUL
  • wcmd_chn.lng 1
    • TCUNINST.WUL
  • wcmd_svn.lng 1
    • TCUNINST.WUL
  • wcmd_ita.inc 1
    • TCUNINST.WUL
  • wcmd_rus.lng 1
    • TCUNINST.WUL
  • wcmd_dut.mnu 1
    • TCUNINST.WUL
  • wcmd_hun.inc 1
    • TCUNINST.WUL
  • wcmd_svn.mnu 1
    • TCUNINST.WUL
The simple example – what does it tell us?

While simple, the example above allows us to link all of the files produced during the installation of Total Commander and build a cluster which we could call ‘totalcmd’.

I’d love to see a DFIR tool that would allow me to implement this sort of clustering and then help me to hide such filighted files with a click of a mouse. And then applying the same logic to other directories (f.ex. Program Files) one by one could allow us to build such clusters automatically and exclude these files from the ‘view’ as well.

Utilizing such automatically generated clusters + clusters of whitelisted/blacklisted software (potentially focused on problematic cases) could allow to significantly reduce analysis time (on top of other data reduction techniques).

See the second part here.

Wow6432Node key stats

I recently came back to play with strings artifacts extracted from a decently sized sample set. Looking at a normalized, clustered data set is always a good starting point for a research. It can be very boring, but every once in a while you will find something interesting.

To kick it off here are some stats about Wow6432Node key that I generated overnight.

With 64-bit boxes becoming pretty much the norm we naturally see more and more samples referring to this Registry key. If there is one reason for us to look at this data is to find out if there are perhaps some keys under Wow6432Node that may deserve some special attention… Who knows, maybe some new persistence mechanism or some new, interesting artifact is out there waiting for someone to discover it.

Obviously, stats may be misleading so use it at your own risk. Also, not all the keys are necessarily malicious. It’s just a bunch of keys that specifically refer to Wow6432Node, and are extracted from a large sample set.

Looking at the data below one thing strikes me immediately – the Run and RunOnce keys are pretty low on the list. Either software authors are not hardcoding them to avoid heuristic detections, or… there is really not that much software that modifies these keys directly.

  179506 software\wow6432node\microsoft\windows\
  42517 software\wow6432node\clients\startmenuinternet
  23631 software\wow6432node\microsoft\windows\currentversion\uninstall\avast
   5074 software\wow6432node\javasoft\java runtime environment
   4859 software\wow6432node\javasoft\java development kit
   3274 software\wow6432node\beattool
   3020 software\wow6432node\avast
   2601 software\wow6432node\sweetim
   1861 software\wow6432node\avira
   1686 software\wow6432node\microsoft\internet explorer\extensions\{ebd24bd3-e272-4fa3-a8ba-c5d709757cab}
   1641 software\wow6432node\sweet-pagesoftware
   1641 software\wow6432node\awesomehpsoftware
   1639 software\wow6432node\webssearchessoftware
   1638 software\wow6432node\qone8software
   1638 software\wow6432node\microsoft\windows\currentversion\uninstall\{c4ed781c-7394-4906-aaff-d6ab64ff7c38}
   1638 software\wow6432node\microsoft\windows\currentversion\uninstall\{889df117-14d1-44ee-9f31-c5fb5d47f68b}
   1638 software\wow6432node\classes\clsid\{4aa46d49-459f-4358-b4d1-169048547c23}
   1637 software\wow6432node\aartemissoftware
   1636 software\wow6432node\avg
   1551 software\wow6432node\microsoft\windows\currentversion\uninstall
   1515 software\wow6432node\avast software
   1465 wow6432node\clsid\
   1399 software\wow6432node\baidu security\antivirus
   1387 software\wow6432node\google\chrome\extensions
   1141 \software\wow6432node\baidu security\pc faster
    913 software\wow6432node\microsoft\windows\currentversion\uninstall\avira
    623 software\wow6432node\omiga-plussoftware\omiga-plushp
    583 software\wow6432node\red gate\
    559 wow6432node\clsid\%s
    502 software\wow6432node
    434 software\wow6432node\microsoft\internet explorer\extensions
    417 software\wow6432node\mozilla\mozilla firefox
    403 software\wow6432node\microsoft\windows\currentversion\uninstall\
    384 software\wow6432node\microsoft\internet explorer\toolbar
    372 software\wow6432node\mozilla\zvu.com\%s\main
    372 software\wow6432node\mozilla\zvu.com
    363 software\wow6432node\microsoft\windows\currentversion\run
    356 software\wow6432node\{smartassembly}
    326 software\wow6432node\microsoft\office\outlook\addins
    295 hkey_local_machine\software\wow6432node\vitalwerks\duc
    281 software\wow6432node\babylontoolbar\babylontoolbar
    265 software\wow6432node\brapp
    263 software\wow6432node\microsoft\windows\currentversion\runonce
    253 software\wow6432node\asktoolbar\macro
    215 software\wow6432node\mozilla\mozilla firefox\
    204 software\wow6432node\realnetworks\dlp
    189 software\wow6432node\microsoft\net framework setup\ndp\
    186 software\wow6432node\qone8software\qone8hp
    168 software\wow6432node\v9software
    163 software\wow6432node\qvo6software\qvo6hp