In my previous post about clustering, I mentioned that it can be used as an efficient data reduction technique. I also provided some examples of timestamps that could be useful for detecting suspicious files on the system. One of them was a compilation time embedded inside Portable Executables (PE). Turns out that putting this idea into practice is easy and today I wrote a simple perl script that implements this functionality in a few dozen lines of code.
The script scans directory (recursively, if requested) and finds all Portable Executables. It then reads their compilation timestamps and groups them into clusters. Each cluster is a ‘bucket’ holding all binaries compiled within a window of 1 day (86400 seconds). You can play around with the script and change the value of CLUSTER_BOUNDARY to e.g. 30 days and see what happens.
On a screenshot below you can see the script at work – finding all PE files and grouping them into clusters:
And after processing the whole folder, the resulting clusters are printed out:
One needs to quickly scroll through these groups and look at isolated / oprhaned files or small groups and this should hopefully help in finding the bad apples. You can also toy around with the script over clean directories to see what intel you can gather from the compilation timestamps of all PE files inside some specific directory.
For example, after running it over the c:\window\system32 directory of various Windows flavors you may spot some interesting patterns:
- Portable Executables that are part of Windows OS are not build in an alphabetic order (I originally hoped they are – it could be an interesting pattern to use to spot ‘out-of-orderly’ named executables sandwiched between 2 clean OS files)
- Still, many OS binaries are compiled sequentially (with a few minutes difference) so many can be easily ignored in analysis
- On Windows XP and Vista DLLs and EXEs seem to be compiled separately (this is an interesting pattern as seeing .exe in a sequence of .dlls should be immediately treated as suspicious; note that system updates may affect this pattern)
- On Windows 7 both EXEs and DLLs seem to be compiled w/o any specific pattern 🙁
- Clean installation has a very small number of clusters within system32 directory; updated/patched binaries make analysis harder (still, updates will be most likely seen as separate clusters)
- Files dropped by installers, malware, as well as packed executables, compiled scripts e.g. perl32exe, etc. should stand out, even if timestomped – see how psexec service executable stands out below
Compilation time is a very useful characteristic of Portable Executable. Malware authors occasionally zero it or change it to a random value, but this is still not a very common practice. This, of course is a very good news for investigators and forensic analysts. If timestamp is real (not tampered with), compilation time of a malicious sample is so unique that it is most likely different from ‘typical’ timestamps that can be found e.g. within system32 directory. As mentioned earlier, PECluester should be able to group such randomly dropped files into separate cluster(s) even if the file system (e.g. $MFT) timestamps are timestomped.
Speaking of the devil. I mentioned ‘perfect timestomping’ in the title of this post.
Perfect timestomping of a Portable Executable would require not only changing the metadata on the file system, but also changing PE file’s compilation time (and all timestamps inside PE file that could reveal its compilation time) to some carefully chosen value that blends with compilation times of system files (especially for malware dropped inside system folders; for malware within application/temp data folders this – of course – is not that useful).
So, how would we go about finding such perfectly timestomped files?
Good news for forensic investigators is that a compilation timestamp is only one of many possible timestamps that can be found inside a typical Portable Executable. Unless malware author takes a really good care of all these timestamps (either understands Portable Executable file format quite well or uses a specialized tool), there is a high chance one may find some inconsistencies. While not many PE timestamps are properly updated during compilation time (e.g. Resources, Import Table have placeholders for timestamps, but are often zeroed by the compiler), some may include timestamps e.g. Debug Directory as show on a screenshot below:
Other clues about the compilation time can be related to
- embedded files (author might have forgotten to clean up their timestamps)
- copyright banners for statically linked libraries
- standard ‘template’ program icon (e.g. icons for win32 applications created via templates in RAD environment utilize always the same standard icon unless authors changes that; icons change between RAD versions and may give some clues as for the ‘age’ of the malware)
- libraries/compiler signatures – this is difficult as it requires libraries of known patterns, IDA Pro’s FLIRT signatures come to mind here and may give some hints, but associating these with a specific date is close to impossible
- even harder – specific to the compiler version code of exception handlers, prologue/epilogue code, compilation flags etc.
Back to PECluester – imho you can use it as an alternative to AV scans and a toy for further research. Go ahead and experiment. Enjoy!
You can download script here.