Hunting for additional PE timestamps

January 4, 2019 in Batch Analysis, Clustering, File Formats ZOO, Malware Analysis, Reversing

Over the years I published a number of posts about file timestamps. Not the file system timestamps, but timestamps hidden inside the actual file content.

I wrote that there is a way to ‘heuristically’ carve timestamps from binary files. I have also provided some compilation timestamps stats for PE files, and discussed less-known Java folder timestamps. And when we talk about the PE files specifically, there is an PE compilation timestamp discussion in a context of timestomping. There is also a bit of rambling about the infamous Borland/Delphi resource timestamp.

Anyone who ever looked at the PE file specification or various PE Dump reports knows there are possibly more timestamps hidden inside these executable files.

For starters, we can look for timestamps sometimes present in additional PE file areas e.g. hidden inside a debug section, or even file’s signature – if they exist. There is an information about a compiler / linker version, Rich Header, .NET version, imported and exported APIs, imported DLLs, etc. All of them may help to narrow down the timeframe when the file was created. The DiE tool does a good job in helping with extraction of some of this information.

We can do a lot of guesswork based on the information available inside the metadata as well, for example by simply looking at file’s Version information block, or manifest. Then there are good old strings: if we are lucky the unencrypted strings embedded in a main file/configuration/update can get us an immediate answer. We may also rely on indirect references to time e.g. by looking at versions of statically compiled libraries (sometimes you can see actual version strings), sometimes bugs in code, and debug / verbosity logs that have not been stripped off from the release version; sometimes dates are included in the PDB string itself, and then sometimes there is a never-shown usage info, or even dead code that may include a dated ‘copyright’ note from the malicious author. These may be also included as comments inside the scripts, in case the binary file is a host to an interpreted code (this happens pretty often with older malware based on AHK, VBE, etc.). Advanced malware analysts can often deduce the version / rough time period of protector layers by just looking at the code (they can, because they write decrypters for this mess and often adjust code, even on daily basis).

Now, all of these can be modified, or fixed because they are very well-known. But there are more timestamps we can look at.

When we read the PE file documentation we can notice that it is rich in descriptions of all these additional timetstamp fields available.

The problem is… most of the time they are always zeroed.

But…

It is actually not always the case!

I was reading the description of the VS_FIXEDFILEINFO structure, and I realized that I never seriously looked at the two timestamp fields it includes:

dwFileDateMS Type: DWORD

The most significant 32 bits of the file’s 64-bit binary creation date and time stamp.

dwFileDateLS Type: DWORD

The least significant 32 bits of the file’s 64-bit binary creation date and time stamp.

So, I wrote a quick parser to search my PE metadata logs for samples where the values of these fields are not zero. And if not zero, must be within a certain, reasonable timestamp range (compiled between year 2000, Jan 1st, and today).

To my surprise, the script started spitting out names of samples that had these fields populated. Very often the values would be identical with a PE Compilation timestamp, but in some cases would be off by a few minutes, sometimes days and months. Since such cases provide a range between the version info compilation and actual PE file compilation it could provide an additional information about the timeframe of active development of the sample.

An example of such file can be found here.

An obvious question appears: which compilers/linkers produce these correctly compiled executables?

I don’t know at this stage. Also, despite some good results I need to emphasize that most of samples do not include a valid timestamp. Still… when it’s available… why shouldn’t we be extracting it?

Share this :)

Comments are closed.