Analysis of binary data is always challenging. Data can be encrypted, encoded, and stored in a number of proprietary formats. Understanding of what data represents and how it is stored is non-trivial. It typically involves either analysis of the code that writes stuff to a file, or trying our luck by guessing what is a possible structure of the actual data. The typical approach is to simply look at it and its properties.
This can involve checking its entropy and how it changes over the file, looking for patterns typically associated with popular compression algorithms, attempting to brute-force various trivial encryption algos, checking if any data is recognized as a string, Unicode string, localized string, a potential absolute or relative offset to other data, or maybe a byte-, word-, dword- long length preceding data etc.
One of the most popular tools that is used to analyze unknown data is binwalk and it helped me on many occasions by providing hints on what is possibly ‘in the file’. Sometimes, even if it didn’t recognize anything interesting was also a good hint – typically meaning encryption, or something really unusual/proprietary.
Existing tools are always handy, but I can’t count how many quick & dirty (and often completely stupid) scripts I wrote to get some data to look more ‘reasonable’ and ‘normal’.
In today’s post I am showing a simple example of such ‘unknown data analysis script’.
When we see a binary file, we typically run ‘strings’ on them and we gather a nice readable ‘printable’ data for analysis.The ‘non-printable’ is also interesting though, so another tool I often run is a strings-like script that carves timestamps out. This comes handy for smaller files, especially for these that look like a config, a quarantine, and anything really that looks like may have a potential timestamps embedded in it.
Carving works following a simple rule – read 4/8 bytes, convert it to an epoch using various conversion algos (based on assumed timestamp format), see if epoch converts to a date between years 2000-2015, and if it does – just print it out, together with the offset and some extra metadata.
00 01 02 03 04 05 06 07 08 09 0A 0B 0C 0D 0E 0F 0123456789ABCDEF --------------------------------------------------------------------------- 00 : 80 86 F6 34 00 C0 5D CE 56 CF CD 01 00 40 FA 13 ...4..].V....@.. 00 10 : 0F 00 CE 01 00 40 8B B7 0F 16 CE 01 00 80 59 DA .....@........Y. 16 20 : 6B 2E CE 01 00 00 BE D2 FE 45 CE 01 00 A4 03 01 k........E...... 32 30 : 85 95 C2 01 .... 36
Looking at such binary data doesn’t give us much useful information.
Running timecraver over it, gives us the following:
=========================================== TimeCraver v0.1, Hexacorn.com, 2015-08-23 =========================================== 00000000,DOSTIME ,44C257B0,2006-07-22 16:52:00,8086F634 00000004,FILETIME,50B94880,2012-12-01 00:00:00,00C05DCE56CFCD01 0000000A,EPOCH ,400001CD,2004-01-10 13:44:45,CD010040 0000000C,FILETIME,510B0580,2013-02-01 00:00:00,0040FA130F00CE01 00000012,EPOCH ,400001CE,2004-01-10 13:44:46,CE010040 00000014,FILETIME,512FEF7F,2013-02-28 23:59:59,00408BB70F16CE01 0000001C,FILETIME,5158CDFF,2013-03-31 23:59:59,008059DA6B2ECE01 00000024,FILETIME,51805B00,2013-05-01 00:00:00,0000BED2FE45CE01 00000026,EPOCH ,45FED2BE,2007-03-19 18:13:18,BED2FE45 0000002C,FILETIME,3DE3D068,2002-11-26 19:50:00,00A403018595C201
The first column is an offset, followed by the timestamp type, then hexadecimal EPOCH calculated from the data, then its YYYY-MM-DD hh:mm:ss representation and finally the actual bytes from the file that are converted to EPOCH.
The data is immediately more readable and certain conclusions can be drawn. If you look at the offsets, distance between them and type of timestamps you may actually ‘see through’ the data and potentially ‘define’ a reasonable structure.
In this particular case, we can see that FILETIME is
– looks like a sequence of FILETIME records. Following this logic, we can guess that structure of the file is potentially like this:
00000000,DOSTIME ,44C257B0,2006-07-22 16:52:00,8086F634 00000004,FILETIME,50B94880,2012-12-01 00:00:00,00C05DCE56CFCD01 0000000C,FILETIME,510B0580,2013-02-01 00:00:00,0040FA130F00CE01 00000014,FILETIME,512FEF7F,2013-02-28 23:59:59,00408BB70F16CE01 0000001C,FILETIME,5158CDFF,2013-03-31 23:59:59,008059DA6B2ECE01 00000024,FILETIME,51805B00,2013-05-01 00:00:00,0000BED2FE45CE01 0000002C,FILETIME,3DE3D068,2002-11-26 19:50:00,00A403018595C201
I can confirm it since it is one of the test files I created
The script can be found here.
Happy craving & carving !
Bonus: if you look at the data in Registry, you will find more timestamps than you thought are actually there. This is a subject for another post
Bonus will be here faster than expected – turns out Andrew Case, Jerry Stormo, Joseph Sylve, and Vico Marziale wrote an awesome python script for timestamp carving in Registry