DriverPack – Clean PDB paths

Unique PDB debug paths embedded inside malware are useful to detect other variants of the malicious family (not applicable to more advanced malware families where authors either wipe the paths out, use a randomized string, or use a programming language and compiler that don’t leave these forensic artifacts behind).

The very same approach can be used for a classification of ‘good’ files. The only problem is finding a nice, sorted sampleset of clean files that we can extract a larger list of ‘good’ pdb paths from.

Luckily, there exist very well organized samplesets of good, clean files that can be downloaded easily and quickly. For instance, a DriverPack. After you download their torrent you get 32GB of popular driver files that are neatly sorted and placed in sub directories referring to both classes of drivers (audio, video, etc.), and vendor names aka companies providing the software added to the pack.

The bonus is that many of these files are relatively fresh (although you will find a lot of oldies there too).

Running a simple parser over the extracted I created a quick and dirty list of clean PDB paths mapped to vendor names in no time. How useful is that? Again, you can build automated yara rules, use it in offline analysis, speed up a triage of forensic investigations w/o relying on hash sets, fuzzy hash sets, etc.

Da Li’L World of DLL Exports and Entry Points, Part 5

The previous parts of this series were done ‘manually’. I would come across some new type of DLL and would jot down its properties so I would have a point of reference if I came across these in the future. The ‘manual’ part involved reading MSDN as well as many types of DLLs I covered are nicely described there.

There is another way to enhance the list by doing it a bit more automatically – such list could f.ex. be incorporated into your yara set, or become a part of tools like DiE.

Over 8 years ago I tried to collect a corpora of signed DLLs dropped by NullSoft installers – my list included over 2200 different DLLs. I will use this list today to show how we can create a table of interesting file properties that in turn could be converted into a detection ruleset.

Using sigcheck we can extract version info from these signed DLLs, and then enhance it by a list of exported APIs (f.ex. using pefile), and also internal DLL names. This is pretty much enough to create a decent detection data set.

Of course, these files are very old so to make the best use of this idea one would need to process a larger data set and newer files.

What is the benefit of using such ruleset? As long as files are signed and show the listed properties you may classify many clean files in an automatic fashion w/o relying on exact hashes, fuzzy hashes, or antivirus scans. And yes, as I mentioned in the past, such list of properties can be abused by malware authors, but then again — these files are actually signed, so it’s a good way to sift through the real and fake ones.