In a recently published Definitive Dossier of Devilish Debug Details, Steve Miller is going on a very entertaining adventure of looking at PDB paths of known malware campaigns and authors. I love this article, because I have always felt that PDB is a great forensic artifact, often overlooked, and even if I did some research on it in the past myself, I have never seen a comprehensive study on a level that Steve delivered.
Inspired by it, I had a quick look at PDB paths of… primarily clean files. I am saying primarily, because while I am almost certain that most of them are clean one can never be sure 100%… To support the claim, I can list a couple of paths I found in this (allegedly) clean corpora suggesting that clean probably means different things to different people:
- C:\Users\lol g\Desktop\PowerBiosServer_20561\PowerBiosServer_20080428\PowerBiosServer\obj\Release\PowerBiosServer.pdb
I still believe that most of these are clean, and… perhaps an honest mistake made these paths incorporated into final executables ;), and who knows… maybe even some of them got signed 😉
Looking at all these paths we can draw some quick conclusions:
- We could use them to generate a bunch of good yara signatures that catch good stuff; helps with clustering
- Of course, since the file is now public, it means that bad guys could re-use existing paths to bypass aforementioned potential yara sigs by making them trigger on bad stuff pretending to be a good stuff
- We see that Perforce, SVN, CVS, GIT are popular repos and perhaps their presence can indicate a proper software development practice at a company that generated the executables (could this alone be a good indicator for determining if the file is benign?)
- Lots of different programming languages in use
- Lots of personal build environments (1K user unique names under c:\users folder alone!)
- Some coders compiled programs under an Administrator account (in fairness, my corpora are files between 2000-2019, so plenty of files come from the old-school times when Admin was a default for everything)
- There are traces of some beautiful build environments out there; seriously, these are symptoms of very mature development practices visible directly in some of these PDB paths (their clusters)
- Surprisingly, many paths are outside of C: drive — could this be a generic indicator of ‘good’ too?
- Also, some of the usernames are clearly test-related; I am curious if these are overlooked in a final build, or some files were ‘leaked’? (test, Test, SKtester, nbtester, cvcctest, Pretest, tester, test5, TestUser, Test2, Test05, TestPC, Pinocchio_test)
- We have users from all over the place: English/American, Chinese, Indian, Irish, French, Korean, Russian, Arabic, etc.
You can download a zipped archive with PDB paths here.
Note: This file is watermarked; you cannot use it for commercial purposes.