In a recently published Definitive Dossier of Devilish Debug Details, Steve Miller is going on a very entertaining adventure of looking at PDB paths of known malware campaigns and authors. I love this article, because I have always felt that PDB is a great forensic artifact, often overlooked, and even if I did some research on it in the past myself, I have never seen a comprehensive study on a level that Steve delivered.
Inspired by it, I had a quick look at PDB paths of… primarily clean files. I am saying primarily, because while I am almost certain that most of them are clean one can never be sure 100%… To support the claim, I can list a couple of paths I found in this (allegedly) clean corpora suggesting that clean probably means different things to different people:
- D:\TEMP\fuckingasus\Debug\fuckingasus.pdb
- D:\Work\pgtool\svn\pgtoolfuck\Release\RTNicPgW32.pdb
- D:\Work\pgtool\svn\pgtoolfuck\x64\Release\RTNicPgW64.pdb
- d:\tmp\1driver\fuck4\rtl818xb\platform\ndis6\usb\objfre_wlh_x86\i386\rtl8187.pdb
- C:\TMP\shit\msikbd.2k\objfre\i386\msikbd2k.pdb
- c:\WORK\XPSDriver\oishitts_view\oishitts_xpsdrv093_051208_build\XPSRenderer092\xpsdriver\AquaFilter\Release\Win32\AquaFilter.pdb
- C:\Users\lol g\Desktop\PowerBiosServer_20561\PowerBiosServer_20080428\PowerBiosServer\obj\Release\PowerBiosServer.pdb
I still believe that most of these are clean, and… perhaps an honest mistake made these paths incorporated into final executables ;), and who knows… maybe even some of them got signed 😉
Looking at all these paths we can draw some quick conclusions:
- We could use them to generate a bunch of good yara signatures that catch good stuff; helps with clustering
- Of course, since the file is now public, it means that bad guys could re-use existing paths to bypass aforementioned potential yara sigs by making them trigger on bad stuff pretending to be a good stuff
- We see that Perforce, SVN, CVS, GIT are popular repos and perhaps their presence can indicate a proper software development practice at a company that generated the executables (could this alone be a good indicator for determining if the file is benign?)
- Lots of different programming languages in use
- Lots of personal build environments (1K user unique names under c:\users folder alone!)
- Some coders compiled programs under an Administrator account (in fairness, my corpora are files between 2000-2019, so plenty of files come from the old-school times when Admin was a default for everything)
- There are traces of some beautiful build environments out there; seriously, these are symptoms of very mature development practices visible directly in some of these PDB paths (their clusters)
- Surprisingly, many paths are outside of C: drive — could this be a generic indicator of ‘good’ too?
- Also, some of the usernames are clearly test-related; I am curious if these are overlooked in a final build, or some files were ‘leaked’? (test, Test, SKtester, nbtester, cvcctest, Pretest, tester, test5, TestUser, Test2, Test05, TestPC, Pinocchio_test)
- We have users from all over the place: English/American, Chinese, Indian, Irish, French, Korean, Russian, Arabic, etc.
You can download a zipped archive with PDB paths here.
Note: This file is watermarked; you cannot use it for commercial purposes.