PDBs… from the the good sauce…

One of the early public sample clustering attempts I have ever made was a search for the username that was the most prevalent among the PDB paths extracted from my malware repository circa … 2013. Long time ago. Yup. The winner account was (unsurprisingly): ‘Administrator’.

7 years later we are seeing more PDB path research and Steve Miller at FireEye did a lot work in this space. Nick, who is one of my fav malware researchers, chipped in on the Twitter thread related to Steve’s research and pointed readers to my old blog posts, so I felt somehow obliged to follow up.


By looking at PDBs from the goodware.



Not only malware embeds the PDB paths, but also lots of goodware, that is…. drivers, installers, do-something files from your favorite or not so favorite vendor (aka it’s often a vendor that happens to be supporting your video, audio cards, as well as vendors installing lots of OEM software crappe on your laptops with a lot software pre-installed ‘out of the box’).

Still interested?

You should be… A list of good PDB paths can be easily turned into a ‘Good Yara’ repo. And that means.. you can exclude many of clean samples early as they come in by just looking at their PDB paths.

So… how these ‘good’ PDB paths look like?

Here are some stats….

D:\binaries.x86fre\SCP_WPA	50766
e:\SourceCode\AsMultiLang\AsMultiLang\release	37478
c:\CCView$\jmerchan_view_ASE_Installers\ASE_Installers\Iif2\Installer\Hdmi\Resource\Src\Release	34064
c:\CCView\jgonz2x_Staging_view\ASE_Installers\Iif2\Installer\Hdmi\Resource\Src\Debug	25642
c:\share\anarayan_latest_main\gfx_Development\SourceCUI2\igfx\TvWizIns\TVconfig\Resource\NEW_SRC	22776
c:\ccviews\atjes_L10N_ASE_Staging\ASE_Installers\Iif2\Installer\Chipset\Resource\Src\Debug	21446
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudbus\azalia\objfre_wnet_x86\i386	18614
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudio\objfre_wnet_x86\i386	18614
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudpropshortcut\objfre_wnet_x86\i386	18614
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudprop\objfre_wnet_x86\i386	18614
E:\projects 2009\DLL\AsAcpi\AsAcpi\Release	15693
c:\ccview\jgonz2_RCR1022521_view\ASE_Installers\IIF2\Installer\HDMI\Resource\SRC\Debug	15044
e:\Code\Eddy\AI Suite II\Source\AI-Suite II	11434
V:\TPMCLIENT\Bin\Win32\Release	10689
o:\BTW\btw1.2\bin\amd64	10246
G:\binaries.x86fre\SCP_WPA	10227
y:\ASE_Installers\Iif2\Installer\Hdmi\Resource\Src\Release	9940
c:\documents and settings\administrator\my documents\projects\dll\pngio\release	9856
C:\Symbols\Release	9674

This is just a top 20, and one can definitely build some Yara sigs around these. If you want the whole list DM me.

Is there a risk malware guys will re-use these? Absolutely. This is why only publish the top 20.

What about the usernames?

Looking at the stats I can pinpoint the following user accounts:

Chunyung	40752
cc4build	10161
chunyung	7822
test	5945
releng	5120
chunyung.RTDOMAIN	3905
dnandy	3520
newport10gc	3505
karl	2993
DEV	2811
tachun.cmedia	2667
SW	2618
cvcctest	2575
Test	2422
ws	2385
rkosana	2119
Administrator	2103
vyeh	1993
jim	1837
celitc	1799

Yes, it doesn’t tell us much other than indicating my ‘good’ sampleset is somehow biased toward productions of the mystical ‘Chunyung’. I have to work it out and add more diversity to this corpora… In the meantime… whatever doesn’t match these ‘good’ PDB paths is probably… a bad guy. So yeah… if you want to build some ‘goodware’ sigs out of it, please DM me and I will share the full PDB dataset with you.

In terms of the directories, the stats show us this:

D:\binaries.x86fre\SCP_WPA\	82596
e:\SourceCode\AsMultiLang\AsMultiLang\release\	69012
c:\ccviews\atjes_L10N_ASE_Staging\ASE_Installers\Iif2\Installer\Chipset\Resource\Src\Debug\	66753
c:\CCView$\jmerchan_view_ASE_Installers\ASE_Installers\Iif2\Installer\Hdmi\Resource\Src\Release\	59019
c:\CCView\jgonz2x_Staging_view\ASE_Installers\Iif2\Installer\Hdmi\Resource\Src\Debug\	52848
c:\ccview\jgonz2_RCR1022521_view\ASE_Installers\IIF2\Installer\HDMI\Resource\SRC\Debug\	51078
c:\share\anarayan_latest_main\gfx_Development\SourceCUI2\igfx\TvWizIns\TVconfig\Resource\NEW_SRC\	45916
V:\TPMCLIENT\Bin\Win32\Release\	32175
E:\8168\vc98\self\bin\x86\	29523
E:\projects 2009\DLL\AsAcpi\AsAcpi\Release\	28525
E:\8665\vc98\mfc\mfc.bbt\src\	27055
E:\8972\vc98\self\bin\x86\	25890
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudbus\azalia\objfre_wnet_x86\i386\	25555
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudpropshortcut\objfre_wnet_x86\i386\	25554
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudprop\objfre_wnet_x86\i386\	25554
e:\hdaudio\srv03\source\drivers\oem\src\wdm\audio\drivers\hdaudio\hdaudio\objfre_wnet_x86\i386\	25554
y:\ASE_Installers\Iif2\Installer\Hdmi\Resource\Src\Release\	24840
C:\Symbols\Release\	23829
T:\__test_sys\__outputs\NNT-SNB32-W86_andmitri\mediasdk_tags_Win7_MFTs_15.31_promoted_53672\samples\_build\Win32\Release\	21504
E:\8447\vc98\mfc\mfc.bbt\src\	20980

PDB Goodness

In a recently published Definitive Dossier of Devilish Debug Details, Steve Miller is going on a very entertaining adventure of looking at PDB paths of known malware campaigns and authors. I love this article, because I have always felt that PDB is a great forensic artifact, often overlooked, and even if I did some research on it in the past myself, I have never seen a comprehensive study on a level that Steve delivered.

Inspired by it, I had a quick look at PDB paths of… primarily clean files. I am saying primarily, because while I am almost certain that most of them are clean one can never be sure 100%… To support the claim, I can list a couple of paths I found in this (allegedly) clean corpora suggesting that clean probably means different things to different people:

  • D:\TEMP\fuckingasus\Debug\fuckingasus.pdb
  • D:\Work\pgtool\svn\pgtoolfuck\Release\RTNicPgW32.pdb
  • D:\Work\pgtool\svn\pgtoolfuck\x64\Release\RTNicPgW64.pdb
  • d:\tmp\1driver\fuck4\rtl818xb\platform\ndis6\usb\objfre_wlh_x86\i386\rtl8187.pdb
  • C:\TMP\shit\msikbd.2k\objfre\i386\msikbd2k.pdb
  • c:\WORK\XPSDriver\oishitts_view\oishitts_xpsdrv093_051208_build\XPSRenderer092\xpsdriver\AquaFilter\Release\Win32\AquaFilter.pdb
  • C:\Users\lol g\Desktop\PowerBiosServer_20561\PowerBiosServer_20080428\PowerBiosServer\obj\Release\PowerBiosServer.pdb

I still believe that most of these are clean, and… perhaps an honest mistake made these paths incorporated into final executables ;), and who knows… maybe even some of them got signed 😉

Looking at all these paths we can draw some quick conclusions:

  • We could use them to generate a bunch of good yara signatures that catch good stuff; helps with clustering
  • Of course, since the file is now public, it means that bad guys could re-use existing paths to bypass aforementioned potential yara sigs by making them trigger on bad stuff pretending to be a good stuff
  • We see that Perforce, SVN, CVS, GIT are popular repos and perhaps their presence can indicate a proper software development practice at a company that generated the executables (could this alone be a good indicator for determining if the file is benign?)
  • Lots of different programming languages in use
  • Lots of personal build environments (1K user unique names under c:\users folder alone!)
  • Some coders compiled programs under an Administrator account (in fairness, my corpora are files between 2000-2019, so plenty of files come from the old-school times when Admin was a default for everything)
  • There are traces of some beautiful build environments out there; seriously, these are symptoms of very mature development practices visible directly in some of these PDB paths (their clusters)
  • Surprisingly, many paths are outside of C: drive — could this be a generic indicator of ‘good’ too?
  • Also, some of the usernames are clearly test-related; I am curious if these are overlooked in a final build, or some files were ‘leaked’? (test, Test, SKtester, nbtester, cvcctest, Pretest, tester, test5, TestUser, Test2, Test05, TestPC, Pinocchio_test)
  • We have users from all over the place: English/American, Chinese, Indian, Irish, French, Korean, Russian, Arabic, etc.

You can download a zipped archive with PDB paths here.

Note: This file is watermarked; you cannot use it for commercial purposes.