Good Exports are real

Collecting ‘good’ samples helps to discover a lot interesting patterns. In my old post I focused on the PDB paths extracted from the DriverPack driver collection, yesterday I touched on the list of ‘file names associated with good known kernel drivers’, and today I will cover the function names exported by a very large corpora of ‘good’ DLL samples.

You may ask what is the value here, and I can answer that ‘this is how the normal looks like’.

How is that useful to the Threat Hunting crowd?

If you monitor rundll32 invocations referencing DLLs and their API functions you may quickly discover a lot of anomalies. Any invocation referring to a non-OS DLL is suspicious. Any invocation referring to a DLL in a suspicious location is… suspicious. Any process using unusual constructs is suspicious. Any process invoking DLL exported functions via ordinal numbers is suspicious. Any process referencing API ordinals via negative or large ordinal numbers is super-suspicious, too.

These are great ‘suspicious’ tests, but we can do more.

The ‘StartW’ export used by Cobalt Strike DLLs is a good example. Invocations of this function are not necessarily ‘suspicious’ by default, because we don’t have a point of reference. There are so many legitimate invocations of rundll32 executing exported functions from so many DLLs that it’s hard to zoom-in on this particular function and declare that it’s bad. Again, we need a point of reference, of sort.

The list of functions exported by ‘good’ DLLs is far longer than expected: 11375507 unique entries, with many very popular and some only occurring once. You can download an archived text file referencing many ‘good export names’ from here.

There are so many uses for this set:

  • known-good names for threat hunting purposes
  • a very fertile ground for a deeper lolbin research
  • a very fertile ground for discovering new vulnerabilities

The set is watermarked hence you have been warned. You cannot use this set for any commercial reason. You cannot create any commercial detection based on this data. The only exceptions are: fully unlimited use by law enforcement, and for educational and non-commercial research purposes only.