Using to create API hashing DBs

If you ever used shellcode_hashes IDA plugin from Mandiant, you probably have also used before. But, if you haven’t, this post is for you.

The focus of the article is on the the script – it is used to generate a SQLite database sc_hashes.db that in turn is used by (used from IDA GUI) to identify immediate values that could be hashes of known APIs inside the decompiled binary. It’s fast and superhandy for position independent code analysis, including inline and implanted PE file loaders that rely on such API hashing functionality (multiple API hashing algos are supported).

As per the, the can be called with the following arguments:

python <database name> <dll directory>

The best is of course to run it on a subset of the c:\windows\system32 directory, with a focus on the most common libraries and the sc_hashes.db speaks to that directly, including only API hashes for the following libraries:

  • advapi32.dll
  • advpack.dll
  • chrome.dll
  • comctl32.dll
  • comdlg32.dll
  • crypt32.dll
  • dnsapi.dll
  • gdi32.dll
  • hal.dll
  • imagehlp.dll
  • kernel32.dll
  • lsass.exe
  • mpr.dll
  • msvcrt.dll
  • netapi32.dll
  • nss3.dll
  • ntdll.dll
  • ntoskrnl.exe
  • odbc32.dll
  • ole32.dll
  • oleaut32.dll
  • psapi.dll
  • shell32.dll
  • shfolder.dll
  • shlwapi.dll
  • termdd.sys
  • urlmon.dll
  • user32.dll
  • userenv.dll
  • winhttp.dll
  • wininet.dll
  • winmm.dll
  • winsta.dll
  • ws2_32.dll
  • wship6.dll
  • wsock32.dll


it’s also handy to have a larger data set available.

When I played with it a few years ago, I generated all hashes from the whole C:\windows\system32 directory.


Because you never know when you will stumble upon a hash value that is not represented inside the sc_hashes.db.

Now, you may think that replacing default sc_hashes.db with your full_blown_system32_dataset.db is the best idea ever, but it’s not. The sc_hashes.db is 50MB file, and the the full_blown one is ~600MB. SQLite is fast, but Ida+python+SQLite, not so much. So, you have been warned.

The bottom line:

Use default sc_hashes.db for all your cases first, and only if you find hashes outside of this set, try to look for the hash inside the full_blown one (either via SQLIte interface, or via grep/rg on a text export). Finally, if you discover which DLL the API hash belongs to, you can always generate a new SQLite DB set based on that single DLL (just needs to be copied to a working directory for the script to process it).

And if you don’t understand any of it, just download this file (45MB warning). It includes many hashes and many APIs. You can simply grep it for unknown API hash. Who knows, maybe you will get lucky…

Delphi API monitoring with Frida, Part 3

In part 1 and part 2 we looked at individual APIs and I hinted we can automate generation of handlers. Today we will do exactly that.

The attached python code ( reads PE file and then searches for code patterns that represent a couple of popular Delphi API functions responsible for string operations.

For every occurrence found, we generate a handler & print out the offsets (first offset is file position, and the second number is RVA that we need to pass to frida-trace):

Once handlers are generated, we can run frida-trace:

frida-trace c:\test\foo.exe foo.exe -a foo.exe!4c04 -a foo.exe!4e70 -a foo.exe!4fac -a foo.exe!4c48

This is a full log. Lots of string goodness, right? Note the IOCs, and the fact processes seem to be enumerated in order to find if possibly targeted processes are present.

And in case you are wondering the sample in question (foo.exe) is 00008EB74EEAEFFC64E85F8B0978D4EB056FCF390264A0D4C7D4A15ED5356DD3.