String analysis for n00bs

I like to demo this little windows executable to everyone who thinks they are doing the reverse engineering bit right, by using available automated static and dynamic analysis tools, and trusting them blindly.

The sample is a PE32 that is 2560 bytes long. Running ‘strings’ over it produces these results:

!This program cannot be run in DOS mode.
Rich
.text
`.rdata
@.data
8/u
ExitProcess
GetCommandLineA
kernel32.dll
GetStdHandle
WriteFile
Hello World!

Running it from a CLI gives us the following text being printed out to the STDOUT:

Hello World!

One can say that both static and dynamic analysis give us the same output. Based on this info it’s kinda obvious to conclude that this small binary is a simple CLI program that prints out ‘Hello World!’ when executed.

Except, only code analysis can help us to determine that the program behaves differently if we pass a ‘/h’ argument to it.

In such case, the dynamic analysis will show that the following string is being printed out to the STDOUT:

Hello Baby!!

Static analysis was done right. Default dynamic analysis was done right. And code analysis was done right too. It’s just the automation that failed.

Just a reminder that we can’t blindly trust the automation, because it only sees the obvious. And command line arguments are not the only way to trigger execution of a different branch of code. It could be a guard rail of any sort: time of the day, locale of the OS, delayed payload, payload downloaded from a site that is not available atm, etc.

in the interest of full disclosure: I have not ‘analyzed’ this sample with any AI framework, so am still hopeful that at least some of them would see through this little mischief.

Rundll32 goes to hell…

Parsing command line invocations is fun, because it’s impossible to do it right (all the time).

Imagine a test DLL that exports a function called foobar. We place this DLL in c:\test directory and name it like this:

test.dll, #666

We can then use rundll32.exe to execute the foobar function using the following syntax:

rundll32 "c:\test\test.dll, #666", foobar

A different version can use the following name:

test.dll,abcxyz

with the invocation:

rundll32 "test.dll,abcxyz", foobar

We do need quotes, because rundll32.exe does not accept file names with a ‘coma’ in them (for obvious reasons), and the full path is not needed if we are in the same directory, but the gist is that these are all proper DLL file names!:

What your sophisticated regexes extracting DLL name and API’s ordinal number, or API name from this sort of invocations tell you today?

And then here’s another case for your consideration – create a test DLL with the following exports:

  • A
  • W
  • AA
  • AW
  • WA
  • WW

When you run the following invocations:

rundll32 c:\test\test.dll, A
rundll32 c:\test\test.dll, W

– which of these 6 exported functions will get executed?

I have provided an answer to this question a few years ago, and here’s the DebugView output:

The bottom line is that you can’t use regexes for parsing command line invocations or make assumptions w/o running into many corner cases.