Adding character(s) to Command Line processing

In my old post about certutil I mentioned that it accepts a number of less-known Unicode characters passed to its command line. Powershell accepting a number of Unicode characters representing “-” and its variations is a very well-known fact too.

What’s new? You may ask…

Processing command line was never easy. All Operating Systems, their various shells as well as many command line tools come with their own command line parsing ideas and quirks, but, I bet, whoever designed many of these command line argument parsers didn’t really see the Unicode character set coming…

In recent years we moved away from a simple world of “-“, “–“, and “/” as command/options switches towards the world that is well… kinda developing now.

In 2024 we have a number of popular Windows programs that accept a lot of Unicode characters as ‘special’ (either as a part of a command line, or ‘pasted’ to the program):

  • \t (Unicode 0x0009) – <Character Tabulation> (HT, TAB) // \t needs to be interpreted
  • \n (Unicode 0x000A) – (EOL, LF, NL) // \n needs to be interpreted
  • \r (Unicode 0x000D) – <Carriage Return> (CR) // \r needs to be interpreted
  • ” ” (Unicode 0x0020) – Space (SP) // ignore quotes
  • ” (Unicode 0x0022) – Quotation Mark
  • ‘ (Unicode 0x0027) – Apostrophe
  • – (Unicode 0x002D) – Hyphen-Minus
  • / (Unicode 0x002F) – Solidus, slash, forward slash
  • – (Unicode 0x0096 – mapped to 0xFB in codepage 437)
  • ” ” (Unicode 0x00A0) – No-Break Space (NBSP) // ignore quotes
  • (Unicode 0x2013) – En Dash
  • (Unicode 0x2014) – Em Dash
  • (Unicode 0x201C) – Left Double Quotation Mark
  • (Unicode 0x201D) – Right Double Quotation Mark
  • “ ” (Unicode 0x202F) – Narrow No-Break Space (NNBSP) // ignore quotes
  • (Unicode 0x2212) – Minus Sign
  • and possibly more

While not all programs accept these yet, we can already list a few that actually do:

  • certutil.exe
  • powershell.exe
  • pwsh.exe
  • certreq.exe
  • conhost.exe

You may ask… what’s a big deal?

Well, the big deal is that many assumptions about how command line arguments are passed to programs shaped the whole industry obsessively focused on detection engineering fixated on “recognizable command line patterns”.

These Unicode characters break a lot of these assumptions…

Bitmap hunting in SPL, Part 2

In my previous post I introduced the concept of bitmap hunting. Today I will show another example that helps to find a sequence of more than 2 events.

Consider this artificially generated sequence of events:

| makeresults           | eval _time=_time + 01 | eval evt="Run" | eval program="outlook.exe"
| append [| makeresults | eval _time=_time + 02 | eval evt="Run" | eval program="firefox.exe"]
| append [| makeresults | eval _time=_time + 03 | eval evt="Run" | eval program="firefox.exe"]
| append [| makeresults | eval _time=_time + 04 | eval evt="File" | eval file="...\invoice.lnk" ]
| append [| makeresults | eval _time=_time + 05 | eval evt="Run" | eval program="cscript.exe"]
| append [| makeresults | eval _time=_time + 06 | eval evt="Run" | eval program="powershell.exe"]
| append [| makeresults | eval _time=_time + 07 | eval evt="Run" | eval program="mshta.exe"]
| append [| makeresults | eval _time=_time + 08 | eval evt="File" | eval file="...\bar.tmp" ]
| append [| makeresults | eval _time=_time + 09 | eval evt="Run" | eval program="svchost.exe"]
| append [| makeresults | eval _time=_time + 10 | eval evt="Run" | eval program="outlook.exe"]
| append [| makeresults | eval _time=_time + 11 | eval evt="Run" | eval program="dllhost.exe"]
| append [| makeresults | eval _time=_time + 12 | eval evt="File" | eval file="...\foo.tmp" ]
| append [| makeresults | eval _time=_time + 13 | eval evt="Run" | eval program="cscript.exe"]
| append [| makeresults | eval _time=_time + 14 | eval evt="Run" | eval program="powershell.exe"]
| append [| makeresults | eval _time=_time + 15 | eval evt="Run" | eval program="mshta.exe"]

giving us this data:

It’s completely fictional, but you can see that we have two clusters of cscript, powershell, mshta program executions, with one following the file creation event where the file was a shortcut file using the file extension LNK (often used by malware).

Let’s say we want to find all the clusters of these 3 programs being executed AFTER any LNK file was created, and ignore the others.

We can first create a bitmap by adding the following code:

| eval b=
   case (
         evt="Run" and program="cscript.exe", "c",
         evt="Run" and program="powershell.exe", "p",
         evt="Run" and program="mshta.exe", "m",
         evt="File" and like (file, "%.lnk"), "l",
         1=1," "
        ) 
| eventstats list(b) as allb
| eval allb_bitmap = mvjoin(allb,"")
| table _time, allb_bitmap, b, evt, file, program

giving us this:

We can clearly see 2 interesting clusters, but only one fitting our criteria.

We can obviously exclude the rows where the b is empty, but we still need to split these lcpm and cpm clusters into separate buckets.

And the bucket is the word!

Modifying our earlier code a bit:

| eval b=
   case (
         evt="Run" and program="cscript.exe", "c",
         evt="Run" and program="powershell.exe", "p",
         evt="Run" and program="mshta.exe", "m",
         evt="File" and like (file, "%.lnk"), "l",
         1=1," "
        ) 
| bucket _time span=10s
| eventstats list(b) as allb by _time
| eval allb_bitmap = mvjoin(allb,"")
| where b!=" " and like(allb_bitmap, "%lcpm%")
| table _time, allb_bitmap, evt, file, program

we get this:

We can now start the triage!

Of course, there is a manual effort to this exercise and it may not be always possible to fully automate it, but I hope you can see the potential of this technique.

And see how cheap that is! There are no summary indexes, nested queries, complex statistics involved – it’s just a simple exercise of putting interesting events on a one-dimensional map, and then breaking them down into manageable clusters.