Subfrida v0.1

As many of you know, I am a big fan of Frida framework and I love its intuitiveness and flexibility, especially when it comes to auto-generating handlers for hooked functions, even if they are randomly chosen.

In my older Frida Delphi project I focused on functions that I could define. Today, I will focus on functions that are unknown.

How?

We are going to write an IdaPython script that will generate simple logging/tracing function stubs for all the subroutines that IDA ‘sees’ inside the executable.

When you load any executable into IDA it parses the analyzed program’s segments, recognizes the code, and… in it – many functions. We don’t really know or care what they do, other than being aware that they exist. FLIRT signatures help in recognizing some, but it is non-trivial, as well.

So, the value-proposition here is that we will try to use Frida to run the program and log calls to every subroutine ‘discovered’ or ‘recognized’ by IDA, and print out the strings that subroutine arguments may point to when the function is executed — for this exercise we will try to log ANSI and WIDE strings potentially passed to these functions, and strings delivered in their output.

Why?

This may help us to quickly understand the inner-workings of the program, and in some lucky cases extract IOCs, and overall, help in reverse engineering efforts. Especially for samples that are written in modern languages like Rust, Go, Nim.

The idea sounds great, but there is a problem. One that I don’t know how to solve, but by publishing my partial research, I hope someone more knowledgeable will help me to address… The problem is that any error in your OnEnter or OnLeave Frida handler function forces the script to bail out.

It’s a pity.

My ‘original’ code for this exercise looked like this:

import os
import shutil
import idautils
import idaapi
import idc
import re

idf = idc.get_idb_path()

print ("Original IDA File: %s" % idf)

m = re.match(r"\.idb", idf)

arch = 0
if m:
   arch = 32
   print ("- 32-bit")
else:
   arch = 64
   print ("- 64-bit")

if arch == 32:
   idf = idf.replace('.idb','.frida')
else:
	 idf = idf.replace('.i64','.frida')

print ("Output idf: %s" % idf)

filename=re.sub(r"\.frida", "", re.sub(r"^.+[\\/]", "", idf))
handlers=re.sub(r"[^\\/]+$", "", idf) + "__handlers__" + "/" + filename + "/"

if os.path.isdir(handlers):
	 print ("Deleting old handlers directory: %s" % handlers)
	 shutil.rmtree(handlers)

os.mkdir(handlers)

print ("Saving frida input file to '%s'" % idf)
print ("Saving '%s' handlers to '%s'" % (filename, handlers) )
g = open(idf, 'w')
base = idaapi.get_imagebase()
for f in idautils.Functions():
    dism_addr = list(idautils.FuncItems(f))
    ofs = "%X"%(dism_addr[0]-base)
    g.write ("-a %s!0x%s\n" % (filename, ofs))
    h = open(handlers + "/" + "sub_"+ofs+".js", 'w')
    h.write("""
{

  onEnter(log, args, state) {
    out = 'onenter: """+ofs+"""\\n'

    log(out)

    for (i = 0; i < 4; i++)
    {
       if (args[i]>0)
       {
          console.log(args[i].readUtf8String());
          console.log(args[i].readUtf16String());
          a = args[i].readUtf8String(256)
          if (a > 0)
          {
             out = out + ' [' + i + ']a ' + JSON.stringify(a) + '\\n'
          }
          w = args[i].readUtf16String(256)
          if (w > 0)
          {
             out = out + ' [' + i + ']w ' + JSON.stringify(w) + '\\n'
          }
       }
       this.args [i] = args [i]
    }

    if (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null)
    {
        state ['log_file']=new File('logfile.bin', 'wb');
    }

    if (! (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null) )
    {
        state ['log_file'].write(out);
        state ['log_file'].flush();
    }

  },

  onLeave(log, retval, state) {
    out = 'onenter: """+ofs+"""\\n'

    log(out)

    for (i = 0; i < 4; i++)
    {
       if (this.args[i]>0)
       {
          console.log(this.args[i].readUtf8String());
          console.log(this.args[i].readUtf16String());
          a = this.args[i].readUtf8String(256)
          if (a > 0)
          {
             out = out + ' [' + i + ']a ' + JSON.stringify(a) + '\\n'
          }
          w = this.args[i].readUtf16String(256)
          if (w > 0)
          {
             out = out + ' [' + i + ']w ' + JSON.stringify(w) + '\\n'
          }
       }
    }

    if (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null)
    {
        state ['log_file']=new File('logfile.bin', 'wb');
    }

    if (! (typeof state ['log_file'] === 'undefined' || state ['log_file'] === null) )
    {
        state ['log_file'].write(out);
        state ['log_file'].flush();
    }
  }
}
    """)
    h.close()


g.close()

When executed in a Windows IDA the code generates:

  • a .frida file with a list of RVA addresses for frida-trace to intercept
  • a list of generic handlers and their code for all these subroutines that simply try to log 4 first arguments passed to these functions – both at the entry point, and the function return.

Unfortunately, Frida is very sensitive and any error during processing of these handlers forces a bail out :(.

So, after toying around with different variations of this, and similar code, I came up with this dumb script:

import os
import shutil
import idautils
import idaapi
import idc
import re

idf = idc.get_idb_path()

print ("Original IDA File: %s" % idf)

m = re.match(r"\.idb", idf)

arch = 0
if m:
   arch = 32
   print ("- 32-bit")
else:
   arch = 64
   print ("- 64-bit")

if arch == 32:
   idf = idf.replace('.idb','.frida')
else:
	 idf = idf.replace('.i64','.frida')

print ("Output idf: %s" % idf)

filename=re.sub(r"\.frida", "", re.sub(r"^.+[\\/]", "", idf))
handlers=re.sub(r"[^\\/]+$", "", idf) + "__handlers__" + "/" + filename + "/"

if os.path.isdir(handlers):
	 print ("Deleting old handlers directory: %s" % handlers)
	 shutil.rmtree(handlers)

os.mkdir(handlers)

print ("Saving frida input file to '%s'" % idf)
print ("Saving '%s' handlers to '%s'" % (filename, handlers) )
g = open(idf, 'w')
base = idaapi.get_imagebase()
for f in idautils.Functions():
    dism_addr = list(idautils.FuncItems(f))
    ofs = "%X"%(dism_addr[0]-base)
    g.write ("-a %s!0x%s\n" % (filename, ofs))
    h = open(handlers + "/" + "sub_"+ofs+".js", 'w')
    h.write("""
{

  onEnter(log, args, state) {
    out = 'onenter: """+ofs+"""\\n'
    log(out)

    for (i = 0; i < 4; i++)
    {
       console.log(' - '+ args[i] + 'a->' + args[i].readUtf8String()+'\\n');
       console.log(' - '+ args[i] + 'w->' + args[i].readUtf16String()+'\\n');
       this.args [i] = args [i]
    }
  },

  onLeave(log, retval, state) {
    out = 'onenter: """+ofs+"""\\n'
    log(out)
    for (i = 0; i < 4; i++)
    {
       console.log(' - '+ this.args[i] + 'a->' + this.args[i].readUtf8String()+'\\n');
       console.log(' - '+ this.args[i] + 'w->' + this.args[i].readUtf16String()+'\\n');
    }

  }
}
    """)
    h.close()


g.close()

It at least populates the console.log file with anything that may be of interest and we can grep, rg it to our liking…

Stuffing up the WINDIR env. var. with THE SPACE

I love revisiting the ‘there is nothing else to be found there anymore’ cases and I described this process here.

Recently, I’ve been thinking of the WINDIR environment variable. I have already covered a few cases where WoW executables could be forced to execute any executable of our choice after the WINDIR environment variable modification, but it crossed my mind that we may try something new…

If you google what the environment variable maximum length is you will discover it is (allegedly) 32,767 characters. Luckily, Raymond Chan wrote this post that gives us a bit more (reliable) insight.

So, my thinking was … if I can force the WINDIR environment variable to fill-in the whole space used by the environment variables…. then in cases where it is used to expand a path (f.ex. for the 64-bit executable from the WoW 32-bit executable level as in cases I linked to above), there may be some path truncation happening that will render the ‘expanded’ version of the path in some unpredictable way… That is, I was hoping that if the WINDIR is long enough f.ex. close to the alleged maximum length of 32K characters, then the rest of the path would be truncated, potentially giving us an opportunity to literally run any executable on the system this way…

That didn’t happen 🙁

After playing around with it I eventually gave up. No truncation occurred, and while results were dependent on the WoW executable I tested (msra.exe, w32tm.exe, launchtm.exe), I have not identified a way to exploit this in any way.

However…

I do want to showcase one interesting observation from this attempt.

The msra.exe was a very interesting study.

I generated a batch file that was nearly 65K in size. It’s just a rather lengthy SET WINDIR=<~64K spaces>..\..\windows\notepad.exe.

So, I ran it from cmd.exe terminal, and then executed c:\WINDOWS\SysWOW64\msra.exe.

To my surprise, the program ran! Despite the fact that the WINDIR environment variable was far bigger than the alleged maximum environment block size!

Secondly, while it took a few seconds for the msra.exe to load, it did eventually show an interesting message box as an indication of an error:

The lessons learned is that the environment block can be far larger than 32K!

Using System Informer/Process Hacker we can look at the msra.exe process environment block, and here it is — a WINDIR variable taking ~64K:

(you can select all, copy it to clipboard, then paste it in Notepad, save file and check file size).

What this example teaches is that you should trust, but verify.

And it’s still possible I missed something and WINDIR or many other environment variables can be abused to do things no one ever considered…