Sysmon – ideas, and gotchas

This post is about ensuring sysmon config works as it should. And also to introduce a few unusual ideas, and highlight a couple of gotchas that perhaps not everyone thinks of when touching it for the first time.

Despite reading about sysmon capabilities a lot, I only recently seriously looked at it from a threat hunting perspective.

There are many ‘template’ sysmon config available online. They are an excellent start for anyone who wants to build their own. There also many presentations about Sysmon and its detection capabilities available too. If you need to find then, just google around. Lots of them provide a lot of basics, as well as some good in-depth food for a thought – they should be definitely consumed before attempting to build your own config…

Let’s begin…

Coverage + Versioning

If you deploy your sysmon config to a single system, or install it in a small lab, config deployment and versioning is not a problem. Anytime you change your config, you just reconfigure the sysmon program manually. And if needed, you can just restart the service, or restart the computer.

In a real world though, the deployment and updates are more complicated. The basic production issues I came across, or heard of are:

  • Sysmon service doesn’t automatically starts after system restart
  • Sysmon accepts a new config, but event forwarding stops working
  • Sysmon generates tones of events & needs to be switched off, or tuned asap (typically for a small subset of hosts where some new noisy program was installed; ironically, very often it’s a security tool that causes all this noise)
  • The noise degrades performance of a system; owners are not happy
  • A good config can still cause performance degradation on virtual machines

There are more scenarios, I am sure of it, but using our favorite line from a corporate jargon: the bottom line is that we want to know:

  • how many systems we have & their OS versions
  • what % of these systems have the sysmon properly installed
  • what is a sysmon service status on these systems (running/not running first, and then more detailed service states for troubleshooting)
  • are events being forwarded to a log aggregation system?
  • what is a version of a sysmon config being used by each system

Many issues listed earlier are not strictly Sysmon problems. IT dependencies are always tricky. They become even more visible when we deal with a large, multinational company. And despite being a great piece of software, sysmon doesn’t have its own deployment ‘command center’. Everything has to be done by hand, and can be a subject to PEBKAC, various corporate rules (change management, business cases, compliance), laws (DAB, GDPR), and politics.

Again, the bottom line is that even if we are not responsible for deployment, we want and should be tracking it all so we can troubleshoot issues as soon as they are spotted. Absence of data is itself an incident.

In terms of config changes you can use Sysmon Event 16 to track the version of the XML file used to deploy the config. Using data from the event we can check config file’s SHA1, and look at its actual path. It may come handy to include a version number in a sysmon config file name so you can extract it e.g.:

Event 16
Sysmon config state changed:
UtcTime: 2019-02-13 20:57:26.626
Configuration: <path>\sysmon_v1.xml
ConfigurationFileHash: SHA1=<sha1>

It’s all nice and cozy, but… imagine that your sysmon config update happens only once a quarter. In the meantime, new systems are added, old systems are being removed, people responsible for deployment change. If you want to find out what version of config you have on all these systems at the end of the quarter you now need to query your log aggregation tool for that whole quarter of data to find the last Event 16 from all systems … Good luck with that.

Of course, you can probably gather info on the deployment status from other sources, e.g. from the release notes, etc. but I am just highlighting a very important problem that has to be taken into account from a planning and maintenance perspective. I don’t know a good, generic solution for that problem at the moment. Any ideas, and advice from the trenches welcome.

And there is one more decision to make. How many configs to use, and how to maintain them? Config used for servers may be different (more sensitive & noisy) than for workstations. That’s yet another dimension to take into account from a deployment, and maintenance perspective.

Of course, lab testing, unit testing of new rules, and gradient release is a different story, but also important.

Access to the file config

Another thing to consider is the config security. First of all, who writes and modifies the config? It’s a pretty responsible task and only selected people should have an access to it.

Secondly, if you drop a config file on a system where sysmon is deployed, ensure that ACLs are in place for both file, and the Registry key so no one can read its textual or binary form. Delete the XML file immediately after the config update. While the config data can be obtained from the Registry, we can always try to make life of attackers a bit harder.

Rules and tagging

Do yourself a favor and tag everything from the very start. This will help a lot with troubleshooting individual rules, and their classes. Use as many different rule names as possible. Don’t just rely on Mitre techniques. For example, someone running powershell.exe, and a program loading automation DLL is not equivalent (T1086: PowerShell).

Also, your rules will overlap. I once spent a lot of time ‘fixing’ my broken rule. Until I realized (after tagging all rules!) that I was looking at a wrong rule. This was actually the reason I started tagging everything.

Rules and their order

The XML Syntax is unfortunately not the friendliest way to write a config with a few hundreds, sometimes even thousands rules.

I usually take a mixed sorting approach where rules that are grouped under certain common category are put together. They are then sorted by the condition, and then by the actual artifact value. This helps me to keep the stuff in some order.

I also use a very simple visual aid. For all clusters, I align the artifact values to the same column e.g.:

<Image condition="image" name="CScript"    >cscript.exe</Image> 
<Image condition="image" name="HTA" >mshta.exe</Image>
<Image condition="image" name="PowerShell" >powershell.exe</Image>

It makes reading / reviewing much easier.

Rules & ‘end with’ optimization

Sysmon is very busy. It processes a lot of stuff that our rules don’t trigger on. We really want it to bail out on that non-matching stuff as quickly as possible, and catch the stuff specified by rules even faster.

@Swiftonsecurity made an interesting discovery with regards to the way ‘end with’ rules are processed. If used wisely, they may significantly improve processing speed of your rules.

The reason for this is that the ‘end with’ comparison doesn’t start the comparison from the end of the string as one would expect, but from the position somewhere inside the longer of the compared strings. This reduces a number of comparisons needed to bail out for strings that are clearly different.

For example, if your rule says ‘Path ends with c:\foobar\foo.exe‘, and sysmon observes a Path c:\foobar\test.csv, the comparison will bail out immediately after a first comparison i.e. when ‘:’ vs. ‘c’ letters are compared:

c:\foobar\test.csv
c:\foobar\foo.exe

This is because sysmon finds the earliest position in a longer string where the shorter string should begin (counting from the end of the longer string), and starts the comparison from there.

If comparison started from the beginning, it would need to walk through the full path, which is the same in our example, and only fail when ‘t’ is compared against ‘f’:

c:\foobar\test.csv
c:\foobar\foo.exe

So, optimizing rules using this trick is a pretty good idea.

Process Access Rule

I love it and I hate it. This is an extremely tricky rule to use.

Anytime a process is opening a handle to another process, it uses a very specific access mask which states what access is being requested. The bitmask defines lots of different privileges. These privileges include an ability to terminate a process, create threads, duplicate handles, and reading and/or writing from/into virtual memory of a process. You can read all the gore details from the Microsoft article I linked to.

The good news is that it is an excellent way to detect any sort of code/data injections. The bad news is that legitimate software literally abuses the OpenProcess API. Whether it’s just a result of copypasta, or a beauty of a legacy code, it is not uncommon for processes opening handles to other processes to request and be granted a full-blown access called PROCESS_ALL_ACCESS (0x1FFFFF). Windows Explorer does it all the time, so do many other processes, and this includes AV software too:

The even ‘badder’ news is that sysmon doesn’t support bitmask comparison, so if we want to detect a presence of specific bits e.g. ones responsible for reading/writing memory:

  • PROCESS_VM_READ (0x0010)
  • PROCESS_VM_WRITE (0x0020)

we need to come up with an idea how to detect these w/o using a bitmask.

The good news is that these values are relatively small numbers, and fit within one byte (256 possible values); we can find all possible values of the byte where either one, or both of these bits (0x10 and/or 0x20) are enabled. We can then generate rules using the ‘end with’ condition on the GrantedAccess rule.

Yup, it means a matrix of all the values with these bits ON:

 condition="end with" 10
condition="end with" 11
condition="end with" 12
...
condition="end with" fd
condition="end with" fe
condition="end with" ff

There are more bad news though. Now that you limit the rules only to these that read/write memory, you need to exclude all the source/target images that you think should be ignored. This is a subject to threat model embraced by your org, and a lot of manual analysis of the sysmon logs. It is a good rule of thumb to narrow down the list of target processes to the most common that are a subject to memory reading or injections. These include lsass.exe, explorer.exe, svchost.exe, and a couple of others. It’s a hard decision to make really, as we are limiting the visibility.

And… the good news is that you have now significantly limited number of events going to your log aggregator. From there, you can build more dynamic exclusions where you can pair more than one field to make a better exclusion.

Get ready for a lot of frustration

Yes.

In my recent post I complain about everyone focusing on detecting mimikatz. It works great in demos, it is a well-known marketing driver to refer to it, but there is a lot things that can be done that are outside of mimikatz. We need more work on these additional targets.

If you have an experience with any commercial EDR, your sysmon research will confirm that the tool is a great, free substitute of these products.
BUT
You are missing a lot flexibility that some of the commercial tools offer for many years.

For example, an ability to build more complex rules needs to be delegated to a log aggregation system. Since we are already stripping down a number of events that sysmon is logging using our targeted rules, we now have two levels of filtering to maintain. It’s actually hard to manage w/o making a mistake.

Managing your ruleset is also really challenging. If you start walking through available configs you may notice that some of them contain typos, don’t be too critical – you will make these typos too. Since it’s an XML, you need to use an editor that can at least help you check the validity of the syntax. Otherwise you will be chasing the unicorns.

Where the rules are absolute (e.g. full path, or registry entry), it’s relatively easy to keep a track of them e.g. in a separate sheet. When you start using keywords, infixes, and more ‘wide’ rules…. avoiding duplicates will become really hard. There is also additional complexity with regards to WOW subsystem. For many rules it’s handy to mirror system32 and syswow64 as well as Program Files and Program Files (x86) rules.

Testing rules is also not that easy. For trivial cases where we trigger on a file name we can do it on the spot. For code injection, accessing lsass.exe, and running some lolbins, running processes as SYSTEM, accessing registry keys protected by ACL, etc. you may need to dedicate a solid amount of time to make it work+get appropriate approvals to start testing…

Good luck!

PE files and the Easy Programming Language (EPL)

If you ever came across portable executables that include references to enigmatic modules called:

  • krnln.fne
  • krnln.fnr
  • eAPI.fne
  • RegEx.fnr

and many other libraries with a .fne, or .fnr file extensions, or perhaps found some of these files during a forensic exam, then this post is for you.

These executables are generated by so-called Easy Programming Language (EPL), a RAD, Visual Basic-like programming language and software development environment available from this Chinese company, and also available from this website. It’s not super popular, but it definitely has a following in China as programs are still being written in it. Including malware.

If you are in a hurry, you can download and play with the actual RAD v 4.01 from here or here.

When you install it, you will quickly notice that it populates c:\Program Files\EPL\lib folder with all these familiar libraries:

btdownload.fne, cncnv.fne, com.run, cominf.run, downlib.fne, dp1.fne, eAPI.fne, eCalc.fne, EChartBar.fne, eCompress.fne, EDataStructure.fne, eDB.fne, edroptarget.fne, eExcel2000.fne, eGrid.fne, eImgConverter.fne, EInterProcess.fne, eMMedia.fne, eNetIntercept.fne, ePPT2000.fne, ERawSock.fne, ESpeechEngine.fne, ESPI11.dll, ESSLayer.fne, EThread.fne, ewizard.fne, eWord2000.fne, EXMLParser.fne, Exmlrpc.fne, ExtMenu.fne, HtmlView.fne, iext.fne, iext.fnr, iext2.fne, iext3.fne, internet.fne, isapi.fne, Javalib.fne, krnln.fne, krnln.fnr, mp3.run, mysql.fne, ocx.run, odbcdb.run, OPenGL.fne, PhoneCortrol.fne, pop3.fne, portio.fne, RegEx.fne, script.fne, shell.fne, sock.fne, spec.fne, twain.fne, Warning.txt, WNet.fne, xplib.fne

Since the user interface is in English, we can easily load one of the samples provided with the framework e.g. Funny Ball Game:

We can then compile the game and run it:

Now that we compiled and ran the executable, we can look at the file itself.

While the framework requires you to register before you can build the standalone programs, it still provides a way to compile & test them. For the test purposes it provides a small stub executable that launches programs, and does so from a Temporary directory. The commercial version allows to package it all into one standalone .exe. It’s the ‘packaged’ version of .exe we will typically come across ‘in the wild’.

Looking at the stub .exe:

we can notice the following strings of interest:

  • krnln.fne
  • GetNewSock
  • Software\FlySky\E\Install
  • Not found the kernel library or the kernel library is invalid!
  • Failed to allocate memory!
  • / MADE BY E COMPILER – WUTAO

Looking at the PE file properties, we can see that the stub is pretty old:

  • TimeDateStamp: 0x3925136B (GMT: Fri May 19 10:11:55 2000)

I am not sure if this timestamp is good enough for any identification as I don’t have enough samples. Plus, this is a stub used for testing. Still, it could be used to run actual E programs (you can find some on github and elsewhere).

The easiest way to identify the program compiled with EPL is to look at its sections:

The .ecode section name is very characteristic, and I have added it to my PE Section list; you may also come across .edata section; this, together with a list of strings, and modules listed earlier (and probably a few more than can be found online – one can create new ones) should be enough to ID the files (e.g. via Yara).

Since the file format is quite obscure, and programs’ dynamic analysis not well researched, many AV and sandbox vendors list some of the artifacts created by the EPL framework as malicious, just because it’s there, for instance this Registry key entry:

  • HKCU\Software\FlySky\E\Install\Path

It actually points to a location where the clean libraries are; on a system where the development environment is installed it points to:

  • C:\Program Files\EPL\lib\

In cases where the .exe is standalone, when the program is executed the libraries are automatically unpacked to a temporary folder e.g.:

  • %Temp%\E_4
  • %Temp%\E_N4

The registry entry will be then pointed to that directory. This is obviously a possible persistence mechanism, but its value is pretty low for today’s standard…

During the development phase programs are stored in files with a .e file extension. Some programmers distribute them in this form as well (again, that’s what you can find on github and elsewhere!).

Looking at a sample .e file and its top few bytes we can see a magic string ‘CNWTEPRG’:

References to a MainForm string, and top bytes (BM) of a of a bitmap (.bmp) file are clearly visible. The graphic files can be carved out easily from a .e file. Plus, we can always load the file into the actual developer environment to see what the source code it holds.

For the sample game I have shown earlier, we can double click the form and end up in a code window shown below:

Looking for existing tools that can understand the internal file format of PE files generated by the EPL (especially the packaged ones), I came across a very old Chinese tool called E-Code Explorer. You can download it from here (Web Archive copy over Google Translate :)). Since the E-Code Explorer interface is in Chinese, it’s a bit tricky to operate (screenshot below is from here):

Looking for a good candidate .exe file that could be open with this tool I checked the actual development environment. Not surprisingly, it’s also (at least partially!) written in EPL! After poking around I found out that c:\Program Files\EPL\setup\mksetup.exe can be loaded without any issues:

The program reads the internal structures of the .exe, recognizes its e-code signature, and version. It also lists a lot of information in a way similar to other tools do so for other frameworks, e.g. lists e-code modules the program relies on:

and calculates offsets for internal sections:

This, and a traditional tree-like browser (shown below) may come handy during malware analysis:

Also, when I looked at this file format a few years ago I recall seeing actual description of the internal structures of the standalone .exes, but can’t find it at the moment.

I guess this file format is more a curiosity than anything else, but yet another PE file type to be aware of. What’s more, I am aware of tools written in EPL actually being found during forensic investigations so you may actually come across the .fne, .fnr files during your exams…