You are browsing the archive for EDR.

WerFault – command line switches v0.1

September 20, 2019 in EDR, threat hunting

I posted about werfault.exe a couple of times before. Some of the posts focused on persistence mechanisms, some on lolbinish behavior, but I thought it would be good to dedicate some time to describe the actual command line arguments this program accepts…

Why?

In my opinion werfault.exe accepts the most bizarre command line arguments combos on Windows platform ever. And despite werfault.exe process being executed so many times we are yet to see a comprehensive description of the switches it relies on. And what makes them stand out is that:

  • at a first glance, they look completely random
  • they use / rely on a bunch of weird, unusual and undocumented arguments, and finally,
  • many of them expect values in a numerical, often hexadecimal format that confuse every single analyst that ever put their eyes on it…

The below summary is my first attempt to take a stab at this topic so it may not be the most complete reference, BUT… we have to start somewhere.

The key to understanding werfault.exe command line arguments is to focus on the first switch being used. Yes, the very first thing werfault.exe is doing when it’s invoked it is checking the why:

  • -e: SQM Escalation
    • -e -p <num> -t <num> -r <num> -a <num> -f <num>
    • -e -p <num> -t <num> -r <num> -a <num> -f <num> -h <num>
  • -k : kernel-related
    • -k -lc <dump file name>
    • -k -lcq
    • -k -q
    • -k -rq
    • -k -l <string> <string> — live kernel
    • -k -lc <string> <string> — live kernel
  • -p: ?
    • -p <num> -h <num>
  • -pr: ?
  • -pss: ?
  • -s: process executed via SilentProcessExit mechanism
    • -s -t <num> -i <num> -e <num> -c <num>
  • -u : user mode
    • -u -p <num> -s <num>
  • /h – elevated hang reporting
    • /h /shared <shared>
    • /h /shared <shared> /t <num> /p <num>
  • /hc – ?
  • ??? -nonelevated – ??

The command line switch separator (- or /) that I listed above is actually important and its hardcoded form is what the program expects and compares against. This is somehow unusual and it escapes a typical pattern we are familiar with (either of these two characters – or / are commonly accepted as switch indicators).

I am aware of many other command line switches, but I am still browsing through the code, so I will update this post when I get more info.

What’s the lessons learned here?

If you see werfault.exe process in Sysmon or 4688 logs try to figure out what their execution is indicating. Sometimes, they may be an early warning of malware trying to do something that is prohibited on newer versions of Windows, but was fully acceptable on older. Also, if any program crashes, and it involves werfault.exe, you can use it to provide a feedback to the vendor/software developer…

There is literally a lot of goodness that can come out from looking at werfault.exe process invocations in general. Whatever crashes, hangs, breaks usual patterns is always an interesting thing to look at.

Moar and Moar Agents – sthap!

July 27, 2019 in EDR, Preaching

$Vendors love agents.

  • One does the AV
  • One does the DFIR
  • One does the EDR
  • One does the CIDS
  • One does the DLP
  • One does the FIM
  • One does the IAM
  • One does the SSO
  • One does the Event Forwarding
  • One does the Asset Inventory
  • One does the Client Proxy
  • One does the Managed Updates
  • One does the Vulnerability Management
  • One does the Employee Monitoring, on demand
  • One does the Conferencing
  • etc.

Some claim they are agent-less, but under the hood they use WMI, psexec, GPO, SCCM, etc.

Every single agent adds to a list of events that are generated and collected by a system/and often other agents. Every single one steals CPU, RAM, HDD cycles. Almost every single agent runs other programs. Almost every single agent works by spawning multiple processes at regular intervals. Almost every agent that is noisy renders all Mitre Att&ck’s Discovery tactic detections useless.

A quick digression: I used to have a work laptop with 4GB RAM. At least once a day my work would come to a halt. I always had Outlook, Chrome, and Microsoft Teams opened. At that special time of a day an agent would kick off its work and my computer’s CPU/RAM usage would jump to 100%. I couldn’t switch between apps, and literally had to wait each time for good 5-10 minutes for the agent to stop, before I could resume my work.

This has to sthap.

We all know that we need that Magic Unicorn single-vendor solution that works for Win/OSX/Lin + offers AV+EDR+DFIR+FIM+DLP+CIDS+VM+SSO+IAM in one + uses minimum resources + is cheap :). Atm all of these features are typically addressed by solutions from different vendors & the moar of them make a claim to your box, the worse the performance will be.

Let me focus on EDR here for a moment as they ARE one of the worse resource hogs, especially these ‘solutions’ that rely on polling. IMHO tools that primarily use this approach to collect data have to go, and pronto & I would personally never (re-)invest in them; polling is not only very 2011, but it literally misses stuff, adds a lot of stress to the endpoint, data synchronization and accuracy are questionable, and so on and so forth — ah, and these solutions often piss off analysts a lot – it’s so often that they want to do triage the system & they can’t, cuz the system is offline.

To elaborate on the ‘synchronization and accuracy ‘ bit:

  • system offline or on a different network –> no data accessible at all –> delays in triage/analysis
  • if you are doing env sweeps, you end up polling a few times to ensure you collect data from ‘all’ systems; the ‘all’ is just a wishful thinking — you have no control over it; also, as a result, some systems that are always online end up being polled more than once (resources wasted)
  • datasets are not synchronized & you got duplicates since you will get a few batches with different timestamps

So… IMHO polling will always give you an imperfect data to work with; it just doesn’t work in a field that is so close to Digital Forensics + doesn’t help to answer questions that will be asked by management:

  • how many systems in our env have this and that artifact present? you will never be able to answer with a 100% certainty
  • is our env. clean? yeah, right… 75% of boxes replied to our query with a negative result, others didn’t, so… we are 75% clean

Plus, they often rely on third party/OS binaries to do the job + are often using interpreted language (slow, cuz interpreters are often executed as external programs that add to the event noise, especially the ‘Process creation’ event pool).

What I find the most hilarious is the fact actual malware can squeeze in system info collection, password grabbing, screen grabbing, video recording, vnc modules, shell, etc in <100KB of code; most of vendors use RAD, Java, scripts and end up with awful bloatware.

What I am trying to say is that EDR tools that are worth looking at are:

  • tools that integrate with the OS on the lowest possible level — AV is integrating on a low-level for a reason (also, look at Sysmon)
  • collect all real-time events
  • send data off the box ASAP (any data stored on the box can be compromised/deleted/modified)
  • send data out by any means necessary (multiple protocols?)
  • send stuff to cloud anytime box goes online (no matter what network)
  • use native code (machine code) for main event collector modules instead of interpreted language –> performance / minimal footprint
  • single service process (supported by kernel driver, when necessary) instead of multiple processes
  • doesn’t spawn other processes — native code-based modules collect data as per need, loaded as DLL or always present (the interception of events is a code that can be VERY lean; the bulkier the code, the crappier the solution; red flags: .NET, Java, Powershell, VBScript, Python, WMI, psexec, etc.)
  • run queries on data / analyze outside of the endpoint

Basically: the agent intercepts, collects, caches, sends out to cloud when any network is available & asap, then sleeps until the next event occurs.

Of course, the solution may have extra modes for deploying heavy-weight stuff e.g. scripts, DFIR modules (memory dumping, artifacts collection, etc) + prevention modules etc., but this is used only during actual analysis, not triage.

So, what I covered is a basic architectural requirement:

  • An agent acts as a event forwarder ONLY & sends them to a Collector + can launch heavy ‘forensic’ modules / programs as per necessity
    • Events that are collected must be configurable, ideally (pre-processing –> less events –> better performance/less storage/less bandwidth)
  • Collector acts as a repository of events
    • Just store & index
    • Perhaps apply some generic out of the box rules/tests (VT, vendors’s IOCs, yara, etc.) and trigger alerts
  • Console allows to query Collector events, set up watch lists, manage rulesets, etc.

Coming back to agents as a whole — it’s time for some consolidation to happen… As usual, big players will be the winners as only they can afford to acquire and integrate.