You are browsing the archive for threat hunting.

That’s a very fine Chardonnay you’re not drinking

August 26, 2019 in threat hunting


This post is vague on names, vendors, products. Simple reason: I don’t want to get sued.

However, I give you all the tools to go and find the vendors that abuse the trust ‘your’ users put in them.

Old Post

This post is an attempt to look at a threat that is often overlooked in our typical threat hunting scenarios: unintended data leakage.

Unintended means that users do not have any plan to steal a data and ruin a company. They simply lack the technical knowledge that would allow them to assess the risk of installing certain applications. Who am I kidding. It’s not only non-techies falling for it, but also more technical people as well. I saw evidence of the latter on a number of occasions. Yup, and the evidence includes yours truly.

How do we leak data in an unintended way?

The simplest example is the auto-complete/auto-suggest functionality. As you type stuff in various search boxes (OS, browser address bar, search bar, etc.), the data is often sent key by key to some remote server instantly. In return, a suggested word can be provided and the editor can help us to type faster by ‘guessing’ what the next word will be. This is a great and handy functionality, but the risk is that you may accidentally paste something sensitive and hit Enter. It did happen to me on more than one occasion over the years. I tend to flip between many windows pretty quickly, and sometimes such automation mode fails me. Additionally, if I work between two different OSes e.g. Windows and MAC, and with many host/guest combos my strong habits on one OS don’t translate well to the others. As such, wrong combinations of keys lead to a booboo. I developed a habit now that I always look at the search box before I hit enter, but I wouldn’t say this is a foolproof solution…

We can argue that these are just accidents and can be handled quickly. I agree. I wanted to describe a trivial case before I go to a more interesting area: the software.

There is a group of software that relies on a heavy interaction with servers for a very simple reason: it works using the same principles like auto-complete/auto-suggest functionality, but applies it not to typed keys only, but pretty much everything that has a textual form.

By now, I guess you know what I am talking about — translation, and with a lower impact – spelling, and grammar correction software, text to speech, voice to text software, as well as many online document converters (e.g. DOC to PDF).

I will focus on describing a translation software only, but many of aspects covered below apply to other software too & similar threat hunting techniques can be used to find them.

From a purely technical perspective, many of translation applications/plug-ins use techniques that are very similar to ones you expect from a proper keylogger. They monitor keys entered from a keyboard, they monitor a clipboard, they monitor active / foreground windows as well as mouse cursor position to know where to grab a text from. Some of them go as far as using ScreenOCR/ScreenICR to grab a text from pictures, or custom controls that don’t support text retrieval via any API. They also install add-ins, plug-ins to improve their ability to harvest text from word processors, email clients, etc. in a native way. They have… lots of potential.

Anytime they grab that text – it is being translated almost instantly.

Applications of this sort exist for many years. Back in a day though, they would heavily rely on offline dictionaries and translation algorithms that would reside in libraries stored on user’s system. The need to connect to the remote server was minimal (only for updates). Today, most of them are cloud-based – – it is obviously better this way, from a quality perspective: dictionaries are always up to date, no need to transfer large files, users can exchange translations, etc.

The only problem is that in an effort to be as user-friendly as possible, they grab all the possible stuff without much control from the user and send it out.

If you are a ‘lucky’ threat hunter, you will look at your logs and see very rich GET requests. If you are less lucky, only POST and for these you need to collect some PCAPs.

The ‘richest’ translation applications I have seen are these that are sending whole paragraphs, memos, content of emails, email threads, pretty much… everything. Some of them will even include process names for windows the text snapshot was taken from, or additional attributes telling the vendor how the text was grabbed. They actually include lots of metadata in these logs.

Now that I have your attention, my suggestion for your hunting exercise is as follows:

  • Network Logs
    • Look for popular process names in your logs e.g. Outlook, Winword, etc; they may be related to the translation software, or… any other threat (PUA, malware, etc.), so they are good one way or another
    • Look for domains that are related to translation; the following list of initial keywords should help:
      • transl
      • traduct
      • lingv
      • dict
      • thesau
      • spellc
  • Endpoint
    • Look for a presence of translation software
    • Download and install it in VM, test how it works
    • If you see it sending stuff out, assess what sort of data is being included
    • Look for cool functionality: ScreenOCR, ‘follow the window’, etc.

Note: not all of these applications are bad. Many of them ‘behave’, so you don’t need to kill’em all. But monitoring at least — yes, definitely.

taskhost.exe $(Arg0) & its other arguments

July 1, 2019 in threat hunting

While looking at Sysmon logs on Windows 7 I noticed a strange process entry that had the following properties:

  • service.exe – as a parent process
  • taskhost.exe – as an image
  • $(Arg0) – as a command line argument

Anytime you see a placeholder / reference like this you start wondering whether it is a bug or a feature.

After grepping all .exe and .dll files under Windows directory I couldn’t find any references to $(Arg0). Only after grepping all files I finally came across the following task entry:

  • c:\WINDOWS\System32\Tasks\Microsoft\Windows\RAC

After looking at other Task XML files I noticed there are other variants of such command line argument under the <data> field
– – as far as I know they are not reported anywhere on the dedicated Task Scheduler interface or in Autoruns:

Other entries found:

    • Microsoft\Windows\CertificateServicesClient\SystemTask
    • Microsoft\Windows\Customer Experience Improvement Program\UsbCeip
  • USER
    • Microsoft\Windows\CertificateServicesClient\UserTask
    • Microsoft\Windows\CertificateServicesClient\UserTask-Roam
  • <![CDATA[$(Arg0)]]>
    • Microsoft\Windows\SideShow\GadgetManager
  • ![CDATA[$(Arg1)]]
    • Microsoft\Windows\Media Center\MediaCenterRecoveryTask
    • Microsoft\Windows\Media Center\ObjectStoreRecoveryTask
    • Microsoft\Windows\Media Center\PvrRecoveryTask
    • Microsoft\Windows\Media Center\PvrScheduleTask
    • Microsoft\Windows\Media Center\SqlLiteRecoveryTask
  • PageNotZero
    • Microsoft\Windows\MemoryDiagnostic\CorruptionDetector
  • Decompression
    • Microsoft\Windows\MemoryDiagnostic\DecompressionFailureDetector
  • <![CDATA[Logon]]>
    • Microsoft\Windows\Offline Files\Logon Synchronization
  • $(Arg0)
    • Microsoft\Windows\RAC\RacTask
    • Microsoft\Windows\Task Manager\Interactive

So, if you come across weird command line arguments used by taskhost.exe, the Tasks folder is a place to look at. Note that CDATA notation which I left intact (copied directly from the files) will not be present in the logs. As such, if you see e.g. ‘taskhost.exe KEYROAMING’ it is coming from the following entry:

  • Microsoft\Windows\CertificateServicesClient\UserTask-Roam