You are browsing the archive for threat hunting.

When your TODO list is always short of something…

June 1, 2019 in Archaeology, Clustering, threat hunting

There is only one place visited by Windows coders less often than anything else: their program’s resource section. This is either a sacred or a scary place, depending on the coders’ inner fears or strengths – but no matter which one it is, with this post I want to draw your attention to program’s VERSIONINFO section.

If program is coded in Visual Studio – its versioninfo is typically pre-populated with a bunch of boilerplate TODO strings:

It is the place where we specify the name of the program, its version, description, add copyright notice, and where we add the name of the company that produced it, etc.. Some programmers adjust these automatically via intricate command line tools or scripts, some edit these manually. Some… forget to do so….

Now… creating software is exciting. Writing your new, revolutionary code is an exhilarating experience (or a dull one, if you are bored to death in your job), and with all these cool (boring) features you are adding so quickly, the poor versioninfo section is typically left behind.

Until the day you sign.

I mean, the day you literally sign your release .exe and then… it becomes IMMORTAL.

Why immortal?

It’s signed… kinda forever. Certificate revocations happen quite rarely. And if legitimate, your .exe will most likely end up on many popular download, mirroring sites, and/or in in AV and EDR repositories. And if it has a built-in vulnerability, backdoor, or some stupid logic flaw or dependency, and worse – if there is an active exploit/trick for it – you can be sure that it will become a long-living part of an arsenal of attacker groups. And if you still have doubts… nothing is ever permanently deleted from the internet anyway 😉

Back to your signed file…

If you have not edited these versioninfo properties – they will make it to the signed build. If there is no proper QC process, the version information bits that you were supposed to fill-in early in the process will be left untouched e.g.:

  • TODO: File Description
  • TODO: Company Name

Yup. Anyone viewing properties of your signed file will see something like this:

It’s ugly. And… it’s so… common.

Over years I have seen many signed binaries with the version info populated with a default ‘TODO” boilerplate. I find it to be a truly interesting phenomenon. I mean, who would forget it before signing? Right?

I was curious how many companies actually did forget to change that boilerplate at some stage in the past. Obviously, I don’t have all the files of the world, so the list below is biased, but on the other hand, I have been collecting clean files for many years now… so I think it’s still kinda representative – especially that many big names are on it.

The examples are just single hash/company, but there are more hashes/company, obviously:

  • Acer Incorporated
    • 2C11C7769AFBA91795309335E95364610B64A0D5
  • Advanced Micro Devices
    • E7C988AAE8A28537418BAAF9C6D627F1E9A71D3D
  • Alps Electric Co.
    • F0081345B04426F2327CC83BCB78549F266D02B2
  • ASUSTeK Computer Inc.
    • 4EFE7101C6E34FF6654B27C0E356B3906077228F
  • AzureWave Technologies
    • B502DF357711B544A9D358F4BBD5A570AC94A2D8
  • Broadcom Corporation
    • 163BD9619BE13A149CD0FC23E9591D0B557FF8AC
  • Clearwire Corporation
    • 3A34C830547BF851D803351D9F0F9FE5A756ECAB
  • Conexant Systems
    • B0464EA6E85DCF73BCDB7F59E2FE39E1F30F0BC6
  • CyberLink
    • E9376D0887317F7294887AD51C40DB8C775D1439
  • Dell Incorporated
    • 404E4A8C86A6E45328B324B5BAC0CA1022C46F7C
  • DTS
    • 3FDB841A79DC5A19581608A9F7DC277D646DD2B9
    • C3E0A08681282C2F07647B62607FC02F684E2302
  • Hewlett Packard
    • 9A947DB8463B010CE60FAE48C995C2349E913026
  • HTC Corp.
    • C2702513C3B1E8782A5C09845EA3AD9DD6150272
  • Huawei Technologies Co.
    • D4BA877C07342C2EE4DD07D35A3FCF065C28A7D9
  • Intel Corporation-Mobile Wireless Group
    • CB92B9366322BADD7FE8468586D6BCF5F3EDD267
    • 05B5F6413FE4AE449007B6FC70FBB196DA5F3881
  • Lenovo Information Products (Shenzhen) Co.
    • 71F10E73D71A651CC2A31B5D5B4A4710A386E419
  • Logitech
    • A81A76EFF455966DBAFEFE33AFA0005B128652A3
  • Magic Control Technology Corp.
    • 334367D94B8096BE8109DBAA7B2DF43CF7EE4DB3
  • Microsoft Windows Hardware Compatibility Publisher
    • 638A2003B302900AF56AF7BE9FB83F6EC022F391
  • Penpower Technology Ltd.
    • 10ECABAE634F48F57594BE5A72723415B3FC2192
  • Ralink Technology Corporation
    • E75E67EF40D83C4819E47F138AA3AE24955787EF
  • Realtek Semiconductor Corp
    • D9F1DCC120C684D233AA7F2C234AC6A3C35C0B4C
  • Samsung Electronics CO.
    • A37D12C80ACF024F6F0D1EABD919BB3149C2B4EC
  • Sentelic Corporation
    • B53564DE1380AF1169F8D356E8BC8C917310F627
  • Softex Incorporated
    • 679528394EF830468E2A335C597CD75E8096B8CD
  • Stardock Corporation
    • 6CF78EF4597F9646B01654AD5ABFE4F73A93006B
  • VIA Technologies Inc.
    • A8CBD410AEA6A7C5E850B8659725E736BBFAA30F
  • Wave Systems Corp.
    • EB357F2096537F6AA1A24366258F781AD5DE9CFF

And I list these not to shame the companies. As I mentioned – I think it’s an interesting phenomena from a sample clustering perspective, and I don’t think it really affects security in any significant way. At the end of the day, a signed file is still a signed file so you can be sure it’s ‘better’ than an unsigned one. Yes, there are malware families signed with a stolen or a forged certificate, but still, you will be always better off trusting the signed files more than non-signed ones.

Still… how does it affect us, Blue teamers?

I don’t have all the answers and welcome comments. Obviously, adding exclusions/whitelists that treat ‘TODO’ strings as a trusted ‘Signed Publisher’ is not a good idea. This obviously adds a need for us to whitelist these files by hashes, or paths.

If found on the investigated system, these files may attract the attention of forensic investigators as well (e.g. if they look at autostart records and detect entries that point to signed TODO executables). After so many, heavily publicized reports about stolen certs, or supply chain attacks, it’s only natural to immediately generate hypothesis about ‘if I see TODO: strings, this system was probably attacked by someone using malware signed with a stolen cert’.

Finally, it also exposes a less-discussed weakness of the signed files:

  • We often forget that there are programmers behind every signed file
  • These coders make mistakes; many mistakes

– a.k.a. if there is a bug in versioninfo, there could be a bug in the code. Or… it could be a feature :-).

Last, but not least: if you produce software on regular basis, please add the QC stage to your process where the release build is always checked against TODO entries. Kill the build if needed, because it is a proper build breaker. Fix it. Release good releases. Thank you.

Event, Event on the wall, who’s the fairest of them all? Part 2

May 30, 2019 in threat hunting

In part 1 I highlighted possible unexplored areas where we can look for additional interesting Events. Programmatic access to Event Log templates that Brian pointed me to makes these analysis even more straightforward (and desirable!).

Let’s start with the bad news first.

After exporting all the unique field names on Windows 10 I realized there are over 600 unique items across all these available templates. That’s a lot. Hard to find common denominator between all of them.

These fields will be most likely localized (not sure if they finally fixed it in new versions of Windows, or plan to, because it’s a major pain in the neck).

Also, if you study the output file (generated via the powershell bit shared in the part 1) you will notice that these templates exist in different versions, hence some of the fields will not always be available in the logs we have. It makes ingestion of these logs a bit more problematic too (parsers need to cater for different versions). Plus, your queries will have to take it into account as well.

  • 4688 version 0
  • 4688 version 1
  • 4688 Version 2

Finally, there is a long way for these field names to be delivered to your log aggregation system in a way they were originally named. It is almost for granted that these fields will be named differently than in these original templates (e.g. AccountName will become AcctName, acct, useraccount, etc.). Hence, you need to dig up the actual field names used by your log aggregation system and match them against templates. If you are only just starting to use a log aggregation system pay attention and influence the decisions that will ensure these fields are named the same way as in the templates, wherever possible!

Last, but not least – not all of these events will be set up (won’t be logged), not all of them will be properly forwarded even if logged, not all of them will be delivered in an unified way across all the systems. This means constant battle to ensure we audit our log sources to confirm that we still ‘see’ things, and on all the assets we want.

For the better news.

In my previous post I mentioned Event IDs that include references to process names. These process names not always mean exactly the same thing (sometimes it’s a full file path, sometimes a DLL name, or a component name), but we at least kinda know what to expect:

  • CallerProcessName
  • LogonProcessName
  • NewProcessName
  • ParentProcessName
  • ProcessName
  • TargetProcessName

With that info we can very quickly sift through our data and see what useful events we can find. From there, it’s not far to actual alerts and dashboards.

Here’s an example SPL query for statistics of events that include process name one way or another:

EvID=4611 OR EvID=4615 OR EvID=4616 OR EvID=4624 OR EvID=4625 OR EvID=4648 OR
EvID=4649 OR EvID=4656 OR EvID=4657 OR EvID=4658 OR EvID=4660 OR EvID=4661 OR
EvID=4663 OR EvID=4670 OR EvID=4673 OR EvID=4674 OR EvID=4688 OR EvID=4689 OR
EvID=4696 OR EvID=4703 OR EvID=4798 OR EvID=4799 OR EvID=4818 OR EvID=4904 OR
EvID=4905 OR EvID=4907 OR EvID=4911 OR EvID=4913 OR EvID=4985 OR EvID=5039 OR
EvID=5050 OR EvID=5051 OR EvID=5712 OR EvID=6417 OR EvID=6418
| stats count by EvID

We can also look at top 100 events:

EvID=4611 OR EvID=4615 OR EvID=4616 OR EvID=4624 OR EvID=4625 OR EvID=4648 OR EvID=4649 OR EvID=4656 OR EvID=4657 OR EvID=4658 OR EvID=4660 OR EvID=4661 OR EvID=4663 OR EvID=4670 OR EvID=4673 OR EvID=4674 OR EvID=4688 OR EvID=4689 OR EvID=4696 OR EvID=4703 OR EvID=4798 OR EvID=4799 OR EvID=4818 OR EvID=4904 OR EvID=4905 OR EvID=4907 OR EvID=4911 OR EvID=4913 OR EvID=4985 OR EvID=5039 OR
EvID=5050 OR EvID=5051 OR EvID=5712 OR EvID=6417 OR EvID=6418
| fillnull=”” CallerProcessName, LogonProcessName, NewProcessName,
ParentProcessName, ProcessName, TargetProcessName
| head 100
| table _time, EvID, CallerProcessName, LogonProcessName, NewProcessName,
ParentProcessName, ProcessName, TargetProcessName

As you run it in e.g. Splunk (Verbose mode), you can start adding additional fields that show up, and also remove Event IDs that are too noisy (put them on a side for more targeted analysis).

The goal is to find rare events for immediate alerts, noisy events, but with good filtering opportunities, and finally any others that can enrich our detections, even if just being simply present on a detailed timeline.

Here’s a list of other interesting field groups to play with:

  • Network IP/Addresses

ClientAddress, ClientIPAddress, DestAddress, IpAddress, IpAddresses, IpPort, IpProtocol, LocalAddress, NASIPv4Address, NASIPv6Address, PeerPrivateAddress, RemoteAddress, RemoteIpAddress, RemotePrivateAddress, SourceAddr, SourceAddress, SourcePort, TargetName, TargetServer, TargetServerName

  • Paths

HomePath, KeyFilePath, ObjectPath, ObjectVirtualPath, ProfilePath, ScriptPath, ShareLocalPath

  • Algorithms

AlgorithmName, CryptoAlgorithms

  • Status/Result

AuditStatusCode, EAPErrorCode, Error, ErrorCode, FailureCode, FailureReason, LoggingResult, QuarantineSystemHealthResult, ReplicationStatusCode, SecurityError, Status, StatusCode, SubStatus

  • Packages

AuthenticationPackageName, LmPackageName, NotificationPackageName, PackageName, SecurityPackageName

  • Dates/Timestamps

ClientCreationTime, Duration, ExpirationTime, LockoutDuration, MembershipExpirationTime, MMLifetime, NewDate, NewTime, PreviousDate, PreviousTime, ProcessCreationTime, QuarantineGraceTime, TGT Lifetime

Unfortunately, I don’t have a ready-to-use recipe for all the events extracted from templates (400+ IDs!). Some are obviously uninteresting, some are interesting, but not feasible to use due to volumes. Others could be very interesting, but legitimate software written in an old-school way is indirectly abusing them (e.g. requesting higher privileges than needed by default and this is immediately logged, often many times per minute).

Another thing is that even within a single Event there may be subgroups that we could focus on (e.g. trivial example with filtering by LogonType can narrow down logon events, but there is more).

Still, we can try to come up with some bundles of interesting events:

Code Integrity related:

  • 5038 Code integrity determined that the image hash of a file is not valid. The file could be corrupt due to unauthorized
  • 6281 Code Integrity determined that the page hashes of an image file are not valid. The file could be improperly signed without
  • 6410 Code integrity determined that a file does not meet the security requirements to load into a process.


  • 4618 A monitored security event pattern has occurred.
  • 4649 A replay attack was detected.
  • 4961 IPsec dropped an inbound packet that failed a replay check. If this problem persists, it could indicate a replay attack
  • 5148 The Windows Filtering Platform has detected a DoS attack and entered a defensive mode; packets associated with this attack
  • 5149 The DoS attack has subsided and normal processing is being resumed.
  • 5479 The IPsec Policy Agent service was stopped. Stopping this service can put the computer at greater risk of network attack expose the computer to potential security risks.

Policy-violation related:

  • 6423 The installation of this device is forbidden by system policy.
  • 6424 The installation of this device was allowed, after having previously been forbidden by policy.

Other possibly interesting:

  • 4793 The Password Policy Checking API was called.
  • 4612 Internal resources allocated for the queuing of audit messages have been exhausted, leading to the loss of some audits.
  • 4695 Unprotection of auditable protected data was attempted.
  • 4793 The Password Policy Checking API was called.
  • 4797 An attempt was made to query the existence of a blank password for an account.
  • 4864 A namespace collision was detected.

And as I am finishing this post I am really curious if anyone has ever attempted to build flowcharts that would map Event IDs to actual lifecycle of activities happening on Windows. While some events are atomic (e.g. system time change), many of events are clustered together around the lifecycle of network, system, logon, services, accounts, groups, tickets, policies, certificates, etc. events.

Finally, one thing that makes for an interesting observation: grepping the templates for words like ‘virus’, ‘malware’, ‘threat’ I find nothing. This confirms that the primary role of Windows Events is not supporting threat hunting activities. While we all suffer and complain about the noise they generate, let’s be grateful that they are out there.