The story of an underTAG that tried to wear a mitre…

Today we tag everything with Mitre Techniques. I like it, but I would also want a bit more flexibility. So, I like to mix the ‘proper’ Mitre tags with my own tags (not only subtechnique tags, but also my own specific tags).

Why?

Say… you detect that System.Management.Automation.dll or System.Management.Automation.ni.dll is loaded into a process. Does it mean it is T1086: PowerShell? Or does it mean, that the process is using a DLL that offers automation capabilities? And may not even do anything dodgy? I suspect (s/suspect/know/) that calling it T1086: PowerShell w/o providing that extra info loses a lot of context.

Why not calling it what it is? <some magic prefix>: Powershell Automation DLL?

Is loading WinSCard.dll always an instance of T1098: Account Manipulation, or T1003: Credential Dumping? Or is it just a DLL that _may_ be used for nefarious purposes, but most of the time it is not (lots of legitimate processes use it; sysmon logs analysis is a nice eye opener here).

Why not calling it <some magic prefix>: Smart Card API DLL?

As usual, at the end of this tagged event, or events’ cluster, there is a poor soul, and the underdog of this story that is tasked with analysing it. If our tagging is as conservative as the mindset of politicians who studied at Eton… then so will be the quality of analysis, statistics, and actual response.

And it is easy to imagine confusion of analysts seeing events tagged with a vague name. For example, net.exe command that accesses user/account data, and the loading of WinSCard.dll may make them assume that there is a cluster of ‘account manipulation’ events indicating an attack. A triage of such vague cluster of events is a time wasted… There is a response cost, and there is an opportunity cost at play here. The example is over-simplistic of course, but the devil is in the details. Especially for the analysts.

I’d say… given the way most of events are logged today, often w/o much correlation and data enrichment at source, the event collection process should make any attempt possible to contextualize every single event as much as possible.

We can argue that combining data at its destination, in aggregation systems, SIEMs, collectors, or even ticketing systems, or on a way to them, is possible and actually, desirable… today’s reality is that we need that single event to provide as many answers as possible…

This is because we know that Data Enrichment at the destination is a serious pain in the neck and relies heavily on a lot of dependencies (up to date asset inventories, list of employees, DHCP mapping, then there is a lack or poor support of nested queries, poor performance of nested queries, and this forces us to use a lot of lookup tables that need to be up to date and require a lot maintenance). And if we need to go back to the system that generated event to fetch additional artifacts, or enrich data manually, then we are probably still stuck in a triage process A.D. 2010…

So… if we must tag stuff, let’s make it work for us, our analysts, and make it act as a true data enrichment tool that it was meant to be… If the event, their cluster, or detection based on them is not actionable, then it’s… noise.

A short wishlist for tool writers

2019-01-20: Updated to add Brian‘s suggestion.

Prologue

We have got so many tools now. It’s like almost every week there is a new tool, plug-in, or their update announced.

Happy days!

Yet, many of them still surprise us with deficiencies that should be very well understood by now. Ones that we should not really see in 2019 anymore.

Exhibit #1.

No binaries.

Rant:

Yes, we know that everyone loves compiling binaries from sources, but seriously… who does, really?

Especially that it compiles in your build environment, but it doesn’t in others. And obviously everyone who wants to test your program is an experienced developer.

Re-creating your build environment means installing same compilers, make tools, dependencies, often libraries or packages that are no longer obtainable in versions you have installed. At this stage the build environment is already different from yours.

And once the environment is ready, everyone loves spending time fixing various code issues, suppressing or ignoring warnings, adding missing header files, and often even modifying building commands.

It often takes a few hours. (In fairness, I must bow here to Mimikatz and SQLITE3 authors – their code compiles like a charm!)

Wishlist item:

If you build a tool for yourself, keep it to yourself. If you write for everyone then please ship binaries that give everyone a chance to try your software w/o wasting time on building. Not everyone who will, is a developer. Not everyone who is a developer, will.

Exhibit #2.

Missing dependencies.

Rant:

@#$%^&*

Wishlist item:

Static linking, or adding prerequisite to install Microsoft Visual C++ Redistributable Package may help in this case.

But actually…

This is not a library problem. It is a testing problem.

Yes, all works perfectly in your build environment, but have you tried running it outside of it? The dependency issue would be immediately visible if you tried to run it on a plain vanilla OS, ideally a few of them, and lo-and-behold – if you tried on non-English OS versions too.

Exhibit #3.

Portability and backward compatibility is dead.

Your program only runs on Windows version XYZ.

Rant:

So, you write the tool with the latest, shiniest compiler. It just happens to include requirements, or dependencies that make it work only on a specific Windows version, or up. Nothing in this program actually uses any of the specific features of Window version XYZ or up, as it relies on Windows API available since Windows 95, but… the program won’t work on older versions of Windows.

Wishlist item:

Of course, test it on old versions. if it doesn’t work, find out why:

  • Is it the compiler? See if you can change flags, or settings. Can you use an older version of the compiler?
  • Is it a static linking to a DLL or an API introduced in recent years? Load the DLL / resolve the API dynamically. Note that many sysinternals tools are still backward compatible, because they do it exactly this way!
  • Is it a very demanding value of MajorOperatingSystemVersion / MajorOperatingSystemVersion (OS version required to run the program that is simply too high w/o any reason) ? Adjust them during the build process in an automatic fashion.
  • Is it a dependency on a library that doesn’t work on older versions of Windows? Fair enough, you can either find a different library, or make it clear that this is the reason the software doesn’t work on older versions.
  • Also, perhaps double-check if the troublesome library provides the service that is used all the time, or only in certain, rare instances; consider making certain features available dependent on the availability of such library that could be loaded dynamically; this way the program can still do most of its work on older OS versions.
  • Test the final product on older versions of Windows.

Exhibit #4.

Tools crashing. Often during the first run.

Rant:

@#$%^&(*

Wishlist:

  • detect & resolve dependencies (f.ex. on specific .NET versions)
  • handle these exceptions; this is so much easier now than 20 years ago – it’s a built-in feature for us to use
  • more code review?
  • more error checking?
  • more testing in general?

Exhibit #5.

Tools showing so much that they eventually show nothing.

Rant:

We know that many tools are POC so it’s hard to let go. Anytime you look at the problem e.g. file format parsing, you do need to include all the possible fields, and highlight all the nuances your tool can extract or understand, and of course send it to the output. Users will enjoy it.

You also need to make the output fancy: format it, colorize it, add ASCII Art logo, a copyright banner, use Unicode output characters that show up on your terminal configured to use a Unicode font, etc.

Wishlist item:

By all means, add verbose/debug logs to your tool, but think of the users. What is that they want from your tool? How do they use it? Who are really these tool users? Noobs, experienced practitioners, advanced pros, hardcore hackers? For example, for a PE parser, will they need to know all the gore characteristics of the file when they only want to know the very basic properties of the file (that can drive their next steps in analysis)? What are the real and most common use cases?

Less is often more.

  • Add debug/verbosity, but make it optional.
  • Consider saving such logs to a file, not to standard output.
  • Add options to disable copyright banners.
  • Avoid whistles and fireworks. UI metaphores went down the drain last few years, but we can still try to make the UI user-friendly.
  • Think of the audience. Ask the audience what works, and what doesn’t.
  • Learn from other tool makers.

Exhibit #6.

Help.

Rant:

How do I use your tool? Twitter animations are cool, youtube video with your presentation is great, screenshots are fantastic. But… can I have a basic written documentation please?

What are the problems it tries to solve? This should be the first line of any documentation. You can’t assume everyone who visits your page knows.

What file formats are supported? What is the desired output? What are known issues? What is the competition, or if you don’t want to write about them – how do I verify output of your tool? What are the references to documents, functional specifications, older work that you relied on?

Finally, what are the command line arguments? Give a loooong list of examples. Be generous. Treat users like they have never seen a computer before. It will save them a lot of time.

Seriously, this is such a pain that it bogs my mind how much time I sometimes spend looking for an example usage for some tools. One that actually works. Because these that don’t — it’s endless. And how many times I actually had to reverse engineer some binary to discover a proper usage for this particular version…

And if you want a good example on how to do it right — look at programs written by Nirsoft. They all share similar GUI interface. They also have very good documentation pages, all of which are quite uniform, so seeing it once makes it easy to read others. They include very detailed information about command line arguments programs support. You will also find a lot of information about known issues, request for feedback, licensing info, and lots of lots of useful hints on how to use the programs.

Exhibit #7.

Missing Feedback opportunities.

Brian provided a good suggestion to add to the list – a suggestion that is more towards users than tool developers, but this reminds me that the feedback is not always easy to provide, because developers simply forget to tell us how.  

Rant:

If your program doesn’t work on my system, or with my samples, how can you learn about it if there is no way to provide a feedback?

Wishlist item:

Email address, Twitter handle, or enabled bug logging/comments for your repo/blog will do wonders.

And if you are the user, please provide the feedback to the tool writers!

Exhibit #8.

Not everything is a tool.

Rant:

So, you used 20 libraries, and wrote a 50 lines of code that reads JSON, converts it to XML, transforms it with AI, and then outputs movie rendering a unicorn printed with a virtual 3d printer galloping over the rainbow.

It’s great you are proving yourself. It’s great you are trying. It’s great the program you wrote works for you. There is no sarcasm here. This is how we all learn.

You are in luck. There are so many libraries available now that writing code that does extremely complex tasks in just a few lines of code is trivial. We should respect that.

AND

If there are 50 other programs doing exactly the same thing, and often do it better. If the tool is half-cooked for the sake of a demo during the con, and/or it only works well for your test data. If the whole code is so simple that any average developer could implement it w/o much trouble. If the actual snippets can be found on the StackOverflow. Then please please do not call it a tool. Calling it a POC is enough.

This is not to discourage you from coding. This is to encourage you to assess your code quality & usefulness on a grand scheme of things. If you claim that your software allows to do a specific thing, or assess certain quality of some data, or extract certain properties, and someone can quickly prove that these are not done right, because of a limited scope of the original idea that drove the development, then as a POC it’s still a very valuable asset, but as a part of somebody’s toolbox – completely useless. Hence, not a tool.

Epilogue:

Okay, it’s easy to rant and play a blame games. I coded lots of bad programs myself, and some of them are still available on this web site. Making everyone happy is very hard. Programming itself is actually very hard. Testing is even harder. And writing documentation the most undesirable task coders have to face.

BUT.

I think there is a minimum responsibility to bear for anyone that releases programs publicly, and announces them to the world. It is a basic empathy for the users, and their needs.

And if you release code, a POC, a tool, a suite, and it actually can be quickly tested, won’t crash on its first run, and will deliver expected output — you will actually make a dent in the industry.