The art of writing (for IT Sec)

When I wrote my first DFIR report it was terrible. After receiving the commented version back from my reviewers my heart sunk. I felt I am not going to make it. While I love technical and investigative bit, and had some good win on that particular investigation… somehow, I was unable to communicate it. And since I always liked to write I was really surprised (a.k.a. shocked a.k.a. ego-hurt-badly).

All these hours of work put into report didn’t matter, all these cool technical bits I described didn’t matter – when the doc came back to me it was pretty much a different document… Yup, so many comments and corrections. I literally couldn’t see my original content. There was so much of ‘Adam, you are doing it wrong’… Ouch.

I must add that it was for a Law Enforcement case, so it was a big deal.

I went back and forth on these comments with my reviewers and finally…

  • Got that report into a decent shape & submitted it to the LE
  • Realized that writing for a general public or blogging is not the same as writing for DFIR, especially for LE

And it became especially clear when I received a letter to show up in court and to testify… Imagine my horror. I was a noob and yes, that absolutely terrible report was going to be talked about. And I will be questioned on its content…

Holy cow.

It’s actually pretty intimidating. Confidence from a safety of home, or office seat is one thing, but talking about your work in Court is something completely different. And the guys who ask you questions will try to break you and show you as an incompetent clown. And your report and work may lose credibility… After a mandatory panic attack I started asking around. Some of my peers went through this before and gave me many hints: only answer questions, don’t add any extra info, don’t speculate, don’t be afraid to share a professional opinion, but keep it concise, don’t get emotional, watch out for attempts to dismiss your evidence, or target your credibility (personal attacks, etc), etc. So… YES. That was pretty intimidating, to say the least.

I kinda got lucky on that one and eventually didn’t go to testify, because the guy pleaded guilty (my report actually helped to persuade him!!!), but from there on I learned to be more careful, more humble, and definitely more organized with regards to what I write, especially commercially.

It’s really easy to make claims, it’s much harder to support/describe evidence, build a proper case, argument, timeline, or in case there is no evidence at least offer an educated guess, share professional opinion to support them (including contextualizing circumstantial evidence).

Think about it for a second: from a DFIR perspective we use a lot of tools to extract and interpret evidence. While we are happy building timelines, the whole process of data extraction and interpretation could be called into a question. How do we know, or how are we so sure the programs we use extract and interpret data correctly?

Notably, what you know, or what you think you know will be scrutinized in any possible way, so as you write your report you do need to re-read a lot of older documents, or reference materials to avoid making a mistake of making a statement that is easy to prove to be incorrect, inaccurate, or too general. This may ruin your case. To give you an example… Say… you describe that programs always load in a certain way under Windows, and that’s the only way to run programs. Be careful not to make an overstatement or misrepresentation. As it turns out, there is a lot of other ways to run code on Windows, whether via shellcode, exploits, side-loading, etc. The moment you are caught with statements that can be proven inaccurate your credibility may suffer.

This is where this article begins.

Whether you write a DFIR report, pentesting report, malware write-up, Threat Intel doc, or just fill-in the ticket or even post on the blog or Twitter think for a second of the following:

  • Who is your audience?
  • Who is your audience that you don’t know of?
    • Tickets are often reviewed by Compliance/Audit teams
    • Your most Senior Management may do it one day, even if whimsically
    • In case of a breach, tickets related to the breach-related events/incidents may become evidence in Court
  • How accurate is your description?
    • Did you write about facts or shared an opinion?
    • Did you use language that may not be fit for the purpose? Slang, vulgarisms, personal opinions, puns, jokes, commentary, etc. have no place in these cases
    • Can a non-technical person understand what you wrote? Will they understand how it will affect them?
    • If it is a ticket, is there a closure? You shouldn’t close tickets with no closure statements even if it’s just a simple ‘Based on the investigation, there is no further risk, and the ticket can be closed’; it helps you, helps your manager, and helps the org if these statements are there
  • Will the audience focus on the headline only, summary, or gore details?
  • Are you the first one to publish about it? Do your homework – and always give credit to any relevant older research, if you can find it. Update your post, if you find the references later, or someone provides you with a link (you will be surprised how many time people send me links to web.archive.org where some long forgotten blog/PDF from early noughties discusses some similar topic I just wrote about thinking it’s a novelty)
  • Assume that at least one person will come back to you with comments that will bring a revolution to your thought process (e.g. to point out gaps in your thinking, suggest/reference older /often better/ research on the same topic, or better, more efficient approach to the same problem); anticipate it and accept it in a humble way; remember to thank these guys – they not only read your stuff, they enrich your knowledge!!
  • Assume you may need to explain your claims in ELI5 fashion
    one day and finally…
  • If possible, describe what you did so it can be replicated, and/or re-analyzed; share code, data, examples, queries, attach files, results, add comments how you interpreted them.

This sounds trivial and kinda overdone, right? Let’s see…

  • Twitter is mainly opinions – who cares
  • Tickets’ content is almost never read by anyone – who cares
  • Blogs are blogs – who cares
  • Malware reports are now so generic that they are primarily part of a PR machine, and are actually really easy to write (most of the time=quick intro, some IDA/Olly/Xdbg/Ghidra/DNSpy screenshots of walking through malware stages, finally a conclusion with a marketing bit and then yara+IOCs; they can also be semi-automatically generated from sandboxes – who cares
  • Red Team/ Pentest reports are also semi-automated in many ways, and often just focus on an extensive list of vulnerabilities found by scanners, or ‘I pwned you, patch your systems, kthxbye’ bit if they managed to actually compromise some systems; notably, red teams, similarly to DFIR teams need a lot of willpower and incentive to keep logs of all the steps they take; why? because it’s often poking around w/o any success for many hours; it’s when they hit the jackpot, they immediately chase the leads (DFIR) or explore new paths (red team); this is _hard_ to document, because excitement takes over – still, who cares
  • DFIR reports, even if still manually written, more and more suffer/benefit from an automation too; copypasta and generalizations are a norm, and a predictable TOC (often enforced by standards e.g. in PFI breaches) is there too
  • Finally, Threat Intel is a kinda beast on its own; from literal forwards of PDFs through copypasta exercises to actual valuable intel pieces affecting your org (it was very bad a few years ago, but it’s getting better and better).

Notably, other industries suffer from templates and copypasta as well, so it’s not a phenomenon that is infosec-centric. So many T&S, commercial reports, surveys, searches, etc. are not only non-conclusive, but almost all of them are written in a ‘we don’t don’t take any responsibility’ way. With regards to searchers, reports they are also typically direct exports from databases and while in some cases may get enriched by a quick, yet superficial ‘personal touch’ to make it more credible, they are just an easy source of revenue for companies that own these databases. Sadly, infosec is following these steps. And while we are all pressured by time, and billable hours is what matters… it will be quite a shame if we end up delivering the same vague content as a part of BAU (Business As Usual).

This is where this article begins being practical.

Lenny Zeltser published Writing Tips for IT Professionals. If you have not read it, please do so. This is a great tutorial on how to be strategic about your writing.

Also, for anything you write assume that LE, C-level guys, firms engaged commercially to re-do/confirm/audit your DFIR / pentest analysis, experts in the industry will read it at some stage. Also… assume these reports will become public.. cuz… breaches.

So… try to write in a defensive way, make your lack of knowledge known (where applicable). Suggest avenues for additional research if you can. Don’t claim anything 100%, but at the same time use common sense so that your article doesn’t overuse words like ‘allegedly’, ‘possibly’, ‘probably’, ‘reportedly’, ‘supposedly’, etc.. Be honest, be humble. Focus on facts, not editorializing.

Also… use Alexious Principle, it’s such a simple, yet powerful recipe for writing almost any report/write-up within an infosec space in a defensive way. If you include these 4 points it’s almost guaranteed that all the questions asked by a client, LE, sponsor will be addressed. The less follow-ups on the report, the better writer you are.

Finally, you need to practice. The more you write, the better you will get at it. Also, read documents that are within the same audience spectrum — if you need to write DFIR reports, read available public reports about breaches. Cherry-pick language, statements, as well as formatting style, and the document organization.

And last, but not least – do peer review, if possible. Ask more senior guys to look at what you write. Ask them if there is anything that sounds too vague. Correct it.

And to be honest, this post is a good example of bad writing. I mixed up a lot of things and didn’t have much structure here; if you read that far, thank you.

Creolisation, Tergiversation and Equivocation of IR language

There is a lot of fun made of marketing language of infosec. Anyone who is a bit technical knows that it’s a snake oil game that aims at selling at all cost, and the cyber terms coined by the marketing gurus make us all shake our heads (cyber pathogens, cyber Armageddon, cyber Pearl Harbor, cyber 9/11, etc.).

For a change, I’d like to talk about the language of the people working in IR. I find it quite interesting and actually struggle a lot with adapting to use certain terms as they sound quite foreign to me, if not pretentious.

Newcomers entering this field don’t have an easy life, at least from a linguistical perspective. The field is relatively new, many people still enter it by chance, or thanks to their background from their past work in various ‘related’ disciplines: law enforcement, digital forensics, audits, fraud analysis, network engineering, system architecture, reverse engineering, malware analysis, intelligence services, helpdesk, as well as completely unrelated: chemistry, biology, medicine, music, and many other disciplines. They bring their habits, language, points of view, and attitude which I think make an impact on the IR lingo: one that resembles a pompous creole language of sort.

Many people who came to IR with Digital Forensics experience tend to be very cautious and make lots of statements that are very much aligned with the legal responsibility they encountered as forensic experts testifying in courts. They bring tones of words and statements that often may feel like weasel words to technical people who never experienced the harsh scrutiny witnesses face in court. Hence, we start saying ‘allegedly’, ‘probably’, ‘it would seem’, ‘evidence suggests’, ‘I believe’, etc. more often than in the past. Everything is possible, but… everything is also uncertain.

The non-technical individuals with a background in military, intelligence brought us the very large corpora of terms that even a few years no one in infosec heard of. There are no more ‘bad guys’, ‘virus writers’, and ‘hackers’. Now we all talk about ‘actors’, ‘adversaries’, ‘intel’, ‘TTPs’, ‘indicators’, ‘HUMINT’, ‘SIGINT’, etc. and since we entered the geopolitics we also have ‘attribution’, ‘nation state actors’, plus ‘red teams’, and ‘blue teams’. And let’s not forget to mention the popular units ‘8200’ or ‘61398’. Oh, and we totally ‘nuke’ things.

Let’s admit it. Compliance guys came up with a lot of good ideas. While many technical people don’t like compliance, or auditors, and they perceive these ‘checkbox activities’ as the core ignorance of this industry, it is really important to highlight that these compliance frameworks do impact organizations in a very positive way. They bring structure, force orgs to create processes introducing accountability, affect the architecture, and change the way they do business. As for the language, we all now know about ‘confidentiality’, ‘integrity’, and ‘availability’, don’t we? We also know about ‘business resilience’, or ‘disaster recovery’. And lo, and behold – we even started thinking more about the business we protect than just looking at the technical aspects of attacks and just eyeballing the blinkenlights. While being a ‘cost center’ it is important to have a bit of a thought about the ‘customer’, and where the monies come from. And in my experience the last bit appears in conversations far more often now than say 10 years ago (in technical circles). Then we have ‘findings’, ‘RFIs’, ‘risk scores’, ‘risk posture’, ‘risk management’, and ‘data in transit’, ‘data at rest’, and lo and behold… ‘security controls’, and ‘acceptable use policy violations’. POS malware brought also a lot of opportunities to discuss ‘magnetic stripe’, ‘track data’, and ATMs. IR is becoming compliance on so many fronts!

Then we have network engineers; even today we can come across guys who use a bit archaic terms like ‘octets’ for bytes being transmitted in packets. You probably rarely hear of datagrams, but you definitely hear ‘egress’, ‘ingress’, ‘routing’ all the time. Many younger people find these concepts a bit unclear as in 2018 we all tend to think of uploading / downloading, or sending / receiving data, because … well… that’s how internet works today (in general, I think the mindset of many people entering the IR now is on a much higher level of the OSI model than say… in 2000).

Scientific language brought us ‘viruses’ or ‘samples’ of course, but there are now also ‘implants’, ‘payloads’, ‘detonation’, and ‘anomalies’, ‘regression’, ‘machine learning’, ‘clustering’, and ‘graphs’. And then the whole gallery of code names borrowed from the animal kingdom (‘pandas’, ‘bears’, ‘kittens’, ‘tigers’, etc.). We do ‘Proof Of Concepts’, in the ‘labs’, and we work our ideas starting with ‘hypothesis’. And as for the medicine… some time in 2017 there was a Twitter question about the tech terms that have their roots in medicine. I, among others, contributed quite a few answers to that thread. I thought it will be nice to just drop a superset of IR-related terms here:

abort, agent, anatomy (of a virus), anomaly, antiviral, assessment, attack, backbone, backtracking, bacteria, blackout, blue pill, buffer, cell, census, channel, check-up, clone, compress, congestion, contagion, containment, contamination, defect, defense, diagnose, diagnostics, disc, disease, disinfect, dissection, dissemination, DNA, downstream, epidemics, eradication, exercise, extract, gene, genetic, heartbeat, host, hub, hygiene, immune, immunize, implant, indicator, infection, infestation, influenza, inject, injection, inoculation, isolation, lab, life-support, malignant, microb, monitoring, mutation, nematode, outbreak, patch, pathogen, pathology, patient 0, pattern, penetration, post mortem, probe, prophylactics, quarantine, recovery, red pill, remedies, replication, retrovirus, sample, sanitization, scanning, screen, segment, spread, stat, stop the bleeding, strain (as in malware strain), stress test, subject, system health, tag, test, transmission, trauma, triage, USB condom, vaccine, vector (as in attack vector), virus, vitals, vulnerabilities, worm, x-rays (type of malware scanning), zombie

And last, but not least – let’s not forget about the ‘centrifuges’. Who in infosec would ever imagine talking about stuff like this 10 years ago… ???

Despite all the efforts to stay technical and binary, it would seem that we are more and more vague, indecisive, perhaps way over our heads. We are accidentally ‘jacks of all trades’ in our roles that are dealing with more ambiguity, uncertainty and pure ignorance (our own!**) that needs quick and urgent fixing all the time (**not a fault, just we don’t know everything and we always find something new to learn) than any other IT position.

We are cyber-warriors, cyber-ninjas, white hats, busticati, evangelists, thought leaders, and even celebrity CISOs. But perhaps also, and often without any bad intent, just very lucky career-oriented, fad-driven, over-entitled imposters and… kinda infosec bots. I am confident in my belief that we should wait for more evidence to support my hypothesis, and until then, let’s tentatively agree that IR is an art, and if we lived in ancient Greece, there would be totally a dedicated muse for that.