(Not) Mapping Firefox extension IDs to their names

I have mapped an extensive list of Chrome Plug-in IDs to their names before. Of course, I knew for a long time that I will need to take a look at Firefox Add-ons too…. And in fairness, I did… I actually made a few approaches in the past but without much success.

If you are not familiar with Firefox Add-ons IDs, here’s your starter pack.

The forensics ramifications of the choices described in that article cannot be overlooked:

  • Add-on IDs are not MANDATORY – yup, some Add-ons use (unique) IDs, some – don’t
  • There are at least 2 versions of Add-on JSON manifest to take care of (manifest being the manifest.json file storing all the info about the plug-in, or so you wish — read below)

What the article doesn’t tell you is that the Firefox Add-on world is a total mess.

What that means is that you can’t just download all Firefox Add-ons in one go and then parse their manifests to build a table with all the names, IDs, and their descriptions…

Oh, no… this would be too easy.

The moment you try you will immediately discover that:

  • many manifest.json files are not following JSON standard requirements and can’t be parsed with standard Python libraries (f.ex. comments are not allowed and generate exceptions; hint: json5 works)
  • many entries inside these files include Unicode characters or encoded Unicode characters that can’t be rendered properly with standard Python libraries
  • many newer manifest.json files reference external _locales\<language-code> folders storing one or multiple messages.json files defining referenced localized strings inside the manifest.json files; while these external references are easy to spot as they are all prefixed with __MSG_ it’s an additional complexity
  • It’s all nice and cozy, except:
    • list of locale language-specific directories is messy and non-standard: en, en-us, en_us, en_gb – this is just for US and UK English and it’s just not quality-checked at all
    • many messages.json files do not include all localized fields (yup, your en\messages.json may define 2 strings, and ru\messages.json 20!)
    • many messages.json files include typos
    • many messages.json files include field names referenced by manifest.json files that are stored in a way that makes them inaccessible f.ex. fully capitalized string inside manifest.json file and lower-case, or camel-case inside the messages.json – seriously, is no one checking this when add-ons are submitted? :shockemoji:

I wish this was over, but it is not. So, I don’t have a list of Firefox Add-on ID to name mapping for you yet. I only have this rant 🙂 Sorry 🙂

Bitmap Hunting in SPL

One of the most annoying hunting exercises is detecting a sequence of failures followed by a success. Brute-force attacks, dictionary attacks, and finally password spray attacks have all this in common: lots of failures, sometimes followed by a success.

The problem is stated clearly, but there is no easy solution.

Why?

Most of logs are stateless. Every log row describes an event, and every row is detached from the others. Combing them, combining them, clustering them and extracting some juice from them is a detection engineering art on its own…

So, yes… it’s actually hard to detect these types of sequences and it’s usually very expensive f.ex. if you use Splunk it offers its transaction command for situations like this, but it’s a very very bad choice: it affects performance too much.

There is a more elegant solution out there though… and I call it bitmap hunting.

Instead of getting fixated on the sequence of the events that fit our narrative (set of failures followed by a success in the above example), we focus on building a bitmap of ALL states registered by the respective telemetry, the one that we can always group by the endpoint name, user, time/time bucket, etc..

Let’s look at an example:

| makeresults | eval endpoint="sys01" | eval username="test" | eval status = 0  
| append [| makeresults | eval endpoint="sys01" | eval username="foo" | eval status = 0] 
| append [| makeresults | eval endpoint="sys01" | eval username="bar" | eval status = 0] 
| append [| makeresults | eval endpoint="sys01" | eval username="abc" | eval status = 0] 
| append [| makeresults | eval endpoint="sys01" | eval username="nimda" | eval status = 0] 
| append [| makeresults | eval endpoint="sys01" | eval username="root" | eval status = 0] 
| append [| makeresults | eval endpoint="sys01" | eval username="r00t" | eval status = 0] 
| append [| makeresults | eval endpoint="sys01" | eval username="john.doe" | eval status = 1]
| append [| makeresults | eval endpoint="sys02" | eval username="jane.doe" | eval status = 0] 
| append [| makeresults | eval endpoint="sys02" | eval username="jane.doe" | eval status = 1]
| table _time, endpoint, username, status

These SPL commands build a list of fake events for us, where 2 endpoints sys01 and sys02 register their logon events, where the endpoint, username, and status fields/columns include all the info about the set of events occurring. In essence, this is how it looks like (ignore the _time as I didn’t want to clutter the commands above even more):

We can use the status of all events (success=1, failure=0) to build a bitmap of all of them by grouping them all together by the endpoint:

| stats list(status) as allstatuses, list(username) as allusernames by endpoint
| eval allstatuses_bitmap = mvjoin(allstatuses,"")
| table endpoint, allstatuses_bitmap, allusernames

The result gives us this:

As you can clearly see, it’s pretty easy now to ‘guess’ that sys02 user Jane.doe is just a possible typo or otherwise minor issue that led the user account to be logged in after the first failure, while the sys01 system experienced a barrage of logon attempts with different user names that eventually led to a successful logon. The sys01 should be definitely investigated.

Looking at the bitmap created by all the logon statuses we can quickly devise a logic to detect f.ex. successful password spray/brute force/dictionary attacks:

| stats list(status) as allstatuses, list(username) as allusernames by endpoint
| eval allstatuses_bitmap = mvjoin(allstatuses,"")
| where like(allstatuses_bitmap, "%0001")
| table endpoint, allstatuses_bitmap, allusernames

In the example above, we detect at least 3 failed logons before a successful logon.

And yes, it will hit False Positives too (legitimate logons will be amongst the malicious ones), but number of failed logons will be usually high enough and as such, a good indicator of badness, plus at least we now have something to triage…

p.s. logon events are just one example, but you can convert any condition into a bitmap — as such, you can build more complex conditions too (f.ex. more than two specific events present in a sequence of events)