You are browsing the archive for File Formats ZOO.

7z & Multi-volume archives

May 5, 2020 in Anti-Forensics, File Formats ZOO

This post is pretty much a summary of what I twitted the other day. I thought it will be good to post it in one place in case anyone is interested.

7z is a very flexible archiver and its support for multi-volume archives is just a natural consequence of other archivers supporting this feature, including Rar and Winrar that have it implemented since like… for-e-v-e-r.

While experimenting with 7z I noticed that it’s possible to make multi-volume archives with its first file being just… 1 byte. I was perplexed by it and started digging, because if this is the case, all the forensic tools that look for file magic, or even file extensions may need to rewrite their parsers.

So, for starters… you can run this:

7z a foo.7z -v1 -v10M

This creates a number of 7z archives with the first one being just one byte long:

The naming convention is interesting. As you can see, the standard .7z file extension is followed by a 001, 002 for two consecutive volumes created:

  • foo.7z.001
  • foo.7z.002

When I tested the same thing for Zip archives (still using 7z), I ran this:

7z a foo.zip -tzip -v2 -v10M

I noticed that for Zip files 7z needs at least 2 bytes in its first volume, but everything else works the same way as for .7z.

Finally, you can use Rar / WinRar to build a Rar archive, and then keep ‘R’ – the first character of the Rar! signature – in the first volume file, and the rest of the archive in another as demonstrated on this picture:

When I looked at the actual code of 7z to see how it interpretes file extensions:

I noticed that the code is relying on only the first character of its file extension to determine whether it is a zip or rar multi-volume archive. So… you can rename the volumes to:

  • foo.r.01
  • foo.r.02

or

  • foo.z.01
  • foo.z.02

And you will still be able to process these with 7z.

While the good ol’ Unix tools can do the same (split, cat, gzip, etc.) it is still an interesting find. 7z is used a lot on Windows boxes and people w/o much technical knowledge could use these built-in features w/o a need to learn much of *NIX CLI.

It really kinda changes the dynamics of the way file extensions should be understood and processed, and how access to archive files – especially these that by design are multi-volume – should be handled, especially by security tools. And in particular, these that help in malware detection and post-intrusion analysis. And perhaps… this is just a tip of the iceberg.

Re-sauce, Part 1

April 24, 2020 in Archaeology, Clustering, File Formats ZOO, Forensic Analysis

PE Resources are like an unwanted child of malware analysis and reverse engineering. Almost no one talks about them and… this post is going to… make it worse ;).

Let’s take a large number of ‘bad’ samples, export their resource information, and do some data crunching… we now have some stats.

What are the most popular resources?

These are:

4720830   RT_ICON (3) -
4703093   RT_GROUP_ICON (14) -
3445748   RT_VERSION (16) -
2574034   RT_MANIFEST (24) -
2291058   RT_DIALOG (5) -
2022739   RT_STRING (6) -
1564623   RT_RCDATA (10) -
1193659   RT_BITMAP (2) -
1159726    'DVCLAL' -
1050941    'PACKAGEINFO' -
 931572    'MAINICON' -
 903265   RT_CURSOR (1) -
 884868   RT_GROUP_CURSOR (12) -
 557473    'BBABORT' -
 551898    'BBALL' -
 551836    'BBOK' -
 551785    'BBNO' -
 551023    'BBRETRY' -
 542886    'BBIGNORE' -
 542836    'BBHELP' -
 542834    'BBCLOSE' -
 542593    'BBYES' -
 541708    'BBCANCEL' -
 498816    'PREVIEWGLYPH' -
 497272    'DLGTEMPLATE' -
 358081   RT_MENU (4) -
 199615    'TFORM1' -
 174781   RT_ACCELERATOR (9) -

These with a RT_prefix are standard resource types defined by Microsoft, and the ones in apostrophes are strings that ‘tag’ (or ‘name’) the resources according to developer’s wishes…

Given a number of these ‘named’ ones used repeatedly (as shown by the list above) you can guess that they are somehow ‘known’, or a part of some ‘standard’ — and yup, these are primarily from Borland/Delphi/Embarcadero family of executables that include standard GUI elements from this platform. All ‘BB*’ and ‘T*’ come from this environment. Additionally ‘PACKAGEINFO’ is a resource I covered a little bit in the past – it lists all the packages the executable uses (a good IOC except no one writes malware in Delphi anymore).

Surprisingly, modern PE Viewers and Editors do not parse PE resources very well. They only show the most popular resource types, because the others are often … undocumented. I really don’t like to look at resources in hex view. We can do better.

Let’s start with these that are ‘kinda documented’.

For instance, Resource Hacker can handle some Delphi resources (e.g. forms) pretty well:

A popular ‘Typelib’ resource:

can be viewed with OleView:

The ‘Registry’ is typically an embedded ‘.reg’ file.

A ‘FOMB’ is a binary MOF that was described in this post by FireEye and can be decoded using bmfdec.

What about the others?

And this is where it gets really difficult…

Looking at resources embedded in Windows 10 exe, dll, ocx files one can very quickly build a list of more or less enigmatically-looking resource names:

  • ACCELERATOR
  • ANICURSOR
  • AVI
  • BINARY
  • BITMAP
  • BITMAP4
  • BRANDING_METADATA_RES
  • BRANDING_REQUIRED_RESOURCEID_MAP
  • CERT
  • CODEPAGES
  • CODEPAGESEXT
  • CURSOR
  • DATA_FILE
  • DATAFILERESOURCE
  • DGML
  • DIALOG
  • DUI
  • EDPAUTOPROTECTIONALLOWEDAPPINFOID
  • EDPENLIGHTENEDAPPINFOID
  • EDPPERMISSIVEAPPINFOID
  • EMBEDDEDDATA
  • FILES
  • FLEX_TABLE
  • FLEXDL
  • FONT
  • FONTDIR
  • FONTFALLBACK
  • GIF
  • GROUP_CURSOR
  • GROUP_ICON
  • HTML
  • HWB
  • HWXLANGID
  • IBC
  • ICON
  • IMAGE
  • JPEG
  • JS
  • JSON
  • JSON_RESPONSE
  • MANIFEST
  • MENU
  • MESSAGETABLE
  • MOFDATA
  • MSTESTROOT
  • MUI
  • PNG
  • PNGFILE
  • PRELOAD
  • PRXFILE
  • RCDATA
  • REGINST
  • REGISTRY
  • RGSLIST
  • SCHEMA
  • SIAMDB
  • SKDFILE
  • SRGRAMMAR
  • STYLE_XML
  • TESTROOT
  • TEXT
  • TEXTINCLUDE
  • TUNINGSPACE
  • TYPELIB
  • UIFILE
  • VR_ETW_MANIFEST
  • VR_ETW_RESOURCE
  • VSGEXP
  • WAVE
  • WEVT_TEMPLATE
  • XML
  • XML_FILE
  • XML_SCHEMA
  • XMLFILE
  • XSD
  • XSDFILE
  • XSLFILE

Yup. Some are easy to handle (just by looking at their name e.g. AVI, BITMAP, XML), but… this is just Windows 10.

Time will tell if we will ever see a PE editor/viewer that can handle all of, or at least most of these well.

In the meantime…

Resources is something you may want to look at more closely. Starting today.

Why?

Because of this guy:

I got it from resources of Norton SecureWorks circa 2002-2003. Do you even remember this software existed?

One of cool side-effects of poking around in many resources is coming across weird, unusual strings, texts, images, movies, you name it. You will find developer pictures that were not meant for general public, ‘tagging’ images with names of developers of project managers, jokes, and whatever else. Yes, there is cheezy, there is porn, there are obscenities, there are also Easter Eggs.

If you want to start building your own collection, it couldn’t be easier…

You can simply use:

  • 7z l <filename> .rsrc
    • to list all the resources of a <filename>
  • 7z x <filename> .rsrc’
    • to extract them.

And then start data crunching:

  • Icons are interesting, especially if re-used for malicious purposes (e.g. Adobe, Microsoft) –> there are existing yara sigs for these!
  • Manifest may include references to other executables/DLLs loaded
  • Manifest may also include references to rights required for running the executable (e.g. look for level=”requireAdministrator”)
  • Language information may be helpful with attribution (beware of false flags)
  • Version Information lists lots of interesting information that can be co-related with the information extracted from certificates / signatures, if present
  • Delphi resources are fairly well documented and can be extracted, especially the aforementioned package names — can help to at least cluster samples as per the modules used (may sometimes highlight similar families, plus good for yara sigs)
  • Everything else should be extracted and checked against typical file types/magic:
    • BMP
    • PNG
    • GIF
    • JPG
    • AVI
    • Wav
    • Rtf
    • Ico
    • Cur
    • PE files
    • LE files (older version of MZ executables)
    • MZ files (yup, plain DOS)
    • UTF8/Unicode BOMs
    • Office files
    • etc.

Resources are a very important metadata source for analysts. If you are lucky you may not only get the visuals, but also timestamps (e.g. in Delphi executables).

Be err… resourceful.