Reversing w/o reversing – how to become Alex in practice, Part 2

My post from yesterday was written in a hurry so I didn’t have a chance to cover everything. So, time for the part II.

Okay, let’s start from the old stuff.

The really old stuff

There is a great web site called vetusware.com. It is collecting stuff that is abandonware. When you start searching the page you will get stuck and will spend hours downloading some really esoteric software. There is tones of 16-bit software. There is also lots of 32-bit software from 90s and noughties. There _are_very old SDK, DDK packages there. You do want to download them in case they include descriptions, definitions that have been removed in later versions of SDK/DDK.

You just need to go there and start downloading. The gems you can find include software from early days of Microsoft, Borland, Wordstar, IBM, OS/2 and so so and so forth. This is where it all started. For PC, at least.

The echoes of Int 21h

This one is for Alex. Just kidding – the thing is that before the internet took the shape it has today many coders and reverses relied on just a bunch of knowledge sources.

One of the most important things you wanted to put your hands on back in a day was Ralf Brown’s Interrupt List. This is a nostalgic piece of beauty. It was way ahead of its time and is to date one of the best ever compilations of descriptions of programming interface of tones of APIs. It was a Bible for DOS coders.

The fact these API functions were called or executed via software interrupts doesn’t matter. Ralf collected an impressive collection of knowledge in one piece. There is Microsoft DOS int 21h, int 25h, int 26h, there is VESA for graphic cards, there are low-level int 13h functions of HDD, there are hardware interrupts int 08h, int 09h, there are interrupts internally used by viruses, as well as extensions used by various software, and so on and so forth.

If you ever need a reference for analysing the 16-bit code, the Ralf’s Interrupt List it is.

Echos of early 32-bit coding

Okay, today you have StackOverflow, and everyone programs in QT, .NET, Electron, etc.. Back in a day it was Win32 API, MFC, AFC, Borland, Delphi, Code Gear, and finally Embarcadero. And of course, Alex’s favorite – Visual Basic (I don’t mention Java, because Java people are from a different camp). And people talking about programming either talked on Usenet, IRC, or on web sites like codeproject.com, or codeguru.com.

Many early malware creations were borrowing code from these two sites I mentioned, because the code quality was decent, and most importantly – this was the only place apart from some articles in MSDN, Dr Dobbs, sometimes Usenet where you could find some ‘juice’ back then.

Even today you will find a lot of great articles there. Even if old, they do cover the foundation of many technologies we take today for granted or have already forgotten about e.g.:

  • DDE (Dynamic Data Exchange)
  • OLE (Object Linking and Embedding)
  • COM (Component Object Model)
  • MFC (Microsoft Foundation Class), and
  • AFC (Application Foundation Classes).

There is also a lot of information and code in pure Windows API – it makes it much more valuable than some easy to digest .NET code that hides a lot of details from you (this is not to say .NET is bad; not at all, and quite the opposite; what you can do with .NET via PowerShell today is absolutely amazing). Still, good to look at the old-school stuff if you want to know how ‘raw’ COM interfaces work. There are multiple layers today that both simplify and obfuscate a lot, but when you start digging you will get to the bottom of it.

On this note, you should also get familiar with tools like OleView, OleWoo that allow you to analyze interfaces embedded inside many system DLLs. And there is also OleView’s .NET equivalent from James Forshaw called OleViewDotNet.

Old Software

It’s great to get access to old software. You gonna like OldApps.com, and oldversion.com. If you need to do diffs between versions of the same software, or play around with the legacy software to see if it can still be used e.g. connected to some old legacy servers — this is a good place to start.

I cover why you need a repo of clean software below. Read on.

Old Software and PADs

Ever heard of PAD files?

Back in a day everyone was selling shareware. To sell shareware you had to publish and promote it. Publishing on one site was easy, publishing on 200 sites is hard. And updating all this was even tougher.

This is why some clever shareware authors (Association of Software Professionals) came up with an idea of a PAD file.

It’s basically a XML file that includes vital information about the software e.g. name, vendor name, web site, and also places you can download the software from. PAD stands for Portable Application Description and apart from the page I linked to you can read more about it on wikipedia. For us, the most important is the juice and these are actual PAD files, and the more the merrier.

Why?

Every single one leads you to an executable. And its future updates.

If you can collect a large sampleset of legitimate software you can actually build a nice repo of so-called clean samples. This can help you to extract e.g. clean strings, signatures of clean functions, actual authenticode signatures of vendors (if software is signed), feed your engine with a list of clean URLs, download stuff on regular basis to ensure they are whitelisted, and so on and so forth.

From today’s perspective there are many caveats, of course. There are many cases of PADs being abused by PUA, adware, etc. Secondly, we now are very aware of supply-chain attacks, so can’t fully trust all the downloaded binaries. Nevertheless PADs are an ‘easy win’ when it comes to a source of many clean samples. Yes, you need them. Even if just for testing your yara sigs, AV definitions, etc.

And… that’s it for the part 2. And there is a part III coming 🙂

Reversing w/o reversing – how to become Alex in practice

If you are new to reversing, or want to get better please watch this excellent presentation by Alex Ionescu. He nailed it: reversers specialize in never-ending acquisition of knowledge and… hamstering ISOs.

Stealing somebody’s preso titles is not a nice thing to do, so I apologize in advance. I hope I am not doing anything wrong tho – I just thought it would be nice to add 5 cents to Alex’ preso. And I mean the practical bit for newcomers.

So, with this post I try to answer the following question: “okay, so Alex is now officially old and joined the elderly that remember int 21h, but what can I do to catch up? cuz most of these ideas and tricks he presented are already dead, because time, money, access, change of politics, discontinuation of services and/or any we can’t legally bypass technical and NDA restrictions that stop us from accessing this mountain of knowledge?”.

So, how one would go about re-creating something that he was able to do when he started it nearly 20 years or so?

I won’t tell you how to change your work patterns, but I will tell you how to gather data that can rapidly put you right behind him.

Everything starts and ends with data. Start collecting it. Cherish it. Hamster it. Process it. Do not commit digital Tsundoku.

Right… so here it goes…

Google dorks

Yes, ridiculous, in 2019 we still can use them a lot. You can use it for pretty much everything.

A simple “index of /” + “file name”, sometimes enhanced with “ext:” can give you access to a lot of data.

For instance, how would you go about looking for libcurl dlls?

FTPing w/o FTPing

FTP repos are almost non-existing today, but you can still find them sometimes (e.g. some vendors still use them). You actually want to find FTP search engines more than the actual FTP sites though.

Why?

Because you don’t want to use them to actually download files per se (although if you can, then it’s great, but most of them link to dead repositories today).

What you need them for is… the file lists.

Your strongest researching tool online is the file name. The more unique, the better. Once you get many of these, this can immediately allow you to build a google dork for specific search. With that you have high chances of finding actual copy of a file, or files.

Collecting libraries

Every once in a while I go on a hunt for all the possible versions of some library, both static and dynamic versions. Basically, any copy of the library in any form. I need it because I want to have them at hand for comparisons.

Many researchers build collections of libraries like this, because when you don’t have a compilation time in an analyzed executable (e.g. it got wiped out) the version of the library can give you some temporal point of reference.

Another reason for collecting libraries are signatures. If you have a lot of copies of the same library you can build sigs that will help you to name functions inside the executable if they use a statically linked library.

What to look for?

Anything lib*: libcurl, libpng, libpcre, and then the usual suspects e.g. adlib, pcre, sqlite3, etc.

Downloading DDK and SDK stuff

Yes. You want as many versions of SDK and DDK available.

Yes. You can find them.

How?

Already told you: you need these file names. Badly.

So, how to go about finding the names of the files you want to find?

You primarily need ISO names. Most of DDK and SDK used to be distributed this way. Lots of them are still there. Somewhere.

How do you find them? You need at least a few file names as a seed 🙂

Here’s a few names:

  • GRMSDK_EN_DVD.iso
  • GRMSDKIAI_EN_DVD.iso
  • GRMSDKX_EN_DVD.iso

Typically, once you know the name of one or two ISOs you will quickly find tones of others. Somehow there is a tendency for anyone who uses them for whatever reason to cluster them (or their names) together with others.

Once you start browsing you will find actual links to downloads. Click link to see the example. What did I tell you? Many of them are still at Microsoft web site. So click through Google results and eventually you will find downloadable versions. And typically, you will get clusters of downloads.

So now you have file names, and links.

Go on. Download it all. Some may not work. Most won’t. But this is a one step closer. You could ask around — if you have a file name, it’s much easier to find than asking for a specific SDK version.

Downloading OS ISOs

Plus, there is more.

Okay, you can download them from warez sites, but it’s not recommended. Apart from legal issues, moral principles, there is also a problem of file integrity and malware.

And there are still some ways, but yes, honestly… rich are privileged. If you don’t have a MSDN subscription you are kinda screwed. In the past you could at least get access to many ISO via a very cheap Technet subscription, but this one is gone as well.

The good news is that many ISOs for more recent Windows versions are actually online & often available directly from the Microsoft site!

Again, you need file names.

Or… you need a site that already did all the work for you. I obviously don’t endorse the web site, and you are visiting and using it at your own risk. If you browse it though you will find a lot of OS ISO file names. These may lead you to further searches + actual downloads. Yes, lo-and-behold, many links present there lead you to the Microsoft Site where you can download actual OS ISOs from.

Here’s a list of example ISO names:

  • de_windows_7_starter_with_sp1_x86_dvd_u_678545.iso
  • de_windows_7_starter_with_sp1_x86_dvd_u_678545.iso
  • cs_windows_7_enterprise_with_sp1_x86_dvd_u_677695.iso
  • ct_windows_10_multi-edition_version_1709_updated_sept_2017_x86_dvd_100090807.iso
  • ct_windows_10_multi-edition_version_1709_updated_sept_2017_x64_dvd_100090806.iso
  • fr_windows_10_multi-edition_version_1709_updated_sept_2017_x86_dvd_100090827.iso
  • fr_windows_10_multi-edition_version_1709_updated_sept_2017_x64_dvd_100090825.iso

Also, you may like this link. It includes a lot of hashes for MSDN-related content.

Dead Links

Once you start hunting for ISO files you will notice that in some cases you will find dead links. If you have file names though you should be able to find _some_ repos online that still keep them. Again, you can also ask around.

Downloading Very Old Stuff

It’s time to try Web Archive ISO project. For example, this link shows you all ISOs hosted there that come from Microsoft. These look like winners:

  • EN_WIN2000_PRO_SP4.ISO
  • dos71floppy.zip
  • windowsmeisoandbootdisk.zip

Again, keep an eye on the file names. These may lead you to weird corners of the internet where someone somewhere is still sitting on these old Resource Kits, SDKs, DDKs, etc.

Good luck.