Archaeology | Hexacorn

If you are new to reversing, or want to get better please watch this excellent presentation by Alex Ionescu. He nailed it: reversers specialize in never-ending acquisition of knowledge and… hamstering ISOs.

Stealing somebody’s preso titles is not a nice thing to do, so I apologize in advance. I hope I am not doing anything wrong tho – I just thought it would be nice to add 5 cents to Alex’ preso. And I mean the practical bit for newcomers.

So, with this post I try to answer the following question: “okay, so Alex is now officially old and joined the elderly that remember int 21h, but what can I do to catch up? cuz most of these ideas and tricks he presented are already dead, because time, money, access, change of politics, discontinuation of services and/or any we can’t legally bypass technical and NDA restrictions that stop us from accessing this mountain of knowledge?”.

So, how one would go about re-creating something that he was able to do when he started it nearly 20 years or so?

I won’t tell you how to change your work patterns, but I will tell you how to gather data that can rapidly put you right behind him.

Everything starts and ends with data. Start collecting it. Cherish it. Hamster it. Process it. Do not commit digital Tsundoku.

Right… so here it goes…

Google dorks

Yes, ridiculous, in 2019 we still can use them a lot. You can use it for pretty much everything.

A simple “index of /” + “file name”, sometimes enhanced with “ext:” can give you access to a lot of data.

For instance, how would you go about looking for libcurl dlls?

FTPing w/o FTPing

FTP repos are almost non-existing today, but you can still find them sometimes (e.g. some vendors still use them). You actually want to find FTP search engines more than the actual FTP sites though.

Why?

Because you don’t want to use them to actually download files per se (although if you can, then it’s great, but most of them link to dead repositories today).

What you need them for is… the file lists.

Your strongest researching tool online is the file name. The more unique, the better. Once you get many of these, this can immediately allow you to build a google dork for specific search. With that you have high chances of finding actual copy of a file, or files.

Collecting libraries

Every once in a while I go on a hunt for all the possible versions of some library, both static and dynamic versions. Basically, any copy of the library in any form. I need it because I want to have them at hand for comparisons.

Many researchers build collections of libraries like this, because when you don’t have a compilation time in an analyzed executable (e.g. it got wiped out) the version of the library can give you some temporal point of reference.

Another reason for collecting libraries are signatures. If you have a lot of copies of the same library you can build sigs that will help you to name functions inside the executable if they use a statically linked library.

What to look for?

Anything lib*: libcurl, libpng, libpcre, and then the usual suspects e.g. adlib, pcre, sqlite3, etc.

Downloading DDK and SDK stuff

Yes. You want as many versions of SDK and DDK available.

Yes. You can find them.

How?

Already told you: you need these file names. Badly.

So, how to go about finding the names of the files you want to find?

You primarily need ISO names. Most of DDK and SDK used to be distributed this way. Lots of them are still there. Somewhere.

How do you find them? You need at least a few file names as a seed 🙂

Here’s a few names:

GRMSDK_EN_DVD.iso
GRMSDKIAI_EN_DVD.iso
GRMSDKX_EN_DVD.iso

Typically, once you know the name of one or two ISOs you will quickly find tones of others. Somehow there is a tendency for anyone who uses them for whatever reason to cluster them (or their names) together with others.

Once you start browsing you will find actual links to downloads. Click link to see the example. What did I tell you? Many of them are still at Microsoft web site. So click through Google results and eventually you will find downloadable versions. And typically, you will get clusters of downloads.

So now you have file names, and links.

Go on. Download it all. Some may not work. Most won’t. But this is a one step closer. You could ask around — if you have a file name, it’s much easier to find than asking for a specific SDK version.

Downloading OS ISOs

Plus, there is more.

Okay, you can download them from warez sites, but it’s not recommended. Apart from legal issues, moral principles, there is also a problem of file integrity and malware.

And there are still some ways, but yes, honestly… rich are privileged. If you don’t have a MSDN subscription you are kinda screwed. In the past you could at least get access to many ISO via a very cheap Technet subscription, but this one is gone as well.

The good news is that many ISOs for more recent Windows versions are actually online & often available directly from the Microsoft site!

Again, you need file names.

Or… you need a site that already did all the work for you. I obviously don’t endorse the web site, and you are visiting and using it at your own risk. If you browse it though you will find a lot of OS ISO file names. These may lead you to further searches + actual downloads. Yes, lo-and-behold, many links present there lead you to the Microsoft Site where you can download actual OS ISOs from.

Here’s a list of example ISO names:

de_windows_7_starter_with_sp1_x86_dvd_u_678545.iso
de_windows_7_starter_with_sp1_x86_dvd_u_678545.iso
cs_windows_7_enterprise_with_sp1_x86_dvd_u_677695.iso
ct_windows_10_multi-edition_version_1709_updated_sept_2017_x86_dvd_100090807.iso
ct_windows_10_multi-edition_version_1709_updated_sept_2017_x64_dvd_100090806.iso
fr_windows_10_multi-edition_version_1709_updated_sept_2017_x86_dvd_100090827.iso
fr_windows_10_multi-edition_version_1709_updated_sept_2017_x64_dvd_100090825.iso

Also, you may like this link. It includes a lot of hashes for MSDN-related content.

Dead Links

Once you start hunting for ISO files you will notice that in some cases you will find dead links. If you have file names though you should be able to find _some_ repos online that still keep them. Again, you can also ask around.

Downloading Very Old Stuff

It’s time to try Web Archive ISO project. For example, this link shows you all ISOs hosted there that come from Microsoft. These look like winners:

EN_WIN2000_PRO_SP4.ISO
dos71floppy.zip
windowsmeisoandbootdisk.zip

Again, keep an eye on the file names. These may lead you to weird corners of the internet where someone somewhere is still sitting on these old Resource Kits, SDKs, DDKs, etc.

Good luck.

This is not a very important research really. Just a ‘blurb’ of what I observed during my quick tests.

So…

First of all, I noticed that .dctx files are being handled by this program:

C:\Windows\System32\IME\shared\IMEWDBLD.EXE

These are dictionary files (source) and are compiled to some other binary format (.dctc AFAICT). These dictionaries seem to be heavily used (and needed?) for Asian languages, so most of info on them can be found online on forums discussing Japanese and Chinese language keyboard input.

Examples: here, and here.

When you open a .dctx file on Windows 10 you will be presented with this dialog box:

When we click OK, we will see another dialog box:

I have not figured out what that means, but it seems to be a highly prevalent error and many users report it. I couldn’t bypass it despite toying around with various parameters embedded inside my test .dctx file. I tried to use variations of English language (US vs. UK), different encoding, etc., but it always comes back with the same error.

Also, after looking at IMEWDBLD.EXE, I noticed that it takes a -v <logfile> command line argument (where -v stands for -verbose, I guess). Using it during testing is a better alternative to that non-descriptive dialog box shown above. After trying to open the very same .dctx with IMEWDBLD.EXE and -v flag enabled I observed this in the ouput of the log file:

Error: Encountered fatal error(0x80070057:The parameter is incorrect.).
Error: There is a problem with the dictionary file. Please try to download again.

Unfortunately, this error is very prevalent inside the binary (IMEWDBLD.EXE), so I didn’t spend too much time trying to figure it out. Okay, if you must know, 0x80070057 stands for an invalid argument. Would be really handy to know which argument triggered it… hmm…..

So, that’s it really.

If you want to play around, this is a minimalistic sample .dctx file you can try to import on your Windows 10 system. Download, and double click. That’s it.

Bonus

I think the IME components are not very well researched and can potentially offer mechanisms that will allow for less-known attacks focused on:

persistence
bypassing security controls
RCE

Why?

They seem to be developed for a niche (but not negligible due to number!) group of users in Asia (Japanese, Chinese), and most likely have been poorly tested. The last IME-related research I could find is here.

Why?

If you look at IMEWDBLD.EXE binary you will notice a bunch of flags that are not documented anywhere on the internet. Hence, they could be limited to a test environment at MS, or only taken into account on OS versions that require IME. The lower the scope, the lower the testing priority. A.K.A. if it is not documented on the Internet, then it’s likely internal.

Some food for a thought:

HKLM\SOFTWARE\Microsoft\IME\PlugInDict
EncryptAllPlugInDict
DisableAllPlugInDict

Command line arguments for IMEWDBLD.EXE:

-encrypt <unknown>
-pluginguid <guid>
-w <unknown>
-pm <unknown>
-v <logfile> – saves the verbose info to logfile
-nofilter <unknown>
-testing <unknown>

Hexacorn

Hexacorn

Category Archives: Archaeology

Reversing w/o reversing – how to become Alex in practice

Dictionary files (.dctx)