Java – y u haz no class? (proxy logs patterns: class.class, com.class, edu.class, net.class, org.class)

July 9, 2013 in Forensic Analysis, Malware Analysis, Proxy Logs Analysis

TL;DR; This post explains why we see /class.class, /com.class, /edu.class, /net.class, /org.class in the proxy logs; The first one is a result of the .class file missing on the server, and four others are requested when Java applet code uses Javascript engine (Rhino).

Intro

Proxy Log Analysis is an art of finding suspicious stuff in a HUGE amount of logs generated by the web traffic (and sometimes traffic on other ports as well) which is an evidence of:

  • Employees browsing internet
  • Employees doing something ‘funny’ (tunneling, proxy bypass, etc.)
  • Computer running software connecting to the Internet
  • System getting updates
  • and… of course, malicious activity – both the one that indicates malware trying to sneak in on the system by means of a drive-by, and one that is a sign of malware actively running on the system.

Typically, to analyze logs efficiently one employs a classification of the IPs/URLs/domain names, etc. provided by various vendors/engines automatically and/or utilize the IPs/Host names shared by various threat intel companies and groups. Other approaches include statistics, histograms, and in general data mining of your own logs in any possible way. I often prefer the latter, because the evidence of all web-based incidents in your org is all right there and only its analysis can help to find stuff AV, external threat intel and other security controls miss.

The patterns

The patterns that are the topic of this post are very characteristic and can help cherrypicking Java-specific badness in the logs. I (like many others) came across it during manual data mining and since I don’t recall seeing it described anywhere else, I thought it may be useful to document it. The other reason that motivated me to write this post is clarification. I noticed some researchers incorrectly attribute this legitimate Java traffic to exploit packs. While such traffic may be generated by a Java code that is a part of the exploit pack, the patterns themselves are NOT an indication of malware/exploit/drive-by activity.

All of these strange .class requests are a side-effect of a few things happening at the time the proxy hits the web server serving malicious content:

  • Legitimate, clean Java file OR exploit pack is buggy or deployed incorrectly
  • Legitimate, clean Java file OR exploit pack deployed correctly and working, but… removed since (we don’t care how)
  • Specific components of Java are being utilized

In other words, the reasons of  having the following requests:

  • /class.class
  • /com.class
  • /edu.class
  • /net.class
  • /org.class

present in the web logs is Java itself when it is launched by a browser to render APPLET/EMBED/OBJECT HTML tags.

Since it’s fun to experiment and you may want to do so yourself I am providing a bunch of examples that you can use to test the whole thing in your own lab.  The examples were executed on IE 8.0 with Java 1.6.

Requests to a non-existing .class file

We write a simple HTML code, place it on the test web server and load it into an IE browser.

The code is as follows:

<html>
<applet code="code_only.class" width="0" height="0"></applet>
</html>

When such code is rendered by the browser it will load Java and Java will request code_only.class file. If the file doesn’t exist on the server we will observe the following traffic:

java_logs_code_onlyThe request to a non-existing code_only.class file ends up with Java requesting code_only/class.class file next.

From the analyst’s perspective: searching for all instances of requests to /class.class in your logs may give you a list of potential new drive-by web sites (note, sometimes corporate web sites are buggy and you may see /class.class requests from non-malicious IPs too, so keep your eyes open!).

Requests to a non-existing .class file, take 2

Since we know that Java will request class.class file if it cannot find the original code_only.class file, we can try to place code_only/class.class file on the server and observe what happens when the same page is requested (ensure you clear both IE and Java cache before you do + kill java.exe instance!):

java_logs_code_only_dirclassNo surprises here, the file is actually downloaded.

Requests to a non-existing archive (.jar) file

<html>
<applet archive="archive_only.jar" width="0" height="0"></applet>
</html>

When such code is rendered and archive_only.jar doesn’t exist on the server, we will observe the following traffic:

java_logs_archive_onlyStrange. Why there is no traffic to archive_only.jar observed at all?

If you look at the HTML code above, you will notice that there is no code attribute.

No code == no rendering. Yes, it’s just a buggy HTML code. I put it here only for the sake of completeness, because from the analysis perspective, it has no value as you won’t be able to observe anything specific related to this behavior in the proxy logs at all (the traffic ‘doesn’t happen’ here).

Requests to a non-existing archive (.jar) file, take 2

Okay, we now add the code tag.

<html>
<applet archive="archive_and_code.jar" code="archive_and_code.class" width="0" height="0"></applet>
</html>

When such code is rendered we will observe the following traffic:

java_logs_archive_and_codeThe request to a non-existing archive_and_code.jar is seen first, the request is repeated a few times and then an attempt is made by Java to load a non-existing archive_and_code.class file. Since it doesn’t exist either, it then makes request  to archive_and_code/class.class – same as in the first two cases discussed above.

Request to an existing archive (.jar) file, but without the proper class

We will use the same code:

<html>
<applet archive="archive_and_code.jar" code="archive_and_code.class" width="0" height="0"></applet>
</html>

and also drop a simple JAR (I just used a random zip archive) file named archive_and_code.jar on the server. It doesn’t contain the proper class.

When such code is rendered we will observe the following traffic:

java_logs_presarchive_and_coder

The request to a an existing archive_and_code.jar is seen first, and since in this case the archive doesn’t contain the actual archive_and_code.class class file, Java ignores it and goes on trying to load a non-existing archive_and_code.class file.

Since it doesn’t exist either, it then makes Java request archive_and_code/class.class – all in all, almost the same behavior as in the earlier cases.

Request to an existing archive (.jar) file, and with a proper class file

Re-using the same HTML code and jar file as in previous case, but adding archive_and_code.class file to it makes the request simpler – only the JAR file is requested.

java_logs_presarchive_and_prescodeNo surprises here either. Java downloads the JAR file, extracts the .class file and executes code. If the file is corrupted, or completely incorrect file format it just ignores it (hopefully; perhaps there are still some 0days bugs waiting to be found in this area).

Getting requests to com.class, edu.class, net.class, org.class

Okay, at first these can be a bit puzzling as they very often appear in the logs in a very close proximity of requests related to exploit packs and it’s natural to assume they are part of malicious traffic. Well, to some extent they are, but as stated earlier – they are a side-effect of how Java engine behaves and are not result of specific requests made by an exploit pack or a malicious Java code.

Turns out that the requests to these 4 classes is a result of applet code (could be an exploit pack but also a legitimate code) utilizing a ScriptEngineManager object which is an internal Java engine that can execute JavaScript. To see when exactly this behavior triggers, let’s try to implement a simple Java applet first:

import java.awt.*;
import java.applet.*;
import java.awt.event.*;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class call_ScriptEngineManager extends Applet 
{
   public void init()
   {
    ScriptEngineManager factory = new ScriptEngineManager();    
   }    
}

All this code does is an instantiation of the ScriptEngineManager object.

Launching the applet from an HTML code:

<html>
<applet code="call_ScriptEngineManager.class" width="0" height="0"></applet>
</html>

leads to the following traffic being generated:

java_logs_ScriptEngineManager1Oops. It looks like the instantiation alone is not enough.

Oh well, obviously there is something missing in my code. ScriptEngineManager is a manager of scripts, we need to tell it what engine we want to work with!

Let’s modify the applet code pasted above and add one line requesting JavaScript engine to be instantiated:

import java.awt.*;
import java.applet.*;
import java.awt.event.*;
import javax.script.ScriptEngine;
import javax.script.ScriptEngineManager;
import javax.script.ScriptException;

public class call_ScriptEngineManager extends Applet 
{
   public void init()
   {
    ScriptEngineManager factory = new ScriptEngineManager();
    ScriptEngine engine = factory.getEngineByName("JavaScript");
   }    
}

Loading such applet leads to the following traffic being observed:

java_logs_ScriptEngineManager2Well, there you have it: requests to:

  • /com.class
  • /edu.class
  • /net.class
  • /org.class

appear when we instantiate the JavaScript engine via ScriptEngineManager.

This below is just a guess (I don’t know Rhino source code and architecture at all), but looking at the source code the activity seems to be triggered by the following snippets (if anyone can confirm it, please let me know, thanks!):

  • src\org\mozilla\javascript\ScriptRuntime.java
             static String[] getTopPackageNames() {
        // Include "android" top package if running on Android
        return "Dalvik".equals(System.getProperty("java.vm.name")) ?
            new String[] { "java", "javax", "org", "com", "edu", "net", "android" } :
            new String[] { "java", "javax", "org", "com", "edu", "net" };
    } 
[...]
   for (String packageName : getTopPackageNames()) {
            new LazilyLoadedCtor(scope, packageName,
                    "org.mozilla.javascript.NativeJavaTopPackage", sealed, true);
        }

How can we use this data?

I personally came up with a few ideas, which I was directly utilizing to enhance the defenses of the org I was protecting – all of them relied on knowing the  new paths / file name patterns used by exploit packs. These change often and keeping ahead of the curve allows us to:

  • Define regular expressions or static patterns for proxy logs block / alerts (can be also added to SIEM if you use it for proxy log analysis)
  • Search online and find actual domain names using these – if done manually this is a very tedious job, but can be automated – these can be blocked as well
  • Find actual new malware samples for submitting to AV (malicious web sites sometimes don’t protect against directory listing and knowing a specific file name makes it possible to find the full repository of new malware samples)

There are probably more ideas possible.