About Obfuscation - RetroGuard Documentation


Prev   Contents   Next

Java bytecode (*.class files) contains all of the information, apart from comments, that is in Java source (*.java) files. Using a tool called a decompiler a hostile competitor can easily reverse engineer your Java classes. To counter this threat, it is possible to obfuscate your class files before distributing your software.

The obfuscation process strips all unnecessary information from the classes. This includes the line number tables, local variable names and source file names used by debuggers. Also, class, interface, field and method identifiers are renamed to render them meaningless. The Java virtual machine, which runs your bytecode, does not care at all about these changes. However, the decompiled version of these classes is extremely difficult to understand, frustrating any attempt to reverse engineer your code. The changes that an obfuscator makes to your Java classes are not reversible - there is no automated way for a reverse engineer to recover the lost information about your code.

An additional benefit to obfuscation is a substantial reduction in the size of your Java classes, due to the removal of unnecessary information and the replacement of large, human-readable identifiers with small machine generated names. This size reduction leads to faster download times for your Java applets, and the ability to pack more features into your midlets running on small devices like cellphones and PDAs.

To determine which classes are to be obfuscated, most obfuscators start at a single entry point (usually the 'main' method of an application, or the 'Applet'-derived class for an applet), and construct a tree of all classes accessible from that point. Unfortunately, this method is quite limiting and works only in simple cases. If your Java code has multiple entry points (several applications, applets, or JavaBeans, or if your code is intended to be used as a Java library) then this method is just not flexible enough.

Instead, RetroGuard obfuscates all classes and interfaces within a JAR file. JAR files are the industry standard mechanism for packaging Java classes for distribution - it is easy to package your classes as a jar using the 'jar' utility distributed with the Java Development Kit from Sun Microsystems. Any number of entry points to the JAR can be specified using a RetroGuard script file. This allows the obfuscation process to be completely flexible.

A technique used by several obfuscators is to introduce corrupt bytecode into the obfuscated Java classes. These corruptions are prohibited by the definitive text, the Java Virtual Machine Specification by Yellin and Lindholm, but do not happen to be noticed by the current virtual machine implementations. The corruptions are sufficient to break some of the simpler decompilers on the market. This class corruption is a very dangerous course to take, however, since virtual machines will certainly enforce the constraints of the Specification much more strictly in the future. At that point, code which uses this 'corrupting obfuscation' will simply fail.

Corruption of classes is unacceptable - one cannot afford to ship Java bytecode which only sometimes runs, or fails completely on some virtual machines. For this reason the RetroGuard obfuscator produces only verifiable bytecode in full compliance with the Java Virtual Machine Specification. Instead of corrupting the bytecode, RetroGuard uses heavy overloading of identifiers (multiple uses of method names within a class) and the introduction of Java source-code keywords as identifiers to make it almost impossible to understand decompiled Java classes.

Another technique that is often suggested to prevent decompilation is encryption of Java classes and the use of a custom classloader to decrypt them. However, since the decrypted classes can always be intercepted using a modified version of the 'java.lang.ClassLoader' method 'defineClass', that technique is fundamentally flawed. The issue is explained very clearly in Vladimir Roubtsov's article at JavaWorld.


Prev   Contents   Next
 Copyright © 1998-2007 Retrologic Systems.
 All rights reserved.