Chapter 2. Attack and Defense

Attack and Defense

Many claims have been made for the security of Java. A lot of these claims have been rather exaggerated, but underlying them is the fact that security was designed-in at an early stage in the development of the language. Saying that Java has strong security is like challenging the world to find the holes in it, which is exactly what has happened. Some very clever (and very devious) people have been applying their brain-power to the problem of breaking down the Java defenses.

In this chapter we give a high-level view of how Java defends itself and then summarize the different ways in which it can be attacked.

Java Is Not Just a Language

Most of the books on the subject deal with Java as a programming language. As a programming language it has much to recommend it. Its syntax is very like C, but with many of the features that hurt your brain removed. It is strongly object-oriented, but it avoids the more obscure corners of the O-O world.

For most programming languages the question of "how secure is it" does not arise. It's the application that needs to implement security, not the language it is written in. However, Java is many other things in addition to being a programming language:

A set of object-oriented frameworks, primarily for GUI building and networking
An operating system
A client/server management mechanism
A unifying force that cuts across operating system and network boundaries

It is not surprising that Java has become so widely accepted, so quickly. Before we look at the security issues, let us review some Java fundamentals.

Components of Java

There are a number of different components to Java:

Development environment
The Java Development Kit ( JDK) contains the tools and executable code needed to compile and test Java programs. However, unlike a normal language, the JDK includes object frameworks for creating graphical user interfaces, for networking and for complex I/O. Normally these things are provided as additions, either by the operating system or by another software package. Of course, fully-featured development environments do exist for Java, but the core language includes a lot of what they would normally have to provide.
Execution environment
Java's execution environment is neither that of a compiled language nor an interpreted language. Instead it is a hybrid, implemented by the Java Virtual Machine ( JVM) . Java is often said to be platform-independent, but first the JVM must be ported to each platform to provide the environment it needs. The JVM implementation is responsible for all of the built-in security of Java, so it is important that it is done properly.
Interfaces and architectures
Java applications live in the real world. This means that they must be able to interact with non-Java applications. Some of these interactions are very simple (such as the way that a Java applet is invoked in a Web page). Others are the subject of more complex architectural definitions, such as the JDBC interface for relational database support. The mechanism for adding encryption to Java security, the Java Cryptography Architecture ( JCA), falls into this latter category.

The Development Environment

Once you have installed the JDK, you can start creating Java source code and compiling it. Java is like any other high-level programming language, in that you write the source code in an English-like form. The source code then has to be converted into a form that the machine can understand before it can be executed. To perform this conversion for a normal language, the code is usually either compiled (converted once and stored as machine code) or interpreted (converted and executed at runtime).

Java combines these two approaches. The source code has to be compiled with a Java compiler, such as javac , before it can be used. This is a conventional compilation, but the output that it produces is not machine-specific code, but instead is bytecode , a system independent format. We will take a closer look at how bytecode is constructed in See Java Bytecode .

In order to execute, the compiled code has to be processed by an interpreter, which is part of the Java execution environment (known as the Java virtual machine, or JVM). The JVM is a runtime platform, providing a number of built-in system services, such as thread support, memory management and I/O, in addition to the interpreter.

Class Consciousness

Java is an object-oriented language, meaning that a program is composed of a number of object classes, each containing data and methods. One result of this is that, although a program may consist of just a single class, when you have compiled it into bytecode only a small proportion of the code that gets executed is likely to be in the resulting .class file. The rest of the function will be in other classes that the main program references. The JVM uses dynamic linking to load these classes as they are needed. As an example, consider the simple applet contained in the following two Java source files:

Applet Source Code, PointlessButton.java

Invoked Class File, Button.java

The first listing, pointlessButton.java, is an applet that places a button on the Web page. It is not a very useful button, but we like it. Instead of using the standard AWT Button class it uses a class of our own, also called Button (see the second listing), but in a locally-written package. This works like a normal button, except that it changes color when you move the mouse pointer over it. See Running the pointlessButton Applet shows two copies of the applet running in a Web page.

Running the pointlessButton Applet

The total size of the bytecode for this example is only 2 KB. However, the two classes cause a lot of other code to be dynamically installed, either as a result of inheritance (defined by the extends keyword in the class definition) or by instantiation (when a class creates an instance of another class with the new keyword). See Classes Loaded for the pointlessButton Applet shows the hierarchy of classes that could potentially be loaded to run our simple applet (this is a simplified view, because it does not consider classes that may be invoked by classes above the lowest level of the hierarchy).

Classes Loaded for the pointlessButton Applet

This diagram illustrates a number of things about Java classes:

The classes are arranged in packages which are collections of related classes. The language defines a large number of these, which have to be implemented by every JVM implementation. You can add your own class packages by defining new classes that inherit from one of the basic classes. In our example, all but two of the classes are provided as standard. Normally, Java class loaders impose a direct relationship between a package name and the location of the directory in which it expects to find the class files for the package. So, in our example, the classes contained in the jamjar.examples package will be found in directory {codebase}/jamjar/examples (codebase is the base directory on the server from which the applet is loaded, specified in the applet tag). 1
Classes are defined as extending existing classes. This means that they can inherit the properties (variables and methods) of the higher (or super ) class. They can also selectively override the properties of the super class. They also add new properties of their own.
Java identifies classes using the fully-qualified class name, that is, the combination of the package name and the class name. This allows you to have duplicated class names, such as our two Button classes. If two classes in different packages do have duplicate names, the programmer must take care to use the right one. Two things that help with this are: importing classes by name, instead of importing the whole package, and placing the trusted classes at the start of the class path.

VABs and Beans

Java is unusual in the breadth of function that its built-in class frameworks provide; however, for a project of any complexity you are likely to employ graphical tools, such as a visual application builder (VAB) to link together predefined components, thereby reducing the code you have to write to the core logic of the application. Examples of VABs include IBM VisualAge for Java and Lotus Development's BeanMachine.

A component in this context is a package of Java classes that perform a given function. The JavaBeans definition describes a standard for components, known as Beans . Basically a Bean is a package of code containing both development and runtime components that:

Allows a builder tool to analyze how it works ("introspection").
Allows a builder tool to customize its appearance and behavior.
Supports "events," a simple communication metaphor than can be used to connect beans.
Supports "properties," or settable attributes, used both when developing an application and programmatically when the application is running.
Supports persistence, so that a bean can be customized in an application builder and then have its customized state saved away and reloaded later.
Provides interfaces to other component architectures, such as ActiveX and LiveConnect.

From this list you can infer that, although a Bean is mostly made up of Java classes, it can also include other files, containing persistent information and other resources such as graphical elements, etc. These elements are all packed (or pickled ) together in a JAR (Java Archive) file.

From a security viewpoint, VABs and Beans do not affect the underlying strengths and weaknesses of Java. However, they may add more uncertainty, in that your application now includes sizeable chunks of code that you did not directly write. Their ability to provide interfaces to other component architectures may also cause problems, as we discuss in See Interfaces and Architectures .

The Execution Process

We have said that the Java virtual machine operates on the stream of bytecode as an interpreter. This means that it processes bytecode while the program is running and converts it to "real" machine code that it executes on the fly. You can think of a computer program as being like a railroad track, with the train representing the execution point at any given time. In the case of an interpreted program it is as if this train has a machine mounted on it, which builds the track immediately in front of the train and tears it up behind. It's no way to run a railroad.

Fortunately, in the case of Java, the virtual machine is not interpreting high-level language instructions, but bytecode. This is really machine code, written for the JVM instruction set, so the interpreter has much less analysis to do, resulting in execution times that are very fast.The JVM often uses "Just in Time" ( JIT) compiler techniques to allow programs to execute faster, for example, by translating bytecode into optimized local code once and subsequently running it directly. Advances in JIT technology are making Java run faster all the time. IBM is one of many organizations exploring the technology. Check the IBM Tokyo research lab site at http://www.trl.ibm.co.jp for project information.

Before the JVM can start this interpretation process, it has to do a number of things to set up the environment in which the program will run. This is the point at which the built-in security of Java is implemented. There are three parts to the process:

The first component of applet checking is the applet class loader . This separates the classes it loads to avoid attack: local classes are separated from remote classes, and classes from different applets are separated from each other. The search order is then Java built-in classes first, local classes next, remote classes last. So, if, by accident or design, an applet contains a class of the same name as a built-in or local class, it will not overwrite it.
The second component is the class file verifier . This runs when the applet is loaded, and aims either to confirm that the bytecode program will stay within the sandbox, or to reject it. It is a multipass process which begins by making sure that the syntax is valid, checks for stack overflow or underflow, and runs a theorem prover that looks to see that access and type restrictions are observed.
The third component is the security manager , which checks sensitive accesses at runtime. This is the component that will not allow Java applets illicit access to the file system, or to the network, or to the runtime operating system.

The Class Loader

So how do these classes get loaded? When the browser finds an <applet> tag in a page, it starts the Java virtual machine which, in turn, invokes the applet class loader . This is, itself, a Java class which contains the code for fetching the bytecode of the applet and presenting it to the JVM in an executable form. The bytecode includes a list of referenced classes and the JVM works through the list, checks to see if the class is already loaded and attempts to load it if not. It first tries to load from the local disk, using a platform-specific function provided by the browser. In our example, this is the way that all of the core java classes are loaded. If the class name is not found on the local disk, the JVM again calls the class loader to retrieve the class from the Web server, as in the case of the JamJar.examples.Button class (above).

Where Class Loaders Come From

The class loader is just another Java class, albeit one with a very specific function. An application can declare any number of class loaders, each of which could be targeted at specific class types. The same is not true of an applet. The security manager prevents an applet from creating its own class loader. Clearly, if an applet can somehow circumvent this limitation it can subvert the class loading process and potentially take over the whole browser machine.

The JVM keeps track of which class loader was responsible for loading any particular class. It also keeps classes loaded by different applets separate from each other.

The Class File Verifier

At first sight, the job of the class file verifier may appear to be redundant. After all, bytecode is only generated by the Java compiler, so if it is not correctly formatted and valid, surely the compiler needs to be fixed, rather than having to go through the overhead of checking each time a program is run?

Unfortunately, life is not that simple. The compiled program is just a file of type ".class" containing a string of bytes, so it could be created or modified using any binary editor. Given this fact, the Java virtual machine has to treat any code from an external source as potentially damaged and therefore in need of verification.

In fact, Java divides the world into two parts, Trusted and Untrusted. Trusted code includes the " local" Java classes which are shipped as part of the JVM and sometimes other classes on the local disk (detailed implementation varies between vendors). Everything else is untrusted and therefore must be checked by the class file verifier. As we have seen, these are also the classes that the applet class loader is responsible for fetching. See Where the Class File Verifier Fits illustrates this relationship.

Where the Class File Verifier Fits

We will look in detail at the things that the class file verifier checks in See The Class File Verifier .

You can see that, for an applet, the class loader and the class file verifier need to operate as a team, if they are to succeed in their task of making sure that only valid, safe code is executed.

From a security point of view the accuracy of the job done by the class file verifier is critical. There are a large number of possible bytecode programs, and the class file verifier has the job of determining the subset of them that are safe to run, by testing against a set of rules. There is a further subset of these verifiable programs: programs that are the result of compiling a legal Java program. See Decisions the Class File Verifier Has to Make illustrates this. The rules in the class file verifier should aim to make the verifiable set as near as possible to the set of Java programs. This limits the scope for an attacker to create bytecode that subverts the safety features in Java and the protection of the security manager.

Decisions the Class File Verifier Has to Make

The Security Manager

The third component involved in loading and running a Java program is the security manager. This is similar to the class loader in that it is a Java class (java.lang.SecurityManager) that any application can extend for its own purpose.

The SecurityManager class provides a number of check methods associated with specific actions that may be risky. For example, there is a checkRead method which receives a file reference as an argument. If you want your security manager to prevent the program from reading that particular file, you code checkRead to throw a security exception.

Although any application could implement SecurityManager, it is most commonly found when executing an applet, that is, within a Web browser. The security manager built into your browser is wholly responsible for enforcing the sandbox restrictions : the set of rules that control what things an applet is allowed to do on your browser machine. Any flaw in the coding of the security manager, or any failure by the core classes to invoke it, could compromise the ability to run untrusted code securely.

The Sandbox Restrictions

The main objectives of the sandbox are to:

Prevent damage to the browser system caused by updating files or running system commands.
Prevent the uninvited retrieval of data by reading files or extracting environmental information.
Prevent the browser machine from being used as a platform to attack other systems.
Prevent the trusted built-in Java classes on the browser from being overridden or corrupted.

This last objective is the key to all of the others. This is because the security manager is, itself, a built-in class so if an attacker can corrupt or bypass it, all control is lost.

The Security Manager is part of the local browser code, so the implementation of the sandbox restrictions is the responsibility of each browser vendor. However, they all have the same objectives, so the result is a set of restrictions that is common across most vendors' implementations:

No local disk access
Very limited environmental information
The "phone home" rule: the only host that an applet can establish a network connection to is the one from which it was loaded
No linkage to local code
No printing

We will look at the sandbox restrictions in more detail in See What the Security Manager Does .

Interfaces and Architectures

We have discussed two parts of the world of Java, the development environment and the execution environment. The third part is where the world of Java meets the rest of the world, that is, the capabilities it provides for extending Java function and integrating with applications of other types. The simplest example is the way that a Java applet is created and integrated into a Web page by writing the program as a subclass of the Applet class and then specifying the class name in an <applet> HTML tag. In return, Java provides classes such as URL and a number of methods for accessing a Web server.

Don't Go Native! Seek Purity!

Another simple way to extend Java is by the use of native methods. These are sections of code written in some other, less exciting, language which provides access to native system interfaces. For example, imagine an organization with a helpdesk application which provides a C API for creating new problem records. You may well want to use this so that your new Java application can perform self-diagnosis and automatically report any faults it finds. One way to do so is to create a native method to interpret between Java and the helpdesk application's API. This provides simple extensibility, but at the cost of portability and flexibility, because:

The native method has to be compiled for a specific system platform.
It must be pre-installed and cannot be installed dynamically like a Java applet.
It cannot be invoked from an applet, because the sandbox restrictions prevent it.

The Java purist will deprecate this kind of application. In fact, although the quest for 100% Pure Java sounds like an academic exercise, there are a number of real-world advantages to only using well-defined, architected interfaces, not the least of which is that the security aspects have (presumably) already been considered.

Some of the Roads to Purity

As projects using Java have matured from being interesting exercises in technology into mission-critical applications, so the need has arisen for more complex interactions with the outside world. The Java applet gives a very effective way to deliver client function without having to install and maintain code on every client. However, the application you create this way still needs access to data and function contained in existing "legacy" systems. 2 With JDK 1.1 JavaSoft have introduced a number of new interfaces and architectures for this kind of integration. The objective is to enable applications to be written in 100% Pure Java, while still delivering the links to the outside world that real requirements demand.

Some of the more notable interfaces of this kind are:

JavaBeans . As we discussed above, these not only provide easier application development, but also provide integration with other distributed object architectures. From a security point of view this capability opens a back door which an attacker could exploit. The Java security manager provides strict and granular controls over what a Java program may do. But these controls are dependent on the integrity of the Java Virtual Machine and in particular the trusted classes it provides. A Java applet cannot meddle with the trusted classes directly, but a Bean can provide linkage to a different type of executable content, with less stringent controls. This could be used to corrupt the JVM trusted classes, thereby allowing an attack applet to take over.
Remote Method Invocation ( RMI). This allows a Java class running on one system to execute the methods of another class on a second system. This kind of remote function call processing allows you to create powerful distributed applications with a minimal overhead. For example, an applet running on a browser system could invoke a server-side function without having to execute a CGI program or provide its own sockets-based protocol. The security concerns for RMI are, in fact, similar to the CGI case. The server code is not subject to the applet sandbox restrictions, so the programmer needs to be wary of unintentionally giving the client more access than he or she intends.
For example, consider a Java application that accesses a database of personal information, consisting of a server-side application communicating with a client applet. When writing the application, the programmers will naturally assume that the only code involved is what they write. However, the Java code that initiates the connection does not have to be their friendly applet, it could be the work of a cracker. The server application must be very careful to check the validity of any requests it gets and not rely on client-side validation.
Object Request Brokers ( ORBs). RMI provides a way to remotely execute Java code. However, for many years the O-O world has been trying to achieve a more generic form of remote execution. That is, a facility that allows a program to access the properties and methods of a remote object, regardless of the language in which it is implemented or the platform on which it runs. The facility that provides the ability to find and operate on remote objects is called an object request broker , or ORB. One of the most widely-accepted standards for ORBs is the Common Object Request Broker Architecture ( CORBA) and packages are becoming available that provide a CORBA-compatible interface for Java (for example, VisiBroker for Java from VisiGenic Corp, which is soon to be part of the core Java classes). See Interacting with an ORB illustrates the relationship between a Java application or applet and a remote object. Clearly, in an implementation of this kind the Java program relies on the security of the request brokers. It is the responsibility of the ORB and the inter-ORB communications to authenticate the endpoints and apply access control. The official standard for inter-ORB communications is the Internet Inter-ORB Protocol ( IIOP).

Interacting with an ORB

JDBC. This ought to stand for "Java Database Connectivity," but actually it is a name in its own right (when you are changing the world, who needs vowels?). JDBC is an API for executing SQL statements from Java. Most relational databases implement the ODBC API (which does stand for something: Open Database Connectivity), originated by Microsoft. JBDC thoughtfully includes an ODBC bridge, thereby giving it instant usefulness. From a security point of view, there are some concerns. You should beware of giving access to more data than you intended. For example, imagine an applet which invokes JDBC on the Web server to extract information from a database. It is important that the server application is written to allow only the SQL requests expected from the applet, and not the more revealing requests that an attacker could make.

Cryptography to the Rescue!

The interfaces that we have briefly described above illustrate a big issue in Java. The applet environment, fenced in as it is by the sandbox restrictions, is a relatively safe platform (only "relatively" safe, because it relies on software controls that have been found to contain bugs and because it provides limited protection from nuisances such as denial of service attacks). However, in the real world we need to extend the security model to allow more powerful applications and interfaces.

The security model needs to answer questions such as the following:

Where did this piece of Java code come from?
What type of things should the code be allowed to do?
If someone appears to be using an applet I provide, how can I find out who they are?
How can I protect the confidentiality of the data my Java application is handling?

The answers to questions of this kind lie in cryptography and JDK 1.1 introduces the Java Cryptography Architecture ( JCA) to define the way that cryptographic tools are made available to Java code.

Cryptographic Tools in Brief

The derivation of the word "cryptography" is from Greek and means literally "secret writing." Modern cryptography is still involved in keeping data secret, but the ability to authenticate a user (and hence apply some kind of access control) is even more important.

Although there are many cryptographic techniques and protocols, they mostly fall into one of three categories:

Bulk encryption This is the modern equivalent of "secret writing." A bulk encryption algorithm uses a key to scramble (encrypt) data for transmission or storage. It can then only be unscrambled (or decrypted) using the same key. Bulk encryption is so called because it is effective for securing large chunks of data. Some common algorithms are DES, IDEA and RC4.

Public key encryption This is also a technique for securing data but instead of using a single key for encryption and decryption, it uses two related keys, known as a key pair . If data is encrypted using one of the keys it can only be decrypted using the other, and vice versa. Compared to bulk encryption, public key is computationally expensive and is therefore not suited to large amounts of data. The most commonly-used algorithm for public key encryption is the RSA system.

Hashing A secure hash is an algorithm that takes a stream of data and creates a fixed-length digest of it. This digest is a unique "fingerprint" for the data. Hashing functions are often found in the context of digital signatures . This is a method for authenticating the source of a message, formed by encrypting a hash of the source data. Public key encryption is used to create the signature, so it effectively ties the signed data to the owner of the key pair that created the signature.

We describe the process of creating a digital signature in See The Security Classes in Practice .

Java Cryptography Architecture

JCA is described as a provider architecture . It is designed to allow different vendors to provide their own implementation of the cryptographic tools and other administrative functions. This makes a very flexible framework which will cater for future requirements and allow vendor independence.

The architecture defines a series of classes, called engine classes , that are representations of general cryptographic functions. So, for example, there are several different standards for digital signatures, which differ in their detail implementation but which, at a high level, are very similar. A single engine class (java.security.Signature) represents all of the variations. The actual implementation of the different signature algorithms is done by a provider class which may be offered by a number of vendors.

Provider and Engine Classes

The provider architecture has the virtue of offering a standard interface to the programmer who wants to use a cryptographic function, while at the same time having the flexibility to handle different underlying standards and protocols. The providers may be added either statically or dynamically. Sun, the default provider, provides:

Digital signatures using DSA
Message digests using MD-5 and SHA-1

Support for the management of keys and access control lists were not in the initial release of JDK 1.1, but will be provided later.

We discuss the JCA in more detail in See Introducing JCA: the Provider Concept .

US Export Rules for Encryption

Unfortunately, only a subset of the cryptographic possibilities are implemented in JDK 1.1. It includes all of the engine classes needed for digital signatures, plus a provider package, but nothing for bulk or public key encryption. The reason for this is the restrictions placed by the US government on the export of cryptographic technology.

The National Security Agency (NSA) is responsible for monitoring communications between the US and the rest of the world, aiming to intercept such things as the messages of unfriendly governments and organized crime. Clearly, it is not a good thing for such people to have access to unbreakable encryption, so the US Government sets limits on the strength of cipher that a US company can export for commercial purposes. 3 This applies to any software that can be used for "general purpose" encryption. So, the SUN provider package that comes with JDK 1.1 can include the full-strength RSA public key algorithm, but it can only be used as part of a digital signature process and not for general encryption.

Finally, in 1996, the US government relaxed the export rules. The promise is that any strength of encryption may be exported, so long as it provides a technique for key recovery , that is, a way for the NSA to retrieve the encryption key if they need to break the code.

The JavaSoft response to the current restrictions was to define two, related, packages for cryptography in Java. JCA is the exportable part, which contains the tools for signatures and is partially implemented in JDK 1.1. The not-for-export part is the Java Cryptography Extensions ( JCE) which include the general purpose encryption capabilities. These consist of engine classes for bulk and public key encryption, plus an extension to the Sun provider package that offers the DES bulk encryption algorithm.

The eventual aim is to develop a full strength, exportable cryptographic toolkit with key recovery built into it.

Signed Applets

Using JCA, it is possible for a Java application or applet to create its own digital signatures. This allows you to write more sophisticated programs, but a more common scenario is where you want an applet to do something that the sandbox restrictions normally forbid. In this case, the browser user needs to be convinced that the applet is from a trustworthy source. The way this is achieved is by digitally signing the applet.

The signature on an applet links the code to the programmer or administrator who created or packaged it. However, the user has to be able to check that the signature is valid. The signer enables this by providing a public key certificate. We discuss this in detail in See Java Gets Out of Its Box .

The Other Side of the Coin: Access Control

When you receive an applet that has been digitally signed you know where it came from and you can make a judgement of whether or not it is trustworthy. Next, you want to exercise some access control . For example, an applet may want to use your hard disk to store some configuration information. You probably have no objection to it doing so, but that does not mean that you are happy for it to overwrite any file on the system. This is the difference between a binary trust model ("I trust you, do what you like") and a fine-grained trust model ("tell me what you want to do and I'll decide whether I trust you").

Other types of executable content, such as browser plug-ins and ActiveX currently use the binary model. By contrast, signed Java operates on top of the very tight sandbox restrictions. This means that fine-grained controls can be implemented. At the time of writing, the standards for controlling access were still being evolved. We discuss the different approaches in See JAR Files and Applet Signing .

Attacking the World of Java

In the early days of most software developments, security is a long way down the list of priorities. This makes Java very unusual, in that security has been an important consideration from the very beginning. No doubt, this is because the environment to which the infant language has been exposed in its formative years is a cruel and unforgiving one: the Internet. In this section we take a cracker's-eye view; what opportunities do we have to abuse a Java applet, to make it do our dastardly deeds for us?

Perils in the Life of an Applet

The Java applet that runs in your Web browser has had an unusually long and interesting life history. Along the way it has passed through a number of phases, each of which is in some way vulnerable to attack. See Perils in the Life of an Applet illustrates the points of peril in the life of an applet.

Perils in the Life of an Applet

Let us look at the points of vulnerability in some detail:

You may think that all of the programmers you know are angels, but there is no way to tell if really there is a devil inside. In the case of a Java applet you are another step away from the person who wrote the code. So, when you buy a software product from a well-known company, you can be fairly sure that the contents of the shrink-wrap will not do you any harm. When you receive any code from the Internet you have to be wary of where it really comes from. In the case of a Java applet, the risk is in some ways worse, because you may not even be conscious that you have received the program at all. We will show some examples of the kind of things that a hostile applet can do in See Malicious Applets and we will discuss the code signing capabilities of JDK 1.1 in See Java Gets Out of Its Box .
The Java compiler, javac, takes source code and compiles it into class files (in bytecode format) that can be executed by the Java virtual machine. It is quite common for a developer to have multiple versions of javac on his or her computer. For example, the Java development kit for various system platforms is available for download from Javasoft and other computer manufacturers. Very often, a developer will have a current and one or more beta versions installed. javac is also provided as part of many application development tools.

Normally you expect that the bytecode generated by a compiler would reflect the source code you feed in. However, a compiler can easily be hacked so that it adds its own, nefarious, functions. Even worse, a compiler could produce bytecode output that cannot be a translation of normal Java source code. This would be a way to introduce code to exploit some frailty of the Java code verification process, for example.

Although a hacked compiler is the most obvious example of a compromised programming tool, the same concern also applies to other parts of the programmer's arsenal, such as editors, development toolkits and pre-built components.

If an attacker can get update access to the class files, wherever they are stored, they can subvert the function of the applet. For example, by modifying business data used in the applet or inserting rude messages. One obvious point of attack is where the class files are stored on the Web server. If an attacker can get update access to the directory they are in, they can be corrupted. Java class files should therefore be protected in much the same way as CGI programs, for example. Some basic principles for protection are:
Don't allow update permissions for the user ID that the Web server runs under. Many successful attacks on Web servers rely on finding holes in the logic or implementation of CGI programs and tricking them into executing arbitrary commands.
Make sure that the server has been properly hardened to reduce the risk of someone gaining access beyond the normal Web connection. You should remove unwanted network services and user IDs, enforce password restrictions and limit access using firewall controls. You should also make sure that you have the fixes for the latest security advisories installed.
One side-effect of Java's portability is that a Webmaster can get applet code from any number of different sources. The code could just generate some entertaining animation or cool dialogs. Alternatively it could be a fully-fledged application, containing thousands of lines of source code.

Any applets you import in this way should be treated with suspicion. This raises a moral question: how responsible should you feel if your Web site somehow damages a client connecting to it, even if you are not ultimately responsible for the content that caused the damage? Most reasonable people will agree that there is a duty of care which should be balanced against your desire to build the world's most dynamic and attractive Web site. Indeed it would be a good idea to check whether your agreements with others mean that you have a formal legal duty of care. You do not want a thoughtlessly-included applet to result in your being sued.

The next journey in the life of an applet is when it is loaded into the browser virtual machine across the network from the server. Although it could, potentially, be intercepted in mid-flight and modified, a much more likely form of attack would involve some type of spoofing . What this means is that the attacker fools the browser into thinking that it is connecting to rocksolid.reliable.org, when really the applet is coming from nogood.badguys.com. The most sophisticated form of spoofing is the Web spoof , where the attacker acts as a filter for all of the traffic between the browser and anywhere else, passing most requests straight through, but intercepting particular requests and modifying them or replacing them with something more sinister (see See A Web Spoof ). Note that it does not have to be this way around. It is equally possible for a Web spoof to screen everything going to and from a server, rather than a client system.

A Web Spoof

Spoofing is not just a problem for Java applets, of course. Any Web content can be attacked in this way. With Java this gives the attacker an opportunity to execute a malicious applet or try to exploit security holes in the browser environment. However, compared to the risk of downloading and installing a conventional program in this kind of environment, the risk is small.

Signed applets can again help with this problem. An attacker may be able to substitute subversive class files to attack the browser, but it is much more difficult to forge the class signature.

Finally the applet arrives at the browser; class files are loaded and verified and the virtual machine goes to work. If the installation is working as designed, the worst peril that can befall you as a user is that the applet may annoy you or eat excessive system resources (see See Class of 1.1 for a description of class loader and security manager controls). There are two things that can go wrong with this idyllic picture:
There may be bugs in the Java implementation.
You may have installed a hacked version of the browser code.

Of these two, the first is more likely. There have been a number of well-publicized security breaches found in the Java virtual machine components. The best description of how these operate can be found in Java Security, Hostile Applets, Holes, and Antidotes , by Felten and McGraw. You can also find more up-to-date online information at the sites listed in See Sources of Information about Java Security . The best way to protect yourself is to make sure you are aware of the latest breaches and install the fixes as they arrive.

The possibility of installing a browser that has been tampered with is a real one, although there are considerable practical hurdles for an attacker to overcome in creating such a thing. If you do as we recommend (above) and install the latest fixes, you will inevitably be running a downloaded version of the browser. There is some small risk that this could be a hacked version, but no examples of this have yet been detected.

Vulnerabilities in Java Applications

A Java applet is an obvious vehicle to mount an attack, because it can install itself uninvited and probe for weaknesses. And, of course, this is why so much thought has gone into the construction of the sandbox and the JDK 1.1 code-signing capabilities.

A Java application, on the other hand, is a much less obvious target. There are many ways in which such an application could be implemented, for example:

On a Web server using CGI to interface with Web pages or applets
As a stand-alone application on a server, interfacing with client code using socket connections
As a stand-alone application on a server, using remote object request services (like RMI or CORBA) for communication

To a cracker, the fact that the application is written in Java rather than any other language is not really important. The strategies that he or she would use to search for vulnerabilities are the same. For example:

Many successful attacks rely on driving the application with data that it is not equipped to handle. In particular, if the application uses a command line interface, it should be very careful to screen out escape sequences that an attacker could use to execute arbitrary commands.
Applications frequently have to give themselves temporary higher privileges to use system functions or get special access (such as user IDs for database control). If an attacker can crash the application at this critical point, or link to it from another running program he or she can use the special privileges illegally.

As we said earlier, vulnerabilities of this kind apply to applications written in Java the same as any other application programming environment. However, Java does include safety features that make it harder for an attacker to find a flaw. These safety features work at two levels:

Java source The Java language uses strong type constraints to prevent the programmer from accessing data in an inconsistent way. You can cast objects from one type to another, but only if the two classes are related by inheritance; that is, one must be a superclass of the other. This does not operate symmetrically, which means you can always cast from a subclass to its superclass, but not always vice versa. Referring again to See Classes Loaded for the pointlessButton Applet , you could access an instance of the Button class as an Object, but you could not access a Button as a Panel.

Furthermore, Java prevents you from having direct access to program memory. In C it is common to use a pointer to locate a variable in memory and then to manipulate the pointer to process the data in it. This is a frequent source of coding errors, due to the pointer becoming corrupted or iterating beyond the end of the data. Java allows a variable to be accessed only through the methods of the object that contains it, thereby removing this class of error.

Bytecode The Java virtual machine is type-aware. In other words, most of the primitive machine instructions are associated with a particular type. This means that the JVM also applies the type constraints that the compiler imposes on the Java source. In fact, this job is split between the class file verifier, which handles everything that can be statically checked and the JVM, which deals with runtime exceptions. Contrast this with other languages, in which the compiler produces microprocessor machine code. In this case the program is just handled as a sequence of bytes, with no concept of the data types that are being represented.

The JVM is also, at a basic level, strongly compartmentalized, mirroring the object orientation of the Java source. This means that each method in the code has its own execution stack and only has access to the memory of the class instance for which it was invoked.

Summary

This first part of the book has been a tour through the many aspects of Java security. You should now have a good high-level understanding of the issues involved and the mechanisms that are at work. In the next section we look under the covers, at the detailed operation of the Java virtual machine and the security classes.

1. In fact we are guilty of using an improper name construction here. If your package will be used together with packages from other sources, you should follow the naming standard laid down in the Java Language Specification , Gosling, Joy and Steele. In our case this would lead to a package name something like com.ibm.JamJar.examples. If you want to know more about the Java language specification, refer to http://java.sun.com/docs/books/jls/.

2. "Legacy" seems to be the current word-of-the-month to describe any computer system that does not fit the brave new architecture under discussion. It is an unfortunate choice, in that it implies a system that is outdated or inadequate. You may have a state-of-the-art relational database that is critical to the running of your business, but to the Web-based application that depends on the data it contains, it is still a "legacy system".

3. Cipher strength is controlled by the size of the key used in the encryption algorithm. Current export rules limit the key size for bulk encryption to 40 bits, which can now be cracked in a matter of hours with quite modest computing facilities. Each extra bit doubles the key space, so a key size of 64 bits is 16 million times tougher than 40 bits. A similar rule applies to public key encryption, where an export-quality 512-bit modulus is inadequate, but a 1024-bit modulus is expected to remain effective for the next ten years, at least for commercial use.