Many claims have been made for the security of Java. A lot of these claims have been rather exaggerated, but underlying them is the fact that security was designed-in at an early stage in the development of the language. Saying that Java has strong security is like challenging the world to find the holes in it, which is exactly what has happened. Some very clever (and very devious) people have been applying their brain-power to the problem of breaking down the Java defenses.
In this chapter we give a high-level view of how Java defends itself and then summarize the different ways in which it can be attacked.
Most of the books on the subject deal with Java as a programming language. As a programming language it has much to recommend it. Its syntax is very like C, but with many of the features that hurt your brain removed. It is strongly object-oriented, but it avoids the more obscure corners of the O-O world.
For most programming languages the question of "how secure is it" does not arise. It's the application that needs to implement security, not the language it is written in. However, Java is many other things in addition to being a programming language:
It is not surprising that Java has become so widely accepted, so quickly. Before we look at the security issues, let us review some Java fundamentals.
There are a number of different components to Java:
Once you have installed the JDK, you can start creating Java source code and compiling it. Java is like any other high-level programming language, in that you write the source code in an English-like form. The source code then has to be converted into a form that the machine can understand before it can be executed. To perform this conversion for a normal language, the code is usually either compiled (converted once and stored as machine code) or interpreted (converted and executed at runtime).
Java combines these two approaches. The source code has to be compiled with a Java compiler, such as javac , before it can be used. This is a conventional compilation, but the output that it produces is not machine-specific code, but instead is bytecode , a system independent format. We will take a closer look at how bytecode is constructed in See Java Bytecode .
In order to execute, the compiled code has to be processed by an interpreter, which is part of the Java execution environment (known as the Java virtual machine, or JVM). The JVM is a runtime platform, providing a number of built-in system services, such as thread support, memory management and I/O, in addition to the interpreter.
Java is an object-oriented language, meaning that a program is composed of a number of object classes, each containing data and methods. One result of this is that, although a program may consist of just a single class, when you have compiled it into bytecode only a small proportion of the code that gets executed is likely to be in the resulting .class file. The rest of the function will be in other classes that the main program references. The JVM uses dynamic linking to load these classes as they are needed. As an example, consider the simple applet contained in the following two Java source files:
The first listing, pointlessButton.java, is an applet that places a button on the Web page. It is not a very useful button, but we like it. Instead of using the standard AWT Button class it uses a class of our own, also called Button (see the second listing), but in a locally-written package. This works like a normal button, except that it changes color when you move the mouse pointer over it. See Running the pointlessButton Applet shows two copies of the applet running in a Web page.
The total size of the bytecode for this example is only 2 KB. However, the two classes cause a lot of other code to be dynamically installed, either as a result of inheritance (defined by the extends keyword in the class definition) or by instantiation (when a class creates an instance of another class with the new keyword). See Classes Loaded for the pointlessButton Applet shows the hierarchy of classes that could potentially be loaded to run our simple applet (this is a simplified view, because it does not consider classes that may be invoked by classes above the lowest level of the hierarchy).
This diagram illustrates a number of things about Java classes:
Java is unusual in the breadth of function that its built-in class frameworks provide; however, for a project of any complexity you are likely to employ graphical tools, such as a visual application builder (VAB) to link together predefined components, thereby reducing the code you have to write to the core logic of the application. Examples of VABs include IBM VisualAge for Java and Lotus Development's BeanMachine.
A component in this context is a package of Java classes that perform a given function. The JavaBeans definition describes a standard for components, known as Beans . Basically a Bean is a package of code containing both development and runtime components that:
From this list you can infer that, although a Bean is mostly made up of Java classes, it can also include other files, containing persistent information and other resources such as graphical elements, etc. These elements are all packed (or pickled ) together in a JAR (Java Archive) file.
From a security viewpoint, VABs and Beans do not affect the underlying strengths and weaknesses of Java. However, they may add more uncertainty, in that your application now includes sizeable chunks of code that you did not directly write. Their ability to provide interfaces to other component architectures may also cause problems, as we discuss in See Interfaces and Architectures .
We have said that the Java virtual machine operates on the stream of bytecode as an interpreter. This means that it processes bytecode while the program is running and converts it to "real" machine code that it executes on the fly. You can think of a computer program as being like a railroad track, with the train representing the execution point at any given time. In the case of an interpreted program it is as if this train has a machine mounted on it, which builds the track immediately in front of the train and tears it up behind. It's no way to run a railroad.
Fortunately, in the case of Java, the virtual machine is not interpreting high-level language instructions, but bytecode. This is really machine code, written for the JVM instruction set, so the interpreter has much less analysis to do, resulting in execution times that are very fast.The JVM often uses "Just in Time" ( JIT) compiler techniques to allow programs to execute faster, for example, by translating bytecode into optimized local code once and subsequently running it directly. Advances in JIT technology are making Java run faster all the time. IBM is one of many organizations exploring the technology. Check the IBM Tokyo research lab site at http://www.trl.ibm.co.jp for project information.
Before the JVM can start this interpretation process, it has to do a number of things to set up the environment in which the program will run. This is the point at which the built-in security of Java is implemented. There are three parts to the process:
So how do these classes get loaded? When the browser finds an <applet> tag in a page, it starts the Java virtual machine which, in turn, invokes the applet class loader . This is, itself, a Java class which contains the code for fetching the bytecode of the applet and presenting it to the JVM in an executable form. The bytecode includes a list of referenced classes and the JVM works through the list, checks to see if the class is already loaded and attempts to load it if not. It first tries to load from the local disk, using a platform-specific function provided by the browser. In our example, this is the way that all of the core java classes are loaded. If the class name is not found on the local disk, the JVM again calls the class loader to retrieve the class from the Web server, as in the case of the JamJar.examples.Button class (above).
The class loader is just another Java class, albeit one with a very specific function. An application can declare any number of class loaders, each of which could be targeted at specific class types. The same is not true of an applet. The security manager prevents an applet from creating its own class loader. Clearly, if an applet can somehow circumvent this limitation it can subvert the class loading process and potentially take over the whole browser machine.
The JVM keeps track of which class loader was responsible for loading any particular class. It also keeps classes loaded by different applets separate from each other.
At first sight, the job of the class file verifier may appear to be redundant. After all, bytecode is only generated by the Java compiler, so if it is not correctly formatted and valid, surely the compiler needs to be fixed, rather than having to go through the overhead of checking each time a program is run?
Unfortunately, life is not that simple. The compiled program is just a file of type ".class" containing a string of bytes, so it could be created or modified using any binary editor. Given this fact, the Java virtual machine has to treat any code from an external source as potentially damaged and therefore in need of verification.
In fact, Java divides the world into two parts, Trusted and Untrusted. Trusted code includes the " local" Java classes which are shipped as part of the JVM and sometimes other classes on the local disk (detailed implementation varies between vendors). Everything else is untrusted and therefore must be checked by the class file verifier. As we have seen, these are also the classes that the applet class loader is responsible for fetching. See Where the Class File Verifier Fits illustrates this relationship.
We will look in detail at the things that the class file verifier checks in See The Class File Verifier .
You can see that, for an applet, the class loader and the class file verifier need to operate as a team, if they are to succeed in their task of making sure that only valid, safe code is executed.
From a security point of view the accuracy of the job done by the class file verifier is critical. There are a large number of possible bytecode programs, and the class file verifier has the job of determining the subset of them that are safe to run, by testing against a set of rules. There is a further subset of these verifiable programs: programs that are the result of compiling a legal Java program. See Decisions the Class File Verifier Has to Make illustrates this. The rules in the class file verifier should aim to make the verifiable set as near as possible to the set of Java programs. This limits the scope for an attacker to create bytecode that subverts the safety features in Java and the protection of the security manager.
The third component involved in loading and running a Java program is the security manager. This is similar to the class loader in that it is a Java class (java.lang.SecurityManager) that any application can extend for its own purpose.
The SecurityManager class provides a number of check methods associated with specific actions that may be risky. For example, there is a checkRead method which receives a file reference as an argument. If you want your security manager to prevent the program from reading that particular file, you code checkRead to throw a security exception.
Although any application could implement SecurityManager, it is most commonly found when executing an applet, that is, within a Web browser. The security manager built into your browser is wholly responsible for enforcing the sandbox restrictions : the set of rules that control what things an applet is allowed to do on your browser machine. Any flaw in the coding of the security manager, or any failure by the core classes to invoke it, could compromise the ability to run untrusted code securely.
The main objectives of the sandbox are to:
This last objective is the key to all of the others. This is because the security manager is, itself, a built-in class so if an attacker can corrupt or bypass it, all control is lost.
The Security Manager is part of the local browser code, so the implementation of the sandbox restrictions is the responsibility of each browser vendor. However, they all have the same objectives, so the result is a set of restrictions that is common across most vendors' implementations:
We will look at the sandbox restrictions in more detail in See What the Security Manager Does .
We have discussed two parts of the world of Java, the development environment and the execution environment. The third part is where the world of Java meets the rest of the world, that is, the capabilities it provides for extending Java function and integrating with applications of other types. The simplest example is the way that a Java applet is created and integrated into a Web page by writing the program as a subclass of the Applet class and then specifying the class name in an <applet> HTML tag. In return, Java provides classes such as URL and a number of methods for accessing a Web server.
Another simple way to extend Java is by the use of native methods. These are sections of code written in some other, less exciting, language which provides access to native system interfaces. For example, imagine an organization with a helpdesk application which provides a C API for creating new problem records. You may well want to use this so that your new Java application can perform self-diagnosis and automatically report any faults it finds. One way to do so is to create a native method to interpret between Java and the helpdesk application's API. This provides simple extensibility, but at the cost of portability and flexibility, because:
The Java purist will deprecate this kind of application. In fact, although the quest for 100% Pure Java sounds like an academic exercise, there are a number of real-world advantages to only using well-defined, architected interfaces, not the least of which is that the security aspects have (presumably) already been considered.
As projects using Java have matured from being interesting exercises in technology into mission-critical applications, so the need has arisen for more complex interactions with the outside world. The Java applet gives a very effective way to deliver client function without having to install and maintain code on every client. However, the application you create this way still needs access to data and function contained in existing "legacy" systems. 2 With JDK 1.1 JavaSoft have introduced a number of new interfaces and architectures for this kind of integration. The objective is to enable applications to be written in 100% Pure Java, while still delivering the links to the outside world that real requirements demand.
Some of the more notable interfaces of this kind are:
The interfaces that we have briefly described above illustrate a big issue in Java. The applet environment, fenced in as it is by the sandbox restrictions, is a relatively safe platform (only "relatively" safe, because it relies on software controls that have been found to contain bugs and because it provides limited protection from nuisances such as denial of service attacks). However, in the real world we need to extend the security model to allow more powerful applications and interfaces.
The security model needs to answer questions such as the following:
The answers to questions of this kind lie in cryptography and JDK 1.1 introduces the Java Cryptography Architecture ( JCA) to define the way that cryptographic tools are made available to Java code.
The derivation of the word "cryptography" is from Greek and means literally "secret writing." Modern cryptography is still involved in keeping data secret, but the ability to authenticate a user (and hence apply some kind of access control) is even more important.
Although there are many cryptographic techniques and protocols, they mostly fall into one of three categories:
Bulk encryption This is the modern equivalent of "secret writing." A bulk encryption algorithm uses a key to scramble (encrypt) data for transmission or storage. It can then only be unscrambled (or decrypted) using the same key. Bulk encryption is so called because it is effective for securing large chunks of data. Some common algorithms are DES, IDEA and RC4.
Public key encryption This is also a technique for securing data but instead of using a single key for encryption and decryption, it uses two related keys, known as a key pair . If data is encrypted using one of the keys it can only be decrypted using the other, and vice versa. Compared to bulk encryption, public key is computationally expensive and is therefore not suited to large amounts of data. The most commonly-used algorithm for public key encryption is the RSA system.
Hashing A secure hash is an algorithm that takes a stream of data and creates a fixed-length digest of it. This digest is a unique "fingerprint" for the data. Hashing functions are often found in the context of digital signatures . This is a method for authenticating the source of a message, formed by encrypting a hash of the source data. Public key encryption is used to create the signature, so it effectively ties the signed data to the owner of the key pair that created the signature.
We describe the process of creating a digital signature in See The Security Classes in Practice .
JCA is described as a provider architecture . It is designed to allow different vendors to provide their own implementation of the cryptographic tools and other administrative functions. This makes a very flexible framework which will cater for future requirements and allow vendor independence.
The architecture defines a series of classes, called engine classes , that are representations of general cryptographic functions. So, for example, there are several different standards for digital signatures, which differ in their detail implementation but which, at a high level, are very similar. A single engine class (java.security.Signature) represents all of the variations. The actual implementation of the different signature algorithms is done by a provider class which may be offered by a number of vendors.
The provider architecture has the virtue of offering a standard interface to the programmer who wants to use a cryptographic function, while at the same time having the flexibility to handle different underlying standards and protocols. The providers may be added either statically or dynamically. Sun, the default provider, provides:
Support for the management of keys and access control lists were not in the initial release of JDK 1.1, but will be provided later.
We discuss the JCA in more detail in See Introducing JCA: the Provider Concept .
Unfortunately, only a subset of the cryptographic possibilities are implemented in JDK 1.1. It includes all of the engine classes needed for digital signatures, plus a provider package, but nothing for bulk or public key encryption. The reason for this is the restrictions placed by the US government on the export of cryptographic technology.
The National Security Agency (NSA) is responsible for monitoring communications between the US and the rest of the world, aiming to intercept such things as the messages of unfriendly governments and organized crime. Clearly, it is not a good thing for such people to have access to unbreakable encryption, so the US Government sets limits on the strength of cipher that a US company can export for commercial purposes. 3 This applies to any software that can be used for "general purpose" encryption. So, the SUN provider package that comes with JDK 1.1 can include the full-strength RSA public key algorithm, but it can only be used as part of a digital signature process and not for general encryption.
Finally, in 1996, the US government relaxed the export rules. The promise is that any strength of encryption may be exported, so long as it provides a technique for key recovery , that is, a way for the NSA to retrieve the encryption key if they need to break the code.
The JavaSoft response to the current restrictions was to define two, related, packages for cryptography in Java. JCA is the exportable part, which contains the tools for signatures and is partially implemented in JDK 1.1. The not-for-export part is the Java Cryptography Extensions ( JCE) which include the general purpose encryption capabilities. These consist of engine classes for bulk and public key encryption, plus an extension to the Sun provider package that offers the DES bulk encryption algorithm.
The eventual aim is to develop a full strength, exportable cryptographic toolkit with key recovery built into it.
Using JCA, it is possible for a Java application or applet to create its own digital signatures. This allows you to write more sophisticated programs, but a more common scenario is where you want an applet to do something that the sandbox restrictions normally forbid. In this case, the browser user needs to be convinced that the applet is from a trustworthy source. The way this is achieved is by digitally signing the applet.
The signature on an applet links the code to the programmer or administrator who created or packaged it. However, the user has to be able to check that the signature is valid. The signer enables this by providing a public key certificate. We discuss this in detail in See Java Gets Out of Its Box .
When you receive an applet that has been digitally signed you know where it came from and you can make a judgement of whether or not it is trustworthy. Next, you want to exercise some access control . For example, an applet may want to use your hard disk to store some configuration information. You probably have no objection to it doing so, but that does not mean that you are happy for it to overwrite any file on the system. This is the difference between a binary trust model ("I trust you, do what you like") and a fine-grained trust model ("tell me what you want to do and I'll decide whether I trust you").
Other types of executable content, such as browser plug-ins and ActiveX currently use the binary model. By contrast, signed Java operates on top of the very tight sandbox restrictions. This means that fine-grained controls can be implemented. At the time of writing, the standards for controlling access were still being evolved. We discuss the different approaches in See JAR Files and Applet Signing .
In the early days of most software developments, security is a long way down the list of priorities. This makes Java very unusual, in that security has been an important consideration from the very beginning. No doubt, this is because the environment to which the infant language has been exposed in its formative years is a cruel and unforgiving one: the Internet. In this section we take a cracker's-eye view; what opportunities do we have to abuse a Java applet, to make it do our dastardly deeds for us?
The Java applet that runs in your Web browser has had an unusually long and interesting life history. Along the way it has passed through a number of phases, each of which is in some way vulnerable to attack. See Perils in the Life of an Applet illustrates the points of peril in the life of an applet.
Let us look at the points of vulnerability in some detail:
Normally you expect that the bytecode generated by a compiler would reflect the source code you feed in. However, a compiler can easily be hacked so that it adds its own, nefarious, functions. Even worse, a compiler could produce bytecode output that cannot be a translation of normal Java source code. This would be a way to introduce code to exploit some frailty of the Java code verification process, for example.
Although a hacked compiler is the most obvious example of a compromised programming tool, the same concern also applies to other parts of the programmer's arsenal, such as editors, development toolkits and pre-built components.
Any applets you import in this way should be treated with suspicion. This raises a moral question: how responsible should you feel if your Web site somehow damages a client connecting to it, even if you are not ultimately responsible for the content that caused the damage? Most reasonable people will agree that there is a duty of care which should be balanced against your desire to build the world's most dynamic and attractive Web site. Indeed it would be a good idea to check whether your agreements with others mean that you have a formal legal duty of care. You do not want a thoughtlessly-included applet to result in your being sued.
Spoofing is not just a problem for Java applets, of course. Any Web content can be attacked in this way. With Java this gives the attacker an opportunity to execute a malicious applet or try to exploit security holes in the browser environment. However, compared to the risk of downloading and installing a conventional program in this kind of environment, the risk is small.
Signed applets can again help with this problem. An attacker may be able to substitute subversive class files to attack the browser, but it is much more difficult to forge the class signature.
Of these two, the first is more likely. There have been a number of well-publicized security breaches found in the Java virtual machine components. The best description of how these operate can be found in Java Security, Hostile Applets, Holes, and Antidotes , by Felten and McGraw. You can also find more up-to-date online information at the sites listed in See Sources of Information about Java Security . The best way to protect yourself is to make sure you are aware of the latest breaches and install the fixes as they arrive.
The possibility of installing a browser that has been tampered with is a real one, although there are considerable practical hurdles for an attacker to overcome in creating such a thing. If you do as we recommend (above) and install the latest fixes, you will inevitably be running a downloaded version of the browser. There is some small risk that this could be a hacked version, but no examples of this have yet been detected.
A Java applet is an obvious vehicle to mount an attack, because it can install itself uninvited and probe for weaknesses. And, of course, this is why so much thought has gone into the construction of the sandbox and the JDK 1.1 code-signing capabilities.
A Java application, on the other hand, is a much less obvious target. There are many ways in which such an application could be implemented, for example:
To a cracker, the fact that the application is written in Java rather than any other language is not really important. The strategies that he or she would use to search for vulnerabilities are the same. For example:
As we said earlier, vulnerabilities of this kind apply to applications written in Java the same as any other application programming environment. However, Java does include safety features that make it harder for an attacker to find a flaw. These safety features work at two levels:
Java source
The Java language uses strong type constraints to prevent the programmer from accessing data in an inconsistent way. You can cast objects from one type to another, but only if the two classes are related by inheritance; that is, one must be a superclass of the other. This does not operate symmetrically, which means you can always cast from a subclass to its superclass, but not always vice versa. Referring again to
See Classes Loaded for the pointlessButton Applet
, you could access an instance of the Button class as an Object, but you could not access a Button as a Panel.
Furthermore, Java prevents you from having direct access to program memory. In C it is common to use a pointer to locate a variable in memory and then to manipulate the pointer to process the data in it. This is a frequent source of coding errors, due to the pointer becoming corrupted or iterating beyond the end of the data. Java allows a variable to be accessed
only
through the methods of the object that contains it, thereby removing this class of error.
Bytecode
The Java virtual machine is type-aware. In other words, most of the primitive machine instructions are associated with a particular type. This means that the JVM also applies the type constraints that the compiler imposes on the Java source. In fact, this job is split between the class file verifier, which handles everything that can be statically checked and the JVM, which deals with runtime exceptions. Contrast this with other languages, in which the compiler produces microprocessor machine code. In this case the program is just handled as a sequence of bytes, with no concept of the data types that are being represented.
The JVM is also, at a basic level, strongly compartmentalized, mirroring the object orientation of the Java source. This means that each method in the code has its own execution stack and only has access to the memory of the class instance for which it was invoked.
This first part of the book has been a tour through the many aspects of Java security. You should now have a good high-level understanding of the issues involved and the mechanisms that are at work. In the next section we look under the covers, at the detailed operation of the Java virtual machine and the security classes.
1. In fact we are guilty of using an improper name construction here. If your package will be used together with packages from other sources, you should follow the naming standard laid down in the Java Language Specification , Gosling, Joy and Steele. In our case this would lead to a package name something like com.ibm.JamJar.examples. If you want to know more about the Java language specification, refer to http://java.sun.com/docs/books/jls/.
2. "Legacy" seems to be the current word-of-the-month to describe any computer system that does not fit the brave new architecture under discussion. It is an unfortunate choice, in that it implies a system that is outdated or inadequate. You may have a state-of-the-art relational database that is critical to the running of your business, but to the Web-based application that depends on the data it contains, it is still a "legacy system".
3. Cipher strength is controlled by the size of the key used in the encryption algorithm. Current export rules limit the key size for bulk encryption to 40 bits, which can now be cracked in a matter of hours with quite modest computing facilities. Each extra bit doubles the key space, so a key size of 64 bits is 16 million times tougher than 40 bits. A similar rule applies to public key encryption, where an export-quality 512-bit modulus is inadequate, but a 1024-bit modulus is expected to remain effective for the next ten years, at least for commercial use.