RMI, Dynamic Proxies, and the Evolution of Deployment

Contents

Stubs, Skeletons, the RMI Runtime,
Serialization and Dynamic Codebases

No More Skeletons: JDK 1.2

Detouring Through Dynamic Proxies

Client-Side Fun and Games

JDK 1.5 Eliminating the Stubs (The Implementation)

Closing Thoughts

I was nosing around the beta of JDK 1.5 a few weeks back, and I noticed that, once again, Java's Remote Method Invocation (RMI) framework has evolved towards simplicity. In particular, the following paragraph caught my eye:

Dynamic Generation of Stub Classes
This release adds support for the dynamic generation of stub classes at runtime, obviating the need to use the Java Remote Method Invocation (Java RMI) stub compiler, rmic, to pregenerate stub classes for remote objects.

This is good news for RMI fans -- it simplifies both development and deployment without incurring any real cost. The changes in 1.5 also started me to thinking about the changes in RMI over the past few releases. When RMI first shipped, it looked a lot like CORBA. A lighter-weight and Java-specific CORBA, to be sure, but with the exception of dynamic codeloading, it had the same feel, and the same basic structure -- you define interfaces, create stubs and skeletons, and so on.

Slowly and subtly, that's changed. Over the past few releases of the JDK, RMI has evolved into a low-process and lightweight framework for strongly typed remote method invocation. And that's what this article's about. First, I'll review the basics of RMI, what stubs and skeletons are, and how they are used in distributed systems. Hopefully, this will mostly be old hat; in any case, it will also constitute a review of what was "state of the art" in the early versions of RMI. After laying that groundwork, I'm going to ratchet the technical level (a.k.a "geek quotient") up a little bit and I'll talk about the elimination of skeletons (which happened in JDK 1.2). Then I'll cover the introduction of dynamic proxies (which happened in JDK 1.3) and talk about how that, and aspect-oriented programming, changed the way everyone people built frameworks. And then I'll close by examining the elimination of skeletons in more detail.

Stubs, Skeletons, the RMI Runtime, Serialization, and Dynamic Codebases

Before we start, let's review some terminology. Without going into a deep terminological rathole, we're going to use client/server terminology. The client is the application that initiates the action, and the server is the application that does the heavy lifting.

In most object distribution frameworks, method calls from the client to the server are mediated by stubs and skeletons. Stubs are automatically generated pieces of code that live in the client process and represent the server. Programmer code calls a method on the stub, which then forwards the call to the server.

More precisely, the stub forwards the call to the skeleton. The skeleton is another piece of automatically generated code; it lives inside of the server's process. The role of the skeleton is simple: it receives messages from the stub and forwards them to the "real application." Even more precisely, the communication between the stub and the skeleton goes through an additional layer, which I like to refer to asthe RMI runtime. The RMI runtime performs a number of simple tasks, mostly related to resource sharing and reuse. Two of the most important are:

Socket reuse. Clients talk to servers over sockets. Sockets are expensive -- opening them takes both time and resources. In most cases, you can reuse a socket repeatedly, to send multiple messages, and that's exactly what RMI does. RMI even shares sockets across instances of servers. If one client is talking to three distant servers inside of one process, all of the communication can potentially travel over a single socket connection.
Thread pooling. One of the things that people often overlook is that each remote method invocation to the server implicitly consumes a thread. (How else could the method call be made?) You could imagine that the skeleton has a "listening thread" that creates additional threads whenever it receives a call. But if you have enough knowledge to think of that, you probably already figured out that having a pool of threads available (and reused across all the skeletons) is a much better idea.

This is summarized in Figure 1:

Figure 1. Remote Method calls go "down the picture" (through the stub, into the runtime, over the wire, through the runtime, into the skeleton, and out into the server code)

This involves a fair number of classes. The class diagram for a simple application (in this case, a mythical Calculator application where the server knows how to do arithmetic) is shown in Figure 2.

Figure 2. Classes involved in a simple RMI application. Note the parallel structure -- every application-specific class mirrors a library class

Note that Figure 2 only contains one class the programmer has to implement (and one interface). Everything else either comes with the JDK or is automatically generated.

Let's go back to Figure 1 now. The important thing about Figure 1 is that it's incomplete; it's a snapshot of what a running system, where the client application already has an active stub to an active server, looks like. But how did the client get or create that stub?

In most RMI applications a naming service, such as the RMI Registry or JNDI, is used to solve this problem. At runtime, here's what happens:

The RMI registry is either already running or is launched by the server process.
Code within the server process creates the actual server instance (for example, an instance of Calculator). As part of this, a instance of the skeleton class is automatically created and handed to the RMI runtime.
Code within the server process registers the actual server instance with the naming service under a logical name (for example, calculator).
The client application requests a stub for the server from the naming service, using the logical name.

Under the covers, the third and fourth steps are a bit tricky. What really happens in the third step is:

A stub for the server object is created.
The stub is serialized (the instance data for that stub is turned into a sequence of bytes).
The bytes are sent to the naming service, which uses them to create a new instance of the stub class (deserialization).

And what really happens in the fourth step is:

A stub for the server object is created.
The stub is serialized (the instance data for that stub is turned into a sequence of bytes).
The bytes are sent to the client application, which uses them to create a new instance of the stub class (deserialization).

That is, in order to create a stub at the client, you usually wind up having class definitions deployed both at the naming service and the client application.

Deploying classes can be awkward, though. For one thing, it's easy to forget to generate and deploy skeletons and stubs, especially at the naming service. People often overlook the need to deploy (or redeploy) classes at the naming service, especially when deploying a "hotfix" to a server. More seriously, a lot of environments are set up with a small number of naming services that are shared by a large number of different applications. In such environments, a single naming service can be used by multiple versions of an application, and a single client application might need to talk to multiple versions of a server application (each with a different stub). In such environments, it's often problematic to deploy class files on a naming server, or on a client machine. Which is why dynamic classloading got so much publicity back in the early days of RMI -- it helps to solve the deployment issues.

The idea behind dynamic classloading is simple and seductive: if you annotate the instance data (which is being collected by serialization) with a URL from which the class file can be downloaded, then you don't need to deploy the classes to the file system -- they're created automatically by the classloader when the instance data arrives. In essence, this is the same idea as "lazy fetching" (where database objects are unpopulated until their fields are accessed) or just-in-time compiling (where conversion to native code is delayed until the virtual machine decides it's necessary). It's a bit of a marketing mystery why SUN used the term "dynamic classloading" instead of "just-in-time deployment."

In the dynamic classloading scenario, once you do a few chores, magic happens and your deployment obligations are reduced. In terms of the above class diagram, you get the following:

Figure 3. Classes and deployment obligations. The ones with blue shading need to be deployed with the client; the ones with red shading can be deployed dynamically

Note: You can also dynamically deploy other class files. It's just done less often.

No More Skeletons: JDK 1.2

That's where we were when JDK 1.1. shipped. RMI was a nice framework that simplified building distributed applications. You had to build stubs and skeletons and either deploy them with your application or arrange for them to be dynamically downloaded. But you had the same problems with other frameworks (like CORBA) too, and anyway, if you really cared about deployment issues, dynamic code loading demoed really well.

In JDK 1.2, things changed a little. The Java Remote Method Protocol (JRMP), which is the actual "bytes on the wire" protocol used by RMI, was revised. One part of the thinking behind the revision went something like this:

The client is sending a method call to the server.
In order to do so, the client really has to know a lot of information.
In fact, the client knows so much that, well, the skeleton is unnecessary. With a little bit of boilerplate reflective code, the RMI runtime can dispatch the method call directly, without the skeleton.
And wouldn't it be cool to get rid of skeletons?

And so it came to pass. If you wanted to, you could use rmic to generate stubs that used the new version of JRMP (using the -v 1.2 command line argument) and then never have to worry about the skeletons. This changed simplified server-side deployment issues a little bit, but the world was, in general, underwhelmed by it. After all, the hard part of deployment isn't updating the server; it's updating the clients.

Detouring Through Dynamic Proxies

JDK 1.3 introduced dynamic proxies. A dynamic proxy is just an automatically generated class that implements a set of interfaces. The important point is that the interfaces are specified at runtime, and the implementations of all of the methods are handled by a single invocation handler that you define.

The following code snippet illustrates the general idea:

MimickingInvocationHandler handler = new MimickingInvocationHandler (....);
Class[] delegationInterfaces = new Class[] { RichardNixon.class, 
     GeraldFord.class, RichardSimmons.class};
	 
Class richLittleProxyClass = Proxy.getProxyClass(
     getClass().getClassLoader(), delegationInterfaces);
	 
Constructor proxyClassConstructor =
     RichLittleProxyClassProxyClass.getConstructor(Class[] {
     MimickingInvocationHandler});

return proxyClassConstructor.newInstance(Object[] {handler});

This rather artificial code results in the creation of a new class object and a new instance. In this example, a class will be created that implements the RichardNixon, GeraldFord, and RichardSimmonsinterfaces (and we will store it in the field richLittleProxyClass). It does so using an instance of MimickingInvocationHandler, which simply must implement the invoke method:

public Object invoke(Object proxy, Method method, Object[] args) { 
   // All methods that are called on the RichLittleProxyClass 
   // actually wind up calling this method.

If you're a distributed systems guy, this looks wonderful -- the obvious thing to do is wrap your servers in logging, authentication, and processor control layers that are all defined using dynamic proxies. (Distributed systems guys tend to have this reaction a lot. At the first AspectJ talk I ever attended, a distributed systems guy leaned over and said "Cool! We can use this to simplify authentication.")

Note: In fact, if you're an EJB guy, you're probably thinking that EJB already has those wrappers and asking yourself if they're implemented using dynamic proxies. Some EJB containers are.

Authentication is a good example. You might have a whole lot of methods that require an authentication token and check permissions. For example, you might have the following authentication rules implemented inside of the isValid(...) method of an AuthenticationChecker class.

Check that the authentication token is a valid authentication token.
Check that the IP address of the caller matches the IP address associated to the token.
Get the permissions for the token and check whether this method is allowed.

And then somehow tie this in to your server using a dynamic proxy. Ideally, you'd have something like this:

A class that is the "server" for the purposes of RMI (e.g., incoming remote method invocations are sent to it) and that uses an invocation handler that first checks authentication and then forwards the call.
A class that would have been the server, but is now the server's delegate instead.

Unfortunately, this doesn't work without a lot more effort on your part. For one thing, the proxy classes that are generated are subclasses of java.lang.reflect.Proxy, which means they can't also be subclasses of UnicastRemoteObject (or any other RMI server base class). While, strictly speaking, it's not necessary for servers to subclass UnicastRemoteObject, it's very convenient. Moreover, the client still needs stubs, and this means that you're going to have to write them by hand (or build something very similar to rmic to generate them).

So when JDK 1.3 shipped (and this was around the time that AspectJ first started to get some traction, as well), there was a lot of speculation about proxies and servers. The potential of dynamic proxies, and more generally, of aspect-oriented programming for server construction is obvious. Consequently, many object distribution and application frameworks have begin to make this style of proxy use core to their architecture. But it wasn't (and still isn't) possible in RMI.

Client-Side Fun and Games

On the server side, dynamic proxies are tantalizing. They hint at what we want to do, but they don't quite allow us to get it done. On the client side, however, the addition of dynamic proxies in JDK 1.3 allows us to do a very cool thing. In order to explain it, we need to discuss a slight change in RMI's serialization algorithm. This change actually happened in JDK 1.2.2, but at the time, everyone viewed it as an entirely innocuous bug fix. Here's the relevant snippet from the release notes:

Serializing remote objects (since 1.2.2)
Prior to 1.2.2, an attempt to pass an unexported remote object in a RMI call would result in a java.rmi.StubNotFoundException. This exception was a result of the RMI runtime's failure to locate a stub object during an attempt to replace a remote object implementation with its corresponding stub. In 1.2.2 and later releases, an unexported remote object passed in an RMI call will no longer result in an exception, but rather the remote object will be serialized instead of its stub. If the remote object implementation is not serializable, an attempt to pass an unexported object in an RMI call will result in a java.rmi.RemoteException with the nested exception java.io.NotSerializableException.

This might not seem like a very big deal -- it simply says that if an object implements both Remote and Serializable, then if it's not listening on a socket, it gets serialized (otherwise, a stub to it gets serialized).

The interesting part is that you can write dynamic proxies that include both Remote and Serializable in their list of interfaces (and, since neither Remote nor Serializable declare any methods, it's easy to do so). For example, suppose you implement an RMI server as a subclass of UnicastRemoteObject, as is traditional. Then you wrap it in a proxy, and both the proxy and the invocation handler implement Serializable. You can then bind the proxy into your naming service. It gets serialized out, and stored there. (And since it will contain a reference to the server, the serialized copy of the invocation handler will contain a reference to a stub for the server.) Figure 4 illustrates the situation:

Figure 4. Wrapping a server with a dynamic proxy. The key point is that the skeleton, which listens for remote method invocations, is bound to the actual RMI server (not to the dynamic proxy)

Going one step further, the invocation handler is actually downloaded to, and executes on, the client. The way it works is illustrated in Figure 5. There's an instance of the dynamic proxy on both the client and the server, but it winds up getting bypassed on the server because the remote method invocation uses the stub to the subclass of UnicastRemoteObject.

Figure 5. The remote call flow. The client talks to the dynamic proxy, which talks to the stub, etcetera. The key point here is that the dynamic proxy on the server side is skipped over

I haven't found a killer use for this yet, but it can be useful (for implementing some simple caching and retry strategies). And, as I said, it's quite intriguing.

JDK 1.5 Eliminating the Stubs (the Implementation)

At this point, we've talked about how to get rid of skeletons using reflection and about how to use dynamic proxies in various contexts, including wrapping stubs on the client side to get a very simple form of aspect-oriented programming.

The next step is to use dynamic proxies to get rid of stubs entirely, instead of simply wrapping them. After all, stubs don't have any special magic; they're instances of an automatically generated class designed to forward method calls over the wire. Abstractly, there's no reason they couldn't simply be replaced by a proxy object.

Unfortunately, before JDK 1.5, the structure of RMI made it impossible for you to get rid of stubs without a significant amount of skill and effort (and an even more significant amount of brittle code).

In JDK 1.5? Well, this article started with a very short quote. I think you're ready for the longer version now.

Dynamic Generation of Stub Classes
This release adds support for the dynamic generation of stub classes at runtime, obviating the need to use the Java(tm) Remote Method Invocation (Java RMI) stub compiler, rmic, to pregenerate stub classes for remote objects. Note that rmic must still be used to pregenerate stub classes for remote objects that need to support clients running on earlier versions. When an application exports a remote object (using the constructors or static exportObject methods of the classes java.rmi.server.UnicastRemoteObject or java.rmi.activation.Activatable) and a pregenerated stub class for the remote object's class cannot be loaded, the remote object's stub will be a java.lang.reflect.Proxy instance (whose class is dynamically generated) with ajava.rmi.server.RemoteObjectInvocationHandler as its invocation handler. An existing application can be deployed to use dynamically generated stub classes unconditionally (that is, whether or not pregenerated stub classes exist) by setting the system property java.rmi.server.ignoreStubClasses to true. If this property is set to true, pregenerated stub classes are never used.

In short: dynamic proxies replace stubs.

There are two very interesting codas to this, though. The first is that the earlier trick, of wrapping the server in a proxy, still works with dynamically generated stubs. The second is that the dynamic proxy class is apparently generated independently on the client and server -- if you launch an application without enabling dynamic code loading, and without creating stubs or skeletons, and with clients and servers running from separate classpaths, you'll still be able to connect clients with servers.

Closing Thoughts

RMI has been remarkably stable for a long time. Changes have been slow and incremental. But each change has been towards simplifying development and deployment, without any accompanying sacrifices. It's still a strongly typed system, it still preserves as much of the local-process call syntax and semantics as is reasonable, and it still is entirely Java. But, over time, it's also gotten more agile. Slowly and subtly, RMI has evolved into a low-process, low-deployment-overhead, and lightweight framework for strongly typed remote method invocation. It's become as good for very dynamic environments as any of the more loosely coupled frameworks without sacrificing any of its original strengths.