Tapestry Training -- From The Source

Let me help you get your team up to speed in Tapestry ... fast. Visit howardlewisship.com for details on training, mentoring and support!

Sunday, August 16, 2009

Article: Meta-Programming Java

In the last couple of years, if you mention the term meta-programming, people's ears perk up ... and they start looking around for Ruby. That's fair; Ruby makes a lot of meta-programming concepts very, very easy. However, that doesn't mean you can't do any meta-programming in Java; you just are a bit more limited and need a lot more infrastructure.

Tapestry 5, both the web framework and the underlying Inversion of Control container, is rife with meta-programming options. Let's talk about one of the most versatile: the thunk.

Thunks and Laziness

A thunk is a placeholder for a value to be computed as-needed. The Haskell programming language makes great use of these; thunks are the essense of lazy programming: each thunk represents a set of parameters to a function1 and the function itself.

The upshot of this is that when you see a function call (or other expression) in Haskell code, what really happens is that a thunk of the invocation of that function is created to capture the values to be passed in (some of which may themselves be thunks of other expressions). Its only when the value is needed, when the result of the expression is used in some other expression that is evaluated, that the thunk itself gets evaluated; the function is invoked, the return value is cached in the thunk and returned. This makes the order in which things happen in Haskell very difficult to predict, especially from the outside. Because of thunks, algorithms that look tail recursive aren't (the recursive call is just another thunk, evaulated serially). Further, algorithms that appear to be infinite, aren't: the thunks ensure that just values that are actually needed are ever computed.

It's an elegant and powerful approach, and it's even fast, because the fastest code is the code that is never executed in the first place.

Other languages have this feature; Clojure reflects its Lisp heritage in that almost everything operates in terms of accessing, iterating and transforming collections ... and all of those collection operations are lazy as well. Unlike Haskell, this is more a function of a carefully crafted standard library than a direct offshoot of the language, but the end result is quite similar.

But what happens when you want to accomplish some of these features (such as lazy evaluation) within the tight constraints of standard Java? That's when you need to get creative!

Thunks in Tapestry 5

Tapestry 5 uses thunks in many different places; the most common one is the use of proxies for Tapestry 5 IoC services. In Tapestry 5 every service has an interface2. Let's take a peek at a typical service in Tapestry 5, to illustrate the typed-thunk concept.

Listing 1: ComponentMessagesSource.java

public interface ComponentMessagesSource
{
    Messages getMessages(ComponentModel componentModel, Locale locale);

    InvalidationEventHub getInvalidationEventHub();
}

The purpose of the ComponentMessagesSource service is to provide a Messages object representing a particular component's message catalog. This is part of Tapestry's localization support: every page and component has easy access to its own message bundle, which includes messages inherited from base components and from a global message catalog.

A central tenet of Tapestry 5 is that service instantiation is lazy: services are only constructed as needed. What does "as needed" mean? It means, the first time any method of the service is invoked. This kind of lazy instantiation is accomplished by using thunks. So for a service such as ComponentMessagesSource, there will be a class somewhat like ComponentMessagesSourceThunk to handle the lazy instantiation:

Listing 2: ComponentMessagesSourceThunk.java

public interface ComponentMessagesSourceThunk implements ComponentMessagesSource
{
    private final ObjectCreator creator;

    public ComponentMessagesSourceThunk(ObjectCreator creator) { this.creator = creator; }

    private ComponentMessagesSourceThunk delegate() { return (ComponentMessagesSourceThunk) creator.createObject(); }

    public Messages getMessages(ComponentModel componentModel, Locale locale)
    {
        return delegate().getMessages(componentModel, locale);
    }

    public InvalidationEventHub getInvalidationEventHub()
    {
        return delegate().getInvalidationEventHub();
    }
}

You won't find the above class in the Tapestry source code: it is generated on-the-fly by Tapestry. That's great, because I know I'd hate to have to supply a service interface, a service implementation and a thunk class for each service; the interface and implementation is already plenty! One of the reasons that Tapestry all but requires that services have a service interface is to support the automatic creation of thunks or other proxies around the interface.

However, you can see the pattern: every method of the interface is, of course, implemented in the thunk. That's what it means to implement an interface. Each method obtains the delegate and then re-invokes the same method with the same parameters on the delegate. The trick is that the first time any of these methods are invoked, the delegate does not yet exist. The ObjectCreator will create the delegate object during that first invocation, and keep returning it subsequently. That's the essence of lazy instantiation.

The point here is that for any interface, you can create a typed-thunk that can stand in for the real object, hiding the real object's lifecycle: it gets created on demand by the ObjectCreator. Code that uses the thunk has no way of telling the thunk from the real objects ... the thunk implements all the methods of the interface and performs the right behaviors when those methods get invoked.

Creating Thunks Dynamically

Before we can talk about using thunks, we need to figure out how to create them dynamically, at runtime. Let's start by specifying the interface for a service that can provide thunks on demand, then figure out the implementation of that service.

Listing 3: ThunkCreator.java

public interface ThunkCreator
{
    /**
     * Creates a Thunk of the given proxy type.
     *
     * @param proxyType     type of object to create (must be an interface)
     * @param objectCreator provides an instance of the same type on demand (may be invoked multiple times)
     * @param description   to be returned from the thunk's toString() method
     * @param <T>           type of thunk
     * @return thunk of given type
     */
    <T> T createThunk(Class<T> proxyType, ObjectCreator objectCreator, String description);
}

Remember that this is just an automated way of producing instances of classes similar to ComponentMessagesSourceThunk. A simple implementation of this service is possible using JDK Proxies:

Listing 4: ThunkCreatorImpl.java

public class ThunkCreatorImpl implements ThunkCreator
{
    public <T> T createThunk(Class<T> proxyType, final ObjectCreator objectCreator, final String description)
    {
        InvocationHandler handler = new InvocationHandler()
        {
            public Object invoke(Object proxy, Method method, Object[] args) throws Throwable
            {
                if (method.getName().equals("toString") && method.getParameterTypes().length == 0)
                    return description;

                return method.invoke(objectCreator.createObject(), args);
            }
        };

        Object proxy = Proxy.newProxyInstance(Thread.currentThread().getContextClassLoader(),
                                              new Class[] { proxyType },
                                              handler);

        return proxyType.cast(proxy);
    }
}

JDK Proxies were introduced way back in JDK 1.3 and caused a real flurry of activity because they are so incredibly useful. A call to Proxy.newProxyInstance() will create an object conforming to the provided interfaces (here specified as the proxyType parameter). Every method invocation is routed through a single InvocationHandler object. The InvocationHandler simply re-routes method invocations to the object returned from objectCreator.createObject().

Tapestry's implementation of ThunkCreator uses the Javassist bytecode manipulation library to generate a custom class at runtime. The generated class is much closer to the example CompnentMessagesSourceThunk; it doesn't use JDK proxies or reflection. This means that Java's Hotspot compiler can do a better job optimizing the code. In reality, you'll be hard pressed to spot a difference in performance unless you use these thunks inside a very tight loop.

Great so far; now lets think about how we could use this in another way. What if you have a service that returns an object that is expensive to construct and may not even get used? An example of this in Tapestry is the Messages object, obtained from the ComponentMessagesSource service. Building a Messages instance for a component involves a lot of hunting around the classpath looking for properties files, not just for the component but for its base-class and for application-wide message bundles. That means a lot of I/O and and a lot of blocking, waiting for the disk drive to catch up. In many cases, these Messages objects are injected into components, but aren't used immediately. In terms of getting markup into the user's browser faster, avoiding all of those file lookups and file reads until absolutely necessary is an appreciable win.

Our goal is to intercept the call to ComponentMessagesSource.getMessages() and capture the parameters to the method. Instead of invoking the method, we want to return a thunk that encapsulates the method call. This is where we can really start to talk about meta-programming, not just programming: we aren't going to change the ComponentMessagesSource service implementation to accomplish this, we are going to meta-program the service. This is a key point: A Tapestry service is the sum of its interface, its implementation, and all the other parts provided by Tapestry. We can use Tapestry to augment the behavior of a service without changing the implementation of the service itself.

This approach is in stark contrast to, say, Ruby. When meta-programming Ruby you often end up writing and rewriting the methods defined by the class in place. In Java, you will instead layer on new objects implementing the same interface to provide the added behavior.

Accomplishing all this is suprisingly easy ... given the infrastructure that Tapestry 5 IoC already provides.

Lazy Advice

The goal with lazy advice is that invoking a method on a service short-circuits the method invocation: a thunk is returned that is a replacement for the return value of the method. Invoking a method on a thunk will invoke the actual service method, then re-invoke the method on the actual value returned from the method.

Image 1: Lazy Advice Thunk/

This is shown in image 1. The service method is represented by the blue line. The advice intercepts the call (remembering the method parameters) and returns a thunk. Later, the caller invokes a method on the thunk (the green line). The thunk will invoke the service method using the saved parameters (this is the lazy part), then re-invoke the method on the returned value.

To the caller, there is no evidence that the thunk even exists; the service method just returns faster than it should, and the first method invocation on the return value takes a little longer than it should.

Now we know what the solution is going to look like .. but how do we make it actually happen? How do we get "in there" to advise service methods?

Advising Service Methods

Tapestry's Inversion of Control Container is organized around modules: classes that define services. This is in contrast to Spring, which relies on verbose XML files. Tapestry uses a naming convention to figure out what methods of a module class do what. Methods whose name starts with "build" define services (and are ultimately used to instantiate them). Other method name prefixes have different meanings.

Module method names prefixed with "advise" act as a hook for a limited amount of Aspect Oriented Programming. Tapestry allows an easy way to provide around advice on method invocations ... a more intrusive system such as AspectJ can easily intercept access to fields or even the construction of classes and has more facilities for limiting the scope of advice so that it only applies to invocations in specific classes or packages. Of course, it works by significantly rewriting the bytecode of your classes and Tapestry's IoC container aims for a lighter touch.

Being able to advise service methods was originally intended to support logging of method entry and exit, or other cross-cutting converns such as managing transactions or enforcing security access constraints. However, the same mechanism can go much further, controlling when method invocations occur, in much the same way that the lazy thunk described above operates.

Listing 5 shows the method advice for the ComponentMessagesSource service.

Listing 5: TapestryModule.java

    @Match("ComponentMessagesSource")
    public static void adviseLazy(LazyAdvisor advisor, MethodAdviceReceiver receiver)
    {
        advisor.addLazyMethodInvocationAdvice(receiver);
    }

This method is used to advise a specific service, identified by the service's unique id, here "ComponentMessagesSource". An advisor method may advise many different services; we could use glob names or regular expressions to match a wider range of services. An advisor method recieves a MethodAdviceReceiver as a parameter; additional parameters are injected services. The intent of module classes is to contain a minimal amount of code, so it makes sense to move the real work into a service, especially because it is so easy to inject services directly into the advisor method.

The LazyAdvisor service, built into Tapestry, does most of the work:

Listng 6: LazyAdvisorImpl.java

public class LazyAdvisorImpl implements LazyAdvisor
{
    private final ThunkCreator thunkCreator;

    public LazyAdvisorImpl(ThunkCreator thunkCreator)
    {
        this.thunkCreator = thunkCreator;
    }

    public void addLazyMethodInvocationAdvice(MethodAdviceReceiver methodAdviceReceiver)
    {
        for (Method m : methodAdviceReceiver.getInterface().getMethods())
        {
            if (filter(m))
                addAdvice(m, methodAdviceReceiver);
        }
    }

    private void addAdvice(Method method, MethodAdviceReceiver receiver)
    {
        final Class thunkType = method.getReturnType();

        final String description = String.format("<%s Thunk for %s>",
                                                 thunkType.getName(),
                                                 InternalUtils.asString(method));

        MethodAdvice advice = new MethodAdvice()
        {
            /**
             * When the method is invoked, we don't immediately proceed. Intead, we return a thunk instance
             * that defers its behavior to the lazily invoked invocation.
             */
            public void advise(final Invocation invocation)
            {
                ObjectCreator deferred = new ObjectCreator()
                {
                    public Object createObject()
                    {
                        invocation.proceed();

                        return invocation.getResult();
                    }
                };

                ObjectCreator cachingObjectCreator = new CachingObjectCreator(deferred);

                Object thunk = thunkCreator.createThunk(thunkType, cachingObjectCreator, description);

                invocation.overrideResult(thunk);
            }
        };

        receiver.adviseMethod(method, advice);
    }

    private boolean filter(Method method)
    {
        if (method.getAnnotation(NotLazy.class) != null) return false;

        if (!method.getReturnType().isInterface()) return false;

        for (Class extype : method.getExceptionTypes())
        {
            if (!RuntimeException.class.isAssignableFrom(extype)) return false;
        }

        return true;
    }
}

The core of the LazyAdvisor service is in the addAdvice() method. A MethodAdvice inner class is defined; the MethodAdvice interface has only a single method, advise(). The advise() method will be passed an Invocation that represents the method being invoked. The Invocation captures parameters passed in as well as the return value or any checked exceptions that are thrown. Invoking the proceed() method continues on to the original method of the service3.

At this point, the thunk encapsulates the original method invocation; we even have an object for that: the Invocation instance originally passed to the advise() method. Invoking any method on the thunk will cause the ObjectCreator.createObject() method to be triggered: this is where we finally invoke proceed() and return the value for the lazily invoked method.

Other uses for Thunks

In essence, this thunk approach gives you the ability to control the context in which a method is executed: is it executed right now, or only when needed? It is only a little jump from that to executing the method in a background thread. In fact, Tapestry includes a ParellelExecutor service that can be used for just that.

Conclusion

Type-safe thunks are a powerful and flexible technique for controlling when (or even if) a method is invoked without sacrificing type safety. Unlike more intrusive techniques that rely on manipulating the bytecode of existing classes, type-safe thunks can be easily and safely introduced into existing code bases. More than that, this exercise opens up many exciting possibilities: these techniques (coding to interfaces, multiple objects with the same interface, delegation) open up a path to a more fluid, more responsive, more elegant approach to coding complex behaviors and interactions ... while reducing the total line count and complexity of your code.

One of the things I am most happy about in Tapestry is the way in which we can build up complex behaviors from simple pieces. Everything stacks together, concisely and with minimum fuss:

  • We can create a thunk around an ObjectCreator, to defer the instantiation of an object
  • We can capture a method invocation and convert that into an ObjectCreator and a lazy thunk
  • We can advise a method without changing the actual implementation, to provide the desired laziness
  • Tapestry can call an advisor method of our module when constructing the ComponentMessagesSource service
  • We can inject services that do the advising right into advisor methods

Footnotes

1 Actually, all functions in Haskell take exactly one parameter which is both mind-blowing and not relevant to the discussion.

2 Services can be based on classes rather than interfaces, but then you lose a lot of these interface-based features, such as lazy proxies.

3Or, if the method has been advised multiple times, invoking proceed() may invoke the next piece of advice. For example, you may have added advice to a method for logging method entry and exit, and for managing database transactions as well as lazy evaluation.

2 comments:

Michael Buckley said...

Neat stuff -- too bad it is so hard to do in Java.

Where do you see this used in Tapestry? My worry is that very few service calls are functionally pure. Losing control of the order of evaluation is scary. Bertrand Meyer identifies the semi-colon as the essential control flow operator for a reason.

Howard said...

The deferred execution part is used with care in internals of the framework. The related meta-programming is pervasive!