Thursday, June 24, 2010

Tapestry 5.2: Improved Query Parameter Support

I just checked in some very nice changes for Tapestry 5.2; you can now easily store data about a page in the URL as query parameters:

  @ActivationRequestParameter
  private String name;

By annotating a page (not a component!) field this way, the field will be mapped to the query parameter "name". When a page render link or component event link for the page is created, the current value of the field will be added as parameter "name". When that link is triggered to form a request, the parameter will be read and the field updated from the query parameter value.

It isn't limited to strings ... it uses the whole ValueEncoder machinery so that you can encode numbers or even Hibernate entities (represented in the URL as their primary key).

Cool stuff, if I do say so myself. Even I'm still learning how to flex the massive amount of meta-programming muscle that Tapestry provides. It turns out that the combination of component method advice with custom events triggered on the page can do some really sophisticated things!

Tuesday, June 08, 2010

Who Wants The Func? Gotta Have That Func!

I've been entranced by the concept of laziness since I first really considered it while teaching myself a bit of Haskell. Laziness is the idea that no computation takes place until it is actually needed ... an idea that is common in the functional programming world and one that works best with immutable data.

Why immutable? This has been covered extensively elsewhere, but the gist is that when you have any kind of mutable data (any field that can ever change its value), you add time as an invisible input to your expressions. Literally, the time that any single expression is evaluated relative to other changes to mutable state will affect the outcome of the expression, often in non-predictable ways. In the mathematical world, a function will always return the same value for the same inputs ... in the fuzzy, dirty world of Object Orientation, a method invocation may return different values at different times based on mutable state. Not necessarily mutable state in the object being invoked, but in some other object, somewhere, that the invoked object depends on.

This is why parallel programming in the OO world seems so hard. It requires locks on mutable data, which brings its own problems, such as deadlocks. It can feel like a tottering house of cards.

But remove mutability from this picture and an entirely different world emerges. Functions do behave as functions; same inputs: same result. Side effects disappear, because there's no mutable state. Evaluation of expressions is no longer linked to time: it can be evaluated in parallel threads, or can be deferred until absolutely needed.

That last bit is the laziness. Laziness is a way to bootstrap your code up to a simpler, clearer expression of your algorithms ... once you embrace laziness, you can see that a good amount of the code you write (using mutable data especially) is a case of premature optimization.

Back to Tapestry; as far back as Tapestry 4.0 (where HiveMind and the use of Inversion of Control and Dependency Injection where introduced), Tapestry's internal code has had many functional characteristics. The base unit of work in the Tapestry IoC container is an interface, not a function ... but often those interfaces have a single method. That makes the interfaces look a lot like a function, ready for the kind of composition possible in a functional programming language. Sure, it's a bit clumsy compared to a real functional programming language ... but the power is still there.

Tapestry 5 uses these features to handle a lot of Aspect Oriented Programming concepts; for instance, services are lazily instantiated, and they can be decorated and advised to provide cross-cutting concerns. In fact, Tapestry uses functional composition extensively for all kinds of meta-programming.

Meanwhile outside the realm of Tapestry, my exposure to Clojure has really sold me on the functional approach, and I take to immutable data structures like a warm, comforting blanket. I miss all that when I'm working with ordinary Lists and Sets from Collections API.

Given that Tapestry does a lot of complex things, I started work on a simple functional library. What I've created is not nearly as complex as Functional Java; I think it does less, but does it more cleanly. It's more focused.

The idea is that you'll create a Flow from some source (usually, a Collection). You can then map, filter, reduce, append, concatenate, and iterate the values inside the Flow. Further, the Flows are lazy (as with Haskell and Clojure); all evaluation is deferred until absolutely necessary, and it's thread safe. You can also have infinite flows.

It all starts in a static class F (for Functional) that has the initial factories for Flows. This example uses the F.range() method to create a Flow for a range of integers:

System.out.println(F.range(1, 100).filter(F.gt(10)).map(
  new Mapper() {
   public Integer map(Integer value) {
    return value * 2;
   }
  }).take(3).toList());

When executed, this code prints the following: [22, 24, 26]. That is, it dropped the values less than or equal to 10; it then multiplied each remaining value by 2 and converted the first three to a list.

  • F.range() creates a lazy Flow of integers in the range (from 1 to 99; the upper range is non-inclusive).
  • filter() is a Flow method that keeps only some values, based on a Predicate
  • F.gt() is a static factory method, it creates a Predicate that compares a Number value from the Flow against the provided value
  • map() is a Flow method that is applied to each value in the Flow
  • take() takes a limited number of values from the front of the Flow
  • toList() converts the Flow into a non-modifiable List

Here we mapped from Integer to Integer, but it would have been possible to map to a different type. At each stage, a new (immutable) Flow object is created.

What about laziness? Well if we modify the code a bit:

System.out.println(F.range(1, 100).filter(F.gt(10)).map(
  new Mapper() {
   public Integer map(Integer value) {
    System.out.println("** mapping " + value);
    return value * 2;
   }
  }).take(3).toList());
The new output is:
** mapping 11
** mapping 12
** mapping 13
[22, 24, 26]

... in other words, although we write code that makes it appear that the entire Flow is transformed by the map() call, the reality is that individual values from the original flow are mapped, just once, as needed. The code we write focuses on the flow of transformations, from the input, to the final result: "start with the range, retain the values greather than 10, multiply each by two, keep just the first three".

Does this make a difference? Not with trivial cases like this example. The functional code could be rewritten in standard Java as:

List<Integer> result = new ArrayList<Integer>();
for (int i = 1; i < 100; i++)
{
  if (i >= 10)
  {
    result.add(i * 2);
  }
}

result = result.subList(0, 3);

System.out.println(result);

Yes, this code is shorter, but it does more work (computing many doubles values that are not needed). We could do some extra work to keep a count of result values and end the loop earlier, but that increases the cyclomatic complexity even further. The extra work is not a big deal here, but if the transformations were more expensive (say, re-drawing images in different sizes, or reading data from a database) the work not done unnecessarily would become quite significant.

And is the traditional Java code really shorter? What if we create a reusable factory function:

public static Mapper<Integer,Integer> multiplyBy(final int multiplicand)
{
  return new Mapper<Integer, Integer>() 
  {
    public Integer map(Integer value)
    {
      return value * multiplicand;
    }
  };
}

Then our original Flow expressions becomes:

System.out.println(F.range(1, 100).filter(F.gt(10)).map(multiplyBy(2)).take(3).toList());

... meaning that, once we have a good collection of these Mapper and Predicate factory methods, we can have the efficient, lazy code and it will be more concise and readable.

Anyway, tapestry-func is a work in progress, but it's very promising, and already being used in both Tapestry 5.2, and in some of my clients' code.