Saturday, September 23, 2006

Type Coercion in Tapestry 5

I just finished a bit of work I'm very proud of ... a fairly comprehensive type coercion framework for Tapestry 5.

Here's the problem: with the way you bind parameters in Tapestry, you are often supplying a value in one format (say, a String) when the type of the parameter (defined by the variable to which the @Parameter annotation is attached) is of another type, say int.

So ... who'se reponsible for converting that String into an Integer? Tapestry. Get used to that answer, because that's a big theme in Tapestry 5.

At the core of the solution is a simple interface for performing type coercions:

public interface Coercion<S, T>
{
    T coerce(S input);
}

Gussied up inside all that generics goodness is the idea that an object gets passed in, and some operation takes place that returns an object of a different type. Perhaps the input is a String and the output is a Double.

Now, we dress that up with a wrapper that helps Tapestry determine what the Coercion converts from (source/input) and to (target/output):

public class CoercionTuple<S, T>
{
    public CoercionTuple(Class<S> sourceType, Class<T> targetType, Coercion<S, T> coercer)
    {
      . . .
    }

    public Coercion<S, T> getCoercion() . . .

    public Class<S> getSourceType() . . .

    public Class<T> getTargetType() . . .
}

My brief look at Haskell influenced the naming ("tuple") and a lot of the overall design.

Now we have a service, TypeCoercer, that can perform the conversions:

public interface TypeCoercer
{
    <S, T> T coerce(S input, Class<T> targetType);
}

The TypeCoercer is seeded with a number of common coercion tuples (thanks to Tapestry IoC, you can contribute in more if you need to). From these tuples, the service can locate the correct coercion.

Now the neat part is that if there isn't an exact match for a coercion, that's not a problem. The service will search the tuple space and build a new coercion by combining the existing ones.

For example, there's a builtin String to Double tuple, and a builtin Number to Long tuple. The TypeCoercer will see that there's no way to convert String (or any of its super classes or extended interfaces) directly to an Integer, so it will start searching among the tuples that do apply.

This all happens automatically. Say you pass in a StringBuffer instead of a String; TypeCoercer will construct the compound coercion Object to String, String to Long, Long to Integer.

Writing this code was very pleasurable; too often the things I work on are too simple: move datum A to slot B, and I get the whole design for such a piece of code all at once and its just a scramble to get it coded (and tested, and documented) before that mental image fades. This time I had to work hard (despite the very small amount of code involved) to really understand the problem space and the algorithm to make it all work ... then back it up with a good number of tests.

8 comments:

  1. Anonymous8:58 PM

    very very good ideas! thanks!

    ReplyDelete
  2. Anonymous10:02 PM

    Great feature. It would be maybe good to support by default conversion from/to BigInteger and BigDecimal also. These are oftenly used when dealing with monetary values.

    ReplyDelete
  3. Anonymous10:30 AM

    Yes.

    Providing BigDecimal support out of the box makes the application more db friendly, since many deafult JDBC types are BigDecimal.

    ReplyDelete
  4. The only real limiting aspect of this approach is that it is entirely based on type, and not the content. This can bite you on some of the common cases of converting a string. You can end up with multiple tuples that are each applicable (say String to Long and String to Double) and no easy way to determine which is the right one to use.

    I bad situation would be one that chose String to Long to Float rather than String to Double to Float, since the first path would not be able to parse decimal numbers!


    I think the final system will have String to Long, String to Double, String to BigInteger and String to BigDecimal. It will also have Long to Double and vice-versa, Long to BigInteger and vice versa, Double to BigDecimal and vice versa. Double to Float and vice-versa, and Long to (each of the integer types) and vice versa.

    Basically, we're creating a couple of "clusters" of tuples that can be coerced cheaply

    In this way, if you are looking for an integer type, the cheapest path will be String to Long to Integer (rather than, say, String to Double to Long to Integer).

    Likewise, if you are looking for String to Float, it will be String to Double to Float (rather than String to Long to Double to Float, or String to BigDecimal to Double to Float).

    What's nice about this discussion is that the implementation does not change, just the base set of coercion tuples.

    ReplyDelete
  5. Great feature to have - it will certainly make life much easier! Out of interest, what does happen if there are two equal paths to achieve the same conversion, as for example in your comment above? From the more real-world mappings in your comment above I can't immediately see any duplicate paths. However, if the system is extensible then someone could always create a duplicate path (although I concede it's unlikely in practice). Picking one at random could lead to unpredictable behaviour.

    ReplyDelete
  6. There's a particular order in which it navigates the tree; primarily looking for "shortest path", where length is primarily determined by number of coercions steps, but also takes into account how far up the inheritance chain from the source type it has to go to find a coercion.

    ReplyDelete
  7. Howard, haven't you think about adding the coercer the information whether it can lose some information during coercing from input to output. E.g. String to BigDecimal don't lose any, but String to Long can, Double to Long can also. Then you can choose the coerce path that can lose info closest to the end of the path. Therefore the path String -> Double -> Float is better than String -> Long -> Float.

    ReplyDelete
  8. I have thought about adding a "cost" value for the coercion, so that inexact coercions would be more costly than exact coercions, but that doesn't seem to work as well as the current, simple system.

    It just occured to me to add a describe() method that would return a description of the coercion from A to B, this would allow users with questions (or who have contributed coercions) to verify that the expec ted coercions are triggered for a particular pair of input and output types.

    ReplyDelete

Please note that this is not a support forum for Tapestry. Requests for help will be deleted. Please subscribe to the Tapestry user mailing list if you are in need of support, or contact me directly for professional (for pay) support.

Spammers: Don't bother. I delete your comments and it's a waste of time for both of us. 垃圾邮件发送者:不要打扰。我删除您的评论和它的时间对我们双方的浪费