Tapestry Training -- From The Source

Let me help you get your team up to speed in Tapestry ... fast. Visit howardlewisship.com for details on training, mentoring and support!

Friday, June 29, 2012

Gradle CoffeeScript Compilation

As I'm working on rebooting JavaScript support in Tapestry one of my goals is to be able to author Tapestry's JavaScript as CoffeeScript. Tapestry will ultimately have a runtime option to dynamically compile CoffeeScript to JavaScript, but I don't want that as a dependency of the core library; that means I need to be able to have that compilation (transpilation?) occur at build time.

Fortunately, this is the kind of thing Gradle does well! My starting point for this is the Web Resource Optimizer for Java project, which includes a lot of code, often leveraging Rhino (for JavaScript) or JRuby, to do a number of common web resource processing, including CoffeeScript to JavaScript. WRO4J has an Ant task, but I wanted to build something more idiomatic to Gradle.

Here's what I've come up with so far. This is a external build script, separate from the project's main build script:

The task finds all .coffee files in the input directory (ignoring everything else) and generates a corresponding .js file in the output directory.

The @InputDirectory and @OutputDirectory annotations allows Gradle to decide when the task's action is needed: If any file changed in the directories provided by these methods then the Task must be rerun. Gradle doesn't tell us exactly what changed, or create the directories, or anything ... that's up to us.

Since I only expect to have a handful of CoffeeScript files, the easiest thing to do was to simply delete the output directory and recompile all CoffeeScript input files on any change. The @TaskAction annotation directs Gradle to invoke the doCompile() method when inputs (or outputs) have changed since the previous build.

It's enlightening to note that the visit passed to the closure on line 34 is, in fact, a FileVisitDetails, which makes it really easy to, among other things, work out the output file based on the relative path from the source directory to the input file.

One of my stumbling points was setting up the classpath to pull in WRO4J and its dependencies; the buildscript configuration on line 4 is specific to this single build script, which is very obvious once you've worked it out. This is actually excellent for re-use, as it means that minimal changes are needed to the build.gradle that makes use of this build script. Earlier I had, incorrectly, assumed that the main build script had to set up the classpath for any external build scripts.

Also note line 54; the CompileCoffeeScript task must be exported from this build script to the project, so that the project can actually make use of it.

The changes to the project's build.gradle build script are satisfyingly small:

The apply from: brings in the CompileCoffeeScript task. We then use that class to define a new task, using the defaults for srcDir and outputDir.

The last part is really interesting: We are taking the existing processResources task and adding a new input directory to it ... but rather than explicitly talk about the directory, we simply supply the task. Gradle will now know to make the compileCoffeeScript task a dependency of the processResources task, and add the task's output directory (remember that @OutputDirectory annotation?) as another source directory for processResources. This means that Gradle will seamlessly rebuild JavaScript files from CoffeeScript files, and those JavaScript files will then be included in the final WAR or JAR (or on the classpath when running test tasks).

Notice the things I didn't have to do: I didn't have to create a plugin, or write any XML, or seed my Gradle extension into a repository, or even come up with a fully qualified name for the extension. I just wrote a file, and referenced it from my main build script. Gradle took care of the rest.

There's room for some improvement here; I suspect I could do a little hacking to change the way compilation errors are reported (right now it is just a big ugly stack trace). I could use a thread pool to do compilation in parallel. I could even back away from the delete-the-output-directory-and-recompile-all approach ... but for the moment, what I have works and is fast enough.

Luke Daly leaked that there will be an experimental CoffeeScript compilation plugin coming in 1.1 so I probably won't waste more cycles on this. For only a couple of hours of research and experimentation I was able to learn a lot, and get something really useful put together, and the lessons I've learned mean that the next one of these I do will be even easier!

Getting CrashPlan to work on Mac after JDK 1.7 upgrade

After upgrading my Mac to JDK 1.7, I noticed that CrashPlan stopped working. Given just how much trouble I got in a ways back when I lost a month's worth of pictures and videos of my son I'm now a belt and suspenders kind of backup guy ... the belt may be TimeMachine, but the suspenders are CrashPlan.

After working it out with their technical staff, it turns out CrashPlan is not compatible with JDK 1.7 on Mac which is rather unfortunate. And, having installed JDK 1.7, it was trying to use that version!

Getting it working is pretty easy however; most of the information is available in their recipe for starting and stopping the CrashPlan engine. The key part is that the com.crashplan.engine.plist file includes an entry where you can specify the path to the Java executable; with a bit of experimentation to find the right path, it appears to be working.

The only change I made to the file was to change the path; the first entry for the ProgramArguments key:

You will, of course, need to sudo to edit the file and to stop and relaunch the CrashPlan engine. Now, having set this up (and, coincidentally, renewed by annual subscription) my computer is happily chugging away, making the backups.

Friday, June 08, 2012

Latency numbers every programmer should know

This is interesting stuff; Jonas Bonér organized some general some latency data by Peter Norvig as a Gist, and others expanded on it. What's interesting is how, scaling time up by a billion, converts a CPU instruction cycle into approximately one heartbeat, and yields a disk seek time of "a semester in university".

This is easier read on the original Gist page. Sorry about my blog's narrow formatting.

Another view of this data is as an animated presentation.

Wednesday, June 06, 2012

Synchronized Considered Harmful

New flash: concurrency is hard. Any time you have mutable data and multiple threads, you are just asking for abuse, and synchronized is simply not going to be your savior ... but the concurrency facilities of JDK 1.5 just might be.

I was recently contacted by a client who was load testing their Tapestry 5.3.3 application; they were using Tomcat 6.0.32 with 500 worker threads, on a pretty beefy machine: Intel Xeon X7460 @ 2.66Ghz, OpenJDK 64-Bit Server VM (14.0-b16, mixed mode). That's a machine with six cores, and 16 MB of L2 cache.

For all that power, they were tapping out at 450 requests per second. That's not very good when you have 500 worker threads ... it means that you've purchased memory and processing power just to see all those worker threads block, and you get to see your CPU utilization stay low. When synchronization is done properly, increasing the load on the server should push CPU utilization to 100%, and response time should be close to linear with load (that is to say, all the threads should be equally sharing the available processing resources) until the hard limit is reached.

Fortunately, these people approached me not with a vague performance complaint, but with a detailed listing of thread contention hotspots.

The goal with Tapestry has always been to build the code right initially, and optimize the code later if needed. I've gone through several cycles of this over the past couple of years, optimizing page construction time, or memory usage, or throughput performance (as here). In general, I follow Brian Goetz's advice: write simple, clean, code and let the compiler and Hotspot figure out the rest.

Another piece of advice from Brian is that "uncontested synchronized calls are very cheap". Many of the hotspots located by my client were, in fact, simple synchronized methods that did some lazy initialization. Here's an example:

In this example, getting the messages can be relatively time consuming and expensive, and is often not necessary at all. That is, in most instances of the class, the getMessages() method is never invoked. There were a bunch of similar examples of optional things that are often not needed ... but can be heavily used in the cases where they are used.

It turns out that "uncontested" really means uncontested: You better be sure that no two threads are ever hitting synchronized methods of the same instance at the same time. I chatted with Brian at the Hacker Bed & Breakfast about this, and he explained that you can quickly go from "extremely cheap" to "asymptotically expensive" when there's any potential for contention. The synchronized keyword is very limited in one area: when exiting a synchronized block, all threads that are waiting for that lock must be unblocked, but only one of those threads gets to take the lock; all the others see that the lock is taken and go back to the blocked state. That's not just a lot of wasted processing cycles: often the context switch to unblock a thread also involves paging memory off the disk, and that's very, very, expensive.

Enter ReentrantReadWriteLock: this is an alternative that allows any number of readers to share a lock, but only a single writer. When a thread attempts to acquire the write lock, the thread blocks until all reader threads have released the read lock. The cost of managing the ReentrantReadWriteLock's state is somewhat higher than synchronized, but has the huge advantage of letting multiple reader threads operate simultaneously. That means much, much higher throughput.

In practice, this means you must acquire the shared read lock to look at a field, and acquire the write lock in order to change the field.

ReentrantReadWriteLock is smart about only waking the right thread or threads when either the read lock or the write lock is released. You don't see the same thrash you would with synchronized: if a thread is waiting for the write lock, and another thread releases it, ReentrantReadWriteLock will (likely) just unblock the one waiting thread.

Using synchronized is easy; with an explicit ReentrantReadWriteLock there's a lot more code to manage:

I like to avoid nested try ... finally blocks, so I broke it out into seperate methods.

Notice the "lock dance": it is not possible to acquire the write lock if any thread, even the current thread, has the read lock. This opens up a tiny window where some other thread might pop in, grab the write lock and initialize the messages field. That's why it is desirable to double check, once the write lock has been acquired, that the work has not already been done.

Also notice that things aren't quite symmetrical: with ReentrantReadWriteLock it is allowable for the current thread to acquire the read lock before releasing the write lock. This helps to minimize context switches when the write lock is released, though it isn't expressly necessary.

Is the conversion effort worth it? Well, so far, simply by converting synchronized to ReentrantReadWriteLock, adding a couple of additional caches (also using ReentrantReadWriteLock), plus some work optimizing garbage collection by expanding the eden space, we've seen some significant improvements; from 450 req/sec to 2000 req/sec ... and there's still a few minor hotspots to address. I think that's been worth a few hours of work!

Tuesday, June 05, 2012

Things I Learned at Hacker Bed & Breakfast

  • You can spend any amount of money on Scotch, and it keeps getting better
  • A Java instance field that is assigned exactly once via lazy initialization does not have to be synchronized or volatile (as long as you can accept race conditions across threads to assign to the field); this is from Rich Hickey
  • Scotch + Random Internet Images + Speakers == Fun; giving off the cuff talks on silly subjects, with random internet images as your "slides"
  • No one's that interested in Java anymore (at least, not at Hacker B&B)
  • Only ride a zip line backwards if you are prepared to accept the consequences
  • The linkage between honey and fried chicken has not yet penetrated into Research Triangle, despite my best efforts
  • The cost of using the synchronized keyword can go asymptotic with enough threads and contention, making ReentrantReadWriteLock much cheaper in comparison
  • Stu Halloway's house appears to have been designed by the same team that designed the Tardis, and is also bigger on the inside than the outside
  • It is much more efficient to just hand Brian Goetz the $50 buy in, rather than go through the motions of playing poker with him

Friday, April 20, 2012

Yet More Spock Magic: Mocks

Spock's built-in mock object capabilities are just a dream to use ... unlike other systems I've used, it doesn't get in your way, or force you to think backwards or inside out. Once again, some listings. These are for tests of Tapestry IoC's AspectDecorator service, which is used to create a wrapper interceptor around some other object. The test below shows how a supplied MethodAdvice callback object is invoked by the interceptor, if the advice is associated with the invoked method.

TestNG with EasyMock (Java)

Even this example is a bit streamlined, as some of the mock object capabilities, such as methods newMock(), replay() and verify() are being derived from the TestBase base class.

Spock

Spock's wonderful when: / then: blocks organize the behavior into a stimulus followed by a response; using EasyMock, you have to train the mock objects for the response before introducing the stimulus (the method invocation). Further, with EasyMock there's one API for methods that return a fixed response, a second API for methods that throw an exception, and a third API for methods where the result must be calculated; in Spock the value after the >> operator is either a literal value, or a closure that can do what it likes, such as the one attached to MethodAdvice.advice() that checks for the expected method name, and then proceed()s to the delegate mock object's method.

I think that a reasonable developer, even without a thorough understanding of Spock, would get the gist of what this test does (perhaps with a little side note about the interaction system inside the then: block). On the other hand, I've seen when training people with TestNG and EasyMock that it very rarely sinks in immediately, if at all.

Thursday, April 19, 2012

Yet Another Bit of Spock Love

I'm gradually converting a back-log of existing tests to Spock ... and some of them convert so beautifully, it hurts. Here's an example:

Before (Java and TestNG)

After (Spock)

What a difference; the data-driven power of the where: block makes this stuff a bit of a snap, and you can see in once place, at a glance, what's going on. IDEA even lines up all the pipe characters automatically (wow!). It's obvious how the tests execute, and easy to see how to add new tests for new cases. By comparison, the TestNG version looks like a blob of code ... it takes a bit more scrutiny to see exactly what it is testing and how.

In addition, the propertyMissing() trick means that any property (including public static fields) of Calendar is accessible without qualification, making things look even nicer. This is what they mean by writing an "executable specification", rather than just writing code.

I can't say this enough: using any other framework for testing Java or Groovy code would simply not be logical.

Friday, March 09, 2012

Node and Callbacks

One of the fears people have with Node is the callback model. Node operates as a single thread: you must never do any work, especially any I/O, that blocks, because with only a single thread of execution, any block will block the entire process.

Instead, everything is organized around callbacks: you ask an API to do some work, and it invokes a callback function you provide when the work completes, at some later time. There are some significant tradeoffs here ... on the one hand, the traditional Java Servlet API approach involves multiple threads and mutable state in those threads, and often those threads are in a blocked state while I/O (typically, communicating with a database) is in progress. However, multiple threads and mutable data means locks, deadlocks, and all the other unwanted complexity that comes with it.

By contrast, Node is a single thread, and as long as you play by the rules, all the complexity of dealing with mutable data goes away. You don't, for example, save data to your database, wait for it to complete, then return a status message over the wire: you save data to your database, passing a callback. Some time later, when the data is actually saved, your callback is invoked, and which point you can return your status message. It's certainly a trade-off: some of the local code is more complicated and bit harder to grasp, but the overall architecture can be lightening fast, stable, and scalable ... as long as everyone plays by the rules.

Still the callback approach makes people nervous, because deeply nested callbacks can be hard to follow. I've seen this when teaching Ajax as part of my Tapestry Workshop.

I'm just getting started with Node, but I'm building an application that is very client-centered; the Node server mostly exposes a stateless, restful API. In that model, the Node server doesn't do anything too complicated that requires nested callbacks, and that's nice. You basically figure out a single operation based on the URL and query parameters, execute some logic, and have the callback send a response.

There's still a few places where you might need an extra level of callbacks. For example, I have a (temporary) API for creating a bunch of test data, at the URL /api/create-test-data. I want to create 100 new Quiz objects in the database, then once they are all created, return a list of all the Quiz objects in the database. Here's the code:

It should be pretty easy to pick out the logic for creating test data at the end. This is normal Node JavaScript but if it looks a little odd, it's because it's actually decompiled CoffeeScript. For me, the first rule of coding Node is always code in CoffeeScript! In its original form, the nesting of the callbacks is a bit more palatable:

What you have there is a count, remaining, and a single callback that is invoked for each Quiz object that is saved. When that count hits zero (we only expect each callback to be invoked once), it is safe to query the database and, in the callback from that query, send a final response. Notice the slightly odd structure, where we tend to define the final step (doing the final query and sending the response) first, then layer on top of that the code that does the work of adding Quiz objects, with the callback that figures out when all the objects have been created.

The CoffeeScript makes this a bit easier to follow, but between the ordering of the code, and the three levels of callbacks, it is far from perfect, so I thought I'd come up with a simple solution for managing things more sensibly. Note that I'm 100% certain that this issue has been tackled by any number of developers previously ... I'm using the excuse of getting comfortable with Node and CoffeeScript as an excuse to embrace some Not Invented Here syndrome. Here's my first pass:

The Flow object is a kind of factory for callback wrappers; you pass it a callback and it returns a new callback that you can pass to the other APIs. Once all callbacks that have been added have been invoked, the join callbacks are invoked after each of the other callbacks have been invoked. In other words, the callbacks are invoked in parallel (well, at least, in no particular order), and the join callback is invoked only after all the other callbacks have been invoked.

In practice, this simplifies the code quite a bit:

So instead of quiz.save (err) -> ... it becomes quiz.save flow.add (err) -> ..., or in straight JavaScript: quiz.save(flow.add(function(err) { ... })).

So things are fun; I'm actually enjoying Node and CoffeeScript at least as much as I enjoy Clojure; which is nice because it's been years (if ever) since I've enjoyed the actual coding in Java (though I've liked the results of my coding, of course).

Monday, February 27, 2012

Plastic: Advanced Example

Plastic is Tapestry's built-in Aspect Oriented Programming library, which primarily operates at the byte code level, but shields you from most byte code level thinking: normally, your code is implemented in terms of having method invocations or field reads and writes passed to callback objects that act as delegates or filters.

Sometimes, though, you need to get a little more low-level and generate the implementation of a method more directly. Plastic includes a fluent interface for this as well: InstructionBuilder.

This is an example from Tapestry's Inversion Of Control (IoC) container code; the proxy instance is the what's exposed to other services, and encapsulates two particular concerns: First, the late instantiation of the actual service implementation, and second, the ability to serialize the proxy object (even though the services and other objects are decidedly not serializable).

In terms of serialization, what actually gets serialized is a ServiceProxyToken object; when a ServiceProxyToken is later de-serialized, it can refer back to the equivalent proxy object in the new JVM and IoC Service Registry. The trick is to use the magic writeReplace() method so that when the proxy is serialized, the token is written instead. Here's the code:

To kick things off, we use the PlasticProxyFactory service to create a proxy that implements the service's interface.

The callback passed to createProxy() is passed the PlasticClass object. This is initially an implementation of the service interface where each interface method does nothing.

The basic setup includes making the proxy implement Serializable and creating and injecting values into new fields for the other data that's needed.

Next, a method called delegate() is created; it is responsible for lazily creating the real service when first needed. This is actually encapsulated inside an instance of ObjectCreator; the delegate() method simply invokes the create() method and casts the result to the service interface.

The methods on InstructionBuilder have a very close correspondence to JVM byte codes. So, for example, loading an instance field involves ensuring that the object containing the field is on the stack (via loadThis()), then consuming the this value and replacing it with the instance field value on the stack, which requires knowing the class name, field name, and field type of the field to be loaded. Fortunately, the PlasticField knows all this information, which streamlines the code.

Once the ObjectCreator is on the stack, a method on it can be invoked; at the byte code level, this requires the class name for the class containing the method, the return type of the method, and the name of the method (and, for methods with parameters, the parameter types). The result of that is the service implementation instance, which is cast to the service interface type and returned.

Now that the delegate() method is in place, it's time to make each method invocation on the proxy invoke delegate() and then re-invoke the method on the late-instantiated service implementation. Because this kind of delegation is so common, its supported by the delegateTo() method.

introduceMethod() can access an existing method or create a new one; for the writeReplace() method, the introduceMethod call creates a new, empty method. The call to changeImplementation() is used to replace the default empty method implementation with a new once; again, loading an injected field value, but then simply returning it.

Finally, because I feel strongly about including a useful toString() method in virtually all objects, this is also made easy in Plastic.

Once the class has been defined, it's just a matter of invoking the newInstance() method on the ClassInstantiator object to instantiate a new instance of the proxy class. Behind the scenes, Plastic has created a constructor to set the injected field values, but another of the nice parts of the Plastic API is that you don't have to manage that: ClassInstantiator does the work.

I'm pretty proud of the Plastic APIs in general; I think they strike a good balance between making common operations simple and concise, but still providing you with an escape-valve to more powerful (or more efficient) mechanisms, such as the InstructionBuilder examples above. Of course, the deeply-nested callback approach can be initially daunting, but that's mostly a matter of syntax, which may be addressed in JDK 8 with the addition of proper closures to the Java language.

I strongly feel that Plastic is a general purpose tool, that goes beyond inversion of control and the other manipulations that are specific to Tapestry ... and Plastic was designed specifically to be reused outside of Tapestry. It seems like it could be used for anything from implementing simple languages and DSLs, to providing all kinds of middleware code in new domains ... I have a fuzzy idea involving JMS and JSON with a lot of wiring and dispatch that could be handled using Plastic. I'd love to hear other people's ideas!

Thursday, February 23, 2012

Tests: Auto Launch Application, or Expect Running?

Does your test suite launch your application or expect it to be running already? This question came up while working on a client project; I launched the Selenium-based test suite and everything failed ... no requests got processed and I spent some time tracking down why the test suite was failing to launch the application.

I chatted about this with my client ... and found out that the Selenium test suite expects the application to already be running. It starts up Selenium Server, sure, but the application-under-test should already be running.

And suddenly, I realized I had a Big Blind Spot, dating back to when I first started using Selenium. For years, my testing pattern was that the test suite would start up the application, run the Selenium tests, then shut it down ... and this makes perfect sense for testing the Tapestry framework, where the test suite starts and stops a number of different mini-applications (and has to run headless on a continuous integration server).

But an application is different, and there's a lot of advantages to testing against the running application ... including being able to quickly and easily reproduce any failures in the running application. Also, without the time to start and stop the application, the test suite runs a bit faster.

Certainly, the command line build needs to be able to start and stop the application for headless and hands-off testing. However, the test suite as run from inside your IDE ... well, maybe not.

So ... how do you handle this on your projects?

Friday, January 27, 2012

LinkedIn Etiquette

I've used LinkedIn for many years now, long before I joined Facebook ... I liked the concept of never losing contact information with business contacts and technologist. It just seemed like a good idea (though I do sometimes wonder if LinkedIn has any particular purpose).

I tend to only connect with people I've met in person, or at least talked to on the phone.

One thing that drives me crazy about LinkedIn is that when requesting a connection, you aren't forced to customize the greeting message from the default "I'd like to add you to my professional network."

As far as I'm concerned, this default greeting is like no greeting at all, and it's a sign that you are just trolling for contacts. Just like you should always write a cover letter for a resume, you should always customize this greeting message.

Your chances of successfully making a connection go up significantly if you do customize the message ... and with me, if you don't, your chances drop to near zero.

Thursday, January 26, 2012

Tapestry Advantages

A summary of a discussion about the advantages of Tapestry over Struts:

  • Exceptional exception reporting
  • Significantly less code
  • Live class reloading
  • Sensible defaults, especially for SEO-friendly URLs
  • Great community
  • Flexibility and customizability

Interestingly, the quality of Tapestry's documentation was mentioned ... favorably! Between the revised home page, and Tapestry JumpStart (and Igor's coming book), I think we're headed in the right direction in terms of documentation going from a liability to an asset.

Tuesday, January 24, 2012

Tapestry 5.4: Focus on JavaScript

Tapestry 5.3.1 is out in the wild ... and if Tapestry is to stay relevant, Tapestry 5.4 is going to need to be something quite (r)evolutionary.

There was some confusion on the Tapestry developer mailing list in advance of this blog post; I'd alluded that it was coming, and some objected to such pronouncements coming out fully formed, without discussion. In reality, this is just a distillation of ideas, a starting point, and not a complete, finalized solution. If it's more detailed than some discussions of Tapestry's evolution in the past, that just means that the mailing list discussion and eventual implementation will be that much better informed.

In posts and other conversations, I've alluded to my vision for Tapestry 5.4. As always, the point of Tapestry is to allow developers to code less, deliver more, and that has been the focus of Tapestry on the server side: everything drives that point: terseness of code and templates, live class reloading, and excellent feedback are critical factors there. Much of what went into Tapestry 5.3 strengthened those points ... enhancements to Tapestry's meta-programming capabilities, improvements to the IoC container, and reducing Tapestry's memory footprint in a number of ways. I have one client reporting a 30% reduction in memory utilization, and another reporting a 30 - 40% improvement in execution speed.

Interestingly, I think that for Tapestry to truly stay relevant, it needs to shift much, much, more of the emphasis to the client side. For some time, Tapestry has been walking a fine line with regards to the critical question of where does the application execute? Pre-Ajax, that was an easy question: the application runs on the server, with at most minor JavaScript tricks and validations on the client. As the use of Ajax has matured, and customer expectations for application behavior in the browser have expanded, it is no longer acceptable to say that Tapestry is page based, with limited Ajax enhancements. Increasingly, application flow and business logic need to execute in the browser, and the server-side's role is to orchestrate and facilitate the client-side application, as well as to act as a source and sink of data ultimately stored in a database.

As Tapestry's server-side has matured, the client side has not kept sufficient pace. Tapestry does include some excellent features, such as how it allows the server-side to drive client-side JavaScript in a modular and efficient way. However, that is increasingly insufficient ... and the tension caused by give-and-take between client-side and server-side logic has grown with each release.

Nowhere is this more evident than in how Tapestry addresses HTML forms. This has always been a tricky issue in Tapestry, because the dynamic rendering that can occur needs to be matched by dynamic form submission processing. In Tapestry, the approach is to serialize into the form instructions that will be used when the form is submitted (see the store() method of the FormSupport API). These instructions are used during the processing of the form submission request to re-configure the necessary components, and direct them to read their query parameters, perform validations, and push updated values back into server-side objects properties. If you've ever wondered what the t:formdata hidden input field inside every Tapestry forms is about ... well, now you know: it's a serialized stream of Java objects, GZipped and MIME encoded.

However, relative to many other things in Tapestry, this is a bit clumsy and limited. You start to notice this when you see the tepid response to questions on the mailing list such as "how to do cross-field validation?" Doing more complicated things, such as highly dynamic form layouts, or forms with even marginal relationships between fields, can be problematic (though still generally possible) ... but it requires a bit too much internal knowledge of Tapestry, and the in-browser results feel a bit kludgy, a bit clumsy. Tapestry starts to feel like it is getting in the way, and that's never acceptible.

Simply put, Tapestry's abstractions on forms and fields is both leaky and insufficient. Tapestry is trying to do too much, and simply can't keep up with modern, reasonable demands in terms of responsiveness and useability inside the client. We've become used to pages rebuilding and reformatting themselves even while we're typing. For Tapestry to understand how to process the form submission, it needs a model of what the form looks like on the client-side, and it simply doesn't have it. There isn't an effective way to do so without significantly restricting what is possible on the client side, or requiring much more data to be passed in requests, or stored server-side in the session.

The primary issue here is that overall form submission cycle, especially combined with Tapestry's need to serialize commands into the form (as the hidden t:formdata field). Once you add Ajax to this mix, where new fields and rules are created dynamically (on the server side) and installed into the client-side DOM ... well, it gets harder and harder to manage. Add in a few more complications (such as a mix of transient and persistent Hibernate entities, or dynamic creation of sub-entities and relationships) into a form, it can be a brain burner getting Tapestry to do the right thing when the form is submitted: you need to understand exactly how Tapestry processes that t:formdata information, and how to add your own callbacks into the callback stream to accomplish just exactly the right thing at just exactly the right time. Again, this is not the Tapestry way, where things are expected to just work.

Further, there is some doubt about even the desirability of the overall model. In many cases, it makes sense to batch together a series of changes to individual properties ... but in many more, it is just as desirable for individual changes to filter back to the server (and the database) as the user navigates. Form-submit-and-re-render is a green screen style of user interaction. Direct interaction is the expectation now, and that's something Tapestry should embrace.

What's the solution, then? Well, it's still very much a moving target. The goal is to make creating client-side JavaScript libraries easier, to make it easier to integrate with libraries such as jQuery (and its vast library of extensions), make things simpler and more efficient on the client side, and not sacrifice the features that make Tapestry fun and productive in the first place.

Overall Vision

The overall vision breaks down into a number of steps:

  • Reduce or remove outside dependencies
  • Modularize JavaScript
  • Change page initializations to use modules
  • Embrace client-side controller logic

Of course, all of these steps depend on the others, so there isn't a good order to discuss them.

Reducing and removing outside dependencies

Tapestry's client-side strength has always been lots of "out of the box" functionality: client-side validation, Zones and other Ajax-oriented behaviors, and a well-integrated system for performing page-level initializations.

However, this strength is also a weakness, since that out of the box behavior is too tightly tied to the Prototype and Scriptaculous libraries ... reasonable choices in 2006, but out-of-step with the industry today. Not just in terms of the momentum behind jQuery, but also in terms of very different approaches, such as Sencha/ExtJS and others.

It was a conscious decision in 2006 to not attempt to create an abstraction layer before I understood all the abstractions. I've had the intermediate time to embrace those abstractions. Now the big problem is momentum and backwards compatibility.

Be removing unnecessary behaviors, such as animations, we can reduce Tapestry's client-side needs. Tapestry needs to be able to attach event handlers to elements. It needs to be able to easily locate elements via unique ids, or via CSS selectors. It needs to be able to run Ajax requests and handle the responses, including dynamic updates to elements.

All of these things are reasonable to abstract, and by making it even easier to execute JavaScript as part of a page render or page update (something already present in Tapestry 5.3), currently built-in features (such as animations) can be delegated to the application, which is likely a better choice in any case.

Modularizing JavaScript

Tapestry has always been careful about avoiding client-side namespace polution. Through release 5.2, most of Tapestry's JavaScript was encapulated in the Tapestry object. In Tapestry 5.3, a second object, T5 was introduced with the intention that it gradually replace the original Tapestry object (but this post represents a change in direction).

However, that's not enough. Too often, users have created in-line JavaScript, or JavaScript libraries that defined "bare" variables and functions (that are ultimately added to the browser's window object). This causes problems, including collisions between components (that provide competing definitions of objects and functions), or behavior that varies depending on whether the JavaScript was added to the page as part of a full-page render, or via an Ajax partial page render.

The right approach is to encourage and embrace some form of JavaScript module architecture, where there are no explicit global variables or functions, and that all JavaScript is evaluated inside a function, allowing for private variables and functions.

Currently, I'm thinking in terms of RequireJS as the way to organize the JavaScript. Tapestry would faciliate organizing its own code into modules, as well as application-specific (or even page-specific) JavaScript modules. This would mean that de-referencing the T5 object would no longer occur (outside of some kind of temporary compatibility mode).

For example, clicking a button inside some container element might, under 5.3, publish an event using Tapestry's client-side publish/subscribe system. In the following example, the click events bubble up from the buttons (with the button CSS class name) to a container element, and are then published under the topic name button-clicked.

Consider this an abbreviated example, as it doesn't explain where the element variable is defined or initialized; the important part is the interaction with Tapestry's client-side library: the reference to the T5.pubsub.publish function.

Under 5.4, using the RequireJS require function, this might be coded instead as:

Here, the t5/pubsub module will be loaded by RequireJS and passed as a parameter into the function, which is automatically executed. So, this supports JavaScript modularization, and leverages RequireJS's ability to load modules on-the-fly, as needed.

Notice the difference between the two examples; in the first example, coding as a module was optional (but recommended), since the necessary publish() function was accessible either way. In the 5.4 example, coding using JavaScript modules is virtually required: the anonymous function passed to require() is effectively a module, but its only through the use of require() (or RequireJS's define()) that the publish() function can be accessed.

This is both the carrot and the stick; the carrot is how easy it is to declare dependencies and have them passed in to your function-as-a-module. The stick is that (eventually) the only way to access those dependencies is by providing a module and declaring dependencies.

Change page initializations to use modules

Tapestry has a reasonably sophisticated system for allowing components to describe their JavaScript requirements as they render, in the form of the JavaScriptSupport environmental (an environmental is a kind of per-thread/per-request service object). Methods on JavaScriptSupport allow a component to request that a JavaScript library be imported in the page (though this is most commonly accomplished using the Import annotation), and to request the initialization functions get executed.

Part of Tapestry's Ajax support is that in an Ajax request, the JavaScriptSupport methods can still be invoked, but a completely different implementation is responsible for integrating those requests into the overall reply (which in an Ajax request is a JSON object, rather than a simple stream of HTML).

Here's an example component from the TapX library:

The @Import annotation directs that a stack (a set of related JavaScript libraries, defined elsewhere) be imported into the page; alternately, the component could import any number of specific JavaScript files, located either in the web application context folder, or on the classpath.

Inside the afterRender() method, the code constructs a JSONObject of data needed on the client side to perform the operation. The call to addInitializerCall references a function by name: this function must be added to the T5.Initializers namespace object. Notice the naming: tapxExpando: a prefix to identify the library, and to prevent collisions with any other application or library that also added its own functions to the T5.initializers object.

The JavaScript library includes the function that will be invoked:

Under 5.4, this would largely be the same except:

  • There will be a specific Java package for each library (or the application) to store library modules.
  • The JavaScriptSupport environmental will have new methods to reference a function, inside a module, to invoke.
  • Stacks will consist not just of individual libraries, but also modules, following the naming and packaging convention.

Embrace client-side controller logic

The changes discussed so far only smooth out a few rough edges; they still position Tapestry code, running on the server, as driving the entire show.

As alluded to earlier; for any sophisticated user interface, the challenge is to coordinate the client-side user interface (in terms of form fields, DOM elements, and query parameters) with the server-side components; this is encoded into the hidden t:formdata field. However, it is my opinion that for any dynamic form, Tapestry is or near the end of the road for this approach.

Instead, it's time to embrace client-logic, written in JavaScript, in the browser. Specifically, break away from HTML forms, and embrace a more dynamic structure, one where "submitting" a form always works through an Ajax update ... and what is sent is not a simple set of query parameters and values, but a JSON representation of what was updated, changed, or created.

My specific vision is to integrate Backbone.js (or something quite similar), to move this logic solidly to the client side. This is a fundamental change: one where the client-side is free to change and reconfigure the UI in any way it likes, and is ultimately responsible for packaging up the completed data and sending it to the server.

When you are used to the BeanEditForm component, this might feel like a step backwards, as you end up responsible for writing a bit more code (in JavaScript) to implement the user interface, input validations, and relationships between fields. However, as fun as BeanEditForm is, the declarative approach to validation on the client and the server has proven to be limited and limiting, especially in the face of cross-field relationships. We could attempt to extend the declarative nature, introducing rules or even scripting languages to establish the relationships ... or we could move in a situation that puts the developer back in the driver's seat.

Further, there are some that will be concerned that this is a violation of the DRY pricipal; however I subscribe to different philosophy that client-side and server-side validation are fundamentally different in any case; this is discussed in an excellent blog post by Ian Bickling.

Certainly there will be components and services to assist with this process, in term of extracting data into JSON format, and converting JSON data into a set of updates to the server-side objects. There's also a number of security concerns that necessitate careful validation of what comes up from the client in the Ajax request. Further, there will be new bundled libraries to make it easier to build these dynamic user interfaces.

Conclusion

In this vision of Tapestry's future, the server-side framework starts to shift from the focus of all behavior to the facilitator: it paints the broad stokes on the server, but the key interactions end up working exclusively on the client.

I'm sure this view will be controversial: after all, on the surface, what the community really wants is just "jQuery instead of Prototype". However, all of the factors described in the above sections are, I feel, critical to keeping Tapestry relevant by embracing the client-side in the way that the client-side demands.

I think this change in focus is a big deal; I think it is also necessary for Tapestry to stay relevant in the medium to long term. I've heard from many individual developers (not necessarily Tapestry users) that what they really want is "just jQuery and a restful API"; I think Tapestry can be that restful API, but by leveraging many of Tapestry's other strengths, it can be a lot more. Building something right on the metal feels empowering ... until you hit all the infrastructure that Tapestry provides, including best-of-class exception reporting, on-the-fly JavaScript aggregation and minimization, and (of course) live class reloading during development. java

I'm eager to bring Tapestry to the forfront of web application development ... and to deliver it fast! Monitor the Tapestry developer mailing list to see how this all plays out.

Review: Gradle Class with Luke Daley

Last week, Luke Daly arrived in Portland to teach a three day Gradle class; the folks at Gradleware were nice enough let me audit the class (so it only cost me a couple of thousand dollars of lost billing revenue to attend). My goals for the class was to gain a deeper understanding of how Gradle works, so that I could write more efficient builds, diagnose problems, and write my own plugins. The class scored very high on all of those counts!

Much of the first day was spent on basics, including a very useful introduction to the Groovy programming language (which is the basis of the Gradle DSL). Even though I have used Groovy pretty extensively for testing purposes over the last couple of years, there were features I've glossed over in the past that turned out to be very useful.

The class is taught largely bottom up: from the basics of declaring tasks, then actions on tasks, and gradually working up towards defining tasks inputs and outputs. Compiling Java came pretty late, which seems curious since that's the primary job of Gradle, but this makes sense from a bottom-up approach as SourceSets can then be described correctly.

The class alternates between lecture and discussion, and short focused labs. At the end of the third day, we worked on tuning up our own builds, passing the video projector cable around ... we had a good time making the Tapestry build more efficient and readable, as well as adding new features to it.

I'm feeling very good about my Gradle skills after attending the course; I've already been able to make further improvements to the Tapestry build subsequently, and I have plans for more involving things going forward.

The return on investment for this class is a bit tricky; Gradle simply does so much straight out of the box, and even without a strong understanding of its structure, you can kind of get it to do what you want just by guesswork and Googling. If you've been using Gradle for a while, I give this a cautious recommendation: getting back the three days of invested time may take a while to pay off. On the other hand, if you are currently dependent on Ant or Maven ... sign up for the next class and get yourself switched over, today!

Friday, January 13, 2012

Hackergarten in PDX - Friday January 20th


Merlyn Albery-Speyer is organizing a Hackergarten while Luke Daley (creator of Geb, and Gradle committer) is in town to run an in-depth Gradle training. I'll be there, working on Tapestry, or Gradle, or a video game, or something.

Please see Meryln's blog to RSVP. I look forward to meeting and coding with more PDX peeps!



Thursday, December 29, 2011

Adding "Ajax Throbbers" to Zone updates

A common desire in Tapestry is for Zone updates to automatically include a throbber (or "spinner") displayed while the Ajax update is in process. This is, unfortunately, a space where the built-in Tapestry 5.3 Zone functionality is a bit lacking. Fortunately, it's not too hard to hard it in after the fact.

This solution involves a JavaScript library, two CSS stylesheet files (one is IE specific), plus the "throbber" image. Typically, you'll bind all of these things together in your application's Layout component.

First, the JavaScript. We need to intercept links and forms that update a Zone. When such a request starts, we add a <div> to the top of the Zone's client-side element. When the update from the server arrives, the entire content of the Zone's element will be replaced (so we don't have to worry about clearing the <div> explicitly).



When a form is submitted with Ajax, to update a Zone, Tapestry fires a client-side event on the Form; the Tapestry.FORM_PROCESS_SUBMIT_EVENT constant provides the event name. The primary handler for this event is the code that actually performs the XmlHTTPRequest and sets up a handlers for the response; the above code adds a second handler that adds the Ajax overlay.

Likewise, when a link is used to update a Zone, there's a second client-side event; again, the primary handler for the event does the actual Ajax work, but the same logic allows the Zone to be decorated with the overlay.

The overlay consists of a <div> that will visually mark the entire zone's content and consume any mouse clicks during the Ajax update. The CSS associated with the zone-ajax-overlay CSS class sets up a translucent background color and the spinning Ajax throbber.

Next up is the CSS:



This little bit of CSS is doing quite a bit. Firstly, if the Ajax request is very quick, then there will be an annoying flicker; to combat this, we've set up a simple CSS animation to delay the animation momentarily, long enough that fast requests will just see the new content pop into place. There's probably a bit of room here to tweak the exact timing.

Alas, in the current world, we need to do a bit of work to support both Firefox (the -moz prefix) and WebKit (Safari, Chrome, the -webkit prefix). This is really calling out for a SASSy solution.

You'll also see an animated image for the throbber. I used ajaxload.info to create one.

But what about Internet Explorer? It doesn't understand the animation logic, and it does CSS opacity differently from the others. Fortunately, we can segregate those differences in a separate CSS file.



Lastly, we put all this together inside the application's Layout component:



The @Import annotation does the easy imports of the main CSS and JavaScript.
Tapestry 5.3 supports IE conditional stylesheets ... but this requires just a bit of code as the @Import annotation doesn't support adding a condition, as this is a fairly rare requirement.

Instead, the IE-specific CSS is injected into the page as an Asset object; this can be combined with StylesheetOptions to form a StylesheetLink, which can be imported into the page.

With this in place, every page will include both CSS stylesheets (one as an IE-only conditional comment) and the necessary client-side logic ... and every Zone update will get this uniform treatment.

There's some limitations here; in Tapestry it's possible for the server-side to push updates into multiple Zones. The client-side doesn't even know that's happening until it gets the reply, so there's no universal way to add overlays to multiple zones when the request is initiated.

Secondly, in rare cases, a Zone update may only update other Zones, and leave the initiating Zone's content unchanged. In that case, you may find that the Zone's throbber is still in place after the response is handled! I'll leave it as an exercise to the reader on how to deal with that.




Thursday, December 22, 2011

Dissecting a Tapestry Operation Trace

I'm helping out a client who is having a problem using Spock and Tapestry 5.3 together. The Spock/Tapestry integration was created for Tapestry 5.2, and some subtle change in the Tapestry 5.3 IoC container has boned the integration, so running even a simple test results in an exception, with a very big stacktrace:



The part at the top (the numbered entries) are the operation trace: Tapestry tracks what its doing at all times (using a stack stored inside a ThreadLocale) just so it can report this information in case there's an error. That's part of Tapestry's commitment to useful feedback. The operation tracing was enhanced a bit in Tapestry 5.3 to be a bit more verbose and all-reaching.

The operation trace is providing a wealth of information about how Tapestry got to the point where an exception was thrown. This is much more useful than just the huge stack trace (about 400 frames!) since Tapestry, by design, tends to call through the very same methods repeatedly; stack traces are less useful when what counts are the parameters to the methods, rather than the methods themselves.

It takes a while to figure out, but the key operations are:



That operation corresponds to invoking this method:



Translated to English, this code says:

When starting up the Registry (the odd name for the Tapestry IoC container), execute this block of code, that checks to see if early startup of Hibernate is desired and, if so, forces the initialization of Hibernate by invoking the getConfiguration() method (otherwise, the initialization would happen lazily the first time a request needed to use the Hibernate Session).

The @Symbol annotation means that the parameter's value is derived from a Tapestry configuration symbol, which is a flexible, late-binding way to configure services as they are instantiated. In other words, because of the use of a symbol, rather than a constant, the actual value passed in can't be determined until runtime ... which is a good thing; it means a developer can configure the symbol's value locally, but a different default value is in effect for the production application. Sometimes you want early startup, sometimes you don't.

In order to resolve the value of a symbol, Tapestry must instantiate the SymbolSource service; it has its own configuration that depends on other services, including ApplicationDefaults, FactoryDefaults, as well as a few other simple objects that implement the SymbolProvider interface, but are not services.

There's also a hidden dependency here: Starting in Tapestry 5.3, Tapestry will attempt to type-coerce contributed values (in this case, symbol values contributed to ApplicationDefaults or FactoryDefaults) from their actual type, to the expected type. That shows up in operation 14:



Those integers and booleans need to be converted to Strings; Tapestry 5.3 invokes the full machinery of the TypeCoercer service to do this coersion, seen as operation 15.

At operations 21 - 23, Tapestry sees the ThreadLocale service (which stores the active Locale used during processing of the request; something that can vary on a request-by-request basis). The ThreadLocale service uses a special service lifecycle that enforces that the instance is stored as a per-thread singleton, not a per-Registry singleton, and will be discarded at the end of each request.

The ServiceLifecycleSource service is the source for these ServiceLifecycle objects.

At operation 28 - 31, the Spock/Tapestry integration code is getting involved. It adds a special service lifecycle just for values that are part of a Spock specification ... and we're finally reaching the problem point!



The Spock/Tapestry integration is using the addInstance() method, which instantiates a class with dependencies; this is operation 30. This is the problem point, but it's not obvious why its causing an eventual exception.

Because of the use of addInstance(), Tapestry must locate and inject the dependencies of the PerIterationServiceLifecycle class, including the IPerIterationManager service (operation 31).

In Tapestry, there is a mechanism to replace services with overrides; this is the ServiceOverride service and its configuration. It's super handy for extending Tapestry in completely unexpected ways.

That brings us to some code, new in Tapestry 5.3, at operation 38:



And that brings us to the actual cause. Notice the @Symbol annotation ... remember way back to operation 7, that required the TypeCoercer (operation 15) ... well, we're not done with that yet, but this production override code has a @Symbol annotation that requires the TypeCoercer ... which is still in the middle of being instantiated.

Yes, this takes a lot of detective work ... this is something of an Achilles' Heel of Tapestry's IoC container; since much of the functionality of the container is defined in terms of other functionality of the container, you can get into these hidden dependency cycles when tweaking some of the more fundamental aspects of Tapestry, such as TypeCoercer contributions, or adding new service lifecycles. This is unfortunate, since so much else in Tapestry's web framework and IoC container Just WorksTM.

In terms of fixing this ... turns out the Spock/Tapestry integration has some other dependencies on 5.2, making use of internal classes and constructors that no longer exist in 5.3. I'll be forking their code shortly to produce a 5.3 compatible version.

However, my take-away here is: the system works, the emphasis on feedback, and the generation of useful operation traces, makes this detective work even possible. The alternative would have taken far, far longer ... using the debugger to try and work backwards to what Tapestry was trying to do. It's so much better to have Tapestry simply tell you what you need to know!

Wednesday, November 23, 2011

Mac Tips: Preventing Aperture from Launching

Since I tend to connect and disconnect my Android phone to my Mac pretty often, I got frustrated that it kept launching Aperture every time (it used to do the same with iPhoto, before I switched).

In any case, the solution for this is easy enough; with the phone connected, launch the ImageCapture.app: select the phone, and choose "No application" from the drop down list in the bottom left corner:



This is under Lion; your mileage may vary under earlier versions. Hope it helps!



Tuesday, November 22, 2011

Gradle Training in Portland

It's no mystery that I'm a big Gradle fan ... it's by far the best build tool I've even seen. Build scripts are concise and readable, and Gradle the easiest build tool to configure and extend, as well.

I fought an uphill battle with the other Tapestry developers to replace Maven with Gradle. Yes, we had an existing Maven build and it worked (most of the time, until it didn't). Now past the switchover, we're really reaping the benefits: faster builds, fewer headaches, and it's much easier to add new sub-modules. Despite the achievements I've made with Gradle, there's a lot about this powerful tool I still need to learn. So I was surprised and pleased when Tim Berglund contacted me to help find a venue in Portland, Oregon for Gradle training.

They're now advertising that the three-day training is coming January 17-19, 2012; details here.

I've seen Tim speak at a number of No Fluff Just Stuff and Uberconf sessions; he's very good. If you are frustrated using Ant or Maven, you need to learn Gradle, and I can't see how getting an intensive brain dump won't do anything but save you time and money.

Thursday, October 20, 2011

Some Tapestry Stories

A ways back, I published a call for anyone interested in free Tapestry 5 Laptop stickers (that call is still open!). You get the stickers, I get a story. Here's a few highlights, in no particular order:

Robert B., USA
The SC Medicaid Web Portal enables doctor's offices and hospitals, using the Web, to enter and submit claims for patients enrolled in Medicaid in South Carolina.
Steve E., Singapore
We're producing an on-line matching system for Non Deliverable Forwards (NDFs) in the currency finance market for a brokerage firm. (Or I guess in layman's terms, a gambling system!)
Kejo S., Switzerland
We are working now since almost 3 years with T5 and we created an application platform for ABB with it and really a lot of other projects. I think we wrote already far more than 1 million lines around your framework! Often your framework is fun and sometimes pain, but it's always amazing how lean and clean your source is! Overall I think it is the best frontend framework in the java ecosystem right now! Great Work!
Ivan K., Belarus
We just started new project on t5 - online collectible card game.
Michael L., Germany
We started a Tapestry5 project to build our extranet application, that links into the ERP to provide realtime information to stake holdersaand supporting internal workflows. We're just at the beginning and implementing more and more stuff. However, we looked at different Web-Frameworks but Tapestry5 simply rocked!
Greg P., Australia
I'm using Tapestry to create liftyourgame.com. A site that allows people to achieve their goals.
Szymon B., Poland
We use Tapestry 5 in our economic information service neurobiz.pl. It gives users access to the information about businesses operating in Poland and registered in National Court Registry.
Dominik Hurnaus, Austria
Working on a large CRM system for an automotive customer.
James S., USA
My team and I are working on a web-based interface for a fuzzy lookup and matching engine we've developed. I've also started messing around Tapestry for a few of my personal projects. I started using T5 a couple months ago, and so far I'm loving it.
Nenad N., Serbia
I am working with 5 other developers on mobile portals developed with Tapestry for multiple clients.
Dragan S., Macedonia
I was a GSOC developer and now I'm trying to do new cool stuff with Tapestry like websocket integration with node.js and rabbitmq.
Volker B., Austria
Our project is a dealer management system which supports dealers and workshops of the VW Group's brands and the Porsche sports car brand in all sorts of operational processes in a modern and flexible way ... in our company I think there are about 80-100 people that are using Tapestry.
Daniel J., Canada
Assessment dashboards for schools in southwest SK, Canada
William O., USA
We are working on a number of cool Facebook apps using Tapestry. One's called My Social Rankings ( mysocialrankings.com ), and the other is called Blingville (blingville.com).
Peter P., Slovakia
We are developing web applications for broker companies using Tapestry 5, and its great to develop with Tapestry.
Pablo N., Argentina
We are using Tapestry for http://www.squidjob.com (migrating out of GWT). The site is THE place for finding service providers for anything.
Joost S., the Netherlands
Yanomo is time tracking, project management and invoicing software for the rest of us. Use it and "You know more" :)
Alexander G., Belarus
We have been using Tapestry for about 6 years in our projects. Our current project is web administration console for RadiumOne Display (www.radiumone.com) platform. We are very happy with our stack consisting from Tapestry5+Spring+Hibernate+jQuery.

As usual, I see a lot more Tapestry adoption outside the US; I wonder if its about programming culture ... or about Tapestry localization support? I tend to see the Europeans developers as having more freedom to work with less mainstream technologies ... but when I ask them about this, they always seem to think that it's the US developers who have that freedom. I guess the grass is always greener.