Tapestry Central: Goodbye, Digester!

Ah, the evolution of XML parsing in Tapestry. Tapestry is very much driven by validated XML files (for page and component specifications, application specifications, and library specifications). In the earliest days, Tapestry was tied directly to Xerces. Later, it switched over to JAXP. I had reams of code that would walk the DOM tree and construct the specification objects from the XML.

As a nod to efficiency, I switched over in 3.0 to use Digester, but that's caused a lot of grief in its own right. It seems like the version Tapestry uses was always in conflict with whatever version was in use by the servlet container, especially Tomcat. That caused a lot of grief.

Meanwhile, Digester drags along some of its own dependencies, jakarta-collections and jakarta-beanutils. More JAR hell, keeping all those JARs and versions straight.

No more; I replaced Digester with an ad-hoc parser derived from (and sharing code with) the HiveMind module deployment descriptor parser. It uses a stack to track the objects being constructed (that's borrowed from Digester), but uses a simple case statement and some coding discipline, to deal with recognizing new elements and processing them. I haven't done any timings, but comparing this code to the Digester code leads me to think that this will have a substantial edge ... which will be even more important once Tapestry supports reloading of page templates and specifications. In addition, inside the monolithic SpecificationParser class, it's a lot clearer what's going on. The old code had to create some number (usually three, sometimes six) Digester rule objects for each Digester pattern (patterns are matched against elements on Digester's stack to determine which rules fire). The new code is almost entirely just private methods: beginState() methods that decide what state to enter based on the current element, enterState() methods that create new specification objects, push them onto the stack, and change to a new parser state, and endState() methods invoked when a close tag is found, to finalize created objects and pop them off the stack, and return the parser state to its earlier value.

Next up; the JDom-based parser for the Tapestry mock unit test suite. First, I want to use SDL, not XML, for these scripts. Second, I want them to run much, much faster and I suspect that a lot of time is being spent in JDom. Third, I want thrown assertion exceptions to have line-precise error reporting.

And fourth ... part of Tapestry 3.1 will be to productize this approach to testing Tapestry applications.

Tapestry Central

Monday, May 03, 2004

Goodbye, Digester!

No comments:

Post a Comment