Friday, September 19, 2003

Excellent discourse on WebObjects and Tapestry

Drew Davidson, (co-?)creator of OGNL, recently posted this message about WebObjects to the Tapestry user mailing list:
I went to WWDC 2001 full of excitement about WO, since at that time they were just getting the pure Java 5.0 version out the door. I had talked with them previously about how they were going to transition from ObjC to Java and was concerned about the API.

Their main focus was to make it API compatible with ObjC - something I thought was ridiculous. From their point of view, I guess that their customers were clamoring for this because it would make transitioning their applications from ObjC to Java easier; most of the WO work we did at Running Start (where I worked with Erik Hatcher) during 2000 involved using the ObjC-Java bridge and the Java API to WO that was provided. So there was an existing Java API for WO.

However, this decision had long-term consequences to users. The API as presented by the ObjC-Java bridge was an Objective-C API - no two ways about it. It was not Java-like in the least. The API used internal collections (NSArray, NSDictionary, etc.) instead of the soon-to-be-standard Java Collections in java.util. The method names were ObjC-like and not JavaBeans patterns where that made sense. ObjC concepts like categories and inherited class methods that Java does not support were rough translations and sometimes the disconnect was obvious.

Short term this is a benefit as developers can more easily port their code.

Long term, however, this means that you must live in the Apple WO world instead of the bigger Java world. Adaptors must be written when interaction is required between APIs and standard tools for doing JavaBeans stuff just won't fit without making custom BeanInfo objects for the necessary objects.

The WO API itself was written very long ago (it started back at NeXT back in 1994, for crying out loud!) and it was written with the perspective that you build a WO *application* that acts as an application server. It does not take into account the notion of Application Servers and therefore solves the entire Application Server role from soup to nuts. At the time it was written this was sensible.

But the world has moved on. Servlets are now the standard way to deploy Java web applications. Apple has made efforts to integrate WO with the Servlets, and even integrate with JSP, but IMNSHO they have not really made it feasible to develop and deploy this way. The API is just too elderly and they needed to have ripped a new page off the notebook, taken the concepts and started over. The concepts are still very valid. Nothing else (up until Tapestry) had attempted to really componentize the web, and that is very much to NeXT's (then Apple's) credit. When Howard started over with the basic assumptions of Application Server, Servlets, etc. and applied the excellent concepts from WO he created Tapestry. Tapestry is, to my mind, what WO 5.0 should have been. The failure was due to Apple's doing what the customer *wanted* and not what they *needed*. I'm sure that Apple's rationale for doing what they did was to support their customer base and not lose those customers. But look at their market now - they are niche. But the 3 WO customers left probably had a very easy time porting their 4.5 applications to 5.0.

BTW I mentioned the collections problem at WWDC (in one of the sessions, in front of everyone) and was brushed off by Ernie Prabakhar (the manager in charge of WO at the time) by him claiming that their collections were not functionaly compatible (mainly due to null handling); I told him that I had integrated the two without conflict with some facades that wrapped the NS stuff to Java Collection stuff. It worked, but was wasteful because of all of the adaptors hanging around and the constant awareness you had to have at integration points. They really had no interest in solving this problem, and it was clear they just wanted me to shut up and go away because I was expressing concern and critique instead of praise and adulation.

You, Jamie, are correct in your assesment that WO has stopped evolving. It has become a niche solution wherein you have to buy into the entire Apple web development package, rather than being able to use the tools you want and exclude those you don't. The only company that can get away with this kind of behaviour and still keep market share is Microsoft.

- Drew

Wednesday, September 17, 2003

Components Vs. MDA? (vs. J2EE)

Been reading some of the blog by Rickard on Model Driven Architecture and the resulting maelstrom at TheServerSide.

I have a pretty deeply ingrained abhorance of code generation. Note bytecode enhancement at runtime; that I'm really into. There's no baggage with that, no chance for the build process to screw it up, no need for a developer to go modify the output, no chance for bad merges or lost changes.

I've never like code generation; my early exposure to NeXT and InterfaceBuilder really showed off why it wasn't necessary for GUIs and I still don't think its needed for enterprise apps.

Code generation is a last resort that should only be used when real techniques fail. For me, real techniques are combinations of components and service-oriented architectures .. the kind of stuff you see in Tapestry and HiveMind.

In MDA, you define your system in terms of objects and relationships ... but then you spew out code representing that system that isn't isomorphic to the system you defined. If it was truly isomophic, you would be able to reconsititute your original model from it, but that's simply not reasonable.

In my vision of a proper MDA, the objects and relationships in the model would be directly expressed by matching objects and relationships in the runtime image. In Tapestry terms ... application extensions and helper beans; in HiveMind terms, services and implementations. In both cases, I would expect that the MDA would provide a library of the components (and service implementations) referenced in the model and the runtime image.

Tapestry has room for meta properties in all the key elements; it is quite reasonable for a tool to plug in meta data into pages, components and helper beans to support seamless round-trip engineering ... reading the output to reconstitute the original model. Wouldn't be, couldn't be, perfect ... but it would be better than spewing out code.

Still, if the underlying problem is the complexity of J2EE in terms of getting reasonably sized projects working ... then J2EE is overused and should be bypassed. EJBs don't lend themselves to real, lightweight, agile development models (XP or not) ... they are too reliant on the container and on the runtime environment to be easily tested outside of that container. Any reasonable development model involves running frequent tests ... for example, Tapestry has around 400 tests that run in 40 seconds (and I want to speed that up significantly). HiveMind has 160 tests that run in three seconds. Did I break something? Click the test button.

At work, for me to test an EJB change requires a build (to package the EJB and deploy it) which is currenlty taking about ten minutes. You read that right ... we have some issues with our build environment (they're being worked worked on, but ... "House built on a bad foundation ...")

So, with every change requiring a minimum of ten minutes to validate (usually, much, much longer) agile is dead. Refactoring to use HiveMind (or other microkernels) could support a fast, robust, agile methodology ... and bring into question whether MDA makes sense.

Meanwhile, my other peeve with MDA is that it assumes there will be one uber-developer (the architect) who can create the model, and create the code generation scripts (or modify the supplied ones). How many people can wear that many hats? Doesn't that create a choke point? Isn't that another block to agility ... you don't fix your code, you have to find and fix the code that generates your code, regenerate all, retest, hope it didn't break something outside your fiefdom.

MDA. Offshort outsourcing. Mission statements. ISO 9000. What's the next fad that's going to distract me from getting actual work done?

Monday, September 15, 2003

Presenting at ApacheCon

I'll be presenting an all-too-short Tapestry session at ApacheCon on Nov 17th at 11am. I have 60 minutes to talk Tapestry and handle Q&A ... glad I already have some materials ready, but I don't know how much I can cover in that amount of time.

Friday, September 12, 2003

HiveMind call for participation

HiveMind is, I believe, ready to go! The last few days have been spent refining the framework in a number of small ways. Essl Christian has really gotten very much involved, even contributing a useful patch that addresses the complex class loader environment within Tomcat. He's also provided some great input that led to further shaving away of unwanted code. Sometimes coding is like sculpting ... you just remove the stuff you don't absolutely need.

I'm now interested in putting together a real community to bring HiveMind forward. This will mean putting together a real proposal to move HiveMind out of the sandbox and into Jakarta commons proper. A community around HiveMind would be a good start.

My vision for HiveMind going forward:

  • Select a starting contributors community. Me, Essl, Bill Lear?, Harish?, David Solis?
  • Move to Jakarta commons w/ a charter, voting and all the dressing. Yea! More administration!
  • Move the source code over from sandbox to proper. Reorganize it into a Maven multiproject
  • Progress to a 1.0 release
  • (In parallel) Begin to add additional sub-projects to wrap common code as resuable HiveMind modules. Examples: object pools, database pools, Lucene?, Betwixt?, Hibernate, etc.
  • Conquer the world, and all that

So, if interested, drop me an e-mail or post to the Jakarta Commons Developers list.

Sometimes testing numbers just don't make sense

Based on what I did yesterday, I thought I'd see if I could improve the efficiency of the LoggingInterceptor by directly invoking methods on the commons-logging wrapper class, rather the the Log interface. Fortunately, I decided to do some tests to see about relative costs. I don't fully understand the results (Windows XP, Sun JDK 1.4, Hotspot server) ... obviously, Hotspot is doing something interesting to the code.

Run Log4J Direct Jakarta (Interface) Jakarta (Impl)
Run #1 741 711 701
Run #2 1622 671 551
Run #3 661 521 541
Run #4 641 540 571
Run #5 621 541 551
Run #6 610 521 551
Run #7 611 531 560
Run #8 611 531 551
Run #9 621 521 550
Run #10 611 521 551
Run #11 611 530 551
Run #12 611 521 551
Run #13 611 520 561
Run #14 611 531 550
Run #15 611 511 531
Run #16 581 510 581
Run #17 621 531 601
Run #18 631 571 570
Run #19 621 541 571
Run #20 621 541 570

The tests invoked isDebugEnabled() 50,000,000 times. I did what I could to help ensure that things didn't get unrealistically inlined.

What I don't understand is why going through the Log interface is sometimes faster than either going through the Log4J Logger class, or through the commons-logging Log4JCategoryLog class. I checked the commons-logging code ... Log4JCategoryLog is a very thin wrapper that delegates to a Logger.

Anyway, until I find better tests for figuring out performance, there's no reason to change the LoggingInterceptor code.

Thursday, September 11, 2003

Improved HiveMind efficiency

I was thinking about interceptors in the shower this morning (best place to do serious thinking) and was also thiking about the performance results I got a little ways back. It seems like the big hit as the constant calling of interface methods instead of instance methods. It's well known that calling a method on an interface is much slower than calling a method on an object instance ... the JVM has to do a more expensive dynamic lookup and there are fewer options for optimizing the call.

One idea I had was to rework how interceptors work, so that all the interceptors would work together to make a single interceptor class. Didn't seem like the right approach, since it would be a lot of work, and I think it will be rare for a service to have more than two interceptors. Hard to test, hard to implement, no assurance of a return.

Next, I considered having each interceptor subclass from the previous; this would allow calls to use super.foo() rather than _inner.foo(), which would be more efficient (a super-class invocation rather than an interface invocation).

Then I realized that I always have the actual type and instance of each interceptor (or the core implementation) ... which means I can build interceptors in terms of concrete classes rather than interfaces.

To see if there was any benefit, I needed to test. I extended my test harness to add to new Adder services; the first is like the Singleton service, but has one NullInterceptor (this interceptor simply calls through to the next object). The second has two Interceptors. First run (JDK 1.4, Hotspot server) showed the cost of those interceptors:

Run Bean Interface SingletonDeferred ThreadedOne Interceptor Two Interceptors
Run #1 211 390 2183 2824 3185 2393 9174
Run #2 250 2343 2324 2864 3014 2434 9203
Run #3 240 2354 2323 2844 3054 2394 9253
Run #4 241 2333 2353 2824 3045 2403 9183
Run #5 231 2353 2333 2825 3064 2383 9194

Compare column "Singleton" to "One Interceptor" and "Two Interceptors". I don't exactly have a theory for why the difference between one and two interceptors is so large. Also, the value is red is not a mistake, we've seen it before ... that appears to be Hotspot making an early optimization when there is only a single implementation of the interface; later it must go back and rewrite things when new implementations show up and it can no longer be certain that interface test.Adder is always class test.impl.AdderImpl.

I then reworked the interceptor code to use the actual class to call the next inner-most (interceptor or implementation). This is possible because of the order of contstruction: the core implementation is constructed first, then the lowest order interceptor is wrapped around it, the the next lowest order interceptor, and so forth. This time there was a dramatic change:

Run Bean Interface SingletonDeferred ThreadedOne Interceptor Two Interceptors
Run #1 210 361 1993 2593 2925 2203 2233
Run #2 210 2153 2144 2613 2804 2194 2213
Run #3 220 2143 2153 2594 2784 2203 2203
Run #4 221 2143 2153 2583 2804 2194 2193
Run #5 220 2143 2153 2574 2814 2193 2213

And so we see the desired outcome; adding interceptors is now a small, incremental cost.

Why is this important? We often hear about the 80/20 rule, that 80% of the performance problems are in 20% of the code. My concern is that all the extra layers of interface invocations will become a visible factor, cumulatively ... but because all those calls are so completely, widely distributed, they will become impossible to track down. Sort of grit in the machinery ... no single, large point of failure, but a kind of overall friction in the works.

Given just how many iterations of my code loop I need to get measurable results (50,000,000), that's probably not a large concern ... still it's nice to claim I've optimized HiveMind for performance.

Tuesday, September 09, 2003

More substantial docs on HiveMind

I've added several new documentation items to the HiveMind site. One describes how to "bootstrap" registry ... exactly what code you execute to locate and parse all the HiveMind module deployment descriptors and form a Registry (it's just a couple of lines). A second page describes how to override a service with a new service. Finally, a big page is a case study of how HiveMind is used in Vista to manage startup and shutdown logic.

Meanwhile, HiveMind is really taking root at WebCT. Several new projects within Vista will leverage HiveMind in one way or another; all of these fit the HiveMind pattern: service oriented, with distributed configuration needs. For example, a new data access layer is in the works, and different tools and services will contribute DAOs (Data Access Objects) for different types of databases. Rob Lorring has just put together a JMX-based monitoring package using HiveMind ... different tools contribute different monitors. The new caching service is configured using HiveMind as well.

The recent call for contributors to form a HiveMind community is also coming along, with a few people popping up to see what it's all about. Once I have a free moment (a laughable concept) I'll put together a proposal to move HiveMind from the sandbox into Jakarta commons proper.

Friday, September 05, 2003

HiveMind --- now with downloads

I hacked together some stuff to allow downloads (binary and source) of HiveMind. It's out of my home page at jakarta.apache.org. There's some definite interest building in HiveMind ... pretty soon, it'll be time to form an actual community and propose moving it out of the commons sandbox.

Wednesday, September 03, 2003

"Tapestry in Action" cover

I got goosebumps! Manning has decided on the cover illustration and here's what the book will look like:

With luck, the first chapters will show up on MEAP in about two weeks.

"Tapestry in Action" will be MEAP

Got some confirmation that the book will be available as incremental, online PDFs prior to dead-tree format. Not sure at this time what the schedule is ... there's still a lot of copy editting, indexing and production to be done.

In other news ... I'm trying to set up the necessary definitions to allow Javassist to build inside Gump. I'm awaiting S. Chiba's permissions, even though I probably don't need it -- can't see why he'd object. This will be good for other Gump projects that have a Javassist dependency, such as Tapestry and HiveMind.