Tapestry Training -- From The Source

Let me help you get your team up to speed in Tapestry ... fast. Visit howardlewisship.com for details on training, mentoring and support!

Thursday, September 11, 2003

Improved HiveMind efficiency

I was thinking about interceptors in the shower this morning (best place to do serious thinking) and was also thiking about the performance results I got a little ways back. It seems like the big hit as the constant calling of interface methods instead of instance methods. It's well known that calling a method on an interface is much slower than calling a method on an object instance ... the JVM has to do a more expensive dynamic lookup and there are fewer options for optimizing the call.

One idea I had was to rework how interceptors work, so that all the interceptors would work together to make a single interceptor class. Didn't seem like the right approach, since it would be a lot of work, and I think it will be rare for a service to have more than two interceptors. Hard to test, hard to implement, no assurance of a return.

Next, I considered having each interceptor subclass from the previous; this would allow calls to use super.foo() rather than _inner.foo(), which would be more efficient (a super-class invocation rather than an interface invocation).

Then I realized that I always have the actual type and instance of each interceptor (or the core implementation) ... which means I can build interceptors in terms of concrete classes rather than interfaces.

To see if there was any benefit, I needed to test. I extended my test harness to add to new Adder services; the first is like the Singleton service, but has one NullInterceptor (this interceptor simply calls through to the next object). The second has two Interceptors. First run (JDK 1.4, Hotspot server) showed the cost of those interceptors:

Run Bean Interface SingletonDeferred ThreadedOne Interceptor Two Interceptors
Run #1 211 390 2183 2824 3185 2393 9174
Run #2 250 2343 2324 2864 3014 2434 9203
Run #3 240 2354 2323 2844 3054 2394 9253
Run #4 241 2333 2353 2824 3045 2403 9183
Run #5 231 2353 2333 2825 3064 2383 9194

Compare column "Singleton" to "One Interceptor" and "Two Interceptors". I don't exactly have a theory for why the difference between one and two interceptors is so large. Also, the value is red is not a mistake, we've seen it before ... that appears to be Hotspot making an early optimization when there is only a single implementation of the interface; later it must go back and rewrite things when new implementations show up and it can no longer be certain that interface test.Adder is always class test.impl.AdderImpl.

I then reworked the interceptor code to use the actual class to call the next inner-most (interceptor or implementation). This is possible because of the order of contstruction: the core implementation is constructed first, then the lowest order interceptor is wrapped around it, the the next lowest order interceptor, and so forth. This time there was a dramatic change:

Run Bean Interface SingletonDeferred ThreadedOne Interceptor Two Interceptors
Run #1 210 361 1993 2593 2925 2203 2233
Run #2 210 2153 2144 2613 2804 2194 2213
Run #3 220 2143 2153 2594 2784 2203 2203
Run #4 221 2143 2153 2583 2804 2194 2193
Run #5 220 2143 2153 2574 2814 2193 2213

And so we see the desired outcome; adding interceptors is now a small, incremental cost.

Why is this important? We often hear about the 80/20 rule, that 80% of the performance problems are in 20% of the code. My concern is that all the extra layers of interface invocations will become a visible factor, cumulatively ... but because all those calls are so completely, widely distributed, they will become impossible to track down. Sort of grit in the machinery ... no single, large point of failure, but a kind of overall friction in the works.

Given just how many iterations of my code loop I need to get measurable results (50,000,000), that's probably not a large concern ... still it's nice to claim I've optimized HiveMind for performance.

No comments: