Tapestry Central: Improved HiveMind efficiency

I was thinking about interceptors in the shower this morning (best place to do serious thinking) and was also thiking about the performance results I got a little ways back. It seems like the big hit as the constant calling of interface methods instead of instance methods. It's well known that calling a method on an interface is much slower than calling a method on an object instance ... the JVM has to do a more expensive dynamic lookup and there are fewer options for optimizing the call.

One idea I had was to rework how interceptors work, so that all the interceptors would work together to make a single interceptor class. Didn't seem like the right approach, since it would be a lot of work, and I think it will be rare for a service to have more than two interceptors. Hard to test, hard to implement, no assurance of a return.

Next, I considered having each interceptor subclass from the previous; this would allow calls to use super.foo() rather than _inner.foo(), which would be more efficient (a super-class invocation rather than an interface invocation).

Then I realized that I always have the actual type and instance of each interceptor (or the core implementation) ... which means I can build interceptors in terms of concrete classes rather than interfaces.

To see if there was any benefit, I needed to test. I extended my test harness to add to new Adder services; the first is like the Singleton service, but has one NullInterceptor (this interceptor simply calls through to the next object). The second has two Interceptors. First run (JDK 1.4, Hotspot server) showed the cost of those interceptors:

Run	Bean	Interface	Singleton	Deferred	Threaded	One Interceptor	Two Interceptors
Run #1	211	390	2183	2824	3185	2393	9174
Run #2	250	2343	2324	2864	3014	2434	9203
Run #3	240	2354	2323	2844	3054	2394	9253
Run #4	241	2333	2353	2824	3045	2403	9183
Run #5	231	2353	2333	2825	3064	2383	9194

Compare column "Singleton" to "One Interceptor" and "Two Interceptors". I don't exactly have a theory for why the difference between one and two interceptors is so large. Also, the value is red is not a mistake, we've seen it before ... that appears to be Hotspot making an early optimization when there is only a single implementation of the interface; later it must go back and rewrite things when new implementations show up and it can no longer be certain that interface test.Adder is always class test.impl.AdderImpl.

I then reworked the interceptor code to use the actual class to call the next inner-most (interceptor or implementation). This is possible because of the order of contstruction: the core implementation is constructed first, then the lowest order interceptor is wrapped around it, the the next lowest order interceptor, and so forth. This time there was a dramatic change:

Run	Bean	Interface	Singleton	Deferred	Threaded	One Interceptor	Two Interceptors
Run #1	210	361	1993	2593	2925	2203	2233
Run #2	210	2153	2144	2613	2804	2194	2213
Run #3	220	2143	2153	2594	2784	2203	2203
Run #4	221	2143	2153	2583	2804	2194	2193
Run #5	220	2143	2153	2574	2814	2193	2213

And so we see the desired outcome; adding interceptors is now a small, incremental cost.

Why is this important? We often hear about the 80/20 rule, that 80% of the performance problems are in 20% of the code. My concern is that all the extra layers of interface invocations will become a visible factor, cumulatively ... but because all those calls are so completely, widely distributed, they will become impossible to track down. Sort of grit in the machinery ... no single, large point of failure, but a kind of overall friction in the works.

Given just how many iterations of my code loop I need to get measurable results (50,000,000), that's probably not a large concern ... still it's nice to claim I've optimized HiveMind for performance.

Tapestry Central

Thursday, September 11, 2003

Improved HiveMind efficiency

No comments:

Post a Comment