I've been doing a bit of work on the Tapestry 5 code base. I'm really interested in making Tapestry 5 screaming fast, and since the code is based on JDK 1.5, we can use concurrency support. Previously, I've blogged about using an aspect to enforce read and write locks. I decided to write a simple benchmark to see what the relative costs were.
As with any benchmark, its only an approximation. I tried enough tricks to ensure that Hotspot wouldn't get in there and over optimize things, but you can never tell. HotSpot is a devious piece of software.
I got interesting, and strange, results:
For a base line, I executed the code with no synchronization whatsoever (simple). The cost of synchronization (synched) shows that synchronization is pretty darn cheap, just an increment on top of the baseline code. The aspect graph shows the cost of using the @Synchronized aspect to maintain a reentrant read/write lock (that is, shared read lock combined with an exclusive write lock). Finally, the rw graph shows the cost of writing code that maintain the read/write lock in normal code (rather than having it added via the aspect).
Synchronization has some overhead. Using the @Synchronization aspect is about 4x as expensive as just using the
synchronized keyword on a method. Strangely, the aspect version operates faster than the pure code version for reasons I can't explain, except it must have something to do with how AspectJ weaves my code (a lot of code I write ends up as private static methods after weaving, which may have some runtime performance advantage).
These results demonstrate an important tradeoff: if your application only occasionally has multiple threads hitting the same methods, then you might want to choose
synchronized, since you aren't in danger of serializing your threads. By serializing, we mean that only one thread is running and all other threads are blocked, waiting for that thread to complete a synchronized block. Serialized threads is what causes throughput for a web site to be bad, even though the CPU isn't maxed out ... it's basically, Moe, Larry and Curly fighting to get through a single, narrow door all at the same time (they race to claim the single, exclusive lock).
Tapestry, on the other hand, will have a number of choke points where many threads will try to simultaneously access the same resource (without modifying it). In those cases, a shared read lock (with the occasional exclusive write lock) costs a little more per thread, but allows multiple threads to operate simultaneously ... and that leads to much higher throughput. Here, Moe, Larry and Curly get to walk through their own individual doors (that is, each of them has a non-exclusive read lock of their own).
As with any benchmark, my little test bench is far, far from a simulation of real life. But I think I can continue to make use of @Synchronized without worrying about tanking the application. In fact, just as I predicted Tapestry 4 would out-perform Tapestry 3, I believe Tapestry 5 will out perform Tapestry 4, by at least as much.