It keep both a (cursor-based) DOM model, and a interface-based Java model of the same document, which is kept synchronized. Don't quite know how to explain this, but a document looks like a standard DOM, but can also be treated as a tree of strongly-typed Java objects (represented as interfaces); the Java interfaces are created from an XML schema document (using a command line tool).
There seems to be a lot of magic in there; when you create new nodes on the Java tree it usually can figure out, automatically, where to put the nodes on the DOM tree ... based on the XML schema.
They seem to follow a good, pragmatic split in their approach to the API; convienience methods for doing things the way you usually want to, and formal methods to give you total control.
This has definately bumped up on my list of technologies to check out. I can't imagine that it's as efficient as SAX and Digester, and it's based on XML schema (for good or ill).
It appears to behave reasonably well, even when the input document is damaged. Validation is optional. They've also done some difficult, interesting things to maximize speed ... for instance, the XML schema is "compiled" into a binary metadata format (based on the Java class format). So you get the benefits of Schema without the cost of parsing those giant schema documents. It's 100% compatible with 100% of W3C schema. They also claim performance about the same as Xerces and much faster than JAXB R1.
No comments:
Post a Comment