Java DOM ObjectBuilder

Mark L. Fussell fussellm at
Fri Dec 12 14:22:25 GMT 1997

I have a first pass at an ObjectBuilder that generates objects based on 
the W3C Java DOM Interfaces[*].  So any XML-Parser with a BuilderClient 
[currently MS-XML and Aelfred] can generate DOM objects, including the 
Model information itself.  It is also easy to modify both the objects and 
the construction process to be different from the DOM specific ones (e.g. 
"Tag" specific objects instead of generic Elements).  This applies to the 
DTD objects (Use a different Node, ElementDefinition, or any other 
interface/class) as well as the normal Element content.

If enough people are interested I will try to make a specific release of 
this code and the minimum amount of MONDO that is needed to make it work 
(see below for size information), otherwise I will include it as an 
example in the next MONDO release.  The rest of this mail just discusses 
the details a bit more.


The DOM ObjectBuilding process can generally be 1-pass (direct) from the 
parser, except for the DTD which parsers digest first and must be 
'redescribed' to the builder.  For Aelfred, it looks something like this:

 XmlParser->| XmlProcessor  |-->DOMObBuilder->SpecificFactory->>DOMObject
            | BuilderClient |^                 or BeanFactory     v
            +---------------+ \<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<</

Where the '->>' indicates the Factory actually creates an object (at 
least conceptually) and the '<<<' is a return line for that object to be 
used in the subsequent recipe.  Generally the arrows to the right are the 
response to an ESIS type of event, but the ordering for building is 
sometimes a little different (attribute processing occurs inside an 
object's context, not before).  You can think of Recipes as a more 
general ESIS event model with a feedback loop.

Sending the DTD across is the one exception in terms of the ESIS analogy, 
because it is not a part of the event flow.  It needs to be redescribed 
as soon as it is available.  For Aelfred, the DTD is sent to the builder 
at the 'doctypeDecl' which looks like this at the moment [We are in the 

    public void doctypeDecl (XmlParser p, String name, String pubid, 
String sysid)

        Enumeration enum = p.declaredElements();
        while (enum.hasMoreElements()) {
            String elementName = (String) enum.nextElement();


Conceptually, the recipe for a DocumentType looks like:
    <Document (
        <DocumentType externalSubset=(
                name = "Period"
                contentModel = 
                        connector  = <OR>
                        occurrence = <REP>
                        tokens     = (
                            <ElementToken name="start">
                            <ElementToken name="end">
or in an XML-Recipe form it would look like:
            <ElementDefinition name="Period">
                  <ElementToken name="start"/>
                  <ElementToken name="end"/>

The DTD recipe and the normal Element content recipes are shipped to the 
ObjectBuilder which has the necessary factories to build objects from the 
recipe.  For the DTD recipes it builds pre-known and very specific 
classes: "Document", "ElementDefinition", "ModelGroup", etc.  For the 
Element content the ObjectBuilder currently builds a generic Element 
hierarchy.  The construction process for both the DTD and the Elements 
can be easily (and almost arbitrarily) changed.  The two semi-constants 
ar the DOM recipes which are encoded into the DOM-oriented BuilderClient 
and the source document itself.  It is also easy to turn on and off the 
DTD generation in the BuilderClient, and the result of a document without 
a DTD is a DOM Document object with a null DocumentType.

SIZE and Other Info

The total amount of MONDO-oriented DOM Building code is about 10K.  This 
is divided into 6 factories for the enumerated types, 1 factory for 
Document, and 1 main builder.  The rest of the DOM was done with a Bean 
Factory.  The BuilderClient is another 9K for a stack-based version 
(Aelfred) and a bit less for an object-based version (MS-XML).  
BuilderClients are pretty easy to write, about two hours or so for me, 
but I haven't gotten around to the other parsers yet. 
    MONDO itself is a bit large (~100K + requires ~100K general library) 
but I am trying to produce a version (mindo) that only includes what is 
needed for this type of task which may be 50K for mindo and 40-60K for 
the general library.

The DOM interfaces are about 10K and the skeleton classes are 16K.  The 
classes only serve the purpose of construction and printing (i.e. 
dumping).  More interesting classes would be quite a bit larger.

[*] Note that I modified the DOM interfaces to: (1) fix what I thought 
were bugs or deprecated behavior (2) Provide some extra services (e.g. 
Integer objects for the 'int's) (3) collapse specific types into more 
generic Map and List collections and (4) Added a naming convention (i.e. 
suffixing an interface which only has constants in it with 'Constants').  
Changing things back into the original form should be easy (I have the 
originals from the spec also) and should have little significance to the 
rest of the process.


For more information on MONDO see
Part of the design document is in HTML now and for this particular topic 
(XML->DOM Objects), you might want to look at Chapters 2&4 at:

mark.fussell at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list