Java DOM ObjectBuilder
Mark L. Fussell
fussellm at alumni.caltech.edu
Fri Dec 12 14:22:25 GMT 1997
I have a first pass at an ObjectBuilder that generates objects based on
the W3C Java DOM Interfaces[*]. So any XML-Parser with a BuilderClient
[currently MS-XML and Aelfred] can generate DOM objects, including the
Model information itself. It is also easy to modify both the objects and
the construction process to be different from the DOM specific ones (e.g.
"Tag" specific objects instead of generic Elements). This applies to the
DTD objects (Use a different Node, ElementDefinition, or any other
interface/class) as well as the normal Element content.
If enough people are interested I will try to make a specific release of
this code and the minimum amount of MONDO that is needed to make it work
(see below for size information), otherwise I will include it as an
example in the next MONDO release. The rest of this mail just discusses
the details a bit more.
-------------------------------------------------------
The DOM ObjectBuilding process can generally be 1-pass (direct) from the
parser, except for the DTD which parsers digest first and must be
'redescribed' to the builder. For Aelfred, it looks something like this:
+---------------+
XmlParser->| XmlProcessor |-->DOMObBuilder->SpecificFactory->>DOMObject
| BuilderClient |^ or BeanFactory v
+---------------+ \<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<</
Where the '->>' indicates the Factory actually creates an object (at
least conceptually) and the '<<<' is a return line for that object to be
used in the subsequent recipe. Generally the arrows to the right are the
response to an ESIS type of event, but the ordering for building is
sometimes a little different (attribute processing occurs inside an
object's context, not before). You can think of Recipes as a more
general ESIS event model with a feedback loop.
Sending the DTD across is the one exception in terms of the ESIS analogy,
because it is not a part of the event flow. It needs to be redescribed
as soon as it is available. For Aelfred, the DTD is sent to the builder
at the 'doctypeDecl' which looks like this at the moment [We are in the
XmlProcessor/BuilderClient]:
public void doctypeDecl (XmlParser p, String name, String pubid,
String sysid)
{
this.startObject(DOM_DOCUMENT_TYPE_RECIPE);
this.startParameter("externalSubset");
Enumeration enum = p.declaredElements();
while (enum.hasMoreElements()) {
String elementName = (String) enum.nextElement();
buildObjectForElementDefNamed_in(elementName,p);
}
this.finishParameter();
this.finishObject();
}
Conceptually, the recipe for a DocumentType looks like:
-----------------
<Document (
<DocumentType externalSubset=(
<ElementDefinition
name = "Period"
contentModel =
<ModelGroup
connector = <OR>
occurrence = <REP>
tokens = (
<ElementToken name="start">
<ElementToken name="end">
)
>
>
...
-----------------
or in an XML-Recipe form it would look like:
-----------------
<Document>
<DocumentType>
<externalSubset>
<ElementDefinition name="Period">
<contentModel><ModelGroup>
<connector><OR/></connector>
<occurrence><REP/></occurrence>
<tokens>
<ElementToken name="start"/>
<ElementToken name="end"/>
...
-----------------
The DTD recipe and the normal Element content recipes are shipped to the
ObjectBuilder which has the necessary factories to build objects from the
recipe. For the DTD recipes it builds pre-known and very specific
classes: "Document", "ElementDefinition", "ModelGroup", etc. For the
Element content the ObjectBuilder currently builds a generic Element
hierarchy. The construction process for both the DTD and the Elements
can be easily (and almost arbitrarily) changed. The two semi-constants
ar the DOM recipes which are encoded into the DOM-oriented BuilderClient
and the source document itself. It is also easy to turn on and off the
DTD generation in the BuilderClient, and the result of a document without
a DTD is a DOM Document object with a null DocumentType.
SIZE and Other Info
===================
The total amount of MONDO-oriented DOM Building code is about 10K. This
is divided into 6 factories for the enumerated types, 1 factory for
Document, and 1 main builder. The rest of the DOM was done with a Bean
Factory. The BuilderClient is another 9K for a stack-based version
(Aelfred) and a bit less for an object-based version (MS-XML).
BuilderClients are pretty easy to write, about two hours or so for me,
but I haven't gotten around to the other parsers yet.
MONDO itself is a bit large (~100K + requires ~100K general library)
but I am trying to produce a version (mindo) that only includes what is
needed for this type of task which may be 50K for mindo and 40-60K for
the general library.
The DOM interfaces are about 10K and the skeleton classes are 16K. The
classes only serve the purpose of construction and printing (i.e.
dumping). More interesting classes would be quite a bit larger.
[*] Note that I modified the DOM interfaces to: (1) fix what I thought
were bugs or deprecated behavior (2) Provide some extra services (e.g.
Integer objects for the 'int's) (3) collapse specific types into more
generic Map and List collections and (4) Added a naming convention (i.e.
suffixing an interface which only has constants in it with 'Constants').
Changing things back into the original form should be easy (I have the
originals from the spec also) and should have little significance to the
rest of the process.
==========================================
For more information on MONDO see
http://www.chimu.com/projects/mondo
Part of the design document is in HTML now and for this particular topic
(XML->DOM Objects), you might want to look at Chapters 2&4 at:
http://www.chimu.com/projects/mondo/design/part0002.html
http://www.chimu.com/projects/mondo/design/part0004.html
--Mark
mark.fussell at chimu.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list