Content roles in XML

Eve L. Maler elm at arbortext.com
Thu Jan 22 19:32:24 GMT 1998


At 01:50 PM 1/22/98 -0500, Jim Amsden wrote:
>I'm very new to XML, and in developing a DTD for JAR files, JavaBeans, and
>Rational Rose petal files, I experienced a recurring problem. The EventSet
>element of the JavaBeans DTD is exemplary. Here's a fragment of the JavaBeans
>DTD I came up with:
>
><!ELEMENT EventSet (Annotation*, Method, Method, Method+)>
><!ATTLIST EventSet
>  %FeatureDescriptor;
>
>  listenerType CDATA #REQUIRED
>  isInDefaultEventSet (true | false) "false"
>  isUnicast (true | false) "false"
>>

Even before I continued reading, my reaction to the Method, Method, Method+
part of your content model was that "There have got to be some additional
semantics here"...

>The content of an Event set includes two required methods, and a
collection of
>other methods. In the DTD, there's no way that I know of to indicate the
roles
>these methods play in the EventSet. I would like to say something like:
>
><!ELEMENT EventSet (Annotation*, addListenerMethod, removeListenerMethod,
>eventMethod+)>
><!ATTLIST EventSet
>  %FeatureDescriptor;
>
>  listenerType CDATA #REQUIRED
>  isInDefaultEventSet (true | false) "false"
>  isUnicast (true | false) "false"
>>
>
>where addListenerMethod, removeListenerMethod, and eventMethod are all Method
>elements. This more clearly describes the content of an EventSet and avoids
>using positioning only to capture the meaning of element content.

I believe this is the ideal solution.  It doesn't matter if
addListenerMethod, removeListenerMethod, and eventMethod all share
identical content models and even attribute lists; they are obviously still
three different-enough things to get their own element types.

(Note that if you just had Method with an attribute indicating which of the
three types any one element is, you'd get the same processing power but not
the same validation power -- that is, you couldn't use the DTD to check
that at least three Methods are present.)

>I could use
>parameter entities to achieve this effect as in:
>
><!ENTITY % addListenerMethod "Method">
><!ENTITY % removeListenerMethod "Method">
><!ENTITY % eventMethod "Method">
>
><!ELEMENT EventSet (Annotation*, %addListenerMethod;, %removeListenerMethod;,
>%eventMethod;+)>
><!ATTLIST EventSet
>  %FeatureDescriptor;
>
>  listenerType CDATA #REQUIRED
>  isInDefaultEventSet (true | false) "false"
>  isUnicast (true | false) "false"

Using parameter entities would clarify for you, the DTD writer and
maintainer, what's going on.  However, it doesn't expose the semantics to
application software.  Parameter entities can be thought of as just
"macros," as far as your purpose is concerned.  So having three different
element types, while seemingly similar to the parameter entity solution, is
radically more powerful.

>Is this reasonable? Good XML DTD style? Not too much of a runtime overhead? A
>common practice? Note that this probably wouldn't help with the parsed XML as
>there would be a Method element for each method. You couldn't ask an
EntitySet
>element for it's addListenerMethod content like you could ask it for it's
>isUnicast attribute. You'd have to know to get the first Method in the
content.
>Of course an extensible parser with factory methods for constructing parse
tree
>nodes could hide the position dependence and provide more meaningful
accessors.
>
>I guess what I'm looking for is a way to capture (using UML terms) the
>association roles between the EventSet Class and the Method Class. There
are 3
>associations between these two classes, and I need a way to distinguish them.
>
>Anyone have any other ideas? Has anyone else experienced this situation?

This is a common problem in SGML/XML modeling.  DTD designers are often
reluctant to invent new element types if the structure would be identical
to element types the designer has already "bought."  However, I believe
this is false economy.  Your application software will still have to treat
the first Method, the second Method, and the third-through-nth Methods
differently, so it sure smells like you've got three different things. :-)
By creating unique element types, you expose the meaning to both software
and humans.

This isn't to say that it's not useful to use context to treat an element
differently; deciding these points is a matter of "feel" sometimes.  But,
overall, I'd rather use parentage than linear-order context to determine
fundamental processing of elements.

	Eve

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list