Is it OK to rely on invalidity?

Joshua E. Smith jesmith at
Tue May 4 17:39:56 BST 1999

Are my questions getting hard enough yet?  I wouldn't want you all to be
bored! ;)

Suppose you were using XML to define an object oriented programming
language.  The language has an object model behind it which is chock full
o' classes.  For example, there's a class "color" with attributes of "red"
"blue" and "green".  So in my XML-ish programming language, I can state:

<color name="my_color" red="128" blue="255" />

to make an instance.  Presumably, this would be backed up by a DTD like:

<!ATTLIST color
          name ID #IMPLIED
          red NMTOKEN "0"
          green NMTOKEN "0"
          blue NMTOKEN "0" >

Now suppose that I am going to allow the users of my language to create
their own classes.  A user-defined class would specify the member objects
which are to be instanced when the class is instanced.  For example,

<class name="color-parts" parameters="r g b" >
  <color name="whole-color" red="r" blue="b" green="g" />
  <color name="red-part" red="r" />
  <color name="blue-part" blue="b" />
  <color name="green-part" green="g" />

Now wouldn't it be cool, if my user could then instance his own class like

<color-parts name="my-color" r="128" g="0" b="255" />

and my application will make all those constituent members:

I think so, but to pull that off, my user would have to extend the DTD do
declare his class.  Otherwise, while being well-formed, this program is not
valid.  To make it valid, the programmer would have to extend the DTD with
a lot of gobledegook I bet the programmer won't really get (I'm assuming
that the XML community will be split between those who write DTDs and those
who use them).

The alternative, which doesn't require DTD extension, would be to instance
user-defined classes using the slightly uglier mechanism:

<instance class="color-parts" name="my-color" parameters="128 0 255" />

Now instances of user-defined classes look different from built-in classes,
which is almost never the case in OO languages.

Another problem which pops up is making references to the generated
objects.  If I want to refer to in some other object's
attributes, I cannot use an IDREF and still be valid (since, as far as the
XML parser is concerned, there is no such object -- my-color and red-part
are two different things).

Instead, I'd have to use a NMTOKEN, which sucks because in most cases the
attribute really is an IDREF, and I bet some cool editors are going to be
able to show those inter-object relationships graphically to the user.

So the question: is it OK to rely on invalidity?  If I just let validity
slide, I can provide a nicer interface for user-defined classes.  What do I
lose?  Will a validating editor barf if it sees a just-well-formed element?
 Or will it just gracefully mark the element as being suspect (which I
think the user of the language can handle, since it is a user-defined class).


Is there a way to get the slick syntax I want without giving up validity?

-Joshua Smith

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list