XML-Data: advantages over DTD syntax?
cbullard at hiwaay.net
Wed Oct 1 05:03:04 BST 1997
Murray Altheim wrote:
> I've been (honestly) trying to give this XML markup schema idea a fair shake,
> and while I wouldn't enjoy writing the DTD (based on the examples, I think
> the syntax is verbose and ugly) I do think this does come down to an
> advantage, at least as far as my own XML parser experiments go.
I agree. It isn't that a single parser processes it, it is what
can be done with it. UP FRONT: DTDs or Schemas. I think these
are different animals to do different kinds of design. Past experiences
with design efforts like HyTime, MID, etc. left me and still
leave me puzzling about what should be defined in the markup
language(s) and what is best done in the objects.
> My parser builds a Java Vector array of what I call 'ContentObjects', and
> I've enumerated types for the various types of content objects. Using the
> same parser, I would simply add several more enumerated types (for element
> declaration, attlist declaration, etc.) to the list and let the thing
> attack a DTD'ed document instance. That would be obviously easier than
> writing the parser to understand SGML markup declarations.
Ok. Having never written a parser, I believe you. Creating objects
using the XML/SGML markup as an interpreted source for properties
is what XML should *standardize* IMO. What would be very interesting
to hear is opinions on how much and which parts of the document
framework properties should be expressible in the XML, and what parts
should be in other notations. For example, we have to look at how
scripting is to be done since despite SGML's resistance to procedural
languages in SGML, internal scripting is a part of the modern document
Of course, the contenders are ECMAScript and Java (IMHO) because
other notations within the framework support those languages as internal
> But from that point on, figuring out how the document model is structured
> seems pretty much the same, just a different approach on getting to the
> declarations into the Vector array. I don't see any other particular
> advantages to the syntax, and as I said earlier, it seems harder to read
> (to me) and certainly more verbose.
Also agreed. I guess I have problems with the ideas of using
the instance syntax because I think of that as data (old SGML habit)
and I think of a DTD as expressing automata. I understand how
they have adapted that as attributes, but I don't like that model.
To me, the element types are active. I am comfortable with
the current DTD syntax.
> By and large though I agree with you. DTDs are hard enough to read now;
> adding all that extra markup cruft seems a step backward. It requires the
> reader to compose the content model in their head based on interpretation
> of the schema markup, which relies a great deal on whitespace (!) IMO.
That's the difference, perhaps. In some sense, the schema approach
is an exercise in entity/attribute modeling a la a relational
A *mythical* SGML designer sees a document from which he is abstracting
to which he is adding structure in terms of Type/attribute(s)OfType
that occur at some frequency and in some order. While some have
certainly been able to express that relationally, it isn't the
conceptual model I prefer for human-digestible text.
That said, if it is enabling a more object-oriented capable
syntax, then these are two ways to create markup for the
same information. I have no problems with that.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev