Proposal Critique - XML DTDs to XML docs
Simon St.Laurent
SimonStL at classic.msn.com
Fri May 22 02:27:18 BST 1998
>There is a reason that we usually choose not to have circular
>specifications. First, reading and writing them is often a pain. Second,
>the two become interdependent.
Yes, there is the 'ingrown toenail' metaphor for standards that rely on each
other too closely and turn into a mess.
> [re: using href declarations for references]
>I thought that you wanted to use XLink and XPointer?
Of course I want to use XLink and XPointer. The href declaration is the
tiniest piece of the XLink standard, and seems fairly well established, if not
indeed set in stone. I'd be happy to use the full XLink spec, but realize
that not everyone needs it. Fine. Make href a part of the 'Level 1' spec and
pray that XLink doesn't migrate to entirely different terminology. It's no
worse than SYSTEM and PUBLIC are now, certainly.
>What would the rules be? What would extensions be allowed to do and not
>do?
For now, because this is simply a 'representation', I expected the same rules
to hold for these DTDs with regard to document syntax as apply now. Maybe I
should have written a complete section on behavior; maybe I will.
>I guess I don't understand the difference between adding things and
>changing the fundamental rules of the "level 1" parse. DTDs DO change the
>fundamental rules of the fundamental parse. What could be more fundamental
>than this:
Here we begin to see where the communications breakdown has set in, and maybe
we can unravel it. You see entities as modifying the rules of the 'fundamental
parse'. I see entities as riding along on the rules of the 'fundamental
parse' to make their changes. To me, the basic rules for parsing establish a
syntax for documents, including a set of rules for including entities. Using
an entity is just taking advantage of those rules, _not_ modifying them in any
way. I see the distinction between expanding an entity and including (or
transcluding) information from a link as a minor technical skirmish that
should have been settled long ago, not a major battle over the fundamental
shape of documents.
Maybe that's what I get for working in HyperCard and HTML all these years...
>We could restrict DTD extension to data typing, but that strikes me as a
>step backwards. Verification is going to be (and should be) increasingly
>the job of non-DTD schemata.
>...
>Verification should be handled at a different level and by a different
>piece of software than the parser.
I think this philosophy reflects SGML's heritage in document management.
Developers who'd like to apply XML to other tasks may find this heritage
distracting or indeed disturbing, giving the DTD's current lack of
extensibility. It's not hard to imagine database developers who need to use
XML coming up with a really simple schema like:
<Element Name="FirstName" Type="Text" Size="50" />
<Element Name="LastName" Type="Text" Size="50" />
<Element Name="BirthDate" Type="Date" />
Then they could just use a PI to tell their application to check their
well-formed document ("Who the hell needs a DTD anyway? Like who came up with
_that_?") against this schema. Something like:
<? WhoNeedsDTDs simpleschema="http://www.simonstl.com/schema.jnk" ?>
This doesn't really do any harm; part of the joy of well-formed documents is
that you can chuck all the rest of the goodies in XML and build it yourself.
Still, to me, this loses a lot. I'd like to see developers use DTDs, and I
think that describing the structure of these documents is important for many
reasons: easier use with editors, easier-built storage systems, and, of
course, error-checking.
Making DTDs extensible in clearly defined ways (and not your <!MY-OWN-ENTITY >
critter) seems lke a good way to bring these folks in. By providing a
structure that developers can use to ensure interoperability of their
documents, as well as extend to include data-type verfication, I think we'd be
able to keep more developers in the habit of using DTDs.
Which brings us to the core of the issue:
>In other words, I think that we should be reducing the responsibilities of
>the DTD, rather than expanding them. A whole new syntax for a core part
>of the language would make XML much more complicated than it is now.
Right now, the options for including verification on top of the DTD structure
look pretty ugly. Namespaces, schemas, and PIs pile on top of each other to
drive documents into the ground. These sort of extensions are going to
sprout. I'd like to give them a good place to grow, a single document that
provides a complete picture of a document model's content. Do you really want
stacks of schemas floating around as well as the style sheets, scripts, link
group documents, and the DTD? I don't feel the need to put _everything_ in
one place - style sheets, scripts, and link information seem better managed
outside this framework and don't cause endless repetition of the document
structure.
Does it really make sense to define the DTD once for XML 1.0 validation and
define an entirely separate but redundant structure for data type validation?
If SGML compatibility is your highest aspiration, it certainly may. To me,
it doesn't make sense.
Maybe the XML-Data crew will get their ubercombination to work. I'd rather
start by getting DTD's made extensible and more easily managed first, and then
add the schemas later, without requiring redundant structures. This doesn't
seem like that bizarre a goal.
Simon St.Laurent
Dynamic HTML: A Primer / XML: A Primer / Cookies
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list