XML and (K)Office

David Megginson david at megginson.com
Fri Mar 26 04:07:05 GMT 1999


James Robertson writes:

[on using Namespaces in spreadsheet formats]

 > It:
 > 
 > * Breaks validation. We are no longer able to ensure that the
 >   files we are reading/creating are correct and useful.

DTD validation cannot guarantee that a file is correct or useful; it
can only guarantee that it matches a few BNF-like productions (that's
helpful in itself because it allows some code simplication, but not as
much as some people let on).  DTDs are great for guided authoring, but
that's a different area.

Furthermore, Namespaces itself doesn't break DTD validation -- it's a
different layer.  The possibility of receiving unexpected information
does break validation, but it does so with or without namespaces; with 
namespaces, at least, you can clearly distinguish what's been added.

 > * Still has the variations between applications, so that a reader
 >   of a given format still needs to know 100% about what is that
 >   format.

Not at all -- it can use what it understands and apply simple rules to
the rest (ignore it as in RDF, skip to the top level and process the
children, etc.).

 > Without the rigour of a DTD, we've got nothing.

DTDs may be rigorous or lax, depending on the designer.  Here's a DTD
for spreadsheets:

  <!ELEMENT spreadsheet (#PCDATA)>

Just dump in the comma-delimited file, and escape any XML delimiters.
Now you have a DTD, and you still have nothing.

 > Particularly since this data may well live long, and is not
 > some transient "sent over the web" data.

That means that the format should be well-documented and validatable;
DTDs can help (and it's nice that they work with off-the-shelf tools),
but they're not worth much by themselves.

 > How will future users make sense of the format without
 > a DTD?

I've written dozens (hundreds?) of DTDs and a book on them, so I'm
quite comfortable saying that a DTD does not guarantee that users can
make sense of a format.  It is helpful in many ways, but good
documentation, examples, sample code, etc. are at least as important.

Would you like to code in C++ based only on the BNF for the language?
Of course.

Is it possible to code in C++ without ever having seen the BNF (or
whatever they use) in the ANSI spec?  Thousands do, some well and some
poorly.

That said, I think that DTDs are wonderfully useful and will be around
for a long time -- I doubt that any other schema standards that come
out will be nearly so light-weight.


All the best,


David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list