XML trade off 1 - DTD vs XML Schema

Paul Prescod paul at prescod.net
Tue Aug 3 22:29:15 BST 1999


Mark Birbeck wrote:
> 
> 
> <statusReport>
>   <time>1201</time>
>   <station>123</station>
>   <status>56</status>
> </statusReport>
> 
> why bother sending more schema info than the name of the root document
> and the two children that it has? 

Why bother sending any schema information *at all*? If the first system
has the schema and the document, why doesn't it just validate before
passing the information to the second system? If the second system,
conversely, can only deal with information that adheres to certain
rules, then why wouldn't it supply the schema itself. It knows what
those rules are!

This passing schemas "at runtime" with the data can only be useful for
something OTHER than validation. The schema must provide some
information that helps in the interpretation of the data. You could have
just put that information IN the data and made it completely
self-describing. Therefore sending a schema to describe data that is
already coming down the pipe is at best a minimization.

Unless I am completely confused, schemas exist to be sent in advance to
be read by humans. These humans use the schemas to build software
without reams and reams of error checking. Any other use for schemas
seems to me to be a mere convenience.

> For this same reason, I
> must say I am surprised that fans of XML can be looking to use non-XML
> syntaxes to define any type of data, unless totally
> unavoidable/impractical.

Definitions of impractical vary. Should URLs be expressed as
<protocol><machine><path><fragment> elements? Should XPaths be
element-based? Should SQL statements? Should Perl code? Where do you
draw the line?

Insofar as a) content models are regular expressions, b) regular
expressions are trivial to parse and c) XML-structured representations
of regular expressions are incredibly verbose, I tend to think that
XML-structured content models ARE impractical.

> Finally, no-one has come back on my point from previous emails, that if
> you want to be able to index and manipulate the massive amount of XML
> data that will exist in coming years, often using non-standard schemas,
> you will need to be able to manipulate the meta-meta-data. And what
> better tool to use to define this than good old XML?

XML is not a data manipulation language. What you are really talking
about are XSL, SAX and the DOM. These can be taught to parse non-XML
syntaxes. In fact, they already do. XSL and the DOM can parse and
interpret namespace declarations, for example. SAX will be able to soon
also. XSL and the DOM will soon be able to parse XPaths also.

 Paul Prescod

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list