A Plea for Schemas

Thomas B. Passin tpassin at mitretek.org
Mon Nov 1 22:37:43 GMT 1999


Everyone seems to have something really useful to say about schemas.  I'd like
to add a few thoughts, too.

First, one of the strengths of XML, as I see it, is that you don't NEED to have
a DTD - validation is optional.  This means you can still parse and use a
document if you can't get hold of its DTD (or, let us hope, its schema of
whatever variety).  Perhaps you would really like to have it when you create the
document, and the user-end processor could of course be designed to require the
DTD.  But at least you can design your system not to need it if you want.

This is useful because you can now work if the internet isn't available, or if
you didn't download the current version of the DTD, etc.  It also means there
are fewer internet connections, so it is faster.  When I tried out several SGML
browsers a few years ago, it seemed like I was always waiting for some for one
or another part of the system to lead - and sometimes some crucial part failed
to come in at all.  I ended up not very happy with the process.  I imagine
that's one reason that XML - which was developed to "put SGML on the Internet",
doesn't require the DTD.  So we should not require any other schema to be
present, either.

This suggests that if an XML document specifies a schema whose type is not known
to the XML processor, the processor should ignore it.  This is consistent with
current processors being allowed to ignore entities defined in external DTDs, if
the processor is not doing validation.  In turn, this reinforces the notion that
it is useful not to require the user end processor to do validation-type things.
And this in turn is very applicable to use over the internet - the less
complexity you require, the more likely that things will work.

Second, the suggestions for a general way to specify alternate schema styles
sound really interesting.  Possibly it could be done now through a Processing
Instruction without needing any syntax change to XML.  Or, as Microsoft is
trying out in IE5, through a specific namespace - but that would only work for
XML-syntax schemas, and therefore not for DTDs as we know them.  But I don't
think that a general purpose XML processor should be required to understand them
all or to convert between them.

Third, with regards to the discussions about a schema repository, what needs
should it serve? That is the salient question that should be worked out before
many of the other questions can be answered.  Since I think that the need for
on-line or user-end schemas should be minimized, I think a repository should not
be needed for on-line, day-in and day-out processing.  In this view, a schema
repository would be more of an archive, to be used for reference and to get
copies of schemas for use by developers and, perhaps, creators of content (human
or otherwise).

This is starting to seem pretty long-winded, so I'll stop here.

Tom Passin

From: Simon St.Laurent <simonstl at simonstl.com>


>At 12:31 PM 11/1/99 +0100, Matthew Gertner wrote:
>>I totally agree. The idea of some kind of discovery mechanism for
>>schemas has already been batted around this list (and I believe I heard
>>something about some W3C activity starting in this area?).
>
>The more I've worked with XML, the less convinced I've become that central
>repositories for schemas hold a meaningful answer to information
>processing.  While access to prior work is useful for reference, learning,
>and some avoidance of reinventing the wheel, I don't think the dream of
>schema repositories as standards bodies makes sense.  (I used to, really!)
>
>>Getting all
>>the competing approaches behind a standard schema language is a major
>>prerequisite for this.
>
>I'm not convinced that this is actually critical.  You can start with DTDs
>or XML-Data and move around as necessary - try out XML Authority
>(www.extensibility.com) for one example of a tool that makes it very easy
>to convert among different schema vocabularies.  Data type information can
>be stored in DTDs quite easily, for instance, and put into a schema format
>when appropriate.  This approach reduces the seemingly high cost of
>switching infrastructures.
>
>Rick Jelliffe's Schematron is another interesting option, providing
>supplemental tests that aren't typically included in schemas.  Multiple
>schema languages may in fact provide useful tools that a single vocabulary
>and structure might not easily include.
>
>>It is hard to justify investing too much in
>>schema development and infrastructure if the final form and capabilities
>>of this schema language are still unclear. How and whether schemas will
>>then be made available is very much up in the air. The most promising
>>option I see is to create some kind of schema marketplace that will
>>enable people to get their schemas out there, with the metadata
>>necessary for others to find them, and let "free market" competition
>>decide which schemas will gain general acceptance. There are already
>>some efforts of this type, but they smack a little of marketing
>>manoeuvres controlled by a single company and not real attempts to
>>create an open marketplace.
>
>It's interesting, though, that the 'marketing manoeuvres' fosters this kind
>of open competition, while the more neutral body has set up a formal
>process for settling on schemas through closed committees.
>
>I'm not sold on the need for single schemas for particular markets, nor do
>I think a single repository is going to make that much difference (except
>perhaps for PR).
>
>In some cases, companies will be able to agree on industry-wide standards,
>but I don't think that approach is the only path forward.  XML's
>transformability (courtesy of its structures plus XSL, Omnimark, MDSAX, and
>other tools) opens the door to a Babel-like world in which we have a
>significant - and adquate - chance of understanding each other without
>having to proceed in lockstep.
>



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list