Round 2: How an XML instance document references an XML Schema

Wed Jan 5 23:19:00 GMT 2000

John Aldridge <john.aldridge at informatix.co.uk> writes:

> At 09:22 04/01/00 -0500, "Roger L. Costello" <costello at mitre.org> wrote:
> 
> >There has been a considerable amount of discussion (and confusion) on
> >how an XML instance document indicates the XML Schema(s) that it
> >conforms to.  I am not sure that it is yet clear in people's minds on
> >how to do it.  I will take a stab at explaining it, based upon the
> >discussions.
> 
> (snip very helpful exposition)
> 
> I'm still struggling, however to understand how this is all intended to
> work in an environment which is not continuously connected to the internet.
>  Even on machines which are themseleves well connected, it's surely
> unacceptable to have one's data become unusable because the machine in
> Outer Mongolia which holds the schema has crashed.
> 
> Note that this is not just a matter of validation, because the schema can
> supply default attribute values.  The data can become meaningless in the
> absence of a schema.
> 
> I'd hoped to find a statement such as "a general-purpose schema-aware
> processor must provide some catalogue facility which allows the
> specification of a location from which to fetch the schema corresponding to
> an NS URI.  Only in the absence of such a catalogue entry may the processor
> attempt to dereference the URI given by the schemaLocation attribute".

As I've tried to convey in other messages in this and related threads, 
the XML Schema design is VERY concerned with precisely the issue you
raise above, namely, schema validation should not be a hostage to
connectivity and/or URL stability.  Our approach was, however, NOT to
design YACM (Yet Another Catalog Mechanism), but allow for ANY
alternative schema location mechanism which people come up with.  I
hope a careful reading of chapter 4 of the PWD [1] will clarify this
for you.

For myself, I envisage schema validators working the in a similar way
to XT, James Clark's XSLT implementation: you will be able to invoke a
schema validator with explicit specification of the schema(s) you wish
applied, or you can leave it to the validator (Not an option XT
provides).  The XML Schema PWD allows for one, the other, or both, but
observes that only the schemaLocation approach gives interoperability
(at the price of fragility).

> I'm also puzzled about the semantics of a namespace declaration without a
> corresponding schemaLocation attribute.  Does it mean:
> 
> (a) Names in the namespace do not have an association to a schema.  No
> validation is to be performed (and no attribute defaults are to be supplied).

Certainly not.  See chapter 4 again, and the discussion above.  The
validator is allowed to dereference the namespace URI, or look it up
in a catalog, or . . .

> (b) Unless the processor provides some alternative method of locating the
> applicable schema, then the data cannot be interpreted and an error occurs.

That will always be true, regardless of how things are specified:  a
schema validator confronted with a document with elements in a
namespace for which it neither is given nor can discover a schema will 
necessarily declare defeat.

ht
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht at cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)