Round 2: How an XML instance document references an XML Schema
Henry S. Thompson
ht at cogsci.ed.ac.uk
Thu Jan 6 13:49:08 GMT 2000
John Aldridge <john.aldridge at informatix.co.uk> writes:
> At 23:18 05/01/00 +0000, ht at cogsci.ed.ac.uk (Henry S. Thompson) wrote:
> >John Aldridge <john.aldridge at informatix.co.uk> writes:
> >> I'd hoped to find a statement such as "a general-purpose schema-aware
> >> processor must provide some catalogue facility which allows the
> >> specification of a location from which to fetch the schema corresponding to
> >> an NS URI. Only in the absence of such a catalogue entry may the processor
> >> attempt to dereference the URI given by the schemaLocation attribute".
> >As I've tried to convey in other messages in this and related threads,
> >the XML Schema design is VERY concerned with precisely the issue you
> >raise above, namely, schema validation should not be a hostage to
> >connectivity and/or URL stability. Our approach was, however, NOT to
> >design YACM (Yet Another Catalog Mechanism), but allow for ANY
> >alternative schema location mechanism which people come up with. I
> >hope a careful reading of chapter 4 of the PWD  will clarify this
> >for you.
> I did carefully read Chapter 4, honest, but still struggled to understand
> the way the flexibility it includes should be used. Note that I did not
> suggest above that the document should include a specific catalogue design;
> just that I'd hoped it would mandate the existence of _some_ catalogue.
> >For myself, I envisage schema validators working the in a similar way
> >to XT, James Clark's XSLT implementation: you will be able to invoke a
> >schema validator with explicit specification of the schema(s) you wish
> By which you mean (I think) "explicit specification of _how to locate_ the
> schema(s) you wish applied,". Presumably you are not intended to be able
> to request that elements be validated against a schema with a
> targetNamespace which does not match the namespace from which the elements
> to be validated are drawn?
Both points correct: how to _locate_, and targetNamespaces must
always match (except in the case where there is none, but that's
another can of worms).
> > or you can leave it to the validator (Not an option XT
> >provides). The XML Schema PWD allows for one, the other, or both, but
> >observes that only the schemaLocation approach gives interoperability
> >(at the price of fragility).
> OK, that's very helpful. So, when writing an XML file, I should start it:
> <?xml version="1.0">
> And then say to the customers for this data:
> You must process this data either
> (a) in an environment with reliable access to
> http://www.informatix.co.uk/Stuff/Stuff.xsd (in which case you
> may use any "general-purpose schema-aware" XML processor), or,
> (b) you are constrained to use only those XML processors which
> allow you to specify that the schema for the namespace
> http://www.informatix.co.uk/Stuff is to be found in some other
> location accessible to you.
> In the context of the obligation "...unless directed otherwise
> general-purpose schema-aware processors must attempt to dereference each
> schema URI...", the existance of a catalogue or other mechanism for
> locating a schema counts as "directed otherwise".
Well, not the existence alone, but the existence plus some indication,
from user or application choice, to use what exists.
> I guess I'm just suspicious that, in the absence of specific requirements,
> processors will not bother to implement an such alternative mechanism.
> After all, the language quoted in the previous paragraph is very similar to
> that describing DTD links: "An XML processor ... may use the public
> identifier to try to generate an alternative URI. If the processor is
> unable to do so, it must use the URI specified in the system literal".
You can't make people provide interoperable solutions, only encourage
them to do so, you're right.
> . . .
> I guess I was really confused about the relation between schemas and
> I understand your answer to mean that by using a name from a namespace, and
> then using a schema-aware processor, you are automatically claiming that
> the element conforms to the schema for that namespace.
> There is no such thing, to a schema-aware processor, as a namespace without
> an associated schema.
That's close, but there are undoubtedly some grey areas. In the
simplest case: a schema-validator is validating the content of some
element with a schema for its namespace and encounters an element name
from a different namespace. What happens? If neither schemaLocation
nor built-in information nor namespace-URI-based search yield a
schema, there is a problem. Let's look a little harder at how this
1) The instance looks like this
<a:root xmlns:a='uri:a' xmlns:b='uri:b'>
The content model the validator is working with, within a schema for
the uri:a namespace, looks like this:
<element ref='a' . . ./>
Now this latter reference is not allowed unless there's an <import>
statement for it. But that <import> may not contain a
'schemaLocation' attribute, or the URI specified there may not be
accessible, etc. At that point an error should be raised.
2) The instance is the same, but the relevant content model looks like
<element ref='a' . . ./>
This, and related cases, are the grey area mentioned above. The WG
has not yet decided exactly what the detailed schema-validation story
is wrt validation within material which in the first instance is
allowed by a wildcard particle in a content model.
> Thanks for your help, both here and on other topics to which I've not
> contributed but have followed with interest.
You're welcome: you, and the rest of xml-dev, are our launch
customers. . . :-)
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht at cogsci.ed.ac.uk
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev