XML Schemas and Namespaces
Henry S. Thompson
ht at cogsci.ed.ac.uk
Fri Dec 31 15:32:18 GMT 1999
Roger Costello <costello at mitre.org> writes:
> Hi Folks,
>
> I have a couple of questions with regards to the use of namespaces in
> XML Schemas.
>
> 1. As has been recently discussed, the method for an XML instance
> document to indicate the XML Schema that it conforms to is with the
> schemaLocation attribute. For example:
>
> <?xml version="1.0"?>
> <BookCatalogue xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
xmlns="http://www.somewhere.org/BookCatalogue"
> xsi:schemaLocation=
> "http://www.somewhere.org/BookCatalogue
>
> http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd">
> ...
> </BookCatalogue>
>
> At the root element (BookCatalogue) of this XML instance document I am
> using schemaLocation to indicate the XML Schema that it conforms to.
>
> The problem is this: when I defined BookCatalogue (in BookCatalogue.xsd)
> I didn't define any attributes for it. I certainly didn't define
> xmlns:xsi nor xsi:schmemaLocation as attributes. Thus, this XML
> instance document is invalid, right?
No, it's fine. Note it has no DTD, so validity (an XML 1.0 concept)
is not relevant. Defining a DTD for it, which appropriately allowed
for namespace prefixes, would be possible but tedious.
It's SCHEMA-valid (or at least it's not obviously NOT schema-valid,
given the addition of a default namespace declaration as above)
because
a) xmlns:xsi and xmlns are not attributes, they are namespace
declarations, and they're just fine as such: no declarations for them
are required in BookCatalogue.xsd;
b) xsi:schemaLocation is an attribute, but by definition such an
attribute is always schema-valid provided its contents are coherent,
which they are in this case.
> The nice thing about DOCTYPE was that it separated the mechanism for
> declaring the associated schema (i.e., the DTD) from the information
> items (i.e., the elements). With schemaLocation the mechanism for
> declaring the associated schema is intertwined with the information
> items.
>
> Thus, it seems that when an XML Schema is written the author must try to
> anticipate how instance documents will use it and add in xmlns:xsi and
> xsi:schemaLocation attributes to the elements being defined in the
> schema. For my example, I would need to define BookCatalogue as:
>
> <element name="BookCatalogue">
> <type>
> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
> <attribute name="xmlns:xsi" type="URI"/>
> <attribute name="xsi:schemaLocation" type="string"/>
> </type>
> </element>
>
> I must be misunderstanding something fundamental. This is obviously
> ridiculous.
I hope the comments above clarify that you don't need either of those
attribute declarations.
> 2. My second question has to do with referencing elements within an XML
> Schema. Consider this schema:
>
> <?xml version="1.0"?>
> <!DOCTYPE schema SYSTEM "xml-schema.dtd"[
> <!ATTLIST schema xmlns:cat CDATA #IMPLIED>
> ]>
> <schema xmlns="http://www.w3.org/1999/XMLSchema"
> targetNamespace="http://www.somewhere.org/BookCatalogue"
> xmlns:cat="http://www.somewhere.org/BookCatalogue">
> <element name="BookCatalogue">
> <type>
> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
> </type>
> </element>
> <element name="Book">
> <type>
> <element ref="cat:Title"/>
> <element ref="cat:Author"/>
> <element ref="cat:Date"/>
> <element ref="cat:ISBN"/>
> <element ref="cat:Publisher"/>
> </type>
> </element>
> <element name="Title" type="string"/>
> <element name="Author" type="string"/>
> <element name="Date" type="date"/>
> <element name="ISBN" type="string"/>
> <element name="Publisher" type="string"/>
> </schema>
>
> Note that we define the Book element and in the BookCatalogue element it
> is referenced using cat:Book
>
> <element name="BookCatalogue">
> <type>
> <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
> </type>
> </element>
>
> My understanding is that the reason for prefixing Book with cat: is to
> indicate "the Book element that we are referencing comes from the cat:
> namespace". The cat: namespace is defined at the top of the schema to
> be the same as the targetNamespace. Thus, the cat: namespace refers to
> this schema document.
I'd say, more carefully: "The prefix cat denotes a namespace URI which
is the same as the namespace URI identifying the target namespace of
this schema. Thus references to schema components in that namespace
refer to components defined in this schema."
> Here's my question: it appears to me that namespaces are being used
> here to "point" to things. In this case, cat: is "pointing" to the
> current document (the XML Schema). Isn't this a violation of the
> namespace spec, which says that there is no guarantee that there is
> anything at the URI referenced by a namespace?
The fact that you can't depend on dereferencing a namespace URI is
fundamental to our design. I hope the above gloss helps clarify that
we're not cheating here. It may be helpful to consider the
intermediate case of the <import> concept. Here are some excerpts
from the schema for schemas, but they could be from BookCatalogue.xsd:
<schema xmlns="http://www.w3.org/1999/XMLSchema"
targetNamespace="http://www.w3.org/1999/XMLSchema"
xmlns:x="http://www.w3.org/XML/1998/namespace">
<import namespace="http://www.w3.org/XML/1998/namespace"
schemaLocation="http://www.w3.org/XML/1998/xml.xsd"/>
<element name="info">
<type content="mixed">
<any minOccurs="0" maxOccurs="*"/>
<attribute name="source" type="uri"/>
<attributeGroup ref="x:lang"/>
</type>
</element>
</schema>
The <attributeGroup> element references a group named 'lang' in a
namespace with the namespace URI
"http://www.w3.org/XML/1998/namespace", which we recognise as the
namespace for XML itself. The import statement tells us we can find a
schema for the namespace with that namespace URI at
http://www.w3.org/XML/1998/xml.xsd, and indeed if you look there you
will find a schema with a declaration of an attributeGroup named
'lang'. In other words, <import> establishes the connection between a
namespace URI used in explicit schema references and a schema which
discharges those references, in much the same way that
'xsi:schemaLocation' establishes the connection between the namespace
URI used in IMPLICIT schema references in an instance and a schema
which discharges them.
To close the conceptual loop, you can think of the 'targetNamespace'
attribute on a schema as providing the wherewithall for an implied
<import> statement, e.g.
<import namespace="http://www.somewhere.org/BookCatalogue"
schemaLocation=""/>
This is just what is meant by saying that every schema is taken to be
defining components in its target namespace.
Hope this helps,
ht
Note I've tried to be careful to distinguish four things in my answers
above:
namespaces;
schemas;
namespace URIs;
prefixes.
Although doing this makes things more prolix, it avoids
misunderstandings, and I commend it to you in messages on this topic.
--
Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
Fax: (44) 131 650-4587, e-mail: ht at cogsci.ed.ac.uk
URL: http://www.ltg.ed.ac.uk/~ht/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list