(Many) XML Schema Questions

Roger Costello costello at mitre.org
Mon Dec 27 22:13:03 GMT 1999


Hi Folks,

I have been making my way through the new XML Schema spec
and have numerous questions.

1.) My first question is on how XML Schemas are to be 
referenced by XML instance documents.

It is my understanding that in an XML Schema you 
use the targetNamespace attribute to indicate
the namespace of the XML Schema document, and 
then in an XML instance document you reference
the XML Schema using this namespace.

For example, suppose that I create an XML Schema
for BookCatalogues:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        targetNamespace="http://www.somewhere.org/BookCatalogue">
     ...
</schema>

Now, in my XML instance document I refer to it using
a namespace declaration:

<?xml version="1.0"?>
<BookCatalogue xmlns="http://www.somewhere.org/BookCatalogue">
    <Book>
        <Title>Illusions The Adventures of a Reluctant Messiah</Title>
        <Author>Richard Bach</Author>
        <Date>1977</Date>
        <ISBN>0-440-34319-4</ISBN>
        <Publisher>Dell Publishing Co.</Publisher>
    </Book>
        ...
</BookCatalogue>

The namespace declaration in this XML instance document
is to be interpreted as: "All the stuff between <BookCatalogue> 
and </BookCatalogue> conforms to the schema defined at 
this namespace."

Thus, my first question is: do I have a correct understanding
of the purpose of targetNamespace, and of how an XML instance
document is to indicate that it conforms to an XML Schema?

2. The namespace spec clearly indicates that there is
no guarantee that there is anything at the URL referenced
by a namespace.  However, with XML Schemas, it seems
that the namespace referenced in an XML instance document
must necessarily reference an XML Schema.  Is this a violation
of the namespace spec, or is it an application-specific 
usage of namespaces?

3. In question #1 I showed how an XML instance document
may reference the XML Schema that it conforms to:

<?xml version="1.0"?>
<BookCatalogue xmlns="http://www.somewhere.org/BookCatalogue">
   ...
</BookCatalogue>

How does an XML Parser know that it is to go to
BookCatalogue.xsd at this URL?  Should, instead, the
namespace declaration be:

<BookCatalogue
xmlns="http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd">

4. With XML Schemas is it possible for an XML instance document to
be composed of fragments, each conforming to a different
XML Schema?  For example:

<?xml version="1.0"?>
<Library>
    <BookCatalogue xmlns="http://www.somewhere.org/BookCatalogue">
        <Book>
            <Title>Illusions The Adventures of a Reluctant
Messiah</Title>
            <Author>Richard Bach</Author>
            <Date>1977</Date>
            <ISBN>0-440-34319-4</ISBN>
            <Publisher>Dell Publishing Co.</Publisher>
        </Book>
            ...
    </BookCatalogue>
    <Employees xmlns="http://www.somewhere-else.org/Personnel">
        <Name>John Doe</Name>
        <Position>Reference Manager</Position>
    </Employees>
</Library>

Here we see the BookCatalogue fragment conforming to one XML Schema
and the Employees fragment conforming to another XML Schema.  Is this
capability the intent of the XML Schema spec?


5. In section 3.6 of the XML Spec it deals with derived types
Below is an example of a Book type being derived (by extension)
from a Publication type:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd"[
<!ATTLIST schema xmlns:cat CDATA #IMPLIED>
]>
<schema xmlns="http://www.w3.org/1999/XMLSchema"
               targetNamespace="http://www.xfront.org/BookCatalogue"
               xmlns:cat="http://www.xfront.org/BookCatalogue">
    <type name="Publication">
        <element name="Title" type="string" maxOccurs="*"/>
        <element name="Author" type="string" maxOccurs="*"/>
        <element name="Date" type="date"/>
    </type>
    <type name="Book" source="cat:Publication" derivedBy="extension">
        <element name="ISBN" type="string"/>
        <element name="Publisher" type="string"/>
    </type>
    <element name="Catalogue">
        <type>
            <element name="CatalogueEntry" minOccurs="0"
maxOccurs="*"                      

type="cat:Publication"/>
        </type>
    </element>
</schema>

The CatalogueEntry element is of type Publication.  Thus, in an
XML instance document a CatalogueEntry element can contain
either a Publication or a Book (since a Book is a Publication).
Here's an example XML instance document:

<?xml version="1.0"?>
<Catalogue xmlns="http://www.somewhere.org/BookCatalogue"
           xmlns:xsi="http://www.w3.org/1999/XMLSchema">
        <CatalogueEntry>
                <Title>Staying Young Forever</Title>
                <Author>Karin Granstrom Jordan, M.D.</Author>
                <Date>December, 1999</Date>
        </CatalogueEntry>
        <CatalogueEntry xsi:type="Book">
                <Title>Illusions The Adventures of a Reluctant
Messiah</Title>
                <Author>Richard Bach</Author>
                <Date>1977</Date>
                <ISBN>0-440-34319-4</ISBN>
                <Publisher>Dell Publishing Co.</Publisher>
        </CatalogueEntry>
        <CatalogueEntry xsi:type="Book">
                <Title>The First and Last Freedom</Title>
                <Author>J. Krishnamurti</Author>
                <Date>1954</Date>
                <ISBN>0-06-064831-7</ISBN>
                <Publisher>Harper &amp; Row</Publisher>
        </CatalogueEntry>
</Catalogue>

Here we see the first CatalogueEntry is a Publication.
The next two CatalogueEntry's are Books.  Note the 
use of an attribute (xsi:type) to indicate that it is a Book. 
The XML Schema spec says that when using a type in an 
XML instance document if it's not the source type (Publication)
then we must indicate what derived type is being used
(Book).  Why? Surely, an XML Parser would be able to 
figure out the type, just as compilers are able to do so.

6. What is the default value for the content attribute of
the type element?  Is it elementOnly?

7. Consider this example of type deriving (by restriction):

<type name="Publication">
        <element name="Title" type="string" maxOccurs="*"/>
        <element name="Author" type="string" maxOccurs="*"/>
        <element name="Date" type="date"/>
</type>
<type name= "SingleAuthorPublication" source="cat:Publication"
derivedBy= "restriction">
        <element name="Author" type="string" maxOccurs="1"/>
</type>

Here we see the type SingleAuthorPublication is a Publication
which is restricted to a single author.

My question is, do you need a namespace qualifier
for the element that you are restricting?  e.g.,

     <element cat:Author type="string" maxOccurs="1"/>

8. Type deriving allows us to create a new type that is an
extension of another type.  It also allows us to create a 
new type that is a restriction of another type.  What if you 
want to create a new type that is both an extension and a 
restriction.  How do you do that?  For example, suppose that 
I would like to create a Magazine type from the Publication type
where the Author is restricted to zero occurrences, and we 
extend by adding an editor.  How do I do that?  Create an intermediate
type that is a restriction and then extend that?  Not very 
elegant, I would say.

9.  I guess that I don't understand what equivClass is all about.
How does it differ from extension and restriction types?

10. In section 3.7 of the XML Schema spec it talks about creating
unique identifiers using keys.  Here's an example of using
key and keyref:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd"[
<!ATTLIST schema xmlns:lib CDATA #IMPLIED>
]>
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        targetNamespace="http://www.xfront.org/Library"
        xmlns:lib="http://www.xfront.org/Library">
    <element name="Library">
        <type>
            <element name="BookCatalogue">
                <type>
                     <element name="Book" minOccurs="0" maxOccurs="*">
                          <type>
                              <element name="Title" type="string"/>
                              <element name="Author" type="string"/>
                              <element name="Date" type="string"/>
                              <element name="ISBN" type="string"/>
                              <element name="Publisher" type="string"/>
                              <attribute name="Category" minOccurs="1">
                                  <datatype source="string">
                                      <enumeration
value="autobiography"/>
                                      <enumeration value="non-fiction"/>
                                      <enumeration value="fiction"/>
                                  </datatype>
                              </attribute> 
                              <attribute name="InStock" type="boolean"
default="false"/>
                              <attribute name="Reviewer" type="string"
default=""/>
                          </type>
                     </element>
                </type>
            </element>
            <element name="CheckoutRegister">
                <type>
                    <element name="Person" type="string"/>
                    <element name="Book">
                    <type content="empty">
                        <attribute name="titleRef" type="string"/>
                        <attribute name="libegoryRef" type="string"/>
                    </type>
                </type>
            </element>
        </type>
    </element>
    <key name="bookKey">
        <selector>Library/BookCatalogue/Book</selector>
        <field>Title</field>
        <field>@Category</field>
    </key>
    <keyref name="bookRef" refer="bookKey">
        <selector>Library/CheckingOutBook/book</selector>
        <field>@titleRef</field>
        <field>@libegoryRef</field>
    <keyref>
</schema>

Note how the key element is declaring that the combination 
of the contents of Title plus the value of the attribute Category 
is unique and represents the key.  Note the XPath expression
which locates the relevant nodes.  My question is this: where
is the XPath expression relative to?  What's the current node? 
Is is to the root of the document?  In the XML Schema spec 
their examples used an XPath expression that begins with .//  
What's the current node being indicated by the dot?

11. What does maxOccurs="*" mean with respect to the <any/> element? 
e.g., what's the difference between:

   <element name="free-form">
      <type>
            <any/>
      </type>
   </element>

and

   <element name="free-form">
      <type>
            <any maxOccurs="*"/>
      </type>
   </element>

I ask this because the any element can contain
any well-formed XML document, so maxOccurs seems
to have no meaning.

Thanks for any anwers you can provide.  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list