About XML document linkage and Schemas.

Thu Nov 25 21:12:10 GMT 1999

Hi Mark,

Mark said:
Since you can never know all the occurrences of the meta data for a
document, I would have thought this is impossible. Of course you could
link to the meta data you happen to know about, but then it gives the
appearance of completeness when there is none. Shouldn't you just link
the meta data to the document, for which RDF is very suitable?

Didier reply:
Of course you cannot know in advance all the possible meta data about an XML
document. But let's say that you want to provide a limited set for an
context like, for instance, e-commerce. And if we think a bit more about the
meta data topic, what we want, most of the time, is not the whole set but
more what is pertinent to the context.

Use case:
You have an invoice formatted as an XML document to send to a business
partner. Some choices are given to you:
a) use the biztalk framework and then have the biztalk framework provide
some meta data for your document and your document being transformed into a
biztalk document fragment.
b) use another mechanism where you document is the document and 1) has,
associated to it, a certain meta data set packaged in a separate document
(i.e. href) or b) includes the limited meta data set.
c) other options...

To rely on a) means that Microsoft decides on what to include in this
limited set of meta data such as:
a) to whom or what is this document targeted
b) from whom or what this document is coming from
c) For which purpose this document is used for
etc...

The good point is that the meta data set is defined, the bad point is that a
monopoly controls this set :-)

If, however, there is a standard way to refer to or include a set of meta
data particular to a certain XML document, then, a monopoly is not at the
hart of the decision process on how you format your document (transformed
into a document fragment).

So, for instance, if we where able to do:

<myInvoice xmlns="http://www.xml.org/Myinvoice">  <------ and that this URI
would point to a page containing links about this name space as found
actually in W3C site for their own name spaces.
..... some content here......
<rdf:RDF xmlns="....."> again same thing as above
... all the limited meta data set here.....
</rdf:RDF>
... other content here....
</MyInvoice>

So here what is received by the trading partner is a "myInvoice" document
type or more particularly a "MyInvoice" name space or vocabulary defined
somewhere by a) a schema, b) a human readable document explaining the
schema. Most preferably the document at the other end of the URI would be an
XML document able to contain more than just the schema.

The good thing is that now your document is not longer a fragment in a
document controled by a monopoly but a document that you and your business
partners agreed on. The bad thing is that you do not have a monopoly to
organize your life :-). No seriously, you would have to define with your
business partners the limited meta data set required for a transaction
context.

So, On one hand, there is a way to convey limited meta information set. It
is to create a new document type, name space or vocabulary as actually done
with the biztalk framework. In this last case, the "MyInvoice" name space is
included in the main biztalk document as a fragment. Thus, the biztalk
document has been created mainly to convey a limited meta data set and the
mechanism used to convey the document and its meta data is to transform the
original document into a fragment and embed or include this fragment in the
meta data vehicle, which by the way became the main document. what is
driving the cow? the tail or the head?.
On the other hand, your document could stay as a document but any meta data
set being included as a fragment (or linked to an external meta data
document).

So, the real issue about the whole biztalk affair or nay biztalk like kind
of framework is what is driving the cow? the tail or the head? or said more
formally ;-) do we embed the document into the meta data and transform the
document into the meta data document as a fragment or do we, instead,
include meta data as a fragment in the document. the tail? the head? which
one? :-))

My concern or focus is: Is there a standard way to include a limited meta
data set in a document. Actually, it seems that yes I can embed a RDF
fragment to do so. However, there is no standard way to refer to an external
meta data document (i.e. having an hRef to the rdf document or an href to a
collection of rdf documents). So the document may embed a rdf fragment in a
standard way using the rdf recommendation but there is not standard way to
have a document refer to its related meta data (as I said, a limited set
pertinent to the context).

Note that this last feature is very important. Lets say now ,that I send to
you a document about XML. This latter could be created dynamically by a tool
which will not only send to you the document but also a collection of meta
information about the XML subject. The meta data collection could be other
related topics and thus could use something like topic maps links,
information on who created the document etc... The browser can then show you
what's related to this document. To do so, we need a standard mechanism to
point, from an XML document to a limited meta data set. I said limited
throughout this message because, as you said, we won't provide the whole set
of meta data about a document but always a limited set. The set of meta data
that we own, are aware of or simply can provide or even more pertinent, the
meta data related to the context.

Thus, we can now envision that an XML document is not an island but a
document that _could_ have:
a) a link or that it embeds a style sheet for rendition
b) a link or that it embeds a limited set of meta data bout this document
c) a link or that it embeds other kind of meta data like information about
its structure and what is valid (different from (b) which provide more
contextual or semantic information about the content than about the form or
the structure)
d) Obviously, the document content itself can contain links to other
documents

To better imagine that an XML document could contain a limited meta data set
it suffice to imagine that the XML document is not a simple file resident on
a system and simply accessed with a file system like HTTP GET but more like
a dynamic document created and that the user agent may require to:
a) get all the information (i.e. the document itself and related
information - this through a standardized links or standard fragments
b) get only the document without any related information
c) get only a fraction of the related information.

Hope I gave a better view of my XML world view and what are partially the
rationales behind what I said.

Cheers
Didier PH Martin
mailto:martind at netfolder.com
http://www.netfolder.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)