About XML document linkage and Schemas.

Didier PH Martin martind at netfolder.com
Sat Nov 27 20:38:23 GMT 1999

Hi Mark,

Sorry for the delay for an answer but your comment required some thoughts.

Form the point of view of validity (as per the XML 1.0 recommendations) you
are right. it is not valid to embed a RDF element if this latter and its
descendant are not included in the document's DTD. But, on the contrary, it
is valid (as per the XML 1.0 recommendation) to embed the <MyInvoice>
element into a <rdf:description> element. So, I won't comment about the
philosophy to have a meta data format to include the document or the
contrary neither if RDF is even meta data or simply a different data model,
neither will I say that one is better than the other. I'll just play the
rules and see if these are incongruous, strange or simply too restrictive
today in the light of what we try to do or in the light of what we learned
to do.

Funny things that we can observe today with the current XML framework:

a) We can include a link to a style sheet in a document and have this
document valid as per the XML 1.0 recommendation. This mainly because the
link takes the form of a processing instruction. A Funny thing also to note
is that the style sheet reference is at the same time a link and, from an
other point of view, a processing instruction. And also, that from the
document structural point of view a non existing thing or a new incarnation
of the invisible man.

If, because of transportation constraints, we have to include the entire
style sheet content in the document, then the <xsl:stylesheet> element and
all its descendents (i.e. children) would have to be included in the
document's DTD to be parsed with validity in mind. Thus, in order to have
the original document structural integrity respected (always from the XML
1.0 point of view), we cannot create a compound document (document composed
of fragments and all the fragments resident in the same text "package") and
thus, we are restricted to link to the outside style sheet (in a separate
text "package"), not to include it per se. In certain circumstances, to
transport a single "package" is a necessity. But under the XML 1.0
constraints we cannot at the same time include the style sheet element and
all its descendants and still consider it as valid if the style sheet and
all its descendant are not part of the document type. To create a compound
document, we therefore need to create an other format than XML in order to
carry the document and the style sheet in the same text "package". Let's
call this kind of constraint: "use case 1" (and remember the fact, that
under certain circumstances it would be easier and beneficial to have all
this information -i.e the document and its style sheet- in the same text
"package" just to minimize the ping pong effect or simply to transport the
whole in a messaging system imposing such constraints Ex: SMTP, Message
based middleware).

b) in the same vein, if we where to include a meta data reference about a
particular context concerning this document, then we can use the same
mechanism as for (a) and still respect the original document structural
integrity (I mean here without affecting it or modifying it). Thus, by
including a processing instruction used as a link to a meta data document
(i.e. an RDF document) and that this processing instruction is also
precisely used as...a processing instruction by an RDF engine then, we get
what we want. Except if we incorporate to the situation or the goal to reach
that we have the constraint of a single text "package" to transport both the
document and its meta data. To be valid as per the XML 1.0 recommendation,
the document cannot include the meta data <RDF> element without a
modification of its structure, it can only have a link to it through a
processing instruction which is like the invisible man (remember the movie?
:-) for an XML 1.0 validating parser.

c) The only thing that do not affect an XML 1.0 valid document is its DTD
itself which can be included in the same text "package" without affecting
the document integrity. Thus, the DTD and the document itself are in the
same unit.

So, what is the vision of the world from an XML 1.0 recommendation point of
view? or if XML 1.0 would be a philosopher, what would he say? That only the
document type definition is part of the document, if you want to include
something (a joke, a meta data context, a style sheet a virus, :-) in the
document without affecting its structural integrity, do it with a processing
instruction or a comment. both are the incarnation of the invisible man for
an XML validating parser.


Let's call that use case #3 (no 1 being the style sheet affair, #2 being the
meta data affair, #3 being both, what an occupied man :-)
But now, how to aggregate things and still being able to validate them?
Waiting an answer from the Schema WG. Can we now add complementary things to
a document without affecting its structural integrity? A DTD do not affect
the structural integrity, a stylesheet does a meta data context does. But
both are complementary elements. Have we regressed from what we learned
about compound documents (remember the frameworks developed at the beginning
of the 90s just before we all became troubled by the compound documents
amnesia?) Or did we learned in fact?

So I can include new stuff not necessarily part of the original XML document
as long as I include them as processing instructions or comments. And if it
is hard to aggregate, add complementary information, create rich content,
and still be valid at the same time.

So, for an XML citizen, to be valid is too restrictive. this is why maybe,
most of these citizen are choosing to be only "well formed"

Didier PH Martin
mailto:martind at netfolder.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list