Documents and Document Fragments (Was RE: XML Information Set Req
uirements, W3C Note 18-February-1999)
Mark Birbeck
Mark.Birbeck at iedigital.net
Sat Feb 20 17:36:13 GMT 1999
Walter Underwood wrote:
> At 02:27 PM 2/19/99 -0800, Jeffrey E. Sussna wrote:
> >I think this issue needs to be addressed. It may be the case that
> >the stream contains, not one, but many documents, where each
> >information "packet" is a document. Or perhaps the afore-mentioned
> >notion of "document fragment" is introduced, and each packet is a
> >fragment.
>
> [snip]
> A series of document-packets (packuments?)
> should work fine.
I'm not sure whether introducing new terminology - fragments,
packuments, etc. - clarifies anything.
As I read XML 1.0 there is nothing wrong with interpreting an XML 'file'
or 'stream' as being made up of a number of XML documents. Many of the
discussions that have taken place on this list have been a little
confusing due to the physical and logical notions of a document being
merged.
To explain - from 2.8, we know that:
<?xml version="1.0"?>
<greeting>Hello, world!</greeting>
is a well formed document, and so is this:
<greeting>Hello, world!</greeting>
So, you could say that:
<issue>
<article>
<para>Para 1</para>
<para>Para 2</para>
<para>Para 3</para>
</article>
<article>
<para>Para 1</para>
<para>Para 2</para>
</article>
<article>
<para>Para 1</para>
<para>Para 2</para>
<para>Para 3</para>
<para>Para 4</para>
</article>
</issue>
is one document for an issue of a magazine, but it also 'contains' three
more documents - one for each article in that issue. A closing element
is therefore effectively the end of a document - even if that document
may be inside another document (in the *logical* sense in which the word
is used in the spec.) I don't think we therefore need the notion of a
'document fragment', because in XML 1.0 terms a fragment *is* a
document.
Whether this approach is of any use to you obviously depends on what you
are doing. In our case we have stored all the data that makes up the
articles and issues of a magazine in an object-type database, and then
built interfaces onto it that allow any node and its children to be
exported as XML, as if they were a document. This means that the notion
of a document that we normally have (the physical one) is no good, since
all 'documents' are dynamic and can start at any point in the tree. More
than that they could be the result of queries which combine nodes from
separate areas (say all articles about India, no matter what issue they
appear in) or they could be a subset of children from a node (all
articles in a certain issue that are by one author).
So, this interpretation of a document is crucial in situations of
dynamic XML export. As Marc says:
> What could be accomplished is a unified solution to problems addressed
> and/or recognized in SAX, XSL, queries, DOM, and fragments. It also
> provides a model for a data server as an XML 'document' constructor.
We now treat our web servers logically as 'XML servers', with either one
massive document on or thousands of smaller ones, whichever way you want
to slice it.
(BTW, DTDs can be dynamically created too, if you're worrying that this
presentation only deals with well-formed documents.)
Regards,
Mark
Mark Birbeck
Managing Director
Intra Extra Digital Ltd.
39 Whitfield Street
London
W1P 5RE
w: http://www.iedigital.net/
t: 0171 681 4135
e: Mark.Birbeck at iedigital.net
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list