Streams, protocols, documents and fragments

Borden, Jonathan jborden at
Wed Feb 24 02:50:00 GMT 1999

> Jonathan Borden writes:
> > <term>document</term> is defined as in the XML spec. documents are well
> > formed. when a document fragment is isolated from its parent
> document, it
> > becomes a standalone document.
> Sounds fine so far...
> > a document may contain a prolog. a document fragment may not. a
> document may
> > contain a !DOCTYPE definition (DTD), a document fragment may
> not. Hence all
> > document fragments are legal documents but not all documents are legal
> > document fragments.
> I think I follow what you are saying, but I'm confused why you would
> choose to define a document fragment in this way.  Why can't it
> contain a prolog?

	because then it is a document. My sole purpose in discussing 'document
fragments' was because the thread had gotten stuck on the notion that a
continuous XML stream would contain a single long document (perhaps w/o a
closing tag) and the actual PDU's consist of document fragments ... the
point is that if we create a protocol on a stream which transmitts multiple
documents, there is no loss of functionality over a solution employing
'document fragments'

> Are you assuming that document fragment must be
> produced as a reduction of a parent document?  It strikes me as very
> odd to define 'document fragment' as a superset of 'document'.

	to the contrary, if all legal doc frags were in fact docs then doc is a
superset of doc frag ... but it has been pointed out that doc frags aren't
always legal docs (when non default charsets are used).

> > So, the problem here is not one with XML, rather the protocol used to
> > transmit documents, HTTP and SMTP send one MIME message per
> PDU, streaming
> > protocols can be defined which transmit multiple documents.
> But the definition of XML processor does become a problem here.  If
> the stream consists of multiple XML documents, one must use an
> XML-aware processor to parse it.  But this had better be a
> non-conforming XML processor, since according to the spec a
> 'conforming XML processor' must cry foul if its input doesn't have one
> and only one root element.

	It is the responsibility of the *protocol* to pass a document to the XML
parser. There is no requirement that the stream be passed unadulterated to
the XML parser. The suggestion of delimiting characters allows the protocol
layer to easily chop the stream into individual documents.

Jonathan Borden

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list