Streams, protocols, documents and fragments was: RE: Documents and Document Fragments (Was RE: XML Information Set Requirements, W3C Note 18-February-1999)

Borden, Jonathan jborden at
Mon Feb 22 03:26:27 GMT 1999

James Tauber wrote:

> <ExecutiveSummary>
> Mark and I agree on and are both excited about the value of the
> concepts we
> are discussing: namely promoting an element in a document to the status of
> document element in its very own document.
> I'm also interested in the reverse: demoting a document element to the
> status of a normal element in a larger document.
> Our disagreement seems to stem from the fact that I don't believe you have
> an "XML document" until you serialise as well-formed XML text. That's my
> understanding of the XML 1.0 REC.
> Mark uses the terms "physical" and "logical" XML documents where, by the
> former I think he means what I think of as an XML document in the sense of
> the XML 1.0 REC, i.e. serialised text. By the latter I think he
> means a more
> abstract representation of the type being developed by the XML Infoset WG.
> In fairness to Mark, the terms "physical" and "logical" are used
> in this way
> in the Infoset Requirements. However, I would argue that the term "XML
> document", at least as used in the XML 1.0 REC, is only ever "physical".
> There is an equivalent logical representation but that is yet to be
> standardised by the Infoset WG.
> Most people probably think I am just being pedantic.
> I am.
> I'm also trying to follow the spec :-)
> </ExecutiveSummary>

	I think this whole discussion is getting muddled because terminology of
different domains is being interchanged.

Some definitions:

<term>document</term> is defined as in the XML spec. documents are well
formed. when a document fragment is isolated from its parent document, it
becomes a standalone document.

a document may contain a prolog. a document fragment may not. a document may
contain a !DOCTYPE definition (DTD), a document fragment may not. Hence all
document fragments are legal documents but not all documents are legal
document fragments.

<term>stream</term> is ambiguous but generally refers to a series of bits or
bytes or characters. In general, a stream behaves similarly to a socket.

<term>protocol</term> is layered above a network transport, or socket and
defines a mutually agreed upon mechanism to exchange messages and other

So what does this have to do with XML? The canconical example of streamed
XML is the stock ticker. Assuming each stock quote is transmitted in a
document, the HTTP protocol can employ a particular URL e.g.,
http://wherever/quotes/next to return the next quote as a single document.
Suppose we wish to transmit 100 quotes as distinct documents, this does not
work with HTTP which returns a single MIME message response for each
request. The solutions would be to employ 1) multipart messages 2) wrap the
quotes in a single document 3) use another protocol.

Suppose we use raw sockets? Nothing to prevent sending one document after
another down the socket. The end of one document and the start of another
are unambigous assuming the documents are well-formed.

So, the problem here is not one with XML, rather the protocol used to
transmit documents, HTTP and SMTP send one MIME message per PDU, streaming
protocols can be defined which transmit multiple documents.

Jonathan Borden

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list