Streaming XML and SAX

David Megginson david at
Wed Feb 24 21:35:57 GMT 1999

Tom Harding writes:

 > David Megginson wrote:
 > > 1. Use a non-XML mechanism for separating XML packets -- that
 > >    way, there's not a tight dependency between the stream-handler
 > >    and the parser (the stream handler knows the bounds of each
 > >    packet without doing any XML parsing).
 > What bit sequence would you use as a separator and how would you
 > ensure that no conceivable encoding would produce it spuriously?

I'm talking about characters, not bit sequences.  For a simple
solution, you should provide the entire stream in the same character
encoding (remember that a transport protocol is allowed to override
the encoding in the XML declaration or encoding declaration).
Otherwise, the packets will need to be escaped somehow.

 > > 2. Separate information about the packages from the packets
 > >    themselves.  The information could be linear, or it could
 > >    itself be XML packets of a different sort.  You should not
 > >    have to parse an entire packet to know its sequencing, etc.
 > How could you terminate a document with another doc element?  The
 > only thing allowed after all legitimate Misc markup at the end of a
 > document is more Misc markup.

But you're not performing XML parsing at all until you take the stream 
apart first -- in other words, all the XML parser sees is the part
between the separator characters.  This is the kind of layered
approach that makes for simple, maintainable systems.

 > > Putting a PI in the XML packet itself seems a little awkward to me.
 > How about thinking of it as a "network-ready" document?  Or if you
 > like, explicitly define a "packet" as such a document.

No, it still looks like a messy architecture to me, because the
transport layer has to know about the packets -- it has to parse the
XML about to get information about what it's looking at, and that adds
complexity and inefficiency.  A clean architecture should separate the
layers completely, and use XML only where it has an obvious advantage
over other approaches.

All the best,


David Megginson                 david at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list