Streaming XML and SAX
David Megginson
david at megginson.com
Wed Feb 24 21:35:57 GMT 1999
Tom Harding writes:
> David Megginson wrote:
>
> > 1. Use a non-XML mechanism for separating XML packets -- that
> > way, there's not a tight dependency between the stream-handler
> > and the parser (the stream handler knows the bounds of each
> > packet without doing any XML parsing).
>
> What bit sequence would you use as a separator and how would you
> ensure that no conceivable encoding would produce it spuriously?
I'm talking about characters, not bit sequences. For a simple
solution, you should provide the entire stream in the same character
encoding (remember that a transport protocol is allowed to override
the encoding in the XML declaration or encoding declaration).
Otherwise, the packets will need to be escaped somehow.
> > 2. Separate information about the packages from the packets
> > themselves. The information could be linear, or it could
> > itself be XML packets of a different sort. You should not
> > have to parse an entire packet to know its sequencing, etc.
>
> How could you terminate a document with another doc element? The
> only thing allowed after all legitimate Misc markup at the end of a
> document is more Misc markup.
But you're not performing XML parsing at all until you take the stream
apart first -- in other words, all the XML parser sees is the part
between the separator characters. This is the kind of layered
approach that makes for simple, maintainable systems.
> > Putting a PI in the XML packet itself seems a little awkward to me.
>
> How about thinking of it as a "network-ready" document? Or if you
> like, explicitly define a "packet" as such a document.
No, it still looks like a messy architecture to me, because the
transport layer has to know about the packets -- it has to parse the
XML about to get information about what it's looking at, and that adds
complexity and inefficiency. A clean architecture should separate the
layers completely, and use XML only where it has an obvious advantage
over other approaches.
All the best,
David
--
David Megginson david at megginson.com
http://www.megginson.com/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list