Streaming XML (Was RE: XML Information Set Requirements, W3C Note 18-February-1999)

Mark Birbeck Mark.Birbeck at iedigital.net
Sun Feb 21 14:28:41 GMT 1999


Couldn't find a coin - so I suppose I should respond:

Marcelo Cantos wrote:
> On Sat, Feb 20, 1999 at 04:08:24PM -0000, Mark Birbeck wrote:
> >then we can't put anything on that
> > wire other than news headlines (and really you shouldn't process
> > anything until you receive that closing element, but I know 
> > that's what
> > people are requesting they can do).
> 
> I disagree with that last parenthesised remark.  Stream-based parsers
> do and indeed should process data as it arrives.  XML browsers _most
> certainly_ should do so.
> 
> Not that I disagree with your overall point (I haven't really given it
> that much thought), but the above is definitely wrong IMO.

You seem to have missed the point of the discussion. The question is
whether it is legitimate to open a stream of XML with some sort of
element like:

    <stockPrices>

and then spend the rest of the day sending out things like:

    <stockPrice>
        <ticker>MSFT</ticker>
        <price>1000</price>
    </stockPrice>

and then at the end of the day, sending:

    </stockPrices>

No-one so far in the discussion has argued that this is good XML -
except you Marcelo, but you can be excused because you haven't given it
much thought -  because if you were validating this you should not (CAN
NOT!) say the document 'stockPrices' is valid until you receive the
closing element. And that would mean you couldn't process the
intervening prices until you had validated the entire document, and that
would mean your data feed would be useless.

So, what people are discussing is whether there is any way of keeping
within the principles of XML and still having an 'infinite document' or
an 'open-ended document' or whatever. In other words, how can we
correctly process those intervening 'stockPrice' elements when we
haven't yet had the complete document to which they belong. Now, you
just say 'stream-based parsers' *should* do this. But if you think that,
back it up. Everyone else in this discussion has said why they are for
or against such an approach. If it's obvious to you, then please share.

My contribution to the discussion - which I *did* give much thought -
was to try and argue that it is not very good programming practice
anyway, to open a stream for 8 hours. Instead we should remove the
containing 'stockPrices' document, and then send lots of 'stockPrice'
documents throughout the day. This has many advantages, such as the
ability to maintain consistency with current XML approaches, the ability
to send multiple 'types' of data along one wire, and the ability to send
a DTD with each document, or even an abbreviated DTD if required. In
short, my disagreement was with trying to map 'the stream' to 'the
document', rather than to 'a carrier of many documents', and I argued
that we already have everything we need in XML 1.0 to implement very
powerful stream processing.

Regards,

Mark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list