Fw: Trees versus event streams

F. Chahuneau - General Manager fcha at Berger-Levrault.fr
Sat Mar 1 11:10:21 GMT 1997


[Peter Murray-Rust (Peter at ursus.demon.co.uk), Thu, 27 Feb 1997]

> My current problem may highlight this. A CML document is highly 
> tree-structured and contains no mixed content, so that eventStreams don=
't
> contribute much. BUT it also includes chunks of HTML where a tree 
> structure is quite inappropriate. If I take a Lark-based approach (or m=
y 
> own parser) the HTML gets rendered into a tree. I am now hacking this 
> back into an event stream to render the hypertext. Not only does it
> take more effort, but I'm sure that holding HTML as a tree has a
> memory hit. Ideally when I'm parsing CML, and come to the
> tag <XHTML> (sic) which contains <BODY>, I'd like to tell the parser
> 'stop parsing as a tree and just hold a hypertext string until </XHTML>=


Peter,

This kind of consideration is precisely what led us to define a *dual* 
programming paradigm when designing the Balise SGML processing language 
(http://www.balise.com).

Being able to switch back-and-forth between these two useful and 
complementary abstractions for an SGML document (a "tree of typed nodes 
with attributes" vs an "ESIS or ESIS+ event stream") is, from our 
experience, often required when you have to express complex processing 
tasks on SGML documents, but still want to keep your code as concise as 
possible. 

No paradigm is inherently better than the other: it all depends what you =

want to express. If you want your code to remain legible and maintainable=
 
(i.e related in a straightforward way to the processing idea it expresses=
), 
then you really need both in some cases. If you are interested, this idea=
 
is further developed in the following paper: "Event Driven or Tree 
Manipulation Approaches to SGML Transformation" presented at SGML'96 and =

available at "http://www.balise.com/current/articles/lecluse.htm"

> We *could* do this with a PI, but would have to all agree.

Doing this with a PI does not seem to be the best idea, since it does not=
 
leave a choice to the application programmer which mode she wants to use =

for what, while the best choice may entirely depend on  what she wants to=
 
do. Being able to switch betwwen tree an event-stream mode on any GI even=
t 
is what is required. For maximum generality, you also need to be able to =

generate an event stream during (sub-)tree traversal, maybe not the 
*original* tree, but one which you have modified or created through your =

application.

In the world of traditional SGML applications, sheer document size is 
frequently an issue, so that tree mode must often be used with parsimony.=
 
In the case of XML or HTML fragments, this problem is probably negligible=
. 
The rationale for maintaining an "event stream" paradigm in an XML API is=
, 
therefore, not to save memory, but simply that it might the most 
appropriate in some cases.

 
        
_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
       _/ Fran=E7ois CHAHUNEAU                 phone: [+33] 1 40 64 43 00=
  _/
      _/ Directeur G=E9n=E9ral/General Manager                           =
   _/
     _/ AIS S.A.                             FAX: [+33] 1 40 64 43 10  _/=

    _/ 15-17 rue R=E9my Dumoncel    email: fcha at ais.berger-levrault.fr  _=
/
   _/ 75014, Paris, FRANCE        WWW: http://www.berger-levrault.fr _/
  _/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/_/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list