XML API specification

Peter Murray-Rust Peter at ursus.demon.co.uk
Thu Feb 27 15:06:17 GMT 1997


In message <199702271419.JAA24115 at nathaniel.ebt> gtn at ebt.com (Gavin Nicol) writes:
> I think that for the *parser*, we should define an event-handling
> interface, as it is much simpler to build certain applications
> that way, and because you can build a tree from a stream of
> events if you need to.

In CoST Joe English supported both eventStreams and trees (I'm sure Joe will
have some wisdom on this one).  I started off using the event mechanism 
and switched to a tree-based one but I suspect that this was the nature of the 
application.

My current problem may highlight this.  A CML document is highly 
tree-structured and contains no mixed content, so that eventStreams don't
contribute much. BUT it also includes chunks of HTML where a tree structure
is quite inappropriate.  If I take a Lark-based approach (or my own
parser) the HTML gets rendered into a tree.  I am now hacking this 
back into an event stream to render the hypertext.  Not only does it
take more effort, but I'm sure that holding HTML as a tree has a
memory hit.  Ideally when I'm parsing CML, and come to the
tag <XHTML> (sic) which contains <BODY>, I'd like to tell the parser
'stop parsing as a tree and just hold a hypertext string until </XHTML>.
We *could* do this with a PI, but would have to all agree.

> 
> Some questions that will affect the API is whether one sees empty
> element as elements containing nothing, or as elements unable to

Yes.
> contain anything, and wether entity/attribute type information needs
> to be passed across thr API.

I have been convinced that entity information needs to be preserved and I
assume there are people who are concerned about attribute_type.  If
nothing else, this is probably critical for ID/IDREF.
> 
> What do people think? How much information must the parser pass 
> along?

At least what comes out of sgmls/ESIS, probably with general entities
added.  We also need to know the DOCTYPE info.
[...]

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list