XML API specification
Peter Murray-Rust
Peter at ursus.demon.co.uk
Thu Feb 27 15:06:17 GMT 1997
In message <199702271419.JAA24115 at nathaniel.ebt> gtn at ebt.com (Gavin Nicol) writes:
> I think that for the *parser*, we should define an event-handling
> interface, as it is much simpler to build certain applications
> that way, and because you can build a tree from a stream of
> events if you need to.
In CoST Joe English supported both eventStreams and trees (I'm sure Joe will
have some wisdom on this one). I started off using the event mechanism
and switched to a tree-based one but I suspect that this was the nature of the
application.
My current problem may highlight this. A CML document is highly
tree-structured and contains no mixed content, so that eventStreams don't
contribute much. BUT it also includes chunks of HTML where a tree structure
is quite inappropriate. If I take a Lark-based approach (or my own
parser) the HTML gets rendered into a tree. I am now hacking this
back into an event stream to render the hypertext. Not only does it
take more effort, but I'm sure that holding HTML as a tree has a
memory hit. Ideally when I'm parsing CML, and come to the
tag <XHTML> (sic) which contains <BODY>, I'd like to tell the parser
'stop parsing as a tree and just hold a hypertext string until </XHTML>.
We *could* do this with a PI, but would have to all agree.
>
> Some questions that will affect the API is whether one sees empty
> element as elements containing nothing, or as elements unable to
Yes.
> contain anything, and wether entity/attribute type information needs
> to be passed across thr API.
I have been convinced that entity information needs to be preserved and I
assume there are people who are concerned about attribute_type. If
nothing else, this is probably critical for ID/IDREF.
>
> What do people think? How much information must the parser pass
> along?
At least what comes out of sgmls/ESIS, probably with general entities
added. We also need to know the DOCTYPE info.
[...]
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list