YAXPAPI (Yet Another XML Parser API)- an XDEV proposal

Peter Murray-Rust peter at ursus.demon.co.uk
Sun Dec 14 07:34:39 GMT 1997

At 16:07 13/12/97 -0800, Tim Bray wrote:
>At 12:03 AM 14/12/97, Peter Murray-Rust wrote:
>>I am listing the main calls from Lark and AElfred that I find useful. As
>>you can see there is a great similarity - I confess that I find the AElfred
>>ones slightly easier to understand.
>OK, let's get concrete.  I think that the AElfred callbacks each having
>an XMLParser argument is a good idea.  Also AElfred's names are better,
>the "Do*" prefix in Lark is silly.  So on the event-stream stuff, I'd
>go with the AElfred model modulo the following changes:

This seems eminently reasonable - if DavidM is listening I suggest we can
get this sorted very quickly.

>>  attribute(XmlParser, String, String, boolean) 
>It seems completely wrong to have an attribute event separate from
>start-element events.  To start with, it suggests that the order of 
>attributes is significant, which it is incorrect.  Secondly, since much
>element-specific processing depends on what attributes are there, it is 
>less convenient for the application programmer.  Third, if the processor
>(as it must) does defaulting, he's going to have to do some attribute
>list wrangling anyhow, so it can't really be extra work.  

I cut the documentation out to save space on the list. 
boolean isSpecified
(although this doesn't match with the documentation for the Parameters,

>What's the boolean?  I don't think the application author should
>to have to deal with anything but the name and value of attributes.
>Anyhow, I'd go with 
>startElement(XmlParser processor, String type, Attribute[] attributes);

So would I.

>and lose the attribute() method.
>>  data(XmlParser, String) 
>I feel that the 2nd argument should not be a String.  It is a recipe
>for disastrous inefficiency if the processor has to cook up a 
>java.lang.String object for every little chunk of text.  Lark uses two
>arguments, a char[] array and a character count; the app can
>make a String if it needs to.  If you find this awkward, create
>a new data type called Text so that if you need a String you
>can make it with lazy-evaluation in Text.toString(), but if you
>don't need it you don't build it.

Seems reasonable.

>Also, it shouldn't be named "data" - it should be named
>characterData or charData or text or some such term that can
>be mapped directly to the spec.
>>  resolveEntity(XmlParser, String, String, URL) 
>I don't think entities have any place in the first cut of this 
>interface.  The processor exists to make these problems go away.

Lark has entities:
public boolean doSystemTextEntity(Entity e, String name, String extID)
and two others...

>Lark has a thing where if any callback returns 'true', the
>parser drops out of its loop... which is awfully useful and easy
>I think.  Lark will also re-enter, but this need not be a requirement.
>Also, for application programmers, especially dealing with smallish
>objects, a tree interface is very natural.  I've written both
>event-stream and tree apps using Lark, and the trees are a lot
>easier to use for anything even moderately complex.  So the API 
>should have Element, Attribute, and Text classes. 

I won't quarrel with this. I would be very happy for a tree interface,
because JUMBO is based on trees. However I didn't want to subclass Lark's
trees if we decided on a different one, because unlike an event stream,
that could take a major rewrite of JUMBO. IFF we can standardise now, I'll
be very happy.

>And it shouldn't (sorry Peter) be called YAXPAPI - how about SAX, Simple

Of course it shouldn't - I would second the use of Simple somewhere in it.

>API for XML?  Maybe SAX-J for the Java bindings. -Tim
Sounds great. let's make sure we get 100% of the way this time.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list