YAXPAPI (Yet Another XML Parser API)- an XDEV proposal
Tim Bray
tbray at textuality.com
Sat Dec 13 17:58:18 GMT 1997
At 03:19 PM 13/12/97, Peter Murray-Rust wrote:
I agree with Peter that we should just buckle down and get on with what used
to be known as XAPI.
But my approach would be quite different. I think that the first step
should be the end-user's API, the kind of thing that someone using a SMIL
or RDF processor would need. Such a person really doesn't want to wrestle
with entities and references and PIs and marked sections; all they want
is elements and attributes and the basic doctype info; they want the
processor to deal with entities and refs and quote marks and white space in
markup and encodings and so on.
This would go a long way to address the whinings of the RDF & SMIL type
people, who thought XML just meant elements and attributes. I think that
from their point if view, it should be, all the other stuff in the syntax
is strictly to support authoring and management convenience.
It should come in event-stream flavor and tree flavor.
Minimal event stream API:
1. Doctype, returns: root type, external subset system/public idents
2. Element start, returns: type, element name-value pairs, whether it's empty
3. Text
4. End Element, returns: type
Minimal tree API:
1. Document, with methods: root type, system ID, public ID, root element
2. Element, with methods: parent, children, attributeValueByName, allAttributes
3. Attribute, with methods: name, value
4. Text (presumably hiding lazy evaluation)
I acknowledge this is grossly insufficient for basing an editor on. You want
that, use the DOM. Only a few choices have design implications:
1. How are children returned; possibilities would be to have Element and
Text crammed into the same class with a method for asking which is which,
or have separate Text and Element classes, then children returns an Object
array or a Vector, and you can find out what kind of child each member
is using the instanceof operator. I favor the latter, Lark does this
2. Whether it's worthwhile putting children into, as opposed to a native
array or Vector, a special ChildList class with enumerator and indexing
so you can hide a lazy-evaluation behind it. I favor the latter, the
DOM does this but Lark doesn't.
3. Whether the processor should be required to coalesce adjacent Text
objects. Suppose you have <a>foo <!--comment--> bar &ref; <?pi?>baz</a>,
it's immensely less work if the processor can give this to the app
as 4 Text chunks. I think most of the processors do this now.
If I formalized and published this, it would look a lot like part of
Lark's interface, but I bet all the other parsers could implement it.
Should I? -Tim
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list