SAX RFD: ModSAX Predefined Features

David Brownell db at eng.sun.com
Tue Mar 9 06:57:28 GMT 1999


Again, I think that unifying these under the generic get/set
API (with Boolean.TRUE and Boolean.FALSE objects as values
for features that are really boolean) could be useful.

Documentation for each feature should specify whether it's
changeable mid-parse ... I'd suggest "no" as the default answer!

Mike Dacon commented about the "API archaeology" aspect of this
name; perhaps the "Parser2" style naming convention can avoid
losing technical context (i.e. this is still a parser, even
if it's parsing a DOM or a stream of SAX events :-).


> 1. http://xml.org/sax/features/validation

Good.  (I'm curious if folks prefer one parser, which can
have this feature toggled, vs two, where the parser comes
with at least an initial value.)


> 2A. http://xml.org/sax/features/external-general-entities
> 2B. http://xml.org/sax/features/external-parameter-entities

Right, two kinds of parsed entities, two control knobs.
Validating parsers must refuse to change these knobs.
(OK, _five_ kind of parser -- validating, and four kinds
of nonvalidating parser!  ;-)


> 3. http://xml.org/sax/features/namespaces

I'd rather have this just kick in modified XML syntax rules
(e.g. entity names may never be scoped, and scoped names may
have only one interior colon).

With that, one can layer the rest of namespace processing
on top in any of several fashions.  A DOM can be built which
exposes namespace declarations; or a filter can munge names
and strip out the declarations.  The "munge" feature could
get its own namespace URI.


> 4. http://xml.org/sax/features/unbuffered-input
>   True means ensure that the parser does not buffer input from a
>   Reader or InputStream supplied by the application (actually,
>   one-character look-ahead will usually be required); false means do
>   not ensure that the parser does not buffer input.  This feature might
>   be useful for reading multiple documents from a single stream.

I'm not sure this is a common enough feature to need to be
predefined ... support for "XML Islands" within HTML may become
important, but much of this can be done (at least in Java) by
requiring pushback to be done at appropriate points.


> http://xml.org/sax/features/normalize-text

This is a good filter feature, I think.


Lars suggested a "Catalog" feature.  There are different sorts of
catalog, and they need configuration, so the value of this could
be a URI for the catalog, not just a boolean.  Plus, this would
seem to be up to the "EntityResolver" to handle ... yes?  It'd
perhaps suggest that one could ask the next filter in the stream
for the resolver it was using ... :-)



Good discussion, gang!

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list