[Q] How should SAX support Namespaces?

Fri Jul 24 23:25:06 BST 1998

Toby Speight scripsit:

> David> How should SAX support namespaces?  I can think of three
> David> options:
> 
> I'd like to add a fourth: define namespace support as a layer above
> SAX, which can interface with any SAX parser, and produce output
> similar to that of SAX with additional information.  Then this
> "namespace library" can do one of your proposed actions:

I absolutely agree with this.  In fact, I think that this simple
species of namespace support should be provided as part of the
SAX helper library, so that everyone has it readily available.
That way, SAX-compliant XML parsers can just do XML and leave
namespaces up to common code.

> David> 1. Simply ignore them, and require the XML application to do
> David>    the work (all of the necessary information is passed on by
> David>    SAX).

This corresponds to not using a helper class at all.

> David> 2. Use the current interface, but allow namespace-aware SAX
> David>    processors to prepend namespace URIs to element type and
> David>    attribute names, as in
> David>    startElement("urn:www.megginson.com:doc", ...)
> David>    endElement("urn:www.megginson.com:doc", ...)

This corresponds to a helper class that is itself a SAX-compliant
parser.  Just initialize it with the object representing some other
(real) parser, and it decodes namespace PIs and alters the
startElement, endElement etc. calls.

I must point out, however, that since ":" is valid in URIs
it should *not* be used as the separator between URIs and Names,
and instead some character not valid in either should be used.
Since we are in a Java/Unicode context, I propose '\u2020' DAGGER
as the separator: both Arial Sans Unicode and Bitstream Cyberbit
have glyphs for it, which assists debugging.

> David> 3. Revised org.xml.sax.AttributeList and org.xml.sax.DocumentHandler
> David>    to include the namespace as a separate (possibly-null) argument:
> David>
> David>    startElement("urn:www.megginson.com", "doc", ...)
> David>    endElement("urn:www.megginson.com", "doc", ...)

And this would be a helper class that exports an interface related
to, but different from, the SAX 1.0 interface.

> 4.  Apply an application-specified re-mapping to the names.  So the
>     above could be initialised with
>        ns_processor.setPrefix("urn:www.megginson.com", "davids-ns");
>     and have its document handler called with
>        startElement("davids-ns:doc", ...)
>        endElement("davids-ns:doc", ...)

This would be another helper class, also a SAX 1.0 compliant parser.

> I don't particularly like (2) above, since it means that different SAX
> parsers may return different values for the same document.

I don't understand this comment.

> I do feel that namespace handling is orthogonal to syntactic parsing
> of XML, and that SAX itself should confine itself to the latter.  I'm
> not sure whether namespace processing should happen between SAX and
> the application (layered approach) or separately "at the side" of the
> application (application developers' utility class).

Layered, layered, says I.  Option #2's class would be IMHO the
most useful, since you can plug it into an existing SAX-using
application with zero changes to the interface, and just start
testing for names like "http://purl.oclc.org/NET/xschema\u2020XSchema".

-- 
John Cowan	http://www.ccil.org/~cowan		cowan at ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)