[Q] How should SAX support Namespaces?

Tue Jul 21 18:31:38 BST 1998

Peter Murray-Rust wrote:
> ...
> The ElementNames generate a UniversalName which for
> convenience inside the program I hold as nsString+SEPARATOR+localName.

this is better managed as 
 namespace : (nsString x universalName*)
 universalName : (namespace x localName)

these are the "symbols" which i have alluded to in the past.
the representation saves a lot on references to properties of the namespace itself.
you also don't need the separator. you operate directly on universal names.
it also saves you from the fate which you describe below.

> (The
> spec suggest an ordered pair - if we can find a SEPARATOR which is
> guaranteed not to occur in a URN  it just makes it a bit easier (this is
> DavidM's #2 but with something other than COLON). [It never sees the light
> of day, anyway].
> ...
>         The problem I face is with other specs (especially XPointer). These will
> have to be revised to fit namespaces, since I think relying on a prefix in
> a given document may be very dangerous. Thus I'd like to be able to search
> for <CML:Molecule> in a document using XPointer but cannot rely on the
> 'CML'. [I know that some people say XPointer shouldn't be used for such
> 'searches' but my will is weak.] The XPointer spec will have to read
> something like:
> descendant(2,%universalName{[http://xml-cml%SEPARATOR]?Molecule})

this problem is inherent in the string representation of universal names. it's
one of the reason what the representation is a bad idea. if one manages
universal names as symbols, the problem does not exist.
the process which interns the xpointer maps the name token in the descendant
clause to the same symbol to which the parser mapped the token decoded from a
document. the comparisons are then pointer quality. neither the uri, nor the
respective prefixes matter to the application. 

> where the [...]? means optional and the %universalName operator means 'use
> the UniversalName (which may or may not have a prefix according to what the
> document author decided). This will then cater for a document like:
> 
> <?xml:namespace ns="http://xml-cml.org" prefix="CML"?>
> <?xml:namespace ns="http://xml-cml.org" prefix="ChemML"?>
> <CML>
>   <ChemML:Molecule>
>     <ATOMS>...</ATOMS>
>   </ChemML:Molecule>
> </CML>
> 
> This might appear perverse but all three elements types can 'belong to the
> CML DTD'. [I am not invoking scoping.] In a multiauthor document I think
> it's quite possible that we shall see:
> <P>, <HTML:P> and <H:P> all referring to the HTML paragraph element.
> 
> I also pass over the rather hairy problem of validating DTDs.

what do universal names change about validation?

> 
> I wonder whether namespace-aware DTD software has to add defaults on the
> basis of Universal names and not element types. Thus:
> 
> <?xml:namespace ns="http://xml-cml.org" prefix="CML"?>
> <?xml:namespace ns="http://xml-cml.org" prefix="ChemML"?>
> <!DOCTYPE CML [
> <!ATTLIST CML:Molecule title CDATA #FIXED "A molecule">
> ]>
> <CML>
>   <ChemML:Molecule>...</ChemML>
> </CML>
> 
> What attributes does the <ChemML:Molecule> element have??

if you've used symbols rather than strings, the symbol "ChemML:Molecule" is
the same as the symbol "CML:Molecule", since you've specified that the are in
the same namespace.

> 
> By tackling this at the SAX level we have a really wonderful opportunity to
> help ensure that ambiguities are as few as possible. I suspect that some
> exciting areas will arise - this is new territory!
> 

yes, i agree.

  1  :  do it right once for at least tags and attribute names;
+ 1  :  expose the interface to applications for use with attributes values
        and other tokens
-------------
= the immense benefit of saving people a lot of trouble.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)