The Peace Process: DOM and namespaces...

Thu Feb 11 11:42:59 GMT 1999

Tyler Baker writes:

 > The DOM has an unstated implication that it reflects a valid XML
 > document.  If you make a call to getNodeName() on an Element node,
 > it is expected to return a valid XML name.

Perhaps, but a violation of an 'unstated implication' can hardly make
something illegal (Tyler's original claim) -- what Tyler actually
seems to be suggesting is that expanding QNames in the DOM goes
against the original spirit of the API, not against the letter.

 > > The physical representation of an XML document (as defined by XML 1.0)
 > > is not allowed to have characters like '/' and '@' in element and
 > > attribute name, but the DOM is not a physical representation; it is an
 > > API providing access to one view of a document's information set, and
 > > as such, it is not governed by the Name production in XML 1.0.

 > This is one way of looking at it.  But this is not clear and there
 > is no mechanism defined to tell an application whether the DOM is
 > using these illegal names or not.  If you write the DOM Document
 > back out to XML, you are writing out illegal names because you
 > don't know if you are writing out prefixes + local part or
 > namespace + local part.

You gotta check anyway -- what if someone's HTML DOM implementation
were allowing names with illegal letters?  Presumably, however, you
have turned on namespace munging somewhere in your DOMBuilder (however
that works), so you know what you're getting.  Namespace munging
should *never* take place by default for vanilla XML 1.0 processing
(in Expat, for example, it is a user-configurable option, and for SAX
1.0, it is handled by third-party filters [which are surprisingly easy
to write]).

 > > The XML 1.0 spec does not even require processors to report element
 > > names, so in terms of conformance, anything goes kids.
 > 
 > How is anyone supposed to reliably build any sort of architecture
 > on XML if everything is this ambiguous.

We're working on it, but you'd be surprised by what you can do even
with partial specs.  XML 1.0 defines the physical representation of a
document as a string of characters; the DOM defines an API into
structured information, such as XML and HTML documents.  There is a WG 
right now working on the XML Information Set, which will provide some
glue between the two -- I'll keep everyone posted.

All the best,

David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)