Why SAX needs namespace support

Wed Jan 27 17:42:40 GMT 1999

james anderson wrote:

> Tyler Baker wrote:
> >
> > ...
> >
> > Implementing namespaces in an XML parser is a trivial task, but dealing with
> > namespaces at the application level is a totally different story, especially with
> > this namespaces scoping stuff.  How do you deal with namespaces in the DOM?  I
> > mean if you copy a node here, and insert a node there, this entire namespace
> > scoping stuff gets all out of whack.
>
> It gets out of whack only if one models it incorrectly. Does anyone here even
> know what the "upward funarg" problem is?. In any case, it has yet to be shown
> that it is necessary to handle the scoping in the DOM at all, as the
> identifiers in the DOM should be universal names.

Well then when you are building a DOM tree, should you pretend that namespace
attributes are not really attributes, but are really pseudo-attributes which really are
not part of the document structure but are only there temporarily in the serialized
form of an XML file, or else should attributes be copied into the DOM tree like all
other attributes.

Attributes whether they are xml:space, xmlns: , or whatever are fundamental parts of
the document as of XML 1.0.  Should we assume that namespace nodes (as they are called
in XSL) should be automatically generated as necessary when writing out a DOM document
as well?  By doing this, you really make things sloppy as the application programmer
now has any direct control upon the actual output of writing a DOM document to a file.
A direct transformation of one XML document built into a DOM tree to another XML
document that represents that DOM tree may produce strikingly different results.

> >  Of course you could say "well the
> > application programmer will just have to deal with that" and then the application
> > programmer would say "why should I deal with it at all".
>
> No. The better implementation is intrinsic in the parser. Then the problem
> does not exist on the application level.

If all you care about is reading XML with namespaces and that is it, then great,
namespaces don't matter because once you are done doing whatever you did with the
document, you basically throw it away.  But what if you are modeling object state using
XML for a component based application.  When you have components nested inside other
components, things can get very sloppy very fast.

> > ... I have had to create my own Java <--> XML architecture
> > because nothing out there is suitable for my needs or the needs of developers who
> > will hopefully be working with this client application.
>
> You need only ensure that all identifiers you use are universal names. It is
> unfortunate that current parsers still model identifiers as strings. I would
> hope that this will change. As long as the serializer does the inversion
> mapping to that which the parser did, the application has nothing to do with
> scoping issues.

Alah the need for a Name type.

> >
> > OK the point is, namespaces as they are currently defined I feel make practical
> > use of XML in this regard too difficult for developer novices to deal with.  I
> > would not even have wasted a year of my life on XML if I thought that its goals
> > were targeted exclusively for the browser centric world because that world does
> > not apply at all to how I am using XML.  XML 1.0 did the job.  Namespaces do
> > not.
>
> If namespaces do not do the job, then the application presumes an inadequate
> model for universal names. Which has nothing directly to do with their
> encoding in XML.

I am just saying that when you read an XML file into memory and then spit it back out
as a document, the document structure should not change aside from maybe stripping
comments and pretty printing issues.  If people don't feel this is important, then I
have a fundamental disagreement here in XML's direction.

The design goals for XML are:
            1.XML shall be straightforwardly usable over the Internet.
            2.XML shall support a wide variety of applications.
            3.XML shall be compatible with SGML.
            4.It shall be easy to write programs which process XML documents.
            5.The number of optional features in XML is to be kept to the absolute
minimum, ideally zero.
            6.XML documents should be human-legible and reasonably clear.
            7.The XML design should be prepared quickly.
            8.The design of XML shall be formal and concise.
            9.XML documents shall be easy to create.
           10.Terseness in XML markup is of minimal importance.

I suppose there perhaps should of been a #11 which said "XML shall be usable by people
without CS degrees".  Sorry for the sarcasm but it seems like a lot of people debating
this issue are victims of their own intelligence.  Did the W3C Namespaces WG actually
do any sort of research among average users on XML?  Of course not because these drafts
and recommendations are all done in secret among the "best and the brightest" without
any outside opinion whatsoever.  Does this mean the people that make up the "Namespaces
in XML" WG are stupid?  Not at all.  It just means that the process that the W3C uses
is stupid.  Of course I suppose I am being "master of the obvious here" but the W3C
still just does not get it that "end-users matter".

> > Plain and simple "Namespaces in XML" looks ugly,
>
> You are correct on this one point.
>
> > ... feels ugly, and therefore is not
> > practical at all for lots of applications that need to be simple, yet need some
> > namespaces mechanism nevertheless.
> >
> > Yes your customers, but that is in the document world.  You are right they
> > probably could care less.  Perhaps I should scrap XML support altogether and
> > stick with just Java Object Serialization.
>
> The problem of uniquely identifying names does not disappear if you take this
> approach. The serialization form changes. That is all. The requirement, that
> you model them adequately in the application, does not.

Well, if we go down the path that XML will only be generated automatically and modified
automatically directly by applications, then why on earth not just use a binary format
which is more compact, lightweight, and does not require some fancy XML parser to
handle things.  HTML was successful because just about any computer novice could create
an HTML file in Microsoft Windows Notepad and then upload it to their ISP's web
server.  Am I a lone wolf here in my belief that XML should be simple enough for use as
configuration files, user profile data (that may be edited manually), or simple object
state.  So the basic question here is XML supposed to be broadly adopted or just
another niche specification?  Niche specifications don't excite me, broadly adopted
specifications do.

> >
> > Yah, but try and use namespaces with dynamically built DOM trees (or any other
> > object tree implementation that maps to XML).
>
> Go back and look at the note which I posted just before Christmas on "External
> DTD & namespaces". The DOM manipulation and the serialization are the easy part.
>
> > ... It can be a major pain in the rear
> > if you have multiple components that have no knowledge of each other and whose
> > externalized content is merged into one XML document tree.
>
> This claim is not correct. The serialization is actually the easiest part of
> the problem.
> Yes, the proposed serialization form for prefix bindings is convoluted. The
> distinctions between kinds of namespaces is baroque. These things do not
> matter to an application. It is the job of the parser to get these right and
> to map the qualified names to easily managed internal representations for
> universal names. Once that is accomplished, there need be no special coding in
> the application to accommodate namespaces.

Very true iff you don't care about end-users doing anything with XML files.  I have
much more faith in the average computer user than it seems a lot of people do here.  I
actually believe normal human beings are capable of making simple changes to XML
documents manually if they need to.  Ideally, applications will be generating XML files
most of the time, but there are situations where people will want and need direct
control.  If you could assume that applications generate and process all of the markup
all of the time, you could eliminate validation, do tag minimization, eliminate line
number counting, and plenty of other optimizations that bear significant overhead and
are only there to support robust error reporting.

Also the last time I checked, in XSL pretty much 99.9% of people out there are building
stylesheets manually.  Since people will be generating XML documents manually such as
in the case with XML stylesheets, forcing them to deal with this namespaces issue when
the argument here is that "the tools will take care of the job" is rather
contradictory.  Tim Bray mentioned a quote along the lines that the two hardest things
in computer science are namespaces and document caching (I think I am wrong about the
latter).  What is the purpose of sticking in namespaces at all if it is one of the most
complicated issues in computer science?  Sticking one of the most complicated issues in
computer science into XML does not lend well to the argument that "XML documents shall
be easy to create."

Last but not least, XML has a lot of potential in the EDI space.  EDI though ugly has
no concept of namespaces and for years people have gotten by without them.  If you make
dealing with XML more of a pain than dealing with EDI, then why on earth would anyone
want to make the switch other than because they heard that "XML is cool and the best
new thing since sliced bread"?  Sooner or later the hype will die down and actually
usability will become an issue when companies and application developers decide whether
to support XML.  After all no one wants to support technologies that bring in lots of
tech support calls.

Tyler

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)