A very meaty editor discussion

David Megginson ak117 at freenet.carleton.ca
Tue Apr 28 19:56:38 BST 1998

Ray writes:

 > First, I'll mention my problems using SAX as perhaps a catalyst
 > for SAX Level 2 / Authoring Interface.
 > 1) no ability to get at DTD info. If a user has internal entities,
 >    a dtd, etc.... loading up a document and saving it via SAX
 >    results in corruption.

First, I'd like to say that I'm flattered that people want to use SAX
for this kind of thing, when its original target audience was very
different (I guess that a Hyundai in the lot is better than a BMW
that's still on order).

Secondly, though, I'll pick a nit here and assert that an XML document
that passes through SAX is normalised, but not corrupted; that is, if
I pass a document through SAX once (with an XML-writing
DocumentHandler), then pass it through again, the results will be the
same both times.

 > 2) no ability to get at unparsed entities (DOM supplies this I
 >    believe) This is important if you want to preserve the look of 
 >    the source.

This is not exactly true, though there was a bug in the January draft
of SAX.  In the January draft (which you're probably using), unparsed
entity and notation information was delivered in a clumsy way through
the AttributeMap interface, so you did get information about all
notations or unparsed entities that were actually referenced (except
in the case of an ENTITIES attribute with multiple entities -- that
was the bug).

SAX 1.0, which is just about to leave beta (doesn't _anyone_ have bug
reports, aside from one JavaDoc typo?), has a new, much simple
DTDHandler interface which reports all notation and unparsed entity
declarations to the application (but no other DTD information).  You
can take a peek at 1.0beta at


 > I believe in the power of text, and I think everything should be
 > editable via a text editor and that information should be
 > preserved.  If I can't author a document in emacs, load it up into
 > an editor, and go back and forth, something is wrong.

I agree strongly, which is why I probably wouldn't use SAX level 1 for
an editor (it's meant for downstream processing, where lexical things
don't matter).  The DOM would seem to be the best match, since an
editor will want to store a document tree anyway, but I am happy to go
ahead with a level 2 SAX if the XML-Dev members convince themselves
(and me) that it could fill a niche that the DOM cannot.

 > 3) comments and CDATA have been mentioned before. I'll note that
 >    even Netscape's Composer doesn't preserve comments! I believe
 >    comments are fundamentally important, even in the simplest
 >    XML applications, because it allows an author to annotate
 >    a file inline and transfer it anywhere. You could of course
 >    envision annotation being done differently, say with
 >    a separate annotation DTD and XML-Link/Xpointers, but would
 >    it be as human readable as simple source comments?

Since you mentioned Emacs LISP, think of the two kinds of comments
in the following ELISP function:

 ;; This function says hello

 (defun hello ()
   "Display a friendly message at the bottom of the window."
   (message "Hello!"))

The first comment,

  ;; This function says hello"

is purely lexical: it has no special significance to the Emacs
byte-code compiler, which will simply discard it.  This is the
equivalent of a <!-- --> comment in XML source.

The second comment (called a doc string),

  "Display a friendly message at the bottom of the window."

is more important -- the compiler will preserve this information with
the function definition, and will use it to provide interactive help
to the end-user.  This is the equivalent of an annotation included in
XML source as an element or an attribute value.

Comments are important to authors, which is why they would certainly
appear in a level-2 SAX if we made one.  They _must not_ be important
to downstream processing, though; if they are, then they belong in XML
markup (possibly even as a pointer to a different document).

All the best,


David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list