SAX needs from our point of view

Michael Amster mamster at webeasy.com
Thu Apr 23 23:32:59 BST 1998


Quoting Ray Cromwell:

>Ok, now that I've started a flame war and gotten that off my chest :), 
>I'd like to nominate the three biggest features I'd like in SAX Level 2 
>(or SAX2.0), in order of importance.
>1) access to DTD information 
>2) comments, CDATA, and location information for Attributes 
>3) sax.util classes that take an ElementFactory (which return DOM 
>interfaces), and build a tree. (maybe Don Park would like to contribute 
>this). IBM's XML for Java is a starting point, but it has the fatal flaw 
>that the return values of the ElementFactory are not the DOM interfaces 
>(such as Element or PI) but IBM base classes, like TXElement or PI, 
>which means you are forced to inherit from TXElement instead of just 
>implementing Element.

In our case, having embedded XML languages with our own language
controlling flow of execution, we have a real need for an accurate
reproduction of the XML elements parsed so they can be rewritten correctly.
 Specifically, the issue is important in distinguishing between text and
CDATA.  Let me illustrate with a simple example:

<WEIF COND="true">
	<WETHEN>
		<ARBITRARYXML/>
		<![CDATA[
			This is data with &references; which should not be parsed!
		]]>
		<MOREXML>
			This is just text
		</MOREXML>
	</WETHEN>
</WEIF>

When this is reported up from a SAX parser, we do not differentiate between
text and the CDATA, but let's say that we want to output the subset of
arbitrary XML back out from our DOM or other object structure:

		<ARBITRARYXML/>
			This is data with &references; which should not be parsed!
		<MOREXML>
			This is just text
		</MOREXML>

Now you see that the CDATA will have all references made when it is
reparsed.  We really do want to preserve CDATA as different from text in
SAX.  I can live without comments and to some degree, I can even reduce the
amount of DTD info available to me, but I hope that CDATA and text are
reported differently through the interface.  It should not substantially
complicate things for parser writers or application developers if it is
just a Document handler event.

-MA
~-~-~-~-~-~-~-~-~-~-~-~-~-~-WEBEASY-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~-~
Michael Amster					mamster at webeasy.com
4676 Admiralty Way, Suite 300			Tel: 310.576.0770
Marina Del Rey, CA 90292		       	Fax: 310.576.2011

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list