Preserving white space and entity references in DataChannel X JP

Tom McCann tom at DataChannel.com
Tue Aug 3 13:22:30 BST 1999


Sorry for the delay in responding.  Here's the response from the developer:

For things like <element1> &lt; fubar &gt; </element1>, the parser is
handling the entity reference nodes correctly. When preserveWhitespace is
on, the 2nd (childNodes.item(1)) and 5th (childNodes.item(4)) child nodes of
elements1 are entityRef nodes. Their nodeValue are "<" and ">",
respectively.

I found a bug while testing this. The parser does not reset the whitespace
buffer correctly. The 3rd child node of element1 is a TEXT node with a
nodeValue of a single space. This node should not have been there.

My guess is that this user did not get the child item number correctly. The
bug that I just found out created an extra TEXT node which could be
confusing.

HTH

Tom McCann
Director of Engineering
DataChannel Inc.
http://www.datachannel.com/



> -----Original Message-----
> From: Vance Christiaanse [mailto:vance at textwise.com]
> Sent: Friday, July 30, 1999 12:19 PM
> To: Erik James Freed
> Cc: Xml-Dev; Keith Swenson
> Subject: Re: Preserving white space and entity references in 
> DataChannel
> XJP
> 
> 
> Step 1:
> > > Erik James Freed wrote:
> > >
> > > I am experiencing some strange behavior with the datachannel XML
> > > parser package (the most recent one).
> > > In my reading of the DOM spec, this is not appropriate 
> behavior, but
> > > perhaps I am missing something.
> > >
> > > The behavior is that when I do a 
> 'setPreserveWhiteSpace(true)' before
> > > parsing a document, and the document
> > > contains strings with entity references such as:
> > >
> > >     <element1> &lt; fubar &gt; </element1>
> > >
> > > when I then do a getText() on element1, what is returned is a
> > > java.lang.String that contains a null (char 0) for each entity
> > > reference.
> > >
> > > These nulls of course confound the rest of the code I am 
> writing. In
> > > side the DOM tree the entity reference objects are 
> happily holding the
> > > appropriate text representation  i.e. '<' and  '>'.
> > >
> > > Turning off white space preservation makes the getText() place
> > > appropriate decoded entity references in the resulting string.
> > >
> > > Bug or feature?
> 
> Step 2:
> I wrote:
> > I don't see a setPreserveWhiteSpace(...) method or 
> preserveWhiteSpace
> > class or instance variable in the DOM spec and I don't see 
> getText() or
> > a text variable either. The answer to "bug or feature" 
> would be up to
> > the
> > 
> > > datachannel XML parser package
> 
> Step 3:
> Eric wrote
> > Vance,
> > 
> > Yes indeed this is not a pure DOM/XML issue, however the DC 
> extension does
> > purport to
> > adhere to standard XML concepts.
> > 
> > The following is from the datachannel documentation on the
> > PreserveWhiteSpace parameter:
> > 
> >         "As per the XML Language Specification, this 
> specifies the white space
> > handling for the        application; that is, the default 
> white space handling to
> > apply when      xml:space="default". If preserveWhiteSpace 
> is true, all white
> > space will be preserved         regardless of the setting 
> of any xml:space
> > attributes in the document. The white space     will be 
> preserved by additional
> > text nodes being present in the tree. If        
> preserveWhiteSpace is false, then
> > the values of the xml:space attribute specified in      the 
> document will
> > determine whether white space is preserved or not. "
> > 
> > So with that clarification is this a bug or a feature?
> 
> Step 4:
> I don't know, unfortunately. I've been studying the DOM and I just
> wanted to clarify its boundaries. Hopefully someone familiar with the
> DataChannel XML parser package will answer!
> 
> Vance
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list