"Save as XML"

David Megginson ak117 at freenet.carleton.ca
Fri Feb 6 14:57:05 GMT 1998


Peter Murray-Rust writes:

 > I have a problem to know what to do with "save XML" on JUMBO. In
 > the SAXDemo routine characters(), DavidM converts non printing
 > chars to escaped variants *e.g. asc(10) -> 
 , but does *not*
 > convert & to & This means that any XML file that contains &
 > will produce invalid XML output.

Sorry for any confusion there -- I had originally used '\n' and '\r',
then decided to use character references to be more XML-like.  I
realise, though, that that gives the unintended appearance of an
attempt to produce XML-parseable character data.  Perhaps I should go
back to C-like escapes.

 > What is the appropriate strategy? Should a "save XML" application
 > convert all five chars (&, <, >, ', ") to their escaped
 > equivalents? Or none? Or just the first two. [In my own community I
 > don't think using <![CDATA[ is a good idea because people won't
 > have any idea what is going on and they will get it wrong.  In any
 > case - as pointed out - it doesn't overcome the random occurrence
 > of ']]>' ].

This taps into an earlier discussion about what is an is not
significant information in an XML document.  For example, if the
general entity &name; is set to "David Megginson", then the following
two fragments are exactly equivalent for many XML applications:

FRAGMENT 1:

  <x


  y="z">My name is<--

  here's a comment


  --> &name;.</x>


FRAGMENT 2:

  <x y="z">My name is David Megginson.</x>


Some authoring and repository tools, however, will want to preserve
the general entity reference, the comment, and the whitespace (even
inside the start tag).  In SGML, you can use grove plans to specify
what information is and is not significant to an application -- but
there is still a lack of detailed standards for the information set
(or sets) returned by an XML parser.


All the best,


David

-- 
David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com
      http://home.sprynet.com/sprynet/dmeggins/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list