Confusion about conditional sections

Richard L. Goerwitz III richard at goon.stg.brown.edu
Sat Feb 20 23:57:27 GMT 1999


> Once you see a conditional ignore section, can you effectively...
> just scan for <![ and ]]> parts of the text in side there without
> doing regular parsing?

As several others have pointed out, productions 61-65 don't quite say
this.  You still have to be sure that <![[ and ]]> are nesting cor-
rectly, even inside of ignored sections.

Just to re-quote the prose section that explains this point:

> Note that for reliable parsing, the contents of even ignored
> conditional sections must be read in order to detect nested con-
> ditional sections and ensure that the end of the outermost (ig-
> nored) conditional section is properly detected.

In practical use, I'm not sure how useful this behavior is, because
one of the main uses of conditional sections, for many shops, is to
divert SGML and XML portions of a DTD into their own conditional
sections.  You want to be able to simply cut and paste, and not worry
about whether quoted strings inside markup or comments contain a ]]>
sequence.

If the reason the standard was worded the way it was is to make pro-
cessing simple, than I can only shrug.  XML is looking as though it
is going to become a monster.  Asking parsers to pay attention to
strings and comments inside of ignored conditional sections is not
going to make it much more difficult to grok than it already is, and
it may eliminate a source of problems.

=================  test.xml ==================
<!DOCTYPE test SYSTEM "test.dtd" [
  <!ENTITY % sgml_section "IGNORE">
  <!ENTITY % xml_section "INCLUDE">
]>
<test>&agrave;</test>

=================  test.dtd ==================
<!ELEMENT test (#PCDATA)>
<![%xml_section;[
  <!ENTITY agrave "[some unicode escape]">
  <!ENTITY egrave "[some unicode escape]">
  <!ENTITY igrave "[some unicode escape]">
]]>
<![%sgml_section;[
  <!-- This section should be able to hold comments and
   -   markup containing strings.  There should be no
   -   problems with ]]> sequences in either of these
   -   contexts.
   -->
  <!ENTITY igrave SDATA "anything we please, e.g., <!-- ]]> -->">
  <!ENTITY agrave SDATA "anything we please, e.g., ]]>">
  <!ENTITY egrave SDATA "anything we please, e.g., ]>">

  <!-- I don't see why we couldn't have a CDATA section
    -  here, since we're in an IGNORE section.
    -->
  <![CDATA[ This CDATA section should be ignored. ]> ]]]>
]]>
<!-- And our parser should be smart enough to close the
  -  IGNORE section above, and not here: ]]>
  -->

Richard Goerwitz
Scholarly Technology Group

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list