Attribute normalisation and character entities
Arjun Ray
aray at q2.net
Mon Jan 24 16:47:48 GMT 2000
On 24 Jan 2000, Richard Tobin wrote:
> Section 3.3.3 seems to me to say that character references are not
> subject to the translation to #x20 [...]
> The errata (http://www.w3.org/XML/xml-19980210-errata) re-writes this
> section but does not appear to change it in this respect.
>
> However the Oasis test suite, in tests sa02 and not-sa02, requires
> that they are replaced with spaces.
>
> Which is correct?
If the intent is to do it the SGML way, then 3.3.3 is correct. In fact, I
think 3.3.3 (as clarified in the errata) is the best explanation I've seen
of this!:-)
The SGML gotcha here has to do with the 'SEPCHAR' category. A numeric
character reference is always character data at the point it occurs, and
so doesn't get *parsed* as SEPCHAR (and thus thereafter normalized for
non-CDATA declared values.)
Try this file with nsgmls:
===
<!DOCTYPE foo [
<!ELEMENT foo - - (#PCDATA) >
<!ATTLIST foo
bar CDATA #IMPLIED
baz NAMES #IMPLIED
>
]>
<foo bar="blah1 blah2" baz="grape banana">...</foo>
===
This won't validate. So
a) Replace ' ' with '&#RE;'. Now, it will validate. (because RE is a
SEPCHAR when parsed.)
b) Replace with '&lf;' and add a declaration in the DTD
<!ENTITY lf " " >
This, too, will validate (because the character reference substitution
occurs when the entity declaration is *parsed*, and so is a regular
literal whitespace character by the time the entity reference is used.)
c) Change the entity declaration to
<!ENTITY lf CDATA " " >
and now, it won't validate any more. (because the recursive parsing rule
has been short-circuited.)
d) Repeat (b) and (c) with 'RE' for '10' in the entity declaration. Same
difference in results.
Ain't this fun?;)
Arjun
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Unsubscribe by posting to majordom at ic.ac.uk the message
unsubscribe xml-dev (or)
unsubscribe xml-dev your-subscribed-email at your-subscribed-address
Please note: New list subscriptions now closed in preparation for transfer to OASIS.
More information about the Xml-dev
mailing list