parsing entity values

Richard Tobin richard at
Mon Jan 25 16:15:41 GMT 1999

> ><!ENTITY % ap "&#38;#39;" > ( 38 = "&" , 39 = "'" )
> ><!ENTITY msg "he said %ap;hi!%ap;" >

> Right. The replacement text for ap is
>     &#39;


> With msg, the parameter entity is included as part of the replacement text
> and so the replacement text of msg is
>     he said &#39;hi!&#39;


See the table in section 4.4.  We have a parameter entity reference in an
entity value, so it is "included in literal".  4.4.5 says "[the parameter
entity's] replacement text is processed in place of the reference itself
as though it were part of the document at the location the reference was
recognised, except that a single or double quote character [...] will not
terminate the literal".  So the &#39; is processed as if it had occurred
directly in the definition of msg.

You can't see the difference in this case, but if we had:

<!ENTITY % less "&#38;#60;">
<!ENTITY % more "&#38;#62;">
<!ENTITY elt "%less;=%more;">

the replacement text of elt would be 




and should be detected as a syntax error if &elt; occurred in the

Phil suggests that having to keep track of where the quotes are
special makes the parsing quite difficult; I don't think this is true,
though perhaps it depends on how your parser works.  Mine just checks
to see whether it read the quote character from the same entity that it
read the opening quote from.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list