Confused about & in entity literal

John Cowan cowan at
Tue May 11 20:21:01 BST 1999

roddey at wrote:

> That's what got me confused in the first place. My original implementation was
> just to ignore ampersands in the original scan of the entity value.

Be sure to greedily expand character references, though; the ability
to define "lt" and the like depends on that ability.

> But, if raw
> ampersands are not allowed, then I'm just asking whether I'm obligated to prove
> that any ampersand is at least provisionally part of a general entity (even if
> it later turns out not to be a legal one) during the scan of the entity value.

BNF rule 9 says that EntityValues can't contain random  & except
as part of a well-formed PEReference or Reference, so it's a WF
error (which you must catch) to have them, even if the entity is
never referenced.

All you actually have to do is to ensure that the next character
(if not #, see above) is a NAMESTRT character, and that all characters
until ; are either NAME or NAMESTRT characters.  There is no need (and
in fact it is forbidden) to look up the supposed entity name anywhere.

John Cowan		cowan at
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list