Confused about & in entity literal

John Cowan cowan at locke.ccil.org
Tue May 11 20:21:01 BST 1999


roddey at us.ibm.com wrote:

> That's what got me confused in the first place. My original implementation was
> just to ignore ampersands in the original scan of the entity value.

Be sure to greedily expand character references, though; the ability
to define "lt" and the like depends on that ability.

> But, if raw
> ampersands are not allowed, then I'm just asking whether I'm obligated to prove
> that any ampersand is at least provisionally part of a general entity (even if
> it later turns out not to be a legal one) during the scan of the entity value.

BNF rule 9 says that EntityValues can't contain random  & except
as part of a well-formed PEReference or Reference, so it's a WF
error (which you must catch) to have them, even if the entity is
never referenced.

All you actually have to do is to ensure that the next character
(if not #, see above) is a NAMESTRT character, and that all characters
until ; are either NAME or NAMESTRT characters.  There is no need (and
in fact it is forbidden) to look up the supposed entity name anywhere.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan at ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list