ANNOUNCE: Apache XML encoding detector update

John Cowan cowan at locke.ccil.org
Mon Apr 26 17:07:25 BST 1999


Matthew Sergeant (EML) wrote:

> Also, a note to John Cowan: In reading through your detector I noticed that
> it only checks for g=["']...["'], not the full encoding=["']...["'] -
> Personally I don't think that's safe, but having read your C code (and
> knowing C's poor handling of strings) I can understand why you did it... <g>

Actually, it *is* safe.  Once we see "<?xml ", where the last character
can be any whitespace character, we know beyond doubt that this is
an XML declaration/text declaration.  The first "g" appearing in
the declaration is necessarily the last letter of "encoding", unless
XML is extended to a version number that includes the letter "g",
which is most unlikely.  The only other way that a "g" can appear is
within the name of the charset itself.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan at ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list