UTF-16 in external entities

Steve Schafer pandeng at telepath.com
Tue Feb 1 03:56:14 GMT 2000

I have a question regarding what appears to be an inconsistency
between the original XML 1.0 spec and one of the errata. Erratum E44
contains the following:

"Add the following to the second paragraph after the list (this also
takes care of the previous erratum on UTF-7): 'Note: Since external
parsed entities in UTF-16 may begin with any character...'"

Whereas the second paragraph of section 4.3.3 says:

"Entities encoded in UTF-16 must begin with the Byte Order Mark..."

I'm inclined to believe the original spec. If an external parsed
entity encoded in UTF-16 does in fact begin with the Byte Order Mark,
then I can't see any way in which the autodetection algorithm would
fail (barring the pathological cases of an 8-bit-encoded entity
beginning with the characters FE FF or FF FE, of course).

-Steve Schafer

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions and unsubscriptions
are  now ***CLOSED*** in preparation for list transfer to OASIS.

More information about the Xml-dev mailing list