ANNOUNCE: xml encoding detector in C

Paul Langer Paul.Langer at
Fri Apr 23 12:09:43 BST 1999

At Friday, April 23, 1999 12:11 AM John Cowan wrote:

> I have written an XML encoding detector function in C.
> [snip]
> I believe it handles all the cases in Appendix F correctly,
> including the EBCDIC one.

One remark on the EBCDIC handling:

Your program returns "EBCDIC-CP-US" if it detects EBCDIC
without an explicit encoding declaration (see comment:
 /* better than nothing */).

I do not think that this behaviour is "better than nothing".
The XML spec says "Parsed entities which are stored in an encoding
other than UTF-8 or UTF-16 must begin with a text declaration containing
an encoding declaration"  (Chapter 4.3.3 Character Encoding in Entities,

And if you want to define a default, what makes "EBCDIC-CP-US"
more desirable than e.g. "ebcdic-cp-is"?

All the best,

Paul Langer                           PL at
Software AG                           Tel. +49-6151-92-1912
Uhlandstr. 12                         Fax  +49-6151-92-1613
D-64297 Darmstadt 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list