Characters having an ASCII value > 127

Richard Tobin richard at cogsci.ed.ac.uk
Fri Sep 18 14:45:40 BST 1998


> I guess, to correctly interpret and display those characters I have to
> know the character set which was used to encode the original text file.
> How can I communicate this character set to an XML parser?

You can do this by putting an encoding declaration in the XML
declaration at the start of the file.  For example, if the document
is in ISO Latin 1, officially named ISO-8859-1, you can use

 <?xml version="1.0" encoding="ISO-8859-1"?>

Without an encoding declaration (or a mime type if the document comes
from an http server) a conforming parser will treat it as UTF-8, and
any character above 127 will be misinterpreted.

Of course, any particular parser may not support the character set you
happen to be using.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list