encoding incompatibilities between XT and XP in FOP

James Tauber jtauber at jtauber.com
Fri Jul 30 03:16:43 BST 1999


----- Original Message -----
From: Chris Maden <crism at oreilly.com>
> [James Tauber]
> > I stepped through the code and it appears XP is treating it as
> > big-endian UTF-16. By the time XP is reading off its buffer of
> > bytes, they are 0x00 0xE2 0x20 0xAC 0x00 0xA2.
>
> What is XP using to read the file?  You mentioned you were using
> Microsoft's Java implementation; I suspect that the problem is there.
> The conversion of 0x80 to 0x20AC makes me very suspicious, because the
> use of 0x80 for Euro is a relatively recent Windows codepage change,
> so I'm inclined to suspect the Microsoft Java implementation.

I did too until I tried it with Sun's JDK and it behaved identically.

In com.jclark.xml.sax.Driver, there is a method OpenEntity that begins:

private OpenEntity openInputSource(org.xml.sax.InputSource inputSource)
throws IOException {
    Reader reader = inputSource.getCharacterStream();
    String encoding;
    InputStream in;
    if (reader != null) {
      in = new ReaderInputStream(reader);
      encoding = "UTF-16";
    }
    else {
      in = inputSource.getByteStream();
      encoding = inputSource.getEncoding();
    }
    ...

The encoding gets set there and never changes.

James Tauber


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list