encoding problem fixed
david-b at pacbell.net
Tue Aug 3 17:38:28 BST 1999
Elliotte Rusty Harold wrote:
> It's possible to start with an InputStream to read the XML declaration,
> then chain that InputStream to an InputStreamReader once the encoding is
> known and never use the InputStream directly again. Since the XML
> declaration is ASCII (possibvly aside from a byte order mark) this isn't
> all that difficult to implement.
Make that a a "PushbackInputStream" ... remember that XML (and text)
declarations are optional, and if you're not tying to the parser's
prolog logic, you'll need to feed it a stream of characters.
Consider a document starting "<tag>" where instead of "t" you've got
a multibyte UTF-8 encoded name. Without some pushback there, you're
likely in severe trouble unless you handle the UTF-8 directly ... that
"ASCII only" assumption can fail _very_ quickly.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev