Feeler for SML (Simple Markup Language)
david at megginson.com
Sat Nov 13 11:55:08 GMT 1999
"Don Park" <donpark at docuverse.com> writes:
> Right. This means that SML is not a good choice for 'documents' nor
> encoding data with lots of foreign characters.
Like, say, a database with the names of subscribers to a Chinese
e-mag, or a collection of information about Arabic movies.
Right now, it happens that a few large English-speaking former British
colonies (U.S., Canada, Australia, New Zealand) and Western Europe
make up a majority of the computer-using world, but since we make up a
small minority of the world in general, I expect that things will
change rapidly -- data from Turkey or Saudi Arabia or Japan or Korea
can have exactly the same problem as a document.
Actually, in the end I found that supporting UTF-16/UCS-2 as well as
UTF-8 wasn't that hard in AElfred (again, just a few lines of code,
and I didn't use the Java 1.1 library stuff). The hard part about
Unicode is that there's such a wide range of characters allowed and
not allowed in each context, and XML 1.0 *requires* parsers to report
all of those errors. That's a problem with UTF-8 as well as UTF-16.
All the best,
David Megginson david at megginson.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev