UTF-8 space efficiency

Kragen Sitaker kragen at pobox.com
Sat Nov 13 21:14:17 GMT 1999


Steve Schafer wrote:
> On Sat, 13 Nov 1999 12:05:17 -0500 (EST), you wrote:
> 
> >And it's "filesystem-safe", which means you can use it in filenames
> >without modifying the filesystem to be Unicode-aware.
> 
> That's not necessarily true. UTF-8 uses octets having values in the
> range of 0x00 through 0xF4. Those greater than 0x7F can cause problems
> with some file systems.

I believe you.  Do you know which filesystems they are?  Unix in
general is pretty charset-agnostic, except for reserving 0x00 and 0x2F
as pathname terminator and pathname separator.

IIRC, this was the major reason the Bell Labs guys invented UTF-8 -- so
the Plan9 filesystem could be dumb.

-- 
<kragen at pobox.com>       Kragen Sitaker     <http://www.pobox.com/~kragen/>
The Internet stock bubble didn't burst on 1999-11-08.  Hurrah!
<URL:http://www.pobox.com/~kragen/bubble.html>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list