Binary Data in XML

Tim Bray tbray at textuality.com
Wed Sep 30 18:09:41 BST 1998


Suppose I wrote up a NOTE, should occupy less than one page, proposing
a reserved attribute xml:packed with, for the moment, only two
allowed values, "none" and "base64".  The default value is "none".
If an element has xml:packed="base64" this means that

(a) the content of the element to which this is attached must be
    pure #PCDATA, no child elements and no references, and
(b) the content is encoded in base64, leading and trailing spaces allowed

This obviously couldn't retroactively become part of XML 1.0, but
if it went through a process and became a W3C recommendation, I bet
every parser author in the world would support it in about 15 minutes.

Base64 (a 4-for-3 encoding) wastes 33%, so I thought about perhaps
inventing Base128 (8-for-7) or maybe even a higher level to cut down
wasteage, but Base64 has the advantage that it avoids UTF8/ISO-8859 
confusion and I bet Mr. LZW will eat that 33% anyhow...

I also thought about xml:encoding=, but that conflicts with
encoding= in the XML declaration in a confusing way.

Are there any gotchas I'm missing?  Don't know if I could persuade
one of the WGs to take it up, but it seems pretty obvious that there
is not only industry demand but in fact people doing this already, so
the case is pretty strong I think. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list