Representing Large Tabular Data Blocks

Joshua E. Smith jesmith at kaon.com
Fri Nov 19 18:16:49 GMT 1999


>The natural XML solution would be of course to embed each
>data value within an element or element attribute.  Such as:
><Point XYZ="324.1241 121.1214 -12.4521" NORMAL="0.0 0.0 1.0"/>
>However, when you replicate this point element a hundred 
>thousand times or so, you get an enormous increase in file
>size.  Thus raising the question of XML efficiency.  
>
>One possible solution is to compress the resulting files.
>However, this is a very undesirable option in my case for
>a variety of reasons.  Primarily compatibility and dependency
>issues.

Zlib can be statically linked to your code and is quite portable, so I'd
definitely recommend looking again at using compression.  All the fat of
XML just disappears when you zip it.  (However, you'll spend a lot of time
parsing all those tags in the receiver, so if perfomance is a goal, forget
this approach for large data sets.)

If you are really concerned with file size, it isn't clear why you would be
using ASCII to represent numbers.  How about encoding them as big-endian
IEEE single precision floating point, and then base-64 encoding the whole
mess.  That'll work out to 5.3 bytes per datum, which is less than your
examples are showing.  (Of course, you could use fixed point or some other
rep for the numbers to reduce their bits further.)


-Joshua Smith


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list