<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content="text/html; charset=ISO-8859-1" http-equiv=Content-Type>
<META content="MSHTML 5.00.2314.1000" name=GENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=#ffffff>
<DIV><FONT size=2>I think if we remember to how this thread originated, it would
be useful to get a general concencus that use of attributes depends on what you
are doing.</FONT></DIV>
<DIV> </DIV>
<DIV><FONT size=2>We started discussing XML RPC and wire technologies using XML,
such as SOAP. These do not implement *any* kind of compression, so the use of
attributes is advantageous for (more) efficient communication.</FONT></DIV>
<DIV><FONT size=2>Where compression is used it seems (from what I have seen
anyway) that it is not all that important what you use.</FONT></DIV>
<DIV> </DIV>
<DIV><FONT size=2>Rgds,</FONT></DIV>
<DIV><FONT size=2>Steven</FONT></DIV>
<DIV> </DIV>
<DIV>Steven Livingstone<BR>Glasgow, Scotland.<BR>+44 7771 957 280</DIV>
<DIV> </DIV>
<DIV>Professional XML<BR><A
href="http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861003110">http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861003110</A><BR>Professional
Site Server 3, Wrox Press<BR><A
href="http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696">http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696</A><BR>Professional
Site Server 3.0 Commerce Edition, Wrox Press<BR><A
href="http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505">http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505</A></DIV>
<BLOCKQUOTE
style="BORDER-LEFT: #000000 2px solid; MARGIN-LEFT: 5px; MARGIN-RIGHT: 0px; PADDING-LEFT: 5px; PADDING-RIGHT: 0px">
<DIV style="FONT: 10pt arial">----- Original Message ----- </DIV>
<DIV
style="BACKGROUND: #e4e4e4; FONT: 10pt arial; font-color: black"><B>From:</B>
<A href="mailto:andrewl@microsoft.com" title=andrewl@microsoft.com>Andrew
Layman</A> </DIV>
<DIV style="FONT: 10pt arial"><B>To:</B> <A href="mailto:xml-dev@ic.ac.uk"
title=xml-dev@ic.ac.uk>xml-dev@ic.ac.uk</A> </DIV>
<DIV style="FONT: 10pt arial"><B>Sent:</B> Wednesday, September 22, 1999 7:19
PM</DIV>
<DIV style="FONT: 10pt arial"><B>Subject:</B> RE: RFC: Attributes and
XML-RPC</DIV>
<DIV><BR></DIV>
<DIV><FONT size=2><FONT color=#0000ff><FONT face=Arial><SPAN
class=136121718-22091999>These results are consistent with tests that I have
run against actual XML files generated from databases.<SPAN
class=673591718-22091999> After compression, there is little difference
between different syntactic families.</SPAN></SPAN></FONT></FONT></FONT></DIV>
<BLOCKQUOTE style="MARGIN-RIGHT: 0px">
<DIV align=left class=OutlookMessageHeader dir=ltr><FONT face=Tahoma
size=2>-----Original Message-----<BR><B>From:</B> Mark Nutter [<A
href="mailto:mnutter@fore.com">mailto:mnutter@fore.com</A>]<BR><B>Sent:</B>
Wednesday, September 22, 1999 10:26 AM<BR><B>To:</B> <A
href="mailto:xml-dev@ic.ac.uk">xml-dev@ic.ac.uk</A><BR><B>Subject:</B> RE:
RFC: Attributes and XML-RPC<BR><BR></DIV></FONT>At 12:16 PM 09/22/99 -0400,
Hunter, David wrote:<BR>
<BLOCKQUOTE type="cite" cite>So even if you<BR>compress the files, the
attribute version will be able to compress to 50%<BR>smaller than the
other file. Again, 2KB isn't a lot, but if we're
talking<BR>megabytes in size, 50% is a lot.</BLOCKQUOTE><BR>I wrote a quick
perl script to take /usr/dict/words and turn it into an XML file, with some
artificially generated "attributes". In the resulting file named
attrib.xml, each <word> tag contains the additional information as
attributes. I did the same thing to produce a file called child.xml,
except that the additional information is presented as a child element
instead of as an attribute. Here are the results:<BR><BR><TT>$
./make.pl<BR>$ ls -l<BR>total 13004<BR>-rw-rw-r-- 1
mnutter mnutter 5811852 Sep 22 13:16
attrib.xml<BR>-rw-rw-r-- 1 mnutter mnutter
7445892 Sep 22 13:16 child.xml<BR>-rwxr-xr-x 1 mnutter
mnutter 976 Sep 22 13:16 make.pl<BR>$
gzip attrib.xml<BR>$ gzip child.xml<BR>$ ls -l<BR>total
1127<BR>-rw-rw-r-- 1 mnutter mnutter
671039 Sep 22 13:16 attrib.xml.gz<BR>-rw-rw-r-- 1 mnutter
mnutter 472394 Sep 22 13:16
child.xml.gz<BR>-rwxr-xr-x 1 mnutter
mnutter 976 Sep 22 13:16
make.pl<BR><BR></TT>I used gzip as an example of off-the-shelf compression
technology. As you can see, even though the raw child.xml file is
larger, the compressed version is *smaller* than the corresponding
implementation with attributes.<BR><BR>This may not be true in all cases, of
course, but I expect it often will, due to the way such compression
algorithms work.<BR><BR>For your reference, here is the Perl script I used
to create the two files:<BR><BR>open WORDS, "</usr/dict/words" or die
"Couldn't open dictionary.\n";<BR>open ATTRIB, ">attrib.xml" or die
"Couldn't open attrib.xml\n";<BR>open CHILD, ">child.xml" or die
"Couldn't open child.xml\n";<BR><BR>@twenty_strings = qw(one two three four
five six seven eight nine
ten<BR>
eleven twelve thirteen fourteen fifteen
sixteen<BR>
seventeen eighteen nineteen twenty);<BR><BR>print ATTRIB
"<attrib>\n";<BR>print CHILD "<child>\n";<BR><BR>while($word =
<WORDS>)<BR>{<BR> $time =
time();<BR> $timestr =
localtime($time);<BR> $twenty = rand %
20;<BR> $twentystr =
$twenty_strings[$twenty];<BR> print ATTRIB
<<EOM;<BR> <word time="$time" timestr="$timestr"
twenty="$twenty"<BR>
twentystr="$twentystr">$word</word><BR>EOM<BR>
print CHILD <<EOM;<BR> <word><BR>
<time>$time</time><BR>
<timestr>$timestr</timestr><BR>
<twenty>$twenty</twenty><BR>
<twentystr>$twentystr</twentystr><BR>
</word><BR>EOM<BR>}<BR><BR>print ATTRIB "</attrib>\n";<BR>print
CHILD "</child>\n";<BR><BR>close CHILD;<BR>close ATTRIB;<BR>close
WORDS;<BR><BR><BR>
<DIV>-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-</DIV><BR>
<DIV>Mark Nutter, <mnutter@fore.com></DIV>
<DIV>Internet Applications Developer</DIV>
<DIV>FORE Systems</DIV>
<DIV>Some people are atheists 'til the day they
die.</DIV></BLOCKQUOTE></BLOCKQUOTE></BODY></HTML>