RFC: Attributes and XML-RPC

Steven Livingstone ceo at citix.com
Sat Sep 25 11:10:37 BST 1999


I think if we remember to how this thread originated, it would be useful to get a general concencus that use of attributes depends on what you are doing.

We started discussing XML RPC and wire technologies using XML, such as SOAP. These do not implement *any* kind of compression, so the use of attributes is advantageous for (more) efficient communication.
Where compression is used it seems (from what I have seen anyway) that it is not all that important what you use.

Rgds,
Steven

Steven Livingstone
Glasgow, Scotland.
+44 7771 957 280

Professional XML
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861003110
Professional Site Server 3, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696
Professional Site Server 3.0 Commerce Edition, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505
  ----- Original Message ----- 
  From: Andrew Layman 
  To: xml-dev at ic.ac.uk 
  Sent: Wednesday, September 22, 1999 7:19 PM
  Subject: RE: RFC: Attributes and XML-RPC


  These results are consistent with tests that I have run against actual XML files generated from databases.  After compression, there is little difference between different syntactic families.
    -----Original Message-----
    From: Mark Nutter [mailto:mnutter at fore.com]
    Sent: Wednesday, September 22, 1999 10:26 AM
    To: xml-dev at ic.ac.uk
    Subject: RE: RFC: Attributes and XML-RPC


    At 12:16 PM 09/22/99 -0400, Hunter, David wrote:

      So even if you
      compress the files, the attribute version will be able to compress to 50%
      smaller than the other file.  Again, 2KB isn't a lot, but if we're talking
      megabytes in size, 50% is a lot.

    I wrote a quick perl script to take /usr/dict/words and turn it into an XML file, with some artificially generated "attributes".  In the resulting file named attrib.xml, each <word> tag contains the additional information as attributes.  I did the same thing to produce a file called child.xml, except that the additional information is presented as a child element instead of as an attribute.  Here are the results:

    $ ./make.pl
    $ ls -l
    total 13004
    -rw-rw-r--   1 mnutter  mnutter   5811852 Sep 22 13:16 attrib.xml
    -rw-rw-r--   1 mnutter  mnutter   7445892 Sep 22 13:16 child.xml
    -rwxr-xr-x   1 mnutter  mnutter       976 Sep 22 13:16 make.pl
    $ gzip attrib.xml
    $ gzip child.xml
    $ ls -l
    total 1127
    -rw-rw-r--   1 mnutter  mnutter    671039 Sep 22 13:16 attrib.xml.gz
    -rw-rw-r--   1 mnutter  mnutter    472394 Sep 22 13:16 child.xml.gz
    -rwxr-xr-x   1 mnutter  mnutter       976 Sep 22 13:16 make.pl

    I used gzip as an example of off-the-shelf compression technology.  As you can see, even though the raw child.xml file is larger, the compressed version is *smaller* than the corresponding implementation with attributes.

    This may not be true in all cases, of course, but I expect it often will, due to the way such compression algorithms work.

    For your reference, here is the Perl script I used to create the two files:

    open WORDS, "</usr/dict/words" or die "Couldn't open dictionary.\n";
    open ATTRIB, ">attrib.xml" or die "Couldn't open attrib.xml\n";
    open CHILD, ">child.xml" or die "Couldn't open child.xml\n";

    @twenty_strings = qw(one two three four five six seven eight nine ten
                         eleven twelve thirteen fourteen fifteen sixteen
                         seventeen eighteen nineteen twenty);

    print ATTRIB "<attrib>\n";
    print CHILD "<child>\n";

    while($word = <WORDS>)
    {
        $time = time();
        $timestr = localtime($time);
        $twenty = rand % 20;
        $twentystr = $twenty_strings[$twenty];
        print ATTRIB <<EOM;
      <word time="$time" timestr="$timestr" twenty="$twenty"
            twentystr="$twentystr">$word</word>
    EOM
        print CHILD <<EOM;
      <word>
        <time>$time</time>
        <timestr>$timestr</timestr>
        <twenty>$twenty</twenty>
        <twentystr>$twentystr</twentystr>
      </word>
    EOM
    }

    print ATTRIB "</attrib>\n";
    print CHILD "</child>\n";

    close CHILD;
    close ATTRIB;
    close WORDS;



    -_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-


    Mark Nutter, <mnutter at fore.com>
    Internet Applications Developer
    FORE Systems
    Some people are atheists 'til the day they die.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990925/4da7e4e7/attachment.htm


More information about the Xml-dev mailing list