Taxonomies in XML

Peter Murray-Rust peter at ursus.demon.co.uk
Thu Apr 16 17:07:40 BST 1998


At 22:31 15/04/98 -0800, John Totten wrote:
>The 30 or so XML files that represent the El Limon Weeds Collection
>(one separate file for each weed) will impress a Web Master but not a
>botanist because you cannot produce a taxonomy from them.
>	 How can you add nodes and unlimited nesting to XML documents?
>
If your taxonomy is fixed and consists of a single hierarchy then XML is
the most natural way to express it :-).  I have done this for protein
sequences on the WWW and come up with something like:
<LIST TITLE="EUKARYOTA">
  <LIST TITLE="METAZOA">
    <LIST TITLE="CHORDATA">
      <LIST TITLE="VERTEBRATA">
        <LIST TITLE="TETRAPODA">
          <LIST TITLE="MAMMALIA">
            <LIST TITLE="EUTHERIA">
              <LIST TITLE="PRIMATES">
                <ITEM TITLE="HOMO SAPIENS"/>
              </LIST>
            </LIST>
          </LIST>
        </LIST>
      </LIST>
    </LIST>
  </LIST>
</LIST>

This allows nesting of any depth and displays beautifully in a
tree-structured browser. [I shall be releasing JUMBO2 very shortly - when
SAX is finalised - and this will be one of the examples to show an
essentially non-textual application of XML.]

[I have added whitespace to the above example for human benefit. Exercise:
If you are new to XML, how would you decide whether the whitespace was
'ignorable'? :-)]

Note that I dare not venture further than this because taxonomies are much
more complex than this - usually dynamic and hence requiring attention to
renaming, equivalences, the possibility of multiple parents, etc. Much the
same problems as with orgCharts :-). But if you have a fixed taxonomy, XML
is wonderful. try doing the above with a relational data and asking
non-experts to create the input, whereas I suspect any scientist could work
with the above almost without  thinking.

Why did I include the 'data' in the TITLE attribute rather than content?
Mainly because I had a simple display routine that picked up TITLE
attributes rather than content attributes :-).  If I redid it now I might
move things to element content.

	HTH

	P.


Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list