When to use attributes vs. elements

Mon Feb 8 19:41:26 GMT 1999

Dan Brickley asks several questions in a mail of 1999-02-08 having to do
with serializing graphs of data per the "canonical format" recommendations
in http://www.w3.org/TandS/QL/QL98/pp/microsoft-serializing.html.  Since his
mail was lengthy, I have not copied it here.

Let me take another stab at explaining the idea.  XML has two principal ways
to explicitly express a relationship among elements: containment and idrefs.
Idrefs always express a directed, labeled relationship between two elements;
they always have this meaning and they never have any other meaning.

If elements all have ids, and the relationships between the elements in a
document are all expressed via idrefs, then the document -- per normal XML
rules -- corresponds to a graph in which elements match nodes and attributes
match edges.

Given this, one can make the suggestion that graphs _should_ be serialized
in this way, nodes as elements and edges as idrefs.  A reader, knowing no
more conventions than the ordinary meaning of idrefs, will observe the
correct graph structure.

Of course, XML permits a great deal more flexibility than this.  One can,
for example, take advantage of contextual knowledge and use containment to
imply certain kinds of edges.  If one does this, then a naive reader will
only observe the explicit edges, and will not be able to reconstruct the
implied ones.  But -- to answer Dan's second question -- this does not mean
that a reader needs to have complete knowledge of the implications of the
abbreviations employed.  Even a naive reader will decode the graph correctly
to whatever extent it is explicit, that is, to whatever extent it uses the
conventions advocated in the "canonical format."  

The same point stated differently: If an XML instance uses a different set
of conventions, a naive reader will find some elements whose relationship is
to him unknown.  But he will not find relationships that he interprets
incorrectly.

This is the main point of the paper.

The paper addresses another point, and perhaps this has led Dan to some
confusion.  The paper notes that many XML documents will reflect graphs that
could have been rendered into the canonical format but were not, even though
there is a deterministic mapping from the document's syntax to canonical
syntax. It goes on to note that such mapping could work well in practice,
and we have a range of options for implementing it, from simple declarations
in schema, to architectural forms, to XSL.

But the main point of the paper was to observe that the facilities needed to
express graphs already exist in XML if they are used properly.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)