element content vs. element attribute

Sun Dec 21 09:09:47 GMT 1997

On Sat, 20 Dec 1997, Ray Waldin asked:

> when should data be contained by elements?  Or conversely, when should
> data be an attribute of an element instead of contained by that element? 

There are a number of issues that may help here, depending on how the
information is going to be used...

Some pragmatics first:

* it's often easiest for people writing ad-hoc parsers if you only use
  elements; there's only one syntax to handle

* if you will ever need to have more complex structured values with markup
  in them, they will need to be in element content, because XML (like SGML)
  has a restriction that you can't put element markup inside attributes

* if you want the information to be displayed in XML or HTML or SGML browsers
  most or all of the time, use content, as the style sheets are generally
  less flexible with attributes.

* it's relatively easy to strip out all attribute values and make a
  pared-down instance, if that's useful

* attributes are good for things like interpretations of a text by someone
  transcribing it, not part of actual content

A philosophical view:

* attributes may be used for annotating the element tree; in other words,
  you could use them to store element properties.
  for example,
    <boiler MinTemperature="7" MaxTemperature="320">
	steam
	water
	gunk
    </boiler>
  Unfortuantely, a practical example would add units and tolerance to
  the temperature, and then you need to use elements or a non-XML sub-
  structure:
    <boiler MinTemperature="7 {units K} {tolerance {plus 3} {minus 2}"....>
  This is generally unsatisfactory because it's not using XML; so
    <boiler>
      <MinTemperature
	<unit ref="SIUnits#Kelvin" abbr="K">Kelvin</unit>
	<value plus="3" minus="2">7</value>
      </MinTemperature>
    </boiler>

Clearly you could take those items i have left as attributes and turn
them into elements, and in fact any element E with attribute list A and
content model C can be converted into an element E' with content model
    E.atts(A), E.content(E)
e.g.
    <!Element Boiler-prime
	(Boiler-prime-attributes, Boiler-prime-content)
    >

    <!Element Boiler-prime-attributes
	(
	  MinTemperature-prime,
	  MaxTemperature-prime
	)
    >

    <!Element Boiler-prime-content
	(steam-prime, water-prime, gunk-prime)
    >

It is therefore possible to think of attributes as syntactic sugar for
a very restricted kind of content model.

Unfortunately, this is not quite correct, because XML attributes support
a set of constraints on their content which is entirely different to
that supported for elements.

If you only ever use CDATA, ID and name group attributes, retain ID
attributes as attributes, and convert name group token lists to
corresponding empty elements, the conversion still applies.

In theory, then, attributes are a useful but limited shrthand in most
cases, but, essential for IDref and other cases that are not supported
in element content.

In practice, they can be used to make an instance more readable, or to
reduce file size, or to distinguish between different sorts of information.

Hope this helps.

Lee (tired at 4 am!)

-- 
Liam Quin --  the barefoot typographer -- Toronto
lq-text: freely available Unix text retrieval
IRC: Learn about XML/SGML/XSL/XLL/DSSSL on irc.dragonnet.org in #xml
email address: l i a m q u i n, at host: i n t e r l o g  dot  c o m

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)