Element/Attribute Distinction Considered Harmful

Simon St.Laurent simonstl at simonstl.com
Sun Aug 29 18:21:41 BST 1999


After writing the usual 'when to use elements and when to use attributes' bit
for a new book and then spending some time close up with the XLink specs, I'm
really starting to wonder if we haven't painted ourselves into a corner by
treating leaf elements and attributes differently.

I'm not complaining about notation - in fact, I'm quite fond of attributes as
an abbreviated form that makes clearer (at times) that given information is
perhaps annotation for another component rather than a component of its own.

My concerns arise more with schema development and attribute usage in
situations like XLink.  By using tools that require a given piece of
information to appear as an attribute, we gain the use of one set of tools
(defaulting and somewhat better restraints) at the cost of another set of
tools
(the ability to annotate that information itself, or to plain old store
multiple levels of information, plus the ever-lingering question of whether
attribute sequence matters.)

For a simple example, I'll use the HTML 4.0 IMG element and three of its
attributes: SRC, LONGDESC, and USEMAP.  All three of them take URIs for
values:
<!ATTLIST IMG
        SRC             CDATA   #REQUIRED
        LONGDESC        CDATA   #IMPLIED
        USEMAP          CDATA   #IMPLIED>

(In the HTML 4.01 spec, they use %uri; for the content model, which
resolves to
CDATA.  They also use lower case.)

All three of these attributes are locators - they carry URIs that can be used
to retrieve additional information about the image.  SRC is the 'primary'
attribute, without which there isn't an image, while LONGDESC provides a link
to extra description of the image and USEMAP points to an image map resolver,
itself another set of links.

Because XLink follows the current expectations of schemas, and because XLink
relies on the element/attribute distinction for its understanding of which
containers hold link-relevant information, XLink cannot support this use case
directly.  _If_ IMG stored SRC, LONGDESC, and USEMAP as child elements, XLink
could support this easily.  However, XLink provides no support for 'extended
links' where all the linking information is presented as attributes rather
than
child elements.  In the current usage of XML and schemas, it has no formal
vocabulary available to it that lets it handle this case.

Another case that's generated some (reasonably violent) comment is the use of
the style attribute in HTML for use with CSS.  In HTML, you can do inline
styling by attaching a style attribute to the element to be styled.  That
single attribute may hold hundreds of different style properties, using a
commonly understood convention.  For example, 

<P style="font-size:18pt; font-weight:bold; color:42426F;
font-family:sans-serif">This is my crazy paragraph!</P>

The style attribute here stores four name-value pairs, violating quite
thoroughly the principle that an attribute represents a single value and
irritating some folks enormously.  The same information could have been
represented as:

<P>
<style font-size="18pt" font-weight="bold" color="42426F"
font-family="sans-serif" />This is my crazy paragraph!</P>

or even:
<P>
<style>
        <font-size>18pt</font-size>
       <font-weight>bold</font-weight>
       <color>42426F</color>
       <font-family>sans-serif</font-family>
</style>
This is my crazy paragraph!</P>

You get the idea...

It seems like child elements and attributes are both basically name-value
pairs.  In the case of child elements, we treat order as important, while in
the case of attributes it's considered unclear.

Apart from being the status quo, does this approach really make sense?  Am I
the only one wondering if maybe it's time to start breaking down this
distinction - at least within schemas - to give XML some of the flexibility
the
current system denies?  (RDF certainly took that approach, but wrapped it in
lots of alien - to XML - concepts that I think obscured its value.)  It's
taken
me a long time to reach these conclusions - until recently, I really liked
that
status quo, though without much good cause.

I'd love to hear people's opinions on this subject, if indeed anyone else
considers it worth addressing.  It has immediate impact on developments like
schemas, as well as significant implications for XLink and potentially CSS
style application.

Simon St.Laurent
XML: A Primer (2nd Ed - September)
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list