DTD for RDF? How is that possible?

Rick Jelliffe ricko at allette.com.au
Thu May 6 19:39:05 BST 1999


 From: Roger L. Costello <costello at mitre.org>

>I recall seeing a posting a few months back where someone posted a DTD

That was me!  See http://www.ascc.net/xml/en/utf-8/resource_index.html

>for RDF.  Upon reflection I find this puzzling.  How could you possibly
>create a DTD for RDF?

Not at all. It was trivial to make a DTD for almost all of the RDF
syntax.

I think the DTD is a lot clearer than their formal syntax.  In fact,
their formal
syntax leaves out rdf:subject, rdf:object, rdf:predicate, rdf:type and
rdf:type.
I have put these in today.

The only thing that eludes DTDs is the unlimited number of rdf:_n
attributes;
 I have put in 8 and you can add as many more as you need; they seem a
particularly
gratuitous carbuncle on RDF to me: I trust everyone will boycott that
abbreviated
form (the GratAbbCarb ?).

>It is my understanding that RDF documents can only be "well-formed".
>There is no such thing as a "valid" RDF document.  Correct?  /Roger

But for any particular group of documents you can make up a DTD for
them.
And you can use a variant of the DTD to constrain your users to prevent
them
from using the rdf:_n attributes. That would be a prudent move. So in
fact
you can have lots of DTDs which generate data that complies with RDF.

And you can generate a DTD (like mine) which should accept any RDF
document (providing, of course, that you include complete the DTD to
include any domain-specific element names: these can be included in the
internal prolog).

>...  By definition, the child elements in
>rdf:Description are arbitrary:
>
><!ELEMENT rdf:RDF (rdf:Description)*>
><!ELEMENT rdf:Description (???)*>
>

In order to think about the DTDs, we need to split the structure into
levels:

* The first is the direct structure (this is the DTD that I give)--it
says that an
rdf:RDF element has a declared content type of ANY--that is easy, and
it is what the RDF spec says, under all the fluff;  Similarly,
rdf:Description
can have a content model of ANY. It is important to realize that "ANY"
signifies this arbitrariness...

* The second is the "architectural" structure (I didn't give a DTD for
this: I
just put it in comments and a parameter entity; I could have used ISO's
Architectural Forms declarations to declare this architectural
structure):
this says that a child of an rdf:RDF must be
    ( rdf:Description | rdf:Bag | rdf:Set | rdf:Seq )
and gives the appropriate attributes for these. Because these are
architectural
elements, they do not have to be signified in the element type
identifier (i.e.,
you can use any name, as long as you have the correct attributes and the
correct
content models). Rdf:Description would have an architectural content
model of
(rdf:PropertyElement )* and the rdf:PropertyElement has its due
attributes.

So conventional DTD modeling sees this as two structures, each of which
can be
described using DTDs.  Since the RDF people didn't use architectures,
they are
forced to use BNF (which is incomplete w.r.t. XML, and incomplete and
confusing w.r.t. RDF) and are banished to cry for other forms of
extended
context declarations.

The document can be validated against the direct DTD only, but the
indirect DTD, if constructed, could be used to valid using an
architectural
validator. (Probably that XAF tool could do it. You could also use XSL
to do this kind of validation: I have an article on the same site "Using
XSL for Structural Validation" which looks into it: of course, unless
XSL has some way to check for names generated from numbers, it
cannot validate all possible rdf:_n attributes, but that is a flaw in
RDF.)

At least the RDF  stands as a notable application that doesn't use the
direct element type identifier to key the content model. Following
Murata
Makoto's excellent XTech99 talk about performing set operations on
content models, I have been a little afraid that other forms of
validation
(e.g. architectural validation using attribute names, or content model
validations that follow IDREFs instead of subelements) have been
thrown out.  Which is a shame, because parallel content models provide
some nice capabilities (as RDF may, in about a million years, prove).

I still find it difficult to see RDF as anything other than a way of
making
implicit relationships, which every DTD designer builds into their DTD,
explicit. This allows generic tools which understands the relationships.
But the generic tools still need to understand the schemas, so I think
RDF
does not take us very far in practise, apart from the drab tasks
of managing, navigating and visualizing. It is not a very thick layer,
so it is a pity they have such a restrictive syntax: the whole thing
should
have been an architecture that could have been retrofitted to any
existing
DTD. A great opportunity missed, IMHO.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list