RDF, again

Wed Nov 24 14:47:14 GMT 1999

Paul Prescod <paul at prescod.net> writes:

> The thing I find confusing about the RDF syntax is that the element
> type name can be either an RDF type name or an RDF property. XML
> makes no distinction and that's why I think that it is difficult to
> use for object oriented interchange. Your example doesn't run into
> that problem really because it only goes one level deep. But what
> does the RDF for this CSS-style object representation look like:
> 
> person{
> 	name: person-name{ first: "Paul"; last: "Prescod"};
> 	address: snail-mail-address{
> 		street: street-address{
> 			number: 5936; 
> 			street: "Lovers Lane"
> 		city: city( #!Dallas );
> 		state: state( #!Texas )};
> 	siblings: #sibling1 #sibling2 #sibling3}
> 
> (curly braces for structures, parentheses for primitive types,
> concatenation for lists, semicolons for property separators)

Without minimization (except for rdf:type as the classname), you get
something like this, which is fully normalized and thus suitable B2B
data exchange:

  <!-- Example 1 of 3 (almost no minimization) -->

  <megg:Person rdf:ID="sibling0">
    <megg:name rdf:resource="#id002"/>
    <megg:address rdf:resource="#id003"/>
    <megg:sibling rdf:resource="#sibling1"/>
    <megg:sibling rdf:resource="#sibling2"/>
    <megg:sibling rdf:resource="#sibling3"/>
  </megg:Person>

  <megg:PersonName rdf:ID="id002">
    <megg:firstName>Paul</megg:firstName>
    <megg:lastName>Prescodd</megg:lastName>
  </megg:PersonName>

  <megg:SnailMailAddress rdf:ID="id003">
    <megg:street rdf:resource="#id005"/>
    <megg:city rdf:resource="http://www.places.org/us/tx/dallas/"/>
    <megg:state rdf:resource="http://www.places.org/us/tx/"/>
  </megg:SnailMailAddress>

  <megg:StreetAddress rdf:ID="id005">
    <megg:number>5936</megg:number>
    <megg:streetName>Lovers Lane</megg:streetName>
  </megg:StreetAddress>

With a bit of minimization (and some denormalization), you get
something like this:

  <!-- Example 2 of 3 (moderate minimization) -->

  <megg:Person rdf:ID="sibling0">
    <megg:name>
      <megg:PersonName>
	<megg:firstName>Paul</megg:firstName>
	<megg:lastName>Prescod</megg:lastName>
      </megg:PersonName>
    </megg:name>
    <megg:address>
      <megg:SnailMailAddress>
	<megg:street>
	  <megg:StreetAddress>
	    <megg:number>5936</megg:number>
	    <megg:streetName>Lovers Lane</megg:streetName>
	  </megg:StreetAddress>
	</megg:street>
	<megg:city rdf:resource="http://www.places.org/us/tx/dallas/"/>
	<megg:state rdf:resource="http://www.places.org/us/tx/"/>
      </megg:SnailMailAddress>
    </megg:address>
    <megg:sibling rdf:resource="#sibling1"/>
    <megg:sibling rdf:resource="#sibling2"/>
    <megg:sibling rdf:resource="#sibling3"/>
  </megg:Person>

With really ferocious minimization, you can get down to this (but you
lose some class names):

  <!-- Example 3 of 3 (maximum minimization) -->

  <megg:Person rdf:ID="sibling0">
    <megg:name firstName="Paul" lastName="Prescod"/>
    <megg:address rdf:parseType="Resource">
      <megg:street number="5936" streetName="Lovers Lane"/>
      <megg:city rdf:resource="http://www.places.org/us/tx/dallas/"/>
      <megg:state rdf:resource="http://www.places.org/us/tx/"/>
    </megg:address>
    <megg:sibling rdf:resource="#sibling1"/>
    <megg:sibling rdf:resource="#sibling2"/>
    <megg:sibling rdf:resource="#sibling3"/>
  </megg:Person>

Certainly, it's easy for a person to read at this level, but it's
quite tricky to process; I'm surprised that several RDF processors
have actually come out.

The RDF committee had to make a difficult choice: to what extent
should they complicate the syntax to help get buy-in from the initial
implementors?

When the original SGML committee had to make that choice, strong
vested interests (especially in publishing, from what I've heard) were
able to force a horrendous complexity on the ISO 8879:1986 grammar.

When the XML committee had to make the same choice, they held they
line much better, but vested interests (especially in the SGML world)
were able to force them to include some kruft like notations and
external unparsed entities.

When the RDF committee had to make the same choice, strong vested
interests (especially in the HTML world) were able to force them to
include heavy optional minimization and a some bizarre kruft like the
rdf:aboutEachPrefix attribute.

No International Standard *or* consortium spec is free from this kind
of horse trading -- it's just a fact of life.  In the end, the kruft
in XML didn't hurt it all that much, while the syntactic kruft in SGML
pretty much did it in.  The jury is still out on RDF syntax, but the
convoluted syntax puts them dangerously close to the edge.

> Note also that common XML usage puts datatypes in the schema or
> elsewhere. To recognize an integer as such you need the schema or some
> other external knowledge. Some DTDs have <int> elements but that's so
> ugly that it hasn't really "caught on." The XML world is very
> inconsistent in its thought about the appropriateness of dependence on
> the schema. I think that that dependence is slowly creeping back into
> vogue.

I think that it was their intention to do so, but they were waiting
for some general XML datatyping facility.

> Speaking on behalf of the devil, I'd say that in one week we could
> define (or just find) an S-expression-like language with none of these
> weaknesses and in less time we could write a parser for it. It could
> have "XML element" as a primary data type for embedded XML and could
> also be embedded IN XML.

Not only could we, but many of us have -- I've written quite a few
thousand lines of LISP in my life, and I know that it works fine for
representing data structures, but nobody uses it.  XML also works
fine, and everyone uses it.  So, let's get on to the interesting
stuff, and actually start doing something with information rather than
just marking it up.

All the best,

David

-- 
David Megginson                 david at megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)