Short Essay: Squeezing RDF into a Java Object Model

Roger L. Costello costello at mitre.org
Mon May 3 20:29:29 BST 1999


David,

I see where you are going with this - develop an API for RDF.  Out of
curiosity, why isn't the SAX API adequate?  After all, RDF is just XML. 
Let the application deal with it.  /Roger

David Megginson wrote:
> 
> The more I work with RDF, the more I find it fascinating in the
> abstract but annoying in the concrete.
> 
> The biggest problem is that RDF claims an extremely simple data model
> 
>   statement: subject, predicate, object
> 
> but that the model does not even come close to describing what
> information actually appears in an RDF statement.  Let's start with
> the most naive mapping into a Java object model:
> 
>   public interface RDFStatement
>   {
>     public abstract String getSubject ();
>     public abstract String getPredicate ();
>     public abstract String getObject ();
>   }
> 
> This will work fine for something like the following:
> 
>   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>            xmlns:dc="http://www.purl.org/dc#">
>   <rdf:Description about="http://www.megginson.com/">
>   <dc:Title>Megginson Technologies</dc:Title>
>   </rdf:Description>
>   </rdf:RDF>
> 
>   statement.getSubject()   => "http://www.megginson.com/"
>   statement.getPredicate() => "http://www.purl.org/dc#Title"
>   statement.getObject()    => "Megginson Technologies"
> 
> However, it falls apart quickly when the value of the property is a
> resource:
> 
>   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>            xmlns:dc="http://www.purl.org/dc#">
>   <rdf:Description about="http://www.megginson.com/">
>   <dc:Creator rdf:resource="http://home.sprynet.com/sprynet/dmeggins/"/>
>   </rdf:Description>
>   </rdf:RDF>
> 
>   statement.getSubject()   => "http://www.megginson.com/"
>   statement.getPredicate() => "http://www.purl.org/dc#Creator"
>   statement.getObject()    => "http://home.sprynet.com/sprynet/dmeggins/"
> 
> In the first case, the object was a literal, and in the second case,
> the object is a resource; however, the naive interface does not make
> this information available.  The only solution is to add a new
> property to the Java interface:
> 
>   public interface RDFStatement
>   {
>     public abstract String getSubject ();
>     public abstract String getPredicate ();
>     public abstract String getObject ();
>     public abstract boolean objectIsResource ();
>   }
> 
> Now, for the first example, we have
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Title"
>   statement.getObject()        => "Megginson Technologies"
>   statement.objectIsResource() => false
> 
> and for the second example, we have
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Creator"
>   statement.getObject()        => "http://home.sprynet.com/sprynet/dmeggins/"
>   statement.objectIsResource() => true
> 
> Unfortunately, we're not nearly through yet.  The next nasty bit comes
> from the aboutEachPrefix attribute.  For example, here's a modified
> version of the first example:
> 
>   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>            xmlns:dc="http://www.purl.org/dc#">
>   <rdf:Description aboutEachPrefix="http://www.megginson.com/">
>   <dc:Title>Megginson Technologies</dc:Title>
>   </rdf:Description>
>   </rdf:RDF>
> 
> Now, this description no longer applies just to
> http://www.megginson.com/, but to *all* resources whose URIs begin
> with http://www.megginson.com/ (a constantly-changing set, and, in the
> case of CGIs or Servlets, potentially infinite).  As a result, the
> following information is no longer sufficient:
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Title"
>   statement.getObject()        => "Megginson Technologies"
>   statement.objectIsResource() => false
> 
> We need to modify the interface once again
> 
>   public interface RDFStatement
>   {
>     public abstract String getSubject ();
>     public abstract String getPredicate ();
>     public abstract String getObject ();
>     public abstract boolean subjectIsPrefix ();
>     public abstract boolean objectIsResource ();
>   }
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Title"
>   statement.getObject()        => "Megginson Technologies"
>   statement.subjectIsPrefix()  => true
>   statement.objectIsResource() => false
> 
> But wait -- there's more.  The RDF spec states that the 'xml:lang'
> attribute does not modify the data model, but rather, is a property of
> the (underspecified) literal.  Consider the following (RDF purists
> would perfer to use an RDF:Alt, but let's keep things simple):
> 
>   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>            xmlns:dc="http://www.purl.org/dc#">
>   <rdf:Description aboutEachPrefix="http://www.megginson.com/">
>   <dc:Subject xml:lang="en">markup</dc:Subject>
>   <dc:Subject xml:lang="fr">balisage</dc:Subject>
>   </rdf:Description>
>   </rdf:RDF>
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Subject"
>   statement.getObject()        => "markup"
>   statement.subjectIsPrefix()  => true
>   statement.objectIsResource() => false
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Subject"
>   statement.getObject()        => "balisage"
>   statement.subjectIsPrefix()  => true
>   statement.objectIsResource() => false
> 
> The language distinction is missing from our model, so we have to add
> yet another property to the Java interface:
> 
>   public interface RDFStatement
>   {
>     public abstract String getSubject ();
>     public abstract String getPredicate ();
>     public abstract String getObject ();
>     public abstract boolean subjectIsPrefix ();
>     public abstract boolean objectIsResource ();
>     public abstract String getObjectLang ();
>   }
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Subject"
>   statement.getObject()        => "markup"
>   statement.subjectIsPrefix()  => true
>   statement.objectIsResource() => false
>   statement.getObjectLang()    => "en"
> 
>   statement.getSubject()       => "http://www.megginson.com/"
>   statement.getPredicate()     => "http://www.purl.org/dc#Subject"
>   statement.getObject()        => "balisage"
>   statement.subjectIsPrefix()  => true
>   statement.objectIsResource() => false
>   statement.getObjectLang()    => "fr"
> 
> We're still not done.  Take a look at the following:
> 
>   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>            xmlns:megg="http://www.megginson.com/ns#">
>   <rdf:Description aboutEachPrefix="http://www.megginson.com/">
>   <megg:poem rdf:parseType="Literal">
>    <poem>
>     <line>Roses are red,</line>
>     <line>Violets are blue</line>
>     <line>Sugar is sweet,</line>
>     <line>And I love you.</line>
>    </poem>
>   </megg:poem>
>   </rdf:Description>
>   </rdf:RDF>
> 
> Since the <megg:poem> element sets the 'rdf:parseType' attribute to
> "Literal", the contents of the element will not be interpreted as RDF
> markup.  As a result, the value of this statement is a literal string:
> 
>   statement.getObject() => "
>    <poem>
>     <line>Roses are red,</line>
>     <line>Violets are blue</line>
>     <line>Sugar is sweet,</line>
>     <line>And I love you.</line>
>    </poem>
> "
>   statement.objectIsLiteral() => true
> 
> If I were to round-trip this back to XML, however, how would I know
> that it was meant to be XML markup?  My software might just as easily
> generate the following:
> 
>   <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
>            xmlns:megg="http://www.megginson.com/ns#">
>   <rdf:Description aboutEachPrefix="http://www.megginson.com/">
>   <megg:poem rdf:parseType="Literal">
>    &lt;poem&gt;
>     &lt;line&gt;Roses are red,&lt;/line&gt;
>     &lt;line&gt;Violets are blue&lt;/line&gt;
>     &lt;line&gt;Sugar is sweet,&lt;/line&gt;
>     &lt;line&gt;And I love you.&lt;/line&gt;
>    &lt;/poem&gt;
>   </megg:poem>
>   </rdf:Description>
>   </rdf:RDF>
> 
> This probably isn't what I want.  As a result, I have to add more
> information to my Java interface to note whether the literal value is
> meant to be read as XML markup:
> 
>   public interface RDFStatement
>   {
>     public abstract String getSubject ();
>     public abstract String getPredicate ();
>     public abstract String getObject ();
>     public abstract boolean subjectIsPrefix ();
>     public abstract boolean objectIsResource ();
>     public abstract boolean objectIsXML ();
>     public abstract String getObjectLang ();
>   }
> 
> At this point, it might make sense to split this out into different
> classes:
> 
>   public interface RDFComponent
>   {
>     public abstract String getValue ();
>   }
> 
>   public interface RDFSubject extends RDFComponent
>   {
>     public abstract boolean isPrefix ();
>   }
> 
>   public interface RDFPredicate extends RDFComponent
>   {
>   }
> 
>   public interface RDFObject extends RDFComponent
>   {
>     public abstract boolean isResource ();
>     public abstract boolean isXML ();
>   }
> 
>   public interface RDFStatement
>   {
>     public abstract RDFSubject getSubject ();
>     public abstract RDFPredicate getPredicate ();
>     public abstract RDFObject getObject ();
>   }
> 
> Obviously, there's a much more complex model underlying RDF than the
> spec lets on, and that model affects not only the ease or difficulty
> of implementing an object model, but also the difficult of many
> standard operations like queries against a collection of RDF
> statements and storage in a relational database.
> 
> I'd love to hear from others on this list who've worked with RDF.
> It's full of some very good ideas, but I'm afraid that the underlying
> (and hidden) conceptual complexity might stunt any serious
> implementation.
> 
> All the best,
> 
> David
> 
> --
> David Megginson                 david at megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list