Short Essay: Squeezing RDF into a Java Object Model
Roger L. Costello
costello at mitre.org
Mon May 3 20:29:29 BST 1999
David,
I see where you are going with this - develop an API for RDF. Out of
curiosity, why isn't the SAX API adequate? After all, RDF is just XML.
Let the application deal with it. /Roger
David Megginson wrote:
>
> The more I work with RDF, the more I find it fascinating in the
> abstract but annoying in the concrete.
>
> The biggest problem is that RDF claims an extremely simple data model
>
> statement: subject, predicate, object
>
> but that the model does not even come close to describing what
> information actually appears in an RDF statement. Let's start with
> the most naive mapping into a Java object model:
>
> public interface RDFStatement
> {
> public abstract String getSubject ();
> public abstract String getPredicate ();
> public abstract String getObject ();
> }
>
> This will work fine for something like the following:
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dc="http://www.purl.org/dc#">
> <rdf:Description about="http://www.megginson.com/">
> <dc:Title>Megginson Technologies</dc:Title>
> </rdf:Description>
> </rdf:RDF>
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Title"
> statement.getObject() => "Megginson Technologies"
>
> However, it falls apart quickly when the value of the property is a
> resource:
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dc="http://www.purl.org/dc#">
> <rdf:Description about="http://www.megginson.com/">
> <dc:Creator rdf:resource="http://home.sprynet.com/sprynet/dmeggins/"/>
> </rdf:Description>
> </rdf:RDF>
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Creator"
> statement.getObject() => "http://home.sprynet.com/sprynet/dmeggins/"
>
> In the first case, the object was a literal, and in the second case,
> the object is a resource; however, the naive interface does not make
> this information available. The only solution is to add a new
> property to the Java interface:
>
> public interface RDFStatement
> {
> public abstract String getSubject ();
> public abstract String getPredicate ();
> public abstract String getObject ();
> public abstract boolean objectIsResource ();
> }
>
> Now, for the first example, we have
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Title"
> statement.getObject() => "Megginson Technologies"
> statement.objectIsResource() => false
>
> and for the second example, we have
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Creator"
> statement.getObject() => "http://home.sprynet.com/sprynet/dmeggins/"
> statement.objectIsResource() => true
>
> Unfortunately, we're not nearly through yet. The next nasty bit comes
> from the aboutEachPrefix attribute. For example, here's a modified
> version of the first example:
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dc="http://www.purl.org/dc#">
> <rdf:Description aboutEachPrefix="http://www.megginson.com/">
> <dc:Title>Megginson Technologies</dc:Title>
> </rdf:Description>
> </rdf:RDF>
>
> Now, this description no longer applies just to
> http://www.megginson.com/, but to *all* resources whose URIs begin
> with http://www.megginson.com/ (a constantly-changing set, and, in the
> case of CGIs or Servlets, potentially infinite). As a result, the
> following information is no longer sufficient:
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Title"
> statement.getObject() => "Megginson Technologies"
> statement.objectIsResource() => false
>
> We need to modify the interface once again
>
> public interface RDFStatement
> {
> public abstract String getSubject ();
> public abstract String getPredicate ();
> public abstract String getObject ();
> public abstract boolean subjectIsPrefix ();
> public abstract boolean objectIsResource ();
> }
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Title"
> statement.getObject() => "Megginson Technologies"
> statement.subjectIsPrefix() => true
> statement.objectIsResource() => false
>
> But wait -- there's more. The RDF spec states that the 'xml:lang'
> attribute does not modify the data model, but rather, is a property of
> the (underspecified) literal. Consider the following (RDF purists
> would perfer to use an RDF:Alt, but let's keep things simple):
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:dc="http://www.purl.org/dc#">
> <rdf:Description aboutEachPrefix="http://www.megginson.com/">
> <dc:Subject xml:lang="en">markup</dc:Subject>
> <dc:Subject xml:lang="fr">balisage</dc:Subject>
> </rdf:Description>
> </rdf:RDF>
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Subject"
> statement.getObject() => "markup"
> statement.subjectIsPrefix() => true
> statement.objectIsResource() => false
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Subject"
> statement.getObject() => "balisage"
> statement.subjectIsPrefix() => true
> statement.objectIsResource() => false
>
> The language distinction is missing from our model, so we have to add
> yet another property to the Java interface:
>
> public interface RDFStatement
> {
> public abstract String getSubject ();
> public abstract String getPredicate ();
> public abstract String getObject ();
> public abstract boolean subjectIsPrefix ();
> public abstract boolean objectIsResource ();
> public abstract String getObjectLang ();
> }
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Subject"
> statement.getObject() => "markup"
> statement.subjectIsPrefix() => true
> statement.objectIsResource() => false
> statement.getObjectLang() => "en"
>
> statement.getSubject() => "http://www.megginson.com/"
> statement.getPredicate() => "http://www.purl.org/dc#Subject"
> statement.getObject() => "balisage"
> statement.subjectIsPrefix() => true
> statement.objectIsResource() => false
> statement.getObjectLang() => "fr"
>
> We're still not done. Take a look at the following:
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:megg="http://www.megginson.com/ns#">
> <rdf:Description aboutEachPrefix="http://www.megginson.com/">
> <megg:poem rdf:parseType="Literal">
> <poem>
> <line>Roses are red,</line>
> <line>Violets are blue</line>
> <line>Sugar is sweet,</line>
> <line>And I love you.</line>
> </poem>
> </megg:poem>
> </rdf:Description>
> </rdf:RDF>
>
> Since the <megg:poem> element sets the 'rdf:parseType' attribute to
> "Literal", the contents of the element will not be interpreted as RDF
> markup. As a result, the value of this statement is a literal string:
>
> statement.getObject() => "
> <poem>
> <line>Roses are red,</line>
> <line>Violets are blue</line>
> <line>Sugar is sweet,</line>
> <line>And I love you.</line>
> </poem>
> "
> statement.objectIsLiteral() => true
>
> If I were to round-trip this back to XML, however, how would I know
> that it was meant to be XML markup? My software might just as easily
> generate the following:
>
> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
> xmlns:megg="http://www.megginson.com/ns#">
> <rdf:Description aboutEachPrefix="http://www.megginson.com/">
> <megg:poem rdf:parseType="Literal">
> <poem>
> <line>Roses are red,</line>
> <line>Violets are blue</line>
> <line>Sugar is sweet,</line>
> <line>And I love you.</line>
> </poem>
> </megg:poem>
> </rdf:Description>
> </rdf:RDF>
>
> This probably isn't what I want. As a result, I have to add more
> information to my Java interface to note whether the literal value is
> meant to be read as XML markup:
>
> public interface RDFStatement
> {
> public abstract String getSubject ();
> public abstract String getPredicate ();
> public abstract String getObject ();
> public abstract boolean subjectIsPrefix ();
> public abstract boolean objectIsResource ();
> public abstract boolean objectIsXML ();
> public abstract String getObjectLang ();
> }
>
> At this point, it might make sense to split this out into different
> classes:
>
> public interface RDFComponent
> {
> public abstract String getValue ();
> }
>
> public interface RDFSubject extends RDFComponent
> {
> public abstract boolean isPrefix ();
> }
>
> public interface RDFPredicate extends RDFComponent
> {
> }
>
> public interface RDFObject extends RDFComponent
> {
> public abstract boolean isResource ();
> public abstract boolean isXML ();
> }
>
> public interface RDFStatement
> {
> public abstract RDFSubject getSubject ();
> public abstract RDFPredicate getPredicate ();
> public abstract RDFObject getObject ();
> }
>
> Obviously, there's a much more complex model underlying RDF than the
> spec lets on, and that model affects not only the ease or difficulty
> of implementing an object model, but also the difficult of many
> standard operations like queries against a collection of RDF
> statements and storage in a relational database.
>
> I'd love to hear from others on this list who've worked with RDF.
> It's full of some very good ideas, but I'm afraid that the underlying
> (and hidden) conceptual complexity might stunt any serious
> implementation.
>
> All the best,
>
> David
>
> --
> David Megginson david at megginson.com
> http://www.megginson.com/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list