XML-LINK

Peter Murray-Rust Peter at ursus.demon.co.uk
Sat May 31 01:12:08 BST 1997


I am trying to understand how XML-LINK might be used and would be
grateful for some gentle hints.  The motivation is to develop a
set of routines in JUMBO that are generic and will support a reasonable
variety of ways in which links might be used.  I am confident that there are
readers of this list who have clear ideas of how links might be used and
I hope they can spend a few minutes to give some *very* simple guidance.

<DISCLAIMER>
As we are all aware the XML-LINK spec is in early draft and is scheduled for
revision before July 1.  It is also widely agreed that some of the
terminolgy needs tightening and that some details of the syntax and the 
semantics need addressing.  So only a general approach is required.
</DISCLAIMER>

I hope it will be seen as helpful if I put forward my current understanding
of what XML-LINK is intended to do, and XML-DEVers can annotate my ramblings.
[They can use XML-LINK do to it :-), accepting that we have no means of 
addressing into my content.]  So here goes...

A link has ends which are called resources.  My current understanding is
that these can be thought of as points in the structure of a document, and
will often coincide with Elements.  I am as yet unclear about the total 
number of possible topolgies of a link, and ask some questions here.

Structure and Behaviour.

My understanding is that a hyperdocument can have a link structure which is
independent of behaviour - it simply represents the structure of the 
information.  I'm happy with this - what I'm less clear about is whether
there are *commonly agreed semantics* for this, or whether it's all
application-dependent.  [If the answer to all my concerns is 'application-
dependent' then it will be a pity because everyone will write individual
link processors and there will be no reusability.]  I'm aware that all these
concerns are catered for by HyTime, but since I am ignorant of HyTime,
answers which refer to that won't be much use to me - ideally they should
be in the context of the current spec.

Thus I assume we can transmit structures like DAGs, linked lists, relational
tables, etc. by the use of XML-LINK without being concerned how they
are going to behave.  At this stage I'd like simply to address structure.

SIMPLE
The simplest link is XML-LINK="SIMPLE" and is an analogue of HTML's <A>
or <IMG>.  My view of it is exemplified by this fictitious XML
document:

<P>This is <A HREF="#foo" ID="A">resource A</A> which points to
<FOO ID="foo">the foo bird</FOO> (see picture 
<IMG HREF="foo.gif" TITLE="foo bird" ACTUATE="AUTO" SHOW="EMBED" ID="gif">)
</P>

Here there are two links, both being unidirectional.  I understand the the 
ends of the first link are the 'point' described by 'ID=A', and the point
described by ID=foo (though this is still being discussed).  If this is true,
then in a **tree-based** tool like JUMBO the ends of the link correspond
to nodes in the tree (labelled by ID=A and ID=foo).  The second link is harder
because the resource in foo.gif is not clear (perhaps it is the inode in
the UNIX system?).  

I have (I believe) implemented SIMPLE links in JUMBO.  Each Node has a method
isLink() which says whether it's the start of a SIMPLE link.  (I may have to
change this nomenclature when the other links become clearer.).  So, for
example, when process()ing a Node, JUMBO looks to see if it isLink() and if so
what does it point at (value of HREF).  It seems to work.

Note that in this model, the resource which is pointed to (ID=foo, or foo.gif)
is not required by XML-LINK to know anything about the link.  I asumme it could
be argued both ways that the pointedAt should/should_not know what is 
pointing at it.  [SHOW and ACTUATE are deliberatly not discussed, although I
think they are straightforward (at least compared to EXTENDED).]


EXTENDED

EXTENDED is a container for an indefinite number of LOCATOR links.  [LOCATOR
has exactly the same syntax as SIMPLE but has presumably different
semanttics.]  EXTENDED does not by itself define a resource and is normally
remote from the resources.  

I can see how a bi-directional link might be constructed from EXTENDED 
[It's other multiplicities I don't feel so happy with.]  Does this 
example capture it?  

<P> Friends, Romans, Countrymen, <WORD ID="W1">lend</WORD> me your 
<WORD ID="W2">ears</WORD></P>.
...
<ANNOTATION XML-LINK="EXTENDED" ID="link1">
<POINTER XML-LINK="LOCATOR" HREF="#W1" ROLE="verb">
<POINTER XML-LINK="LOCATOR" HREF="#W2" ROLE="noun">
</ANNOTATION>
...
We therefore have a bidirectional link between the verb and the noun, so
that each of them can locate the other.  Therefore, in JUMBO, there
has to be a pointer which is available to each Node.  My temptation would be
for each node to carry a hashtable of links to other nodes so that (say)
when W1 was asked what it linked to it would come up with a list of the
Nodes at the other end of its links.  W2 would be such a node.  On the other
hand it might point to the LINK (i.e. link1, and it might be clear from the
'contents' of link1, what the other end was.  Is this too restricted?

I am not clear how this extends to 'multidirectional links'  Here is a typical
problem.

to <WORD ID="W3">bear</WORD> the <WORD ID="W4"> slings</WORD> and 
<WORD ID="W5">arrows</WORD> of
...
<ANNOTATION XML-LINK="EXTENDED" ID="link2">
<POINTER XML-LINK="LOCATOR" HREF="#W3" ROLE="verb">
<POINTER XML-LINK="LOCATOR" HREF="#W4" ROLE="noun">
<POINTER XML-LINK="LOCATOR" HREF="#W5" ROLE="noun">
</ANNOTATION>
...
Here I want to indicate that the verb 'bear' links to two nouns at the
same time and that each noun points to 'bear'.  But it isn't obvious that
this is the case (unless perhaps ROLE is used for that, and that doesn't
seem general).  The topology can be seen as a multidirectional link, with
a single 'end' and a double 'end'  (W3<-->(W4,W5)).  Alternatively it can 
be seen as two bidirectional links grouped together )(W3<-->W4),(W3<-->W5)).  
In either case I don't think I have captured this sufficiently well that it 
is capable of being automatically or semi-automatically processed.

Guidance would be gratefully received, particularly if it makes it clear 
whether there is a generic way of supporting this in code.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list