XML-LINK
Peter Murray-Rust
Peter at ursus.demon.co.uk
Wed Jun 25 23:53:37 BST 1997
I recently posted some concerns about XML-LINK on XML-WG and it was suggested
that XML-DEV would be more appropriate; I agree. The main question is to
what extent *generic* XML-link-processors can be built which are
application-independent. They would rely totally on the XML-LINK spec for their
implementation. I have been rereading Eliot Kimber's popsting of 1997-05-31
on this list and have found it extremely helpful. Since no-one has challenged
any of the ideas or terminology there, I shall take that as a reference point
and try to use his terms consistently. (I am aware that July 1 may bring
additional clarification, but discussion here will help).
In many respects, XML-LINK behaviour has parallels to our discussions on
XML-LANG APIs. The draft specifies what goes in, but leaves more fluid 'what
comes out'. It's critical that we have consistent terminology in all of these
endeavours and outline the areas of complete agreement.
My primary concern is with the terms 'resource' and 'embed', where I believe
there is scope for added precision, and where I am not clear that all the
discussion on XML-WG about these has been consistent with Eliot's document.
It seems clear that link traversal requires us to have parsed documents in
'memory' (this could also mean persistent storage, etc.) A link connects
'nodes in trees' and 'resource' is essentially synonymous with 'node' (EK, P1.).
I will build a simple example, and then ask how it might be implemented:
a.xml:
<P>This is <A ID="A" XML-LINK="SIMPLE" HREF="b.xml#ID(B1)" TITLE="l1">
a <B>link</B></A> in a paragraph</P>
can be parsed to a tree (I use '-' to indicate childOf in a TOC-like structure
and PC(string) indicates a child with #PCDATA content (whitespace problems
ignored).
P
-PC(This is )
-A
--PC(a )
--B
---PC(link)
-PC( in a paragraph)
Now, from Eliot's posting I identify the node A as the resource at one end of
the link l1. The content of A is not relevant to the resource, since a node
is a point. [However some XML-WG postings seemed to imply that the content
of A is a resource, which is at variance with Eliot's explanation.]
For EXTENDED, INLINE="TRUE" I am less clear what the resource is in:
<MYLINK XML-LINK="EXTENDED" ID="family" INLINE="true">
<P>Here is the
<A XML-LINK="LOCATOR" ID="father" HREF="b.xml#ID(father)">father</A> and the
<A XML-LINK="LOCATOR" ID="mother" HREF="b.xml#ID(mother)">mother</A> and the
<A XML-LINK="LOCATOR" ID="baby" HREF="b.xml#ID(baby)">baby</A>
</P>
</MYLINK>
which parses to:
MYLINK
-P
--PC(Here is the)
--A
---PC(father)
--PC(and the)
--A
---PC(mother)
--PC(and the)
--A
---PC(baby)
Now this is a single link, with (presumably) a single end at the INLINE end.
So does this mean that the 'resource' of this link is the MYLINK node
with ID=family? Or are there three 'resources' at this end, the A nodes with
IDs of 'father', 'mother' and 'baby'?
*-*-*-*
Now for the other end of the link, and EMBED. I have implemented EMBED in JUMBO
like IMG in HTML:
<A HREF="foo.xml#ID(MOL)" SHOW="EMBED"/>
would locate the ID=MOL in foo.xml, process() it to create an object, which
would then display() itself in the document at the position where the A link
would be rendered. But I am more concerned about when the located node is
a (sub)tree which [XML-LINK] 'should be embedded, for the purposes of display
or processing in the body of the resource and at the location where the
traversal started'
Taking the first example (a.xml) which links to b.xml and assume b.xml
contains:
b.xml:
<P>This is <NODE ID="B1">
a <B>node</B></A> in a paragraph</P>
which is parsed to:
P
-PC(This is )
-NODE
--PC(a )
--B
---PC(node)
-PC( in a paragraph)
The link from ID="A" in a.xml links to node ID="B1" in b.xml. In one
interpretation, that's it - 'embed'ding is up to the application. But is there
any reasonable default behaviour?
(A) it could be traversed as if it were physically part of the a.xml document
(i.e. if NODE were a child of A (and presumably the eldest sibling). The
processor would encounter A, process the node (only), find it had a LINK,
process that, then find A had content and process that. Note that the content
of A remains and it would be application-dependent whether the *content* of A
was hidden or remains.
(B) Nothing happens unless BEHAVIOR is set. In which case are there reasonable
values for it? And does the concept of embedding have any meaning?
My own feeling is that (A) is the most reasonable default. With NEW we have a
separate window, with a separate namespace (so it doesn't matter if b.xml has
a different DTD from a.xml). So this 'window' has to transported into the
current 'window'.
I'd value comments. If this seems to be a consensus view then I'll try to
implement it in JUMBO. At present I suspect that JUMBO has got this partly
right and partly wrong.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list