XML query languages and their encodings

Sat Apr 3 00:11:44 BST 1999

On Thu, Apr 01, 1999 at 09:50:31AM -0600, Paul Prescod wrote:
> Mark Birbeck wrote:
> > 
> > I sort of guessed it might be ;-) I was more getting at the idea of
> > context. The following is a 'list of nodes':
> > 
> >         <name>Mark</name>
> >         <name>Tracey</name>
> >         <name>Jan</name>
> 
> That's exactly my point. That's not a list of nodes. That's a list of XML
> elements. Nodes are abstract. Here's a concrete representation for them
> (and a containing element) for discussion purposes:
> 
> x= element( gi: "names",
>        content:
>              element( gi: "name", content: text( "Mark"))
>              element( gi: "name", content: text( "Tracey"))
>              element( gi: "name", content: text( "Jan")) )
> 
> Now in this abstract model a "list of nodes" is:
> 
> [x.content[0], x.content[1], x]
> 
> Do I know their context? Yes. Do I know their depth? Can I talk about
> nodes of different depths? Yes. In this brain-dead simple abstract model
> those issues are not complex at all.
> 
> Now if we want to encode these results for transmission between machines
> then all of the issues you raise are important. But that is a *separate
> issue*. It has nothing to do with the abstract concept of "node list".
> 
> "XML People" are encoding-focused so they always come back to the
> encoding. That's fine but it is also important to recognize that some
> things should be considered in the abstract domain -- like the result sets
> of query languages.

This is usually, but not always, the case.  At times, the abstract and
concrete domains interact in nontrivial ways.

In our own internal discussions on query models, the issue of
determining context has always had an impact on the conceptual model.
There are essentially two camps:
  1. return just the node (or subtree).
  2. return a "pointer" to the node.

Returning a pointer allows one to go back to the original document and
traverse up, down, back and forth at will.  It is the most powerful
mechanism.  On the other hand, it also involves a considerable amount
of network traffic as the user traverses a DOM-style tree across the
client-server connection.

Returning just the node (or, more generally, returning only what is
requested) has the disadvantage of throwing away context.  But this is
only a problem if your query language can't express your requirements
directly.  For instance, say you want to know the name of the parent
of each node returned.  In an SQL-like language, one might express it
thus:

  select **, parent(**).name from docs where //firstname?="Mark"

where '**' represents anything returned from the XQL subquery.  The
parent function is defined to return the parent list of the given
nodelist.  If the environment provides sufficient support, one could
even do this:

  define function toc(nodes)
  begin
      # define a function to return a table-of-contents.
      # ...
  end;

  select **, toc(**) ...

This would then return the nodes of a query and the related TOC's.
The essence of this approach is that you get back only as much context
as you want.  You also have the power to express whatever level or
complexity of context you want (it also means the server can do the
work without hops).  Then again, if you are not interested in context,
you don't wear the cost of having it.

The real (and nontrivial) problem with this approach is that it
requires non-portable environment support to express desired results
with unlimited expressivity.  In practice I think this is a problem
people are happy to live with.  In the relational world, people using
Oracle code up functions in PL/SQL that can be invoked in queries on a
regular basis.

The point to be made from all this is that your query model will be
greatly affected by delivery considerations.  It is not necessarily a
case of getting the model right and then worrying about how to return
results in a practical setting.

Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)