Whence XQL?
Jonathan Robie
jonathan at texcel.no
Mon Mar 29 20:12:40 BST 1999
At 10:47 AM 3/29/99 +1000, Marcelo Cantos wrote:
>On Thu, Mar 25, 1999 at 10:09:15PM -0500, Jonathan Robie wrote:
>> Cool, you work on SIM? (Does that make you a SIMian?)
>
>Cute! It might just take off around here. :-)
I haven't been able to come up with a similar nickname for people who work
on XQL...
>I do wonder what proportion of people looking seriously at XQL are
>into text. We find WITHIN N to be exceedingly useful. It is also
>interesting to note that we only offer proximity at the word level and
>that this is all clients ever really want. We do also offer same
>sentence/paragraph queries, but virtually no-one uses them.
One full-text search engine vendor told me that their users did not use
proximity searching. This surprised me, but it was what convinced me that I
might be able to leave proximity out of even full-text extensions to XQL.
Most of what I have done with XML until fairly recently was with structured
documents rather than data, or with documents that also contain what has
classically been considered data. I am now starting to do more with XML for
data. I think that both Microsoft and Joe Lapp of webMethods have worked
more with data than with documents.
>It's an interesting angle, though not one I had considered (not that I
>have considered many angles :-). I had understood, perhaps
>incorrectly, that the only way to perform word-level boolean queries
>was to treat words abstractly as leaf nodes of the document tree
>rather than clumps of opaque string data. Under this conception, to
>find "other name", one would say:
>
> LINE[WORD="other"; WORD="name"]
>
>It could possibly be made legal to abbreviate the above to:
>
> LINE["other"; "name"]
XQL as-is does not allow this, but I have discussed this as a possible
extension in the section on "Integrating structured and full-text queries",
in http://www.w3.org/TandS/QL/QL98/pp/murata-san.html, a paper written
together with Makoto Murata-san. It makes the above syntax legal.
The other approach, which you have used above, is to pretend that there is
markup identifying the individual words - that's a perfectly valid approach
too.
>Which would be interpreted as, "a Line element which is the parent of
>a leaf node equal to "other" immediately preceding a leaf node equal
>to "name". Now, support for proximity ("rose*" within 10 words of
>"sweet") would simply be a matter of:
>
> LINE["rose*" %10 "sweet"]
>
>(The %N syntax is borrowed from our query language.) Higher level
>proximities could be done like this:
>
> LINE["name"] %10 LINE["purple"]
>
>The operator simply adopts the level of its operands mismatched
>operands constitute an error.
I would have to think about how to fit that into the XQL grammar. Does it
have advantages over the function-based approach I suggested earlier?
near("name", "purple", 10)
This fits into the XQL grammar without modification, it's just a matter of
introducing another function.
Jonathan
jonathan at texcel.no
Texcel Research
http://www.texcel.no
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list