Assisted Search of XML document collections

Sat May 22 20:32:01 BST 1999

> 
> Hi, all
> 
> There is a modest effort being assembled to look at this prototypical
> problem:
> 
> PROBLEM - multitudes of XML documents. The collection is not necessarily
> static, but if dynamic only incrementally so. The business case that would
> apply is that it makes sense to markup the original documents using XML;
> it also makes sense to search a file which is a description of the
> document collection rather than the whole document collection. The
> derivative file is the "index" - we are not assuming that it itself need
> be XML.
Am I missing something. What is the difference between your "vision" and
that of, for examples, GILS--- which assign a metalevel document, so-called
information locator, to a resource? See http://www.gils.net/locator.html

> 
> I am choosing my language carefully as there seems to be an equal mixture
> of enthusiasm and coolness displayed towards an XML document collection
> indexing scheme. The fact of the matter is that so far we have identified
In the ASF for GILS (which also defined a distributed gathering concept)
see http://asf.gils.net/framework.html

> a number of problems which are amenable to assisted search. We are not
> particularly concerned, at this point, in breaking any new ground in XML -
> rather, this is a project designed to address a subset of XML "usage"
> problems.
Or are you thinking (trying to understand the problem) of something like a
new take on DC so gathers can create their own synthetic locator records?
Or naming conventions? See http://www.gils.net/naming.html

> 
> Although I have announced this project on the perl-xml list, and it will
> concentrate on Perl, with and without XS, there is no reason that Java
> and/or C/C++ viewpoints are not welcome. We are primarily interested in
> exploring issues pertaining to the construction of a file that describes a
> collection of XML documents in a succinct fashion, most likely with a
> moderate to high degree of application specificity - i.e. there may not be
> a lot of defaults that make sense.
Keeping to GILS: http://asf.gils.net/semantic-map.html (following an ISO 11179
Metamodel) to connect crawlers with compliant search engines.....

> 
> We also wish to supply a useful API that search engine writers can use.
So is the project about designing a common development API for search engines?
Or a way for metasearch engines to interoperate with one-another (such as GILS/ASF)?

Since I appear to be totally confused (and, as often, intrigued) a starting point
might be to, if possible, clarify the objectives and goals (the problem is clear)
to explore common ground. 

-- 
______________________
<A HREF="whois://rs.internic.net/ecz">Edward C. Zimmermann</A>
<A HREF="http://www.bsn.com/">Basis Systeme netzwerk/Munich</A>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)