Assisted Search of XML document collections
Edward C. Zimmermann
edz at bsn.com
Sat May 22 22:11:55 BST 1999
> On Sat, 22 May 1999, Edward C. Zimmermann wrote:
> > Since I appear to be totally confused (and, as often, intrigued) a starting point
> > might be to, if possible, clarify the objectives and goals (the problem is clear)
> > to explore common ground.
> That is the stage we are at. I have this gut feeling that we need to
> define what it means to have a search engine operate on let's say 100,00
> documents marked up using XML, and what are the situations where it might
> make more sense to search a file which describes that collection.
100K documents is not a problem. Even on consumer PC hardware a modestly performant
fulltext engine can handle typical queries on such a small collection in fractions
of a second. The problem is more (beyond quantity) that information resources
(XML, HTML or whatever) are not always static but dynamic. That's, above all, one
of the fundamental flaws in the brute-force spider/crawl approaches followed by
the major "Internet Engines" (beyond the impact on bandwidth, the half-life of
data, and all the other significant shortcommings).
> Your best contribution would be to describe a business problem and tell us
> how you like to solve it.
Different problems, different methods, different tools.
Lets turn the tables, since I'm the confused soul, can you explain a bussiness
problem and tell us how you might plan to "solve it"....
<A HREF="whois://rs.internic.net/ecz">Edward C. Zimmermann</A>
<A HREF="http://www.bsn.com/">Basis Systeme netzwerk/Munich</A>
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev