XML Search Engine
Borden, Jonathan
jborden at mediaone.net
Thu Nov 5 19:25:36 GMT 1998
Let me rephrase that: word/character proximity searching has been done for
decades and its utility is well known. The last time I addressed this in
detail was when I spent some time on the Hearsay project which was an early
speech recognition system during the early 1980's. The problem of german or
oriental words/phonemes/sentences etc. is fairly similar (perhaps identical)
to the problem of english language speakers who slur their words together.
Speech processing programs have made great recent strides yet this has been
a difficult nut to crack.
There are many people who believe that further refinements of these well
known techniques are unlikely to yield dramatic improvements. Instead there
are avenues of attack which operate at higher levels on the information food
chain, namely at the word phrase, syntactic and semantic levels. These
levels are well represented as grove structures and XML/SGML search
techniques will likely yield significant results. Natural language
processing algorithms naturally express their output in groves and
intelligent search is at this crossroad.
For example, suppose I am searching for big apples:
"This is a little green apple. Big deal."
will "Big near apple" match?
how about "Big applied to apple"
Jonathan Borden
JABR
http://jabr.ne.mediaone.net
>
>
> Borden, Jonathan wrote:
>
> > As you say Word/Character proximity searching is not that
> interesting, and
> > if this is desired, XML doesn't have much to add to the current equation
>
> I beg to disagree twice. a) proximity search is very important for any
> one searchingany reasonably-sized database with a variety of
> texts; b) XML can
> help a lot,
> even thou most non-XML capable search engines can already offer proximity
> searching.
>
> We have bee able to solve quite a number of problems using
> proximity. If we
> did not have it we could still be able to solve those problems
> albeit spending
> much more effort, time, intelligence and CPU cicles.
>
> - fernando
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list