Searching XML

Mark Birbeck Mark.Birbeck at iedigital.net
Tue Aug 31 11:05:41 BST 1999


Depends on the complexity of your XML. If it's fairly flat then a nice
'cheap and cheerful' method is to convert each XML document into an HTML
version with all the relevant info in META tags. Then use standard
indexing software. When you do a search, instead of returning the HTML
file found, return the XML one that it is a place-holder for.

Although it's a poor relation to properly indexing XML files, it does
mean that when you get an index server that can handle XML your project
structure won't change a lot.

Best regards,

Mark


> -----Original Message-----
> From: Warren Hedley [mailto:w.hedley at auckland.ac.nz]
> Sent: 31 August 1999 04:04
> To: xml-dev at ic.ac.uk
> Subject: Searching XML
> 
> 
> Hey team
> 
> I have a number of HTML and XML files that are used to generate
> our website. We want to add search functionality to this site,
> so that we can look for keywords and text.
> 
> It has proven too slow to search through all of the files, so
> the method I suspect we would use, would be to generate an
> additional database containing all of our main data (perhaps
> all words longer than 4 letters), that we could quickly look
> through to generate search results.
> 
> Does anyone know of an implentation of search functionality
> along these lines (Perl modules would be nice.) Or can anyone
> suggest a better plan of attack?
> 
> Thanks
> 
> 
> -- 
> Warren Hedley
> Department of Engineering Science
> Auckland University
> New Zealand
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list