XML / XSL Disparity
Michael Kay
M.H.Kay at eng.icl.co.uk
Fri Jun 5 12:56:37 BST 1998
>I am writing a very simple, Java based, search engine for
XML
>documents.
Does it really need a new search engine? Much more useful
would be a new filter for an existing search engine, then
you could concentrate your efforts on the XML-specific
functionality.
>As an example, imagine a date stored as 19980604 in an XML
file.
>An XSL file converts that to 06-04-1998 for presentation.
The user
>writes that date down and later searches for it but....
>
The idealistic answer is that to search for a date, you
should know it's a date rather than pretending it's text.
Unfortunately XML doesn't give you that information. You
could try to get it from an external place, e.g. an extended
XSchema.
Searching on the presentation form of information rather
than the stored form is always tricky. You can do it by
presenting to the search engine (at indexing time) the
display version of the document rather than the original
XML, but this presupposes that everyone will use the same
style rules for display, and that the rules are not
context-sensitive (A British user would want to see the
above date as 4/6/1998). Perhaps more importantly in
practice, the display form of the document will have lost
the semantic information inherent in the XML tags, which can
potentially add greatly to the selectivity of the search.
I personally think that if the paradigm is "free text
searching" then it is only going to work well if the content
of the displayed text and the stored text are much the same.
Mike Kay
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list