Automating Search Interfaces

Peter Murray-Rust peter at ursus.demon.co.uk
Tue Feb 17 23:06:37 GMT 1998


Posted on behalf of John Petit

>
>------------------------------------------
>This is a question about how the search scenario will play out on the
>web once XML becomes widely implemented. I have not seen this
>articulated in any of the specifications or articles on the web thus
>far. In lieu of that, I have imagined how it might work. I would like
>some feedback. Am I way off base?  Naturally the answer will have a big
>impact on the design of search engines and other services that I am
>creating.
>
>As particular industries and special interests standardize on their
>respective DTDs, Internet search engines will have to allow users to
>search by specific elements contained in those documents. In the typical
>
>search scenario, a user would use one of the major search services such
>as AltaVista or Yahoo. Lets say the user wanted to search across real
>estate listings, and these listings all used the same DTD. It seems that
>
>independent search engines need to interpret the DTD for a class of
>documents and present a query interface based on that DTD. The question
>is: how is the search engine to interpret the DTD and build an
>intelligent interface based on that DTD? Simply listing every element in
>
>the DTD is one approach, but an ugly one. Many DTDs will contain
>numerous elements which would only clutter and confuse a search
>interface.
>
>One solution may be to use DTD attributes to cue the search engines.
>Perhaps a "LEVEL" attribute could cue the searchers to display
>interfaces to predefined levels. The example below shows that the
>"LEVEL" attribute means that the "numbeds" element should always appear
>in a search query, or at the top level or searches. Any elements that
>did not have this level 1 attribute would not be shown in the search
>interface. If the "LEVEL" attribute was not found in the DTD, the
>default would show all of the elements with search fields next to them.
>
><!ELEMENT numbeds (#PCDATA)>
><!ATTLIST numbeds
>    XML-SQLTYPE INTEGER #FIXED
>    SNAME CDATA #FIXED "Number of beds"
>    LEVEL CDATA #FIXED "1">
>
>Search engines, upon seeing the "LEVEL" attribute, would configure their
>
>interface to have an "Additional Elements" button that would show the
>next level of elements. This would have the effect of shielding the user
>
>from an overwhelming mass of searchable elements.  Perhaps these
>mechanisms are in place, but I just do not see them.
>
>Another useful attribute would describe the "shown name" for a
>particular element. Element tags may not have as descriptive a name as
>they should in the DTD itself. For example, having "numbeds" appear in
>the user search interface would not be very user friendly. A much more
>descriptive string would be "Number of beds."
>
>The "XML-SQLTYPE" attribute indicates that "numbeds" is an integer. This
>
>is a form of strong typing that was described at one time by Tim Bray. I
>
>also do not know the status of strong typing in XML, but strong typing
>would sure be useful in this situation. If a search engine knows that a
>field is going to be a number, then the engine can provide optional
>number manipulations. Such useful operations may be determining price
>ranges, or in this case, a range for the number of bedrooms. Otherwise,
>how will an independent search engine or agent know that a particular
>field can be ranged and mathematically manipulated?
>
>I certanly do not think that these attributes should be mandatory, but
>it seems that there should be an agreed upon method of DTD construction
>that would give clues to search engines. I am clearly not an expert in
>this area, but I have not seen a solution to this in the XML proposals
>published thus far. Does anyone have an answer for this?
>
>
>
>
>
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list