XML Search Engine Holy War - Attributes vs. Elements
Len Bullard
cbullard at hiwaay.net
Sat Oct 16 17:40:44 BST 1999
DuCharme, Robert wrote:
>
> >1. Ignore Attributes all together and index Elements and Character Data
> >only.
>
> >The feeling is that the use of attributes should be restricted (by
> >authors) and used to allow other scripts/applications to either include
> >or preclude the element and resultant children nodes from some sort of
> >processing, displaying or further manipulation.
>
> This shouldn't even be considered.
Yes, although a schema designer is free to document a semantic, this
is application level design. "All politics is local." A search engine
built on that premise is restricting its application space.
> Attributes are used for far more than
> what the above paragraph describes. Typical uses include many classic search
> criteria such as meta-information about authorship, revision stages, and
> revision dates.
Include security markings in that too. While one might not be smart
to use that in a web application, security markings in attributes
have been used in SGML DTDs. Redacting...
> The sole purpose of ID type attributes is to uniquely
> identify elements, and unique identifiers ought to be pretty handy when
> searching for information. A system that can quickly locate elements with a
> particular value in an IDREF type attribute would be very useful in link
> maintenance and implementation.
And as in X3D, (being discussed(ID/IDREFS vs NMTOKENS)), for DEF/USE
relationships. There are
also examples of putting what others might consider "content" into
attributes to preserve a symmetry with nodes and fields. Some use
and will use XML just as a binding to an abstract description (eg,
X3D). There is no simple case or practice that enables an engine
to ignore attribute values and types unless one is blinding the engine
by design.
> but a nice
> thing about implementing storage of attributes is that they map more easily
> to relational databases where ID and IDREF attributes can be easily indexed
> for searching.
Yes. There are scripts and samples that do that now. What I see in
practice
is that export and import systems start out using only elements mapping
field names to GIs, then after some experimentation, they begin to rely
more
on GIs and attributes. Applying ID/IDREF depends on how you use the
names.
It is good for primary key/foreign key relationships if strict
relational
rules are followed, but when packing/serializing, it isn't necessarily
strict so NMTOKENs may be preferred. The question is one of requiring
a validity pass from the XML processor.
len
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list