What is XML for?

Rick Jelliffe ricko at allette.com.au
Sat Jan 30 06:58:50 GMT 1999


 From: Paul Prescod <paul at prescod.net>

Tim Bray wrote:
>>
>> I'm saying: I want a database that can do XML, by which I mean
infinite
>> levels of attributed nested sequenced constructs.
>
>Please don't stop there! Like many of our "tools vendors" you've
forgotten
>about (or chosen not to support) links. My worry about non-OO
technologies
>is that I wonder how they support links (and queries on links) and RDF
>(and queries on RDF models) and topic maps (and queries on topic
>relationships). I suppose you could represent all of these as relations
in
>a relational database.

Databases are concerned with access and retrieval: these are issues of
entity management. The keys of a record (i.e., an entity) are data
attributes (i.e. attributes of entities, not attributes of elements, by
definition). XML lacks data attributes but compensates somewhat by
forcing entity structure to be synchronous with element structure, so
element's attributes can serve as data attributes (e.g., at the root
element of an entity.)

Which is not to say that the value of a record's keys are always or even
usually unrelated to the value of text or data in the element structure:
the key can be considered some presentation form of data within the
element structure. The more sophisticated the key-recognition or pattern
matching on keys that the DBMS provides, the less the need for the key
(data attribute) to differ from the key-source (element or attribute
values).

For small-scale uses of XML, the distinctions between element and
entity, key and key-source, are not relevant: just a simple element
structure sitting in a file will do. Indeed, one of the strengths of XML
is that where there is a strong need for structured data but not for
entity management, XML gives a file-system-based alternative to simple
databases--look at what can be achieved by OmniMark, Perl, Balise, etc
without ever resorting to databases!

So the view about which entity management system to use (files, SCCS,
RCS, a RDBMS, an OODBMS, a network-structure DBMS, etc) can come to
judgements on the relationship between the data's entity structure and
its element structure: their nature and the performance requirements of
the particular project.

For example, if the data is highly regular at the top levels, with free
text down at the leaves, it might indicate that you should use a RDBMS
with the added convention that all text fields are XML (with fields used
as keys constrained to be #PCDATA, perhaps normalized.)

On the other hand, if the text is free all the way up and down, and you
are not interested in searching or sorting the data, and you need to
support multiple versions of documents, you might convert all your
elements into entities:

<!DOCTYPE a SYSTEM "...">
<a>hello<b>world</b></a>
becomes
<!DOCTYPE a SYSTEM "..."
[
<!!NTITY a1   SYSTEM "database:a?session=1">
<!ENTITY b1 SYSTEM "database:b?session=1">
]>
&a1;

where the particular values of the entities a1 and b1 are generated by
the DBMS (i.e., based on session criteria, such as the version of the
document being sought.

Entity management is therefore a back-end question. DOM, which lets you
access by the element structure, is primarily a middleware or front-end
system.  For tightly coupled server systems in which the entity
structure directly represents the element structure, there is no reason
why data has to go  DBMS->XML->DOM, it can go  DBMS->DOM directly, since
DOM is an interface definition.

The reason for having a back-end/front-end distinction is that it gives
a way to distribute processing, especially into 3-tier architectures:
the server is DBMS, the middleware is the XML/DOM, the client is the
user interface. The punter is not aware of the entity management system
or the element structures.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list