What is a good database for very large collections?

Borden, Jonathan jborden at mediaone.net
Mon Feb 1 17:43:03 GMT 1999

> Can I try to shift it back to a vital question asked earlier, but not
> answered?
> What is a good database for XML?
> The criteria are:
>     * over 20, 000, 000 document fragments, each less than 256
> characters, each with some flat metadata, able to be incrementally
> reloaded onto the live system
>     * about simultaneous 30 users accessing about 10 fragments a minute
> each, grouped together (along with other dynamic data) and transformed,
> with a high need for immediate response

	How are the fragments selected? By query? If you can easily represent the
20M fragments in tabular form, and if you can easily represent the queries
in SQL then a relational db is the way to go. this is not a particularly
large, nor high-volume application for RDBMS.

	Ought you store the 20m fragments each in its own file ... probably not (a
big waste). Ought you employ an ODBMS? not unless SQL wouldn't work well
(you could always load it into say Oracle/SQL Server/DB2 etc vs. ODI/Poet
etc and test it out). My expectation would be that if you need to run
queries, the RDB will win.

>     * constant data-mining tools using various adhoc AI and linguitic
> retrieval software augmenting the metadata in the background.
> Rick

Jonathan Borden

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list