Mirrors and ftp sites

Thu Apr 10 14:18:58 BST 1997

Peter Murray-Rust said:

| 
| A number of people have told me that downloading my software 
| (either as applets or *.tar.gz) is slow and unreliable (i.e. times out).  
| I've experienced similar problems (especially to Austria).
| I think we would find it useful to have material related to XML-DEV
| mirrored and this is really an appeal to see if anyone is willing
| to offer some basic help.

In my understanding your inention is the improved accessibility to
software. Improvement in this case means bandwidth and the ability to
download all related resources from one (mirror) server. It may include
kind of HOWTO for the installation of the various tools.

Exactly the same problem is solved for the distribution of Perl object
classes (-> http://www.perl.com/CPAN) by some very sophisticated system
based on mirrors, author ids and categories. Of course this is overkill now,
it might be better to start top down than bottom up. Have a look and comment.

Also with CPAN software it's easy to mirror whole websites, IMHO this
should be left to caches. The collection of links is covered (pun intended)
very well at www.sil.org/sgml, so no need.

My suggestion is to concentrate on ftp level and daily mirrors in a 
standardized way. This means all mirrors should have the same organisation
regarding directories, filenames and help texts. Just like CPAN. 

What CPAN lacks is a nice HTML GUI. This is of course a field, where XML
should by applied to produce HTML and other documents. My idea is the
following:
We make up a very simple minded DTD to describe resources. Every site
holds a document consisting of a base element and (external) entities.
The entities contain resource description elements and are
maintained by the author of mirrored software on his/her distri-
bution server. They should include things like anchor text for entry in
HTML pages, link to documentation and download sites, author information etc.
The entities are collected by running a XML/SGML parser using http-SYSTEM
identifiers. The software packages themselves are fetched by CPAN
(or other) mirror software.
Based on the document resulting after the collection of all (error mechanism
needed for sites being down) external entities it should be easy to create
various views on the archive.

BTW: There is a crude prototype implementation using this mechanism at
	http://www.tu-clausthal.de/cgi-bin/wwwdocd/homepage.cgi
It's written in 100% Perl and is kind of DynaWeb for the poor. You
may browse the software and DTD behind this at
	http://www.tu-clausthal.de/~inim/hp2/hp2/
What you see is just a surface, any networking code is missing. But
it may be a start. Java probably would be nicer. But hey, wasn't XML
an independet standarrd for exchange of documents between different
applications ?

If you think this sounds like overkill, let me know. IMHO this could be
a very usefull application of XML.

	++im
-- 
Snail : Ingo Macherius // L'Aigler Platz 4 // D-38678 Clausthal-Zellerfeld
Mail  : Ingo.Macherius at tu-clausthal.de WWW: http://www.tu-clausthal.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Frank Zappa)

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)