trying to understand the roll of XML

Jonathan Eisenzopf eisen at pobox.com
Fri Feb 12 10:02:35 GMT 1999


Matt Sergeant wrote:

> Markus Weltin wrote:
>
> > Is it common to produce XML docs from database quires,
> > and then convert them to HTML pages?
>
> Be careful with this. There's no reason to query a database, output XML
> and convert to XML all on the server.

Why not? That's an example of using XML as a glue between a database
and HTML.

>
>
> One warning I will give to users here is: Don't convert XML to HTML on
> the server. Unless performance really isn't important. Or you can
> guarantee that most of your browsers will be 5.0 browsers and you only
> need to do the conversion for a few users. That's a lesson I learned the
> hard way this week. Unfortunately I think a lot of people will make this
> mistake since there are becoming more and more tools on the market to do
> just this (e.g. IBM's Java servlet XSL converter).
>

Whoa, careful there. I think you must be talking about dynamically generating
HTML from XML via a CGI-like script. This has merit, but there are
other scenarios where converting XML to HTML on the server-side is
benefitial.

For instance, there's no reason why you couldn't pre-generate HTML from
XML. This would save processing time on the client and server.

Also, there are instances where processing XML dynamically is easier
and more efficient. As an example, I recently worked on a project that
collects news headlines from internetnews.com and puts them into a
compact summary which web sites then use on their site. Since news
is generally time sensitive, the summary must be generated regularly.

Previously,
each client site was querying internetnews.com every 5 minutes or
so. Once the page was retrieve (around 30k usually), the script
ripped the headlines out of the HTML and generated a news summary
which users then included on their homepage. There were several problems
with this:
1. at around 30,000 registered sites, they were concerned that
resources were being diverted by the potentially 30,000 clients
who were querying the page every few minutes.
2. they were embedding non-HTML tags in HTML
3. the embedded tags did not contain enough information to
categorize the news

The answer was to use XML as a glue between the news Web
server and the clients who want to display news headlines on their
sites. Instead of each client retrieving the news page every few
minutes, the headlines are now gathered, categorized, and XMLified
by a separate server.
Now the clients download the headlines in a more compact and
useful XML format from a different server. The benefits are:
1. significantely reduced bandwith utilization
2. use of resources is diverted away from www.internetnews.com
3. the XML format is easy to understand. it would be easy to convert
to another format or to write a separate client. We can pre-generate
or dynamically generate the news summaries in HTML, DHTML,
and Javascript. XML has been very helpful to make this transition
easier and will also allow for a higher level of extensibility in the
future.

You can see the results at: http://www.webreference.com (left side)
There are also some examples at:
http://www.webreference.com/headlines/nh/examples

You can register to use the news harvester at:
http://www.webreference.com/headlines/nh/

There are multiple news categories, here's a few URLs to the resulting
XML which is generated every 5 minutes:
http://headlines.internet.com/internetnews/top-news/news.xml
http://headlines.internet.com/internetnews/bus-news/news.xml


Jonathan.





xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list