Documents and Document Fragments (Was RE: XML Information Set Req uirements, W3C Note 18-February-1999)

Didier PH Martin martind at
Sat Feb 20 18:27:02 GMT 1999

Hi Mark

We now treat our web servers logically as 'XML servers', with either one
massive document on or thousands of smaller ones, whichever way you want
to slice it.

If I understand you well, what you are saying is that a stream could start
with the processing instruction <?xml version="1.0"?> the PI indicates that
the format following the PI is now a XML format. Then the processor at the
other end of the stream would process each begin-end markups as small
information units. Is it what you are saying?

The concept seems appealing and easy to implement. I guess that the problem
resides with the word document and the meaning legacy that this word convey
(about 5000 years with the notion of a document as a physical entity with
ink on it :-). In our specs we are doing what marketing people call "name
extension" use the same word everywhere because it sells. It seems that the
word document (again because of all the legacy meaning) convey restricted
understanding of what we can do with XML. Probably, "information unit" would
be more appropriate.

I understand also that W3C has to operate with legacy too. That legacy is
called a file and most of the time we get the implicit equation document =
file. I agree that a file could be mentally perceived as closer to a
physical (mean here paper) document than a stream which has its physical
world equivalent more as a river or as a road.

So I guess that the word format would be more versatile and could be adapted
either to a document (i.e. file) and to a stream. It would convey the
meaning that the content is formatted with xml structure. Then, a markup
could be called an information unit and be perceived as a single information
unit (obviously). Documents can be constructed with "information units" and
stream convey "inforamtion units". This would have the advantage to apply to
each world: a) documents = files and b) streams.

At one time in history we where calling our transportation vehicule a
"horse". Imagine the confusion if we where still calling our car a "horse".
How would you call a horse then? Is it what we do with the word "document"?

So Mark, your comment about XML servers is useful as well on the conceptual
point of view as on the practical point of view.

Didier PH Martin
mailto:martind at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list