XML Information Set Requirements, W3C Note 18-February-1999

Jeffrey E. Sussna jes at kuantech.com
Fri Feb 19 18:44:07 GMT 1999


I completely agree with Clark. As someone working with real-time XML streams, I think this is very important. In particular, the whole notion of "document" needs to be thought through very carefully in the context of 1999, rather than the context of 1990 when SGML was developed. If I may grow philosophical for a moment, I believe that XML is at a crossroads. That crossroads can be defined by examining the term "markup". I believe that XML is actually moving away from being "markup" oriented. First of all, one can easily imagine an XML document where all leaf-level elements are EMPTY, and contain all their semantics within attributes. In that case, there is nothing to be "marked up". Furthermore, when you apply XML to things like database record interchange, it really isn't a text-oriented environment anymore. 

I believe that XML points more towards type systems than markup. If you look at a programming language, it generally supports 2 things (I am being very poetic and not rigorous here): defining and instantiating data types, and defining and instantiating operations on data. XML supports the first. It provides a mechanism to create and exchange instances of data types between external systems that will provide the operations on those data. The realization that DTD's are inadequate, and that a more robust schema specification language is needed, points in the same direction.

If you approach XML as a type system, the concept of document loses its first-class status (or at least should, in my opinion). It is interesting that the concept of document (even physical document as file) has crept into programming languages, and has caused problems there as well. The C language include directive is a physical rather than a logical mechanism. When you try to build a database-driven incremental build system, includes become problematic. 

I would like to encourage the XML community to 1) pay attention to the lessons of 30 years of development in the arena programming and type languages, and 2) not get bogged down by the historical baggage of the M in XML.

Jeff Sussna

-----Original Message-----
From: owner-xml-dev at ic.ac.uk [mailto:owner-xml-dev at ic.ac.uk]On Behalf Of
Clark Evans
Sent: Thursday, February 18, 1999 9:09 PM
To: Marcus Carr
Cc: xml-dev at ic.ac.uk
Subject: Re: XML Information Set Requirements, W3C Note 18-February-1999


Marcus Carr wrote:

> I think you might be applying a meaning to that
> phrase that it doesn't deserve - it doesn't call XML
> a document standard, it uses the term "XML document",
> with document defined in the XML recommendation as:
> 
> "A data object is an XML document if it is well-formed, 
> as defined in this specification. A well-formed XML document 
> may in addition be valid if it meets certain further constraints."
> 
> This allows you to use the phrases "XML data object" and 
> an "XML document" interchangeably.
>
> This isn't incongruous with stream markup - you just need to 
> consider the stream as an XML document. Seriously though, you 
> probably wouldn't have the same concerns about "XML data object" ...

My concerns would be even greater.  This conjures up in my
mind a Java or C++ object where the complete stream 
has to be loaded in memory (or some other random-access
medium) before it can be used.  Yes, I know you can have 
a multi-threaded implementation so that you can start using 
the data object before it finishes reading, etc.  However, 
given the object model it is *reasonable* for the niave 
programmer to ask for something at the _end_ of the stream.  
This will cause the call to block untill the stream ends.

If "data stream" processing was treated with *equal*
importance by the W3C committees, then they would see,
in many cases, that this complementary approach is at 
least as good as, or in some cases far superior to 
an "data object" approach.

Constantly viewing XML as a standard for the description 
of "data objects" and not "data streams" is a subtle, and 
important bias.  It is taking object-orientation too far 
and discarding parallel stream processing, and it's related
technologies like SAX and SAXON.

:) Clark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list