XML Information Set Requirements, W3C Note 18-February-1999
Marc.McDonald at Design-Intelligence.com
Marc.McDonald at Design-Intelligence.com
Sat Feb 20 02:02:42 GMT 1999
I agree with Sussna, by thinking out the concept of a document more
fully a number of interesting ideas present themselves.
In considering a document to be a stream or information set, it allows
a distributive organization over a network. Instead of requiring the
entire 'document' to be transferred en-masse as a file, it can be done
piece-wise over a stream. Consider this just-in-time manufacturing of
the 'document'.
Naturally, you can think of cases where only part of the entire
document is needed. Subsetting of the document tree is one of the
features of XSL.
Unifying these 2 ideas provides a new use for a DTD. It is not only a
means to describe the valid structure of a document, but now can
advertise the information available. A site can be described as
capable of providing information sets in a set of structures defined
by DTDs (or their replacement). A consuming application could request
information by a pattern or query which would return the desired
subset of information.
What could be accomplished is a unified solution to problems addressed
and/or recognized in SAX, XSL, queries, DOM, and fragments. It also
provides a model for a data server as an XML 'document' constructor.
In terms of architecture, it removes bottlenecks. Converting to a file
model is expensive if the information is large and it can be used
piecemeal on the other side. It is a worst-case solution. A
demand-based stream model will create entire documents only if
required by the ultimate consumer of the information and otherwise
incrementally provide elements.
Marc B McDonald, Principal Software Scientist
Design Intelligence Inc, Seattle WA
http://www.design-intelligence.com
----------
From: Jeffrey E. Sussna [SMTP:jes at kuantech.com]
Sent: Friday, February 19, 1999 10:42 AM
To: 'Clark Evans'; 'Marcus Carr'
Cc: xml-dev at ic.ac.uk
Subject: RE: XML Information Set Requirements, W3C Note
18-February-1999
I completely agree with Clark. As someone working with real-time XML
streams, I think this is very important. In particular, the whole
notion of "document" needs to be thought through very carefully in the
context of 1999, rather than the context of 1990 when SGML was
developed. If I may grow philosophical for a moment, I believe that
XML is at a crossroads. That crossroads can be defined by examining
the term "markup". I believe that XML is actually moving away from
being "markup" oriented. First of all, one can easily imagine an XML
document where all leaf-level elements are EMPTY, and contain all
their semantics within attributes. In that case, there is nothing to
be "marked up". Furthermore, when you apply XML to things like
database record interchange, it really isn't a text-oriented
environment anymore.
I believe that XML points more towards type systems than markup. If
you look at a programming language, it generally supports 2 things (I
am being very poetic and not rigorous here): defining and
instantiating data types, and defining and instantiating operations on
data. XML supports the first. It provides a mechanism to create and
exchange instances of data types between external systems that will
provide the operations on those data. The realization that DTD's are
inadequate, and that a more robust schema specification language is
needed, points in the same direction.
If you approach XML as a type system, the concept of document loses
its first-class status (or at least should, in my opinion). It is
interesting that the concept of document (even physical document as
file) has crept into programming languages, and has caused problems
there as well. The C language include directive is a physical rather
than a logical mechanism. When you try to build a database-driven
incremental build system, includes become problematic.
I would like to encourage the XML community to 1) pay attention to the
lessons of 30 years of development in the arena programming and type
languages, and 2) not get bogged down by the historical baggage of the
M in XML.
Jeff Sussna
-----Original Message-----
From: owner-xml-dev at ic.ac.uk [mailto:owner-xml-dev at ic.ac.uk]On Behalf
Of
Clark Evans
Sent: Thursday, February 18, 1999 9:09 PM
To: Marcus Carr
Cc: xml-dev at ic.ac.uk
Subject: Re: XML Information Set Requirements, W3C Note
18-February-1999
Marcus Carr wrote:
> I think you might be applying a meaning to that
> phrase that it doesn't deserve - it doesn't call XML
> a document standard, it uses the term "XML document",
> with document defined in the XML recommendation as:
>
> "A data object is an XML document if it is well-formed,
> as defined in this specification. A well-formed XML document
> may in addition be valid if it meets certain further constraints."
>
> This allows you to use the phrases "XML data object" and
> an "XML document" interchangeably.
>
> This isn't incongruous with stream markup - you just need to
> consider the stream as an XML document. Seriously though, you
> probably wouldn't have the same concerns about "XML data object"
...
My concerns would be even greater. This conjures up in my
mind a Java or C++ object where the complete stream
has to be loaded in memory (or some other random-access
medium) before it can be used. Yes, I know you can have
a multi-threaded implementation so that you can start using
the data object before it finishes reading, etc. However,
given the object model it is *reasonable* for the niave
programmer to ask for something at the _end_ of the stream.
This will cause the call to block untill the stream ends.
If "data stream" processing was treated with *equal*
importance by the W3C committees, then they would see,
in many cases, that this complementary approach is at
least as good as, or in some cases far superior to
an "data object" approach.
Constantly viewing XML as a standard for the description
of "data objects" and not "data streams" is a subtle, and
important bias. It is taking object-orientation too far
and discarding parallel stream processing, and it's related
technologies like SAX and SAXON.
:) Clark
xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list