XML Information Set Requirements, W3C Note 18-February-1999

Clark Evans clark.evans at manhattanproject.com
Mon Feb 22 06:29:20 GMT 1999

Paul Prescod wrote:
| I don't believe that the W3C has forgotten about stream processing. One of
| the more controversial parts of the XML namespace specification is
| intimately tied to stream processing (local namespaces). I think that I
| can safely say that when XML was being developed streaming uses were as
| high in the minds of the working group as tree-based uses. Stream based
| processing has always been more common in the SGML world than tree
| processing. Okay then, why are the DOM and XSL tree based? Well, the web
| infrastructure favors small documents inherently. Large streams must be
| broken up on the server side for performance reasons. Bandwidth, not RAM,
| is the limiting factor in Web user interfaces.

This clears a great deal up for me.  Thank you.  I didn't see 
the direct relationship between usage of XML for stream processing 
and local name spaces.  Thus, a similar controversy would arise 
if one proposed local architectures?  *evil grin* 

Jeff Sussna write:
| If you approach XML as a type system, the concept of document loses
| its first-class status (or at least should, in my opinion).

<warning up-time="34h">

I think I agree with this.  Please correct me, but with Property Sets,
each node has a link directly to the 'document root'.  I see this
as something which deserves consideration (among other things)
when the Infomation Set is defined.   Perhaps it can look like this:

  // stuff common to both stream and object representations 

TreeInfoSetNode public BaseInfoSetItem
  // extra stuff that you get for free when the representation
  // is an in-memory graph, database wrapper, or some other
  // complete object with random access.
  TreeInfoSetNode *DocumentRoot();

EventInfoSetStack public BaseInfoSetItem
  // extra stuff relevant when you have an event based, stack 
  // representation of the information in question.  This would 
  // be the same as the 'visitor' interface for the TreeInfoSet?


Hmm. Just meandering...

Rick Jelliffe replied to Jeff's note:
| XML is not a type system. A document is a graph of elements, data,
| comments and PIs with
|    * an ID namespace
|    * optionally some element type declarations
|    * optionally some entity declarations and notation declarations
|    * optionally namespace declarations which allow local type names to
|      be qualified by a URI
| In other words, the document is the block mechanism for metadata 
| and namespaces for a subtree of the entire hyper-document.

Hmm. Perhaps this would be a good starting place for
the 'definition' of a document.  Thus, would it be fair 
to say that a document is analogous to a database transaction?

If so, then my question becomes:  How can I express nested blocks?  

| XML is a labelling notation, not a type system.

I'm not sure I get the distinction.  When you label 
something arn't you in effect classifying it, i.e.,
giving it a type, and, isn't a label required 
to identify type?  

| If the document loses its first-class status, which of these things
| should be gotten rid of? Do you want arbitrary scoping of IDs, element
| type declarations, entity declarations, notation declarations and
| namespaces?  If so, you need some block mechanism to allow these.  

Hmm.  Well, I see a stream based system having a stack. 

Thus, each <tag> beginning something puts the element 
on the stack, and each </tag> pops the stack.  Thus, I see
<tag> and </tag> as my block mechanisms.   Is this too niave?

Don Park wrote:
| Why not have "from here on use these declarations" as the 
| default behavior and then introduce 'push' and 'pop' constructs? 
| 'push' would save current declaration settings and the 'pop' would 
| just restore to the saved settings.

Mabye I'm not getting it, but <tag> and </tag> provide the
push/pop mechanism automagically.  The only things that have
problems is those "from here on" things.  It's unfortunate
that SGML backwards compatibility dictate that this is the
default behavior... and thus <tag> </tag> can't provide
the push/pop mechanism?  *dazed*

I think I need to go back and do more real-world hacking, I'm 
starting to get a better felling for XML.  Sorry for butting in 
the conversation again.

:) Clark Evans

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list