Parser compliance

Len Bullard cbullard at hiwaay.net
Tue Nov 23 01:39:15 GMT 1999


Steven R. Newcomb wrote:
> 
> My main point is, I think, also yours: it causes unnecessary grief to
> assume that the interchange structure of an XML resource needs to be
> the same as, or even correspond to, the API to the information that
> the XML resource carries.

That is true.  An API is designed to be an interface to information; 
not the other way around.  But define the lexical types, freeze the
features 
sets (probably not a good idea, IMO), declare namespaces and schemas,
and 
you get a good enough structure to manage information for many kinds 
of easily proven good bits of information for reasonable transactions 
among interfaces, the system boundaries.

Define what goodness in information is:  coherence.  

Declare a lexicon, let namespace families aggregate by 
frequency and control referents (W)),  that when valid, 
do not degrade the value of W which may be modeled as phasing. 
Coherence is the capacity of the namespace family to carry 
information with the least noise through the most cycles.

Given a set of XML application languages (vocabularies), what element 
names are being shared and do these conflict semantically?  How is this
detected?

Markup needs semantics.  The information at some level, systemically, 
is bound to the machine.

So we have: (forgiveSimplification)

o  Event Parsing:  SAX, event-oriented, tag-stacker, system driven by 
pure names.

o  Object-oriented:  DOM, instantiate the property set as a tree of 
objects with interfaces

That works fine.  Why not?  The purpose of structured information is 
to bind it and share it among cooperative entities where cooperation 
is defined as the transactions among defined boundaries, cooperation 
by the framework of interfaces shared.   The Browser doesn't care.  
It see this all as services.  That's fine.  Good design.

> I mentioned the DOM only as an *example* of ready-to-run objects; I
> did not intend to say that the DOM is useful for every purpose.

Right.  DOM is good when you need an object in memory to talk to.
SAX is good when you need an object in memory to talk to you.

> Ready-to-run objects (including your indexes) are ready-to-run
> objects, regardless of the process by which they were created.

But not by the definition of the interface.  To use the definition, 
at least at the IDL, you need a schematic representation of the
interface. 
How do I know they are <readyToRun>?  Where in the markup do I find this 
<readyToRun> tag?  (ok... yep, messages in markup, right?)  DNA?

> Ready-to-run objects (whatever one may call them, and regardless of
> how they are implemented) are manufactured by all applications of XML,
> not just by some of them.

Sure.  That's the beauty of it.  Pick any good object, relational, 
stringMachine system and it still works.   The hard problem is 
when the machines are through with it, does it still mean the 
same thing?  Can we prove it?

Does that matter if practice reveals we don't always have to 
be able to prove it?  We have to be able to prove it often 
enough and in enough formats that the maps in the right 
places don't become incoherent.

I was reading the Cocoon papers and the notion of "hot spots" 
is perfect because the optimization is by the frequency of 
use.  Good metaphor.  Some may want to look up some DARPA work 
from the early eighties (DICE project) on the application 
of the model of simulated annealing.

len



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list