Seeking a Dao of Groves (from Eliot Kimber)

Wed Feb 2 08:47:30 GMT 2000

Posted on behalf of Eliot Kimber [for technical list reasons].

[Editorial comment - this is the sort of posting that we need to identify
and keep for reference.]

>EliotX-Mozilla-Status2: 00000000
>Message-ID: <38963E05.F1E03297 at drmacro.com>
>Date: Mon, 31 Jan 2000 19:59:33 -0600
>From: Eliot Kimber <drmacro at drmacro.com>
>X-Mailer: Mozilla 4.7 [en] (Win98; I)
>X-Accept-Language: en
>MIME-Version: 1.0
>To: XML Dev <xml-dev at ic.ac.uk>
>Subject: Re: Seeking a Dao of Groves
>References: <Pine.GSO.3.96.1000128071621.26755A-100000 at grind>
<38924B5B.4FDA at hiwaay.net>
>Content-Type: text/plain; charset=us-ascii
>Content-Transfer-Encoding: 7bit
>
>[I originally sent this Sunday 30 Jan but I never saw it posted. My
>applogies if it appears twice.]
>
>Len Bullard wrote:
>
>> Why?  Politics and personalities are what they are, power, tactics
>> all of that, but they get us no closer to understanding HyTime. It
>> is true that I was around a bit after hyTime emerged, and worked
>> with its inventors.  It is true that by the time we got to the
>> Vancouver conference, and after some experience, I no longer had
>> the foggiest idea what Eliot and Steven were saying in their
>> presentations.
>
>And us you, Len :-)
> 
>> HyTime is brilliant, but brilliance blinds as well as illuminates.
>> Sometimes the best position for a light is behind, above and slightly
>> to the left.  So, a statement for finding the position:  standards
>> derail, in my experience, when the problem to be solved by them is
>> not (adequately understood | clearly stated | closed).  I am asking
>> Dr. Newcomb, the only one besides Dr. Goldfarb on this list who
>> was there at the beginning to verify or refute the following, and
>> fill in the rationale.  I would be delighted if Dr Goldfarb would
>> help.
>
>Steve hasn't responded yet, so I'll take the liberty of providing some
>answers as I believe I can do so with some authority.
> 
>> 1.  True or False:  hyTime started (little H deliberate as a music
>> standard.  
>
>True.  
>
>             The problem(s) to be solved were synchronization and an
>> application language for a musical notation.  What requirements of music
>> made the hyTime designers move into a larger scope of standardization
>> (Intergrated Open Hypermedia:  Bibliographic Model).
>
>The story that I have been told by Dr's Goldfard and Newcomb is that it
>was the U.S. DoD who said "this synchronization stuff is just what we
>need for things like battlefield planning and management, but if we put
>a music standard in our RFQs, we'll be laughed out of the Pentagon, so
>can y'all do us a favor and pull the generic stuff out into a separate
>standard? We'd be ever so grateful." Not being idiots, they did.
>
>There was also, I think, the general realization by the committee that
>the structures needed to do the synchronization required by music,

>opera, ballet, and other forms of time-based stuff are in fact general
>to a wide range of problems.  They could define a general facility and
>then use it as the underpinnings of a more specialized music
>representation standard (Standard Music Description Language, which
>Steve and I are trying to finish up now, as a matter of fact).
> 
>> 2.  True of False:  There originally WAS a hyTime DTD.  Why was
>> it abandoned?
>
>True. It was abandoned because a "DTD for HyTime" didn't solve the
>problem. You needed to be able to create your own documents, with your
>own element types and semantics, that took advantage of the generic
>linking and scheduling semantics that HyTime codified. The early drafts
>used complex systems of parameter entities but it was quickly realized
>that that wouldn't work, so they came up with the idea of architectures
>as a way of mapping from specialized document types to the general types
>defined by the HyTime standard.
>
>This is, for example, exactly what XLink does, for the same reason.
>
>HyTime is not, like HTML, simply about creating *a single way* to create
>hypertext documents. It's about enabling *any document* to also be a
>hypertext document. The second is inherently more complex than the
>first, but also much much more powerful.  This is why XLink is cool but
>also why it's more difficult to understand and implement than HTML.
> 
>> 3.  When at its most widely studied, HyTime included an exhaustive
>> set of linking and location models.  At this point, the synchronization
>> facility was expressed using these.  Why did linking and location become
>> the dominant feature of HyTime?
>
>I think linking and addressing became dominant for at least three
>reasons:
>
>1. Linking and addressing are of immediate benefit to almost all typical
>SGML/XML applications (e.g., technical documentation in all its myriad
>forms). Therefore the first applications of any part of HyTime were
>going to be in linking applications. That was certainly what interested
>me in HyTime initially (we needed to SGMLify the already sophisticated
>linking in IBM's BookMaster application and HyTime seemed a very close
>match to our existing requirements). Tools like SoftQuad Panorama/Synex
>Viewport made it easy to use HyTime-based hyperlinking, at least in
>simple ways.  Most SGML practioners already understood hyperlinking (or
>at least the requirement for it), so it was easier for them to see how
>HyTime could be of some value there.
>
>2. Scheduling is much more involved than linking and much more difficult
>to implement from scratch. In 1992, people were still struggling to get
>industrial-strength SGML page-model presentation systems implemented.
>What work was being done on the hypermedia parts of HyTime was being
>done in universities by people like Lloyd Rutledge. It was also very
>abstract and difficult to understand, certainly from the standard alone.
>The pool of people who might be interested in it was small at best and
>many, if not most, of them were already engaged in more immediate
>concerns, like implementing the linking parts of HyTime.  This is still
>largely the case, although things like SMIL are helping us to better

>understand the problem space.  This is really an area where you have to
>have a concrete application to really understand it or even be motivated
>to implement it. The folks who undertand this part of HyTime best are
>still largely engaged in putting bread on their table and getting some
>basic infrastructure components implemented. But we are very close to
>having the technology base we need to make implementing the scheduling
>part of HyTime, if not easy, at least practical.
>
>3. The people who have the most to gain from implementing the scheduling
>stuff have the least ability to realize it: educators and archivists.
>HyTime is ultimately about providing an interchange/archival
>representation form for hypermedia information. This is of vital
>importance to educators who need to be able to create rich information
>presentations that are information system and presentation platform
>independent (that is, apply to hypermedia the same benefits of generic
>markup that technical information has enjoyed for years). It is of vital
>importance to archivists (how many people realize that there exists
>today *no standard way* to archive music except as print scores?). 
>However, these are two groups of people who have little money to spend
>on implementing standards like HyTime and, because they have little
>money, little influence on companies that could implement it.
>
>If you're Macromedia or Adobe, what financial motivation do you have to
>implement HyTime? Your biggest customers make untold millions of dollars
>selling the stuff they develop with your tools, so much profit that the
>cost of authoring and reauthoring is noise, no matter what it costs,
>because the authoring cost is a fraction of the total cost, such that
>optimizing it further would provide little absolute benefit to the
>business. Will you listen to the Disneys and Origins and Dreamworks of
>the world or to the people at the Texas School for the Blind who want an
>easy and sustainable way to make tommorrow's multimedia curriculum
>usable by the visually impaired (how do you run a point-and-click
>tutorial if you can't see the screen? How do you learn from it if you
>can't hear the words? How do you run it if you are paralyzed from the
>neck down?)?
>
>Len and I know that Macromedia director would be a much more useful and
>interesting tool if it could save information in a HyTime-conforming
>format, but what motivation does anyone at Macromedia have to even learn
>that fact, much less put it into practice? Absolutely none. At least
>until the same legislators who are requiring both more multimedia
>content in schools and fully-accessible materials for all students
>realize that these two requirements cannot be met by current technology
>and make the use of HyTime (or its functional equivalent) required by
>law.
> 
>> 4.  True or False:  Groves are a concept borrowed from DSSSL, a
>> style language, itself, originally that was altered to include
>> Semantics.  
>
>False. Groves are a concept that both DSSSL and HyTime needed in order
>to be both compatible with each other and fully defined as standards.
>Both DSSSL and HyTime had notions of abstract trees, but neither had

>defined them formally. When, in 1995, it was realized that these two
>standards could not be completed unless they were based on the same
>basic model of what an SGML document is, the two working groups came
>together to develop a common technical solution to the problem. Groves
>were the result. The technical details came more or less equally from
>Charles Goldfarb, James Clark, and Peter Newcomb, with important
>contributions by the usual cast of suspects, including myself. 
>
>[As far as I know, the DSSSL standard always included the word
>"semantics" in its title (whatever that means--I'll leave that to Sharon
>Adler to explain).]
>
>  What requirement in a linking and location standard
>> resulted in a unification with the DSSSL groves concept?
>
>The requirement for a common underlying abstract model of what an SGML
>document is. Both DSSSL and HyTime depend on the ability to do
>addressing. DSSSL so that you can associate style with things, HyTime so
>that you can relate things together. To do addressing, you must formally
>define the structure of the thing being addressed. To do this you must
>have a formalism for talking about structures. That's what groves are.
>HyTime had the additional requirement that it had to be applicable to
>*all data types*, not just SGML/XML (with groves, DSSSL can also be
>applied to any type of data, but the standard as written does not
>explicitly recognize this fact). Therefore, it was not enough to define
>a formal model for SGML documents, we had to have a formalism from which
>the abstract model for any kind of data (including SGML) could be
>defined.
>
>Another requirement was that both standards needed a *fundamentally
>different view* of SGML documents that reflected the optimizations
>required by their differing uses. In particular, DSSSL needed to see
>processing instructions and individual characters while HyTime wanted to
>ignore processing instructions (by default) and treat sequences of
>characters as single objects for the purpose of addressing. 
>
>This requirement is reflected in groves through the "grove plan", which
>lets you say for a given grove type what things you do or don't want to
>see at the moment. This provided a formal way in which DSSSL and HyTime
>could both be based on the same SGML data model yet have different,
>incompatible default views.  This is a complicating factor for groves
>that could not be avoided (and turns out to be of tremendous utility in
>practice).
> 
>> disaster.  So, yes, time for some simplifications.  Perhaps
>> understanding the way another standard tried to solve the
>> same problems is a clarifying experience.
>
>I don't think it's necessarily time for simplificiations. What it is
>time for is stepping back and providing some basic definitional
>framework for all of the W3C specs. Ultimately, defining good
>abstractions simplifies the system by centralizing design and
>implementation effort and knowledge that would otherwise have to be
>replicated everywhere it was needed. Every XML-related specification
>needs a formal abstract data model for what XML (and possibly other
>stuff) is, and to date each specification has either defined its own

>(XLink/XPointer) or left it implicit (DOM). This makes the total much
>more complicated than it needs to be.
>
>The reason that the HyTime standard is so big is that it defines all of
>the infrastructure needed by the *relatively simple* HyTime
>architecture. That is, the linking and addresing parts of HyTime are
>relatively simple, taking no more than 80 pages to define (comparable to
>XLink, although bigger because it offers more facilities and more syntax
>choices). But to define the facilities of HyTime with something
>approaching mathematical precision, you need the following:
>
>1. A standardized, generic, abstract data representation facility
>(property sets and groves). You need this so that you can define
>addressing and other processing semantics without reference to any
>particular data type or implementation.
>
>2. A standardized, generic facility for mapping from any document to the
>syntax and semantics of the HyTime architecture (the Architectural Forms
>Definition Requirements (AFDR) annex). 
>
>3. A standardized definition of the abstract data model for SGML
>documents, defined in terms of item 1 (the SGML Property Set annex).
>
>4. A general architecture that defines those things the HyTime
>architecture needs that are not specifically related to linking and
>scheduling and that are useful to almost any SGML or XML application
>(the General Architecture).
>
>These are four of 6 annexes that make up the "SGML Extended Facilities".
>The other two, Formal System Identifiers, and Lexical Type Definition
>Requirements, are not strictly required in order to define the HyTime
>architecture, but are useful nevertheless and represent long-standing
>requirements on SGML.
>
>Obviously, these should all be separate standards, published under
>separate cover, and referenced from HyTime, but for historical reasons,
>we did it backward and this is what we have. SC34/WG3 (the working group
>responsible for the HyTime standard) has discussed doing this breakup at
>some point in the future, but it's not being actively pursued at this
>time because the people who would do it (me) are too busy with other
>stuff just now.
>
>I observe that the W3C is making *exactly the same mistake* we made in
>not building the underlying necessary prerequisites first before trying
>to do things like define abstract models for XML documents and generic
>linking mechanisms. We have the excuse that we didn't know what we were
>doing because nobody had done it before. The W3C doesn't have that
>excuse.
>
>We are seeing all sorts of problems with the various specs that stem
>entirely from the lack of well-defined and agreed upon definitions for
>fundamentals. When we published the XML spec we said, as a working group
>"this spec is really not complete without a formal definition of the
>abstract representation of XML documents" but we knew we couldn't afford
>to delay the spec in order to do that [Remember that two of the people
>chiefly responsible for the development of groves, James Clark and
>myself, were founding members of the XML Working Group.] Our expectation
>was that doing that definition would be the next order of business, in

>large part because we knew from experience that both XLink and XSL
>required it. Obviously, history took a different route and we are now
>left where we are, with the DOM defining an API over an unspecified data
>model, both XLink/XPointer and XSL defining their own hand-waved
>abstract models, info set still being worked on, and schemas only just
>now waking up to fact that they need an abstract data model as well.
>
>EVERYONE INVOLVED SHOULD HAVE KNOWN BETTER. Many did. Lord knows I
>brought it up at every opportunity.
>
>I know that a lot of people got suckered into thinking that "XML is
>easy" and that therefore doing things like XLink and XSL and XML Schema
>will be easy too, in explicit contrast to HyTime, which is "hard". But
>of course that's a Big Lie. To the degree XML was "easy" (in the sense
>that it only took 18 months from the forming of the WG to publishing of
>the Rec) it was because it required no invention, it only required
>paring away those parts of SGML we really didn't need. That doesn't
>require any grasping of complex abstractions or layers of mapping or
>obtuse concepts like out-of-line links. But *everything after that
>does*. Defining and understanding the abstractions needed to build a
>complete system of standards is hard. It's hard intellectually, it's
>hard politically, it's hard socially, it's hard from a business
>perspective. A lot of people are simply not capable of working with or
>understanding abstractions. I've known many crackerjack programmers who
>could solve very difficult algorithmic and data structure problems who
>could never grasp groves (and not for lack of trying). With HyTime, we
>had the significant advantage that we were a fairly small group of
>people who worked very well together. We had no commercial interests to
>distract us. By contrast, the W3C is, by it's nature, a set of very
>large groups of very diverse personalities and many competing commercial
>interests. It should be absolutely no surprise that progress has been as
>slow as it has been. In fact it's a surprise to me that any progress has
>been made at all. [Jon Bosak and the W3C leadership have done some
>admirable work in refining the W3C processes to try to work around some
>of these inherent problems.]
>
>I have no illusion that the world will some day wake up and embrace
>HyTime '97 as it exists today. It's big. It's complex, it's hard to
>grasp in many ways.  But I do fully expect that what we learned from
>doing HyTime will eventually influence whatever gets put into practice
>over the next 20 years. I do expect that people will realize that what's
>in HyTime *is there because it has to be there* and that if they want a
>system of standards that will serve them well for a long time (i.e.,
>more than 5 or 10 years) that they will need to build those sorts of
>things too. If they choose to borrow directly from HyTime, I will be
>very pleased, but if they invent their own stuff, that's ok too. We
>learn by doing. I certainly did.
>
>I doubt that much of the infrastructure defined by the HyTime standard
>will be used as is--it's too tied to SGML-specific ways of doing things;

>it reflects the best thinking of 1995, not 2005. But there is a lot of
>good stuff to be learned from what's there, a lot of valuable knowledge
>and mistakes that can be had for the low low price of actually reading
>the spec (and maybe asking some questions of gurus)
><ftp://ftp.ornl.gov/pub/sgml/wg8/document/n1920/html/>.
>
>Steve mentioned that I resigned from the W3C in protest. That's true. I
>also resigned because I had run out of patience trying to get people to
>understand the difficult technical issues that we had spent the last 10
>years coming to understand as we developed the HyTime and DSSSL specs. I
>decided to take my experience and put it to practical use solving
>immediate problems for clients who would not only listen to me but pay
>me too!
>
>Since that time I've been involved in several HyTime-based projects
>where we've put all much of it into production and are continuing to do
>so. I've helped Steve's company get their GroveMinder[tm] product into
>production so we can use grove-based technology to solve people's
>problems. I tried to build a demonstration HyTime engine and gave it
>away (PHyLIS, www.phylis.com). Because I focus on large-scale systems,
>most of what is being developed in the W3C is not even relevant to what
>I'm doing at the moment, except to the degree that the information
>produced is eventually emitted in XML. But because I already have HyTime
>largely implemented, I don't have to wait for XLink to be finished or
>for someone to implement XML Schema. But I also realize that I am, at
>least today, fairly unique in this regard (although not as unique as you
>might think, given that you can buy GroveMinder--I know that, for
>example, ISOGEN's worthy competitor Calian has HyTime knowledge and
>experience that is almost comparable to ours [they don't employ any
>members of the HyTime editorial staff]). And there are people out there
>who don't or can't talk about what they're doing with HyTime, but they
>are out there.
>
>I fully expect that the W3C's efforts to develop XML-based
>specifications will generate important new insights into how to solve
>basic problems in information management and, eventually, provide more
>powerful tools than I have today and I look forward to that eagerly. But
>I certainly don't lose any sleep because I don't have them today. In the
>mean time, all I can see is a lot very bright and hard working people
>working at an essentially futile task and spinning a lot of wheels, all
>because the fundamentals are being ignored. It's a shame, but so is
>world hunger and there's not much I can do about that either except to
>make what small contributions I can.
>
>Cheers,
>
>E.
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions and unsubscriptions
are  now ***CLOSED*** in preparation for list transfer to OASIS.