Inheritance in XML
Matthew Gertner
matthew at praxis.cz
Mon Apr 20 19:42:37 BST 1998
Paul,
Let me try to explain at least what I am envisioning as far as the putative
inheritance mechanism which I described is concerned. I am going to get
myself into trouble by saying this, but SGML was an attempt to avoid doing
5% of what is necessary (i.e. to do everything), and this led 10 years down
the line to the creation of XML. The same applies to HyTime. This doesn't
mean in any way, shape or form that either SGML or HyTime are not wonderful
things. They are absolutely amazing. But in many, many instances what is
needed is a more simple language/mechanism/etc. which only does 5% (I'd
rather bump this up to at least 50% personally...) but which are more
approachable. Isn't this what XML is all about?
A very uncharitable view (which I would certainly not endorse :-) would be
to say that you are setting the bar very high, and then rejecting ideas
which at least one person in this list thinks would be of great value
because they don't reach the level of functionality that you are imposing.
I don't get your point entirely in regards to subtyping and inheritance. Let
me explain how I understand the conventional OO usage of these terms.
Inheritance means that some thing gets some of its attributes/methods from a
base thing. Of course this doesn't necessarily make the derived thing a
subtype of the base thing: it may just inherit data members and not
interfaces. Subtyping means that some thing can be treated as another thing
in some cases because it shares at least one of that things interfaces. So
what I was describing is definitely inheritance and should reasonably be
considered subtyping as well (although we aren't actually talking about
interfaces).
It seems to me that your objection to this boils down to the fact that you
also want to subtype without inheritance. I am approaching the problem from
a different standpoint. I appreciate your reaction to the extent that you
and many other people who know far more about these issues than I have given
years of thought to them and not come up with an entirely satisfactory
solution. I started thinking about how to subtype content models (through
inheritance) in an entirely flexible way and gave up almost immediately.
It's a hard, hard problem. On the other hand, C++ has gotten away with
providing subtyping only through inheritance (you mentioned that it can but
I can't figure out how - please enlighten), and it's still a pretty useful
little language.
We have the advantage now of being at a new frontier: XML. There aren't many
standard XML DTDs to speak of, and certainly none that are built to exploit
subtyping through inheritance. However, if such a mechanism existed (and as
I say, it isn't rocket science), I truly believe that it would be quite
feasible to design small "component DTDs" which could be usefully extended
without needing to map element types or get into the guts of the content
model.
What I was implying in my example about CML were 3 things: 1) The processor
has access to the base DTD. 2) The processor has access to the derived DTD.
3) The processor knows about the inheritance (which is also a subtyping :-)
mechanism being used. This would enable it to get at the content of the base
element type without knowing what to do with the content of the derived
element type. This can't be done with cut and paste. There is some scope for
ambiguity here, but I can't think of any examples that do anything really
useful, so they could just be forbidden (i.e. sticking a (foo*) in front of
a content model that starts with its own (foo*)). In your example, you would
need to extend the processor to deal with images in titles, but at least it
wouldn't break older processors, which would still display the text of the
title.
So we really are talking about two different things. HyTime does a great job
with things like mapping element type names. It isn't going to die or go
away, and companies like Boeing and Bombadier who need that kind of
functionality and can afford to invest in it and climb the learning curve
are going to chose to use it. All that I'm saying is that analogous to the
way that XML tries to broaden the market for a lot of the great ideas in
SGML by simplifying it, we need a simple inheritance mechanism (that
implements subtyping) to be used with XML. Once again, this only makes sense
if DTDs are designed to take advantage of this mechanism and if there is
some central body for gathering these DTDs and their associated
documentation and ensuring that overlap doesn't occur. All I want is to be
able to do is scoot over to the DTD repository site, check for a standard
DTD for invoices, grab it, extend it with the two or three extra attributes
and/or contained element types that I need and use it, while still being
able to use any tools that are designed to work with the original invoice
DTD. I truly believe that this is where XML will really start to fulfill its
promise.
But then I may be crazy...
Matthew
-----Original Message-----
From: Paul Prescod <papresco at technologist.com>
To: xml-dev at ic.ac.uk <xml-dev at ic.ac.uk>
Date: Monday, April 20, 1998 3:33 PM
Subject: Re: Inheritance in XML
>Matthew Gertner wrote:
>>
>> * Terminology *
>>
>> I personally don't agree that there are carved-in-stone, well-understood
>> definitions for terms like "inheritance" and "subtyping" in XML.
>
>I don't think that anyone claimed that there is a well-understood
>definition for "inheritance" in any context -- even OO. But to be
>consistent with English, it must have something to do with "getting
>something for free." In the XML context the most obvious thing would be
>declarations.
>
>Subtyping is different. Subtyping comes straight from mathematics and is
>as old as logic (at least). A type defines a set of objects. A subtype
>describes a subset of those objects. Simple and precise.
>
>> Is
>> "subtyping" a better term. No, because it doesn't have the same resonance
as
>> the word "inheritance" among non-programmer types.
>
>I don't know why you think that. Non-programmer types are likely to balk
>at either word, but at least subtyping is shorter, and can be precisely
>defined. Anyhow, it is not at all like the words are interchangable. You
>can't pick and choose from words that already have meanings.
>
>> I'll make a first attempt:
>> "Inheritance in XML refers to the process of creating new element types
that
>> duplicate the content model and attribute list of existing element types
(in
>> the same or a seperate "base" DTD), while extending these to include
>> additional attributes and/or content. As such, instances of the new
element
>> types can be used wherever the base element type can be used, and can be
>> processed polymorphically by any external processor which knows about the
>> base element type."
>
>ACK! This definition was proven inadequate in the OO software world
>around a decade ago. Both C++ and Java allow subtyping without
>inheritance, and C++, Sather and Eiffel allow inheritance without
>subtyping (I suppose to get that in Java, you would have to use
>delegation). If we are going to borrow ideas from OO, then we should at
>least use the updated, modern ideas, not those that were accidently
>confused in Simula 67 (and have been confused in programmers minds ever
>since).
>
>The first major problem with your definition actually has nothing to do
>with the inheritance/subtyping conundrum. The biggest problem is that if
>you "extend" a content model, you are making a more flexible language,
>which *cannot* be processed polymorphically by an external processor
>which knows nothing about the base element type:
>
><!ELEMENT TITLE (#PCDATA)>
><!ELEMENT MY-TITLE (#PCDATA|IMG|FOO|BAR)>
>
>Now imagine software that generates a TOC from titles, presuming them to
>be strictly textual. What does it do with images in titles?
>
>Now let's talk about inheritance and subtyping. This is not a merely
>theoretical issue. It has important practical implications. The most
>interesting, important application of subtyping is allowing divergent
>evolution of compatible schemas. This is why architectural forms were
>invented. But for this to work, subtyping *must* be unhitched from
>inheritance.
>
>Suppose that Boeing has a content model:
>
><!ELEMENT AIRPLANE-DOC - - (FRONT, MIDDLE, REAR)>
>
>Bombardier has a similer model (after all, they are modelling the same
>thing):
>
><!ELEMENT AIRCRAFT-DOC - - (COCKPIT, STORAGE, TAIL)>
>
>How does inheritance help me to unify these models and validate that
>they are actually isomorphic? It doesn't. This is a job for subtyping. I
>can also come up with examples where inheritance is more useful without
>subtyping but you can always achieve this through other means (which is
>why Java does not support it).
>
>Inheritance is a code reuse mechanism, so you can always emulate it with
>cut and paste (or, parameter entities, or in a programming language with
>delegation). Subtyping is a type system extension. It is completely
>different.
>
>I can inherit stuff from my dad without becoming a dad. I can choose to
>be a dad without inheriting anything either from my dad, or the "class
>dad". They are different things.
>
>> * DTDs and schemata *
>>
>> Francois Chahuneau's article makes a very effective argument for why we
need
>> to extend or replace DTD syntax (thanks Robin). XML-Data is a reasonable
>> attempt to do so, but it is understandly controversial because it is a
such
>> a radical departure from the existing syntax.
>
>I think that XML-Data should be controversial because from my reading it
>is just a mix and match combination of interesting features that people
>want in schemas without a coherent theory of how they should fit
>together. You can't just put 10 smart people into a working group and
>have them throw in their good ideas and expect a coherent result.
>XML-Data's inheritance mechanism does not take advantage of XML's nature
>as a sequence-oriented language for encoding documents. In other words,
>it doesn't solve the fundamental problem.
>
>> I quite like the idea of an
>> alternate, XML-based schema syntax, but the real lesson of XML-Data is
that
>> creating an effective inheritance mechanism isn't rocket science. All
that
>> is really needed is a keyword that says "this element type is derived
from
>> that element type". Something like:
>>
>> <!element dog extends animal...
>
>Sure. This isn't rocket science. But it doesn't solve the fundamental
>problem at all. You haven't defined what happens to "BARK" sub-elements
>in "DOG". Without that definition, any software dealing with animals
>will croak on dogs. Which is exactly what subtyping was supposed to
>avoid....
>
>> More tricky than any of these technical issues is the question of what,
if
>> anything, could be done to promote a mechanism of this sort. Obviously
this
>> would require a change to the XML spec as well as modification to all
>> existing tools which process DTDs, so it's a pretty big deal. I wonder if
>> anyone besides me thinks that a simple mechanism like this would make
sense.
>> If so, is there any room in the XML standards process to discuss a change
of
>> this type at some point in the future (certainly not for XML 1.0)?
>
>Personally, I have yet to see a decent proposal for inheritance and
>subtyping in SGML. Coming up with ibe is difficult, which is why I've
>spent the last year thinking about it. Dan Connolly has also spent
>several years thinking about it. I know that there are many others in
>the same boat. I think that we agree that it doesn't make sense to adopt
>a solution that solves only 5% of the problem, which is why you will see
>resistance to anything like that.
>
>We will know that we have a complete solution to the problem when HTML
>6.0 can be described as a subtype of HTML 5.0, and its behaviour in a
>"subtype aware" HTML 5.0 browser is predictable and well-defined.
>Further, HTML 6.0 must not just extend HTML 5.0 in trivial ways such as
>new <HEAD> tags. It must actually have new elements, with new content
>models mixed in at all levels. As I said, inheritance-at-the-end solves
>about 5% of this problem.
>
> Paul Prescod - http://itrc.uwaterloo.ca/~papresco
>
>"Journalism is good if you follow the rules. Don't allow the human
>rights groups to spoil your profession"
> - Col. Godwin Ugbo of the Nigerian military dictatorship
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list