Proposition: "SGML is Gumming Up the Works"

matt at matt at
Mon Sep 14 22:46:17 BST 1998

On Sat, 12 Sep 1998, Paul Prescod wrote:

> The hardest part of coming to a new domain is recognizing what parts of 
> what we know from other domains do NOT apply.
> On Fri, 11 Sep 1998, Mark Tucker wrote:
> >
> Saying that BNF is weaker than types systems is equivalent to saying that 
> hammers are weaker than screwdrivers. They are not comparable. Grammars
> describes serialization syntax and the other describes a data model.
> If "type systems" could replace serializations, then we wouldn't need 
> XML, would we? We'd just use Java's type system.
> > So, we end up jumping through hoops to write DTD's to express DATA
> > which is very, very, very easily described in terms of modern
> > programming language type systems.  All the while, hearing a low chant:
> > "What kind of cretin are you? You don't want to *validate* your data! (shock)
> > You only want well-formed documents." -- NO and YES.  I don't care
> > if my document can be validated by a pitiful DTD.  I do care that 
> > it conform to a real type schema!
> "Bang. Bang. Bang. I think I bent my screwdriver." I hate to let you 
> down, but when you serialize your data model into XML, all you have is 
> characters. Characters have to be verified according to the techniques that 
> God and Chomsky provided for verifying character streams: regular 
> languages, context free grammars, regular tree grammars, etc.

You forgot to ask if he likes having his programs type-checked.  Ya gotta
lex and parse and build that AST before you can hope to type-check
something.  Different layers do different things.

Not only do people not recognize what parts of what they know don't apply,
but they seem to forget we they learned as well.

> DTDs are much better than BNF. DTDs describe XML data. BNF describes a
> MUCH larger family of languages. If we were to use BNF, we would have to
> put constraints on the BNF that would make it almost identical to DTDs. 
> Here's the ironic part: you are right that it should be possible to use 
> the same element type name in multiple contexts as long as it isn't 
> ambiguous (as in C). I have a proposal for an extension to DTDs (or 
> schemas) that would allow that.
> The problem is, that when you try to combine this advanced facility with 
> type system-based proposals (e.g. inheritance, subtyping, etc.) 
> everything goes to hell. The irony is that it is people who are screaming 
> for "types" instead of lexical constraints who are *weakening* the 
> lexical constraints that would make DTDs (or schemas) closer in power to BNF.
> Consider:
> <BODY>
>     a=<PAREN>B+1</PAREN>
> </BODY>
> What does it mean to "subclass" the PAREN element type when it is clearly 
> used in two different contexts with two different content models? The 
> answer: there is no PAREN type, really. There is a PAREN "tag" that can 
> be used in completely different ways in completely different contexts.

Why would anyone put a paren around args?  Args is already a grouping
construct - paren is redundant there.  In the second case, wouldn't you
rather use <EXPRESSION> than <PAREN>?  It always seemed to me that the
elements of the DTD should sit at least one level above lexing, but PAREN
is something the lexer does away with.  And doesn't it seem that ARGS and
EXPRESSION are subclasses of a parent grouping element?

> In my opinion, you must THROW OUT the notion of type to make progress on 
> this front. Of course, you can then re-introduce the notion of type at 
> some higher level. But I think that we should make this lexical level 
> powerful enough to do everything we need it to do before we move on to 
> the type level.

Are you calling for the resurrection of SHORTREFS?  Content models should
ideally address the abstract syntax tree.  Lexical constraints address
content.  If you want to cross them, you need something like SHORTREFS (or

Matthew Fuchs
matt at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list