We do not need ampersand (was Re: XML-Data, "&" and inheritan ce )

Paul Prescod papresco at technologist.com
Tue Apr 28 22:24:31 BST 1998


Andrew Layman wrote:
> 
> Paul Prescod wrote "SGML defines both a language definition system and a
> (simple) type system."
> 
> You raise an issue that I'm not terribly familiar with: "Language system"
> vs. "Type system."  Could explain each of these terms and the distinction
> between them? Thanks.

It comes down to the semantics of the language. SGML allows you to define
element types. Thus there is some form of simple type system there.
Context free grammars do not allow you to define types. BNFs do not allow
you to define types. Regular expressions do not allow you to define
types...and so forth. That's one of SGML's big differences -- it has types
and a simple type system. Those other things allow you to define
languages, but there are no implied or explicit semantics relating to
types. 

As I mentioned before, one reason (IMO) that SGML does not allow
context-sensitive content models (much) and attributes (at all) is because
a type is supposed to be one thing. All elements of a type are supposed to
share  semantics. A linguistic view would treat elements as just tokens
that may or may not share semantics. More precsely, a lingustic view would
expect elements to share semantics when they are used in the same context,
but not necessarily when they are used in another context.

Consider the grammar for C++. Do round brackets share semantics in that
language? Well, they are uniformly used to group things (duh!) but if you
ever try to write a C++ compile (don't!!!) you will find that you do not
write code to handle "round brackets", because they are just syntactic
wrappers and their meaning is completely dependent on context. This is
more of a linguistic view.

In the type system view, you write one Java class per element type. In the
linguistic view you walk the parse tree, not expecting it to inherently
have the semantics you want, and translate into something more abstract
which you then unleash your Java classes on (which is how parsers often
work).

Of course there is a continuum between the two views of a document, which
is why SGML has successfully straddled the worlds for so long. Maybe it
can continue to and still advance. It seems, though, that we have
restrained its abilities as a language describer in order to not mess up
the type system (e.g. context sensitivity), and restrained its abilities
as a type system in order to not mess up the language (e.g. no subtyping).
We could continue to move forward with half solutions such as SGML's
"exceptions" that provide limited context sensitivity and XML-Data's
inheritance that provides limited subtyping, or we could try to separate
the layers completely.

That would imply, for instance, that instead of having element type
declarations, we would have productions (a semantic change) and
productions could use as much context sensitivity as they needed.
Attributes would not be tied to element types, but rather to contexts.

 Paul Prescod  - http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list