Another look at namespaces

Simon St.Laurent simonstl at simonstl.com
Mon Sep 20 16:37:39 BST 1999


I think you're confusing my discussion of grammars you can build on top of
XML with XML's foundational grammar, which I earlier compared to an alphabet.

At 09:18 PM 9/20/99 +0800, James Tauber wrote:
>> Seems that it wouldn't be hard to create a formal
>> language that had classes of vocabulary (like noun, verb, adjective) and
>> fit them into patterns (subject[noun]-verb[verb]-object[noun]) that were
>> separate.
>
>This separation is merely partitioning the grammar into productions that
>take penultimate symbols to terminal symbol and all the other productions.
>
>Eg
>    [1] Sentence -> NP VP
>    [2] VP -> V NP
>    [3] NP -> Simon
>    [4] NP -> XML
>    [5] V -> likes
>
>What you are talking about is splitting productions 3-5 from 1-2. This is
>often done in natural language processing and many theories of (natural)
>language make a distinction between the lexicon and the syntactic rules. But
>we are talking about formal languages, not natural languages.

True, except that by taking this approach you can leave the grammar _open_
- allowing extension by adding new objects that are nouns, verbs, etc.  You
don't have to define the complete grammar in advance.  XML 1.0 only permits
a limited subset of this functionality through parameter entities, which
can be twisted into powerful tools.

I think both natural and formal languages can be built to be 'extensible'
in a very broad sense of the term, with less constraint by formal grammar
than you appear to be proposing.

>I'm not sure I understand what you are saying here. When a user pieces
>together bits of different DTDs, they end up with a *single* DTD. This is a
>single grammar definining a single set of valid instances.

So overlapping multiple DTDs don't exist?  Even if, in fact, they may be
interoperable on multiple levels?  I also like Rick's example of
architectural forms, which permit validation of the same document against
multiple constraint sets.

>> Then there's the simpler case of well-formed documents, for which we can
>> _derive_ grammars, but can't make definitive statements above the level of
>> XML 1.0 conformance.
>
>Pardon? A grammar for well-formed documents doesn't need to be derived
>because it is in the XML 1.0 REC. It is a BNF augmented by WFCs and the odd
>bit of prose.

I'm discussing the grammar of the content built on top of XML, not the XML
itself. The content of a well-formed document is pretty nearly infinitely
extensible, without a grammar that describes content models, vocabulary,
etc.  This is where I think you're confusing my discussion of what you can
do on top of XML 1.0 with discussion of XML 1.0 itself.

>It can. But formal languages are part of the picture because sometimes there
>are syntactic constraints. They might be loose, but they are still a
>grammar.

But is that grammar interesting any more, or is it like letters?  Something
you pay close attention to in elementary (grammar) school, and move beyond
as you build more things on top of them. 

>> It depends on what kind of 'formalizing' you want to do.  In many cases,
>> I'd suggest that we focus on 'relaxing', producing more flexible models
>> that aren't so concerned about locking everything down into a single
>> grammar and a single vocabulary.  It requires a change of mindset.
>
>A formal grammar is still a formal grammar even if it permits any of the
>terminal symbols in any order. A more flexible model is still a model. The
>moment you model the syntax, you have a formal grammar.

That's fine, as a foundation (XML 1.0), but do we want to be stuck with
this approach at higher levels?  'Modeling the syntax' does produce formal
grammars - but modeling that syntax doesn't automatically produce something
useful or necessary.

>> I don't think we're incompatibly far apart
>
>I actually agree with you completely in pretty much everything but
>terminology.

We seem to continue to disagree on the terminology, but agree that the
terminology is important.

>I think the XML community would generally agree that:
>
>1. certain classes of formal grammar are not sufficient for the syntactic
>constraints people wish to express
>2. syntax isn't all there is

Then we've probably reached the end of the discussion - I think we can
agree on this pretty reasonably.

>Linguists worked these out well before you and I were born, Simon :-)
>I think SGMLers did too which is one of the reasons that a Document Type
>Definition in SGML includes semantics as well as syntax (see another post
>where I follow on from Rick's comments relating to this)

SGML wasn't fond of leaving things open, however.  XML blasted open the
option of life without DTDs, giving us the possibility of neither semantics
nor syntax (at least on the level of content models and constraints on
vocabulary.)

>As far as I can tell, no one is arguing that formal grammars are all we
>need. I am merely trying to clarify what formal grammars are so that people
>understand what is meant when someone says that a language has a grammar or
>that a DTD is a grammar.

I'm not sure that the clarification is actually useful, as I've stated
before. Thinking about formal grammars for XML seems to have a creeping
effect on the structures we propose for using XML.  Unless we can move
beyond the vision of a model for everything, I don't think we're going to
get real far.

Simon St.Laurent
XML: A Primer (2nd Ed - September)
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list