Semantics (was Re: Inheritance in XML [^*])

Paul Prescod papresco at technologist.com
Thu Apr 23 19:44:16 BST 1998


If XML had no semantics, then XSL, XLL and the DOM would have to
explicitly describe the mapping from syntactic features to the abstract
nodes that they work on. But they do not, because XML has semantic
concepts like "element, "element type", "notation" and "attribute" that
are *described by* the syntax. 

Here's what a language with no semantics looks like:

a -> b"q"c
a -> ca
b -> "d"
c -> "e"

Even given a parse tree, you can't do anything interesting with this
language, because it has no semantics. But you can do lots of interesting
stuff with a "raw" XML parse tree, even if you do not know its DTD. For
instance you can build a DOM from it, apply a stylesheet to it, check its
validity, check its conformance to an XML-Data schema and so forth.

I think that what Tim and Jon mean to avoid is a battle royale over how
elements, attributes etc. fit into various ontological philosophies. I
don't think that that avoidance is useful, but I understand the
motivation. Nevertheless, I feel it is not accurate to claim that XML is
semantic-free. There are tons of semantics, both subtle ("element type")
and explicit ("initiate this network transaction in response to this
markup.")

Consider:

"validity constraint: A rule which applies to all valid XML documents.
Violations of validity constraints are errors; they must, at user option,
be reported by validating XML processors"

How can we tell a processor that it must trigger a *side-effect* with a
legitimate (but not valid) document, and then claim that we are not
describing sematics? There are other things like this in the XML spec:

"When an XML processor recognizes a reference to a parsed entity, in order
to validate the document, the processor must include its replacement text"
-- now we're initiating network transactions. That's a semantic?

"If there are no external markup declarations, the standalone document
declaration has no meaning." -- that would imply it already had meaning. I
don't believe that there is a distinction between "meaning" and
"semantics."

"If a non-validating parser does not include the replacement text, it must
inform the application that it recognized, but did not read, the entity."
-- a constraint on the interface between processors and applications.
That's a semantic.

 Paul Prescod  - http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list