Semantics

Fri Apr 24 14:19:32 BST 1998

At 03:14 24/04/98 -0400, Paul Prescod wrote:
[...]
>based on the described abstractions. For example, the DOM couldn't give
>two farts about the syntax of a document. It moves seamlessly between HTML
     ^^^^^
I assume this word is semantically void.

>syntax, XML syntax and could easily handle SGML (and probably or VRML, or
>even PDF) syntax too. It cares about the abstract structure -- the tree of
>attributed-elements described by an XML document. If XML has no semantics,
>then how can it describe an abstract tree? If it doesn't describe a tree,
>then what the heck is the DOM based on?

There seem to me to be at least two types of 'semantics'. One is what
meaning is attached to <FOO>, for which the XML community is trying to work
out mechanisms. These include:
	- stylesheets for adding carbon to cellulose for humans. (I hope this is
extended in the XSL process to include things other than paper and
pseudo-paper rendering, such as interactive processes).
	- mapping to algorithms (Java classes, ECMAScript, etc.)
	- linking to human- or machine-readable resources such as glossaries and
data dictionaries
	- architectural forms

I think we all agree on the need to develop communal mechanism for this.

The other type of semantics is how the words and phrases in the XML specs
should be implemented in software or precise documents. The less clearly
defined the semantics are, the more variation is possible when humans try
to do this implementation. This is one reason why I have constantly raised
the problem of implied semantics in the spec and urged that we develop
software which implements them. 

Part of the reason why things like interfaces take so much effort is that
it involves semantics as well as many other aspects. Whilst I suspect this
has not been a major problem in SAX, it certainly is when we come to what
an 'XML processor' and an 'XML application' do. For example, I doubt if we
all agree on when a processor is required to validate a document, and what
is done with the result.  And I expect that we shall see an increasing
number of commandline switches or menu items in XML software to allow
humans to vary the semantics. 
For example, JUMBO2 reads and displays all whitespace. If there is no DTD
then no whitespace is 'ignorable'. If an element's content consists only of
whitespace then it's often 'obvious' to a human that this whitespace is not
required for humans. IMO it would be wrong to remove this automatically, so
I have a menu switch 'Delete Whitespace', which finds all such occurrences
and - deletes them [either locally for elected nodes, or globally].

Wherever we can agree on and make available an algorithm that encodes our
semantics (or offers a choice between acceptable views) this is worth
highlighting and systematising for else we can descend into semantic
incompatibility.

	P.
Peter Murray-Rust, Director Virtual School of Molecular Sciences, domestic
net connection
VSMS http://www.nottingham.ac.uk/vsms, Virtual Hyperglossary
http://www.venus.co.uk/vhg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)