Content roles in XML

W. Eliot Kimber eliot at isogen.com
Sun Jan 25 19:32:36 GMT 1998


Jim Amsden wrote:
> these methods play in the EventSet. I would like to say something
> like:
> 
> <!ELEMENT EventSet (Annotation*, addListenerMethod,
> removeListenerMethod,
> eventMethod+)>
> <!ATTLIST EventSet
>   %FeatureDescriptor;
> 
>   listenerType CDATA #REQUIRED
>   isInDefaultEventSet (true | false) "false"
>   isUnicast (true | false) "false"
> >
> 
> where addListenerMethod, removeListenerMethod, and eventMethod are all
> Method
> elements. 

Using the standard SGML architecture mechanism ("architectural forms",
see "http://www.isogen.com/papers/archintro.html"), you can define a
"Method" superclass and then derive the three element types above from
it.

The basic process is this:

1. Define the superclass element type:

   <!ELEMENT Method (ANY) >

This is your "architectural form" (that is, an architectural element
type or "superclass").  A collection of such element types forms a
single "architecture".
Architecture definitions consist of two basic parts. The most important
part is general documentation that describes the architecture as a set
of semantic objects, which may be done using any number of existing
formalisms for defining object classes and their properties (including,
but not limited to, prose). This can be generically referred to as the
"schema" for the architecture: that is, the complete set of rules
defined however you choose to define them. The second part, 
is a set of SGML or XML element, attribute, and notation declarations,
which defines the XML or SGML validable rules for conformance to the
architecture.  The declarations that make up the architecture are
physically separate from the documents that use them--architecture
declarations are used by reference, not by incorporation as external DTD
subsets are [in other words, the architectural declarations are not a
syntactic component of the documents that point to them, while external
DTD subsets are].

2. Assign a globally-unique name to the architecture so you can refer to
it formally.  An architecture is identified by a public identifier or
other form of URN as well as a short name to be used locally for mapping
to it [the short name need not be universally unique, but most
architectures choose conventional short names that are likely not to be
duplicated, e.g. "hytime"].  

In this example, you are defining the general classes that make up JAR
files, so you might call it something like "JAR-base-arch", with a
public ID like "+//IDN ibm.com//NOTATION JAR Base Architecture//EN" (I
don't know that an equivalent URN would look like, but such an URN would
work just as well--the key is that the name is 100% globally unique).

3. Use the architecture short name as the name of the attribute that
defines the specialization of elements from the base class:

<!ATTLIST addListenerMethod
   JAR-base-arch  NAME #FIXED "method"
>

Now addListenerMethod is clearly a method as "method" is defined by the
JAR-base-arch.

4. Declare the use of the JAR base architecture and provide the pointer
to its formal definition:

<?IS10744:arch name="JAR-base-arch"
   public-ID="+//IDN ibm.com//NOTATION JAR Base Architecture//EN"
>

Where the public ID identifies the documentation for the JAR
architecture (not the SGML or XML declarations for the types in it). 
This declaration registers the name "JAR-base-arch" as the local name
for the architecture, the name that will be used for the mapping
attribute used in step 3.

This is enough to establish the semantic relationship between the
specialized element types and the general class "method" and any
JAR-base-arch-aware processor has all the information it needs to
recognize the mapping and do the right thing.  This assumes that such a
processor has built-in knowledge of the rules of the architecture, which
is usually the case (e.g., just as Web browsers have built-in knowledge
of the rules for HTML).

If you want to enable XML or SGML validation in terms of the base
architecture, then you need to also point to the declarations for the
architectural DTD:

<?IS10744:arch name="JAR-base-arch"
   public-ID="+//IDN ibm.com//NOTATION JAR Base Architecture//EN"
   dtd-system-id="http://www.ibm.com/JAR/dtds/jar_arch.dtd"
>

Now architecture-aware processors (such as James Clark's SP) can
validate the document according to its mapping to the JAR base
architecture.  For example, it would tell you whether or not
addListenerMethod elements occur in the instance where the architectural
DTD says method elements are allowed.  Note that this is only validation
according to the rules of XML or SGML--full semantic validation
according to the full semantics of the architecture would require either
an architecture-specific processor or a specification of those rules in
some formalism other than SGML DTD syntax (thus the general requirement
for "schemas" in addition to normal DTD declarations).

Here's how these declarations look when put together into a complete
document:

<?XML 1.0 ?>
<?IS10744:arch name="JAR-base-arch"
   public-ID="+//IDN ibm.com//NOTATION JAR Base Architecture//EN"
   dtd-system-id="http://www.ibm.com/JAR/dtds/jar_arch.dtd"
>
<!DOCTYPE addListenerMethod [
 <!ELEMENT addListenerMethod (ANY) ><!-- whatever this should be -->
 <!ATTLIST addListenerMethod
   JAR-base-arch  NAME #FIXED "method"
>
]> 
<addListenerMethod></addlistenerMethod>

Or, using a DTD-less document (my general preference for XML):

<?XML 1.0 ?>
<?IS10744:arch name="JAR-base-arch"
   public-ID="+//IDN ibm.com//NOTATION JAR Base Architecture//EN"
   dtd-system-id="http://www.ibm.com/JAR/dtds/jar_arch.dtd"
>
<addListenerMethod JAR-base-arch="method"></addListenerMethod>

The only thing the DOCTYPE declaration provides is the convenience of
default attribute values--it doesn't affect the interpretation of the
mapping.  This is cool because it means you can completely avoid
per-document declarations while still having the option of validating
against the architectural declarations, if provided.
In addition, the architectural declaration makes it clear what the
governing semantic definition(s) are.

Cheers,

Eliot
--
<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
Highland Consulting, a division of ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 95202.  214.953.0004
www.isogen.com
</Address>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list