Entities in XSchema

Wed Jun 10 17:44:34 BST 1998

John Cowan wrote:
> 
> Paul Prescod wrote:
> 
> > SAX provides information on ID/IDREF attributes, but I don't think that
> > ESIS did, so this is more debatable.
> 
> But not all SAX-compliant parsers do: attribute type checking is a
> validity constraint only, so non-validating parsers are free to
> return CDATA as the type of all attributes.

Doh! I didn't think of it before, but it doesn't really matter whether
ID/IDREF are passed through SAX. The question is whether we want our
schema language to be able to say that 

a) a particular attribute's values must be unique document-wide. 
b) a particular attribute's value must refer to the previously described
globally unique value.

I would give a qualified yes: Yes, we want to be able to say that
eventually. No, we don't have time to think it through completely now. For
instance, do we want a single ID namespace as XML/DTD has? Or do we want
one namespace per attribute? The latter seems more powerful to me! So a
"FIG" element could have both a FIGID and an ID. The FIGID might be only
available on "FIG" elements, and thus a unique identifier for classes,
without invading the namespace of "EXAMPLEs". Note that I use the word
namespace in a sense more or less unrelated to that of the "namespaces"
specification.

IDREF should also be more powerful. It should allow full XPointers (though
an XSchema processor might only check local ones).

In other words, I say leave linking (ID/IDREF) out for now, but plan great
things for them in the future.

ENTITY and NOTATION seem more or less harmless. There is no obvious way to
make them more powerful, and they are inherently tied to information that
SHOULD BE available to an application of XML anyhow. After all, binary
entity and notation declarations are completely useless if the application
doesn't have access to them. Since they are fairly simple and in the realm
of application-level information, I think that we should support them.

> As one who is fairly SGML-ignorant, I would like to know:  Just what
> is in the ESIS?  (If you will, limit the answer to things XML also
> has.)

I think that the best summary is here:

http://www.jclark.com/sp/sgmlsout.htm

Things that must be turned on with a -o option are NON-ESIS. For example:

Aname val 
    The next element to start has an attribute name with value val which
takes one of the following forms: 
    IMPLIED 
        The value of the attribute is implied. 

    CDATA data 
        The attribute is character data. This is used for attributes whose
declared value is CDATA. 

    NOTATION nname 
        The attribute is a notation name; nname will have been defined
using a N command. This is used for attributes whose declared value is
NOTATION. 

    ENTITY name... 
        The attribute is a list of general entity names. Each entity name
will have been defined using an I, E or S command. This is used for
attributes whose declared value is ENTITY or ENTITIES. 

    TOKEN token... 
        The attribute is a list of tokens. This is used for attributes
whose declared value is anything else. 

    ID token 
        The attribute is an ID value. This will be output only if the -oid
option is specified. Otherwise TOKEN will be used for ID values. 

The last line indicates that ID differentiation is non-ESIS.

Note that ENTITY and NOTATION are supported, as in SAX.

 Paul Prescod  - http://itrc.uwaterloo.ca/~papresco

Three things are most perilous: Connectors that corrode
Unproven algorithms, and self-modifying code
http://www.geezjan.org/humor/computers/threes.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)