XML API specification

Richard Light richard at light.demon.co.uk
Thu Feb 27 14:38:38 GMT 1997


In message <4034 at ursus.demon.co.uk>, Peter Murray-Rust
<Peter at ursus.demon.co.uk> writes
>Richard,
>       This is very helpful.  It is probably patently apparent that I
>am struggling with groves, grove plans and so on.  I am also struggling
>with the SGMLPropertySet.  It is clear to me that a very high priority
>for people like me is a 'Gentle Introduction to Groves and
PropertySets".
>(Even if I could manage all of 10179 it's difficult to tell what is 
>_important_ and what isn't :-).  I notice that the property set is
itself
>ISO:17044 - does that mean it's an add-on to 8879?)

Hands up anyone who _isn't_ struggling with this stuff!  I know it makes
my brain hurt ;-)

All it's saying in DSSSL is that the _full_ definition of a property set 
is to be found in the HyTime standard (10744) - Section 6.7.  One 
potentially relevant thing that I don't know about is the HyTime 
Technical Corrigendum, which I think re-defines the SGML Property Set so 
that HyTime, DSSSL, SGML etc. will be all in line with each other.  And 
it's still being finalised.

>My very crude vision of groves is that these are a set of tree-
structured
>views of the ESIS/XML tree, with properties being added at nodes for
various
>purposes that I do not yet understand.  Am I on the right lines? :-)

I think so.  Anyway, as one pragmatist to another ...

My understanding of groves is that they take the tree structure idea one
step further than we are used to in dealing with SGML documents, and
thereby make everything simpler, but more verbose.

We are used to thinking of an SGML document as a tree structure of
elements, each with lots of miscellaneous additional properties 'hanging
off the side'.  The grove idea says "let's take this additional stuff,
and see that as part of the tree as well".  So an element node, for
example, now has a subnode containing its GI, and one subnode for each
[non-implied?] attribute.  Each of these attribute subnodes will in turn
have subsubnodes containing e.g. the attribute name and value.

The whole SGML Property Set is defined (naturally enough) as an SGML
document.  What you get is the definition of a number of groups, or
'property set modules' (<psmodule> elements).  Each contains the
definition of classes (<classdef> elements), and each class has zero
or more properties (<propdef> elements).  The <classdef>s must actually
_contain_ the <propdef>s that follow them, although there are no
</classdef> tags to make this explicit.  (Roll on XML, I say!)

So, to plunge right into 9.6 and take an example:

    <classdef rcsnm=attasgn appnm="attribute assignment"
    conprop=value dsepprop=tokensep clause="79002">
    <desc>
    An attribute assignment, whether specified or defaulted.
    <note>
    In the base module because of data attributes.

declares the class "attasgn" (full name "attribute assignment").
Below this:

    <propdef subnode rcsnm=value datatype=nodelist
    ac="attvaltk datachar sdata intignch entstart entend"
clause="79401">
    <note>
    If the attribute value is tokenized, the children are of type
attvaltk;
    otherwise, they are of the other allowed types.
    <when>
    The attribute is not an impliable attribute for which there is no
    attribute specification.

declares the first property of an attribute assignment, which is its
"value", and:

    <propdef rcsnm=name datatype=string strlex=name strnorm=general
    clause="93001">

declares the second property of an attribute assignment, which is its
"name".

As well as giving names to all these things (so we don't have to make
them up), this definition includes other potentially useful
information.  (In fact there are usually two names: a short 'Reference
Concrete Syntax' name and a longer application name, which was
specifically designed for use "in a programming or scripting
language".)  The DATATYPE attribute states what type of data the
property contains (string, node list, integer, etc.).  The AC attribute
says what types of subnodes the property is allowed to have (= a
'content model' for property nodes!).  STRNORM declares whether the
value is to be normalized.  And so on.

>My current understanding (PLEASE CORRECT THIS ANYONE!) is that a
normalised
>WF XML document is isomorphic to its ESISStream.  IOW having read the
>XML document, applied any conditional clauses, substituted any
entities,
>the XML document contents can be automatically converted to ESIS and
vice
>versa.  [The reason this matters is that until recently I have been
using
>sgmls to parse CML and input the ESIS into my tools.  I also have a
crude
>XML-like parser :-).]

I can't answer that, but can point out that DSSSL specifies three ps
modules which together 'roughly' correspond to ESIS (baseabs, prlgabs0
and instabs).  These would be the bits of the SGML Property Set to
examine first.

>> I'm not suggesting that the structures and their properties should be
>> expressed as in the DSSSL standard (Section 9.6 - horrendous!), but
if
>
>Ah... I am looking at the section.  This was a bit I hoped was
'unimportant'.
>The XML-WG has been debating whether conecpts from standards outside
XML can
>be used without being explicitly in the XML spec.  I would hate to
think
>that XML implicitly involved 10179:9.6.  I can accept that it may/will
come
>into PhaseIII.

My take on this is that you start from the XML spec, and find the
corresponding bits of 9.6 to give you a standard nomenclature.

Richard.

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list