Xapi-J: an architectural detail

Mon Aug 4 18:54:11 BST 1997

Richard Light wrote:

> In message <33E502A7.80A66A68 at datachannel.com>, John Tigue
> <john at datachannel.com> writes
> >
> >An XML document can be represented as a tree. In an XML document
> object
> >model there are things which are containers (e.g. a document is a
> >container and so is an element) and also things which are the content
> of
> >a container (e.g. a chunk of text is a content or even a element can
> be,
> >in the case of one element within another). To model these there are
> the
> >IContainer and IContent interfaces. The full source follows:
> >
> >public interface IContainer
> >     {
> >     public Enumeration getContents();
> >     public void insertContent( IContent aContent, IContent
> >preceedingContent );
> >     public void appendContent( IContent aContent );
> >     public void removeContent( IContent aContent );
> >     }
> >
> >public interface IContent
> >     {
> >     public void setParent( IContainer aContainer );
> >     public IContainer getParent();
> >     public String getData();
> >     }
>
> These interfaces are mirrored in the SGML/XML Property Set.  In that,
> everything is a 'node', each with its own name and a set of
> properties.
> One of those properties is 'subnode' - having a subnode property makes
> a
> node, de facto, into a Container in your terminology.  The complete
> XML
> document can be represented as a 'grove' (tree structure) of these
> nodes.
>

I agree that grove is the way to go. I'm just trying to get all the
current processors on the same track before we move towards the grove
work.

> The parent-child relationship between elements of the XML document is
> more specific than this.  The full grove includes things like the DTD
> and processing instructions, which are nodes in the grove structure
> but
> do not exhibit 'parent-child' relationships to anything else.
>

How will we represent the DTD in order to reflect the effects of the
Bray Namespace Proposal?

> Nodes have some 'intrinsic properties', which apply whatever their
> particular type might be.  (Again, this mirrors your thinking very
> closely.)  These intrinsic properties are:
>
> object Node
>   property ClassNm  ; the name of the node's class
>   property GrovRoot ; the root of the grove of which the node forms a
> part
>   property SunPNs   ; the names of all the subnode properties
> exhibited
> by the node
>   property AllPNs   ; the names of all the properties exhibited by the
>
> node
>   property ChildPN  ; the name of the children property, when this
> class
> of node has children
>   property DataPN   : the data property name (i.e. 'char' or
> 'string'),
> when this class of node contains data
>   property DSepPN   ; the data separator property name
>   property Parent   ; the node's parent
>   property TreeRoot ; the root of the parent-children tree [not the
> same
> as the 'grove root']
>   property Origin   ; the node that that this node as one of its
> subnode
> properties
>   property OTSRelPN ; the origin-to-subnode relationship property name
>
> I've given the full set of intrinsic node properties, really just to
> point out that all of this modeling has already been done before.
> Much
> of it is too detailed (and perhaps one level too abstract) to apply to
>
> Xapi-J.  However, I'm concerned that Xapi-J developers shouldn't just
> ignore the SGML property set and invent their own version.
>

Ignoring the SGML property set would be just plain stupid. I like to
drive cars not re-invent wheels.

> Expressing the only intrinsic property (parent) that is relevant to
> this
> discussion leads to:
>
> public interface XMLnode
>         {
>         public XMLnode parent();
>         }
>
> We could add in a couple of extra intrinsic properties, so you can get
>
> to the grove root and its origin from any node:
>
> public interface XMLnode
>         {
>         public XMLnode parent();
>         public XMLnode grovroot();
>         public XMLnode origin();
>         }
>

I absolutely agree that the Xapi-J interfaces are not done. I have tried
to bring the current processors together while mapping out the basics of
the object model. We will need to add more properties as you point out.
One thing I would like to see is that we return appropriate objects as
much as possible. One particular processor out there does a
getAttribute() where you pass in a String and get back a String. I think
an IAttribute should be returned. This way other convenience methods of
the returned class can be used. For example something like isPercent()
or isNumeric() for an attribute not to mention all the properties of say
a character.

> I don't think we need separate IContainer and IContent interfaces -
> what's wrong with just INode (or XMLnode, as I have it)?
>

We could do that. Or maybe both with something like the following:

public interface XMLNode extends IContainer, IContent
    {
    ...
    }

I went with IContainer and IContent because I can do more precise
polymorphic message handling such that the receiving method can make
more assumptions about what the passed object can do without casting to
the exact class. Casting in Java is a runtime cost (b/ of security) so
more expensive.

> >These interfaces only express the methods for navigating a tree. A
> >particular class of objects would need to have some more methods to
> be
> >interesting. For example, the interface for an element is IElement.
> The
> >full source follows:
> >
> >public interface IElement extends IContent, IContainer
> >    {
> >     public String getType();
> >     public void setType( String aType );
> >     public void addAttribute( String name, String value );
> >     public void removeAttribute( String name );
> >     public IAttribute getAttribute( String attributeName );
> >     public java.util.Enumeration getAttributes();
> >     }
> >
> >The above states that an IElement can be a container and/or a content
>
> >and also has some other methods particular to being an element. So
> >although IElement does not directly have a method called
> getContents(),
> >it gets the method from its superinterface IContainer.
>
> We can do the same thing here:
>
> public interface XMLelement extends XMLnode
>         {
>         public String gi();
>         public void setType( String aType );
>         public void addAttribute( String name, String value );
>         public void removeAttribute( String name );
>         public XMLattribute getAttribute( String attributeName );
>         public XMLattlist atts();
>         }
>

I generally argree. I went for getGI() and setGI() at one point but the
spec forced getType() and setType(). Plus I believe that the work we
produce here will filter down to folks who are far less preoccupided
with XML. For them the term "generic identifier" or even "gi" would be
less readily grasped than "type". Either way, by following the get/set
naming convention we map to JavaBeans. Slightly more wordy than X() and
setX() but the builder tools are geared for recognising getX() and
setX().

> Notice that I've left the middle four declarations more or less
> unchanged, for the following reason:
>
> There is definitely a useful distinction here, between those things
> which are _properties_ of a node within an XML document, like the GI
> of
> an element or its list of declared attributes, and _operations_ which
> the API lets you carry out on that node.
>
> The SGML/XML property set is entirely about the properties of an
> existing instance.  It provides no framework or precedent for API
> commands which _alter_ that instance, like SetType (which assigns or
> changes the GI of an element).  There, we are rather more on our own!
>

At first setType() might seem less than useful. And perhaps type should
be a parameter to the constructor and not modifiable (more on that
later). I got caught in a Java specific detail related to the
following:Class.forName("SomeClass").newInstance()
With this code Java objects can be instantiated from a String of the
class' name. That's handy for object serialization amongst other things;
for example, say you had a repository of classes for specific element
types and you want to instantiate one during a parse. The point is that
in Java newInstance() only works with the default constructor;
parameters cannot be passed in. So there is need for a seperate method
for setting the type of the element. If we wanted to make the type
immutable then perhaps we could specify that the member field "type" can
only be set once. This type of behavior shows up a lot in the JDK.
Inside the property setter the field is checked for null, if not then
produce an exception. Also in the JDK we see String and StringBuffer
where String is immutable and StringBuffer is where strings can be
dynamically built up. Perhaps something like that for Xapi-J

> I'm not sure if the Java API provides for a more elegant way of
> specifying a property than the one I've dreamt up - if it does, we
> should use it.
>

The only point I'm sure on is the getX() and setX() "design pattern".
Most Java devs casually consuming XML will use a JavaBean and we should
plan for that architecture.

> Hope this helps.
>

Deffinately. Thanks.

> Richard Light
> SGML and Museum Information Consultancy
> richard at light.demon.co.uk
> 3 Midfields Walk
> Burgess Hill
> West Sussex RH15 8JA
> U.K.
> tel. (44) 1444 232067
>
> xml-dev: A list for W3C XML Developers
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
> To unsubscribe, send to majordomo at ic.ac.uk the following message;
> unsubscribe xml-dev
> List coordinator, Henry Rzepa (rzepa at ic.ac.uk)

--
John Tigue
Sr. Software Architect
DataChannel
http://www.datachannel.com
jtigue at datachannel.com
206-462-1999

-------------- next part --------------
A non-text attachment was scrubbed...
Name: vcard.vcf
Type: text/x-vcard
Size: 263 bytes
Desc: Card for John Tigue
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19970804/bfb1fb6e/vcard.vcf