In HTML: XML Documents Are Objects! (or "Killing OO Softly With XML")

Tue Mar 10 21:35:42 GMT 1998

It looks like my last post was messed up a bit. Here's a text intro
with the entire document attached in HTML.  Sorry for the double post.
=======================================================================

"Wouldn't it be nice if one could simply tell an object
to serialize to XML, and then deserialize back into an object?"

As programmers do you long for the old days when data was data and code was
code? Do you buy into the idea that the behavior associated with data should
be embedded within the application so as to restrict reuse of that data? Ah,
the good old days of relational databases! In its current usage XML is enabling
you to revisit those days again... but don't be persuaded by the dark force! Put
on your OO glasses and see the light!

Sure, XML provides incredible potential, and I am all for it. But in their
current form, XML documents are nothing more than mobile semi-structured
non-object databases (this is pretty cool, but not OO). Why is it that
programmers have suddenly forgotten all about objects just so they could
write XML? Is a return to relational databases that enticing? (Bleech!)
The only practical reasoning behind such an approach is that programmers
want to keep their data private. They don't want other applications to have
the ability to reuse that data. They accomplish this feat by embedding all
of the code associated with that data (formally called "behaviors" in the
OO era) in their own applications. [Who's running this show anyway? Is
XML some kind of conspiracy to kill OO?]

Here's a simple example. You write an application that converts unformatted
poems into composite poem objects rich with behavior. You want to store these
poems, and share them with other applications that want to do things with
poems (whatever it is you do with poems). You define an XML structure and
start generating XML documents as a means to store and share the poems. Every
application (including yours) that reads in your poems using an XML parser
will see the poem as something similar to:

[This XML document was taken from an example accessible at the Microstar
website (distributors of the AElfred XML parser).  The file name is donne.xml.
Below is the parse tree for this document.]

root |-> Element |-> Element |-> Element
                 |           |-> Element
                 |           |-> Element |-> Element
                 |
                 |-> Element |-> Element
                             |-> Element
                             |-> Element
                             |-> Element

Pretty impressive right? It sure doesn't look like a poem object does it?
Once this structure has been generated every single application will need
to supply its own code to understand how to navigate and interpret this
structure, and provide behavior for it. This is typical if you are a C
programmer, but be clear, this isn't OO. And, while DOM takes us a bit
farther, you still won't get the parser to produce a poem object and its
poem-specific behaviors from the XML document (but we still want DOM!).

The process of generating XML strips the behavior out of the objects; or,
saying it differently, XML and related standards do not describe a mechanism
by which one can attach behavior to XML documents. The parser, in turn,
cannot therefore work miracles when it reads the data (which are no longer
objects) back into the application. Or can it? Why can't we view XML as a
serialized object representation? If we agree that this is not too far
fetched, then why can't parsers deserialize or objectify the objects
contained in the XML documents, rather than simply handing us data and making
the applications do all of the work?  What if the parsers generated real
classes (with behavior!) instead of generic Element classes? The poem above
would instead look like this: (perhaps if we talked about XML documents
as orders (or anything else) instead of poems it might be more motivating?)

root |-> poem    |-> front   |-> title
                 |           |-> author
                 |           |-> revision-history |-> item
                 |
                 |-> body    |-> stanza
                             |-> stanza
                             |-> stanza
                             |-> stanza

Oh, but could it be that simple? (The answer is "yes.") Would having a
parser output objects with type-specific behavior be useful? (Hmm...)
Would programmers really want to share their objects if they could?
(The answer should be "yes.") Even if they didn't want to share their
objects, or if nobody wanted their objects, why violate the principles
of OO and make the programmers' lives more difficult? Wouldn't it be
nice if one could simply tell an object to serialize to XML, and then
deserialize back into an object?

With some VERY simple extensions to current parsers this can occur,
and already has -- we've created an extended version of the Lark XML
parser which provides this capability. Our input to this extended
parser is the XML document and the type-specific classes (like poem)
extended with the basic ability to deserialize themselves.

The details are described in the attached document. The enhanced
version of Lark is freely available on request.

Paul.

--

********************************************************************
Paul Pazandak                                      pazandak at objs.com
Object Services and Consulting, Inc.             http://www.objs.com
Minneapolis, Minnesota 55420-5409                       612-881-6498
********************************************************************

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19980310/5f58d8db/ExtendingLark.htm