About DOM and CORBA (long)

Jeff Greif jmg at trivida.com
Sun Apr 4 00:38:28 BST 1999


Didier PH Martin asked about issues involving a C++ DOM mapping and binary compatibility
between different implementations of such a thing if they existed.

Didier,

The DOM IDL spec can be used just as it is if clients and servers communicate using CORBA.
The implementation language on the server, nor the implementation language of the client stubs
should not matter (and can be different).  Almost all ORB implementations are heavily
optimized for single-address-space or single machine client-server communication.

The reason the Java and ECMAscript DOM bindings exist is for programs that do not use an ORB
for communication -- those in the Java language can agree to use the Java bindings, and
different DOM implementations in the same language can interoperate, as you mention.  This is
unnecessary if the DOM is a server object in a CORBA ORB (in fact, the idltojava program and
the Jacorb idl compiler may produce different classes, but IIOP will handle any differences
when a Jacorb client talks to an idltojava-based server.)

The binary compatibility you think you are looking for is a feature of the implementation
language, not CORBA or IDL (which make no promises to provide any and by design should not be
expected to.  A single ORB may support several different implementation languages without
binary compatibility between the server objects they implement.  The compatibilty is only at
the IDL level.)

It is probably a bad idea to attempt a special C++ DOM mapping.  The C++ objects produced by
different compilers are, by explicit design, incompatible (see, for instance, the chapter on
linkage in the Stroustrup and Ellis ARM from 1990.)  The layout of the objects is almost
entirely up to the implementation, as are the placement (or use of) vtables, and the runtime
type information.  Each compiler uses a different name mangling scheme (by explicit design) so
your link will fail if you try to mix code produced by different compilers!  (If this didn't
happen, the incompatibilities would cause various mysterious failures at runtime.)  Binary
compatibility even on a single platform is not part of the picture.  Just in case I haven't
made myself clear enough, suppose you buy the HighSpeedDOM dll which was written in C++ and
compiled using compiler X.  Now you get the DOM.h file which defines the classes and methods
according to the new C++ DOM mapping.  You  include this in your C++ client code, compile your
client code, which instantiates a DOM object and calls some of these methods, using compiler
Y.  When you try to link against HighSpeed's DLL, it will fail simply because the names of the
methods in your compiled code and in HighSpeed's code don't agree.  If by some dreadful
accident, compiler X and compiler Y have chosen the same name mangling scheme, then the
program will fail at runtime, calling the wrong method with the wrong arguments or looking in
the wrong slot of the object for a data value, and probably memory will be corrupted.  A
company trying to produce a terrific DOM implementation in binary form must compile it with
each supported compiler.

There might be some advantage in performance to implement the DOM in C++ (that is, generate
C++ server stubs for the IDL and fill them in with code).  The idea would probably be that
parsing a document might go faster than in Java, and writing out a DOM tree might be faster
(these are examples of the larger or more complex operations that take place within the DOM
code, just triggered by a single client interaction).  Client inquiries and modifications of
that DOM would have an extra overhead for passing through the ORB, but you'd hope this is
small compared to the work done in the DOM implementation, assuming the client and server are
on the same machine and preferably in the same address space.  There are many factors which
determine whether this hope is reasonable, but the moderate degree of success of CORBA
indicates that at least sometimes, it is reasonable.  The overhead would be significant or
dominant on tiny operations like advancing an iterator.  Given that there is no binary
compatibility, no interoperability of code from different compilers, no direct C++ to C++
serialization protocol, and you might want to write a DOM client in some other language, the
best thing appears to be to just implement the DOM in C++ if you think there are performance
advantages or you have some other constraints, and call its methods through CORBA.

When you implement a CORBA object in C++, there are two levels of incompatibility:
  1.  The different translation of IDL to C++ classes and methods by different ORBs.  This is
handled by IIOP between the client ORB which is calling and the server ORB which is
responding.
  2.  The different binary form of the objects and methods produced by the C++ compiler.  This
is completely irrelevant to the client which might be compiled using a different C++ compiler,
as the ORBs communicate essentially at the IDL level.

Finally, it should be noted that the C++ vtable is in no way an element of any ANSI or ISO C++
standard.  COM is a specified binary interface that was deliberately designed to look like
(and hence be optimized for) C++ vtables produced by Microsoft compilers -- any implementation
of COM objects produced by other compilers (on Intel or other platforms) must either have
matching vtable implementations or use some other trick (most likely another layer) to realize
the binary compatibility.  This is why it has taken so long to have COM implementations (not
widely used) on non-Windows platforms and why there are probably still interoperability
problems between different platforms using COM.  COM is a way of imposing a binary
compatibility standard on top of C++ implementations, but adds non-trivial overhead also.  I
would see little reason to attempt to produce COM-based DOM interoperability rather than
CORBA-based unless you know only the Windows platform would be used.

Jeff


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list