XML Component API

Oren Ben-Kiki oren at capella.co.il
Sun Jan 17 09:44:29 GMT 1999


I raised the question of a standard API to XSL processors in the XSL mailing
list. This question has quickly touched on general issues of how to combine
XML processing modules, since there are two incompatible ways to pass XML
data - as an in-memory DOM tree or as "parsing" events.

I came up with the attached scheme. It allows writing all sorts of XML
related components, using either of the APIs, and still easily combine them
together to obtain complex XML processing goals, with little or no loss of
efficiency.

I understand that a new version of SAX interface is being considered, making
it more like an implementation of the visitor pattern for DOM trees then a
simple parser interface. This is a chance to add something along the
attached API directly to the SAX interface, if that's the right place for
it.

Share & Enjoy,

    Oren Ben-Kiki

-------------- next part --------------

/**
 * An object which wishes to receive XML from another should implement this
 * interface. At least one of the methods should return a non null value.
 */
interface XMLConsumer {

    /**
     * Return a SAX DocumentHandler if the object is interested in "parsing"
     * events (or, you can view this as applying the "Visitor" pattern
     * to an in-memory DOM tree). Returns null if the consumer won't accept
     * parsing events.
     */
    public DocumentHandler asSAXHandler();

    /**
     * Return a DOM Document handler is the object is interested in an
     * in-memory DOM tree. The consumer is free to modify this tree at will.
     * Returns null if the consumer won't accept a DOM tree.
     */
    public DOMDocumentHandler asDOMHandler();
};

/**
 * Accept a DOM tree from an XML producer. This is the equivalent of the
 * SAX DocumentHandler class. :TBD: Should this be in the DOM API spec? Is
 * there something there already which satisfies this need?
 */
interface DOMDocumentHandler {

    /**
     * Receive the DOM tree. The handler may do anything it wants with it.
     * :TBD: Ought there be a way by which the handler would indicate that
     * it is not going to change the tree?
     *
     * @exception   SAXException    If there is some error handling the DOM
     *                              tree. :TBD: Should this be a DOMException?
     *                              If so, then everywhere below that is
     *                              written "SAXException", a "DOMException"
     *                              should be added.
     */
    public void receiveDocument(Document document) throws SAXException;
};


/**
 * An object which generates XML. This interface is not used directly;
 * it is shared by XMLFilter and XMLSource.
 */
interface XMLProducer {

    /**
     * Set the consumer to which the XML data would be sent.
     * The producer asks the consumer for a handler which fits the type the
     * producer can supply. Some producers might know how to provide their data
     * in multiple formats. Typically producing one format is more efficient
     * then the other, so the producer should ask the consumer to handle this
     * format first and only then ask for the a handler to the second one.
     *
     * @param       consumer        The consumer to send the XML to.
     * @exception   SAXException    If the consumer can't receive the data
     *                              format the producer creates.
     */
    public void setConsumer(XMLConsumer consumer) throws SAXException;
};

/**
 * An object which accepts XML from a producer and passes it on to a
 * consumer. Along the way, it might do arbitrary processing. One of the
 * more useful filters is one which can accept any form of XML data from
 * the producer and convert it to the form required by the consumer.
 */
interface XMLFilter implements XMLProducer, XMLConsumer {
};

/**
 * An object which accepts XML from one producer and sends it to multiple
 * consumers. A useful one is a multiplexer which clones DOM trees and
 * gives each consumer a copy. One which simply passes the same DOM tree
 * to all interested consumers is also possible, but should be used with
 * great care.
 */
interface XMLMultiplexer implements XMLConsumer {

    /**
     * Add a consumer. :TBD: How to handle adding the same one twice?
     *
     * @exception   SAXException    If the filter can't accept this consumer.
     */
    public void addConsumer(XMLConsumer consumer) throws SAXException;

    /**
     * Remove a consumer. :TBD: How to handle removing a nonexistant one.
     * Throw a SAXException? Simply ignore the operation?
     */
    public void removeConsumer(XMLConsumer consumer);
};

/**
 * An XML source - one which generates the XML internally (or from another
 * format).
 */
interface XMLSource implements XMLProducer {

    /**
     * Sends the XML data to the consumer. If there is a processing
     * chain (via filters or multiplexers) then this triggers the whole
     * chain.
     *
     * @exception   SAXException    If the consumer can't handle the format
     *                              the producer can supply, there is a problem
     *                              generating the XML, or the consumer has
     *                              thrown an exception while handling it.
     */
    public void sendXml() throws SAXException;
};

/**
 * The SAX parser interface should be modified to implement XMLSource.
 */
interface XMLParser implements XMLSource {
    // A slightly modified version of the current interface.
};

/**
 * An XSL processor is "loaded" with an XSL stylesheet and from then on
 * serves as an XSLFilter - it takes an input XML document and converts
 * it into another.
 */
interface XSLProcessor implements XMLFilter {

    /**
     * Obtain the consumer to load with the XSL stylesheet XML representation.
     * This must be called before the filter is asked to process any XML data.
     */
    XMLConsumer getXSLConsumer();
};

Also, some useful helper routines, controlled by system properties:

XMLParser makeXMLParser();
XSLProcessor makeXSLProcessor();
XMLFilter makeXMLConverter(); // Convert from any format to another.
XMLFilter makeXMLConverter(XMLConsumer consumer); // Call setConsumer as well.
XMLConsumer makeXMLWriter(String encoding);
XMLConsumer makeXMLWriter(); // Use UTF-8 by default.
XMLMultiplexer makeXMLMultiplexer(boolean toClone);
XMLMultiplexer makeXMLMultiplexer(); // Give each consumet a clone by default.

So, after all this is done, may be create XSL processor, with logging of
the input given to the XSL processor, by:

XMLSource xmlParser = makeXMLParser();
XMLFilter xslProcessor = makeXSLProcessor();
XMLMultiplexer logger = makeXSLMultiplexer();
XMLConsumer inputWriter = makeXMLWriter();
XMLConsumer outputerWriter = makeXMLWriter();

// Initialize the parser and the writers...

xslProcessor.setConsumer(outputWriter);
logger.addConsumer(xslProcessor);
logger.addConsumer(inputWriter);
xmlParser.setConsumer(logger);
xmlParser.sendXml();

This assumed that all the consumers can accept the formats the producers give
them; otherwise, one would have to add instances of an appropriate converter
XMLFilter class between mismatching consumers and producers. Note that the cost
of such converters is zero if the consumer and producer do match, so it is a
good habit to add such a converter unless it is absolutely certain that
conversion is unnecessary:

xslProcessor.setConsumer(makeXMLConverter(outputWriter));
logger.addConsumer(makeXMLConverter(xslProcessor));
logger.addConsumer(makeXMLConverter(inputWriter));
xmlParser.setConsumer(logger);
xmlParser.sendXml();


More information about the Xml-dev mailing list