SAX2: ParserFactory

Miles Sabin msabin at cromwellmedia.co.uk
Thu Jan 13 13:57:07 GMT 2000


Whether or not XMLReader extends Parser, it looks as tho' this
utility class is going to have to be replaced for SAX2. Seeing
as we've got this opportunity, I'd like to propose we use an
alternative (or at least supplementary) mechanism for
specifying an XMLReader implementation class.

The current mechanism is to have the org.xml.sax.parser System
property specify the name of a Parser implementing class for
ParserFactory to instantiate. This works fine, but has a
couple of drawbacks. First, it's only possible for an
application to bind to one parser implementation using this
technique ... applications that want to use more than one
have to specify at least one of the parser classes explicitly. 
Second, it requires some external configuration: it isn't just 
enough to put the parser implementation on the CLASSPATH for 
the parser to be made available.

There is another mechanism that avoids both of these 
(admittedly minor) problems. The general idea can be found
in the description of the new Service Provider mechanism
found in the JDK 1.3 beta release,

  http://java.sun.com/products/jdk/1.3/docs/guide/jar/jar.html

We can't use that directly (two reasons: we want to support
pre 1.3, and in any case in 1.3 we'd have to reference a class
in sun.misc.*) but we can do the same sort of thing.

Here's how it works (sorry if this is a bit cryptic). Rather 
than using a System property we use one or more provider files 
(these are just text files somewhere on the CLASSPATH which can 
be accessed via ClassLoader.getSystemResources()) which contain 
the names of the concrete classes which implement XMLReader. An
XMLReaderFactory could return an Iterator or Enumeration over
the available implementation classes (more likely over some
meta-information providing access to the features of the
various installed parsers so an application can choose if
there's more than one) and allow an app to select and
instantiate one.

The neat thing about this is that if we choose a well known
location on the CLASSPATH to put the provider files it'll be
possible to bundle a parser implementation along with it's
own provider file, making it possible to configure an app to
use it merely by making sure that the parser is on the
CLASSPATH. For example, suppose we put the provider files at
org/xml/sax2/XMLReader.providers, then I could distribute
a jar file of my parser that looked like this,

  com
    cromwellmedia
      markup
        xml
          NonValidatingXMLReaderImpl.class
          ValidatingXMLReaderImpl.class
  org
    xml
      sax2
        XMLReader.providers

where XMLReader.providers contains the lines,

  com.cromwellmedia.markup.xml.NonValidatingXMLReaderImpl
  com.cromwellmedia.markup.xml.ValidatingXMLReaderImpl

An XMLReaderFactory would be able to reference my providers
file via ClassLoader.getSystemResources() and hence
instantiate (or provide meta-information about) my two parsers.

Note that this mechanism is compatible with there also being
someone else's parser(s) being on the CLASSPATH too. So long
as that parser is in a separate jar file (or in a separate
class tree) ClassLoader.getSystemResources() will be able to
locate both provider files, so an XMLReaderFactory would be
able to amalgamate the files and offer access to all the
parsers on the CLASSPATH.

One complication tho'. ClassLoader.getSystemResources() (nb.
plural) is new in JDK 1.2. With a bit of fiddling about we can
get this mechanism to work with 1.1 as well, but with the
limitation that multiple provider files can't be automatically
amalgamated (ie. if you had more than one parser you'd have
to stitch the two provider files together and put the manually
joined file at the front of the CLASSPATH).

Oh, and there's no reason why we couldn't also support an
org.xml.sax2.parser System property as well.

If there's any interest I can probably find the time to put
together a sample implementation.

Incidentally, I'd really like to see a mechanism of this sort
used for bootstrapping into a DOM implementation.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin at cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ or CD-ROM/ISBN 981-02-3594-1
Please note: New list subscriptions now closed in preparation for transfer to OASIS.





More information about the Xml-dev mailing list