SAX: Next Round
david at megginson.com
david at megginson.com
Wed Jan 20 21:24:17 GMT 1999
[I started a thread like this before the holidays, but then got drawn
away, so I'll try again.]
I've been thinking about what new SAX interfaces we need the most
(with much prodding from users). Here's what I think we need as a
minimum:
1. A standard filter interface (and perhaps an optional base class).
2. A handler interface for lexical events like comments, CDATA
sections, and entity references.
3. Some kind of namespace support.
I'd like to lose EntityResolver and DTDHandler (who uses them?), but I
don't know if we can.
1. Filter Interface
-------------------
My first inclination here was to have the filter interface extend
org.xml.sax.Parser, something like this:
public interface ParserFilter extends Parser
{
public abstract void setParent (Parser parser);
}
Upon more careful consideration, though, I think that we might want to
make Filter an entirely independent interface:
public interface Filter
{
public abstract void setParent (Parser parser);
}
The advantage to the second approach is that you can have something
like
public class SAXDriver implements Parser, LexicalProcessor, Filter,
NamespaceProcessor
{
}
and mix and match different capabilities (see further, below).
A third alternative (suggested by James Clark) is to have the filter
interface extend DocumentHandler rather than parser -- i.e., to filter
on the client side:
public interface Filter extends DocumentHandler {
public void setDocumentHandler(DocumentHandler handler);
public void setParameter(String name, String value) throws SAXException;
}
2. Lexical Event Handler
------------------------
What do we really need here?
public interface LexicalHandler
{
public void startDTD (String name, String pubid, String sysid)
throws SAXException;
public void endDTD (String name);
public void startExternalEntity (String name, String pubid, String sysid)
throws SAXException;
public void endExternalEntity (String name) throws SAXException;
public void startCDATA () throws SAXException;
public void endCDATA () throws SAXException;
public void comment (String data) throws SAXException;
}
I haven't checked, but I think that this gives us everything we need
for DOM level one.
Of course, we need a new parser interface that knows about lexical
handlers:
public interface LexicalParser extends Parser {
public void setLexicalHandler (LexicalHandler handler);
}
This could also be an entirely independent interface that doesn't
extend Parser:
public interface LexicalProcessor {
public void setLexicalHandler (LexicalHandler handler);
}
3. Namespace Support
--------------------
Where should we go with this? At a minimum, we need to be able to
provide fully-expanded element and attribute names, but it would also
be nice to be able to know about namespace prefixes so that an
application can expand them in attribute values and character data
content. Here's a minimal approach:
public interface NamespaceParser extends Parser
{
}
Doesn't look like much, does it? We could provide an on-off switch:
public interface NamespaceParser extends Parser
{
public void enableNamespaceProcessing (boolean flag);
}
Another nice thing to do would be to allow the application to choose a
separator between the URI part and the local name:
public interface NamespaceParser extends Parser
{
public void enableNamespaceProcessing (boolean flag);
public void setSeparator (String sep);
}
To keep things pure, it might be a good idea not to have this extend
the Parser interface:
public interface NamespaceParser
{
public void enableNamespaceProcessing (boolean flag);
public void setSeparator (String sep);
}
Now, the best way to find out about namespace prefixes is to introduce
a new handler interface:
public interface NamespaceHandler
{
public void startNamespacePrefix (String prefix, String uri)
throws SAXException;
public void endNamespacePrefix (String prefix)
throws SAXException;
}
Of course, NamespaceParser has to know about this:
public interface NamespaceParser
{
public void enableNamespaceProcessing (boolean flag);
public void setSeparator (String sep);
public void setNamespaceHandler (NamespaceHandler handler);
}
Finally, we might want a helper class that can split up names, etc.
Do we munge all of this with inheritance, or keep a series of separate
mix-and-match interfaces?
Comments?
All the best,
David
--
David Megginson david at megginson.com
http://www.megginson.com/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list