dgd at cs.bu.edu
Thu Mar 6 16:57:29 GMT 1997
At 2:14 PM -0800 3/5/97, Bill Smith wrote:
>> Well, the Lark event-stream API sure looks & feels like a bunch of
>> callbacks. You make a Lark object, call its readXML() method with one
>> argument being a Handler object; Handler being a data-less class that
>> just has a bunch of methods called things like doPI() and doStartTag() and
>> doEntityReference() and doText() and so on; you'd normally subclass Handler
>> replacing the methods for the events you wanted to see, and pass in
>> that kind of object. Lark calls these upon recognizing
>> the constructs in the input stream, passing the byte offset info, the
>> element & entity stack (*if* you're treebuilding), and other currently
>> relevant info. These methods are all booleans; if any returns true,
>> Lark stops and returns control to whoever called readXML().
I like this boolean approach -- it lets the Handler object take back the
flow of control, pretty easily. If Lark could break PCDATA up when the
buffer stalls, you could easily implement a Browser-style application.
>Another way to do this is to have the Lark object (or interface) define the
>event methods rather than have a separate Handler object. When it's time to
>parse something, create a subclass that overrides the (standard) event methods
>for the Lark object.
I don't like this quite as well for a generic API as I can see the use of
Handler objects that don't know how to parse -- they can be glued to other
event sources to run off of DB engines -- or even broken across a network
to provide and XML event-stream mechanism...
>A possible advantage to this method is that it makes clear the inheritance
>relationship between the "standard" parser and something more specific. It
>is also "easier" to create a more specific parser from an exisiting parser
>object - simply subclass the existing parser and override the methods
>required to provide the desired new functionality.
If the methods in the standard parser don't do something you are interested
in, you still have to do override them all -- and I don't see what default
behavior would make sense other than "do nothing". It seems that you could
get the benefits of having that simply by supplying a predefined Handler
object that has null implementations for its methods.
>The Lark model "hides" the inheritance relationship in the Handler object
>making it necessary to look inside a Lark object to determine the type of
>a given parser (something you might need to do when debugging). An
>alternative is to create a new parser object that contains a subclassed event
>handler. This makes it possible to distinguish the type of parser at the
>"outer" level but requires two new objects instead of one to perform the
The debugging issue is certanly a bit inconvenient. If we use interfaces
rather than classes for the API (almost certainly a good idea), then we can
certainly create a Parser that implements Handler.
>I'm not a parser expert so the subclass model may not make any sense but this
>is a mechanism I have successfully used building other object-oriented
>(including GUI-based) systems. I have also used callbacks but find them most
>useful when forced to use C or other non-object-based languages.
I think that subclassing here just means that I might be forced to pull in
parser baggage (or null methods) when I want to implement a parser-free
event handler or event generator.
David Durand dgd at cs.bu.edu \ david at dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)
More information about the Xml-dev