lex, yacc, and xml
David Megginson
ak117 at freenet.carleton.ca
Mon Dec 22 22:10:15 GMT 1997
Ward Harold writes:
> <question name="why hand code parsers" class="potentially stupid">
> Why is it that all of the XML parsers/processors I've seen appear to be
> hand coded rather than generated via lex/yacc or flex/bison? I seem to
> recall seeing something to the effect that yacc/bison can't handle the
> class of grammar that XML falls into. Then again I'm not a compiler
> constructor, opted for the AI sequence in graduate school, so I may be
> imagining things. Even if there is a technical reason for eschewing
> parser generation surely the basic lexing and scanning could be done
> with lex/flex, no?
> </question>
This is actually a very good question, but I will second most of Tim's
comments. With Ælfred, I set out to produce an Java-based XML parser
under 20K (I missed by about 6K, but I'm still working on it). A
hand-crafted recursive-descent parser seemed like the only reasonable
choice, and it turned out to be very fast as well.
In fact, it is not much harder to write a recursive-descent parser
than it is to write out EBNF productions, at least not once you get
into a rhythm and write a few helper methods for lexical scanning
(like "readName()").
All the best,
David
--
David Megginson ak117 at freenet.carleton.ca
Microstar Software Ltd. dmeggins at microstar.com
http://home.sprynet.com/sprynet/dmeggins/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list