lex, yacc, and xml
richard at cogsci.ed.ac.uk
Sat Dec 27 18:11:20 GMT 1997
> The currently available LT XML release (http://www.ltg.ed.ac.uk/software/xml/)
> uses a lex/yacc parser.
The ugliest part of this code is the DTD parsing, because you want
(say) SYSTEM returned in some places as a keyword, and in others as a
name. To achieve this, the yacc layer has to be constantly setting
the lexer mode ("lexical tie-ins"). Contrast this with C (surprise!)
where you can't have a variable with the same name as a keyword.
As Henry said, the performance is one reason why we switched to a plain
C parser. Another is the question of 16-bit characters, though this
could probably have been kludged since all the syntactically important
characters are < 128.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev