XML processing experiments

Jarle Stabell jarle.stabell at dokpro.uio.no
Fri Nov 7 17:39:16 GMT 1997


David McKelvie wrote:
>>> "<!ENTITY name 'richard'> ... <p>my name is &name;</p>"
>
>It's worth pointing out that Richard wants ALL of the PCDATA of the
><p> element to be returned as one string of characters "my name is
>Richard", rather than as two strings "my name is " and "Richard".

Yes. But this requires one to copy (at least the first string) and a
concatenation.

Some applications may be more interested in the speedup which may result
from not doing this copying/concatenation, and happily accept the small
increase in complexity handling it.

I'm playing with a design involving two pluggable "ESIS-handlers", one
"low-level", where GI's, attribute names, attribute values, comments etc
points directly into the source. (typically via a filemapping or an
in-memory-buffer)
The "low-level" ESIS-handler may copy the data into "real" strings,
concatenate the consecutive PCDATA sections , build the tree, do validation
etc and pass the events to an optional "higher-level" ESIS-handler.

I think/hope the layer which triggers the low-level events won't be very
different from Mr Clark's "quick and dirty" parser.

(Not sure yet whether the low-level handler should just receive events, or
whether it should query for the next event/token.)


Cheers,
Jarle


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list