SAX-J and the DPH (DJH?)

Chris Maden crism at
Wed Dec 31 15:39:10 GMT 1997

[Sean McGrath]
> >>The fun starts for the D⟨H when including the markup in the
> >>SED command is not an option due to the hierarchical sensitivity
> >>of the task. e.g. just telephone numbers occuring within the
> >>appendix elements and skipping those where the client attribute
> >>has the value = "Jones". That sort of thing.
> >>
> >>Maybe nothing short of a fully blown XML parser will do for these
> >>situations?
> Care to suggest a Perl implementation? I for one would find it very
> instructive to see what problems/issues/patterns emerge from
> studying a live example.

$inappendix = FALSE;

while (<>) {
    if (/<appendix/) {
	$inappendix = TRUE;
    if (/<\/appendix/) {
	$inappendix = FALSE;
    if ((/^(.*<telephone[^>]*>)555-1234(.*)$/) && $inappendix) {
	$pre = $1;
	$post = $2;
	if (!(/client\s*=\s*["']Jones["']/)) {
	    print $pre . "555-4321" . $post . "\n";
	else {
	    print $_;
    else {
	print $_;

This is a somewhat simplistic implementation; for instance, it assumes
that there won't be more than one <telephone> on a single input line,
and it doesn't scale well for multiple clients that you're filtering
out.  It also assumes no CDATA marked sections, and demonstrates why
</> makes the DPH's life impossible.  I haven't actually run this, but
it's similar enough to other things I've done that I think it should

<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL> <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as:
To (un)subscribe, mailto:majordomo at the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list