SAX-J and the DPH (DJH?)

Chris Maden crism at ora.com
Wed Dec 31 15:39:10 GMT 1997


[Sean McGrath]
> >>The fun starts for the D⟨H when including the markup in the
> >>SED command is not an option due to the hierarchical sensitivity
> >>of the task. e.g. just telephone numbers occuring within the
> >>appendix elements and skipping those where the client attribute
> >>has the value = "Jones". That sort of thing.
> >>
> >>Maybe nothing short of a fully blown XML parser will do for these
> >>situations?
> 
> Care to suggest a Perl implementation? I for one would find it very
> instructive to see what problems/issues/patterns emerge from
> studying a live example.

$inappendix = FALSE;

while (<>) {
    if (/<appendix/) {
	$inappendix = TRUE;
    }
    if (/<\/appendix/) {
	$inappendix = FALSE;
    }
    if ((/^(.*<telephone[^>]*>)555-1234(.*)$/) && $inappendix) {
	$pre = $1;
	$post = $2;
	if (!(/client\s*=\s*["']Jones["']/)) {
	    print $pre . "555-4321" . $post . "\n";
	}
	else {
	    print $_;
	}
    }
    else {
	print $_;
    }
}

This is a somewhat simplistic implementation; for instance, it assumes
that there won't be more than one <telephone> on a single input line,
and it doesn't scale well for multiple clients that you're filtering
out.  It also assumes no CDATA marked sections, and demonstrates why
</> makes the DPH's life impossible.  I haven't actually run this, but
it's similar enough to other things I've done that I think it should
work.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list