xml diff?

Sanjiva Weerawarana sanjiva at watson.ibm.com
Wed Dec 23 15:32:57 GMT 1998

Mark D. Anderson writes:
>Suppose I want the "diff" between two xml files.
>I can imagine a few approaches:
>- very-cheasy:
>just use "diff"
>- almost-as-cheasy:
>first do s/\>/\>\nUNIQUE/g to put the tags on separate lines,
>then use "diff", then restore by s/\nUNIQUE//g
>- graph-theoretic:
>surely there must be some CS work on algorithms for finding
>the least cost path between two trees, expressed as a sequence
>of operations? the simplest is with just the operations of
>add/delete of subtrees, but move and copy are interesting too.

Available from IBM alphaWorks
is a tool that does exactly this. It computes the edit distance between
two DOM trees and produces a report which indicates which nodes have
been changed, which have been added and which have been deleted. This report
is given in XML.

A "patch" tool comes with it to take this report and patch one tree to
get to the other tree. A graphical UI allows u to apply the changes a step
at a time.

Check it out .. its pretty cool! (It was written by Paco Curbera, who works
down the hall from me; so, yes, I am biased about it.)

Sanjiva Weerawarana, Ph.D.                      email:  sanjiva at watson.ibm.com
Research Staff Member                             tel: +1 914 784 7288 t/l 863
IBM TJ Watson Research Center                     fax:         +1 914 784 6324
Hawthorne, NY 10598, USA.            url: http://lanka.watson.ibm.com/~sanjiva

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list