Identity

Lars Marius Garshol larsga at ifi.uio.no
Wed Jun 23 17:30:09 BST 1999


* David Hunter
| 
| While it is true that the transition from vacuum to Jupiter is very
| gradual, making it hard to determine exactly where Jupiter starts
| and vacuum ends, it is probably irrelevant to every member of this
| list, because none of us will have occasion to visit Jupiter.  

It's relevant to the question of identity, though, but let's not start
a meta-discussion.

| In another post on this thread, Lars Marius Garshol asked if the
| following two URLs denote the same resource:
| 
| <URL: http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html>
| <URL: http://birk105.studby.uio.no/linker/XMLtools.html>
| 
| My question is, does it matter?  Is there a case where we need an
| application to know or think that these two URLs are the same? 

Definitely! When people do a search for 'Free XML software' on Google
I want them to get a result more or less like:

  <li><a href="http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html">
      Free XML software</a> (<a href="http://birk105.../">alternative</a>)

and not to see these as two completely unrelated sites.

| OTOH, are THESE two URLs the same:
| 
| <URL:  http://a.server.com/dir/page.asp>
| <URL:  http://a.server.com/dir/page.asp?param1=5&param2=6>
| 
| This, in my [small] mind, is a much more difficult question to
| answer, but again, is there a case where we need an application to
| know or think that these refer to the same thing?

Sure! Lots! Some examples:

 - a server log analyzer that provides a referral report should merge
   references from these two 
   (see <URL: http://birk105.studby.uio.no/birk/stats/wwwrefer.html>)

 - a search engine should know whether they are the same, just as with
   my example above

 - software that builds an offline copy of a web site should know
   whether to make separate copies for these two URLs

 and so on...

And, BTW, it's by no means obvious that those two URLs really refer to
the same thing. I'm sure you'll agree that these two URLs refer to
different resources, for example:

<URL: http://www.80s.com/cgi-bin/valley.cgi?url=http%3A%2F%2F208.206.40.209%2Fmyfamily%2Froad.html>
<URL: http://www.80s.com/cgi-bin/valley.cgi?url=http%3A%2F%2F207.200.30.120%2F%47over%6Eor%2F%42ush.html>

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list