Socat issues for XML

Paul Grosso paul at arbortext.com
Tue Sep 22 16:34:23 BST 1998


[John and I have been having a dialog on xml-dev at ic.ac.uk, but he
sent me one reply in the chain just to me, not to the list, so I
replied just to him.  He then replied to that message cc-ing the
list, so I figure I better send my posting to the list for context.
Sorry about this being somewhat out of order.  paul]

At 17:35 1998 09 21 -0400, John wrote:
>Paul Grosso scripsit:
>
>> I see three options:
>> 
>> 1.  say that your subset of TR9401 catalogs doesn't include OVERRIDE;
>> 2.  say that your subset "recognizes" OVERRIDE entries but ignores them;
>> 3.  say that your subset handles OVERRIDE.
>
>My current choice is 2.  For reference, my implementation understands
>PUBLIC, SYSTEM, DELEGATE, CATALOG, and BASE, and recognizes-but-ignores
>all other entries whether documented in 9401:1997 or not, as long as they
>follow the grammar given there:  (Name, Name?, Quoted-String*).

That is logically reasonable, though it suffers from the problem that
users will get unexpected results when they use existing catalogs.

>
>> Looking at the pros and cons, I'd opt for option 3:  a little more work
>> for your implementations seems preferable to the problems 1 and 2 will
>> mean for end users. 
>
>The objection to #3 is that I expect that system ids in XML documents
>will often be pro forma, and it will be more useful to use the catalog
>to find a local equivalent.

I'm not sure I follow.  Are you saying that, even if you set the initial
default of OVERRIDE to YES and a catalog writer explicitly puts in an
OVERRIDE NO entry, you think it is more "useful" to assume they didn't
mean it and ignore it?  That there is, in fact, no way to say "please
really use the system ids in my document" (which is, of course, precisely
what the "real" browsers will do until and unless they actually implement
a catalog, which I don't expect from MS&NS in the near term)?

>> I'm guessing you've got a scenario in mind where there is
>> no SYSTEM or PUBLIC match in a given catalog entry file and
>> where there HAS been a matching DELEGATE or CATALOG entry,
>> BUT for some reason you want to ignore the DELEGATE or CATALOG 
>> entry that was put into the catalog (why?) and instead just 
>> give up now and use the system id in the external identifier.
>
>Because XML system ids have defined semantics, whereas public ids
>don't.  I would expect that a local catalog would be willing to
>defer public id interpretation to a "root catalog" using CATALOG,
>without necessary wanting all system ids (URLs) to be decoded
>thereby.
>
>What I really want is a CATALOG-OF-PUBLIC-IDS-ONLY entry.

So why not put only public ids in that catalog?

You're only going to get to that catalog if you're still trying
to resolve an external identifier.  If you get to that catalog,
you've got to read/process it, so you're not saving any time.

It sounds like what you want is "ignore any SYSTEM type entries
in this catalog" but if the way to do that is to put a special
entry into that catalog, you already have to be able to access
and modify that catalog, so why not just omit the SYSTEM entries.

Perhaps you need (and maybe this is what you meant above) some 
sort of entry like CATALOG but that ignores SYSTEM entries in
the catalog-to-be-read.  But that still sends you off to read
that catalog, so you aren't saving any search time as you implied
in your earlier message.  All you're doing is inhibiting any
SYSTEM entries from matching, so you're making it even more likely
that your search will continue longer and wider. 

You're right that the more global a catalog, the less likely
it is that it will have a SYSTEM type entry.  But if global
catalogs have no SYSTEM entries, you don't have a problem, and
if they do, maybe it makes sense.  For example, say you download
the MathML Rec from http://www.w3.org/TR/REC-MathML/ to your
machine and then browse it.  There will be a graphic/icon on the
top that doesn't resolve.  If you look in the source, you'll see

  <DIV ALIGN=RIGHT>
  <A HREF="http://www.w3.org/"><IMG SRC="images/w3c_home.gif"
    ALT="W3C" BORDER=0 HEIGHT=48 WIDTH=72 ALIGN=LEFT></A>
  <B>REC-MathML-19980407</B>
  </DIV>

Note the <IMG SRC="images/w3c_home.gif"> which doesn't work
on your machine.  But if the W3C had a "global" catalog that 
had the entry:
  SYSTEM "images/w3c_home.gif"  "http://www.w3.org/images/w3c_home.gif"
and you had a catalog with the entry:
  CATALOG "www.w3.org/catalog"
then you'd get things resolving properly when you browse W3C 
documents that you've downloaded to you local machine.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list