Are notations dead, or just pining for the fjords? (was Re: SAX and delayed entity loading)

John Cowan cowan at locke.ccil.org
Fri Feb 26 20:25:51 GMT 1999


[This is an old e-mail that I wrote in early December, but for
some reason never posted.]

Liam R. E. Quin wrote:

> There are several fairly big problems with notations as defined.
> (1) the suggestion that one use the system identifier as a program to
>     run makes them a major security hole.

Indeed.  Wherefore that should not be done.

Clause 4.7 of the XML Rec was worded poorly (too subtly):

# [...] an external identifier for the notation which may allow an XML
# processor or its client application to locate a helper application
# capable of processing data in the given notation.

That does not mean, in general, that the external ID should point
directly to the helper application, since the application is
inevitably system-dependent and the document should not be.

(Tim, I've cc'ed you because I think making this clear(er) would be
a useful addition to your XML annotations.)
 
> (2) the idea that you know the format in advance of images or other
>     referenced objects and hard-wire it into your document does not fit
>     the web model of content negotiation, in which the client sends
>     a list of formats, in order of preference, and the server send
>     back the best available format, converting if necessary.

This point and the following one apply to the use of notations with
unparsed entities, which I am not now trying to defend.
Notations can be applied to elements as well through NOTATION
attributes, and that use I believe to be valuable.

> (4) there is no way to give a notation for XML, since, by definition,
>     any external entity with an associated notation is an unparsed entity!
>     The distinction between parsed/unparsed should be nothing to do with
>     the format at all.

No, there is no way to give a notation for XML and expect it to be
parsed automatically --- that's not the same thing at all.  If you
want to reference a subdocument and do *not* want it parsed, then
XML unparsed entities are plausible.  An example would be an XML TOC,
which can be validated, rendered, etc. without incorporating the
individual chapters (also in XML) directly into it, as a parsed
entity would do.
 
> The word "entity" confuses many people who come to XML (and SGML!) for
> the first time.  For one thing, it's already used in the relation
> database world to mean something entirely different.

Terminological buccaneering is unavoidable.
	-- Northrop Frye

> For another
> thing, XML has at least five meanings for the word entity -- and even
> the XML specification doesn't always say which kind it means at any
> given point.

No, there are five kinds of entities.  (It's true that CDATA has
two meanings, however, which is regrettable; in SGML it has even
more.)

> Frankly, the term "file" would be better for an external entity.

An external entity can be the result of a query as well, notably
if its system id contains a "?".

-- 
John Cowan	http://www.ccil.org/~cowan		cowan at ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)




More information about the Xml-dev mailing list