Lotsa laughs

James Clark jjc at jclark.com
Tue May 25 14:12:37 BST 1999


Actually, yes. I stand by what I said.  I've been into this in great
detail in the context of XSL/XPointer unification (because XSL patterns
use square brackets which are not allowed by RFC 2396), including
checking with the editor of the Character Model WD.  The consequence of
the Character Model WD is that in an HTML or XML element or attribute
that is representing a URI reference, you can use characters which RFC
2396 prohibits in URI references, and each such character will be
treated as if it had been escaped by %-encoding each of the sequence of
bytes that encodes the character in UTF-8; if it was an HTTP URL, then
the %-encoded sequence would be what actually goes on the wire in the
GET request. The reason for doing this is that RFC 2396 prohibits all
non-ASCII characters and it's a non-starter from an I18N perspective to
require that non-English users creating their documents by hand in a
text editor represent the characters of their language using a
completely unreadable sequence of %-escapes.

Chris Lilley wrote:
> 
> James Clark wrote:
> >
> > "John E. Simpson" wrote:
> > > Ah. Yes, I just checked using IE5 and can indeed see that IE5 "reads" URLs
> > > with embedded spaces just fine, by auto-encoding them to %20s. That's an
> > > interesting twist.
> >
> > This is in conformance with the W3C Character Model WD. See
> >
> >  http://www.w3.org/TR/WD-charmod#URIs
> 
> Actually, no.
> 
> What it is doing is taking content which is *not* in conformance with
> either the W3C Character Model or RFC2396, and silently fixing it up so
> that it *does* conform when dereferencing URLs.
> 
> This is supposed to happen on authoring (so the content is valid), not
> as a form of retrospective error correction.
> 
> --
> Chris


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list