text/xml vs. application/xml

David Megginson ak117 at freenet.carleton.ca
Mon Dec 22 02:03:47 GMT 1997

MURATA Makoto writes:

 > >1) Are you certain that ignoring the encoding declaration is
 > >   conforming behaviour?  
 > Yes, I am certain that ignoring the encoding declaration for text/xml 
 > is conforming behaviour.  This is to allow transcoding.

Thank you again for your posting and for your work on the MIME types.

I do not have access to any clarifications that may have been posted
in the SIG, so I necessarily rely only on the text of the PR.  The
following appears in the (normative) section 4.3.3, "Character
Encoding in Entities":

   It is an error for an entity including an encoding declaration to
   be presented to the XML processor in an encoding other than that
   named in the declaration, or for an encoding declaration to occur
   other than at the beginning of an external entity.

On the other hand, the following appears in appendix F, "Autodetection
of Character Encodings (Non-Normative)":

   The second possible case occurs when the XML entity is accompanied
   by encoding information, as in some file systems and some network
   protocols. When multiple sources of information are available,
   their relative priority and the preferred method of handling
   conflict should be specified as part of the higher-level protocol
   used to deliver XML.  Rules for the relative priority of the
   internal label and the MIME-type label in an external header, for
   example, should be part of the RFC document defining the text/xml
   and application/xml MIME types.

If "internal label" means the encoding declaration, then this note
supports your statement; unfortunately, the note is non-normative,
while the excerpt that I quoted first is normative, so the first must
take precedence (unless I've missed something elsewhere in the PR).
If the paragraph in the non-normative appendix expresses the WG's true
intention, then the PR will need to be revised to support it.

I think, however, that it would be unfortunate if the charset
parameter were used.  Consider, for example, the following document,
encoded in ASCII (despite the incorrect claim in the encoding

  <?xml version="1.0" encoding="ISO-10646-UCS-2"?>
  <doc>This is a sample XML document.</doc>

Let's say, now, that I place this document in a directory that is
accessible through both HTTP and anonymous FTP, and also put a copy on
my local machine.  Here's what will happen:

1) java EventDemo http://www.myhost.org/texts/sample.xml
   ==> receives charset="ISO-8859-1" as the default, ignores the 
       encoding declaration, produces correct output (accidentally),
       and reports no error.

2) java EventDemo ftp://ftp.myhost.org/pub/texts/sample.xml
   ==> reads the encoding declaration, realises that the document is
       _not_ in UCS-2, and reports an error (or worse, puts out
       garbage without reporting an error).

3) java EventDemo sample.xml
   ==> same as (2).

It is counter-intuitive that well-formedness depends on the
transmission protocol.
 > Rick Jelliffe proposed that only application/xml should be used in the 
 > XML SIG.  I will follow the consensus in the XML SIG or WG. 

Please feel free to repost this message to the SIG, if you think that
it will be helpful there.  

I strongly support Rick's suggestion for application/xml, partly
because it will avoid the requirement to make several last-minute
changes to the PR, and partly because it will save XML from being
trapped by some of the same constraints as HTML.  If typical (private)
users cannot post XML documents in their web space in languages other
than English, then the whole effort will be at least a partial

All the best,


David Megginson                 ak117 at freenet.carleton.ca
Microstar Software Ltd.         dmeggins at microstar.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list