Separating documents in a stream

Christopher R. Maden crism at exemplary.net
Sun Nov 14 09:41:15 GMT 1999


[Richard Anderson]
>Provided the xml pi is _always_ used that can be used as the separator and
>seems the cleanest approach to me.

... except the XML *declaration* can appear in a document other than at the
beginning:

<![CDATA[<?xml version="1.0"?>]]>

while the other suggested hacks (^L, ]]>]]>) are always illegal anywhere
within an XML document.

>The XML declaration is an indication to the application processing the
>document as to what version of the XML spec the following markup conforms
>to, what encoding is used, and whether or not there are dependancies of
>external entites within the document.  All of those are processing hints
>(read PI) for the XML processor, arent they ?

Except "PI" is a well- and clearly-defined term, and it does not mean
"processing hint".  Careful terminology is important when discussing
defined standards.

>Now, it you can tell me that the XML decl doesn't meet the definition of a
>processing instruction as defined by section 2.6 of the spec I'll change my
>mind.

That would be §2.6, productions [16] and [17]:

[16] PI       ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'
[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' | 'l'))

The string '<?xml ...?>' does not match PI because 'xml' does not match
PITarget.

>To me, the XML declartion is simply a PI that has a special meaning.
>Certainly the SAX parser I use reports the XML declaration as a PI which is
>obviously very usual.  Is that unusual ?

I hope it is - that parser is broken.

>I will argue the XML spec doesn't define the declaration as a type of PI,
>but it does reserve the xml PItarget, and, quickly checking the nearest XML
>book on my desk (XML companion by Neil Bradley) that also says it is a PI.

That book is broken too.  That's one reason why careful use of terminology
is important - you may know what you meant, but in a public forum, others
will learn from what you say, and they may not know what you meant (though
they think they do).

The prose language is §2.6 is a little ambiguous, noting that "XML", "xml",
and so on are reserved for standardization.  However, the grammar for
PITarget explicitly forbids them; compare §2.3, where Names beginning with
xml are reserved, but the grammar permits them.

-Chris

--
Christopher R. Maden, Solutions Architect
Exemplary Technologies
One Embarcadero Center, Ste. 2405
San Francisco, CA 94111



xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)





More information about the Xml-dev mailing list