Separating documents in a stream

Christopher R. Maden crism at
Sun Nov 14 09:41:15 GMT 1999

[Richard Anderson]
>Provided the xml pi is _always_ used that can be used as the separator and
>seems the cleanest approach to me.

... except the XML *declaration* can appear in a document other than at the

<![CDATA[<?xml version="1.0"?>]]>

while the other suggested hacks (^L, ]]>]]>) are always illegal anywhere
within an XML document.

>The XML declaration is an indication to the application processing the
>document as to what version of the XML spec the following markup conforms
>to, what encoding is used, and whether or not there are dependancies of
>external entites within the document.  All of those are processing hints
>(read PI) for the XML processor, arent they ?

Except "PI" is a well- and clearly-defined term, and it does not mean
"processing hint".  Careful terminology is important when discussing
defined standards.

>Now, it you can tell me that the XML decl doesn't meet the definition of a
>processing instruction as defined by section 2.6 of the spec I'll change my

That would be §2.6, productions [16] and [17]:

[16] PI       ::= '<?' PITarget (S (Char* - (Char* '?>' Char*)))? '?>'
[17] PITarget ::= Name - (('X' | 'x') ('M' | 'm') ('L' | 'l'))

The string '<?xml ...?>' does not match PI because 'xml' does not match

>To me, the XML declartion is simply a PI that has a special meaning.
>Certainly the SAX parser I use reports the XML declaration as a PI which is
>obviously very usual.  Is that unusual ?

I hope it is - that parser is broken.

>I will argue the XML spec doesn't define the declaration as a type of PI,
>but it does reserve the xml PItarget, and, quickly checking the nearest XML
>book on my desk (XML companion by Neil Bradley) that also says it is a PI.

That book is broken too.  That's one reason why careful use of terminology
is important - you may know what you meant, but in a public forum, others
will learn from what you say, and they may not know what you meant (though
they think they do).

The prose language is §2.6 is a little ambiguous, noting that "XML", "xml",
and so on are reserved for standardization.  However, the grammar for
PITarget explicitly forbids them; compare §2.3, where Names beginning with
xml are reserved, but the grammar permits them.


Christopher R. Maden, Solutions Architect
Exemplary Technologies
One Embarcadero Center, Ste. 2405
San Francisco, CA 94111

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at

More information about the Xml-dev mailing list