Separating documents in a stream
John Tigue
john.tigue at tigue.com
Sun Nov 14 06:30:13 GMT 1999
Richard Anderson (rja at arpsolutions.demon.co.uk) wrote:
> Provided the xml pi is _always_ used that can be used
> as the separator and seems the cleanest approach to me.
>
> ----- Original Message -----
> From: David Megginson <david at megginson.com>
> > kragen at pobox.com (Kragen Sitaker) writes:
> >
> > > One protocol I remember reading about uses ]]>]]> to
> > > separate XML documents, since that's a sequence that
> > > can't occur inside a legal XML document.
> >
> > True, but ^L (formfeed) is not allowed within an XML
> > document either -- it's still my favourite as a separator.
I'd consider using multipart MIME to solve the problem
of seperating documents in a stream.
Here's the spec:
IETF RFC 1341
Borenstein, N. and Freed, N.,
"MIME (Multipurpose Internet Mail Extensions): Mechanisms for
Specifying and Describing the Format of Internet Message Bodies"
http://www.ietf.org/rfc/rfc1341.txt
Multiparting is well defined, well tested, and there exists free
compliant software. Also by using technology which is orthogonal
to XML, implementation code can be re-used.
Here's an example of what multipart looks like:
MIME-Version: 1.0
Content-Type: multipart/digest; boundary="---- next message ----"
------ next message ----
...document 1 goes here ...
------ next message ----
...document 2 goes here...
------ next message ------
There's even a sub-type 'multipart/related' which works well for
aggregate documents with interlinked parts and a defined root
entity (for example, for HTML email the root is the HTML document
and it links to the non-root images). multipart/related maintains
the internal links during transfer. XML document are a good
candidate for multipart/related.
Here's the multipart/related spec:
IETF RFC 2112
Levinson, E.,
"The MIME Multipart/Related Content-type"
http://www.ietf.org/rfc/rfc2112.txt
Here's a modified example from the spec. In this example there
are 3 parts: 1 HTML-like doc and 2 JPEGs.
Content-Type: Multipart/Related;
boundary=example-2;
start="950118.AEBH at XIson.com"
type="text/xml"
--example-2
Content-Type: text/xml; charset=iso-8859-1;
declaration="950118.AEB0 at XIson.com"
Content-ID: 950118.AEBH at XIson.com
Content-Description: Document
<doc>
This picture was taken by an automatic camera mounted ...
<image file="cid:950118.AECB at XIson.com" />
<para>
Now this is an enlargement of the area ...
<image file="cid:950118:AFDH at XIson.com" />
</doc>
--example-2
Content-Type: image/jpeg
Content-ID: 950118.AFDH at XIson.com
Content-Transfer-Encoding: BASE64
Content-Description: Picture A
[encoded jpeg image]
--example-2
Content-Type: image/jpeg
Content-ID: 950118.AECB at XIson.com
Content-Transfer-Encoding: BASE64
Content-Description: Picture B
[encoded jpeg image]
--example-2--
-John
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list