DOCTYPE misunderstood

Peter Murray-Rust Peter at ursus.demon.co.uk
Fri May 9 10:39:14 BST 1997


In message <dc9jLEA+jsczEw38 at light.demon.co.uk> Richard Light writes:
[...]
> 
> I've been thinking about the issue of what comes at the head of an XML
> document.  This may be stating the obvious, but ...
> 
> While it would be generally agreed that you can't gratuitously stick any
> old <!DOCTYPE header onto a piece of well-formed XML, I think there is a
> case for architecting XML so that you _can_ hold the naked XML without
> _any_ header information, and prepend both DOCTYPE and style processing
> instructions at delivery time.
> 
> One reason is that you might want to author a document in chunks, and
> either publish/work with the chunks in their own right, or put those
> chunks together via a 'master document' containing lots of entity
> references to pull the chunks in.  For the first purpose, the free-
> standing chunks will require a DOCTYPE header, not least so you can
> create them in a structured XML-aware editor.  For the second purpose,
> they need to be 'naked', since you can't pull in an entity with a
> DOCTYPE at the beginning, and we don't have the SMGL SUBDOC facility in
> XML.

This is a problem I have come up against, and still concerns me.  I would like
to encourage authors to create documents in small reusable chunks, the 
question being whether we use a construction like:

<!DOCTYPE CML [
<!ENTITY chunk1 SYSTEM "chunk1.cml">
... etc...
]>
<CML>
...
&chunk1;
</CML>

with the chunks (say) being:
<MOL>
...
</MOL>


or whether we use something like

<!DOCTYPE CML [
<!ENTITY mini1 SYSTEM "mini1.cml">
]>
<CML>
<XLIST XML-LINK="EXTENDED">
<XVAR XML-LINK="LOCATOR" ACTUATE="AUTO" SHOW="EMBED" HREF="&mini1;"></XVAR>
</XLIST>
</CML>

with mini1.cml being:

<!DOCTYPE CML>
<MOL>
...
</MOL>

Now, I wrote this latter on the fly, and it looks horribly clunky and it's
much more difficult to implement.  And is it *legal*? and will it do
what I want?  The advantage is that the mini version can be used in its
own right and we know what language it's in.  Chunks like:

<A>Foo
<B>Bar</B>
</A>

do not carry their DTD and also unwanted whitespace could easily creep in.
Constructions like:

<A
>Foo<B
>Bar</B
></A
>

might solve some, but not all of the whitespace problem.

Since this must be a Well Investigated Problem, insight would be useful.

	P.

-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list