Auxiliary parses/subdocuments in XML

Norman Gray norman at
Mon Jan 24 17:04:01 GMT 2000


How is one supposed to manage subdocuments in XML, since reference to
entities with notation XML appears to be forbidden by the XML standard?

I'm working on converting to XML an SGML application which uses
subdocuments fairly extensively.  The application doesn't use these
via a general entity reference, but instead via an attribute with an
ENTITY type -- that is, `out of context', through an auxiliary parse.

Now, XML doesn't have SUBDOCs, but it does have the ENTITY attribute type
(sect. 4.4), plus NDATA entities, and the statement (4.4.6) that unparsed
entities in this context should have their entity and notation sysids
passed on to the application.  Add to this James Clark's remark in his
`SGML and XML' note <> which says
`Subdocument entities.  These could be converted into NDATA entities
with a notation that indicates that they are SGML or XML'.  Add further,
miscellaneous chatter on the archive of this and other lists which seems
to imply the same thing, and there seems to be no problem.

However: the definition of `unparsed entity' at the beginning of
section 4 of the XML spec states that an unparsed entity's contents
may not be XML.  This means that the following:

<?XML version="1.0"?>
<!DOCTYPE simple [
<!ELEMENT simple (xref)+>
<!ATTLIST xref
<!ENTITY s2 SYSTEM 'simple2.xml' NDATA XML>
<xref doc="s2"/>

is valid according to nsgmls, but appears to violate the extra-syntactic
constraint in the XML spec.

Am I reading this correctly?  I can see that there's an echo here of
SGML's ban (8879, on such a target entity being an SGML text
entity (as opposed to an SGML subdocument), but that ban seems
concerned with a slightly different issue[1].  However, it does seem to
have very effectively frustrated the sort of information-gathering
auxiliary parse I want to do here.

If I am reading this correctly, is there an obvious best way to
proceed?  `Use XLink' would be an answer, I suppose, but seems a rather
long way about things; also, I'd rather be sure I _have_ to rule out
what has hitherto worked, before starting again with alternatives.

[ Apologies if anyone's already seen this question on comp.text.xml --
I posted it there before twigging that here was probably a better place,
and that anyone reading xml-dev was unlikely to have enough spare time
to read both. ]

All the best,


[1] The thread containing
    seems relevant here.
Norman Gray              

Norman Gray              
Physics and Astronomy, University of Glasgow, UK     norman at

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at
Archived as: or CD-ROM/ISBN 981-02-3594-1
Unsubscribe by posting to majordom at the message
unsubscribe xml-dev  (or)
unsubscribe xml-dev your-subscribed-email at your-subscribed-address

Please note: New list subscriptions now closed in preparation for transfer to OASIS.

More information about the Xml-dev mailing list