XML & Entities inclusion against Inline Tag facilities.

Peter Murray-Rust Peter at ursus.demon.co.uk
Thu May 22 12:10:31 BST 1997


In message <199705220709.JAA28224 at ifhamy.insa-lyon.fr> Alexandre Mutel writes:

This is an important subject as I am currently wrestling with the XML linking 
spec at present.  I'd be happy to see a clear exposition of how XML 
includes/transcludes document/fragments, etc.

<INTERLUDE>
Although XML-DEV is not intended as a forum for beginners, there are a number
of questions - like the current one - which are legitimate to discuss if we 
don't have a lot of traffic.  I also think it's easy for developers to
misinterpret parts of the spec (I have done this in a major way and fairly
publicly with XML-LINK I think :-).  Also
<PLUG>
Since I am running a virtual course on XML and Java (see URL), it's useful to
know what questions come up :-)
</PLUG>
</INTERLUDE>

> hello,
> 
>    In XML specs (like SGML features), they talk about entities inclusions in
>    a document... Something like:
> 
>    <!DOCTYPE book [
> 	<!ELEMENT book (#PCDATA) >
> 	<!ENTITY  including SYSTEM "http://server1.com/index.txt">
>    ]>
>    <book>
>    &including;
>    </book>

This is indeed correct and PARSERs are required to implement it.  For
many applications it will simply be an insertion of the text in index.txt
at the point of the entity reference.  So if index.txt contained:

<P>That's all folks!</P>

the parser would create an intermediate instance:

<book>
<P>That's all folks!</P>
</book>

Note that if there is whitespace in the entity, this whitespace is
included in the document.  Also, if there are entity references in the
entity, *these* are then processed.

This facility only works for entities which are XML documents (but see NOTATIONS)
They cannot have a DOCTYPE or subset and must correspond to a wellformed 
document.
(e.g. 

<P>That's all folks!

would not be allowed.  However the spec 4.4(8) says that if the processor
(i.e. the parser) is NOT validating the document, it doesn't have to 
expand the entity.  I assume (contributions, please) that this would be
done through a parser switch (-E expand entities, or similar).  That means 
that your document could still parse (WF) even if the entity was not WF as
long as expansion was disabled.

> 
>    Okay,they say that with XML-SGML a document can be built with document-part-
>    included using entities facilities.
>    HTML doesn't make use of external entities but it can do inline image through
>    some tag... In XML specs i doesn't see any reference to TAG or special attri-
>    butes that can handle inclusion of document component (text,image,object).

This will be done through XML-LINK.  This is much more powerful than HTML as
it can be applied to any element.  Here's how HTML's IMG would look in XML

<!ATTLIST IMG 
    XML-LINK CDATA #FIXED SIMPLE 
    SHOW     CDATA #FIXED EMBED 
    ACTUATE  CDATA #FIXED AUTO
    HREF     CDATA #REQUIRED
>

This defines IMG to be a SIMPLE XML-LINK.  (Its target 'resource' is 
located through HREF just as in HTML's A.  <A> behaves with ACTUATE="USER"
SHOW="REPLACE", i.e. nothing happens till the user clicks it, and then
(usually) the display is replaced by the new 'resource'.  For IMG
the link is traversed immediately it is encountered, and the resource is
embedded in the document (probably near the <IMG> element).

> 
>    I would like to know :
> 	- if in the future, we 'll only use external entities to include a
> 	  document component ?

No, you can use XML-LINK to refer to part of the current document, as well
as to external documents.  If the external documents are XML then it is
often straightforward to include them, but only if they have the same DOCTYPE
If they have different DOCTYPEs we have a namespace problem and we are still
wrestling with that one (e.g.

<CML>
The rate of this reaction is given by 
<A HREF="eqn1.xml">equation 1</A>
</CML>
where eqn1.xml might be written in MathML.  
)

If the external entity is BINARY (i.e. not XML - it may stiil be ASCII) then
a NOTATION is required (e.g. for GIF).

I'll stop there and suggest someone else tells us how to use NOTATION 
because I haven't implemented it yet!!

> 	- anyelse, does XML will support special attributes for Tag to specify
> 	  that this Tag with this attributes can include something?

You can add XML-LINK attributes to ANY element, so you don't have to use
a single one like <TAG>

> 	- or does this feature will be hardcoded in a browser, making the same
> 	  mistake than HTML?

Nothing is hardcoded in JUMBO, which is the first XML browser that I know
of :-).  If a browser manufacturer wishes to limit their browser to 
one particular XML application then good luck to them - maybe their
market is well-defined.  For example, if someone writes an XML browser
specifically for mobile phones, they may well hardcode their application.
I am strongly urging the scientific/technical/medical community to develop
interoperable components and with CML and MathML we are off to a good start.

A generic browser (like JUMBO) has to be prepared to implement XML-LINK and
XML-STYLE independently of the DTD.  It also has to be able to switch
DOCTYPES for different namespaces.  In principle it also has to be able
to find tools to deal with a number of common NOTATIONS (GIF, CGM, etc.)
and I hope that people will produce self-installing tools for those to
save the browser m'facturers having to reinvent it every time. 

For the major horizontal browser m'facturers, we shall have to wait and see.
I'm very much hoping there is a good API into XML browsers so that developers
can avoid having to render HTML, interface with mail, etc.

Let's have your postings...but keep them targeted to the development of XML
tools, resources, documents, tutorials, etc.

	P.



-- 
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/

xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo at ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa at ic.ac.uk)




More information about the Xml-dev mailing list