EMBED and validation

W. Eliot Kimber eliot at isogen.com
Mon Dec 1 17:31:21 GMT 1997

At 03:36 PM 12/1/97 UT, Simon St.Laurent wrote:
>>From a DOM perspective, EMBEDded material will almost certainly not be
>>considered part of the document tree containing the EMBED element.
>I very much look forward to seeing what the DOM does (or doesn't do) with
>EMBEDded material.  But is this an issue for the DOM in particular, or
>the XML-Link spec give clearer direction about the nature of EMBEDded 
>material?  Especially as some of the replies so far have said that an 
>application _could_ include the EMBEDded material in the document tree _if_ 
>the developer so chose - which opens the door to multiple interpretations
in a 
>large way.

XML (or SGML) data can be used in one of two ways:

1. Use by value (you get the data syntactically).  This is what text
   entities are for.  A text entity is, by definition, part of the *character
   string* of the document that references it.  That means that the parser
   parses it at the point of reference and it must be valid or well formed
   (if the entire document is well formed).  A document with a text entity
   reference is identical, for parsing purposes, to a document with the
   reference replaced by the entity's replacement text (note that in base
   SGML ESIS, text entity references are not communicated by the parser).

2. Use by reference (you point to the data but don't get it syntactically).
   This is what XML Link means by "EMBED" and what HyTime means by 
   "value reference".  The referenced data is a separate, self-contained
   object and the parser does not parse it at the point of reference (if
   at all, as it may not be XML data).  For use-by-reference, it is up to the
   processing application to make sense of the reference, for example, 
   presenting a referenced image according to the active style settings or
   presenting a referenced document as though it had occurred in line, or
   providing an icon you can select to see the referenced thing.

As for "document trees" (groves), the initial result is *never* a single
tree containing the results of parsing two documents (if the thing used by
reference is another document).  However, a processing application might
choose to construct a *new* tree that combines the two documents in some
way that makes sense *to the application*.  For example, I've written
several instances of a program that takes a tree of subdocuments and
creates a single instance from them.

Note that making the distinction between use by value and use by reference
keeps separate the storage and logical organization of the data, so that
data can be organized into storage objects independently of how it might be
used logically by reference.  For example, I might put all my chapters in a
single storage object (document entity) but use individual chapters by
reference (using element-level addressing).  It's also important to keep in
mind that, for XML and SGML, a reference to a document entity is usually
taken as shorthand for reference to that document's root element (that is
the HyTime default, and I assume, the TEI default).

In HyTime's abstract processing model, use by reference is, by default,
transparent to processing applications because the HyTime engine redirects
the processing application to the data used by reference, making it look to
the processor as though there is but a single grove.  However, under the
covers the groves are distinct and processors can ask to view them that
way.  This is probably more sophistication than most XML processors (e.g.,
browsers) need provide, although more sophisticated browsers and hypertext
systems need this flexibility.


<Address HyTime=bibloc>
W. Eliot Kimber, Senior Consulting SGML Engineer
Highland Consulting, a division of ISOGEN International Corp.
2200 N. Lamar St., Suite 230, Dallas, TX 95202.  214.953.0004

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

More information about the Xml-dev mailing list