AFs and the DPH
W. Eliot Kimber
eliot at isogen.com
Thu Oct 2 18:14:27 BST 1997
Peter Murray-Rust wrote:
> May I reduce my ignorance further by asking some simple questions:
> - must a DTD (or at least an ATTLIST) always be provided with the document
> instance?
Only if you want to avoid putting the mapping attributes in start tags
(moral equivalent of qualifying names ala colonization) or want to use
element types that are different from the architectural names
(remembering that by default, if an element has the same GI as a form in
the active architecture, it is mapped to it automatically).
The architecture mechanism was designed when you always had attribute
declarations, so it is optimized for reducing instance syntax by
providing attribute list declarations.
NOTE: If the architectures meta-DTD is identical to what would be the
document's DTD if it had one (for documents without DTDs), then all
mapping is automatic and there's no need for additional attributes in
the instance. In other words, given a document with an explicit DTD,
you can remove the DTD, make it an architectural meta-DTD, and get the
same processing result. This is why I think architectures are key to
the success of XML: it lets you eat the cake of DTD-less documents and
still have it (because the architecture processing gives you all the
validation and processing you need, but only when you want it and not
when you don't).
> - if so, how is this information going to be transmitted to the AF-aware
> processor. Will Xapi-J do this?
I'm not sure what you mean by 'this information'. Do you mean the
mapping itself? If the attributes are declared or specified, they're
simply part of the properties of the elements and any AF-aware processor
can examine the attributes to look to see if there are any it
recognizes. Automatic mapping is slightly more work, because you have
to know what form you are looking for (either because you have
hard-coded it into your processing (e.g., if (gi == 'some-form') {}) or
because you are also looking at the meta-DTD). In the simple case, your
AF-aware processor is expecting certain element forms and attributes and
simply looks for them, rather than trying to do generalized architecture
processing. This is funtionally equivalent to having a processor tied
to a particular DTD except that you look first for architectural
attributes and *then* at GIs, rather than starting with GIs.
Any abstract API (like Xapi-J) can be usefully enhanced to make getting
architecture-specific properties easier. For example, in the work I've
done with ADEPT*Editor, I created a set of functions to resolve
architectural mappings--these functions could easiliy be provided by
ADEPT out of the box. Likewise, any sufficiently complete document API
probably provides primitives that can be combined to provide
architecture-support functions--you can either do it yourself (as I have
for ADEPT and DSSSL) or make them part of the base API. The set of core
functions is fairly small:
- ArchFormOf - Returns the form, if any, of an element
for a given arch.
Applies architectural automapping rules.
- IsArchitectural? - Returns true if the element is
architecture for an arch
- LocalAttributeNameFor - Given an architectural attribute name,
returns the name
of the attribute of the element that is
mapped to the
architectural (i.e., resolves
architectural attribute
name remapping).
- ArchAttValue - Returns the value of an given
architectural attribute.
Returns the architecture-defined default
(if known,
either because the meta-DTD is available
or because
knowledge of the architecture is
hard-coded somewhere).
- ArchContentOf - Resolves architectural content remapping
- IsArchForm? - Returns true if a given element is of the
specified form
With these functions, it's pretty easy to do architecture-aware
processing just as
you do DTD-aware processing, e.g.:
$archform = &ArchFormOf($current_node, 'XML-LINK');
if ($archform == 'SIMPLE') {
print STDERR "Found a simple link element\n";
} elsif ($archform == 'EXTENDED') {
print STDERR "Found an extended link element\n";
}
Versions of these functions are provided as part of the hy-lib.pl Perl
package mentioned below.
> - is any other information required, or can the processor deduce from the
> values transmitted that this is an AF?
With a few reasonable assumptions, yes, the attributes alone are
sufficient. For completely general architectural processing, you need
to either have built-in knowledge of the architecture meta-DTD (e.g.,
you have a hard-wired HyTime-aware processor like Panorama) or you also
process the meta-DTD (like the code I posted recently to the Arbortext
mailing list for doing generalized architectural processing with
ADEPT*Editor).
> - if other information is required, how is it to be included (does SP
> require additional arguments/input, for example)
The attributes alone are sufficient for quick-and-dirty, hardwired
processors (such as my hy-lib.pl Perl code [now quite out of date, but
still illustrative of DPH architecture processing], which can be found
at "www.isogen.com/demos"). To be more generalized, you need the
architecture declaration attributes (provided as data attributes of the
architecture notation in the AFDR-defined approach). These attributes
tell you what the attribute names are for the attributes used in the
document, such as the name of the attribute that specifies the form
mapping, the renaming attribute, and so on. You need to process these
attributes when you want to process documents that don't use the
defaults. SP provides this processing automatically as part of its
generalized architectural processing.
Again, for the simple case, you can just require people to use the
defaults and not worry about it. This is what Panorama does.
> - is the working of the processor completely automatic?
I'm not sure what you mean by 'working' in this case.
> - what is the output of the processor? (a grove?). Can it be represented
> by an ESIS stream?
The abstract architectural processing model is one in which there are at
least two groves: the one constructed from the parsing of the client
document and the one constructed from the architectural instance. The
nodes in the "architectural instance grove" have pointers back to the
nodes in the client document grove from which they were derived, so that
when processing the arctectural grove you can get back to the client
document grove. In this model, you can process either grove and always
get whatever information you need about the other. The GroveMinder
product, being developed by TechnoTeacher, provides this grove model,
for example.
Because architectural processing happens *after* parsing, ESIS isn't
really relevant, because ESIS just tells you about the parse result,
from which you either have enough information to do the architectural
processing (at a minimum, the values of attributes). The only question
is one of completeness: does your ESIS output include everything from
the original document you need. For example, if you want to do complete
architectural processing, you need to get the attributes declared for
the architecture notation, but many systems don't provide data
attributes unless they're associated with a particular entity, which
architecture notations are not.
For the simplest, "I'm just looking for attributes" case, normal ESIS is
sufficient to enable architecture-aware processing.
Cheers,
Eliot
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev
mailing list