Javadoc comments or equivalent in XSchema
carl at chage.com
Sun Jul 26 05:36:16 BST 1998
> From: "Curt Arnold" <CurtA at worldnet.att.net>
> Date: Sat, 18 Jul 1998 01:42:10 -0500
> I have been using some JavaDoc-like comments in my XML-Data
> definitions to automatically generate HTML documentation for the document
> content. I have had my XML-Data to DTD converter modified to produce the
> HTML documentation at the same time as it builds the DTD.
> Has any body else been doing anything in this area?
Yes, I've done something very similar, where I use a single file containing
data that defines the schema/DTD, and includes documentation in addition to
the declarations used by software. Actually, I'm not using xml (xml-data) to
hold the definitions, but an equivalent ASCII format more suitable for hand
editing and human reading (somwhat like RFC822 headers which can be trivially
reformatted into <tag>...</tag>).
The single data file is used to
1. Create the DTD and/or other definitions used by database software,
defining element types/fields.
2. Create a formatted HTML file which serves as documentation on the schema.
3. Create an alternate form for documentation as 'help' containing
instructions for data preparers. The generated help files contain href names
for links in forms, etc.
4. Create WWW forms used for data entry and edits. Forms can show examples
for fields as a guide for entry.
help window frame when clicking in a field.
Unlike other approaches, all documentation and definitions specific to a
schema/DTD are complete in this single source. There ares no separate
(non-extracted) documentation files, misc notes in SGML comments, missing
documentation, etc. common with other approaches.
For enumerated data types, the schema definition data file contains the
values with documentation or else references enumerated values externally
(with documentation). These definitions define the contents of a pull-down
select list or radio/checkbox list in a form with brief
database definitions, the enumerated values are often undefined and/or
> I haven't been making a distinction between regular comments and processed
> comments and have only been using @see and @example formulations. Between
> minimal comments and information in the schema, it is able to do a
> reasonably good job documenting how to use the schema.
As with javadoc, the documentation associated with a DTD shouldn't be a bunch
of arbitrary HTML (IBTWSH) pages. The documentation for an element, etc. is
specific kinds of text that is assembled into various kinds of documentation,
not something normally read in isolation. Javadoc uses the @ notation to
identify the semantics of the documentation text so it can format the content
appropriately, including href links and names.
In XML, of course, <tags> would be used to mark the semantics of the
documentation. In order to synthesize nicely readable HTML documentation and
all the other kinds of extracted documentation such as that mentioned above,
there needs to be a set of documentation elements defined. Failure to
define an appropriate set of tags means that it won't be possible to create
reasonable variants on types of documentation.
The <description> in xml-data is an example of the kind of semantic oriented
(vs presentation) tags needed, and also an example of what is severly wrong
with EDI and other typical database documentation-- incomplete inadequate
documentation. The examples of the xml-data <description> are not really
descriptions, but a few words more aptly called "title". Appropriate
documentation requires much more than a few words.
I would be willing to write up a proposal for a set of elements useful in
documentation, though it will have to be a week from now. Without thinking
about it, the kinds of documentation elements I've used are something like:
<Title> A short (one line) descriptive title
Description: The title is a short description associated with a name or
token, typically 1 line, which provides a more complete phrase than the
single word name. The title might appear with the name in a table of
contents, short summary reference, or 1 line comments found with declarations
in source code.
The title should be an alternative for the name or token not a duplicate. If
no title other than the original name exists, it should be omitted rather
than reentering the element or attribute name.
<Prompt> An human-oriented phrase serving as an alternative for a name
Description: The name used for elements, attributes, and tokens often
contains abbreviations or other short notation that makes it more suited to
use in software systems. A prompt provides an alternative for use in forms,
questionnaires, or other presentations of the data used by humans.
Language specific alternatives for the name can be supplied using the prompt.
<Example> A sample data value illustrating the form and content
Description: An example serves as a simple means to illustrate the meaning of
an elelemt or attribute, showing how data might be formatted and the
appropriate content. The example might be included in documentation or on a
data entry form. If more that one example is provided, the first one supplied
should be the one selected when only a single example is shown, e.g. on a
data entry form.
<Instructions> Detailed documentation written as preparer instructions
Description: The instructions form a detailed description and documentation
for an element or attribute, but written in the narritive as might appear in
help documentation or instrictions for filling out a form. This would
typically suffice for documentation but is written in a style oriented for
data preparars rather than programmers or readers.
<Description> Detailed documentation defining an element or attribute
Description: The description is a set of paragraphs providing a detailed
definition and other documentation for an element or attribute. This could be
used with the Instructions or as an alternative to Instructions. When used
with Instructions, the Description should be used to provide more detail not
normally of interest to data preparers, such as interpretation within
If the documentation is short, for example a phrase or sentence, then the
Title should be used instead of Description.
Other tags such as the javadoc cross-referencing @ markers are also needed.
There are other tags needed that I don't have time to mention here. You might
have noticed some, e.g. with my use of "Title" and "Description" in the
In contrast to tags marking the semantics of the parts of the documentation,
use of certain presentation-oriented tags, e.g. <font> can cause serious
problems. IBTWSH is a good basis, but I don't think all the tags in this set
is appropriate for XSC documentation. Tags like <hr> should probably be
banned. Tags like <big> and <small> should be banned. The usual use is
something like "<big><b>Something</b></big><br>" because some author doesn't
like the way Nescape spaces <h3>Something</h3>. The usual reason for <small>
is to wrap everything, because some author thinks the font's are too big when
view on his 21" monitor.
I think the goodt way to understand the requirements for documentation would
be to write the XSC standard/specification in itself. I don't mean simply
encode the DTD as XSC xml, I mean something equivalent to the whole set of
HTML pages (or the complete xml-data document) should be represneted as XSC
xml containing Doc elements. By processing the XSC-XSC, the XSC document
(something similar to the current one) would be extracted including the
appendix containing the DTD.
The usual case of a DTD plus some arbitrary side-file documentation means
documentation is not reusable, not readily accessable, and often ambiguous or
imcomplete. Specifying *complete* documentation within the xml-data or
xml-xsc definition provides a single referenced file for the schema and in a
reusable form. With distribiuted Internet applications based on the WWW, easy
access and reuse of documentation (metadata) becomes more important. XSC
documentation has the potential to solve this, but only if implemented
Carl Hage C. Hage Associates
<mailto:carl at chage.com> Voice/Fax: 1-408-244-8410 1180 Reed Ave #51
<http://www.chage.com/chage/> Sunnyvale, CA 94086
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
More information about the Xml-dev