Announce: Topic Map Standard out for Final Committee Draft Ballot

Mon Nov 16 22:52:22 GMT 1998

Greetings,

I'm attaching a working proposal I made to the content development working
group of the vrml consortium, which I chaired until recently. As it turns
out, a recent post to our mailing list was made by Eric Pacquet, which I
will be forwarding to this list momentarily. Because the issue of Topic
Navigation Maps bears directly on this and includes the possibility of
forming a reflector if not a cross-standards working group aimed at
unifying the manageability of information resources on the web.

I also forwarded this paper to the enterprise wg list of the the VRML
Consortium which is changing its name to the Web3D Consortium in case you
didn't know. Daniel Lipkin, the chair of that group has been concerned with
xml integration, specifically with metadata concepts being included as a
node in VRML files. So, at the risk of seeming overly wide in my interests,
I think Eric's suggestions bear looking at.

Look for that post momentarily.

Ciao,
Rex Brooks

At  8:40 AM 11/15/98 -0600, W. Eliot Kimber wrote:
>At 08:48 AM 11/14/98 -0800, Dave Winer wrote:
>>What kind of applications would we use the Topic Map structures for?
>>
>>It always helps me to understand this kind of stuff if I can understand a
>>compelling application for it.
>
>A topic map consists, fundamentally, of two kinds of things: topics and
>associations. A topic is an object that represents a single rhetorical
>topic or subject. For example, "XML parser" might be one topic in a topic
>map about XML. A topic serves to associate the abstract idea of the topic
>with occurrences of that topic:
>
><topic xml:link="extended" role="topic" id="xml.parser">
><name>
><basename>XML Parser</basename>
></name>
><occur xml:link="locator" role="parser-instance"
>       href="http://www.jclark.com/xp"/>
><occur xml:link="locator" role="parser-instance"
>       href="http://www.microsoft.com/msxml/">
></topic>
>
>Notice that this serves to impose the semantic label "XML Parser" onto the
>occurrences addressed by the occur element.  Thus a topic can assert that a
>given object is an occurrence of some kind of thing.  This lets you
>construct a classfication or descriptive layer on top of existing data.
>The topics essentially represent opinions about the data.  Different topic
>map authors might express different opinions about the same data. Because
>the form the opinions are expressed in is standardized and consistent
>(topics), they can be reasonably compared to some degree.
>
>Because the topics are expressed formally as hyperlinks (here using Xlink,
>but also doable using HyTime), they are naturally navigable using whatever
>hyperlinking support you have lying about (e.g., HyBrick, PHyLIS, etc.).
>
>Associations relate topics to each other. To continue the XML topic map
>idea,  might have a relation "standard-interface-for" that I use to relate
>the topic "XML parser" to the topic "SAX":
>
><assoc role="standard-interface-for" xml:link="extended">
><assocrl role="parser" href="#id(xml.parser)" xml:link="locator"/>
><assocrl role="interface" href="#id(sax)" xml:link="locator"/>
></assoc>
>
>In many ways, this is like RDF: you can impose properties onto data objects
>and relate data objects together using typed links.  It may be that topic
>maps are one way to express RDF abstractions, I don't know (I don't know
>enough about RDF).  But while RDF seems to be designed primarily to support
>the addition or representation of metadata about objects, topic maps are
>designed for the creation of knowledge bases imposed on data of any type,
>and, in particular.  In any case, Topic Maps are not intended to compete
>with RDF--they are, in essence, different views of the same abstraction:
>objects with properties and relations among them.
>
>So what would you use topic maps for? I think one compelling use is as an
>annotative or descriptive layer over things like encyclopedias,
>dictionaries, databases, and the like. They might be used to enhance
>management information systems by providing a simple but rich and
>standardized way to capture analysis applied to existing data, such as
>market reports, sales numbers, etc.
>
>Topic maps can be a way to augment search and retrieval by providing a form
>of index over a larger, more amorphous body of data.
>
>Many documents can be turned into topic maps simply by labeling the
>existing components as topics. For example, you can think of a command
>reference document as a topic map where every command description is a topic.
>
>The reason for standardizing this concept is that it lets you build generic
>topic map engines that understand the specific properties of topics and
>associations and can therefore manage knowledge of those properties in a
>crisp and efficient way, making the information available to processing
>systems. It also allows the meaningful and automatic merging of topic maps
>because it's clear how the components of each relate to each other as objects.
>
>I'm not sure I've answered the question very well, but maybe I've provided
>enough of a taste for what topic maps do that applications will suggest
>themselves.  I've left out a number of important and interesting details in
>the discussion above, but I think I've conveyed the flavor.
>
>Cheers,
>
>Eliot
>--
><Address HyTime=bibloc>
>W. Eliot Kimber, Senior Consulting SGML Engineer
>ISOGEN International Corp.
>2200 N. Lamar St., Suite 230, Dallas, TX 75202.  214.953.0004
>www.isogen.com
></Address>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)

-------------- next part --------------
Refining the VAN Database Concept

In the weeks leading up to my resignation from chair of the Content
Development Working Group, I worked with Mark Pflaging on some refinements
of the PROTO Repository, while attempting to obtain permission from Niclas
Olafsson for including his VRMLDoc Perl Program, or an adaptation of it.
There were several issues involved, beyond the surface efforts to get the
uploading capacity operational and develop a method for producing
documentation for every PROTO included in the repository.

Those work items were and remain necessary, but one of the tasks that most
needs to be done is to get the Database for the VAN started on the right
track, by which I mean the track it will inevitably follow. This is neither
intuitive from an artistic viewpoint, nor obvious from an business
application-centric software engineering standpoint. It is, in fact,
governed by the ISO, IEC conventions under development for the standardized
representations of information or data. My goal is to apply some
commonsensical adaptations of those developing standards to OUR developing
standard. (There is no doubt about it, I was born WAY too soon.)

One of the primary discussions we had early on in the CDWG was about what
we wanted the VAN to be. This discussion centered on the ease of use for
content creators who are primarily artists and new to VRML. These are the
fabled vast majority of users who will create the content necessary to
spread VRML to ubiquity.

However, until a much wider audience of vrml users is created, those
authors who will be populating the PROTO Repository of the VAN will NOT be
artists mostly but computer enthusiasts and artists who happen to be VRML
enthusiasts. That's an important distinction to make right now because the
people we want to encourage to take up VRML content creation are not going
to know how to search our repository on their own to find the components we
want them to find to make that content. And they are going to have be
taught VRML as well.

If you revisit the VAN Proposal, you will see that providing a means for
potential content creators to learn VRML and then move directly into using
the components in the VAN repositories is built in. However, it needs work,
real work in the writing and in the programming. I will certainly do what I
can.

Since the need of the Consortium right now is to get an operational
implementation of the VAN in place, I want to see to it that the ultimate
creation of the tools to make content creation EASIER goes on while that
simpler implementation is executed. Simpler in this case means simpler in
terms of programming and use by the computer literate, not for use by our
most important target audience. This needs to be understood so that we
don't end up "grandfathering" methodology that will be awkward and clumsy
to carry into the future.

What this preamble means is that the ultimate, "high-level front end" of
the PROTO Repository, and the other repositories of the VAN, has to be
easily searchable, where easily means easily useable by non-experts. One of
the most daunting elements of computer use by non-experts is the avalanche
of acronyms currently in use, so that is one of the first things we need to
avoid. Right after acronyms is jargon. What this means is that our VAN
needs to be searchable in plain language.

Unfortunately, plain language has almost nothing to do with database
design. So we have to take the database technology as we find it, and try
to make it work for our use, AND into the future of both our use and
database technology development. This need to have our database be
supported into the future is the primary reason why we decided we wanted a
Proprietary Database Solution specifically purchased or donated with the
future support written in. We chose Oracle at our face to face meeting at
SIGGRAPH98 in late July, and I specified Oracle 8i, with a Silver Support
Plan when asked for the specification by Platinum in Late October, at which
time it was scheduled for Release November 4, 1998.

So, that's where things stood when I resigned. In the week since, I have
thought about this quite a bit. Michael Wagner has suggested that this
would make a good Master's Thesis, and it might very well. I will certainly
be happy to pass my thoughts on to the academic folks he has suggested, and
to work with them, if that is appropriate for them, but I think this needs
to be pursued by the content group for the original purpose it was
conceived.

It does not need to be a top priority for showing in February at VRML99.
Parallel development cannot help but benefit us all, especially if we are
all working with "real" multimedia elements in a "real" operational
context--not just in an academic "modelled" context.

Our next wave of vrml content creators will mostly be coming out of the
business world of advertising and media and communications, though,
hopefully, some will come from academia as well. So our immediate needs,
beyond the most immediate need to demonstrate a proof of our concept at the
Symposium, is to make a useful, practical Resource Bank that can be put to
use next year in the world of commerce, which will pave the way for further
spread of VRML.

Of course, that can't be our only goal. We need to make sure VRML is
taught, and used for teaching.

This is all a win-win-win situation, if we can all work together. Let's not
leave anyone out, not Macintosh users, not Linux users, not SGI users, not
Solaris users,  not even NT users. And, that means, as near as I can manage
to make out, an XML-enabled, Object-Oriented, Web-Accessible, Interoperable
Java-based Database built on the principle of practical useability, and
that means searchability by non-experts.

As you may have reckoned by now, I actually have an operational plan to
propose, based on the work we have already done in the PROTO Repository
prototype of Mark Pflaging, and the latest developments in XML, and in our
own Enterprise Working Group, too. See what you think.

I propose creating a software program capable of taking information about
incoming PROTO submissions based on several top-level character data fields
in a table form similar to what Mark Pflaging has been using in the PROTO
Repository prototype, and using that information to set the classification
of the PROTO in the Repository. If possible, it would be desireable for the
program also to create some basic, standard, documentation of the PROTO
that could be optionally included in downloading the PROTO. Most of this is
already accomplished, but deserved mentioning nonetheless.

These fields would be first data parsed by the program and would be the
keys to placing the PROTO in the repository. These fields would be:

1. PROTO Name: This will be the name of the PROTO as written in VRML,
which we should encourage our contributors to prepare in the Recommended
Naming Convention form we have elucidated on our website in the example:

* PROTO TwoColorTable [field...
* TwoColorTablePROTO.wrl
* TwoColorTalbeExample.wrl
* TwoColorFurniturePackage.wrl#TwoColorTable

The common reference to this comprehensive PROTO naming convention is the
"Verbose Naming Convention".

2. Author's Name: Fairly self-explanatory, I assume.

3. Date of Original Submission/Date of Last Revision/Version Number

4. Type of VRML PROTO: This is a new item in the Database Schema, and will
have to be checked for conformance with recognized standards, but I
envision this field as having the following data:

		A. Primary (Simple?) Static Component: This means that the PROTO contains
no references to other objects, PROTOs, or multiple ImageTextures and has
no mechanism to change state. This would refer to mostly geometry objects
and sound objects, but possibly MovieTextures or MPEG-4 mediatypes. This
would also be the level of the VAN that would house elements not included
in the the UMEL.

		B. Secondary (Complex?) Static Component: This means that the PROTO
contains references to other objects, PROTOs and may contain multiple
ImageTextures, but has no mechanism to change state. This would refer to
compound objects, object collections or scengraphs

		C. Primary (Simple?) Dynamic Component (Behaviors): This means tha the
PROTO contains a single trigger that uses an EventIn and EventOut to send
values to a ROUTE and cause a change of state

		D. Secondary (Complex?) Dynamic Component (Behaviors): This means that
the PROTO contains multiple triggers and Events, ROUTEs and changes of
states.

5. Author's Descriptions: I propose that we discuss and arrive at a
standard set of Description Conventions similar in intent to the
Recommended Naming Convention, aimed at developing a secondary search, and
more advanced search mechanism based on the functions of the PROTOs, such
as Variable Transparency or Remote Object Detection as a kind of relational
refinement of the Type of VRML PROTO. This would probably have to be in a
second set of fields.

This set of fields or sets of fields if a second set is necessary for a
more advanced top level search of the database should be able to classify
all PROTOs that VRML currently allows. it does not specifically accommodate
multiple mediatypes or streaming of components, but shouldn't disallow
either.

This proposal does not purport to produce or replace the searchable
taxonomy of attributes we discussed earlier, and which is included (though
not completed) in the current PROTO Repository prototype. I believe that
this taxonomy of attributes can possibly be accomplished by applying the
principles that form the recent Proposed Standard for Topic Navigation
Maps. This is an approach we haven't pursued, but which appears to be
workable, and would certainly make our repository more interchangeable with
the rest of the ISO, IEC standards.

I believe that two concepts at work in XML should be considered as we go
forward in our attempt to develop the relational attribute taxonomy that
forms the heart of the Database, as opposed to this top-level sorting by
object-category deserve to be considered. These are Topics which will allow
groups of functionally similar PROTOs to be easily searched and retrieved,
and Groves, which similarly groups documents.

Please see this example of a Web-Based XML Application as an example of how
the SGML Tree Structure can work with the Grove concept and imagine how, as
an architectural model, VRML could be integrated within this structure,
and, more importantly, how this structure could be much more easily
depicted in VRML.

http://inf2.pira.co.uk/Grif/inria.html

My intent in this working proposal is to shape the database in the
direction of conformance with the ongoing development of XML and the recent
publication of Topic Navigation Map Proposed Standard.

This proposal does not, as far as I can tell, require any fundamental
changes in the work Mark Pflaging and Gavin Bell and Daniel Woods have done
so far.

The VRML Parser is an excellent tool and a great addition to the PROTO
Repository. I was going to suggest something like this, but I guess that
would be redundundant.

However, there are two items that I haven't specifically addressed in this
proposal.

First, is the necessity for deriving, requiring, or in some fashion, such
as employing something along the lines of Niclas Olafsson's VRMLDoc,
creating documentation for every PROTO that goes into the Repository. It
would be optional to download it.

Second is the necessity for creating an open source license with options
that can satisfy the spectrum of interests we know to exist amongst our
pool of potential contributors in a way that protects the interests the
Consortium and the CDWG from liability for the warrantability of the VAN's
contents, and protects the copyrights of the authors as the authors
require. As a co-chair of the VRML-IPR Task Group, I can only say that we
are working on it. We know that we will require a checkbox or radio button
that must be checked or marked off to signify acknowledgement of our
license requirements and disclaimers. This will occur on both the
submission of components and downloading components.

Well, that's it for right now. That's a lot more than enough for my poor
brain. It's full. Can't take no more.

Ciao,
Rex