Handling unknown elements?

Thu Apr 9 03:12:04 BST 1998

>>One dilemma I have been trying to figure out with XML is the problem of
>>handling unknown element types and what to do with their children.
>>For simple tree based data modeling this is pretty simple, if you come
>>across an unknown element that the application does not understand, you
>.just ignore it and all of its children.

XML is designed to work without a DTD, so an element that does not appear in
the DTD should probably still be rendered by the user agent.

Another source of knowledge about an element can come from a style sheet. If
there is no reference to an element in the style sheet or the DTD, the user
agent should probably just render it as an inline flow-object.

As far as the attributes and their values, if there is no DTD the user agent
should probably make an array of the attributes it comes across, so they can
be queried.

If there is a DTD the user agent should have the option of either validating
and reporting an error if the document does not comply, or it may just check
for well formedness.

It would be nice if it did both.

Frank

-----Original Message-----
From: Tyler Baker <tyler at infinet.com>
To: xml-dev at ic.ac.uk <xml-dev at ic.ac.uk>
Date: Wednesday, April 08, 1998 3:47 PM
Subject: Handling unknown elements?

>One dilemma I have been trying to figure out with XML is the problem of
>handling unknown element types and what to do with their children.
>
>For simple tree based data modeling this is pretty simple, if you come
>across an unknown element that the application does not understand, you
>just ignore it and all of its children.
>
>However what if like in the case of HTML an application may have mixed
>content where it understands the <B> tag for boldface text but not
>understand the <I> for italicized text.  The actual character data may
>be a child of the <I> element in this case.
>
>In case you anyone would like to know I have designed an XML Application
>framework that for now works fine for tree-based data modeling, but it
>really will have problems with documents that have all sorts of element
>(and their properties) applied to the character content, rather than
>with tree-based data modeling where you simply have elements as nodes
>and the leaf nodes have the actual character content stored in them.
>
>The only alternative for documents is to use something like a DOM tree
>or else an event based parser.  The framework I have designed is pretty
>much what you could call object based in the sense that when the parser
>encounters a start or empty element tag it retrieves its name and asks
>the current parent element for an element to handle that tags attributes
>and content.
>
>Does anyone have any ideas for a solution that could be both object
>based, but document based as well?
>
>I have thought of maybe having an opaque "UNKNOWN" element handler
>object that would forward all requests queries for finding child
>elements to its parent element, but the problem with that is how do you
>know and tell the application if a particular tag should be treated as
>an object based tag where all of its children should certainly be
>ignored, or else you should simply join all of its children
>(symbolically) to the "UNKNOWN" tags parent tag.
>
>I know this might seem a little convoluted but here is what I am trying
>to say in XML
>
><B>
>    <I>
>        Foo
>    </I>
>    <I>
>        Bar
>    </I>
></B>
>
>Using the opaque "UNKNOWN" element it would look like this in tree form
>if the <I> tag were unknown:
>
>                              <B>
>               |                                  |
>   <UNKNOWN>        <UNKNOWN>
>               |                                  |
>           "Foo"                          "Bar"
>
>Symbolically this could be represented as simply:
>
>                              <B>
>                        |                |
>                    "Foo"         "Bar"
>
>Which in document format would evaluate to:
>
>                              <B>
>                                 |
>                          "FooBar"
>
>However, if I were to do all of this in Object format, any unknown child
>elements of <B> which in this case happens to be the <I> element would
>be skipped as well as all of the other sub elements contained in <I>
>regardless of their type.
>
>The only solution I can possibly think of to this dilemma is to have
>each element object have a boolean flag that tells the XML Application
>Framework (which happens to be a parser now but could easily be built on
>top of SAX in 1/2 an hour) whether to ignore unknown child elements or
>else join the children of unknown child elements as children themselves.
>
>Anyone here got any better ideas on this?
>
>Tyler
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
>To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo at ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)
>
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)