XML-Data: advantages over DTD syntax?

Fri Oct 3 05:13:07 BST 1997

Rick Jelliffe wrote:
> 
> 2) Not generalized: Presumably XML-data is being developed
> to solve some real problem (they wouldn't be paying someone
> to come up with good ideas that just sound good, would they?).
> Since we do not have access to the problems that XML-data is
> supposed to solve, we have no way of testing whether it does
> indeed provide a good generalized approach that is significantly
> better or more flexible than SGML.

That is the requirements problem.

> (Reading between the lines, I think XML-data may be targeted
> at retrofitting slack HTML documents with inline generalized
> markup.  

Hmm, could be.  It may also applicable to metatagging 
databases for template designs in relational systems 
that include fragments for tables.  Look at the template 
concepts in the latest developers mag for MS users. 

Look:  this IS the problem with no requirements.  We have 
no real way of knowing why a different schemata of the 
same functional capability is required.  That is why I 
keep asking: what does it do we can't do with XML and a DTD? 
I'm not fighting it: I want to know why.  

> In other words, it probably has the assumption that
> we cannot use DTDs, because the target data is so slack that
> no DTD is possible. So they need something that can just
> markup parts of the data. This is an interesting problem,
> and one that ISO 8879 clearly does not address, except by
> external HyTime/XLL pointers into data, I guess.

Well, that is an interesing problem then.  If we want 
to go out there a bit further, we can talk about it 
as term vectors, velocity, voxels, etc.  We can have 
a self-adaptive data structure that automatically classifies 
elements by spatial distribution (see works of Mathew Chalmers).  
As a matter of fact, the use of DTDs or whatever schema could 
make his algorithms using force vectors work even better by 
reducing the stress factors prior to the computational 
cycles.

That actually works.  Easy to navigate to in 3D.  Do 
I want really lossy schemata with regards to frequency 
and occurrence ranges to do that?  No, I don't think so. 

> Can anyone in the XML-data conspiracy confirm this? It is
> a wild guess :-)

<rant>I think oligarchy is the right word, not conspiracy.  
It is the W3C policy for process.  Not a good deal for the 
community really.  Too many of us have to sit on 
this list with our pens too silent about the discussions, 
decisions and the reasons.  That is neither fair nor optimum.</rant>

> 3) Not markup: The approach of not separating out what
> can be known about data ahead of time (or for every
> instance) from the details of the particular instance
> is justified under the slogan "all metadata is data",
> which avoids the question "should data that has different
> significance be marked-up differently?".

A rose is a rose is a rose... kinda meaningless.
A rose(1) is a rose(2) is a rose(3) is what she meant. ;-)

> 4) Not a human readable-language.  Contend models are simple,
> terse and convenient.  Of course, diagrams etc are simpler.
> But XML-data's verbosity will make reading any kind of
> lengthy DTD more complicated.

To me that means parsing in my head as I read the model 
is harder.  However, equivalent expressiveness isn't a 
sufficient reason to use another notation for the 
schemata unless other benefits are found that are 
compelling.  I don't think teaching DTDs to XML 
newbies is going to be that hard.  Never was to 
SGML newbies until I made them read very complex 
DTDs.  

Now, here is an interesting issue:  when the 
schema/DTD becomes complex, or an aggregate emerges by 
automated means (eg, using the force system to 
classify aggregates), how hard will the 
schemata of either syntax be too read?  IOW, 
people wanted parameter entities very badly.
This may be where they count for something 
more than string substitution because they are 
one way to label topical aggregation in an
automated classification system such as 
Chalmers describes.

> b) Order of extended element content models is not clear.
> If I derive a "cat" element type from "animal" element type
> and say it can also contain a "purr-volume" element type,
> is there any way of constraining where this element
> can go?  Ordering information in content models is vital
> for many processing application, and for integrity.
> Can such additional element types go anywhere, or only
> at the end, or where.

That gets to the point of the DTD:  frequency and occurrence 
are explicitly defined.  That is the really powerful part of 
SGML with regards to a hierarchical instantiation.

> c) Is there anyway of preventing derived elements from
> adding additional tags in particular places?
> 
> I think one weakness in current SGML is the lack of a
> #ANY  keyword for use inside content models, e.g.
>  <!ELEMENT cat (purr-volume, #ANY?, paws+)>

I like that.

> XML-data
> really does not seem to have thought of content models
> as being a tool to manage information (i.e. to frustrate
> well-intentioned users from adding innappropriate elements in
> mission-critical places), just to describe it.

Properties of objects in which the object library already 
knows the things the DTD would tell it?

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev at ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo at ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo at ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa at ic.ac.uk)