J'ai décide d'écrire un livre sur l'Espace et le Temps
à l'intention du grand public après les conférences Loeb
que j'ai données à Harvard en 1982.
- foo
blabla
becomes blabla Two
words Twowords
Two
words Two words
Two words
blablaSPbloblo
We should
get rid of
line end
codes
We should get rid of line end codes
He said:
com.ms.xml.ParseException: Missing entity eacute
at com.ms.xml.Parser.error(Parser.java:110)
at com.ms.xml.Parser.scanEntityRef(Parser.java:440)
at com.ms.xml.Parser.scanText(Parser.java:395)
at com.ms.xml.Parser.parseText(Parser.java:1223)
at com.ms.xml.Parser.parseElement(Parser.java:1081)
at com.ms.xml.Parser.parseDocument(Parser.java:643)
at com.ms.xml.Parser.parse(Parser.java:47)
at com.ms.xml.Document.load(Document.java:171)
at msxml.main(msxml.java:50)
Thanks for any help...
Pat.
==============================================================
bonhomme@loria.fr | Office : B.228
http://www.loria.fr/~bonhomme | Phone : 03 83 59 20 37
--------------------------------------------------------------
* Projet Aquarelle
http://aqua.inria.fr
* Serveur Silfide
http://www.loria.fr/Projet/Silfide
* Multilingual Concordancing
http://www.loria.fr/~bonhomme/lingua/
==============================================================
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From tbray at textuality.com Fri Sep 5 01:40:56 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:58:24 2004
Subject: Lark 0.91 available
Message-ID: <3.0.32.19970904163748.00838210@pop.intergate.bc.ca>
Hi - Lark 0.91 is now available at
http://www.textuality.com/Lark
Only one real difference - it now does Unicode. It reads the BOM and thus
UCS-2/UTF-16 (even byte-swaps); if there's no BOM, reads and tries to
use the encoding declaration, boots it if it says anything but "UTF-8" or
"UTF8". Successfully parses Murata-san's translation of the XML
spec, would love to get my hands on some more internationalized
XML; in particular with non-ASCII markup.
Another 6K of .class files for I18n, sigh.
Lots of bug-fixes in the event-stream module. I had to write a
significant event-stream Lark application to pull the character classes
out of the XML spec in order to build the CharClasses.java file, and
ran across a few bodacious bugs in end-tag handling.
It's a bit bogus because it really doesn't do UTF-8 yet, just ASCII
masquerading as such. UTF-8 Real Soon Now.
Cheers, Tim Bray
tbray@textuality.com http://www.textuality.com/ +1-604-708-9592
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jjc at jclark.com Fri Sep 5 07:10:54 1997
From: jjc at jclark.com (James Clark)
Date: Mon Jun 7 16:58:24 2004
Subject: Character classification
References: <91B7E292027DCF1195CD08002BB690B002457407@RED-93-MSG>
Message-ID: <340F92EB.E84772CB@jclark.com>
Istvan Cseri wrote:
>
> For better speed I would suggest an alternative solution: use a quick
> array lookup for characters below 256 and go to the more expensive
> method above... It will do wonders with your parser.
Except of course when you're parsing non-Latin scripts.
There's another technique which exploits the fact that characters on the
same page often have similar properties, and this is true even more so
for characters in the same column.
The idea is to have a three-level table, the first level with 256
entries, the second and third levels with 16 entries. The entries for
the first and second levels are a (possibly null) pointer to a sub-table
plus a value; the entries for the third level are just values. To look
up the value for a character, you use the high 8 bits to index into the
first-level table; if the pointer part of the entry is null, then return
the value part of entry; otherwise use the sub-table table addressed by
the pointer; use the next 4 bits to index into that in a similar way,
and, if necessary, the bottom 4 bits to index into the bottom table.
This is I believe quite a well-known technique; I got it from Glenn
Adams.
You can use this to implement case-folding by storing the difference
between a character and its upper-case equivalent modulo 2^16.
There's a C++ implementation of this in SP in include/CharMap.h.
James
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From crism at ora.com Fri Sep 5 16:15:09 1997
From: crism at ora.com (Chris Maden)
Date: Mon Jun 7 16:58:25 2004
Subject: DSSSL Digest now publicly available
In-Reply-To: <998.199709051356@grogan.cogsci.ed.ac.uk> (ht@cogsci.ed.ac.uk)
Message-ID: <199709051417.KAA01715@geode.ora.com>
The announcement of the DSSSL Digest (or reference) at
SPSPblabla
becomes SPSPblabla
, so two line ends are discarded.
It seems nevertheless natural that these line ends are dropped.
BTW, this rule was in the first (11/14/96) XML draft.
There is a first problem with this approach: in
default content (preserved content will be examined later):
blabla
bloblo
blublu
The coding in this case is natural: bla, blo and blu are very
aesthetically aligned!
But: a line end code is discarded after "", it shouldn't be.
So: preserved elements need a special rule. It seems quite natural
they need a special rule concerning line end codes (and
space codes).
A possibility: the parser closes a "default" (not preserved) element,
and opens a "preserved" element: the line end codes after the start tag
and before the end tag are discarded. But for a preserved element
directly embedded in a preserved element, line end codes
are left intact.
*Rule3*: WS in element content is discarded.
WS space in element content *must* be discarded. The problem
is: without a DTD, one doesn't know if an element contains only
other elements.
Suppose we have :
).
*Rule 5*: except in preserved elements, consecutive WS characters
are reduced to a single space.
I don't like this rule. If I put two spaces after a point, I mean two
spaces.
It's a typographic decision.
Rule 5 is meant to allow some indentation:
I need some
indentation.SPSPIndentation is needed.
I need some
indentation.SPSPIndentation is needed.
Two > words
> becomes >Twowords
> The space between "Two" and "words" evaporated. > Same thing with: >> Two > words
> I don't think this particular problem is important: the encoding > is not natural. It should be an error! > I think everybody would write: >Two words
, or >> Two words >
, etc... I have long thought that 'some' formatting options should simply be made illegal, and that we should then ensure widespread knowledge of restrictions to future document authors. This is the main example I had already considered. > Inside a preserved element, line end codes are wrongly discarded > after element start tags and before element end tags: >> blabla > bloblo > blublu >Again, I think this coding is very unnatural. > *Rule 4*: Except in preserved elements (elements > with a space attribute set to "PRESERVE") line end codes are > discarded when preceded by a hard or > soft hyphen (in the process, a soft hyphen is also discarded) and > remaining line end codes are treated as space. > > The rule concerning hyphens is not necessary. If it's a hard hyphen, > don't put it at line end (who would do that?) It is in fact a very natural action, which I have seen many times. > Moreover, there is no use in an XML source file to put a soft > hyphen at line end. Who would do that? In my poor life, I have no occa- > sion to see some text with hyphens at line end. I have. Many times. > *Rule 5*: except in preserved elements, consecutive WS characters > are reduced to a single space. > > I don't like this rule. If I put two spaces after a point, I mean two > spaces. > It's a typographic decision. > Rule 5 is meant to allow some indentation: > >
> He said:
>
> I need some
> indentation.SPSPIndentation is needed.
>
>
-style) elements.
3. Can't depend on the DTD or other declarations to control it.
The simplest proposal that does this is to pass all whitespace.
The only real drawback is that _some_ applications (like table formatters)
may have to explicitly ignore whitespace in _some_ contexts where a
traditional SGML parser would have been able to do it for them. Linking
applications must deal with (count), and can't ignore whitespace chunks
that in some cases may have little meaning to a user.
The benefits are "simplest possible rule", easy XML->XML transduction that
preserves the original formatting, a dependable way to count character data
in documents that contain whitespace, regardless of whether you have a DTD.
>> The recent change (to normalize all linends) fills the one hole the
>> previous proposal had -- because it was nearly certain that some
>> processes would blindly change CRLF and their ilk anyhow.
Note that this is only data normalization permitted in XML, and that it
only warrants processes like the changing of line-ending conventions (eg
from PC to Mac) -- that we all know would have taken place anyway, causing
errors, even if they were explicitly prohibited by the standard.
>> My advice: don't waste your bytes complaining about this -- we've
>> heard it _all_ before -- and the solution that works best is to leave
>> it to the application.
>
>I am sure I will get
>convinced when I read the WG discussion :-)
>Or I fear the WG members will have to hear it all (and more)
>again :-))
My advice was just advice about what expectations you could have of
_results_ from whatever discussion ensure. Feel free to discuss whitespace
to your heart's content. But don't expect XML to change.
I'll see if there's any way the archives of the whitespace debate can be
made available, but I can honestly say that they're painful rather than
enlightening reading. Expect to devote several days to the reading, too, if
they do becom public.
I was a chief proponent of the current approach, even at the beginning,
when most in the group did not want to do anything so radical, so I agree
that explanations of the decision are worthwhile -- and I've tried to
contribute such -- but I'm certainly not going to read an extended rehash
on the issue. I've devoted my pound(s) of flesh to whitespace already.
-- David
RE delenda est!
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From digitome at iol.ie Thu Sep 18 16:13:55 1997
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun 7 16:58:28 2004
Subject: Re Whitespace
Message-ID: <199709181413.PAA06449@mail.iol.ie>
Sorry for the lateness of this reply. It got a bit lost in my out-box for a
while!
[Sean Mc Grath]
>>Throw out that grep, that text editor, that fgets(), that diff,sort,uniq
>>utility There all busted for XML use.
>
[David Durand]
>gets is of course Broken As Designed, as the cause of most security bugs in
>Unix systems.
Sorry David, I cannot let you get away with that one. I said *fgets()* which
is an entirely different function to gets(). It takes
three paramaters one of which is the maximum number of characters to read.
It is not Broken As Designed.
>
>Again, they are broken for XML use with files created a particular way.
>They are also broken for HTML files created the same way, and I don't hear
>the weeping and wailing.
No weeping and wailing required because it is typically possible to splice in
line-ends into HTML *without affecting the content*. This is not the case
with XML.
>Can you suggest any solution to the "grep" problem other than requiring a
>fixed line-max in XML.
Yes. Ignore all line ends. I know this presents its own set of difficult
problems
but I'd prefer to tackle these - and maintain compatability with a decades worth
of tools - rather than break the tools.
> Do you think that that hideous hack to accomodate
>defective (if very useful) tools is really worth it.
Yes. Line oriented text processing has been a hugely popular paradigm for
many years now. I don't think of these tools as "defective" at all. I dare
say many wielders of these tools are of the same opinion. These people will
be rightly miffed at the suggestion that they are defective by virtue of the
use of a line oriented paradigm. They will also be rightly miffed that they
cannot bring their tools/skills to bear in the XML world.
>Can you suggest how we
>would determine that buffer size?
Question is Broken As Designed. No need for a silly fixed limit. Just a
recognition
of the existence *of* limits and a standardised mechanism for dealing with them.
Sean Mc Grath
sean@digitome.com
www.digitome.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Thu Sep 18 18:09:41 1997
From: dgd at cs.bu.edu (David G. Durand)
Date: Mon Jun 7 16:58:28 2004
Subject: Re Whitespace
In-Reply-To: <199709181413.PAA06449@mail.iol.ie>
Message-ID:
>Sorry for the lateness of this reply. It got a bit lost in my out-box for a
>while!
>
>[Sean Mc Grath]
>>>Throw out that grep, that text editor, that fgets(), that diff,sort,uniq
>>>utility There all busted for XML use.
>>
>[David Durand]
>>gets is of course Broken As Designed, as the cause of most security bugs in
>>Unix systems.
>
>Sorry David, I cannot let you get away with that one. I said *fgets()* which
>is an entirely different function to gets(). It takes
>three paramaters one of which is the maximum number of characters to read.
>It is not Broken As Designed.
No, but fgets (unlike gets) can deal with long lines --- you have to
recognize that you overflowed and make accomodations, but you can do the
right thing. iw as giving you the benefit of the doubt, since gets, at
least, has the problem that you are raising, while fgets does not.
>>
>>Again, they are broken for XML use with files created a particular way.
>>They are also broken for HTML files created the same way, and I don't hear
>>the weeping and wailing.
>
>No weeping and wailing required because it is typically possible to splice in
>line-ends into HTML *without affecting the content*. This is not the case
>with XML.
Just try that in tables. You have to know the meaning of the markup, even
in HTML, if you want to do this. Now you can claim that table markup is
broken, and you might be right, but HTML does not suport your argument.
Similarly for pre elements: You can't do anything to lineneds in there --
maybe I'm using a 20K line in to force horisontal scrolling for a
rhetorical reason.
>>Can you suggest any solution to the "grep" problem other than requiring a
>>fixed line-max in XML.
>
>Yes. Ignore all line ends. I know this presents its own set of difficult
>problems
>but I'd prefer to tackle these - and maintain compatability with a decades
>worth
>of tools - rather than break the tools.
But this creates worse problems: lack of -style elements, inability to
write XML filters that preserve linespace jsut from generic XML parsers.
No way to use string offsets in linking.
>> Do you think that that hideous hack to accomodate
>>defective (if very useful) tools is really worth it.
>Yes. Line oriented text processing has been a hugely popular paradigm for
>many years now. I don't think of these tools as "defective" at all. I dare
>say many wielders of these tools are of the same opinion. These people will
>be rightly miffed at the suggestion that they are defective by virtue of the
>use of a line oriented paradigm. They will also be rightly miffed that they
>cannot bring their tools/skills to bear in the XML world.
But they can, they just need to limit their files to crrespond to the
limitation of their tools. People do this all the time, without difficulty.
Of course if the world at large decides to abandon the "line paradigm" then
those who stick to it will be inconvenienced. But then if "the world" make
the shift, then there's still not a very big problem, is there?
Even in that case, with some (usually minimal) human intervention, such
linend conversion/insertion is trivial in practice.
I'm sorry I still don't see how this is _worse_ than what we have with text
files today. And compared to HTML and SGML, I think XML's rules are more
consistent, and useful for more things.
I deal with the Mac (where line == paragraph), as well as Unix, all the
time. This problem is not usually of more than 10 seconds concern on the
few times in a month that it comes to mind. On occasion, of course, I find
myself spending 1-10 minutes in an editor fixing things (usually by
invoking a "wrap" command of some sort).
>>Can you suggest how we
>>would determine that buffer size?
>Question is Broken As Designed. No need for a silly fixed limit. Just a
>recognition
>of the existence *of* limits and a standardised mechanism for dealing with
>them.
I can't imagine what such a mechanism is: IBM text editors for decades had
an 80-character limit. Some still work best with 72 column files. if XML is
supposed to require lines no longer than some limit, we need to specify
that limit in the standard. Otherwise all we can say is that any XML
processor is free to reject any document if the lines are "too long for
that tool". That's en even worse prescription for interoperability.
If there are limits, a standard has to tell you how to be safe and not
break any of those limits. At least, a good standard should.
-- David
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://www.dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From digitome at iol.ie Thu Sep 18 20:40:28 1997
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun 7 16:58:28 2004
Subject: Re Whitespace
Message-ID: <199709181840.TAA04606@mail.iol.ie>
[Sean Mc Grath]
>>
>>Sorry David, I cannot let you get away with that one. I said *fgets()* which
>>is an entirely different function to gets(). It takes
>>three paramaters one of which is the maximum number of characters to read.
>>It is not Broken As Designed.
>
[David Durand]
>No, but fgets (unlike gets) can deal with long lines --- you have to
>recognize that you overflowed and make accomodations, but you can do the
>right thing. iw as giving you the benefit of the doubt, since gets, at
>least, has the problem that you are raising, while fgets does not.
>
[Sean Mc Grath]
You mentioned gets(). I didn't. How your insertion of an irrelevant reference
to gets() can be construed as giving me "the benefit of the doubt" I don't know.
[Sean Mc Grath]
>>No weeping and wailing required because it is typically possible to splice in
>>line-ends into HTML *without affecting the content*. This is not the case
>>with XML.
>
[David Durand]
>Just try that in tables. You have to know the meaning of the markup, even
>in HTML, if you want to do this. Now you can claim that table markup is
>broken, and you might be right, but HTML does not suport your argument.
[Sean Mc Grath]
Why not? Why cannot I replace say, "" with " \n" everywhere?
The problem then reduces to long data chunks such as...
pre elements:-
[David Durand]
>
>Similarly for pre elements: You can't do anything to lineneds in there --
>maybe I'm using a 20K line in to force horisontal scrolling for a
>rhetorical reason.
[Sean Mc Grath]
Absolutely agreed. the case is fundamentally different.
These line-ends are truly part of the data and a processor that adds new ones
is blowing the integrity of the data. Thus the plausible argument in favour
of not
using line-end as data content.
[David Durand]
>
>>>Can you suggest any solution to the "grep" problem other than requiring a
>>>fixed line-max in XML.
>>
[Sean Mc Grath]
>>Yes. Ignore all line ends. I know this presents its own set of difficult
>>problems
>>but I'd prefer to tackle these - and maintain compatability with a decades
>>worth
>>of tools - rather than break the tools.
>
[David Durand]
>But this creates worse problems:
[Sean Mc Grath]
Worse?
[David Durand]
>lack of -style elements
Broken As Designed. If something has to give I think elements should
be first to go.
Alternatively the problem can alway be "arcformed" away. We use
DIGITOME CDATA #FIXED "PREFORM">
all the time. Our pretty printing, word wrapping SGML processing tools use
this to
avoid adding extraneous WS that would blow the data content.
[David Durand]
>, inability to write XML filters that preserve linespace jsut from generic
XML parsers.
[Sean Mc Grath]
Line ends (at least those) tipping up to start-end tags would *not* be part
of the data. They
could thus be added/dropped without effecting the data. The CGR output of
the grove
would be the final arbiter on "equivalence" and the launching pad for
offsets used in
addressing.
>No way to use string offsets in linking.
If it ain't got a representation in the grove it ain't in the data and thus
is not counted
when totting up offsets.
[David Durand]
>
>>> Do you think that that hideous hack to accomodate
>>>defective (if very useful) tools is really worth it.
[Sean Mc Grath]
>>Yes. Line oriented text processing has been a hugely popular paradigm for
>>many years now. I don't think of these tools as "defective" at all. I dare
>>say many wielders of these tools are of the same opinion. These people will
>>be rightly miffed at the suggestion that they are defective by virtue of the
>>use of a line oriented paradigm. They will also be rightly miffed that they
>>cannot bring their tools/skills to bear in the XML world.
[David Durand]
>But they can, they just need to limit their files to crrespond to the
>limitation of their tools. People do this all the time, without difficulty.
[Sean Mc Grath]
No difficulty?
Problem : I receive an XML file from a user who works with <1024 lines in
his tools.
I use <512. how do I munge his file to suite my tools? I can't without
blowing the data. If tag-tipping line ends were transient I could make
a stab at it. I would still have to address the ""
case. But hey! I never said this was simple! I just said that the alternate
set of problems this presents have the benefit of not throwing out our
existing line oriented tools and techniques.
[David Durand]
>Of course if the world at large decides to abandon the "line paradigm" then
>those who stick to it will be inconvenienced. But then if "the world" make
>the shift, then there's still not a very big problem, is there?
[Sean Mc Grath]
That is one-helluva shift IMHO! I am not sure to what extent the world is
a) aware of this aspect of XML
b) willing to bite that bullet.
[David Durand]
>if XML is
>supposed to require lines no longer than some limit, we need to specify
>that limit in the standard.
[Sean Mc Grath]
No we don't! We need to have a well defined mechanism whereby a tool with
a line length limit of N can work with XML with line length > N without
blowing the integrity of the data.
[David Durand]
>Otherwise all we can say is that any XML
>processor is free to reject any document if the lines are "too long for
>that tool". That's en even worse prescription for interoperability.
>
See above.
[David Durand]
>If there are limits, a standard has to tell you how to be safe and not
>break any of those limits. At least, a good standard should.
>
[Sean Mc Grath]
The standard does not have to establish a limit. It could help users
of "legacy" tools to *cope* with limits though. "Buy/build better tools" is one
line that can be taken but it is not the only one.
Sean Mc Grath
sean@digitome.com
www.digitome.com
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From dgd at cs.bu.edu Thu Sep 18 21:38:33 1997
From: dgd at cs.bu.edu (David G. Durand)
Date: Mon Jun 7 16:58:28 2004
Subject: Re Whitespace
In-Reply-To: <199709181840.TAA04606@mail.iol.ie>
Message-ID:
At 1:40 PM -0500 9/18/97, Sean Mc Grath wrote:
>[David Durand]
>>No, but fgets (unlike gets) can deal with long lines --- you have to
>>recognize that you overflowed and make accomodations, but you can do the
>>right thing. iw as giving you the benefit of the doubt, since gets, at
>>least, has the problem that you are raising, while fgets does not.
>>
>[Sean Mc Grath]
>You mentioned gets(). I didn't. How your insertion of an irrelevant reference
>to gets() can be construed as giving me "the benefit of the doubt" I don't
>know.
Well, as fgets does not support your argument that "long lines cause
problems" I thought it might be a typo for gets (wh/ does have serious
problems w/ long lines, but is of course a canonical example of bad design,
and not something we want to accomodate).
as to fgets, I confess that I don't see that it should have any problem
with anyfile, newline-containing or not. Am I clear now?
>[David Durand]
>>Just try that in tables. You have to know the meaning of the markup, even
>>in HTML, if you want to do this. Now you can claim that table markup is
>>broken, and you might be right, but HTML does not suport your argument.
>
>[Sean Mc Grath]
>Why not? Why cannot I replace say, "" with " \n" everywhere?
>The problem then reduces to long data chunks such as...
>pre elements:-
Well, because people use tables to format, and that extra space queers the
pitch, inducing funny spacign bahavior. Agreed that a better table model
could avoid this.
>[David Durand]
>>
>>Similarly for pre elements: You can't do anything to lineneds in there --
>>maybe I'm using a 20K line in to force horisontal scrolling for a
>>rhetorical reason.
>
>[Sean Mc Grath]
>Absolutely agreed. the case is fundamentally different.
>These line-ends are truly part of the data and a processor that adds new ones
>is blowing the integrity of the data. Thus the plausible argument in favour
>of not
>using line-end as data content.
I confess to not understanding why a lineend cannot occur at the beginning
of an element. Even SGML never proposed to remove more than _1_ such line
break.
So you want to take them all away, so that grep won't break.
>[David Durand]
>>
>>>>Can you suggest any solution to the "grep" problem other than requiring a
>>>>fixed line-max in XML.
>>>
>[Sean Mc Grath]
>>>Yes. Ignore all line ends. I know this presents its own set of difficult
>>>problems
>>>but I'd prefer to tackle these - and maintain compatability with a decades
>>>worth
>>>of tools - rather than break the tools.
Well, it makes data rather unrevealing.
And of course, the tools are only broken if common practice leads to the
use of long lines -- and if that becomes the case, then it will only have
been because the tools are _not_ actually that important.
This is a social argument that you have not addressed yet, but it cuts to
the core of why we should not do this... We get a simpler easier model, and
there is nothing to stop people from any self-imposed discipline their
tools require.
And if people are _not_ following such a discipline, then there's no reason
to worry about the tools, because it can only happen if people are not
using those tools for XML.
>[David Durand]
>>lack of -style elements
>
>Broken As Designed. If something has to give I think elements should
>be first to go.
Well, theoretically there's a lot of reasonableness to using explict markup
for such line breaks. But, the pragmatist in me has to note that there has
been _no_ successful markup or document processing language without such a
feature (except for word-processors, but the case there is complicated
because the user never _sees_ the relevant representation.
>Alternatively the problem can alway be "arcformed" away. We use
> DIGITOME CDATA #FIXED "PREFORM">
>all the time. Our pretty printing, word wrapping SGML processing tools use
>this to
>avoid adding extraneous WS that would blow the data content.
Doesn't solve the problem you raised. That data has a long line in it and
grep crashes. You have to split the line, and take the consequences, or not
use grep.
if you don't allow arbitrary line-break introduction anywhere, you haven't
solved the legacy tool problem, which weakens your argument somewhat. If
you do, you've mad it impossible to count on line-breaks _ever_ being
significant. The XML committee considered this and rejected it as too
divergent from current practice (that people did not want to give up).
>[David Durand]
>>, inability to write XML filters that preserve linespace jsut from generic
>XML parsers.
>
>[Sean Mc Grath]
>Line ends (at least those) tipping up to start-end tags would *not* be part
>of the data. They
>could thus be added/dropped without effecting the data. The CGR output of
>the grove
>would be the final arbiter on "equivalence" and the launching pad for
>offsets used in
>addressing.
Yes, and the "looks the same in my editor" arbiter of equivalence would
fail. This has long been felt unacceptable by those who use such
transformations. If any hand-editing is involved it is unacceptable
behaviour to change all the line-ends.
>[Sean Mc Grath]
>>>Yes. Line oriented text processing has been a hugely popular paradigm for
>>>many years now. I don't think of these tools as "defective" at all. I dare
>>>say many wielders of these tools are of the same opinion. These people will
>>>be rightly miffed at the suggestion that they are defective by virtue of the
>>>use of a line oriented paradigm. They will also be rightly miffed that they
>>>cannot bring their tools/skills to bear in the XML world.
>[David Durand]
>>But they can, they just need to limit their files to crrespond to the
>>limitation of their tools. People do this all the time, without difficulty.
Yes, If your editor and tools have a 72 character line limit, you don't
create files with long lines. Then your tools always work. If you want
everyone's tools to always work, and you admit a maximum line-length for
tools, you need to pick that number so I can make files that won't toast
your software. Either that, or someone with different software will exceed
the limits of your software, of whose existence she has never even heard!
>
>[Sean Mc Grath]
>No difficulty?
>
>Problem : I receive an XML file from a user who works with <1024 lines in
>his tools.
>
>I use <512. how do I munge his file to suite my tools? I can't without
>blowing the data. If tag-tipping line ends were transient I could make
>a stab at it. I would still have to address the ""
>case. But hey! I never said this was simple! I just said that the alternate
>set of problems this presents have the benefit of not throwing out our
>existing line oriented tools and techniques.
Look, we have a solution. Proposing a new solution based on a new problem
(grep and other tools with hard line-length limitations) requires that the
new solution actually _solve_ the problem. Your solution does not solve the
problem you yourself pose, so it's hard for me to take seriously.
>[David Durand]
>>Of course if the world at large decides to abandon the "line paradigm" then
>>those who stick to it will be inconvenienced. But then if "the world" make
>>the shift, then there's still not a very big problem, is there?
>
>[Sean Mc Grath]
>That is one-helluva shift IMHO! I am not sure to what extent the world is
> a) aware of this aspect of XML
> b) willing to bite that bullet.
In that case, they create files with short lines, and there is no bullet to
bite. The only way this problem can become common is if long lines become
very popular. I don't see how long lines can become popular if they create
fatal tool problems with popular tools. Either long lines will not be
common, or tools that cope with long lines will be common along with the
long lines themselves.
It's a simple feedback loop. No need to change the standard, just let
people's desire to share data feed back into the general knowledge of what
data is shareable.
>[David Durand]
>>if XML is
>>supposed to require lines no longer than some limit, we need to specify
>>that limit in the standard.
>
>[Sean Mc Grath]
>No we don't! We need to have a well defined mechanism whereby a tool with
>a line length limit of N can work with XML with line length > N without
>blowing the integrity of the data.
How do we do this for legacy tools like grep with a hard-compiled limit
(that is not documented, and varied from vendor to vendor)?
If files that work with arbitrary tools are to be possible, we need to know
the constraints that those tools impose.
>[David Durand]
>>Otherwise all we can say is that any XML
>>processor is free to reject any document if the lines are "too long for
>>that tool". That's en even worse prescription for interoperability.
>>
>See above.
I saw. I didn't see how you're going to fix grep (for your data\ndata
case). Or rather the "40K of data with no \n" case which is the real killer.
>[David Durand]
>>If there are limits, a standard has to tell you how to be safe and not
>>break any of those limits. At least, a good standard should.
>>
>
>[Sean Mc Grath]
>The standard does not have to establish a limit. It could help users
>of "legacy" tools to *cope* with limits though. "Buy/build better tools"
>is one
>line that can be taken but it is not the only one.
Well, how could the standard do that?
Actually, since the standard is almost certainly not going to change, I
don't really care how it could do it. My sense is that people won't do
without equivalents -- so you can never get total freedom to
remove/add linends. So since the problem is unsolvable, lets not waste
time, and complicate the standard to get a partial solution (ie. solution
that fails to solve the problem) at the cost of a popular feature.
-- David
I think that's it for me.
_________________________________________
David Durand dgd@cs.bu.edu \ david@dynamicDiagrams.com
Boston University Computer Science \ Sr. Analyst
http://www.cs.bu.edu/students/grads/dgd/ \ Dynamic Diagrams
--------------------------------------------\ http://www.dynamicDiagrams.com/
MAPA: mapping for the WWW \__________________________
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Sep 19 01:58:21 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:58:28 2004
Subject: XML-WG and XML-SIG deliberations
Message-ID: <10145@ursus.demon.co.uk>
Two postings on XML-DEV have explicitly or implicitly referred to the
discussion of XML-SIG and XML-WG. The formal position is that the
discussions of XML-WG (the current W3C-appointed decision-making body) and
XML-SIG (a group of experts who offer advice to XML-WG) are confidential to
W3C member organisations (and the invited experts). This confidentiality
is important as it represents part of the value of being a member of W3C.
There is potential confusion about the archives, since the XML discussion group
was originally called the 'WG' and its archives were (and are) public. They
ended about June 1997 (any precise dates and current URLs for these?) They
are of historical interest and there *might* be some useful discussion there
but there is a huge amount to read through. Maybe some of the whitespace
discussion is in the public archives, though I wouldn't rush.
The archives of XML-WG since June 1997 (?) are not publicly available. Nor
are those of XML-SIG.
However the discussion on this list, and the publicly reported developments
contributed by posters/readers of this list are valued by the XML-groups.
For example the recent WG posting emphasised the value of APIs and their
possible co-publication with XML specs.
The proposal for XSL (XML-STYLE) *is* publicly visible and URLs have been
posted on this list. Unfortunately for XML-DEVers, any XML-SIG and XML-WG
discussion on this is confidential. I leave it to any XML-WG readers of this
list to keep XML-DEV aware of what is happening. Perhaps it could be useful
to remind us of the proposed milestones/timescales for the various XML
components to be published/accepted.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From jeremy at allaire.com Fri Sep 19 06:15:15 1997
From: jeremy at allaire.com (Jeremy Allaire)
Date: Mon Jun 7 16:58:28 2004
Subject: Custom Tags
Message-ID: <34220BAC.2F83@allaire.com>
For anyone interested in CFML custom tags:
http://www.allaire.com/TagGallery/
xml-dev: A list for W3C XML Developers
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To unsubscribe, send to majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
List coordinator, Henry Rzepa (rzepa@ic.ac.uk)
From Alice.Portillo at PSS.Boeing.com Tue Sep 23 23:05:47 1997
From: Alice.Portillo at PSS.Boeing.com (Portillo, Christina)
Date: Mon Jun 7 16:58:28 2004
Subject: Use of Character Escape Codes
Message-ID:
Thought I would share Peter Flynn response on escape codes with you all.
Christina Portillo
Product Definition and Image Technology
The Boeing Company Phone: 425.237.3351
PO Box 3707 M/S 6H-AF Fax: 425.237.3428
Seattle, WA 98124-2207 christina.portillo@boeing.com
> ----------
> From: Peter Flynn[SMTP:pflynn@imbolc.ucc.ie]
> Sent: Monday, September 22, 1997 7:15 PM
> To: Christina Portillo
> Subject: Use of Escape Codes and Characters
>
> At 20:13 22/05/97 +0100, you wrote:
> >Q == "Question=0D How do you encode in your XML document references
> to=
> >characters above 126 in the ISO646 character set.
>
>So of the character=
> >classes defined in the standard: space, char, letter, Base Char, =
> >Ideographic, CombiningChar, Letter, Digit, Ignorable, and Extender
> which=
> >of these has to be escaped to be used in a document. OR from what =
> >index value down must escape codes be used."
>
> I'm sorry to have delayed answering this but the character set
> question
> became rather vexed :-)
>
> The simple answer is you escape any code you can't type as a character
> or byte combination. In other words, if you are working in ASCII, but
> you can generate an e-acute with the correct code (ie ISO 10646, not
> Windows :-) then you should be able to do so, and embed that byte in
> the file. If you need a Hangul glyph and you can't type it, then you
> need to use the escaped code: presumably users on Hangul systems can
> generate all their own characters at the keyboard.
>
> But in practice I think we'll need to see how/if the browsers
> implement
> non-Latin character repertoires.
>
> ///Peter
>
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From gannon at commerce.net Thu Sep 25 01:58:25 1997
From: gannon at commerce.net (Patrick Gannon)
Date: Mon Jun 7 16:58:28 2004
Subject: XML iMarket Project Planning Meeting
Message-ID: <01BCC907.BE1F2F00@arrow-d83.sierra.net>
CommerceNet XML iMarket Project Team,
The XML iMarket Project Planning Team will meet on Monday, October 6, 1997, 9:00am to 12:00pm PDT.
The meeting location will be the CommerceNet offices, 4005 Miranda Ave, Suite 175, Palo Alto, CA 94304 (650-858-1930) unless otherwise notified.
I will arrange for 800# conference call facilities for those unable to attend in person and send the 800# information to those who have replied and confirmed their interest in participating.
If you can attend, please reply confirming whether you will be able to attend in person or whether you will attend via the 800# conference call. Please note that attendence in person or phone is limited to members of CommerceNet's Information Access Portfolio only.
The goal of the meeting is to develop a detailed project plan and Request For Proposal (if needed) to identify companies or consultants with expertise required to help on the project. The iMarket Project is designed to take the XML catalog files and Document Type Definition files produced during the recently completed XML Catalog project. The general plan is to build a demonstration virtual marketplace which utilizes the multiple vendor XML catalogs with standard DTDs and allows shoppers to search for products across vendors by specifying product and merchant attributes. Another goal of this project is to demonstrate how the use of XML stylesheets will allow vendors/merchants to maintain "brand equity" while using common description templates (DTDs).
The XML Catalog tutorial and sample XML/DTD files are available for members at:
http://members.commerce.net/pw/portfolios/access/xml/xml-demo.html
CommerceNet IA Portfolio Members, please review these XML documents and let me know if you or someone else in your company is interested in participating.
Non-members, please reply if you are interested in becoming members or being put on the RFP list.
Thank you for your continued support.
Patrick Gannon, Executive Director
Information Access Portfolio, CommerceNet
http://www.commerce.net/services/portfolios/
------------------------------------------------------
President & CEO, Internet Shopping Directory, Inc.
865 Tahoe Blvd., Suite 211, Incline Village, NV 89451
702-831-2251 702-831-3925 (Fax)
mailto://patrick@shoppingdirect.com
http://www.shoppingdirect.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From Jon.Bosak at eng.Sun.COM Thu Sep 25 17:53:42 1997
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:58:28 2004
Subject: XML iMarket Project Planning Meeting
In-Reply-To: (message from Arthur Keller on Thu, 25 Sep 97 6:02:38 PDT)
Message-ID: <199709251550.IAA13057@boethius.eng.sun.com>
| The requirement of standard DTDs by all vendors and participants
| presumes that these are adequate to satisfy the differentiation needs
| of the various participants. "Brand equity" is not sufficient
| differentiation. Rather, one company may use more detailed
| characteristics than another company in order to differentiate their
| products.
I think you're missing the point.
What I as a consumer want to be able to do is quite simple. I want to
be able to say, "Hey, I need a new jacket," sit down at my computer,
call up my find-a-product robot, enter my jacket parameters, and then
come back a while later to find all the jackets that fit those
parameters offered by all the vendors whose products I'm interested in
considering. If the catalog scheme isn't standardized enough to
support this, then I as a consumer am not interested in using it. If
one of the vendors differentiates itself by adopting a scheme of data
representation that doesn't allow this kind of transparent direct
comparison, then it differentiates itself right out of the class of
vendors I'm interested in, because if all it's giving me is the
ability to cruise its catalog in isolation, I can get the same
functionality from the printed version; it no longer participates in a
way that allows the net to add value to me as a consumer.
I'm not denying that vendors will want to differentiate their
offerings, but if they can't do it in a way that supports detailed
direct comparisons based on the differentia that I am interested in
*as a consumer* then they are simply not in the game at all.
Jon
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From srn at techno.com Thu Sep 25 22:12:40 1997
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun 7 16:58:29 2004
Subject: XML iMarket Project Planning Meeting
In-Reply-To: <199709251550.IAA13057@boethius.eng.sun.com>
(Jon.Bosak@eng.Sun.COM)
Message-ID: <199709252008.QAA01199@bruno.techno.com>
[Jon Bosak:]
> What I as a consumer want to be able to do is quite simple. I want to
> be able to say, "Hey, I need a new jacket," sit down at my computer,
> call up my find-a-product robot, enter my jacket parameters, and then
> come back a while later to find all the jackets that fit those
> parameters offered by all the vendors whose products I'm interested in
> considering. If the catalog scheme isn't standardized enough to
> support this, then I as a consumer am not interested in using it. If
> one of the vendors differentiates itself by adopting a scheme of data
> representation that doesn't allow this kind of transparent direct
> comparison, then it differentiates itself right out of the class of
> vendors I'm interested in, because if all it's giving me is the
> ability to cruise its catalog in isolation, I can get the same
> functionality from the printed version; it no longer participates in a
> way that allows the net to add value to me as a consumer.
>
> I'm not denying that vendors will want to differentiate their
> offerings, but if they can't do it in a way that supports detailed
> direct comparisons based on the differentia that I am interested in
> *as a consumer* then they are simply not in the game at all.
There is a very serious problem here that bears strikingly on an
ongoing discussion in XML-land: the discussion of so-called
"namespaces". The idea that there will be consortia of vendors, or
any other sort of authority who will determine some list of names of
characteristics of each sort of product, so that characteristics can
be directly and automatically compared, is dangerous to innovation,
competition, and commerce, and it is totally unnecessary, too. It
will open the door for existing businesses to use such architectures
as weapons against upstarts in niche markets and in unusual or new
market combinations. Moreover, the use of information architectures
as weapons will always seem like perfectly reasonable business
practices, so it will be nobody's fault when new concepts fail to be
accepted in the marketplace, because the internet failed to live up to
its promise of helping people find what they are looking for and make
informed purchasing decisions. The macroeconomy will be damaged.
Andrew Layman (whom I do not know, but would like to) has laid out a
list of requirements for the implementation of namespaces which, if
used as guidance in the development of XML's namespace features, will
create a need for authorities who give "standard" names to such things
as product characteristics. The concentration of power in such
authorities will hinder innovation, by making it difficult to compare
products regarded as "out of category" for some authority's set of
defined names. I quote from Andrew's "Universal Names" posting of 23
September 1997 on the w3c-xml-sig@w3.org list:
[Andrew Layman:]
I've agreed to summarize the set of requirements that I have
championed in the past under the term "namespaces." Because this
word has also meant several alternate sets of requirements, I'm
temporarily using an entirely different term, "universal names," so
that we can understand this set of requirements without being
confused by other useful, but different, goals. ...
[Here] I'm going to describe one set of requirements, as best I
understand it, in my own words. The name is not important. This set
of requirements is. ...
Let me mention a few things that are not requirements of this
facility. They may be useful features in some other context, but
they are not needed in order to have universal names, and should not
be confused with universal names:
We do not require an ability to rename elements, so that they can be
called one thing in a schema and something else in a document instance.
We do not require the ability to associate multiple semantic meanings
with a single name.
In short, what we need, and all that we need, is a facility that
gives every element's type a universal name, and allows a single
element type to be known by the same name across disparate
documents, where the documents have different "document types" or
where there is no specific document type.
When Andrew Layman says, "We do not require an ability to rename
elements, so that they can be called one thing in a schema and
something else in a document instance," he is backhandedly stating a
requirement that conflicts with the evolutionary process of defining
and marketing new products. How will the catalog of everything that
is for sale handle a case where the same product characteristic, or
even the same entire product, arises from multiple industries
simultaneously, and each of those industries already uses its own
authoritative schema? Will the contents of documents have to be
duplicated and translated so as to conform with multiple schemas, so
that different comparisons can be made? If so, that will cause much
of the value of making the comparisons in the first place to be lost;
features regarded by authorities as "out of category" will simply
disappear. Imagine a single device that is a fax machine, a
telephone, a copier, a computer, and a stereo sound system. Should it
appear in a list of telephones? Maybe. Should the output wattage of
its amplifier be listable in a comparison with the output wattage of
other telephones? Maybe. Should the people who figure out what are
the interesting characteristics of telephones anticipate that output
wattage may be an important characteristic of telephones? It's
completely unrealistic to expect those people to anticipate that.
And, yet, it's an interesting and relevant statistic and it may be
important to some consumers.
The ugly truth is that we can't predict whether information that is
now thought to be irrelevant to other information (or, maybe we don't
even know about the existence of the other information yet) will turn
out to be semantically identical or semantically mappable. In my own
mind, anyway, the real justification for the existence of businesses
that provide "yellow pages on steroids" in support of internet
commerce is to provide the added value of mapping semantics to each
other in such a way that they can be directly compared, just as Jon
says. That mapping can be expressed in some proprietary fashion, or
it can be done using SGML documents that inherit from multiple SGML
architectures, or, if XML supports it, it can be done with XML
documents that inherit from multiple XML architectures, with no limit
on the number of XML architectures that can be inherited, and no
limits on the number of architectures that can usefully be fielded by
old and new industries. If Andrew Layman's much more limited
requirements govern the design of XML, though, XML documents that
represent such semantic mappings will be more costly to create and
maintain. (I guess you'd have to do it all with hyperlinks. Anything
can be done with hyperlinks, but that doesn't mean that everything
*should* be done with hyperlinks. In general, hyperlinks are best
regarded by information managers as a last resort because they cost
more to maintain and their structure is arbitrary and external. It's
better if the information, in effect, maps itself. Inheritable SGML
architectures allow information to map itself in complex ways. Why
shouldn't it be possible to accomplish the same end in XML, without
requiring the use of hyperlinks?)
So, I continue to harp on the importance of allowing a single element
to inherit multiple semantics (and/or the _same_ semantic differently
named or named within different namespaces). Andrew Layman says, "We
do not require the ability to associate multiple semantic meanings
with a single name." But, in my own mind, anyway, this really *is* a
requirement for cataloging companies to extract maximum value from
their listings at minimum information management cost in a dynamic,
non-authoritarian market environment. It would allow internet catalog
providers to map each new DTD into their existing DTDs simply by
tweaking their existing DTDs. For example, in the DTD for their
catalog of telephone products, when the output wattage issue first
arises (i.e., when a telephone appears on the market that lists an
output wattage), a declaration is added that allows the
characteristics listed in the DTD for the manufacturer's product
description document to be inherited. In the same declaration, the
features of the product, such as its "colour", can be mapped to the
things that are the same that are already in the DTD, (such as
"color"). The new feature, "outputWattage", can be made to appear
with a default value of "not applicable", so now all the existing
telephone product listings have this feature, and they can all respond
meaningfully (if uninterestingly) to queries about it. No need to
create and maintain (!) any hyperlinks. No need to write or maintain
any extra documents. One change in one place updates all telephone
products listed in the catalog, regardless of how many there are. The
amount of information stored hardly increases at all, but the value of
the information increases quite a lot. Essentially the same change
can be applied to the DTDs for stereo systems (now they can have a
redial feature, yes or no), the DTD for copiers, etc. Cheap and very
powerful, no? The catalog provider gets to add a terrific amount of
value at very little cost. New products can be found by consumers
even if they didn't know the hybrid category existed. ("I want a very
loud telephone. Hmmm.") New products for untried niches can be
usefully listed in multiple catalogs. Innovation is not penalized for
being unanticipated by the authorities who created DTDs for product
listings in various categories, or by the failure to recognize a
viable category. Indeed, there is no need for such authorities at
all. There is only a need for catalogers who can read and understand
incoming DTDs and perform these cheap semantic mapping tricks.
You can do all this now with SGML (as of August 1, 1997; see
http://www.ornl.gov/sgml/wg8/document/1920.htm). The only question is
whether XML will be able to do it. Maybe it doesn't matter; providers
of internet shopping directories can always maintain their source
information in SGML and simply deliver it in XML form, if they like.
(Or in HTML form, for that matter.)
-Steve
--
Steven R. Newcomb President
voice +1 716 271 0796 TechnoTeacher, Inc.
fax +1 716 271 0129 (courier: 23-2 Clover Park,
Internet: srn@techno.com Rochester NY 14618)
FTP: ftp.techno.com P.O. Box 23795
WWW: http://www.techno.com Rochester, NY 14692-3795 USA
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From jwrobie at mindspring.com Thu Sep 25 22:34:16 1997
From: jwrobie at mindspring.com (Jonathan Robie)
Date: Mon Jun 7 16:58:29 2004
Subject: XML iMarket Project Planning Meeting
Message-ID: <1.5.4.32.19970925200436.01683f9c@pop.mindspring.com>
At 10:14 AM 9/25/97 -0700, ark@DB.Stanford.EDU wrote:
>I certainly agree with your goal, but I don't agree with the means.
>The experience I have is that standards do not work well in this area.
>What we need is an approach that allows the cross-comparison that you
>want, and yet allows for differentiation, experimentation, and
>evolution.
Perhaps the standards could describe architectural forms which would be the
basis for more individual DTDs created by each vendor. This allows searches
to be done for anything in the architectural forms, but still allows each
vendor to have additional information. Because each vendor has a DTD,
documents can still be validated when they are authored, even though they
have vendor-specific information. Because the DTDs are based on common
architectures, searches can be done across vendors.
Jonathan
***************************************************************************
Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie
POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com
***************************************************************************
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From gannon at commerce.net Fri Sep 26 00:24:09 1997
From: gannon at commerce.net (Patrick Gannon)
Date: Mon Jun 7 16:58:29 2004
Subject: XML & Catalogs
Message-ID: <01BCC9C3.094BCEA0@sphynx-d105.sierra.net>
Steven,
Nice to hear from someone who "gets it" regarding the impact of XML on future usage & searchability of internet catalogs.
Since this topic has spilled over from the original meeting posting and generated significant interest, I will request a listserv be established for xml-catalog. This will allow for application oriented discussions of XML that are now related to development (XML-DEV) or EDI (XML-EDI) issues that have their own listserv.
Patrick Gannon
----------
From: Steven R. Newcomb[SMTP:srn@techno.com]
Sent: Thursday, September 25, 1997 1:08 PM
To: Jon.Bosak@eng.sun.com
Subject: Re: XML iMarket Project Planning Meeting
[Jon Bosak:]
> What I as a consumer want to be able to do is quite simple. I want to
> be able to say, "Hey, I need a new jacket," sit down at my computer,
> call up my find-a-product robot, enter my jacket parameters, and then
> come back a while later to find all the jackets that fit those
> parameters offered by all the vendors whose products I'm interested in
> considering. If the catalog scheme isn't standardized enough to
> support this, then I as a consumer am not interested in using it. If
> one of the vendors differentiates itself by adopting a scheme of data
> representation that doesn't allow this kind of transparent direct
> comparison, then it differentiates itself right out of the class of
> vendors I'm interested in, because if all it's giving me is the
> ability to cruise its catalog in isolation, I can get the same
> functionality from the printed version; it no longer participates in a
> way that allows the net to add value to me as a consumer.
>
> I'm not denying that vendors will want to differentiate their
> offerings, but if they can't do it in a way that supports detailed
> direct comparisons based on the differentia that I am interested in
> *as a consumer* then they are simply not in the game at all.
There is a very serious problem here that bears strikingly on an
ongoing discussion in XML-land: the discussion of so-called
"namespaces". The idea that there will be consortia of vendors, or
any other sort of authority who will determine some list of names of
characteristics of each sort of product, so that characteristics can
be directly and automatically compared, is dangerous to innovation,
competition, and commerce, and it is totally unnecessary, too. It
will open the door for existing businesses to use such architectures
as weapons against upstarts in niche markets and in unusual or new
market combinations. Moreover, the use of information architectures
as weapons will always seem like perfectly reasonable business
practices, so it will be nobody's fault when new concepts fail to be
accepted in the marketplace, because the internet failed to live up to
its promise of helping people find what they are looking for and make
informed purchasing decisions. The macroeconomy will be damaged.
. . .
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From peat at erols.com Fri Sep 26 01:17:12 1997
From: peat at erols.com (peat)
Date: Mon Jun 7 16:58:29 2004
Subject: XML & Catalogs
Message-ID: <199709252308.TAA11756@smtp1.erols.com>
Before you do this, we need to ask ourselves, is there or should there be a
significant difference in namespace and other mechanisms depending on use of
the object. Is there that much of a difference on how we describe an article;
say a "red sweater" if the item is in a catalog, stored in an object
repository or exchanged in a Purchase Order? Significant enough to split the
group?
Let me propose we introduce a collaborative means to keeping the collection
(which is still relatively small) of people together. The XML/EDI Group will
soon have this capability through its subgroups and via a generous donation
from outside corporation. It should be up and running in a few weeks. Just a
thought, before splintering off the main path.
- Bruce
----------
Steven,
Nice to hear from someone who "gets it" regarding the impact of XML on future
usage & searchability of internet catalogs.
Since this topic has spilled over from the original meeting posting and
generated significant interest, I will request a listserv be established for
xml-catalog. This will allow for application oriented discussions of XML
that are now related to development (XML-DEV) or EDI (XML-EDI) issues that
have their own listserv.
Patrick Gannon
----------
From: Steven R. Newcomb[SMTP:srn@techno.com]
Sent: Thursday, September 25, 1997 1:08 PM
To: Jon.Bosak@eng.sun.com
Subject: Re: XML iMarket Project Planning Meeting
[Jon Bosak:]
> What I as a consumer want to be able to do is quite simple. I want to
> be able to say, "Hey, I need a new jacket," sit down at my computer,
> call up my find-a-product robot, enter my jacket parameters, and then
> come back a while later to find all the jackets that fit those
> parameters offered by all the vendors whose products I'm interested in
> considering. If the catalog scheme isn't standardized enough to
> support this, then I as a consumer am not interested in using it. If
> one of the vendors differentiates itself by adopting a scheme of data
> representation that doesn't allow this kind of transparent direct
> comparison, then it differentiates itself right out of the class of
> vendors I'm interested in, because if all it's giving me is the
> ability to cruise its catalog in isolation, I can get the same
> functionality from the printed version; it no longer participates in a
> way that allows the net to add value to me as a consumer.
>
> I'm not denying that vendors will want to differentiate their
> offerings, but if they can't do it in a way that supports detailed
> direct comparisons based on the differentia that I am interested in
> *as a consumer* then they are simply not in the game at all.
There is a very serious problem here that bears strikingly on an
ongoing discussion in XML-land: the discussion of so-called
"namespaces". The idea that there will be consortia of vendors, or
any other sort of authority who will determine some list of names of
characteristics of each sort of product, so that characteristics can
be directly and automatically compared, is dangerous to innovation,
competition, and commerce, and it is totally unnecessary, too. It
will open the door for existing businesses to use such architectures
as weapons against upstarts in niche markets and in unusual or new
market combinations. Moreover, the use of information architectures
as weapons will always seem like perfectly reasonable business
practices, so it will be nobody's fault when new concepts fail to be
accepted in the marketplace, because the internet failed to live up to
its promise of helping people find what they are looking for and make
informed purchasing decisions. The macroeconomy will be damaged.
. . .
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
----------
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Sep 26 01:22:26 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:58:29 2004
Subject: XML & Catalogs
Message-ID: <10312@ursus.demon.co.uk>
In message <01BCC9C3.094BCEA0@sphynx-d105.sierra.net> Patrick Gannon writes:
> Steven,
>
> Nice to hear from someone who "gets it" regarding the impact of XML on
> future usage & searchability of internet catalogs.
>
> Since this topic has spilled over from the original meeting posting and
> generated significant interest, I will request a listserv be established
> for xml-catalog. This will allow for application oriented discussions
I think there is potential confusion in the word 'catalog', because of the
SGML Open Catalog. Some XML software such as NXP supports such Catalogs,
although at present (I think) it is not formally part of XML.
If possible I would hope that 'XML Catalog' and xml-catalog (if they exist
at all) were reserved for this usage - otherwise there could be a lot of
confusion.
A general point is the use of the XML-* prefix. Within XML itself it is
reserved (e.g. xml-space, xml-link) and I think we should avoid pre-empting
possible uses of XML-*. Of course 'XML-DEV' falls into the same trap... :-)
I'm assuming that this is not a request for Henry and me to set up another
listserv, because one is about our limit :-).
P.
> of XML that are now related to development (XML-DEV) or EDI (XML-EDI)
> issues that have their own listserv.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From elm at arbortext.com Fri Sep 26 01:35:54 1997
From: elm at arbortext.com (Eve L. Maler)
Date: Mon Jun 7 16:58:29 2004
Subject: XML iMarket Project Planning Meeting
Message-ID: <3.0.32.19970925193549.00ab5490@village.doctools.com>
(I just posted this directly to xml-dev; if any of the iMarket folks wants
to post this to the original recipients of the thread, be my guest...)
At the Montreal face-to-face XML WG meeting, Eliot Kimber mentioned a cool
idea: Schemas can be in the business of providing synonyms for semantics
published in other schemas. Schemas can also be in the business of
providing mappings from names to multiple schemas.
Thus, if you want to use your own name for something, you can create a
schema (why not even use AF syntax?) that does nothing but map your name to
the "standard" one or to several "standard" ones. So my personal schema
can map eve:gazorninplat to both dc:subject and docbook:subject if I want
it to.
This could have some interesting consequences:
o You could chain schemas as much as necessary to get your desired effect.
o An interesting market in derivative schemas could develop.
o XML-only documents wouldn't require full AFDR functionality.
So Jonathan's suggestion below could be seen as a suggestion to create a
base schema using AFDR syntax, which others could use directly, or in
modified form by inserting another schema.
I don't know, maybe all this is obvious to everybody else, but seeing the
problem this way blows my mind. It makes me think that (ironically?) the
first obvious candidate for "non-DTD" schema syntax is AFDRs.
Eve
At 04:04 PM 9/25/97 -0400, Jonathan Robie wrote:
>Perhaps the standards could describe architectural forms which would be the
>basis for more individual DTDs created by each vendor. This allows searches
>to be done for anything in the architectural forms, but still allows each
>vendor to have additional information. Because each vendor has a DTD,
>documents can still be validated when they are authored, even though they
>have vendor-specific information. Because the DTDs are based on common
>architectures, searches can be done across vendors.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From srn at techno.com Fri Sep 26 05:11:18 1997
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun 7 16:58:29 2004
Subject: Retraction and apology
In-Reply-To: <199709252008.QAA01199@bruno.techno.com> (srn@techno.com)
Message-ID: <199709260306.XAA01444@bruno.techno.com>
Some of you who received the note I sent to you earlier today should
not have received the material written by Andrew Layman that I quoted
and which was previously distributed only within the confines of W3C.
I should not have quoted it in a note that was being publicly
distributed.
In my own (pretty weak) defense: I didn't notice that, for example,
the xml-dev list was in the address list; I merely scanned the list of
addresses it to verify that, in fact, it was a list with a lot of
insiders. I should have verified that the list contained no
*outsiders*, but I inexplicably failed to do that, blithely assuming
from the list's provenance, insider topic, insider tenor, and
recognizable insider addressees that it was a discussion taking place
within the family. I should have been more careful; this was
definitely a poor algorithm.
I must ask you folks who were not supposed to see the Layman material
to destroy it and forget it. Anyway, it's an internal discussion,
and, therefore, you can't know the context.
W3C people: I would not blame you for withdrawing my access to the
discussion. My blunder has caused some pain, and I regret that.
-Steve
--
Steven R. Newcomb President
voice +1 716 271 0796 TechnoTeacher, Inc.
fax +1 716 271 0129 (courier: 23-2 Clover Park,
Internet: srn@techno.com Rochester NY 14618)
FTP: ftp.techno.com P.O. Box 23795
WWW: http://www.techno.com Rochester, NY 14692-3795 USA
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From Jon.Bosak at eng.Sun.COM Fri Sep 26 17:33:38 1997
From: Jon.Bosak at eng.Sun.COM (Jon Bosak)
Date: Mon Jun 7 16:58:29 2004
Subject: XML & Catalogs
In-Reply-To: <01BCC9C3.094BCEA0@sphynx-d105.sierra.net> (message from Patrick Gannon on Thu, 25 Sep 1997 14:55:12 -0700)
Message-ID: <199709261530.IAA13761@boethius.eng.sun.com>
| Since this topic has spilled over from the original meeting posting
| and generated significant interest, I will request a listserv be
| established for xml-catalog. This will allow for application oriented
| discussions of XML that are now related to development (XML-DEV) or
| EDI (XML-EDI) issues that have their own listserv.
Thanks, Patrick. Like Steve Newcomb, I didn't notice that this thread
was being copied to xml-dev when I posted to it. We should start over
on the new list server.
Jon
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From Peter at ursus.demon.co.uk Fri Sep 26 21:18:34 1997
From: Peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun 7 16:58:29 2004
Subject: Retraction and apology
Message-ID: <10329@ursus.demon.co.uk>
In message <199709260306.XAA01444@bruno.techno.com> "Steven R. Newcomb" writes:
>
> I must ask you folks who were not supposed to see the Layman material
> to destroy it and forget it. Anyway, it's an internal discussion,
> and, therefore, you can't know the context.
Mailings to xml-dev are not only posted to subscribers, but also hypermailed.
I have no idea what people or robots copy material from this list, but I expect
that this happens. The messages are stored in a mail box, regenerated into
hypertext at regular intervals and it isn't feasible to delete messages from
the archive without a great deal of work. The moving finger writes... sorry.
P.
--
Peter Murray-Rust, domestic net connection
Virtual School of Molecular Sciences
http://www.vsms.nottingham.ac.uk/
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From tbray at textuality.com Sat Sep 27 00:30:46 1997
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun 7 16:58:29 2004
Subject: First XML Book?
Message-ID: <3.0.32.19970926152707.00944510@pop.intergate.bc.ca>
Just got my copy in the mail of "Presenting XML", mostly by Richard Light,
from SamsNet. 400 pages, suffers from being a snapshot of a moving target,
but, I think, a worthy first volume in the soon-to-be-large XML library.
ISBN 1-57521-334-6. -Tim
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From srn at techno.com Mon Sep 29 04:35:49 1997
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun 7 16:58:29 2004
Subject: please consider whether
Message-ID: <199709290233.WAA01182@bruno.techno.com>
[Patrick Gannon:]
> Since this topic has spilled over from the original meeting posting
> and generated significant interest, I will request a listserv be
> established for xml-catalog. This will allow for application
> oriented discussions of XML that are now related to development
> (XML-DEV) or EDI (XML-EDI) issues that have their own listserv.
Patrick -- Here is a note to post on the listserv. -- Steve
**********************************************************************
This note asks those in the online product catalog business to
consider whether they need XML to support SGML Architectures --
multiple architectural inheritance. (Others may also find it
interesting.)
The designers of XML want to know why multiple architectural
inheritance is a feature that should remain unsupported, at least
temporarily.
If you want to use and benefit from the "SGML Architectures" notion
outlined in my earlier note (attached below), I believe you should now
consider (while you still have an option in the matter) whether you
want to be able to use XML for your company-internal "information
source code" for all the information that is the essence of your
company's value. An ISO standard alternative, SGML/HyTime, is also
available for that purpose.
On the one hand, SGML/HyTime is one helluva strong set of paradigms,
of which XML and all the things currently present in or planned for
XML (linking, addressing, metadata) are a proper subset. Together,
these paradigms put the information manager and owner in maximum
control of the cost of creating and maintaining information about
information.
On the other hand, XML will have a wider audience. XML data will flow
across the internet to an awful lot of users (or so we think, anyway)
who won't have full SGML/HyTime capabilities in their systems any time
soon.
If, because your internal databases are limited in functionality to
the representational power of XML, your internal applications cannot
deliver the cost-cutting power of SGML/HyTime for creating and
maintaining massive amounts of n-dimensional (and n-dimensionally
interrelated) information, maybe that's ok because the potential for
higher code maintenance costs is worth the convenience of being able
to dump copies of sections of your metadata source code directly out
to the internet. (Somehow the latter doesn't seem to me a very good
business idea, but that's for you to decide.)
You might be able to avoid having to make this decision early by
letting the w3c-xml-sig group know that your business applications
expect to benefit from multiple architectural inheritance a la SGML
Architectures, so you'd like to have XML support SGML Architectures
sooner, rather than later.
I'm not particular about whatever reason you may have for expressing
to the w3c-xml-sig group your interest (if any) in SGML Architectures;
I just think the online product catalog industry should consider doing
so, and very soon indeed.
I've already made clear my own reasons for bringing this issue up in
my earlier note. For your convenience, I'm attaching it below (sans
some stuff I shouldn't have put in in the first place because it was
from an unpublished W3C discussion about XML).
-Steve
--
Steven R. Newcomb President
voice +1 716 271 0796 TechnoTeacher, Inc.
fax +1 716 271 0129 (courier: 23-2 Clover Park,
Internet: srn@techno.com Rochester NY 14618)
FTP: ftp.techno.com P.O. Box 23795
WWW: http://www.techno.com Rochester, NY 14692-3795 USA
********************************************************************************
*** Not as originally posted. Unpublished W3C material has been deleted. ***
Date: Thu, 25 Sep 1997 16:08:44 -0400
Message-Id: <199709252008.QAA01199@bruno.techno.com>
From: "Steven R. Newcomb"
To: Jon.Bosak@eng.Sun.COM
CC: ark@DB.Stanford.EDU, gannon@commerce.net, brucek@agentsoft.com,
btait@mercantec.com, caallen@webmethods.com,
claire_celeste_carnes@ccm.jf.intel.com, dmarquis@kinetoscope.com,
f.deschamps@bull.com, harvey@eccnet.eccnet.com, jmt@commerce.net,
Jon.Bosak@eng.Sun.COM, jonathan@poet.com, jonlewis@cngroup.com,
marthao@icat.com, Michael.Leventhal@grif.fr, paul@arbortext.com,
pjordan@microstar.com, ptrevithick@bitstream.com, rcw@commerce.net,
smith@adobe.com, tbadger@kodak.com, trung@ondisplay.com,
weld@cs.washington.edu, xml-dev@ic.ac.uk, andrewl@microsoft.com,
higginsc@lanepowell.com
In-reply-to: <199709251550.IAA13057@boethius.eng.sun.com>
(Jon.Bosak@eng.Sun.COM)
Subject: Re: XML iMarket Project Planning Meeting
[Jon Bosak:]
> What I as a consumer want to be able to do is quite simple. I want to
> be able to say, "Hey, I need a new jacket," sit down at my computer,
> call up my find-a-product robot, enter my jacket parameters, and then
> come back a while later to find all the jackets that fit those
> parameters offered by all the vendors whose products I'm interested in
> considering. If the catalog scheme isn't standardized enough to
> support this, then I as a consumer am not interested in using it. If
> one of the vendors differentiates itself by adopting a scheme of data
> representation that doesn't allow this kind of transparent direct
> comparison, then it differentiates itself right out of the class of
> vendors I'm interested in, because if all it's giving me is the
> ability to cruise its catalog in isolation, I can get the same
> functionality from the printed version; it no longer participates in a
> way that allows the net to add value to me as a consumer.
>
> I'm not denying that vendors will want to differentiate their
> offerings, but if they can't do it in a way that supports detailed
> direct comparisons based on the differentia that I am interested in
> *as a consumer* then they are simply not in the game at all.
There is a very serious problem here that bears strikingly on an
ongoing discussion in XML-land: the discussion of so-called
"namespaces". The idea that there will be consortia of vendors, or
any other sort of authority who will determine some list of names of
characteristics of each sort of product, so that characteristics can
be directly and automatically compared, is dangerous to innovation,
competition, and commerce, and it is totally unnecessary, too. It
will open the door for existing businesses to use such architectures
as weapons against upstarts in niche markets and in unusual or new
market combinations. Moreover, the use of information architectures
as weapons will always seem like perfectly reasonable business
practices, so it will be nobody's fault when new concepts fail to be
accepted in the marketplace, because the internet failed to live up to
its promise of helping people find what they are looking for and make
informed purchasing decisions. The macroeconomy will be damaged.
*** Mr. (or Ms.) X *** (whom I do not know, but would like to) has
laid out a list of requirements for the implementation of namespaces
which, if used as guidance in the development of XML's namespace
features, will create a need for authorities who give "standard" names
to such things as product characteristics. The concentration of power
in such authorities will hinder innovation, by making it difficult to
compare products regarded as "out of category" for some authority's
set of defined names.
*** [To say that there is no industrial requirement for XML to support
multiple architectural inheritance is to place the design of
XML in conflict] *** with the evolutionary process of defining
and marketing new products. How will the catalog of everything that
is for sale handle a case where the same product characteristic, or
even the same entire product, arises from multiple industries
simultaneously, and each of those industries already uses its own
authoritative schema? Will the contents of documents have to be
duplicated and translated so as to conform with multiple schemas, so
that different comparisons can be made? If so, that will cause much
of the value of making the comparisons in the first place to be lost;
features regarded by authorities as "out of category" will simply
disappear. Imagine a single device that is a fax machine, a
telephone, a copier, a computer, and a stereo sound system. Should it
appear in a list of telephones? Maybe. Should the output wattage of
its amplifier be listable in a comparison with the output wattage of
other telephones? Maybe. Should the people who figure out what are
the interesting characteristics of telephones anticipate that output
wattage may be an important characteristic of telephones? It's
completely unrealistic to expect those people to anticipate that.
And, yet, it's an interesting and relevant statistic and it may be
important to some consumers.
The ugly truth is that we can't predict whether information that is
now thought to be irrelevant to other information (or, maybe we don't
even know about the existence of the other information yet) will turn
out to be semantically identical or semantically mappable. In my own
mind, anyway, the real justification for the existence of businesses
that provide "yellow pages on steroids" in support of internet
commerce is to provide the added value of mapping semantics to each
other in such a way that they can be directly compared, just as Jon
says. That mapping can be expressed in some proprietary fashion, or
it can be done using SGML documents that inherit from multiple SGML
architectures, or, if XML supports it, it can be done with XML
documents that inherit from multiple XML architectures, with no limit
on the number of XML architectures that can be inherited, and no
limits on the number of architectures that can usefully be fielded by
old and new industries. *** [Without multiple architectural
inheritance, XML documents that represent such semantic mappings will
be more costly to create and maintain. (I guess you'd have to do it
all with hyperlinks. Anything can be done with hyperlinks, but that
doesn't mean that everything *should* be done with hyperlinks. In
general, hyperlinks are best regarded by information managers as a
last resort because they cost more to maintain and their structure is
arbitrary and external. It's better if the information, in effect,
maps itself. Inheritable SGML architectures allow information to map
itself in complex ways. Why shouldn't it be possible to accomplish
the same end in XML, without requiring the use of hyperlinks?)
So, I continue to harp on the importance of allowing a single element
to inherit multiple semantics (and/or the _same_ semantic differently
named or named within different namespaces). *** [Other opinions
notwithstanding,] *** in my own mind, anyway, this really *is* a
requirement for cataloging companies to extract maximum value from
their listings at minimum information management cost in a dynamic,
non-authoritarian market environment. It would allow internet catalog
providers to map each new DTD into their existing DTDs simply by
tweaking their existing DTDs. For example, in the DTD for their
catalog of telephone products, when the output wattage issue first
arises (i.e., when a telephone appears on the market that lists an
output wattage), a declaration is added that allows the
characteristics listed in the DTD for the manufacturer's product
description document to be inherited. In the same declaration, the
features of the product, such as its "colour", can be mapped to the
things that are the same that are already in the DTD, (such as
"color"). The new feature, "outputWattage", can be made to appear
with a default value of "not applicable", so now all the existing
telephone product listings have this feature, and they can all respond
meaningfully (if uninterestingly) to queries about it. No need to
create and maintain (!) any hyperlinks. No need to write or maintain
any extra documents. One change in one place updates all telephone
products listed in the catalog, regardless of how many there are. The
amount of information stored hardly increases at all, but the value of
the information increases quite a lot. Essentially the same change
can be applied to the DTDs for stereo systems (now they can have a
redial feature, yes or no), the DTD for copiers, etc. Cheap and very
powerful, no? The catalog provider gets to add a terrific amount of
value at very little cost. New products can be found by consumers
even if they didn't know the hybrid category existed. ("I want a very
loud telephone. Hmmm.") New products for untried niches can be
usefully listed in multiple catalogs. Innovation is not penalized for
being unanticipated by the authorities who created DTDs for product
listings in various categories, or by the failure to recognize a
viable category. Indeed, there is no need for such authorities at
all. There is only a need for catalogers who can read and understand
incoming DTDs and perform these cheap semantic mapping tricks.
You can do all this now with SGML (as of August 1, 1997; see
http://www.ornl.gov/sgml/wg8/document/1920.htm). The only question is
whether XML will be able to do it. Maybe it doesn't matter; providers
of internet shopping directories can always maintain their source
information in SGML and simply deliver it in XML form, if they like.
(Or in HTML form, for that matter.)
-Steve
--
Steven R. Newcomb President
voice +1 716 271 0796 TechnoTeacher, Inc.
fax +1 716 271 0129 (courier: 23-2 Clover Park,
Internet: srn@techno.com Rochester NY 14618)
FTP: ftp.techno.com P.O. Box 23795
WWW: http://www.techno.com Rochester, NY 14692-3795 USA
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From paul_madsen at qmail.newbridge.com Mon Sep 29 16:11:21 1997
From: paul_madsen at qmail.newbridge.com (Paul Madsen)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID:
9:31 AM 29/09/97
Hi, I posted this to comp.text.sgml but didn't get much response (thanks J.R.)
_________
The XML-Data specification from Microsoft
(http://www.sil.org/sgml/xml-data9706223.htm) proposes
that the logic traditionally expressed in the DTD (content models, attribute
lists, entity definitions,
etc.) be expressed using the syntax of XML instances instead.
For instance, instead of the DTD element declaration
the XML-Data scheme rule would be something like
I'm attracted to the the idea if only because it seems "cool".
But what does this gain us? What deficiencies with the DTD formalism does it
address?
Is it the ability to extend object types so that one class of object is a
specialization of another more
general class?
Do not Architectural forms provide the traditional DTD syntax just that
ability?
Thanks for any insight.
Paul
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From RMcDouga at JetForm.com Mon Sep 29 16:26:38 1997
From: RMcDouga at JetForm.com (Rob McDougall)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID:
If I remember correctly, the advantages are listed in the spec. The main
advantage being that you can include the XML-Data definition within the XML
file itself, so that you now can send a completely self-describing file
that can be read by a single (XML) parser.
Rob
=======================================================
Rob McDougall Phone: (613)751-4800 ext.5232
JetForm Corporation Fax: (613)594-8886
http://www.jetform.com mailto:rmcdouga@jetform.com
=======================================================
-----Original Message-----
From: Paul Madsen [SMTP:paul_madsen@qmail.newbridge.com]
Sent: Monday, September 29, 1997 9:46 AM
To: XML DEV
Subject: XML-Data: advantages over DTD syntax?
9:31 AM 29/09/97
Hi, I posted this to comp.text.sgml but didn't get much response (thanks
J.R.)
_________
The XML-Data specification from Microsoft
(http://www.sil.org/sgml/xml-data9706223.htm) proposes
that the logic traditionally expressed in the DTD (content models,
attribute
lists, entity definitions,
etc.) be expressed using the syntax of XML instances instead.
For instance, instead of the DTD element declaration
the XML-Data scheme rule would be something like
I'm attracted to the the idea if only because it seems "cool".
But what does this gain us? What deficiencies with the DTD formalism does
it
address?
Is it the ability to extend object types so that one class of object is a
specialization of another more
general class?
Do not Architectural forms provide the traditional DTD syntax just that
ability?
Thanks for any insight.
Paul
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From michael at textscience.com Mon Sep 29 17:41:18 1997
From: michael at textscience.com (Michael Leventhal)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
In-Reply-To:
Message-ID: <3.0.1.32.19970929080238.0083c5c0@aimnet.com>
At 09:46 AM 9/29/97 -0400, Paul Madsen wrote:
>But what does this gain us? What deficiencies with the DTD formalism does it
>address?
>
>Is it the ability to extend object types so that one class of object is a
>specialization of another more general class?
IMHO, this is a strong reason to chuck DTDs as they now exist. But not
a goal of XML-DATA.
>Do not Architectural forms provide the traditional DTD syntax just that
>ability?
So say some but not really.
Michael Leventhal
______________________________________________________________________
Michael Leventhal Internet : http://www.grif.com
G R I F , S. A. Email : Michael.Leventhal@grif.fr
VP, Technology Telephone : 510-444-2962
1800 Lake Shore Ave Ste 14 Fax : 510-444-1672
Oakland, California 94606 France : (011) 33 1 30121430 (fr US)
______________________________________________________________________
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From jwrobie at mindspring.com Mon Sep 29 17:51:05 1997
From: jwrobie at mindspring.com (Jonathan Robie)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <1.5.4.32.19970929154902.00a202a4@pop.mindspring.com>
XML-Data adds several features that hard-core object oriented folks
appreciate:
1. True inheritance, with semantics more similar to that of OO
languages than indirect mechanisms used to simulate inheritance when
using architectural forms. Architectural forms do not really give us
what OO folks call inheritance.
2. Reflection - the ability to modify the content model at run-time.
3. The syntax for the content model is the same as the syntax for
data, making it easier to write code to manipulate both.
Of course, all existing SGML and XML tools know how to deal with DTDs,
and this is a rather major departure from traditional SGML. It has not
been blessed by any standardization committee. Given the way Microsoft
has approached Java, insisting that it need not implement the portable
libraries everyone else is using, and encouraging people to use their
platform-specific libraries instead, it is easy to wonder what will
happen to the SGML world if Microsoft is in control of an alternative
method of specifying content models.
According to MS representatives, there *will* be tools to transform
XML-Data content models into DTDs, but still, the "real" content model
is in the XML-Data. Is it worth it in order to gain true inheritance
and reflection? For some applications, it may well be. If Microsoft
controls XML-Data, and some vendors support it but others do not, will
we have the same kind of market fragmentation that we have in the Java
world today, where Microsoft is refusing to support the Java standard
libraries, and instead insists that developers should use their own
libraries, which run only on Windows operating systems?
Who knows!
Jonathan
***************************************************************************
Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie
POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com
***************************************************************************
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From gray at interlog.com Mon Sep 29 18:08:46 1997
From: gray at interlog.com (Graydon Hoare)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
In-Reply-To:
Message-ID:
> I'm attracted to the the idea if only because it seems "cool".
I think the general reasoning behind xml-data and XSL (shiver of horror)
is that if we settle on a uniform representation for graph-structured data
in transit then we can (soon) live in a world where nobody has to write a
parser for the stuff ever again. I mean, a scheme parser isn't exactly
brain surgery so I'm less inclined to enjoy this argument when used in
favour of XSL, but XSL has other reasons for existing. writing a DTD
parser with architectural forms support is just another stumbling block to
wide deployment of XML, and xml-data nicely circumvents the question. You
can just write an XML parser (in a shoddy one-off proof of concept as many
people are busy writing) and write your validator in terms of the objects
the tried and true parser hands you. Given that those objects have really
simple property-querying methods, it makes your code better encapsulated,
less likely to mix validating with the parsing of architectural forms.
at least that's the principal advantage I see.
cool side note: you can use a DSSSL engine to customize an XML-DATA grove
and dump out a new document type ;) or at very least typeset the metadata
in a nice way..
-graydon
______________________
peccatum poena peccati
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From peter at techno.com Mon Sep 29 18:46:19 1997
From: peter at techno.com (Peter Newcomb)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
In-Reply-To: <1.5.4.32.19970929154902.00a202a4@pop.mindspring.com> (message
from Jonathan Robie on Mon, 29 Sep 1997 11:49:02 -0400)
Message-ID: <199709291643.MAA29767@exocomp.techno.com>
[Jonathan Robie on Mon, 29 Sep 1997 11:49:02 -0400]
> XML-Data adds several features that hard-core object oriented folks
> appreciate:
>
> 1. True inheritance, with semantics more similar to that of OO
> languages than indirect mechanisms used to simulate inheritance when
> using architectural forms. Architectural forms do not really give us
> what OO folks call inheritance.
Could you elaborate upon this distinction between architectural form
inheritance and "true OO inheritance"? What about XML-data makes it
capable of supporting "truer" inheritance than architectural forms?
-peter
--
Peter Newcomb TechnoTeacher, Inc.
peter@petes-house.rochester.ny.us peter@techno.com
http://www.petes-house.rochester.ny.us http://www.techno.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From jwrobie at mindspring.com Mon Sep 29 19:29:20 1997
From: jwrobie at mindspring.com (Jonathan Robie)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <1.5.4.32.19970929172831.0098672c@pop.mindspring.com>
At 12:43 PM 9/29/97 -0400, Peter Newcomb wrote:
>[Jonathan Robie on Mon, 29 Sep 1997 11:49:02 -0400]
>> XML-Data adds several features that hard-core object oriented folks
>> appreciate:
>>
>> 1. True inheritance, with semantics more similar to that of OO
>> languages than indirect mechanisms used to simulate inheritance when
>> using architectural forms. Architectural forms do not really give us
>> what OO folks call inheritance.
>
>Could you elaborate upon this distinction between architectural form
>inheritance and "true OO inheritance"? What about XML-data makes it
>capable of supporting "truer" inheritance than architectural forms?
Let me preface this by saying that I am fairly new to both XML-data and
architectural forms, and I am perfectly willing to be shown wrong on this
statement. Let me explain some properties I see in XML-Data which I have not
yet been able to mirror completely using architectural forms. Since you know
much more about architectural forms than I do, I'll let you tell me if there
is an exact equivalent using architectural forms. In fact, this could be a
great opportunity to do a better comparison than I can do by myself.
In C++, Java, Smalltalk, and other OO languages, if I say that "a duck is an
animal", that means: (1) a duck always has all the data associated with an
animal, (2) a duck has the behavior associated with an animal (unless you
specifically say that a duck does certain things differently), and (3)
references to generic animals can also point to ducks. To put this in
traditional OO terms, Duck inherits data, behavior, and type from Animal. In
SGML, it can't inherit behavior, but it can inherit data and type.
Microsoft's XML-Data allows me to inherit data and type in a manner very
similar to OO languages. For instance, their description of XML-Data at
their XML standards page gives the following example:
Now I can use this type declaration to create an animalFriends element,
which is a list of pets:
So the pet hrefs can point to pets, cats, or dogs.
How would I create this schema using architectural forms?
Jonathan
***************************************************************************
Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie
POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com
***************************************************************************
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From eliot at isogen.com Mon Sep 29 20:21:12 1997
From: eliot at isogen.com (W. Eliot Kimber)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <3.0.32.19971129131853.00b5a2c8@swbell.net>
At 01:28 PM 9/29/97 -0400, Jonathan Robie wrote:
> To put this in
>traditional OO terms, Duck inherits data, behavior, and type from Animal. In
>SGML, it can't inherit behavior, but it can inherit data and type.
In fact, you can inherit behavior if your processor is architecture aware
such that you can write rules that will apply the architecture-specific
behavior in the absense of element-specific behavior. This could either be
indirectly through object-oriented processors where the implementing
element-specific objects inherit from architecture-specific objects or
explicitly through scripts that embody the architecture derivation rules,
e.g., something like this in DSSSL (here using a 'query' element rule):
(query (case (arch-form-of (current-node) 'myarch')
(('foo')
(make paragraph ...))
(('bar')
(make sequence ...))))
Behavior is simply processing code associated with types--the only question
is how is the binding done. With SGML, the binding is [almost] always
loose and indirect and architecture-based binding is just another level of
indirection, similar to, if not identical to, the indirection you get by
inheriting methods from supertypes.
>Microsoft's XML-Data allows me to inherit data and type in a manner very
>similar to OO languages. For instance, their description of XML-Data at
>their XML standards page gives the following example:
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>Now I can use this type declaration to create an animalFriends element,
>which is a list of pets:
>
>
>
>
>
>
>
>So the pet hrefs can point to pets, cats, or dogs.
>
>How would I create this schema using architectural forms?
I see a one-level schema hierarchy from which the document in the example
is derived:
superclass animalFriends
contains pet+
superclass pet
contains ANY
attribute owner
attribute name
To duplicate this using architectures, I create a meta-DTD that defines the
two supertypes and a document that derives its element types from the
supertypes.
First the derived document, which declares its derivation from the
architecture (schema):
]>
Now the architectural meta-DTD, which defines the types:
The relationship of the types in the document to the types in the meta-DTD
is clear and machine processible (because of the architecture notation and
meta-DTD entity). The relationship of the individual elements to their
supertypes is clear, either through the automatic mapping (names in the
document automatically map to the same name in the architecture, e.g.,
'animalFriends' in the document maps to 'animalFriends' in the meta-DTD) or
through the explicit mapping as for the types cat and dog. The 'extends'
semantic is inherent in architectural derivation. The architecture conveys
no less information than the example and takes about the same amount of
characters in this case (the verbosity of the XML-Data syntax offset by the
need for the architecture notation and entity declaration in the document).
The architecture approach requires no specialized processors in order to
process the document by architecture-unaware processors and
architecture-aware processing can be added easily through either ad-hoc
means in style sheets or transforms or using more complete architecture
engines (e.g., SP, GroveMinder, etc.).
Note that neither the XML-Data nor the architectural meta-DTD are complete
definitions of the schema--you still need human-understandable definitions
of all the parts (what is a "pet"? What are the rules for pet names? What
are the rules for owner names? What, if any, is the significance of pet
element content? etc.). You also need to define the expected behavior for
the types in various contexts: formatting, transformation, online display,
etc. Neither the XML-Data nor the architecture formalism will or can
provide these--they must be provided by other means, mostly
non-standardized and relying heavily on prose to communicate ideas to
humans, not processing to computers.
The only really important part of the schema discussion is how is a schema
associated with its documentation and definitions and how are things
associated with that schema. That's why the architecture mechanism
requires that you declare the notation for the architecture--that is the
pointer to the authoritative definition of what the architecture rules are.
The meta-DTD for the architecture is just a convenience that makes it
easier to do processing and validation, but the presence of it doesn't give
you that much and the lack of it doesn't preclude doing architecture-based
processing. The same will be true of any other formal syntax for defining
the meta-syntax rules for documents. At least architectures use an
existing syntax that is well understood by all SGML tools.
Given that most XML tools will need to be able to deal with DTDs anyway, I
can see no compelling reason in the short term to define an alternative
syntax for DTDs. Rethinking how document schemas are created and managed
over the long term needs doing, now doubt, but that is a project that will
take years of careful study and thought and must be done in conjunction
with a major revision to SGML, one in which many different ideas and
requirements can be brought to bear.
In my opinion, none of the name-space requirements and none of the
DTD-editing requirements require a change to existing mechanisms in order
to be satisfied in a reasonable way. Given that, there can be no good
reason for trying to reinvent the DTD mechanism at this time and trying to
do so is a waste of time that is better spent on more pressing issues.
Certainly people are free to invent whatever document types they want for
representing schemas, but to suggest that any such definition should be
used as standard within XML or SGML is premature, unwise, and unwarranted.
If Microsoft (or anybody else) wants to build tools to support such a
system and see if people will use or buy them, let them do so. Let the
marketplace decide. But this is not an area of SGML or XML for which the
standards need to change at this time and we should not attempt to change
them.
Cheers,
E.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ricko at allette.com.au Mon Sep 29 20:25:09 1997
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <199709291829.EAA23860@jawa.chilli.net.au>
> From: Jonathan Robie
> Of course, all existing SGML and XML tools know how to deal with DTDs,
> and this is a rather major departure from traditional SGML. It has not
> been blessed by any standardization committee. Given the way Microsoft
> has approached Java, insisting that it need not implement the portable
> libraries everyone else is using, and encouraging people to use their
> platform-specific libraries instead, it is easy to wonder what will
> happen to the SGML world if Microsoft is in control of an alternative
> method of specifying content models.
XML-data would probably fail, that's what.
Because their form of schemas are so complicated and verbose to read
that you will need browsing tools to manipulate them. This in turn
gives schemas (even though they are written in XML) the nature
of binary objects rather than textual objects. It seems the weight
of experience is against people making successful schema languages
in non-textual forms.
For example, Bento and the OpenDoc storage system included API-driven
routines for decorating cleverly stored objects with all sorts of
interesting type information, including type conversion, and so it
can be considered -- in part -- a schema system. Failing to
have a text form, the thing failed to thrive. The XML-data
system does have a text form, but it complicates matters so much by
not having a simple text form (e.g. a separate declaration
syntax) that it seems to be unreadable.
In my view, declarations are actually a kind of processing instruction,
targetted at the parser or entity manager, which also may be of
interest to the application (sorry for using SGML jargon).
The XML-data view seems to be that they are, more essentially,
data rather than processing instructions. Tim Bray has said
frequently "metadata is data", to which I would say
"processing instructions are sometimes data, sometimes not".
Have the XML-data people ever made any requests to ISO for
suggested improvements to the declaration syntax to give
them the functionality they need? (This is unfair really,
since I think XML-data is an experimental system, and
therefore a good place to generate user requirements for
a less verbose syntax.) Have they proved that
a single-tag language is easier to use than one with multiple
types of tags?
I am certainly in 100% favour of schema systems and stronger typing
and abstracting interesting information about data into
header elements. I proposed the SEEALSO parameter in the
current WebSGML TC specifically to allow richer declarations
of syntax using any kind of exotic notations including natural
language, so I am the last person to say that SGML declarations
are enough for all uses.
But I am simply not convinced that XML-data represents a
usable alternative to the standard declarations (in the
same market), and I think XML-data should not compete
(or been talked about as competing!) with the standard
declarations. Their purposes are, I hope, quite
different.
Rick Jelliffe
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ddb at criinc.com Mon Sep 29 20:45:57 1997
From: ddb at criinc.com (Derek Denny-Brown)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <3.0.32.19970929114620.009b3590@mailhost.criinc.com>
At 01:28 PM 9/29/97 -0400, Jonathan Robie wrote:
>At 12:43 PM 9/29/97 -0400, Peter Newcomb wrote:
>>> [snip Jonathan Robie's original post]
>>Could you elaborate upon this distinction between architectural form
>>inheritance and "true OO inheritance"? What about XML-data makes it
>>capable of supporting "truer" inheritance than architectural forms?
>
>[snip]
>In C++, Java, Smalltalk, and other OO languages, if I say that "a duck is an
>animal", that means: (1) a duck always has all the data associated with an
>animal, (2) a duck has the behavior associated with an animal (unless you
>specifically say that a duck does certain things differently), and (3)
>references to generic animals can also point to ducks. To put this in
>traditional OO terms, Duck inherits data, behavior, and type from Animal. In
>SGML, it can't inherit behavior, but it can inherit data and type.
>[snip]
One thing which Henry Thompson's presentation at HyTime '97 brought forth
in my mind was SGML's lack of support for (3) above. Architectural forms
do little or nothing to rectify this, although AF could provide a solution
if used in an envirnment which supports simultanious view of the source and
AF instances with links between the two. Part of the problem is that AF's
do little, if anything to make life easier when I want to build a DTD which
extends an existing DTD. I have to copy the existing DTD and modify it and
then add the AF meta-info which maps the new DTD back tot he old. But now
I have a completely different DTD, from the point of view of _all_ existing
SGML software. Sure I can map my documents to the original, but I can not
see it as both... I must either remove all value added by my modified DTD,
or abandon existing options based on the original DTD, since the new
document is not conforming to the original DTD. Obviously, since I put the
time into building the new DTD, I think there is some significant value
added, but I can not leverage the value added while at the same time
leveraging the use of the existing DTD as a base architecture.
This is exactly what OO Inheritance allows a programmer to do. You need
an extra attribute? Easy! With AF's I either see the document as the new
DTD or I can not see the attribute... value lost either way.
I want to be able to treat it as the original DTD until that special moment
when I can test to see if this has my extended info.. and perform extra
processing based on that...
-derek
Derek E. Denny-Brown II || ddb@criinc.com
"Reality is that which, || Seattle, WA USA
when you stop believing in it, || WWW/SGML/HyTime/XML
doesn't go away." -- P. K. Dick || Java/Perl/Scheme/C/C++
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ricko at allette.com.au Mon Sep 29 21:03:11 1997
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun 7 16:58:30 2004
Subject: Animal-friends implemented as a pattern (Re: XML-Data: advantages over DTD syntax?)
Message-ID: <199709291907.FAA24375@jawa.chilli.net.au>
> From: Jonathan Robie
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> Now I can use this type declaration to create an animalFriends element,
> which is a list of pets:
>
>
>
>
>
>
>
> So the pet hrefs can point to pets, cats, or dogs.
>
> How would I create this schema using architectural forms?
And you do not even need architectural forms. Here is a very
simple pattern for doing everything your example does using
a single DTD and standard SGML! (The suffixes "-content"
and "-attributes" are reserved for use in patterns. The
attribute "is-a" is reserved to allow inheritence labelling.)
]>
If you want multiple inhereitance, then you can just
define a different suffix, and search through attributes
based on that to collect the inheritance tree. I can
provide an example if anyone is interested.
Anyone who is aware of the pattern can see this and implement
it just as easily as they could using XML-data's syntax,
but without breaking SGML compatibility, which generating
new element types outside declarations does.
Rick Jelliffe
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From jwrobie at mindspring.com Mon Sep 29 21:07:12 1997
From: jwrobie at mindspring.com (Jonathan Robie)
Date: Mon Jun 7 16:58:30 2004
Subject: Animal-friends implemented as a pattern (Re: XML-Data:
advantages over DTD syntax?)
Message-ID: <1.5.4.32.19970929190623.00a56820@pop.mindspring.com>
At 05:02 AM 9/30/97 +1000, Rick Jelliffe wrote:
>If you want multiple inhereitance, then you can just
>define a different suffix, and search through attributes
>based on that to collect the inheritance tree. I can
>provide an example if anyone is interested.
Please!
Jonathan
***************************************************************************
Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie
POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com
***************************************************************************
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From eliot at isogen.com Mon Sep 29 21:17:13 1997
From: eliot at isogen.com (W. Eliot Kimber)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <3.0.32.19971129141415.00acab48@swbell.net>
At 11:46 AM 9/29/97 -0700, Derek Denny-Brown wrote:
>>specifically say that a duck does certain things differently), and (3)
>>references to generic animals can also point to ducks. To put this in
>>traditional OO terms, Duck inherits data, behavior, and type from Animal. In
>>SGML, it can't inherit behavior, but it can inherit data and type.
>>[snip]
>
>One thing which Henry Thompson's presentation at HyTime '97 brought forth
>in my mind was SGML's lack of support for (3) above. Architectural forms
>do little or nothing to rectify this, although AF could provide a solution
>if used in an envirnment which supports simultanious view of the source and
>AF instances with links between the two.
I'm not sure I follow you. If you have an architecture-aware search
engine, then you should be able to do a query of the form "find all
elements derived from the form 'animal'", which will include both 'animal'
elements and 'duck' elements. How is this not 3? Or do I misunderstand
Henry's requirement?
Something in the system has to know that a duck is a kind of
animal--architectures convey this information as clearly as any other
method, so I don't see how they can't satisfy the requirement.
> Part of the problem is that AF's
>do little, if anything to make life easier when I want to build a DTD which
>extends an existing DTD. I have to copy the existing DTD and modify it and
>then add the AF meta-info which maps the new DTD back tot he old. But now
>I have a completely different DTD, from the point of view of _all_ existing
>SGML software. Sure I can map my documents to the original, but I can not
>see it as both... I must either remove all value added by my modified DTD,
>or abandon existing options based on the original DTD, since the new
>document is not conforming to the original DTD. Obviously, since I put the
>time into building the new DTD, I think there is some significant value
>added, but I can not leverage the value added while at the same time
>leveraging the use of the existing DTD as a base architecture.
Again, I don't follow you. Either you really have a completely new DTD and
you have to define the processing for it completely or you have a DTD
derived from an architecture *and* you have architecture-aware processors
that let you apply the architeture-specific processing to your new
documents, leaving only the new stuff to be defined. How do architectures
not do this? How would the XML-Data proposal do this any better? In both
cases, it's a function of the processing code both providing the methods
for the base classes and the processing system understanding the derivation
hierarchy.
You can also use the trick of defining the architecture such that its
declarations (and in particular, the parameter entities used to configure
and modularize it) can be also used to create declarations for documents
derived from the architecture. In essessence you combine architectural
derivation with the sort of clever modularization typified by the TEI and
Docbook declaration sets.
Your comments suggest that you are confusing *parsing* with *processing*.
Parsing is not an issue, because the document is either valid to its DTD or
it isn't, and is either valid with respect the governing schema or isn't.
Whether or not the document is valid doesn't affect how it is *processed*
after parsing, which is purely a function of methods applied to types, not
parsing, and is entirely independent of how the type information got
associated with the data (whether by the architecture syntax or the
interpretation of some XML-Data document).
>This is exactly what OO Inheritance allows a programmer to do. You need
>an extra attribute? Easy! With AF's I either see the document as the new
>DTD or I can not see the attribute... value lost either way.
This is only true if you define your processing in terms of architectural
instances derived from documents, but clearly, that is not the way
architectures are intended to be used in the general case. The
architecture provides part of the processing and an architecture-aware
processor must be able to associate architecture-specific processing with a
document, but it's not an all-or-nothing proposition. I must always be
aware of the document's architectural nature as well as its base nature
unless the only processing I care about at the moment is that defined by
the architecture.
The XML-Data proposal (to the degree I understand it) and architectures
appear to convey exactly the same information about a schema and a
document's derivation from it. The fact that the XML-Data syntax appears
to be more "object-oriented" must be a red herring because in both cases
you are providing a purely declarative data description, not the definition
of active methods. The only way in which XML-Data might appear to be
object-oriented is XML-Data-specific semantics for generating complete
declarations from XML-Data specifications based on implication rules, but
these will either be effectively identical to features in the AFDR syntax,
such as multiple attlists for the same element type, or facilities of
limited utility, such as content model implication (which can be managed
pretty well with parameter entities). In other words, I don't see that
it's possible for anything like XML-Data to provide significantly more
assistance in creating and managing declaration sets and meta-DTDs than you
already get with the AFDR and normal SGML facilities.
This is why confusing architectures with object-oriented programming
approaches is so dangerous: they are not the same thing and thinking that
they are leads to erroneous conclusions and unrealistic expectations (such
as that content models can be somehow inherited in any but the most trivial
ways).
Note too that when you have DTD-less documents, problems of DTD syntax
munging go away because you don't have any DTD syntax to mung. Any munging
is managed by the creators of derived schemas. This is one of the beauties
of XML--it frees us from the need to conflat schema definition with the
definition of the parsing rules for document instances.
Cheers,
E.
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ricko at allette.com.au Mon Sep 29 21:53:30 1997
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun 7 16:58:30 2004
Subject: Animal-friends implemented as a pattern (Re: XML-Data:advantages over DTD syntax?)
Message-ID: <199709291958.FAA24998@jawa.chilli.net.au>
----------
> From: Jonathan Robie
> To: ricko@allette.com.au
> At 05:02 AM 9/30/97 +1000, Rick Jelliffe wrote:
>
> >If you want multiple inhereitance, then you can just
> >define a different suffix, and search through attributes
> >based on that to collect the inheritance tree. I can
> >provide an example if anyone is interested.
>
> Please!
Here is a version which allows multiple inheritance.
(Some parenthesis problems fixed too.)
I have put in even empty attribute values, to make
the pattern uniform in every case, so please do not
confuse this simplicity for elaborateness!
To extract the inheritance tree, collect all attributes
with "-inherit" suffix. I think the only novel thing
is that people are not used to wildcard searches on
attribute names, but this is only prejudice.
Also, I think because some tools require precompiled
DTDs, there is a general view in some circles that
DTDs are always compiled, and always made prior
to the generation of the instance. But that is
not intrinsic to SGML.
The PATTERN
-----------
This pattern reserves the suffixes:
-content for a parameter entity with the
element type's contents
-attributes for a parameter entity with the
element type's attributes
-inherit for a fixed attribute with the
element type's immediate inheritance
The pattern is
Where the delimiters {} indicate parameters of the template
which you or your application edit in.
The EXAMPLE
-----------
]>
Please note that I am not saying that this form is always
preferable to using AFs or XML-data. But it can be done
in XML as it stands now, keeping valid SGML declarations.
And, as has been mentioned, there should be interconversion
possible between the three forms, since they give the
same information. If XML-data requires the use of specialist
tools to mapulate, since it is so verbose, then this pattern
cannot either be regarded as excessively verbose either,
since the same kind of tools can be constructed to simplify
creating new objects.
Rick Jelliffe
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From srn at techno.com Mon Sep 29 22:15:42 1997
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun 7 16:58:30 2004
Subject: XML-Data: advantages over DTD syntax?
In-Reply-To: <3.0.1.32.19970929080238.0083c5c0@aimnet.com> (message from
Michael Leventhal on Mon, 29 Sep 1997 08:02:38 +0200)
Message-ID: <199709291827.OAA01640@bruno.techno.com>
[Paul Madsen:]
> Do not Architectural forms provide the traditional DTD syntax just that
> ability [to extend object types so that one class of object is a
> specialization of another more general class]?
[Michael Leventhal:]
> So say some but not really.
I'm one of those who say so. How "not really"?
-Steve
--
Steven R. Newcomb President
voice +1 716 271 0796 TechnoTeacher, Inc.
fax +1 716 271 0129 (courier: 23-2 Clover Park,
Internet: srn@techno.com Rochester NY 14618)
FTP: ftp.techno.com P.O. Box 23795
WWW: http://www.techno.com Rochester, NY 14692-3795 USA
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ddb at criinc.com Mon Sep 29 22:37:33 1997
From: ddb at criinc.com (Derek Denny-Brown)
Date: Mon Jun 7 16:58:31 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <3.0.32.19970929133641.009a9100@mailhost.criinc.com>
At 02:14 PM 11/29/97 -0600, W. Eliot Kimber wrote:
>At 11:46 AM 9/29/97 -0700, Derek Denny-Brown wrote:
>I'm not sure I follow you. If you have an architecture-aware search
>engine, then you should be able to do a query of the form "find all
>elements derived from the form 'animal'", which will include both 'animal'
>elements and 'duck' elements. How is this not 3? Or do I misunderstand
>Henry's requirement?
This requires a AF aware search engine. In addition, all current AF
systems can only view the instance as either the source or the AF. If the
search engine reports where it found the match, it would report it relative
to the AF, not the source document. As I implied in my original post:
>> although AF could provide a solution if used in
>> an envirnment which supports simultanious view
>> of the source and AF instances with links between
>> the two.
a number of things start to change when you add an environment wheren you
can easily map back and forth between the two views.
>Again, I don't follow you. Either you really have a completely new DTD and
>you have to define the processing for it completely or you have a DTD
>derived from an architecture *and* you have architecture-aware processors
>that let you apply the architeture-specific processing to your new
>documents, leaving only the new stuff to be defined. How do architectures
>not do this? How would the XML-Data proposal do this any better? In both
>cases, it's a function of the processing code both providing the methods
>for the base classes and the processing system understanding the derivation
>hierarchy.
I want to build on tools which assume you are using an existing DTD, say a
custom editor environment. (note: this is not based on a real
implementation, but rather a mental exercise) From the point of view of
that tool I either am using a new DTD (since I can not have a nice PUBLIC
reference to the "standard" DTD, and the DTD is different in any case,
because I added elements to some content models) or I only give it the AF
and I have lost my value added elements. I am talking about today and
tomorrow, not next year. Next year there may be tools which allow better
use of AFs. I am not in a position where I have enough information to
really know what vendors plan to release next year. I am in a situation
where if it can not be done today, I can not use it, since my deadlines are
too tight to wait on future releases for most of the software. (note: if
you want grey hair at an early age, this is an excelent recipy. managers
who do not want their staff to have grey hair should either take note or
buy lots of hair dye...)
I have never said that XML-Data provides anything better, since I do not
know enough about it to even compare it to AFs, which I do have a
reasonable understanding of, I think.
>You can also use the trick of defining the architecture such that its
>declarations (and in particular, the parameter entities used to configure
>and modularize it) can be also used to create declarations for documents
>derived from the architecture. In essessence you combine architectural
>derivation with the sort of clever modularization typified by the TEI and
>Docbook declaration sets.
This requires that the original be well designed. A common request, which
is often ignored ;}
>Your comments suggest that you are confusing *parsing* with *processing*.
(Hopefully) no more than current tools force me to co-relate them. They
should be seperate, but are more often than not, virtually synonymous.
Groves are setting the stage for a day when parsing and processing are
seperated. At times I dream of that day, interspersed with my nightmares
imposed by current tools and requirements...
>Parsing is not an issue, because the document is either valid to its DTD or
>it isn't, and is either valid with respect the governing schema or isn't.
>Whether or not the document is valid doesn't affect how it is *processed*
>after parsing, which is purely a function of methods applied to types, not
>parsing, and is entirely independent of how the type information got
>associated with the data (whether by the architecture syntax or the
>interpretation of some XML-Data document).
The problem is that a number of tools/environment define a document's
model/style/environment by the DTD. If I have a special setup for editing
DocBook documents, that setup needs to make some assumptions on your
instance. It does not work when I hand it an instance which violate those
assumtions (because it is conformant to a DTD which uses DocBook as a base
architecture, rather than actually being conformant to the DocBook DTD).
If I have access to the source, I could go in and tweak it, but I would
have to do this either specifically for the new DTD or spend the time to
make the environment work with anything which remotely resembles
DocBook....more work than I want.
>>This is exactly what OO Inheritance allows a programmer to do. You need
>>an extra attribute? Easy! With AF's I either see the document as the new
>>DTD or I can not see the attribute... value lost either way.
>
>This is only true if you define your processing in terms of architectural
>instances derived from documents, but clearly, that is not the way
>architectures are intended to be used in the general case. The
>architecture provides part of the processing and an architecture-aware
>processor must be able to associate architecture-specific processing with a
>document, but it's not an all-or-nothing proposition. I must always be
>aware of the document's architectural nature as well as its base nature
>unless the only processing I care about at the moment is that defined by
>the architecture.
To an extent what I am asking for is an environment where I could build
tools using a traditional OO Inheritence model applied to the SGML AF
model. A DSSSL Style sheet where I would only have to define rules for new
elements (or changed elements).
>This is why confusing architectures with object-oriented programming
>approaches is so dangerous: they are not the same thing and thinking that
>they are leads to erroneous conclusions and unrealistic expectations (such
>as that content models can be somehow inherited in any but the most trivial
>ways).
I agree that AFs shoud definitely no be equated with OO programming. I do
see two things which any attempt to equate them does bring out.
1) DTD extension mechanisms which provide for simple type inheritence would
be very usefull. AFs provide a limited solution, which presents new
difficulties. This is a problem with SGML. AFs are an excellent
workaround which stays within the system, and deserve considerable credit
for that. My reel frustration is with SGML and the limits it imposes, not
AFs.
2) Tools which allow OOP inheritence style defaulting behaviour for
processing of elements based on element-type, architectural type.. AFs may
not map to OOP but they make OOP based processing tools easier...
>Note too that when you have DTD-less documents, problems of DTD syntax
>munging go away because you don't have any DTD syntax to mung. Any munging
>is managed by the creators of derived schemas. This is one of the beauties
>of XML--it frees us from the need to conflat schema definition with the
>definition of the parsing rules for document instances.
But this puts added burden on the tools since all bets are off as to what
the structure looks like. AF at least provide a set mechanism for mapping
to a known structure.
-derek
Derek E. Denny-Brown II || ddb@criinc.com
"Reality is that which, || Seattle, WA USA
when you stop believing in it, || WWW/SGML/HyTime/XML
doesn't go away." -- P. K. Dick || Java/Perl/Scheme/C/C++
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From digitome at iol.ie Tue Sep 30 00:26:19 1997
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun 7 16:58:31 2004
Subject: XML-Data: advantages over DTD syntax?
Message-ID: <199709292226.XAA06786@GPO.iol.ie>
[Rick Jelliffe]
>
>Because their form of schemas are so complicated and verbose to read
>that you will need browsing tools to manipulate them. This in turn
>gives schemas (even though they are written in XML) the nature
>of binary objects rather than textual objects.
>
A good point. I have fond memories of being able to understand Make
files for example! These days, with "advanced" tools they are still
"text only" they are pretty impenetrable and effectively locked in to
particular tools:-(
On the other hand, in the specific case of XML-Data I would have to say
I am in favour. DTDs are prefectly good "documents". XML's reputation as a
meta-language is, I think, positively served by its use to describe "itself" in
this way.
The approach obviously has its practical limits though. The further one gets
from
"data" the closer one gets towards "algorithm" - the less *practical* a tagged
representation becomes. Full scale Scheme would be pretty inpenetrable in
XML but it would be possible! The fact that it is entirely possible is the
important thing. It means (doesn't it????) that XML can be viewed as the
bed-rock on which all the other required syntactic "short hands" can be based.
So XML could have 8879 DTDs. It could also have a DTD for 8879 DTDs.
Core XML could interpret the latter directly, supporting the 8879 syntax via
a transformation. Future syntaxes, methods etc.; for achieving what 8879 DTDs
achieve could then be cleanly layered on top.
Sean Mc Grath
sean@digitome.com
Digitome Electronic Publishing
http://www.digitome.com
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ricko at allette.com.au Tue Sep 30 06:56:20 1997
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun 7 16:58:31 2004
Subject: revised Animal-friends implemented as a pattern (Re: XML-Data:advantages over DTD syntax?)
Message-ID: <199709300500.PAA07205@jawa.chilli.net.au>
Someone has pointed out that the colonized syntax would be
approporiate and clearer. Here it is again (sorry!) with
colons. (I have also cleaned up the inheritance to bundle
things more, so please delete previous version.)
Actually, this following fragment is illegal, because
you cannot use ANY inside a content model. I am not sure how
to read the XML-data format here, but I think this exposes
a flaw in their example: if pet can contain any subelements,
what use is it to say it can also contain a kitten subelement?
Duplicate paths are a little worrying, if that what they
have done.
If it were desired to use ANY in this way (i.e. different
to how SGML uses it), then it could be coped with by
parametising includes and excludes in a similar fashion.
(Again I can provide example if needed, but I hope not.)
----------
> From: Jonathan Robie
> To: ricko@allette.com.au
> At 05:02 AM 9/30/97 +1000, Rick Jelliffe wrote:
>
> >If you want multiple inhereitance, then you can just
> >define a different suffix, and search through attributes
> >based on that to collect the inheritance tree. I can
> >provide an example if anyone is interested.
>
> Please!
Here is a version which allows multiple inheritance.
(Some parenthesis problems fixed too.)
I have put in even empty attribute values, to make
the pattern uniform in every case, so please do not
confuse this simplicity for elaborateness!
To extract the inheritance tree, collect all attributes
with ":inherit" suffix. I think the only novel thing
is that people are not used to wildcard searches on
attribute names, but this is only prejudice.
Also, I think because some tools require precompiled
DTDs, there is a general view in some circles that
DTDs are always compiled, and always made prior
to the generation of the instance. But that is
not intrinsic to SGML.
The PATTERN
-----------
This pattern reserves the suffixes:
contents for a parameter entity with the
element type's contents
attributes for a parameter entity with the
element type's attributes
inherit for a fixed attribute with the
element type's immediate inheritance
The pattern is
Where the delimiters {} indicate parameters of the template
which you or your application edit in.
The EXAMPLE
-----------
]>
Please note that I am not saying that this form is always
preferable to using AFs or XML-data. But it can be done
in XML as it stands now, keeping valid SGML declarations.
And, as has been mentioned, there should be interconversion
possible between the three forms, since they give the
same information. If XML-data requires the use of specialist
tools to mapulate, since it is so verbose, then this pattern
cannot either be regarded as excessively verbose either,
since the same kind of tools can be constructed to simplify
creating new objects.
Rick Jelliffe
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From ht at cogsci.ed.ac.uk Tue Sep 30 10:35:37 1997
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun 7 16:58:31 2004
Subject: Animal-friends implemented as a pattern (Re: XML-Data:advantages over DTD syntax?)
In-Reply-To: "Rick Jelliffe"'s message of Tue, 30 Sep 1997 05:54:19 +1000
References: <199709291958.FAA24998@jawa.chilli.net.au>
Message-ID: <715.199709300835@grogan.cogsci.ed.ac.uk>
Note that as written Rick's solution lacks a feature of the XML-Data
proposal, namely that e.g. in the internal subset I can add a new
declaration
and non-intrusively extend the content model of animal-friends. To
cover this Rick's solution would need place-holding empty parameter
entities in most of his existing entities, e.g.
[Note this is not valid XML, I don't think]
This I think completes the reductio -- the point is not that you can
do things with schemata that you can't do in XML, but that you can do
them in ways which are vastly more transparent and maintainable. Just
because we CAN write all logical formulae using only Shaeffer stroke
and constants doesn't mean we SHOULD do so. Occam didn't say "Don't
proliferate", he said "Don't proliferate beyond necessity".
Note also that I argued at the XML day in Montreal that to avoid the
dangers of multiple incompatible approaches to schemata, we should
always provide a semantics in terms of vanilla XML, which is how I'd
describe what Rick has shown is possible!
ht
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From jwrobie at mindspring.com Tue Sep 30 13:16:16 1997
From: jwrobie at mindspring.com (Jonathan Robie)
Date: Mon Jun 7 16:58:31 2004
Subject: Animal-friends implemented as a pattern (Re:
XML-Data:advantages over DTD syntax?)
Message-ID: <1.5.4.32.19970930111016.009ead94@pop.mindspring.com>
At 09:35 AM 9/30/97 BST, Henry S. Thompson wrote:
So now we have all the players!
Henry, could I ask you to list all the main advantages you see for XML-Data
over XML with architectural forms? Yesterday's traffic makes me think that
this would be a great place to discuss the issues in some depth. One side of
the debate seems to say that XML-Data adds no new functionality, and the
other says that it adds significant new functionality. At this point, I am
not convinced that I know enough to say one way or another.
Jonathan
***************************************************************************
Jonathan Robie jwrobie@mindspring.com http://www.mindspring.com/~jwrobie
POET Software, 3207 Gibson Road, Durham, N.C., 27703 http://www.poet.com
***************************************************************************
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
From zwang at pstat.ucsb.edu Tue Sep 30 21:25:55 1997
From: zwang at pstat.ucsb.edu (Zheng Wang)
Date: Mon Jun 7 16:58:31 2004
Subject: msxml contentmodel
Message-ID:
Hello,
We are trying to write an editor application that uses XML via the
MSXML parser. What we plan to do is to let the editor read the DTD and
then provide users with an interactive environment that they use to
fill out the content of the xml document.
The problem we have is that MSXML does not provide access to the
content model of the DTD through the Document class. The API it
provides is mainly through the Document class. We are not sure whether
Microsoft intended that the interface to the DTD content model not be
available (directly or indirectly) to the application. Could anyone
shed light on how to use MSXML to access the DTD content model, or
does anyone know if some of the other parsers (e.g., NXP, LARK)
provide an interface to the DTD content model? Also, how does this
relate to SGML groves as I have seen discussed on XML-DEV at various
times?
Thanks
Zheng and Matt,
NCEAS, UCSB
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)