From marcelo at mds.rmit.edu.au  Mon Mar  1 00:49:03 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:32 2004
Subject: Streaming XML and SAX
In-Reply-To: <36D82244.DB014ECE@thinlink.com>; from Tom Harding on Sat, Feb 27, 1999 at 08:50:12AM -0800
References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com>
Message-ID: <19990301114841.B4466@io.mds.rmit.edu.au>

On Sat, Feb 27, 1999 at 08:50:12AM -0800, Tom Harding wrote:
> David Megginson wrote:
> 
> > No, it still looks like a messy architecture to me, because the
> > transport layer has to know about the packets -- it has to parse
> > the XML about to get information about what it's looking at, and
> > that adds complexity and inefficiency.  A clean architecture
> > should separate the layers completely, and use XML only where it
> > has an obvious advantage over other approaches.
> 
> It's amazing how two people can see things so differently.  I think
> it's supremely elegant that only the XML processor needs to look at
> data coming off the wire.  It's also as efficient as it gets.  Of
> course the software architecture that handles the documents emitted
> must be modular and extensible, but the task of parsing is done.

It has already been pointed out in this discussion that some
environments try to increase the throughput by dispatching documents
off to different threads.  A system with 50 CPU's is going to be
operating as low as 2% capacity if it is forced to pipe the entire
parsing load through a single thread.  I don't see how you can argue
that this is efficient.

Nor do I agree that concentrating the workload at a single conceptual
point is elegant.  It is much more aesthetically pleasing to let the
protocol break up packets and let the XML parser parse XML.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar  1 02:13:44 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:32 2004
Subject: Streams, protocols, documents and fragments
In-Reply-To: <36D6C618.D44846B6@thinlink.com>; from Tom Harding on Fri, Feb 26, 1999 at 08:04:40AM -0800
References: <A26F84C9D8EDD111A102006097C4CD0D054A1D@SOHOS002> <36D46640.94081620@thinlink.com> <14036.28838.719355.44002@localhost.localdomain> <36D479F1.28D796D9@thinlink.com> <14037.20555.720649.689770@localhost.localdomain> <36D59762.370372DB@thinlink.com> <14038.35650.792155.191827@localhost.localdomain> <36D6C618.D44846B6@thinlink.com>
Message-ID: <19990301131329.B6351@io.mds.rmit.edu.au>

On Fri, Feb 26, 1999 at 08:04:40AM -0800, Tom Harding wrote:
> David Megginson wrote:
> 
> > -- a general-purpose DOM would be *extremely* inefficient for
> > handling things like vector graphics or 3D worlds (to name only
> > two), though it is always possible to expose their optimised
> > object models through a DOM interface later if necessary.
> 
> In lots of applications, the data can't stay in an XML
> representation for very long anyway, because of what you're
> integrating it with/displaying it on/routing it through/converting
> it to/storing it in/etc... I view the DOM as a standard, OO way of
> manipulating the contents of a document.  It lets applications get
> work done, even without taking an end-to-end OO approach.  Perhaps
> I'm showing my bias here ;D

It's the translation process that hits hardest, however.  C and
FORTRAN compilers rarely build parse trees, because it is much more
efficient to generate code directly from token streams.  What you seem
to be suggesting is that a parser should pump an event stream straight
into DOM and then into another domain-specific structure.  This is
just adding an often gratuitous layer that can incur a massive
performance penalty for large documents (a 3D model of a refinery,
say).

In such circumstances I would much rather build the domain-specific
structure straight from the event stream.  (In fact, I have serious
reservations about using XML at all for 3D model transmission and
storage -- the markup tends to grossly outweigh the content, which
consists primarily of numbers.  Compression during transport _and_
storage would be a must).


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Mon Mar  1 03:14:12 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:09:32 2004
Subject: XML and special Characters : unicode v3.0 ? 
Message-ID: <000a01be6391$ddcd2750$14f96d8c@NT.JELLIFFE.COM.AU>

 From: Baden Hughes <bmhughes@ozemail.com.au>

>I know that XML 1.0 allows you to use 'special' characters as included
in
>the Unicode 2.0 specification. With the upcoming release of Unicode 3.0
how
>will we be able to refer to characters in 3.0 which were not in 2.0 ?
The
>same way (meaning the actual version of Unicode spec is irrelevant as
long
>as the method used is included in XML) or some new way ?
>
>For instance, the Sinhala character set was not in Unicode 2.0 but will
be
>in 3.0. How do I get one of those characters in an XML document ? Or is
that
>inconsequential to the document per se as it is simply a reference and
its
>really up to the application to render it correctly ?

The document character set of XML is ISO 10646, as used by the Unicode
Consortium's character set Unicode. I think most people's strong
expectation is that XML will track ISO 10646, just as Unicode tracks it.
In fact, I think it is essential that XML automatically tracks ISO
10646: people will always try to do strange and interesting things with
characters and codes, and XML should try to allow as much freedom for
them to do this as possible.

Developers should be very wary of putting type-checking into their
systems which will cause future legitimate ISO 10646 to fail. For
example, when a new character is invented, like the Euro, the only
difficulty it should cause is if the font is not upgraded or if the
sort/type system doesnt allow new character registration.

We certainly need to abandon the expectation the number of characters is
fixed or knowable, which is how some might interpret material from
Unicode Consortium: a character set standard tries to put in what is
generally useful against some criteria--if your criteria do not match,
then you easily legitimately decide that your character is not found in
the set: is Apple's "apple" character a real character? are variant
kanji characters real characters? are roman, fraktur, italic and uncial
"a" characters different? Is English "W" a different character (i.e.,
"UU") from German "W" (i.e. "VV"), when using historical material? In my
book I use a dinosaur glyph as a word have liked to have put it in the
index too: why is it not a character? Such questions can never be
resolved, but a character set must make a decision based on some
selection criteria; and those criteria will not be appropriate in every
situation.

The nice thing about markup is it lets us simulate the existance of a
character missing from a character set: however, we have no markup
conventions yet to do this systematically. There are no standard methods
for saying "when you find 'a' in this context, collate it differently"
for example (apart from, perhaps, language-tagged elements).

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Mar  1 05:22:48 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:33 2004
Subject: Comments on WD-html-in-xml-19990224
Message-ID: <3.0.32.19990226132215.00ba4b00@pop.intergate.bc.ca>

At 03:07 PM 2/26/99 -0500, John Cowan wrote:
>1) I believe that the introduction of a media type "text/xhtml" is
>a mistake.  

I can see this point of view.

>Instead, it would be better to attach a media-type
>attribute specifying the formal public identifier of the DTD.

?!? find me somewhere in a W3C or IETF document where the FPI has
any standing.  Standards-anality aside, this is a real problem,
because there is *no interoperable resolution mechanism*.  Surely
you can't be serious.

 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tomh at thinlink.com  Mon Mar  1 05:41:36 1999
From: tomh at thinlink.com (Tom Harding)
Date: Mon Jun  7 17:09:33 2004
Subject: Streaming XML and SAX
References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au>
Message-ID: <36DA2858.43F3EA7A@thinlink.com>

Marcelo Cantos wrote:

> It has already been pointed out in this discussion that some
> environments try to increase the throughput by dispatching documents
> off to different threads.  A system with 50 CPU's is going to be
> operating as low as 2% capacity if it is forced to pipe the entire
> parsing load through a single thread.  I don't see how you can argue
> that this is efficient.

Even if you believe that parsing to convert markup into memory structures is slower than
back-end processing, if parsing is faster than the stream itself there is no difference in the
two approaches.  Anyway, in the general case the question is moot because there may be
inter-document dependencies, so you have to look inside the document before trying to
parallelize.

The whole point of this discussion was whether the document terminator ought to be XML or
non-XML.  Aside from the fact that I haven't yet seen a workable suggestion for a non-XML
terminator, it isn't necessary to completely examine a document or convert it to a tree just
to find an XML terminator.   As Nathan pointed out, you could write a semi-parser to find
terminators and then actually parse documents in parallel, but you'd need to suggest a way for
dealing with inter-document dependencies.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Mon Mar  1 15:35:46 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:33 2004
Subject: Streaming XML and SAX
In-Reply-To: <004901be6348$9c8af4a0$c9a8a8c0@thing2>
Message-ID: <NBBBJPGDLPIHJGEHAKBAMEPGCNAA.martind@netfolder.com>

Hi Nathan,

<YourComment>
It seems like something is backwards here!

If an application is processing a series of documents, once it has a
universal
type name for that document (root element name + namespace), it knows how it
wants to process the document and doesn't need a Pi. (What's a Gi? Is that
XML?)
</YourComment>

<Reply>
Yes, obviously a document (or name it the way you want - I don't want to
argue about streams vs documents :-) may not have any PI, may not have any
name space reference. thus, only GI are then used as pattern match in this
case. Sorry I forget to precise the complete resolution mechanism which is
based on pattern match. thus, the router use this pattern match to dispatch
to the right interpreter. Element matched are:

a) PI
b) name space definition
c) Root GI

Any of these elements could be used as a pattern match. Yes a GI is part of
SGML and therefore part of XML. This is simply the element. In your example
it could be something like "vendor-id". So, because the interpreter is based
on a pattern matrch mechanism, everything that could be used for a pattern
match can work. Actually, we use the three elements mentionned above.
</Reply>

<YourComment>
Also, you should be able to use the same parser for all document types and
then do
the routing on the parse events, saving you from having to do a "pre-parse"
to
determine the universal type name.
</YourComment>

<Reply>
Glad to see we both agree on the same mechanism. This is axactly what we do.
The router mechanism is just a temporary interpreter included in the parser
to load/unload the interpreters. To be precise the mechanism is:
a)run the router as a special kind of interpreter
b)parse the document (always)
c) determine which interpreter to load then load it and let it run.
d) the interpreter run until the end of the document
e) at the end of the document: the router/interpreter is then loaded and run
again until a new interpreter is recognized.
f) got to a)
The parser is always the same, only the interpreters are loaded/run and the
router is just a special kind of interpreter. Do you have a more efficient
mechanism to suggest?
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From boblyons at unidex.com  Mon Mar  1 16:46:26 1999
From: boblyons at unidex.com (Robert C. Lyons)
Date: Mon Jun  7 17:09:33 2004
Subject: Need help getting IE 5.0
Message-ID: <01BE63D8.2D163B30@cc398234-a.etntwn1.nj.home.com>

IE 5.0 is no longer available on the Microsoft web site.
It will be available on March 18, but I can't wait that long.

I downloaded a copy of ie5setup.exe from www.download.com.
When I ran ie5setup.exe, I got the following error message:
"Setup was unable to download information about installation sites."

(Note that the ie5setup.exe program is small, and it needs to pull
many IE 5.0 components from the Microsoft web site.)

Any ideas on how I can install IE 5.0 on my computer (before March 18)?

Thanks.

Bob

------
Bob Lyons
EC Consultant
Unidex Inc.
1-732-975-9877
Fax: 1-732-975-9866
boblyons(at)unidex.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Livinsb at rbos.co.uk  Mon Mar  1 16:59:02 1999
From: Livinsb at rbos.co.uk (Livingstone, Stephen)
Date: Mon Jun  7 17:09:33 2004
Subject: Need help getting IE 5.0
Message-ID: <217258E84FF7CF11B4630001FA44B2D502CF055A@REFROWTECX1>

I have 24MB of IE5.0 files here as seperated CAB files,,,

I could mail them to you if you want??(tomorrow)

steven

Steven Livingstone BSc MSc GradInstP
Corporate Systems Development (TCN)
Royal Bank Of Sctoland.
mailto:livinsb@rbos.co.uk
+44 0131 523 4354 [x24354]

Networking Technical Associates,
Glasgow, Scotland.
mailto:ntw_uk@hotmail.com
+44 07771-957-280


> -----Original Message-----
> From:	Robert C. Lyons [SMTP:boblyons@unidex.com]
> Sent:	Monday, March 01, 1999 4:40 PM
> To:	xml-dev@ic.ac.uk
> Subject:	Need help getting IE 5.0
> 
> 
> *** Warning : this message originates from the Internet ****
> 
> IE 5.0 is no longer available on the Microsoft web site.
> It will be available on March 18, but I can't wait that long.
> 
> I downloaded a copy of ie5setup.exe from www.download.com.
> When I ran ie5setup.exe, I got the following error message:
> "Setup was unable to download information about installation sites."
> 
> (Note that the ie5setup.exe program is small, and it needs to pull
> many IE 5.0 components from the Microsoft web site.)
> 
> Any ideas on how I can install IE 5.0 on my computer (before March
> 18)?
> 
> Thanks.
> 
> Bob
> 
> ------
> Bob Lyons
> EC Consultant
> Unidex Inc.
> 1-732-975-9877
> Fax: 1-732-975-9866
> boblyons(at)unidex.com
> 
> 
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


This e-mail message is confidential and for use by the addressee only.  If the message is received by anyone other than the addressee, please return the message to the sender by replying to it and then delete the message from your computer..

'Internet e-mails are not necessarily secure. The Royal Bank of Scotland plc does not accept responsibility for changes made to this message after it was sent.'


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  1 17:59:53 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:33 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <000301be6361$272d2480$5ffa6ccb@baden>
Message-ID: <36DAD563.5222F16A@locke.ccil.org>

Baden Hughes wrote:

> For instance, the Sinhala character set was not in Unicode 2.0 but will be
> in 3.0. How do I get one of those characters in an XML document ? Or is that
> inconsequential to the document per se as it is simply a reference and its
> really up to the application to render it correctly ?

There is a discrepancy between the prose, which says "legal Unicode/10646
characters" and references old versions of these standards, and
the BNF, which says the Char production handles everything except
known control characters (and even some of those).

Don't worry.  The problem will be resolved.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Mar  1 18:24:13 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:33 2004
Subject: XML and special Characters : unicode v3.0 ?
Message-ID: <3.0.32.19990301102354.00b09cf0@pop.intergate.bc.ca>

At 12:58 PM 3/1/99 -0500, John Cowan wrote:
>> For instance, the Sinhala character set was not in Unicode 2.0 but will be
>> in 3.0. How do I get one of those characters in an XML document ? 
>
>There is a discrepancy between the prose, which says "legal Unicode/10646
>characters" and references old versions of these standards, and
>the BNF, which says the Char production handles everything except
>known control characters (and even some of those).

John's right.  And it's not the Sinhala that first brought it home, but
the Euro character, which is clearly OK per production [2] but isn't
a "legal yadda yadda yadda" per the particular amendment of 10646/Unicode
that the XML spec references.  The W3C has some I18n heavies trying
to figure out what to do - life is made more complicated by the fact
that the Unicode people and the IETF i18n people don't always point
in the same direction, sigh; did you know the BOM was legal in UTF-8?
And of course by the fact that Unicode/10646 is a moving target.

But the bottom line is (see the public errata to the XML spec)
that production [2] is normative; both in theory and in practice,
XML processors pass through everything in that range.  In practice,
I've never actually seen anything outside of the BMP, but the 
experts agree they're showing up real soon now.   

How to get it in? Something like &#x10333; I expect.  As a programmer,
it'll show up either as two UTF-16 surrogates or 4+-byte UTF-8 string,
neither of which will look in the slightest like hex 10333.  -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  1 18:36:42 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:33 2004
Subject: Comments on WD-html-in-xml-19990224
References: <3.0.32.19990226132215.00ba4b00@pop.intergate.bc.ca>
Message-ID: <36DADDF2.298060AF@locke.ccil.org>

Tim Bray wrote:

> ?!? find me somewhere in a W3C or IETF document where the FPI has
> any standing.  Standards-anality aside, this is a real problem,
> because there is *no interoperable resolution mechanism*.  Surely
> you can't be serious.

Sure I'm serious.  The XHTML document (clause 3.1) gives three standard
FPIs for XHTML Strict, XHTML Transitional, and XHTML Frameset,
and *requires* that every strictly conforming XHTML document
have a DOCTYPE that refers to one of them.  The associated URL
(systemid) is allowed to vary, but not the FPI.

This is modeled on HTML 4.0, of course; clause 7.2 of that
standard mandates the appearance of one of three FPIs as well.
Similarly, HTML 3.2 (third clause) documents mandate the appearance of a
single FPI, and HTML 2.0 (RFC 1866, clause 3.3) mandates the appearance
of one of five FPIs.

Resolution is irrelevant; it's the FPI itself that says what kind of
(X)HTML you have.

Table of FPIs:

-//W3C//DTD XHTML 1.0 Strict//EN
-//W3C//DTD XHTML 1.0 Transitional//EN
-//W3C//DTD XHTML 1.0 Frameset//EN

-//W3C//DTD HTML 4.0//EN
-//W3C//DTD HTML 4.0 Transitional//EN
-//W3C//DTD HTML 4.0 Frameset//EN

-//W3C//DTD HTML 3.2 Final//EN

-//IETF//DTD HTML 2.0//EN
-//IETF//DTD HTML 2.0 Level 2//EN
-//IETF//DTD HTML 2.0 Level 1//EN
-//IETF//DTD HTML 2.0 Strict//EN
-//IETF//DTD HTML 2.0 Strict Level 1//EN

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  1 19:10:38 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:33 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <3.0.32.19990301102354.00b09cf0@pop.intergate.bc.ca>
Message-ID: <36DAE5FA.5BA2D70E@locke.ccil.org>

Timothaeus Bray scripsit:

> [D]id you know the BOM was legal in UTF-8?

The BOM isn't just a BOM, it's also the ZWNBSP (zero-width
non-breaking space; no, I do not know how to pronounce that
acronym) character, and is interpreted as a BOM only at the
beginning of UCS-2 or UTF-16 documents.  Not to worry; the character is
as near to a no-op as Unicode allows for.

> And of course by the fact that Unicode/10646 is a moving target.

Only sort of.  8859-1 is theoretically a moving target too, except
that all the slots are full; CP 1252 is a moving target that has
just moved (by adding the euro at 0x80).  In all these cases, characters 
can be added (in principle) but not moved or deleted (any more).
 
> In practice,
> I've never actually seen anything outside of the BMP, but the
> experts agree they're showing up real soon now.

Not until Unicode 4.0, unless someone wants to use the private-use
planes 15 and 16.
 
> How to get it in? Something like &#x10333; I expect.

Exactly so.  Or the decimal NCR equivalent.  Two NCRs representing
the surrogates separately would be erroneous by both Unicode/10646
definitions and XML definitions.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Mar  1 19:26:18 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:33 2004
Subject: XML and special Characters : unicode v3.0 ?
Message-ID: <3.0.32.19990301112529.00c0a5e0@pop.intergate.bc.ca>

At 02:09 PM 3/1/99 -0500, John Cowan wrote:
>Timothaeus Bray scripsit:
>
>> [D]id you know the BOM was legal in UTF-8?
>
>The BOM isn't just a BOM, it's also the ZWNBSP (zero-width
>non-breaking space; no, I do not know how to pronounce that
>acronym) character, and is interpreted as a BOM only at the
>beginning of UCS-2 or UTF-16 documents.  Not to worry; the character is
>as near to a no-op as Unicode allows for.

I think there is reason for worry.  In a UTF-16 document, you can
have a BOM and then the <?xml version=?>, and that PI will still
be recognized as the XML declaration.  The spec is, I think,
pretty clear, that a ZWNBSP or any other *data* character before
the XML declaration is verboten.  So... it seems that in UTF8,
a ZWNBSP as first character in the file isn't a data character.
Blecch.

 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  1 19:43:07 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:33 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <3.0.32.19990301112529.00c0a5e0@pop.intergate.bc.ca>
Message-ID: <36DAED75.86978455@locke.ccil.org>

Tim Bray wrote:

> So... it seems that in UTF8,
> a ZWNBSP as first character in the file isn't a data character.

Can you quote chapter and verse for this, either Unicode or 10646?
The latter spec tells you that the sequence EF BB BF may be used as
a *signature* at the beginning of UTF-8 data (since it is unlikely
to occur in any other kind), but does not IMHO imply that the
sequence is removable or doesn't represent a real ZWNBSP.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Mar  1 19:57:25 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:33 2004
Subject: XML and special Characters : unicode v3.0 ?
Message-ID: <3.0.32.19990301115652.00c2d770@pop.intergate.bc.ca>

At 02:41 PM 3/1/99 -0500, John Cowan wrote:
>Tim Bray wrote:
>
>> So... it seems that in UTF8,
>> a ZWNBSP as first character in the file isn't a data character.
>
>Can you quote chapter and verse for this, either Unicode or 10646?

That is *exactly* the question that's now being pursued, and is 
I gather is in play right now in the IETF (or was that Unicode, I
forget which).  -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From russell at latticesemi.com  Mon Mar  1 20:06:20 1999
From: russell at latticesemi.com (Jerry Russell)
Date: Mon Jun  7 17:09:33 2004
Subject: Announce: XML directory/search engine
Message-ID: <Pine.SUN.4.02.9903011205530.9915-100000@tiger>

There is a new site devoted to sites and documents created in XML. You can
now begin submitting your sites. 

The new site is at:  worldwideweave.com

--------------------------------------------
Jerry Russell               Product Engineer
Lattice Semiconductor    408-428-6400 x. 274


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From daniela at cnet.com  Mon Mar  1 20:09:54 1999
From: daniela at cnet.com (Daniel Austin)
Date: Mon Jun  7 17:09:33 2004
Subject: Content-Document-Type: was  (Re: MIME types vs. DOCTYPE)
Message-ID: <77A952A6B467D211855D00805F9521F11492E9@cnet10.cnet.com>


Greetings,


	Here I am speaking for myself, not the HTML Working Group or CNET:

> -----Original Message-----
> From: Walter Underwood [mailto:wunder@infoseek.com]
> Sent: Friday, February 26, 1999 9:42 AM
> To: xml-dev@ic.ac.uk
> Cc: www-html-editor@w3.org
> Subject: Re: Content-Document-Type: was (Re: MIME types vs. DOCTYPE)

<SNIP Content="off-topic"/>


> The objection about thin clients or palmtops not wanting to download
> large files doesn't really hold water. XML will generally be the 
> smallest files. Mine are almost always smaller than the corresponding
> HTML. Powerpoint, PDF, JPEG -- those are big files. 

	This is simply incorrect. The limited capabilities of thin clients
and the expense of transmission of the information require
capabilities-based analysis and profiling of documents on a per-client
basis. As an example, consider a web-enabled cellphone such as this one:
http://www.attws.com/business/pocketnet/index.html. The transmission costs
to this
device vary greatly worldwide, from ~$1/minute in the US to ~$22/min in
Nairobi (actually you can only get basic cell phone via satellite in
Nairobi, but let's pretend.) If I send a 1/2 megabyte XHTML file to this
device, including its 100K CSS stylesheet, the user is entirely justified in
bringing legal action against me. The page would cost many tens or hundreds
of dollars to send, and of course could not be displayed. In fact the client
phone would necessarily display an HTTP error message (or its equivalent) on
the tiny screen. Not to mention the costs of transmitting the inevitable
~12k banner ad, which again cannot be displayed. (Information may want to be
free but information providers want to get paid.)

	At this point in time, no method other than MIME types exists for
informing the client of the type of content
arriving, without first downloading the entire file and then checking it, an
obvious absurdity. Doctypes, FPIs, 
etc. have all be suggested, but none of these solutions provides the
necessary level of transaction control required to identify the content
prior to content reception. Given the massive costs involved, the client
must always be allowed to reject content prior to downloading the entire
file. 


> Adding an XML-specific HTTP header line makes HTTP 1.1 more complex
> (shudder), and imposes an extra coding and testing burden on HTTP
> implementations. Also, it does nothing for XHTML over other 
> transports,
> like SMTP or FTP.


	It is also introducing a new set of dependencies for all XML
documents. Not feasible.


> Essentially, this is document information, not protocol information. 
> It belongs in the document. To describe the document out-of-line, 
> use RDF, not HTTP headers.

	Thin clients will almost necessarily reject all RDF documents (and
most XML documents in general).
RDF is complex and experimental; I am unconvinced that a cell phone should
have to deal with it. 


> Pragmatically, HTTP Content-type isn't even reliable. Somebody will 
> decide that Excel and XML are the same thing, and start serving 
> spreadsheets as text/xml. Cell phones have to deal with that world, 
> and adding things to the HTTP spec doesn't fix ignorant sysadmins. 

True; unfortunate; costly for the victims; possibly legally actionable.


> XHTML Spec comment: the spec doesn't mention application/xml. 
> It should. 
> If application/xml is never appropriate for XHTML (say, the UTF-16
> encoding is forbidden), then say so.

	The XHTML spec is very clear on this, explicitly stating the MIME
types that can be used. No other MIME types are *ever* appropriate. With
MIME types being used for document type identification, sending a document
with the wrong MIME type guarantees an error.


> 
> XHTML Spec comment: Are the Strict, Transitional, and Frameset DTDs
> subsets or extensions? Or neither? Is one a subset of another? These
> intentions should be spelled out in the spec so that future versions
> won't break them.
> 

The 3 XHTML DTDs are neither subsets or extensions in a literal sense. They
correspond as closely as possible to the HTML 4.0 DTDs of the same names.
While to some extent the 'strict' DTD is a subset of the other two, it also
uses different content models for elements with the same name. Once could
not, for practical purposes, use it as an external subset and include the
frameset DTD as an internal DTD subset without conflict between their
content models. I will not attempt to justify the division of HTML into
these 3 groupings - this was decided by the HTML 4.0 committee and is
loosely justified by the HTML 4.0 specification. Current attempts are
designed to follow this  existing prior art to the greatest extent possible.

Regards,

D-

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Mon Mar  1 21:04:56 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:09:34 2004
Subject: xml style questions
Message-ID: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com>

before you scream, this isn't about style sheets, and it
isn't about attributes vs. elements.
rather, this is more how to structure your document/data.

any words of wisdom regarding:

1) having an extra collection layer in the xml tree, like
<root><things><thing></thing><thing></thing></things><another></another></root>
vs.
<root>><thing></thing><thing></thing><another></another></root>

2) having PCDATA vs. having a distinct "comment" or "description" element child:
<thing a="1" b="2">
this is the description of this thing
<some_child></some_child>
</thing>
vs.
<thing a="1" b="2">
<desc>this is the description of this thing</desc>
<some_child></some_child>
</thing>

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Mon Mar  1 21:26:47 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:09:34 2004
Subject: xml style questions
References: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com>
Message-ID: <36DB05E6.B4CD1C4D@allette.com.au>


Mark D. Anderson wrote:

> any words of wisdom regarding:
>
> 1) having an extra collection layer in the xml tree, like
> <root><things><thing></thing><thing></thing></things><another></another></root>
> vs.
> <root>><thing></thing><thing></thing><another></another></root>

Would you consider <thing> and <another> to be siblings? If so, I wouldn't compartmentalise.
Alternatively, if <thing> can appear after <another> but these have different significance, I
would compartmentalise.

> 2) having PCDATA vs. having a distinct "comment" or "description" element child:
> <thing a="1" b="2">
> this is the description of this thing
> <some_child></some_child>
> </thing>
> vs.
> <thing a="1" b="2">
> <desc>this is the description of this thing</desc>
> <some_child></some_child>
> </thing>

If you are going to have a need to deal with <desc> in some way and it could get mixed up with
other #PCDATA, I'd create an element. My instinct would be to mark it up as an element unless
the overhead was excessive, but I think that sort of thing is driven by (a) immediate or
forseeable requirements, followed by (b) personal taste.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at Eng.Sun.COM  Mon Mar  1 22:20:52 1999
From: db at Eng.Sun.COM (David Brownell)
Date: Mon Jun  7 17:09:34 2004
Subject: Yet another niggling XML syntax question
References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com>
Message-ID: <36DB1161.893184EE@eng.sun.com>

roddey@us.ibm.com wrote:
> 
> Does the following violate the 'partial markup in entity' rule of XML?
> 
> <!ENTITY Part1 "<!ELEMENT ">
> <!ENTITY Part2 " Bubba ANY>">
> <!ENTITY Whole "%Part1;%Part2">
> %Whole;

I'll assume you intended to work with parameter entities; then as Richard
pointed out this can be legal ... if the three syntax errors are corrected
("<ENTITY % Part" twice, "%Part2;") AND if this is found in an external
parameter entity not an internal one (which disallows PEs inside entity
declarations -- a WF constraint).


> So I'm assuming
> that this is ok, that the prohibition against partial markup refers to the
> eventual use of the entity, not to the definition thereof?

Right -- this would violate _validity_ constraints (but a nonvalidating
parser should accept it just fine):

	<!DOCTYPE Bubba [
	  <!ENTITY % Part1 "<!ELEMENT ">
	  <!ENTITY % Part2 " Bubba ANY>">
	  <!-- next is a validity error in both internal
		 and external subsets -->
	  %Part1;%Part2;
	]>
	<Bubba></Bubba>

Another way to make an error out of your declarations is to make the
PEs be external, not internal -- then they'd not match full grammatical
productions.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar  1 22:32:12 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:34 2004
Subject: Streaming XML and SAX
In-Reply-To: <36DA2858.43F3EA7A@thinlink.com>; from Tom Harding on Sun, Feb 28, 1999 at 09:40:40PM -0800
References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> <36DA2858.43F3EA7A@thinlink.com>
Message-ID: <19990302093128.A19583@io.mds.rmit.edu.au>

On Sun, Feb 28, 1999 at 09:40:40PM -0800, Tom Harding wrote:
> Marcelo Cantos wrote:
> 
> > It has already been pointed out in this discussion that some
> > environments try to increase the throughput by dispatching
> > documents off to different threads.  A system with 50 CPU's is
> > going to be operating as low as 2% capacity if it is forced to
> > pipe the entire parsing load through a single thread.  I don't see
> > how you can argue that this is efficient.
> 
> Even if you believe that parsing to convert markup into memory
> structures is slower than back-end processing, if parsing is faster
> than the stream itself there is no difference in the two approaches.

That is an awfully big _if_ to enshrine in a standard (if that's where
all this broo-ha-ha ultimately ends up).  What if client and server
are on the same machine?

> Anyway, in the general case the question is moot because there may
> be inter-document dependencies, so you have to look inside the
> document before trying to parallelize.

The question is far from moot since an enormous class of very
interesting problems does not fall into this category.  There are
myriad applications for self-contained XML packets.

Furthermore, inter-document dependenies are not a fundamental problem
for parallelisation.  Threads can talk to each other and block waiting
for other threads to finish parsing, while allowing other threads to
continue independent tasks.  You are suggesting that because in some
cases it isn't trivial to parallelise we should therefore never even
allow the possibility of such a thing to occur.

> The whole point of this discussion was whether the document
> terminator ought to be XML or non-XML.  Aside from the fact that I
> haven't yet seen a workable suggestion for a non-XML terminator,

I am frankly incredulous that there are no systems, protocols or
standards available today that adequately address the need to stream
multiple logical units of information.  This is not a new problem.
Let me suggest one off the top of my head: send a null terminated
decimal length, followed by a document.  This is sufficient to
dispatch data to multiple threads and raise concurrency levels.  Any
further processing can be done inside the parsers.

> it
> isn't necessary to completely examine a document or convert it to a
> tree just to find an XML terminator.

You can do better than a well-formedness parser?  What are you going
to do, grep for </doc>?

> As Nathan pointed out, you
> could write a semi-parser to find terminators and then actually
> parse documents in parallel, but you'd need to suggest a way for
> dealing with inter-document dependencies.

You get the threads to talk.  Inter-document dependencies are not and
need not be a protocol issue.

At the end of the day, the problem of streaming documents is not a
difficult one to solve at the protocol level (HTTP-NG will have it
built in, AFAIK).  Why do you want to complicate life by overloading
the parser's job?

Actually, my real question is, what on earth do you hope to gain?  Or
is this just a philosophical preference thing?


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MarkM at SapphireGroup.com  Mon Mar  1 23:02:22 1999
From: MarkM at SapphireGroup.com (Mark Murphy)
Date: Mon Jun  7 17:09:34 2004
Subject: Looking for XML Filtering Projects
Message-ID: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com>

At XTech '99, I am delivering a presentation on information filtering
applied to XML -- given a source of new/changed XML-encoded data,
determining which of a set of people are interested in that XML based on
filter criteria.

I want to make sure I mention any relevant work in this area, besides my own
and other projects I'm already aware of (e.g., XTenit.com, XML-enabled
search tools like sgrep).

If you are working on information filtering applied to XML, and you would
like your project mentioned at XTech '99, please send me an e-mail
(MarkM@SapphireGroup.com) with relevant details, and I'll be sure to include
you in my presentation!

Mark L. Murphy
The Sapphire Group, Inc.
MarkM@SapphireGroup.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tomh at thinlink.com  Mon Mar  1 23:33:36 1999
From: tomh at thinlink.com (Tom Harding)
Date: Mon Jun  7 17:09:34 2004
Subject: Streaming XML and SAX
References: <4.0.1.19990223210727.00e59d50@pop.hesketh.net> <14036.1186.399749.89131@localhost.localdomain> <36D46419.73F63780@thinlink.com> <14036.28216.379328.364771@localhost.localdomain> <36D82244.DB014ECE@thinlink.com> <19990301114841.B4466@io.mds.rmit.edu.au> <36DA2858.43F3EA7A@thinlink.com> <19990302093128.A19583@io.mds.rmit.edu.au>
Message-ID: <36DB2399.BC94B7E4@thinlink.com>

Marcelo Cantos wrote:

> Furthermore, inter-document dependenies are not a fundamental problem
> for parallelisation.  Threads can talk to each other and block waiting
> for other threads to finish parsing, while allowing other threads to
> continue independent tasks.  You are suggesting that because in some
> cases it isn't trivial to parallelise we should therefore never even
> allow the possibility of such a thing to occur.

I was not suggesting that.  I merely said that in the general case, knowing how to parallelize
requires looking at the data in the stream.  I propose that this data, like everything else,
be stored in XML and that before doing anything else, the endpoint ought to parse it.

I'm sorry if I gave the impression that I think XP is the solution to everything.  I merely
think it would be useful for a lot of things.  If you're judging it on the criteria of being
able to accomplish something that was impossible before, I'm not surprised you're
disappointed.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar  1 23:39:37 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:34 2004
Subject: Looking for XML Filtering Projects
In-Reply-To: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com>; from Mark Murphy on Mon, Mar 01, 1999 at 06:02:08PM -0500
References: <000201be6437$8833fd40$4f9646d1@opal.sapphiregroup.com>
Message-ID: <19990302103859.B19583@io.mds.rmit.edu.au>

On Mon, Mar 01, 1999 at 06:02:08PM -0500, Mark Murphy wrote:
> At XTech '99, I am delivering a presentation on information filtering
> applied to XML -- given a source of new/changed XML-encoded data,
> determining which of a set of people are interested in that XML based on
> filter criteria.
> 
> I want to make sure I mention any relevant work in this area, besides my own
> and other projects I'm already aware of (e.g., XTenit.com, XML-enabled
> search tools like sgrep).

Our database server (SIM) has a facility for querying a database at
regular intervals.  The results are masked with a last-modified
filter, which is updated each time the query is issued.  This means
that users can run a session, build up queries (either by creating new
ones, or merging prior result sets with boolean operators) and then
save them.  They can then have those saved queries executed regularly
on any new or changed data and a notification sent to them in an
appropriate manner (e.g. an emailed page of abstracts and accompanying
links).

The beauty of this approach is that is conflates the concept of filter
and query.  Hence, users wishing to filter documents for items of
interest have the full expressive querying power of the database with
which to define their peculiar interests.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dalapeyre at mulberrytech.com  Mon Mar  1 23:47:28 1999
From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre)
Date: Mon Jun  7 17:09:34 2004
Subject: xml style questions
In-Reply-To: <072201be6426$a82897c0$0200a8c0@mdaxke.mediacity.com>
Message-ID: <v03020905b300c5c02931@DialupEudora>

Mark Anderson wrote:
>any words of wisdom regarding:
>1) having an extra collection layer in the xml tree, like
><root><things><thing></thing><thing></thing></things><another></another></root>
>vs.
><root>><thing></thing><thing></thing><another></another></root>

If you have ANY reason to think you may need the collection layer,
put it in.  Reasons you might want it include things like:

  a) Reuse - <thing>s are frequently used together
     and you want electronic cut-and-paste and/or
     even a really stupid parsing algorithm to be
     able to find them all easily.

     The converse is the same, if you want to ignore
     all <thing>s, group them.

  b) You need some sort of behavior or formatting
     at the collection level.  This could be as simple
     as wanting a new indent level in the
     generated toc.  This is the most
     common reason in practice.

  c) For correct hierarchical layering, <thing>s
     just aren't as big and important as <another>s
     so they don't belong at the same level.

etc.  Yes, much of this could also be done by asking
if you are the first <thing> among your siblings, etc.
But sometimes event-driven processing is easier or faster
than tree walking, and a containing element gives you
your event.


>2) having PCDATA vs. having a distinct "comment" or "description" element
>child:
><thing a="1" b="2">this is the description of this thing
><some_child></some_child>></thing>
>vs.
><thing a="1" b="2"><desc>this is the description of this thing</desc>
><some_child></some_child></thing>

As a style issue, I favor the explicit description.  Makes programming life
easier all around, costs next to nothing.  Programs can easily find the two
equivalent, but, in my experience, people don't.

--Debbie


======================================================================
Deborah Aleyne Lapeyre               mailto:dalapeyre@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9633
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ralph at fsc.fujitsu.com  Tue Mar  2 00:58:04 1999
From: ralph at fsc.fujitsu.com (Ralph Ferris)
Date: Mon Jun  7 17:09:34 2004
Subject: HyBrick Support for XPointer
Message-ID: <3.0.5.32.19990301165634.00a7f3a0@pophost.fsc.fujitsu.com>

Previous announcements of HyBrick's support for XPointer have not detailed
which features are supported. One reason of course is that the discussion
of XPointer continues within the W3C WG. With the announcement of the most
recent version of HyBrick resulting in a significant number of downloads,
it looks like a good time to state which features are availble.

Based on the March, 1998 XPointer draft, HyBrick users can test:

 - All absolute loc terms: root(), html(), id(), origin() 
 - All relative loc terms: child(), ancestor(), descendant(), following()
                            preceding(), fsibling(), psibling()
 - The attr() loc term    

Quick Intro: psibling Example

Here's a quick introduction to using these features:

- Go to the Samples\XLink-sample directory
- Open the readme.xml file
- Inside the first xlink element, under the first locator element:
  <locator role="Hubdocument Title" href="#root().child(1,title)"/>

insert:
<locator role="Overview" href="#id(p6).psibling(1,#element)"/>

- In the first p element start tag after <title>Overview</title>,
add the attribute/value pair id="p6".

- Go to the dtd directory and open the sample.dtd file.
- Add <!ATTLIST p id ID #IMPLIED> after the <!ELEMENT p ... declaration.

Now open the readme.xml file in HyBrick. 
- Left click on the title
- Note that Role[1] Overview is now available in the Locator List. 

Selecting this role causes HyBrick to scroll to 1.2 Overview.

Best regards,

Ralph E. Ferris
Fujitsu Software Corporation

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matt at veosystems.com  Tue Mar  2 01:42:26 1999
From: matt at veosystems.com (matt@veosystems.com)
Date: Mon Jun  7 17:09:34 2004
Subject: WET ICE Workshop on Integrating XML and Distributed Object Technologies
Message-ID: <19990302014213.19138.qmail@veosystems.com>

> Call for Participation to one of the workshops of WET ICE
> 
> 	IEEE 8th International Workshops on Enabling Technologies:
> 		Infrastructure for Collaborative Enterprises.
> 
> 	16-18 June 1999
> Stanford University, California USA
> 
> For more information: http://www.ida.liu.se/conferences/WETICE/
> ______________________________________________________
> 
> WET ICE Workshop on Integrating XML and Distributed Object Technologies
> 
> For more information: http://www.cerc.wvu.edu/workshop2/xmlobjects.html
> 
> Call for Papers and Workshop Description
> 
> The Internet world is being transformed before our eyes as open standards
> such as
> XML are being rapidly adopted. The XML technologies are being seen as
> harbinger of various new functionality in numerous domains ranging from
> electronic commerce to electronic publishing to healthcare delivery to
> manufacturing to
> insurance. Various object-oriented technologies and standards such as Java,
> CORBA and DCOM have also progressed rapidly in the past few years. At this
> time,
> the industry and academia are seriously looking at the intersection of these
> technologies and what it means to the future of the object-web paradigm.
> This
> workshop aims to bring together participants who are seriously investigating
> the combined use of these technologies to support practical application
> needs
> in a variety of domains. The goal of this workshop is to investigate how XML
> and Distributed Object technologies such as Java, CORBA and DCOM can be
> integrated leveraging the strengths each have to offer.
> 
>      Integrating XML and Distributed Object technologies
>      Advances in XML: DOM, SAX, XSL, Schemas, XLink as it relates to Objects
>      Advances in CORBA 3.0, Java, DCOM as it relates to XML
>      Tools and utilities that facilitate integration of XML and
> object-technologies
>      Application of XML and Object technologies in E-commerce, Finance,
> Healthcare,
> 	Publishing, Insurance and Manufacturing and System Integration. The
> 	purpose of these examples should be to show specific successful integration
> 	approaches of XML and objects.
> 
> Workshop Chairs:
> 
> V. "Juggy" Jagannathan
> Concurrent Engineering Research Center
> West Virginia University
> P.O. Box 6506
> Morgantown, WV, USA 26506-6506
> Email: juggy@cerc.wvu.edu
> 
> Matthew Fuchs
> Veo Systems, Inc.
> Email: matt@veosystems.com
> 
> ____________________________________________________________________________
> ______
> 
> About WET ICE
> 
> WET ICE is an annual, international forum for state-of-the-art research in
> enabling
> technologies for collaboration.
> 
> WET ICE '99 will consist of parallel, three-day workshops on different
> topics related
> to collaboration technology. Each workshop will include paper presentations
> and working
> group discussions, with additional joint keynote sessions and a final joint
> session
> to summarize each groups' findings.
> 
> What sets WET ICE apart from larger conferences is that the workshops are
> kept
> small enough to promote fruitful discussions on the
> latest technology developments, directions, problems, and requirements. Each
> group
> will produce a summary report which will appear in the post-proceedings to
> be published
> by IEEE Computer Society Press.
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Tue Mar  2 02:31:41 1999
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:09:34 2004
Subject: XML and special Characters : unicode v3.0 ?
Message-ID: <199903020231.AA03678@murata.apsdc.ksp.fujixerox.co.jp>

John Cowan writes:
>Tim Bray writes:
>> In practice,
>> I've never actually seen anything outside of the BMP, but the
>> experts agree they're showing up real soon now.
>
>Not until Unicode 4.0, unless someone wants to use the private-use
>planes 15 and 16.

It is my understanding that Unicode 3.0 will have many ideographic 
characters which are outside of the BMP.

>John Cowan writes:
>Tim Bray writes:
>> So... it seems that in UTF8,
>> a ZWNBSP as first character in the file isn't a data character.
>
>Can you quote chapter and verse for this, either Unicode or 10646?
>The latter spec tells you that the sequence EF BB BF may be used as
>a *signature* at the beginning of UTF-8 data (since it is unlikely
>to occur in any other kind), but does not IMHO imply that the
>sequence is removable or doesn't represent a real ZWNBSP.

Attached is quoted from A2 of N1396 ISO/IEC 10646-1 Corrigendum 
no. 2 (First draft - revised to 30 April 1996), which was (is?) available 
at http://osiris.dkuug.dk/JTC1/SC2/WG2/docs/N1396.doc

The para most relevant to your question is:
>An application receiving data may either use these signatures to
>identify the coded representation form, or may ignore them and treat
>FEFF as the ZERO WIDTH NO-BREAK SPACE character.

How do you interpret this "or"?   One could argue that when EF BB BF 
is recognized as a signature, it is not treated as the ZWNS.  Unfortunately, 
every description about the BOM (even for UCS-2 or UTF-16) is unclear 
and subject to different interpretations, as I see it.

Cheers,


Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp
---------------------------------------------------------

Annex F
(informative)
The use of "signatures" to identify UCS


 This annex describes a convention for the identification of features
of the UCS, by the use of "signatures" within data streams of coded
characters. The convention makes use of the character ZERO WIDTH
NO-BREAK SPACE, and is applied by a certain class of applications.

When this convention is used, a signature at the beginning of a stream
of coded characters indicates that the characters following are
encoded in the UCS-2 or UCS-4 coded representation, and indicates the
ordering of the octets within the coded representation of each
character (see 6.3). It is typical of the class of applications
mentioned above, that some make use of the signatures when receiving
data, while others do not. The signatures are therefore designed in a
way that makes it easy to ignore them.?In this convention, the ZERO
WIDTH NO-BREAK SPACE character has the following significance when it
is present at the beginning?of a stream of coded characters:

UCS-2 signature: FEFF

UCS-4 signature: 0000 FEFF

UTF-8 signature: EF BB BF

UTF-16 signature: FEFF

An application receiving data may either use these signatures to
identify the coded representation form, or may ignore them and treat
FEFF as the ZERO WIDTH NO-BREAK SPACE character.

If an application which uses one of these signatures recognises its
coded representation in reverse sequence (e.g. hexadecimal FFFE), the
application can identify that the coded representations of the
following characters use the opposite octet sequence to the sequence
expected, and may take the necessary action to recognise the
characters correctly.

NOTE - The hexadecimal value FFFE does not correspond to any coded
character within ISO/IEC 10646.

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Tue Mar  2 03:29:42 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:09:35 2004
Subject: Content-Document-Type: was  (Re: MIME types vs. DOCTYPE)
In-Reply-To: <77A952A6B467D211855D00805F9521F11492E9@cnet10.cnet.com>
Message-ID: <001601be645c$24395ae0$d3228018@jabr.ne.mediaone.net>

Daniel Austin wrote:

>
> 	At this point in time, no method other than MIME types exists for
> informing the client of the type of content
> arriving, without first downloading the entire file and then
> checking it, an
> obvious absurdity. Doctypes, FPIs,
> etc. have all be suggested, but none of these solutions provides the
> necessary level of transaction control required to identify the content
> prior to content reception. Given the massive costs involved, the client
> must always be allowed to reject content prior to downloading the entire
> file.

	Please explain what:

Content-type: text/xhtml

	can possibly do for you that:

Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd"

	cannot do. (Note: the use of doctype = dtd is an example, the doctype can
point to any URI. Just like the XML namespace URI, the doctype URI serves as
a unique identifier and implies no particular meaning.

>
>
>
> > Adding an XML-specific HTTP header line makes HTTP 1.1 more complex
> > (shudder), and imposes an extra coding and testing burden on HTTP
> > implementations. Also, it does nothing for XHTML over other
> > transports,
> > like SMTP or FTP.
>
>
> 	It is also introducing a new set of dependencies for all XML
> documents. Not feasible.

	Huh!? Both these statements are patently false. As per the RFC 822 and
following specs, inclusion of a new header does not in any way alter the
syntax of HTTP or SMTP. It is specifically allowed. Both SMTP and HTTP can
deal with headers, FTP of course could care less about text/xhtml or any
other MIME header so this is moot.

	The point is to create a generalizable mechanism for content negotiation
depending on an XML namespace or DTD or Schema. XHTML like HTML 1.0 - HTML
4.0 is a soon to be historical oddity. I have nothing against HTML, just why
create a hack to solve a particular problem for XHTML version 1.0 e.g.
text/xhtml, when a generalizable solution can be created for any XML
document type e.g. text/xml; doctype=".../XHTML10.dtd". This gives the best
of both worlds.

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Tue Mar  2 08:49:06 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:35 2004
Subject: xml style questions
Message-ID: <01BE6490.F0D17680@grappa.ito.tu-darmstadt.de>

Mark D. Anderson wrote:

> any words of wisdom regarding:
>
> 1) having an extra collection layer in the xml tree, like
> 
<root><things><thing></thing><thing></thing></things><another></another>  
</root>
> vs.
> <root>><thing></thing><thing></thing><another></another></root>

Another reason for a collection layer is human readability. This is 
especially important if the document is normally edited/read by humans, 
less so if it is designed only to be written/read by machine.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan at objectfarm.org  Tue Mar  2 11:45:24 1999
From: stefan at objectfarm.org (Stefan Kreutter)
Date: Mon Jun  7 17:09:35 2004
Subject: XPointer question
Message-ID: <v04011703b3017efd894b@[192.168.0.183]>

Hello there!

given the following XML-snippet:

<customers>
  <customer id="foo">
    <name>Bart Simpson</name>
  </customer>
  <customer id="bar">
    <name>Homer Simpson</name>
  </customer>
</customers>

can I use th following XPointer to get the customer ID of Bart Simpson:

root().child(all, customer).child(1,name).string(1, "Bart
Simpson").ancestor(1, customer).attr(id)

I guess this sould work since the XPointer grammar allows to place
OtherTerm after a StringTerm, but I'm not sure if I understood the spec
completely.

Since string() might return portions of multiple nodes (see 3.7 of
WD-xptr-19980202) applying ancestor() seems a little problematic.

BTW is there a typo in the XPtr-spec? In grammar rule [2] it says:

  [2] OtherTerms ::= OhterTerm | OtherTerm . OtherTerm

shouldn't that be:

  [2] OtherTerms ::= OhterTerm | OtherTerm . OtherTerms

this would allow XPointers of any length not just one or two OtherTerms.

-Stefan
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: text/enriched
Size: 1026 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990302/b5d13faa/attachment.bin
From msabin at cromwellmedia.co.uk  Tue Mar  2 12:07:28 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:09:35 2004
Subject: Encoding detection again ...
Message-ID: <c=US%a=_%p=Cromwell_Media%l=ODIN-990302115843Z-15463@odin.cromwellmedia.co.uk>

I've been browsing throught the archives for an
answer to this question, but I haven't been able
to find anything that seems to give a completely
unambiguous answer ...

Appendix F of the spec say that given a document 
starting with the 4 octet sequence,

  00 3C 00 3F

I'm to infer BOM-less big-endian UTF-16, and 
given a document starting with,

  3C 00 3F 00

I'm to infer BOM-less little-endian UTF-16.

What I what to know is: why could these 
sequences not equally represent (respectively)
big-endian UCS-2 or little-endian UCS-2? In
other words, surely these octet sequences are
ambiguous, and hence the encoding should be
resolved definitively with either,

  <?xml version="1.0" encoding="UTF-16"?>

or,

  <?xml version="1.0" encoding="ISO-10646-UCS-2"?>

or an appropriate MIME header, ie.,

  Content-type: text/xml; charset="utf-16"

or,

  Content-type: text/xml; charset="ISO-10646-UCS-2"

Just so there's no confusion ... I'm assuming:

1. Unicode == UTF-16
2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's
   support for characters outside the BMP).

-- 
Miles Sabin                          Cromwell Media
Internet Systems Architect           5/6 Glenthorne Mews
+44 (0)181 410 2230                  London, W6 0LJ
msabin@cromwellmedia.co.uk           England


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Tue Mar  2 13:41:05 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:09:35 2004
Subject: xml style questions
Message-ID: <93CB64052F94D211BC5D0010A80013310EB351@wwmessd3.bra01.icl.co.uk>

> 
> any words of wisdom regarding:
> 
> 1) having an extra collection layer in the xml tree, 
> 2) having PCDATA vs. having a distinct "comment" or 
> "description" element child:

Firstly, the extra markup can be used to impose extra validity constraints,
which means you application has to do less checking.

Secondly, the extra markup can make XSL stylesheets a lot easier to write.
(In fact, without it they can be impossible...)

So if you're auto-generating the XML and if space isn't at a premium I would
include the extra tags. If it's manually edited it's a different story...

Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar  2 14:39:40 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:35 2004
Subject: Yet another niggling XML syntax question
References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> <36DB1161.893184EE@eng.sun.com>
Message-ID: <36DBF7DE.27FAFCD0@locke.ccil.org>

David Brownell wrote:

> Right -- this would violate _validity_ constraints (but a nonvalidating
> parser should accept it just fine):
> 
>         <!DOCTYPE Bubba [
>           <!ENTITY % Part1 "<!ELEMENT ">
>           <!ENTITY % Part2 " Bubba ANY>">
>           <!-- next is a validity error in both internal
>                  and external subsets -->
>           %Part1;%Part2;
>         ]>
>         <Bubba></Bubba>
> 

It's not 100% clear to me whether the reference to Part2 violates
the WFC "PEs in Internal Subset", which states (inter alia) that
"parameter-entity references can occur only where markup
declarations can occur".  After "%Part1;" which resolves to
" <!ELEMENT  ", a markup declaration is in fact illegal (since
we are already inside one).

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar  2 15:22:49 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:35 2004
Subject: Content-Document-Type: was  (Re: MIME types vs. DOCTYPE)
References: <001601be645c$24395ae0$d3228018@jabr.ne.mediaone.net>
Message-ID: <36DC012D.FAA63A78@locke.ccil.org>

Jonathan Borden wrote:

>         Please explain what:
> 
> Content-type: text/xhtml
> 
>         can possibly do for you that:
> 
> Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd"
> 
>         cannot do. (Note: the use of doctype = dtd is an example, the doctype can
> point to any URI. Just like the XML namespace URI, the doctype URI serves as
> a unique identifier and implies no particular meaning.

I agree, except that I would prefer to see an FPI rather than (or
in addition to) a URI.  That would be extensible to HTML as well as
XHTML, and therefore to the text/html media type as well as the
text/xml media type.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar  2 15:40:27 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:35 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <199903020231.AA03678@murata.apsdc.ksp.fujixerox.co.jp>
Message-ID: <36DC062C.73214454@locke.ccil.org>

MURATA Makoto wrote:

> It is my understanding that Unicode 3.0 will have many ideographic
> characters which are outside of the BMP.

The Unicode Consortium has indicated on its mailing list
that no non-BMP characters will appear in Unicode 3.0.
(Unless Vertical Extension A is being put in Plane 2 after all?)

> >An application receiving data may either use these signatures to
> >identify the coded representation form, or may ignore them and treat
> >FEFF as the ZERO WIDTH NO-BREAK SPACE character.
> How do you interpret this "or"?

I interpret it as "inclusive or", "and/or", "vel".

> One could argue that when EF BB BF
> is recognized as a signature, it is not treated as the ZWNS.

I think that it may or may not be treated as the ZWNBSP.  In any event,
the whole annex is informative, and describes "a convention [...]
applied by a certain class of applications".  It is reasonable to
suppose that XML is not in that class of applications, at least
so far as UTF-8 recognition is concerned.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ajd100 at NAmerica.mot.com  Tue Mar  2 16:54:50 1999
From: ajd100 at NAmerica.mot.com (Dutra Juliana-AJD100)
Date: Mon Jun  7 17:09:35 2004
Subject: FW: Voice XML
Message-ID: <11EF19296147D211A7C100805F312AE7C027A0@s-il06ar.corp.mot.com>

fyi...
> Chiming in on voice standards: AT&T, Lucent Technologies and Motorola will
> announce today joint cooperation on a software language that allows users
> to access the Internet by voice The companies are hoping that the
> language, called VXML, which stands for voice extensible markup language,
> will become a standard for voice commands to the Internet.
> 
> http://www.msnbc.com/news/245787.asp
> 
> Juliana Dutra - E-Business Strategies   
> =====================================
> Motorola,  Communications Enterprise, MMS
> Loc: IL06, Phone = 847-538-3101    Fax = 847-538-7791
> Intranet = http://mms.mot.com/ebusiness/
> =====================================
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kyu-hwang.yeon at bauer-partner.de  Tue Mar  2 18:08:26 1999
From: kyu-hwang.yeon at bauer-partner.de (Kyu Hwang Yeon)
Date: Mon Jun  7 17:09:35 2004
Subject: I wonder ...
Message-ID: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de>

Hi

I am looking for a way to reuse *.dtd files.  For example, I have book.dtd
and library.dtd.  Then, I'd like to reuse book.dtd inside library.dtd
without rewriting whole library.dtd. (Maybe it is too silly question for
people who subscribe this new group)    I wonder it is possible?  Otherwise,
should certain conditions be satisfied for that reuse?

Best regards,

Kyu Hwang


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nikita.ogievetsky at csfb.com  Tue Mar  2 19:18:38 1999
From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita)
Date: Mon Jun  7 17:09:35 2004
Subject: XML behind XMLBars
Message-ID: <9C998CDFE027D211B61300A0C9CF9AB442470A@SNYC11309>

Hi everybody, 
Let me present to the community XMLBars: XML driven menu bars. I intended it
to serve as a simple and visually perceivable example of using XML to
facilitate web design issues. Seems that it turned into a nice web GUI tool.
I would highly appreciate your judgment and critique. Your contribution is
very welcome.

Here is my sin: Namespaces are used to point to document fragments
collection (rather then element definitions)
Why not? It is more convenient for me to say <group ref:id=""> then to use
XPointer or Entity:
		It is easier for people to read (not only parsers matter). 
		By changing namespace ( URN ) all the references defined
with its alias will change automatically. 
It is also great for internalization. And, of course, I can define multiple
namespaces of fragments. The fact that URN doesn't have to be a real URL
makes the possibilities even greater.

-----------------
XML behind XMLBars    Menu Markup Language, if I may :) 

Menu bar rendering and formatting information is stored in XML and cashed in
DOM by a parser. Submenus are rendered only when parent menu is activated.
Action to be fired on a menu click event is also stored in XML. Action can
be a Link to a web page or a chunk of 
JavaScript code. It can also be a Sub-Action. In this case child menu
inherits parents action. 
Action can be parameterized. For example in the following fragment
<MENU>
		<DESC>xml-dev archive</DESC>
		<LINK>
	
http://www.lists.ic.ac.uk/hypermail/xml-dev/<PAR name="year"><PAR
name="month">/index.html
		</LINK>
<SUBMENU>
	<DESC>1999</DESC>
		<SUBACTION>
			<SUB name="year">99</SUB>
		</SUBACTION>
		<SUBMENU>
			<DESC>January</DESC>
				<SUBACTION>
					<SUB name="month">01< /SUB>
				</SUBACTION>
		</SUBMENU>
		<SUBMENU>
			<DESC>February</DESC>
				<SUBACTION>
					<SUB name="month">02</SUB>
				</SUBACTION>
		</SUBMENU>
</SUBMENU>
</MENU>

two leaf submenus when clicked will point to:
http://www.lists.ic.ac.uk/hypermail/xml-dev/9901/index.html
and
http://www.lists.ic.ac.uk/hypermail/xml-dev/9902/index.html

Most of magazines and monthly publications have similar structure. Reusable
group of 12 submenu -months will help. The 3 years of XML-DEV archive will
be as short as:
<MENU>
		<DESC>xml-dev archive</DESC>
		<LINK>
	
http://www.lists.ic.ac.uk/hypermail/xml-dev/<PAR name="year"><PAR
name="month">/index.html
		</LINK>
<SUBMENU>
	<DESC>1999</DESC>
		<SUBACTION>
			<SUB name="year">99</SUB>
		</SUBACTION>
		<SUBMENU>
			<SUBMENUGROUP ref:id="12months" xql:select="./SUB
$le$ '04'">
		</SUBMENU>
</SUBMENU>
<SUBMENU>
	<DESC>1998</DESC>
		<SUBACTION>
			<SUB name="year">98</SUB>
		</SUBACTION>
		<SUBMENU>
			<SUBMENUGROUP ref:id="12months">
		</SUBMENU>
</SUBMENU>
<SUBMENU>
	<DESC>1997</DESC>
		<SUBACTION>
			<SUB name="year">97</SUB>
		</SUBACTION>
		<SUBMENU>
			<SUBMENUGROUP ref:id="12months " xql:select ="./SUB
$ge$ '04'">
		</SUBMENU>
</SUBMENU>
</MENU>

The second optional attribute xql:select will filter first 4 months for
current year and months starting with February for the year 1997.
XMLBars implemented using IE5beta parser can be found at
http://www.cogx.com/XMLBar.
(Sorry, still working on cross-browser implementation).


Nikita Ogievetsky
Cogitech Inc.
http://www.cogx.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Tue Mar  2 19:39:08 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:09:35 2004
Subject: I wonder ...
Message-ID: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1>

>I'd like to reuse book.dtd inside library.dtd without rewriting 
>whole library.dtd.

(First, this is really a question for the xml-l list or comp.text.xml.
xml-dev is for people developing XML software.)

This is what external parameter entities are for. Parameter entities
store pieces of a DTD, and "external" means "stored in a separate file"
(or the equivalent construct in your operation system). For example, if
book.dtd is the following:

  <!ELEMENT book (chapter+)>
  <!ELEMENT chapter (par+)>
  <!ELEMENT par (#PCDATA)>

your library.dtd file could look like this:   

  <!ELEMENT library (shelf+)>
  <!ELEMENT shelf (book+)>
  <!ENTITY % bookdtd "book.dtd">
  %bookdtd;


Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Tue Mar  2 20:12:31 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:09:36 2004
Subject: Content-Document-Type: was  (Re: MIME types vs. DOCTYPE)
Message-ID: <008201be64e8$0fc8ca00$0b2e249b@fileroom.Synapse>

John Cowan wrote:

>Jonathan Borden wrote:
>
>>         Please explain what:
>>
>> Content-type: text/xhtml
>>
>>         can possibly do for you that:
>>
>> Content-type: text/xml; doctype="http://www.w3.org/xhtml.dtd"
>>
>>         cannot do. (Note: the use of doctype = dtd is an example, the
doctype can
>> point to any URI. Just like the XML namespace URI, the doctype URI serves
as
>> a unique identifier and implies no particular meaning.
>
>I agree, except that I would prefer to see an FPI rather than (or
>in addition to) a URI.  That would be extensible to HTML as well as
>XHTML, and therefore to the text/html media type as well as the
>text/xml media type.
>

This is a good idea.

A general way to employ the Content-type header to specify a document type
is:

Content-type: text/xml; element="html"; fpi="-//W3C//DTD XTHML 1.0
Strict//EN";              uri="http://www.w3.org/XHTML.DTD"

This should apply to text/html, text/xml, text/sgml, application/xml etc.

deja vu all over again :-)

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Tue Mar  2 20:22:01 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:36 2004
Subject: Yet another niggling XML syntax question
References: <87256725.0000562C.00@d53mta03h.boulder.ibm.com> <36DB1161.893184EE@eng.sun.com> <36DBF7DE.27FAFCD0@locke.ccil.org>
Message-ID: <36DC4828.C133D8F2@goon.stg.brown.edu>

John Cowan wrote:

> > Right -- this would violate _validity_ constraints (but a nonvalidating
> > parser should accept it just fine):
> >
> >         <!DOCTYPE Bubba [
> >           <!ENTITY % Part1 "<!ELEMENT ">
> >           <!ENTITY % Part2 " Bubba ANY>">
> >           <!-- next is a validity error in both internal
> >                  and external subsets -->
> >           %Part1;%Part2;
> >         ]>
> >         <Bubba></Bubba>
> >
> 
> It's not 100% clear to me whether the reference to Part2 violates
> the WFC "PEs in Internal Subset"

To restate your message slightly:

The problem with %Part2; is that the markup unit starts with %Part1; and
ends with %Part2;, which is something parsed entities aren't supposed
to do.

Note:

The only reason you can get away with

  <!ENTITY % Part3 "%Part1;%Part2;">

is that section 4.3.2 of the XML 1.0 standard says that all internal
parsed entities are by definition well formed.

It's apparently an exception to the "proper nesting" rule, meant spe-
cifically to allow cutting and pasting of parameter entities.  This is
also the motivation for suppressing the addition of spaces before and
after the entities inside the quotation marks above.

Would anyone agree that the standard is not altogether clear on this
point?

(Tim, if my comments are correct, it might make sense to edit them,
in some form, into your annotated version of the spec.)

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From slotter at maya.com  Tue Mar  2 20:40:55 1999
From: slotter at maya.com (Dave Slotter)
Date: Mon Jun  7 17:09:36 2004
Subject: Expat API
In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1>
Message-ID: <v04104403b301faab26bc@[192.70.254.157]>

Hi. I'm new to this list (just subscribed today) and searched the 
archives on expat, but it failed to answer my question.

My question is: where is the documentation on how to use the expat 
API? I downloaded version 1.0.2 and ported the code to run the sample 
program on my Macintosh, but I'm pretty much dead in the water. I 
tried sending email to the author (James Clark) twice in the last few 
days, but I have so far failed to receive a response. The comments in 
the header files do not seem to be sufficient.

What I am trying to do is parse some well-formed XML such as the 
following example so that I can get the tags (which the example shows 
me how to do) and then obtain the text.

-----

<?xml version="1.0" standalone="yes"?>

<DATAFILE>

<FOO ID="12345678">
  <NAME>cat</NAME>
  <TYPE>gray</TYPE>
</FOO>

</DATAFILE>

-----

For example, I would like to be able to obtain the <FOO> TAG as well 
as the FOO ID (12345678), then the <NAME> tag along with the enclosed 
text (cat) then the <TYPE> tag along with its enclosed text (gray).

However, the sample program only shows how to retrieve the tags.

If anyone has some example code, I would be grateful. If someone has 
documentation, that would be appreciated as well.

-Dave Slotter

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar  2 21:10:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:36 2004
Subject: I wonder ...
In-Reply-To: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de>
References: <99Mar2.210244gmt+0100.27779@gatekeeper.bauer-partner.de>
Message-ID: <14044.21265.752204.753493@localhost.localdomain>

Kyu Hwang Yeon writes:

 > I am looking for a way to reuse *.dtd files.  For example, I have book.dtd
 > and library.dtd.  Then, I'd like to reuse book.dtd inside library.dtd
 > without rewriting whole library.dtd. (Maybe it is too silly question for
 > people who subscribe this new group)    I wonder it is possible?  Otherwise,
 > should certain conditions be satisfied for that reuse?

Try this:

  <!ELEMENT library (book+)>
  <!ENTITY % book SYSTEM "book.dtd">
  %book;


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at Eng.Sun.COM  Tue Mar  2 21:22:34 1999
From: db at Eng.Sun.COM (David Brownell)
Date: Mon Jun  7 17:09:36 2004
Subject: Encoding detection again ...
References: <c=US%a=_%p=Cromwell_Media%l=ODIN-990302115843Z-15463@odin.cromwellmedia.co.uk>
Message-ID: <36DC55FF.64C4408D@Eng.Sun.COM>

Miles Sabin wrote:
> 
> Appendix F of the spec say that given a document
> starting with the 4 octet sequence,
> 
>   00 3C 00 3F
> 
> I'm to infer BOM-less big-endian UTF-16, and
> given a document starting with,
> 
>   3C 00 3F 00
> 
> I'm to infer BOM-less little-endian UTF-16.

That is, the appendix _suggests_ (in a non-normative
fashion) that's the way to go.


> What I what to know is: why could these
> sequences not equally represent (respectively)
> big-endian UCS-2 or little-endian UCS-2?

They could ...

> 
> 1. Unicode == UTF-16
> 2. UCS-2 != UTF-16 (because UCS-2 lacks UTF-16's
>    support for characters outside the BMP).

Put it this way:  if you assume UTF-16, you're
safe either way because UTF-16 is a superset.

It'd be reasonable for an autodetecting algorithm
to support "downgrading" its guess from UTF-16 to
UCS-2, and should probably do so if it's reporting
encoding mismatches as fatal errors.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Tue Mar  2 21:48:18 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:09:36 2004
Subject: I wonder ...
In-Reply-To: <14044.21265.752204.753493@localhost.localdomain>
Message-ID: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com>

This works fine, but (at least in IE 5) only for a single level. That is, you can't have another entity reference inside "book.dtd". To me, this significantly limits its usefulness (imagine not allowing a #include inside a file that was #included).

Jeff

-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
David Megginson
Sent: Tuesday, March 02, 1999 1:09 PM
To: XML Development
Subject: re: I wonder ...


Kyu Hwang Yeon writes:

 > I am looking for a way to reuse *.dtd files.  For example, I have book.dtd
 > and library.dtd.  Then, I'd like to reuse book.dtd inside library.dtd
 > without rewriting whole library.dtd. (Maybe it is too silly question for
 > people who subscribe this new group)    I wonder it is possible?  Otherwise,
 > should certain conditions be satisfied for that reuse?

Try this:

  <!ELEMENT library (book+)>
  <!ENTITY % book SYSTEM "book.dtd">
  %book;


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar  2 21:51:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:36 2004
Subject: I wonder ...
In-Reply-To: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com>
References: <14044.21265.752204.753493@localhost.localdomain>
	<000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com>
Message-ID: <14044.23685.951735.30695@localhost.localdomain>

Jeffrey E. Sussna writes:

 > This works fine, but (at least in IE 5) only for a single
 > level. That is, you can't have another entity reference inside
 > "book.dtd". To me, this significantly limits its usefulness
 > (imagine not allowing a #include inside a file that was #included).

If IE 5 behaves this way, it is because of a bug, not because of a
limitation in the XML spec -- since XML support in IE is in early
days, I expect that Microsoft will fix this problem before the
official release.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Clark.Cooper at corporate.ge.com  Tue Mar  2 23:04:11 1999
From: Clark.Cooper at corporate.ge.com (Cooper, Clark (CORP, Consultant))
Date: Mon Jun  7 17:09:36 2004
Subject: Expat API
Message-ID: <014CB98EB81ED011B3E900805FE2D47A04F74B42@X01SCHCORPGE>

Dave Slotter <slotter@maya.com> wrote:
> My question is: where is the documentation on how to use the expat 
> API? I downloaded version 1.0.2 and ported the code to run the sample 
> program on my Macintosh, but I'm pretty much dead in the water

As far as I know the include file is the documentation. Expat is used by the
perl
module XML::Parser, which I maintain, but if you're having trouble with just
the
include file, you'd be absolutely lost looking at Expat.xs (I get lost
looking at it
sometimes). If you can use perl, I'd like to suggest XML::Parser as a
kindler,
gentler interface to expat.

If you're not a perl kinda fella, here's a small example of using expat:

#include "xmlparse.h"
#include <strings.h>
#include <stdio.h>

#define MAXLEV 512
#define BUFSIZE 4096

char indent[(MAXLEV + 1) * 2];
int level = 0;

void
start(void *data, const XML_Char *name, const XML_Char **atts)
{
  int offset;

  printf("\n%s> %s", indent, name);
  while (*atts) {
    printf(" %s='%s'", atts[0], atts[1]);
    atts += 2;
  }
  if (level >= MAXLEV) {
    fprintf(stderr, "Exceeded max level\n");
    exit(-1);
  }
  offset = level * 2;
  indent[offset]     = ' ';
  indent[offset + 1] = ' ';
  indent[offset + 2] = '\0';
  level++;
}  /* End start handler */

void
end(void *data, const XML_Char *name)
{
  level--;
  indent[level*2] = '\0';
  printf("\n%s< %s\n", indent, name);
}  /* End end handler */

void
text(void *data, const XML_Char *txt, int len)
{
  int i;

  printf("%s- ", indent);
  for (i = 0; i < len; i++)
    putchar(txt[i]);
}  /* End text handler */

void
main(int argc, char **argv)
{
  XML_Parser  prs;
  int stat;
  FILE * doc;

  if (argc < 2) {
    fprintf(stderr, "No filename supplied\n");
    exit(-1);
  }

  doc = fopen(argv[1], "r");
  if (! doc) {
    fprintf(stderr, "Couldn't open %s\n", argv[1]);
    exit(-1);
  }

  indent[0] = '\0';
  prs = XML_ParserCreate(NULL);
  XML_SetElementHandler(prs, start, end);
  XML_SetCharacterDataHandler(prs, text);

  while (! feof(doc)) {
    int cnt;
    void *buff = XML_GetBuffer(prs, BUFSIZE);
    if (! buff) {
      fprintf(stderr, "Ran out of memory\n");
      exit(-1);
    }
    cnt = fread(buff, 1, BUFSIZE, doc);
    stat = XML_ParseBuffer(prs, cnt, 0);
    if (! stat) {
      fprintf(stderr, "Parse error at line %d, column %d\n",
              XML_GetCurrentLineNumber(prs),
XML_GetCurrentColumnNumber(prs));
      exit(-1);
    }
  }
  fclose(doc);
  stat = XML_ParseBuffer(prs, 0, 1);
  if (! stat) {
    fprintf(stderr, "Parse error at line %d, column %d\n",
            XML_GetCurrentLineNumber(prs), XML_GetCurrentColumnNumber(prs));
    exit(-1);
  }
}  /* End main */

--
Clark Cooper                  Logic Technologies,Inc
cccooper@ltionline.com
(518) 388-7451                650 Franklin St., Suite 304
coopercc@netheaven.com
	                                  Schenectady, NY  12305

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bmhughes at ozemail.com.au  Tue Mar  2 23:29:42 1999
From: bmhughes at ozemail.com.au (Baden Hughes)
Date: Mon Jun  7 17:09:36 2004
Subject: XML and special Characters : unicode v3.0 ?
In-Reply-To: <36DAE5FA.5BA2D70E@locke.ccil.org>
Message-ID: <000d01be64fc$1a3a09e0$0dce6ccb@baden>

Tim Bray writes:
> > In practice,
> > I've never actually seen anything outside of the BMP, but the
> > experts agree they're showing up real soon now.

John Cowan writes:
> Not until Unicode 4.0, unless someone wants to use the private-use
> planes 15 and 16.

Uh, that's gonna be a problem. How would you put in a PUA character in an
XML doc ? Still by the U+... ? (we have around 800 of them for the languages
we work with !!)

Baden


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From falk at icon.at  Tue Mar  2 23:34:08 1999
From: falk at icon.at (Falk, Alexander)
Date: Mon Jun  7 17:09:36 2004
Subject: Please send non-English XML example documents
Message-ID: <A01C76E644CAD111B83A0000E8D8890E057BD3@melange.icon.co.at>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: Falk, Alexander.vcf
Type: application/octet-stream
Size: 1062 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990302/dff30928/FalkAlexander.obj
From msabin at cromwellmedia.co.uk  Wed Mar  3 12:12:30 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:09:36 2004
Subject: Encoding detection again ...
Message-ID: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303120345Z-15958@odin.cromwellmedia.co.uk>

David Brownell wrote,
> Put it this way:  if you assume UTF-16, you're
> safe either way because UTF-16 is a superset.

Err ... is that true?

Maybe I'm being a bit obsessive about my 
interpretation of the various standards docs, but 
as far as I can see UCS-2 isn't a subset of
UTF-16. The BMP S-zone codes (D800-DFFF) are 
undefined but reserved in UCS-2, and so should 
not occur in a purportedly UCS-2 stream. I would 
expect a processor which encountered such codes to
either,

1. Spit out an error and give up.

or,

2. Quietly ignore them and continue processing 
   with the next 2 octets.

Obviously these codes are defined and legal
in UTF-16, so an incorrect assumption of UTF-16
when the stream was in fact broken UCS-2 would
produce unpredictably incorrect behaviour (ie.
the processor might continue processing a broken
doc in an indeterminate way).

In any case, on a less finickety note, I'd quite
like to be able to compute string lengths UCS-2
style where that's appropriate, because 2*byte-
length is a bit simpler than the UTF-16
equivalent ;-)

Anyway, here's a slightly updated version of a 
proposal I mailed to Tim Bray yesterday ...

In the absence of an appropriate MIME header
the octet sequences,

1. FE FF 
2. FF FE
3. 00 3C 00 3F
4. 3C 00 3F 00

may be inferred to be,

1. big-endian indeterminately encoded 2 octet
   characters.

2. little-endian indeterminately encoded 2 octet
   characters.

3. BOM-less big-endian indeterminately encoded 2 
   octet characters.

4. BOM-less little-endian indeterminately encoded 
   2 octet characters.

If either of the following PIs are found,

  <?xml version="1.0" ?>
  <?xml version="1.0" encoding="UTF-16"?>

or, in cases (1) and (2), if *no* PI is found,
then encoding is resolved to UTF-16. Otherwise 
if,

  <?xml version="1.0" encoding="ISO-10646-UCS-2"?>

is found then encoding is resolved to UCS-2.

This very complicated and isn't a zillion miles away 
from the current handling of UTF-8 vs. ISO 8859-x 
vs. US-ASCII.

Cheers,


Miles

-- 
Miles Sabin                          Cromwell Media
Internet Systems Architect           5/6 Glenthorne Mews
+44 (0)181 410 2230                  London, W6 0LJ
msabin@cromwellmedia.co.uk           England


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Wed Mar  3 12:45:45 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:09:36 2004
Subject: Encoding detection again ...
Message-ID: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303123700Z-16334@odin.cromwellmedia.co.uk>

Sorry to follow up my own posting, but one thing needs a 
bit of clarification, and one typo needs correction.

I wrote,
> David Brownell wrote,
> > Put it this way:  if you assume UTF-16, you're
> > safe either way because UTF-16 is a superset.
> 
> Err ... is that true?
> 
> Maybe I'm being a bit obsessive about my 
> interpretation of the various standards docs, but 
> as far as I can see UCS-2 isn't a subset of
> UTF-16.

The question of UCS-2 being, or not being a subset of 
UTF-16 is a bit of a red herring. It is undoubtedly true 
that the set of octet pairs which are legal UCS-2 
characters is a subset of the set of octet pairs which 
are legal UTF-16 characters.

Appendix F suggests that octet sequences which could
equally well be interpreted as UTF-16 or UCS-2 may be 
assumed to be UTF-16, and *doesn't* include a clause
stating that this assumption should be revised in
the light of an explicit XML encoding declaration. I
think that clause should be added, in much the same
way as it is for UTF-8 vs. 8859-X.

Now the typo ...

> This very complicated and isn't a zillion miles away 
> from the current handling of UTF-8 vs. ISO 8859-x 
> vs. US-ASCII.

Please insert the word 'isn't' in the obvious
place ;-)

Cheers,


Miles

-- 
Miles Sabin                          Cromwell Media
Internet Systems Architect           5/6 Glenthorne Mews
+44 (0)181 410 2230                  London, W6 0LJ
msabin@cromwellmedia.co.uk           England


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Wed Mar  3 13:30:59 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:36 2004
Subject: SAX and DTDHandler
Message-ID: <9f7499ae.36dd3931@aol.com>

Hi Everyone,

I've been playing around with SAX and several of the parser 
implementations (primarily Sun's and IBM's).

The basics of DocumentHandler and ErrorHandler are 
straight forward and work well.

The interfaces EntityResolver and DTDHandler are still fuzzy.
I've searched for documents on these but have not found anything
of any depth.

My primary question is will SAX allow me to parse a DTD?
It doesn't seem so.  DTDHandler only handles unparsed Entity declarations 
(like binary data) and Notation declarations.  If it is the case that SAX does
not
parse DTDs due to the fact that it does not want to perform validation then 
why bother with the above two cases?

I guess I don't understand the design philosophy in these respects.

All help is appreciated.

Thanks,

 - Mike
-----------------------------------------------
Michael C. Daconta
Author of Java 2 and JavaScript for C/C++ Programmers
Author of C++ Pointers and Dynamic Memory Management
Sun Certified Java Programmer and Developer
http://www.gosynergy.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Wed Mar  3 13:33:19 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:09:36 2004
Subject: DTD for Bibliographic Notation
In-Reply-To: <A01C76E644CAD111B83A0000E8D8890E057BD3@melange.icon.co.at>
Message-ID: <v03102803b302e883dc17@[168.100.203.234]>

Has anybody written a DTD for bibliographies?  Are there any standards
efforts in this area?  To be usable, this DTD would have to be public
domain or explicitly allow unrestricted reuse. I probably don't need to
modify it, but at a minimum I need to be able to republish it.


+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|        XML: Extensible Markup Language (IDG Books 1998)            |
|   http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://sunsite.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/     |
+----------------------------------+---------------------------------+


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mintert at irb.informatik.uni-dortmund.de  Wed Mar  3 14:00:44 1999
From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert)
Date: Mon Jun  7 17:09:37 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd 
In-Reply-To: Your message of Sun, 28 Feb 1999 22:02:37 +0100.
             <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no> 
Message-ID: <199903031400.PAA23631@brown.informatik.uni-dortmund.de>


---------
 > There's a very nice document at:
 > 
 > http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm
 > 
 > Cheers,
 > Jarle Stabell

Thanks, Jarle!


Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd
(copied from the above URL) with nsgmls. I'm using nsgmls 1.3 on SunOS 5.6
(Solaris 2). I already parsed xml instances without problems but in this case
it doesn't work. Following are the first lines of nsgmls output:

sm@brown(/tmp/sm){590}: /tmp/sm/sp-1.3/nsgmls/nsgmls -E 10 -w xml -s REC-xml-19980210.xml
/tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:17:W: named character reference
/tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:19:E: "X2014" is not a function name
/tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:61:17:W: named character reference
/tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:61:19:E: "X201C" is not a function name
/tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:62:17:W: named character reference
/tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:62:19:E: "X201D" is not a function name
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:101:9:E: document type does not allow element "ABSTRACT" here
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:142:8:E: document type does not allow element "PUBSTMT" here
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:146:11:E: document type does not allow element "SOURCEDESC" here
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:149:10:E: document type does not allow element "LANGUSAGE" here
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:153:13:E: document type does not allow element "REVISIONDESC" here
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:189:36:W: character "<" is the first character of a delimiter but occurred as data
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:371:8:E: end tag for "HEADER" which is not finished
/tmp/sm/sp-1.3/nsgmls/nsgmls:REC-xml-19980210.xml:787:7:W: character "<" is the first character of a delimiter but occurred as data


I enabled XML support as described on http://www.jclark.com/sp/xml.htm


      Set the SP_CHARSET_FIXED environment variable to YES. 
      Set the SP_ENCODING environment variable to XML. 
      Set the SGML_CATALOG_FILES environment variable to point to the file
	pubtext/xml.soc.  
      Use the -wxml option. setenv SP_CHARSET_FIXED YES


What's wrong?  Any help is appreciated. Thanks in advance.


Bye,

 Stefan.

+-----------------------------------------------------------+
  Stefan Mintert
       UniDo:    mintert@irb.informatik.uni-dortmund.de
       private:  stefan@mintert.com
+-----------------------------------------------------------+

        "let the music keep our spirits high..."

                                (Jackson Browne)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 15:10:42 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:37 2004
Subject: I wonder ...
References: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com>
Message-ID: <36DD50B5.5904B0A6@locke.ccil.org>

Jeffrey E. Sussna wrote:

> This works fine, but (at least in IE 5) only for a single level. That
> is, you can't have another entity reference inside "book.dtd". To me,
> this significantly limits its usefulness (imagine not allowing a
> #include inside a file that was #included).

If so, that is a dreadful bug.  The XML specification has no such
limitations, although one might suppose that an implementation might
have a practical limit in the neighborhood of 50-100, because of
operating system limits on open files.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 15:17:20 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:37 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <000d01be64fc$1a3a09e0$0dce6ccb@baden>
Message-ID: <36DD523B.F2EAFB7E@locke.ccil.org>

Baden Hughes wrote:

> Uh, that's gonna be a problem. How would you put in a PUA character in an
> XML doc ?  Still by the U+... ? (we have around 800 of them for the languages
> we work with !!)

Well, first of all there are 6400 private-use characters on the BMP,
so that gives you plenty of room to play with.  You cannot use
any kind of private-use character in element or attribute names,
which is good for interoperability; to incorporate them in
character data or attribute values, use a character reference
like &#xE000;.

What will be more serious is that *normative* characters from the
Astral Planes aren't usable in XML names either.  Presumably,
when they actually show up, XML will be modified, so that we can
have element names in Egyptian hieroglyphics with attributes in
Sindarin.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Wed Mar  3 15:18:33 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:09:37 2004
Subject: DTD for Bibliographic Notation
Message-ID: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1>

> Elliotte Rusty Harold writes:
>Has anybody written a DTD for bibliographies?  

Have you looked at the bibliography module of DocBook?

  DocBook home page: http://www.oasis-open.org/docbook
  XML version of DocBook: http://www.nwalsh.com/docbook/xml
  file with XML's bibliography module:
http://www.nwalsh.com/docbook/xml/1.3/dbhierx.mod

Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From prb at uic.edu  Wed Mar  3 15:52:40 1999
From: prb at uic.edu (Paul R. Brown)
Date: Mon Jun  7 17:09:37 2004
Subject: DTD for Bibliographic Notation
Message-ID: <003701be658b$81c84d80$e7b2c183@razzmatazz.math.uic.edu>


The folks who built bibtex have already spent some time on this, so you
could use portions of their design.

    - Paul

-----Original Message-----
From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
Date: Wednesday, March 03, 1999 9:25 AM
Subject: DTD for Bibliographic Notation


>Has anybody written a DTD for bibliographies?  Are there any standards
>efforts in this area?  To be usable, this DTD would have to be public
>domain or explicitly allow unrestricted reuse. I probably don't need to
>modify it, but at a minimum I need to be able to republish it.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Wed Mar  3 16:19:39 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:09:37 2004
Subject: DTD for Bibliographic Notation
In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1>
Message-ID: <v03102807b303116f7c52@[168.100.203.234]>

At 10:24 AM -0500 3/3/99, DuCharme, Robert wrote:
>> Elliotte Rusty Harold writes:
>>Has anybody written a DTD for bibliographies?
>
>Have you looked at the bibliography module of DocBook?
>

No, but I'll check it out. Thanks.

>  DocBook home page: http://www.oasis-open.org/docbook
>  XML version of DocBook: http://www.nwalsh.com/docbook/xml
>  file with XML's bibliography module:
>http://www.nwalsh.com/docbook/xml/1.3/dbhierx.mod
>
>Bob DuCharme       www.snee.com/bob       <bob@
>snee.com>  see www.snee.com/bob/xmlann for "XML:
>The Annotated Specification" from Prentice Hall.


+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|        XML: Extensible Markup Language (IDG Books 1998)            |
|   http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://sunsite.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/     |
+----------------------------------+---------------------------------+


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar  3 16:20:22 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:37 2004
Subject: SAX and DTDHandler
In-Reply-To: <9f7499ae.36dd3931@aol.com>
References: <9f7499ae.36dd3931@aol.com>
Message-ID: <14045.24597.828439.227541@localhost.localdomain>

MikeDacon@aol.com writes:

 > My primary question is will SAX allow me to parse a DTD?  It
 > doesn't seem so.  DTDHandler only handles unparsed Entity
 > declarations (like binary data) and Notation declarations.  If it
 > is the case that SAX does not parse DTDs due to the fact that it
 > does not want to perform validation then why bother with the above
 > two cases?

SAX doesn't parse anything -- it's just an interface.  Some (most?)
Java-based XML parsers that implement the SAX interface do happen to
perform validation, but that's outside the scope of SAX 1.0 itself
(we're talking about fixing that for ModSAX).

SAX 1.0 provides the DTDHandler interface because XML 1.0 requires
processors to report notations and unparsed entities.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 16:28:16 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:37 2004
Subject: SAX and DTDHandler
References: <9f7499ae.36dd3931@aol.com>
Message-ID: <36DD62E3.785602E1@locke.ccil.org>

MikeDacon@aol.com wrote:

> My primary question is will SAX allow me to parse a DTD?
> It doesn't seem so.  DTDHandler only handles unparsed Entity declarations
> (like binary data) and Notation declarations.  If it is the case that SAX does
> not
> parse DTDs due to the fact that it does not want to perform validation then
> why bother with the above two cases?

Remember that SAX is a front-end to various parsers with various
philosophies, validating (XML4J), non-validating but external-entity-
reading (Aelfred), non-validating and document-entity-only (XP).

SAX provides methods, for parsers that wish to do so, to report on
declared notations and unparsed entities, since these features
provide actual extensions to the basic element/attribute model.
Element and attribute list declarations cannot be reported through
SAX, since they are reckoned inessential.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 16:29:57 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:37 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd
References: <199903031400.PAA23631@brown.informatik.uni-dortmund.de>
Message-ID: <36DD6335.4C40B6F5@locke.ccil.org>

Stefan Mintert wrote:

> Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd
> (copied from the above URL) with nsgmls.

As the documentation for XMLspec warns, the current version of the
DTD is *not* the one used with the XML Recommendation, which used
a much older version.  So don't do that.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mintert at irb.informatik.uni-dortmund.de  Wed Mar  3 17:14:19 1999
From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert)
Date: Mon Jun  7 17:09:37 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd 
In-Reply-To: Your message of Wed, 03 Mar 1999 11:28:37 -0500.
             <36DD6335.4C40B6F5@locke.ccil.org> 
Message-ID: <199903031713.SAA24548@brown.informatik.uni-dortmund.de>


 > > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd
 > > (copied from the above URL) with nsgmls.
 > 
 > As the documentation for XMLspec warns, the current version of the
 > DTD is *not* the one used with the XML Recommendation, which used
 > a much older version.  So don't do that.

ooops, sorry; but that doesn't explain the parsing errors concerning the DTD:

spec.dtd:60:17:W: named character reference
spec.dtd:60:19:E: "X2014" is not a function name
[...]


BTW: I would be nice to use the XML spec as a valid document, not just a
well-formed document. Has anyone kept the old XMLspec DTD? (I guess it's
Revision 1.0, 7 April 1998)


Bye,

 Stefan.

+-----------------------------------------------------------+
  Stefan Mintert
       UniDo:    mintert@irb.informatik.uni-dortmund.de
       private:  stefan@mintert.com
+-----------------------------------------------------------+

        "let the music keep our spirits high..."

                                (Jackson Browne)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmcdonou at library.berkeley.edu  Wed Mar  3 17:58:23 1999
From: jmcdonou at library.berkeley.edu (Jerome McDonough)
Date: Mon Jun  7 17:09:37 2004
Subject: DTD for Bibliographic Notation
In-Reply-To: <v03102803b302e883dc17@[168.100.203.234]>
References: <A01C76E644CAD111B83A0000E8D8890E057BD3@melange.icon.co.at>
Message-ID: <3.0.5.32.19990303094553.0097e990@library.berkeley.edu>

At 08:26 AM 3/3/1999 -0500, Elliotte Rusty Harold wrote:
>Has anybody written a DTD for bibliographies?  Are there any standards
>efforts in this area?  To be usable, this DTD would have to be public
>domain or explicitly allow unrestricted reuse. I probably don't need to
>modify it, but at a minimum I need to be able to republish it.
>

Mm, not to be Clinton-esque or anything, but it depends on what you
mean by bibliographies.  There are an awful lot of DTDs that include 
elements for bibliographic citation as part of a larger document
structure.  Some of the better known examples would include 
the <biblStruct> and <biblFull> elements with the TEI DTD, the <citation> 
element within ETD-ML DTD (part of the Electronic Thesis and 
Dissertation project at Virginia Tech), the <bibliography> element with
the Encoded Archival Description DTD, and the <BiblioEntry> element in
DocBook.

There are standalone DTDs for capturing bibliographic information, but
they tend to be written by library geeks like me, and as a result, tend
to be a bit more detailed and extensive (read arcane and opaque) than 
what most people would think of when designing a DTD for bibliographies.  
The most authoritative work in these lines would probably be the 
MARC DTDs provided by the Library of Congress
(http://lcweb.loc.gov/marc/marcsgml.html), but understanding
those without copies of both the USMARC standard and the Anglo-American
Cataloguing Rules next to you is a non-trivial task.  If you want to
look over a simpler version of the MARC standard as an XML DTD, I
revised an SGML DTD that I did for MARC which you can grab at 
http://sunsite.berkeley.edu/~jmcdonou/USMARC.XML.DTD; again, knowledge
of the MARC standard is a big help on making heads or tails of the DTD, but
<Fld100>, <Fld245>, <Fld260>, and <Fld300> comprise most of what people
think of as basic bibliographic information.

If you're thinking that having all these different ways of encoding
bibliographic information is a headache waiting for those wanting to 
automate processing of bibliographic data from multiple sources, you're right.
But I don't think there's any way out of that one.  The needs of those doing
markup of bibliographic information vary quite a bit depending on whether
we're
talking scholars reporting on their research, librarians, publishers, students
at various levels, etc.  Mapping between multiple forms of marked up
bibliographic data is something we're just going to have to live with.
I try to think of it as yet another clause in the text-encoding programmers'
full employment act.


Jerome McDonough -- jmcdonou@library.Berkeley.EDU  |  (......)
Library Systems Office, 386 Doe, U.C. Berkeley     |  \ *  * /
Berkeley, CA 94720-6000    (510) 642-5168          |  \  <>  /
"Well, it looks easy enough...."                   |   \ -- /  SGNORMPF!!!
         -- From the Famous Last Words file        |    ||||

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Wed Mar  3 18:07:59 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:09:37 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd 
In-Reply-To: Stefan Mintert's message of Wed, 03 Mar 1999 18:13:45 +0100
Message-ID: <199903031807.SAA03605@stevenson.cogsci.ed.ac.uk>

> spec.dtd:60:17:W: named character reference
> spec.dtd:60:19:E: "X2014" is not a function name

Looks like it's not recognising XML-style character references -
presumably the line is

  <!ENTITY mdash  "&#x2014;">

Are you using a version of nsgmls that knows about XML?

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Wed Mar  3 18:12:25 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:09:37 2004
Subject: Publication of first WD of the W3C XML Fragment Interchange Rec
Message-ID: <3.0.32.19990303121005.00decde8@pophost.arbortext.com>

The W3C XML Fragment WG [1] has just published its first Working Draft 
of the XML Fragment Interchange Recommendation [2].  Its abstract reads:

 The XML standard supports logical documents composed of possibly several 
 entities. It may be desirable to view or edit one or more of the entities or 
 parts of entities while having no interest, need, or ability to view or edit 
 the entire document. The problem, then, is how to provide to a recipient of 
 such a fragment the appropriate information about the context that fragment 
 had in the larger document that is not available to the recipient. The XML 
 Fragment WG is chartered with defining a way to send fragments of an XML 
 document--regardless of whether the fragments are predetermined entities or 
 not--without having to send all of the containing document up to the part in 
 question. This document defines Version 1.0 of the [eventual] W3C 
 Recommendation that addresses this issue. 

Interested parties are invited to review the specification and
report implementation experience.  As indicated in the document, 
comments should be sent to [3], (a publicly archived [4] list).  
Comments received by 1999 March 26 will be considered for a 
revision soon after.

All comments will be considered in light of the XML Fragment Requirements 
Document [5].  In particular, basic scope issues and design decisions 
will be reconsidered only when grave and previously unrecognized flaws 
are uncovered.  Requests for enhancement will typically be deferred 
for later versions of the specification under development unless the 
enhancement is uncontroversial and its incorporation would not 
materially delay production of the specification.

Paul Grosso
XML Fragment WG Chair
Daniel Veillard
W3C Staff Contact

[1] http://www.w3.org/XML/Activity.html#fragment-wg
[2] http://www.w3.org/TR/WD-xml-fragment
[3] mailto:www-xml-fragment-comments@w3.org
[4] http://lists.w3.org/Archives/Public/www-xml-fragment-comments/
[5] http://www.w3.org/TR/NOTE-XML-FRAG-REQ

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Wed Mar  3 18:30:15 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:09:38 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd 
Message-ID: <3.0.32.19990303122916.00d2c4a0@pophost.arbortext.com>

At 18:13 1999 03 03 +0100, Stefan Mintert wrote:
>
 > 
 > http://www.w3.org/XML/1998/06/xmlspec-report-19980910.htm
 > 
>
> > > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd
> > > (copied from the above URL) with nsgmls.
> > 
> > As the documentation for XMLspec warns, the current version of the
> > DTD is *not* the one used with the XML Recommendation, which used
> > a much older version.  So don't do that.
>
>ooops, sorry; but that doesn't explain the parsing errors concerning the DTD:
>
>spec.dtd:60:17:W: named character reference
>spec.dtd:60:19:E: "X2014" is not a function name
>[...]

1.  nsgmls is not an XML parser.  those errors are probably because it's
    not recognizing &#X2014; (the hex version) as a numeric character
    reference.  You might try converting X2014 to a decimal number
    and seeing what happens.  Or, use an XML parser.

2.  The URL quoted above is old.  The latest are:

DTD:
  http://www.w3.org/XML/1998/06/xmlspec-19990205.dtd
Documentation:
  http://www.w3.org/XML/1998/06/xmlspec-report-19990205.htm

paul
    

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tgraham at mulberrytech.com  Wed Mar  3 19:31:20 1999
From: tgraham at mulberrytech.com (Tony Graham)
Date: Mon Jun  7 17:09:38 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd 
In-Reply-To: <199903031400.PAA23631@brown.informatik.uni-dortmund.de>
References: <01BE6366.0E5EF230.jarle.stabell@dokpro.uio.no>
	<199903031400.PAA23631@brown.informatik.uni-dortmund.de>
Message-ID: <f9903031429400056@inu.menteith.com>

At 3 Mar 1999 15:00 +0100, Stefan Mintert wrote:
 > Now I try to parse the XML spec (REC-xml-19980210.xml) and the spec.dtd
 > (copied from the above URL) with nsgmls. I'm using nsgmls 1.3 on SunOS 5.6
 > (Solaris 2). I already parsed xml instances without problems but in this case
 > it doesn't work. Following are the first lines of nsgmls output:
 > 
 > sm@brown(/tmp/sm){590}: /tmp/sm/sp-1.3/nsgmls/nsgmls -E 10 -w xml -s REC-xml-19980210.xml
 > /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:17:W: named character reference
 > /tmp/sm/sp-1.3/nsgmls/nsgmls:spec.dtd:60:19:E: "X2014" is not a function name

Add -c/tmp/sm/sp-1.3/pubtext/xml.soc to the command line so nsgmls
reads the xml.soc catalog that tells it to use the SGML Declaration
for XML, xml.dcl.  That SGML Declaration tells nsgmls what hexadecimal
character references look like.  Without it, things like &x2014; are
being interpreted as per ISO 8879:1986, which isn't doing you or the
parser any good.

Regards,


Tony Graham
======================================================================
Tony Graham                            mailto:tgraham@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9632
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at Eng.Sun.COM  Wed Mar  3 19:55:48 1999
From: db at Eng.Sun.COM (David Brownell)
Date: Mon Jun  7 17:09:38 2004
Subject: Encoding detection again ...
References: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303123700Z-16334@odin.cromwellmedia.co.uk>
Message-ID: <36DD9263.F26D063C@eng.sun.com>

> > > Put it this way:  if you assume UTF-16, you're
> > > safe either way because UTF-16 is a superset.
> >
> > Err ... is that true?
> >
> > Maybe I'm being a bit obsessive about my
> > interpretation of the various standards docs,

Given how many folk talk about UCS-2 lately (not many!)
that could well be true ... ;-)

> >	 but
> > as far as I can see UCS-2 isn't a subset of
> > UTF-16.
> 
> The question of UCS-2 being, or not being a subset of
> UTF-16 is a bit of a red herring. It is undoubtedly true
> that the set of octet pairs which are legal UCS-2
> characters is a subset of the set of octet pairs which
> are legal UTF-16 characters.

And more to the point, XML processors aren't required
to report such low level character encoding errors ...
this would be one.

 
> Appendix F suggests that octet sequences which could
> equally well be interpreted as UTF-16 or UCS-2 may be
> assumed to be UTF-16, and *doesn't* include a clause
> stating that this assumption should be revised in
> the light of an explicit XML encoding declaration. I
> think that clause should be added, in much the same
> way as it is for UTF-8 vs. 8859-X.

All of appendix F is non-normative; you're free to revise
or not, as you see fit, and it won't affect conformance.

- Dave


> Now the typo ...
> 
> > This very complicated and isn't a zillion miles away
> > from the current handling of UTF-8 vs. ISO 8859-x
> > vs. US-ASCII.
> 
> Please insert the word 'isn't' in the obvious
> place ;-)
> 
> Cheers,
> 
> Miles
> 
> --
> Miles Sabin                          Cromwell Media
> Internet Systems Architect           5/6 Glenthorne Mews
> +44 (0)181 410 2230                  London, W6 0LJ
> msabin@cromwellmedia.co.uk           England
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Wed Mar  3 20:32:53 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:38 2004
Subject: Encoding detection again ...
References: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303123700Z-16334@odin.cromwellmedia.co.uk> <36DD9263.F26D063C@eng.sun.com>
Message-ID: <36DD9C3C.99DD529C@goon.stg.brown.edu>

David Brownell wrote:

> And more to the point, XML processors aren't required
> to report such low level character encoding errors ...
> this would be one.

On the face of things, this doesn't make sense.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 20:53:39 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:38 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd
References: <3.0.32.19990303122916.00d2c4a0@pophost.arbortext.com>
Message-ID: <36DDA11E.C07163B0@locke.ccil.org>

Paul Grosso wrote:

> 1.  nsgmls is not an XML parser.

The version included with SP 1.3 <em>is</em> an XML parser,
though not entirely defect-free.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 20:59:59 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:38 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <000d01be64fc$1a3a09e0$0dce6ccb@baden> <36DD523B.F2EAFB7E@locke.ccil.org> <36DD92EC.80B3B3DE@eng.sun.com>
Message-ID: <36DDA2A7.8E2F1E01@locke.ccil.org>

David Brownell wrote:

> Surely it's more important that Klingon markup be supported?  :-)

All Languages Are Equal (TM).
 
> I notice that a recent Linux distribution puts Klingon support
> into a chunk of private use area, so at least there's consistency
> that XML doesn't yet offer complete Klingon support!

Right.  Support for private-use characters in XML names will always
be a Bad Thing, because nobody outside the private user can tell
which characters are letters and which aren't, so it's either all
or none, and "none" is the most sensible choice.

Just be prepared to revisit XML so that Unicode 3.0 name and name-start
characters can get included.  This will allow the creation of
DTDs written in serious Real World languages like Macedonian, Syriac,
Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native
languages, Khmer, Mongolian, and Yi.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar  3 21:04:29 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:38 2004
Subject: Encoding detection again ...
References: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303123700Z-16334@odin.cromwellmedia.co.uk> <36DD9263.F26D063C@eng.sun.com> <36DD9C3C.99DD529C@goon.stg.brown.edu>
Message-ID: <36DDA39D.EA98C73B@locke.ccil.org>

Richard L. Goerwitz wrote:

> On the face of things, this doesn't make sense.

For example, a document containing &#80; and otherwise error-free
may be processed without error, although U+0080 is not a
legal Unicode character.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at Eng.Sun.COM  Wed Mar  3 21:06:07 1999
From: db at Eng.Sun.COM (David Brownell)
Date: Mon Jun  7 17:09:38 2004
Subject: Java Specification Request for XML
Message-ID: <36DD9EA1.2CEE7CEA@eng.sun.com>

There seems to have been some confusion regarding what Sun is trying
to do with its Java Specification Request for an XML Extension to the
Java Platform.

A Java Specification Request (JSR) is a request to develop a
specification; it is not a specification in itself.  What we did
a week ago is ask for comments regarding this proposal to begin
work on such an XML Extension specification.  If this is approved,
we will then follow the Java Community Process as described at
http://developer.java.sun.com/developer/jcp/ to actually develop
that specification.

The Java Community Process is an open, inclusive process and we
look forward to the active particpation of all interested parties.

The process goes forward in several steps:

[1] The JSR is presented for comment (as you've seen)
[2] The JSR is approved (we hope)
[3] An expert group is formed to write the specification; this
    begins with a "Call for Experts" (CAFE) to participate.
[4] The expert group writes a first draft of the specification
[5] The draft is circulated to all Java technology licensees and
    Participants in the Java Community Process.
[6] Comments are collected, read, and responded to by the expert
    group, resulting in an improved specification.
[7] The refined specification is then released to the public for
    comment.
[8] Comments from the public are collected, read, and responded
    to by the expert group, resulting in more refinements.
[9] The final specification is produced by the expert group, along
    with a reference implementation and compatibility tests.

The key point is that everyone with internet access will get a
chance to review and comment on the emerging specification.

Note that the xml-dev community has already had input into the
proposed specification as evidenced by the referencing of the
SAX specification in the JSR as one of the starting documents.
Other specifications could be adopted by the expert group.

We look forward to the continued participation of the xml-dev
community in this work.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at Eng.Sun.COM  Wed Mar  3 21:38:27 1999
From: db at Eng.Sun.COM (David Brownell)
Date: Mon Jun  7 17:09:38 2004
Subject: Encoding detection again ...
References: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303123700Z-16334@odin.cromwellmedia.co.uk> <36DD9263.F26D063C@eng.sun.com> <36DD9C3C.99DD529C@goon.stg.brown.edu>
Message-ID: <36DDAA6F.D432053A@eng.sun.com>

"Richard L. Goerwitz" wrote:
> 
> David Brownell wrote:
> 
> > And more to the point, XML processors aren't required
> > to report such low level character encoding errors ...
> > this would be one.
> 
> On the face of things, this doesn't make sense.

For example, character encodings are typically handled
many layers below the XML processor.  That processor
shouldn't be faulted for behaviors of the underlying
processor.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Wed Mar  3 21:52:01 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:09:38 2004
Subject: Encoding detection again ...
Message-ID: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303214309Z-16544@odin.cromwellmedia.co.uk>

David Brownell wrote,
> "Richard L. Goerwitz" wrote:
> > David Brownell wrote:
> > > And more to the point, XML processors aren't
> > > required to report such low level character 
> > > encoding errors ... this would be one.
> > 
> > On the face of things, this doesn't make sense.
> 
> For example, character encodings are typically handled
> many layers below the XML processor.  That processor
> shouldn't be faulted for behaviors of the underlying
> processor.

Most of the time yes ... but remember we're discussing 
the interaction between encoding detection and encoding 
_declarations_. An XML processor has to have some 
involvement in that.

Cheers,


Miles

-- 
Miles Sabin                          Cromwell Media
Internet Systems Architect           5/6 Glenthorne Mews
+44 (0)181 410 2230                  London, W6 0LJ
msabin@cromwellmedia.co.uk           England


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at Eng.Sun.COM  Wed Mar  3 22:03:28 1999
From: db at Eng.Sun.COM (David Brownell)
Date: Mon Jun  7 17:09:38 2004
Subject: Encoding detection again ...
References: <c=US%a=_%p=Cromwell_Media%l=ODIN-990303214309Z-16544@odin.cromwellmedia.co.uk>
Message-ID: <36DDB059.550023E7@eng.sun.com>

Miles Sabin wrote:
> 
> David Brownell wrote,
> > "Richard L. Goerwitz" wrote:
> > > David Brownell wrote:
> > > > And more to the point, XML processors aren't
> > > > required to report such low level character
> > > > encoding errors ... this would be one.
> > >
> > > On the face of things, this doesn't make sense.
> >
> > For example, character encodings are typically handled
> > many layers below the XML processor.  That processor
> > shouldn't be faulted for behaviors of the underlying
> > processor.
> 
> Most of the time yes ... but remember we're discussing
> the interaction between encoding detection and encoding
> _declarations_. An XML processor has to have some
> involvement in that.

But the error in question would show up after the encoding
declaration had been processed -- well after! -- so the
XML processor itself would no longer need involvement.

The non-normative "detection" can't involve the error ...
surrogates can't appear within encoding declarations.

In any case, it's OK for conformant processors to reject
UCS-2 out of hand, eliminating all possibility of such
an error in any case!

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Mar  3 22:24:41 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:38 2004
Subject: Java Specification Request for XML
In-Reply-To: <36DD9EA1.2CEE7CEA@eng.sun.com>
Message-ID: <199903032224.RAA10719@hesketh.net>

At 12:42 PM 3/3/99 -0800, David Brownell wrote:
>There seems to have been some confusion regarding what Sun is trying
>to do with its Java Specification Request for an XML Extension to the
>Java Platform.
>
>[...]
>
>The Java Community Process is an open, inclusive process and we
>look forward to the active particpation of all interested parties.
>
>[...detailed list of process steps, excerpted..]
>
>[4] The expert group writes a first draft of the specification
>[5] The draft is circulated to all Java technology licensees and
>    Participants in the Java Community Process.
>[7] The refined specification is then released to the public for
>    comment.
>
>The key point is that everyone with internet access will get a
>chance to review and comment on the emerging specification.
>
>Note that the xml-dev community has already had input into the
>proposed specification as evidenced by the referencing of the
>SAX specification in the JSR as one of the starting documents.
>Other specifications could be adopted by the expert group.
>
>We look forward to the continued participation of the xml-dev
>community in this work.

This all sounds good, but I remain concerned (and wary) for a number of
reasons, and I didn't respond directly to your JSR commenting process
because I'm very uncertain about whether this development belongs in a
process controlled, however lightly, by a particular vendor.

The JCP is only a partially open process, as the sequence of steps above -
in which Java technology licensees and 'Participants in the Java Community
Process' is step 5 and the public is step 7 - demonstrates.  It seems that
the licensees and 'official' participants are still privileged, have
earlier access to the information, and potentially more impact on its
shape.  I don't expect to be one of the experts crafting the standard, but
I hope to able to participate in the discussions as a real participant and
not just another spectator.

Given that SAX was developed (and is still developing) in a very open
forum, it seems like the JCP is moving into an area that was totally open
and moving it to an arena that is _less_ open.  There have been a lot of
criticisms of W3C process on this list, as I'm sure you've noticed, for
similar openness problems.  While the W3C does in some way respond to
public comments, there's no transparency - we have no way to know how much
they care.

I'd like to hear Sun make some _strong_ statements that they'll be
developing this API in a way more like the SAX process than the DOM
process, and that genuine transparency is the goal of the JCP rather than
Sun protecting what it sees as its interests in the Java/XML space.  I
think Sun could make a great contribution here, using its weight in the
Java community to help standardize XML processing and make it more
universally used, but I hope Sun isn't planning to use that weight to
direct the discussion and influence the final decisions unduly.

It's promising, but I think there are a lot of folks out here who are very
wary.  (See Elliotte Rusty Harold's comments at http://metalab.unc.edu/xml
for an example.)  I'm definitely wary, though I also have some real hopes.

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mscardin at us.oracle.com  Wed Mar  3 22:50:00 1999
From: mscardin at us.oracle.com (Mark Scardina)
Date: Mon Jun  7 17:09:38 2004
Subject: ANN: Oracle XML Class Generator for Java
Message-ID: <001701be65c7$febb5620$47be1990@mscardin-pc.us.oracle.com>

I would like to announce Oracle's second XML component beta release -
XML Class Generator for Java - now available for downloading and testing on
the Oracle
Technology Network (OTN) XML site located at http://technet.oracle.com.

The XML Class Generator will generate a set of Java source files based on an
input DTD. The generated Java source files can then be used to construct,
optionally validate, and print a XML document that is compliant to the DTD
specified. This is an early beta release and has the following features:

  * Creates Java Classes from DTDs to enable the programmatic construction
of XML documents.
  * Supports validation mode to assist debugging.
  * Works with the Oracle XML Parser in Java.
  * Creates documents conforming to the W3C XML 1.0 Recommendation.
  * Supports creating documents in the following encodings:

                  UTF-8
                  UTF-16
                  ISO-10646-UCS-2
                  ISO-10646-UCS-4
                  US-ASCII
                  EBCDIC-CP-US
                  ISO-8859-1
                  Shift_SJIS

Support is available in the XML Forum on OTN to provide a collaborative
area for bug reporting, technical support, and discussing other Oracle/XML
issues.  This forum will be used for external as well as internal beta
testers.

Mark V. Scardina
Sr. Product Manager - Core Development
Server Technologies - Oracle Corporation
Oracle XML News http://www.oracle.com/xml


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mscardin at us.oracle.com  Wed Mar  3 23:00:19 1999
From: mscardin at us.oracle.com (Mark Scardina)
Date: Mon Jun  7 17:09:38 2004
Subject: ANN: Oracle XML Parser for Java - Preoduction Release
Message-ID: <001801be65c9$6513c960$47be1990@mscardin-pc.us.oracle.com>

The production release of the Oracle XML Parser for Java
is available for download at http://technet.oracle.com/tech/xml.

Supports validation and non-validation modes 
     Built-in Error Recovery until fatal error. 
     Supports W3C XML 1.0 Recommendation. 
     Intergrated Document Object Model (DOM) Level 1.0 API 
     Integrated SAX 1.0  API 
     Supports W3C Proposed Recomendation for XML Namespaces 
     Supports documents in the following encodings: 

           UTF-8           BIG 5
           UTF-16          GB2312
           ISO-10646-UCS-2 EUC-JP
           ISO-10646-UCS-4 EUC-KR
           US-ASCII        KOI8-R
           EBCDIC-CP-*     ISO-2022-JP
           ISO-8859-1to -9 ISO-2022-KR
           Shift_JIS

Support is available in the XML Forum on OTN to provide a collaborative
area for bug reporting, technical support, and discussing other Oracle/XML
issues.  This forum will be used for external as well as internal beta
testers.

Mark V. Scardina
Sr. Product Manager - Core Development
Server Technologies - Oracle Corporation
Oracle XML News http://www.oracle.com/xml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dante at mstirling.gsfc.nasa.gov  Thu Mar  4 15:48:21 1999
From: dante at mstirling.gsfc.nasa.gov (Dante Lee)
Date: Mon Jun  7 17:09:38 2004
Subject: HTML Question
Message-ID: <Pine.LNX.3.95.990304112836.5033C-100000@mstirling.gsfc.nasa.gov>

Can someone look at the source of my web page and tell me why my links are
not coming up in the targeted frames?

The site is at:
http://mstirling.gsfc.nasa.gov/~dante/sharp98

All of the links are targeted to Frame 1, which is specified in the index
frame as the frame to the right.  However, all of the links pop up as new
windows.  Please help.  I think it has something to do with the javascript
in Frame1 (titlebox.html).  Thanx.


	          Dante M. Lee    Code 588
        	NASA/GSFC Greenbelt MD 20771
 	Voice = 301-521-1077   Bldg = 23  Rm = W415 
 	  Email = dante@mstirling.gsfc.nasa.gov
                  dante4@hotmail.com                                          
 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dalapeyre at mulberrytech.com  Thu Mar  4 19:53:04 1999
From: dalapeyre at mulberrytech.com (Deborah Aleyne Lapeyre)
Date: Mon Jun  7 17:09:38 2004
Subject: DTD for Bibliographic Notation
In-Reply-To: <v03102807b303116f7c52@[168.100.203.234]>
References: <49092BAEAC84D2119B0600805FD40F9F120DC3@MDYNYCMSX1>
Message-ID: <v03020907b30437160193@DialupEudora>

The journal publishers have taken a cut at bibliographies,
for small example:
   Elsevier's is available on their website (at least it used to be)
   John Wiley & Sons has one (See WILEY Interscience)
   CADMUS used to have theirs on their website
   Ovid has made theirs public as well (but I would not recommend it)

   PUBMED at NIH/NLM also has a very basic but nice subset
    (definitely available on their website).

--Debbie

======================================================================
Deborah Aleyne Lapeyre               mailto:dalapeyre@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9633
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Mar  4 20:15:31 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:39 2004
Subject: XML and special Characters : unicode v3.0 ?
Message-ID: <3.0.32.19990303213203.00bb79a0@pop.intergate.bc.ca>

At 03:59 PM 3/3/99 -0500, John Cowan wrote:
> This will allow the creation of
>DTDs written in serious Real World languages like Macedonian, Syriac,
>Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native
>languages, Khmer, Mongolian, and Yi.

John, this is unfair.  All the Macedonians and Sinhalese I've known
have an excellent sense of humor.  -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jpetit at 4thworldtele.com  Thu Mar  4 22:05:34 1999
From: jpetit at 4thworldtele.com (John Petit)
Date: Mon Jun  7 17:09:39 2004
Subject: XSL Pre-processing
Message-ID: <36DEA0E1.1EA10C7E@4thworldtele.com>

Is there any software out there that will allow me to do server side XSL
preprocessing of XML documents into HTML for display? This is
independent of the user's browser.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: vcard.vcf
Type: text/x-vcard
Size: 368 bytes
Desc: Card for John Petit
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990304/7bd50a3e/vcard.vcf
From donpark at quake.net  Thu Mar  4 22:06:46 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:09:39 2004
Subject: XML MULTI-Fragment Interchange?
Message-ID: <00ac01be668b$3abc9f80$2ee044c6@arcot-main>

The first draft of the XML Fragment spec allows only one Fragbody.  Could
someone from the WG shed some light on why this constraint is important?

Multi-fragment packages are useful in many situations such as query result
representation.  Although it is possible to define a packaging mechanism
that handles multiple fragments, a fragment context information (FCI) must
be provided for each fragment because the spec does not allow FCI to be
shared by multiple fragments.

A possible example of a multi-fragment package follows:

<?xml version="1.0"?>
<p:package xmlns:p="http://www.w3.org/XML/Package/1.0"
           xmlns:f="http://www.w3.org/XML/Fragment/1.0"
           xmlns="">
  <f:fcs>
    <transaction>
      <purchase>
        <book/>
        <f:fragbody idref="frag1"

fragbodyref="http://sales.acme.com/trans/19990207-1234#root().child
(1,purchase).child(2,book)"/>
        <f:fragbody idref="frag2"

fragbodyref="http://sales.acme.com/trans/19990207-1234#root().child
(1,purchase).child(5,book)"/>
      </purchase>
    </transaction>
  </p:fcs>

  <p:body id="frag1">
    <book>
      <Author>J. R. R. Tolkien</Author>
      <Title>The Book of Lost Tales (The History of Middle-Earth)</Title>
      <Edition>Mass Market Paperback Reprint edition (June 1992)</Edition>
      <ISBN>0345375211</ISBN>
      <Price currency="USD">4.79</Price>
      <Quantity>1</Quantity>
    </book>
  </p:body>
  <p:body id="frag2">
    <book>
      <Author>J. R. R. Tolkien</Author>
      <Title>The Book of Lost Tales (The History of Middle-Earth)</Title>
      <Edition>Mass Market Paperback Reprint edition (June 1992)</Edition>
      <ISBN>0345375211</ISBN>
      <Price currency="USD">4.79</Price>
      <Quantity>1</Quantity>
    </book>
  </p:body>
</p:package>

Comments?

Don Park
Docuverse


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar  4 22:19:54 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:39 2004
Subject: XML and special Characters : unicode v3.0 ?
References: <3.0.32.19990303213203.00bb79a0@pop.intergate.bc.ca>
Message-ID: <36DF06D2.F214E9F9@locke.ccil.org>

Tim Bray wrote:
 
> At 03:59 PM 3/3/99 -0500, John Cowan wrote:
> > This will allow the creation of
> >DTDs written in serious Real World languages like Macedonian, Syriac,
> >Divehi, Sinhala, Burmese, Ethiopic, Cherokee, various Canadian Native
> >languages, Khmer, Mongolian, and Yi.
> 
> John, this is unfair.  All the Macedonians and Sinhalese I've known
> have an excellent sense of humor.  -Tim

Well, several people have believed this was sarcasm on my part.
Not so.  When I said "serious Real World languages" I meant it.
Real people speak, understand, read, and write them in the
course of their day-to-day lives.

Ancient Egyptian and Sindarin don't fall into this category,
no matter that I am an enthusiast of both.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar  4 23:00:30 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:39 2004
Subject: Unicode conformance, short version
References: <3.0.32.19990301212757.00a2e5b0@pop.intergate.bc.ca>
			<36DC0627.F2491FA8@locke.ccil.org>
			<36DC67FB.26E0@w3.org>
			<199903041310.WAA18593@sh.w3.mag.keio.ac.jp> <14046.38677.746315.899329@localhost.localdomain>
Message-ID: <36DF1059.76F2CC4E@locke.ccil.org>

Unicode folks have seen this, but XML folks haven't.

Here's John's Own Version Of Unicode Conformance:

1) Unicode characters are 16 bits long; deal with it.
2) Byte order is only an issue in files.
3) If you don't have a clue, assume big-endian.
4) Loose surrogates don't mean jack.
5) Neither do U+FFFE and U+FFFF (a.k.a. the zigamorph).
6) Leave the unassigned codepoints alone.
7) It's OK to be ignorant about a character, but not plain wrong.
8) Subsets are strictly up to you.
9) Canonical equivalence matters.
10) Don't garble what you don't understand.

This is presented in the hope that it may be useful, but all
warranties (including implicit warranties of merchantability or
fitness for a particular purpose) are void.  Freely reusable,
except that John Cowan asserts the moral right to be known as author.


-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at oreilly.com  Thu Mar  4 23:03:03 1999
From: crism at oreilly.com (Chris Maden)
Date: Mon Jun  7 17:09:39 2004
Subject: I wonder ...
In-Reply-To: <000301be64f6$1363e9c0$5118a8c0@kuantech1.quokka.com>
	(jes@kuantech.com)
Message-ID: <199903042301.SAA01051@ruby.ora.com>

[Jeffrey E. Sussna]
> This works fine, but (at least in IE 5) only for a single
> level. That = is, you can't have another entity reference inside
> "book.dtd". To me, = this significantly limits its usefulness
> (imagine not allowing a = #include inside a file that was
> #included).

IE 5 has its parsing errors, but this is not one of them.  Error
messages I've seen when parsing DocBook indicate that it is definitely
following references to multiple levels.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Mar  4 23:16:40 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:39 2004
Subject: Unicode conformance, short version
Message-ID: <3.0.32.19990304151441.00c17b10@pop.intergate.bc.ca>

At 05:59 PM 3/4/99 -0500, John Cowan wrote:

>4) Loose surrogates don't mean jack.

There's reason to believe they mean severe breakage upstream, and in
mission-critical apps are probably grounds to halt and catch fire.
Anyhow, if you're reading a character stream and one of 'em has a
value between (decimal) 55296 and 57343 inclusive, it ain't XML
any longer. (And I believe all the serious XML processors actually
enforce this particular rule). -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Thu Mar  4 23:26:39 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:09:39 2004
Subject: XML MULTI-Fragment Interchange?
Message-ID: <3.0.32.19990304172546.00f10224@pophost.arbortext.com>

At 14:06 1999 03 04 -0800, Don Park wrote:
>The first draft of the XML Fragment spec allows only one Fragbody.  Could
>someone from the WG shed some light on why this constraint is important?

First, let me remind folks that only comments sent to the archived
mail list set up for comments are "officially" considered.  The WG
cannot promise to honor all requests for responses to questions posted 
on xml-dev.  

However, the answer to Don's question will probably address a lot of
other questions, and the WG did consider it carefully, so I would like
to answer that here.

One of the key principals in developing this version of the Fragment
Interchange spec was to define and remain within a limited scope.  The
problem was (1) to define what fragment context information is, (2) to 
define a fragment context specification notation, and (3) to define at 
least one interoperable method for associating a fragment context 
specification with a fragment body.

Although we did decide to address point (3) by defining a simple
"packaging" scheme, we were very careful to do the minimum necessary
to address point (3).  Specifically, we did not want to enlarge our scope
to include packaging methods in general.  It is expected that the XML
Activity of the W3C will consider ways to address packaging in the near
future, and the XML Fragment WG didn't want to do something that might
later constrain a more general solution.  Packaging multiple entities
in a single unit is likely to be a useful thing to do in general of which
packaging multiple fragment bodies is just one example.  The WG didn't
want to define a way to address multiple fragment bodies and then discover,
when the more general problem is carefully considered, that our solution
wasn't a subset of the solution to the more general problem.

In summary, the WG is aware of lots of improvements, enhancements, and
extensions that could be made to an XML Fragment Interchange spec, but
we ruthlessly kept ourselves to the "minimum needed to declare victory."
We expect work on Schemas and Packaging and XLink and probably other
areas will all contribute technology that would be useful in a version 2
XML Fragment Interchange spec someday, but we believe that implementation
and user experience should prove the version 1 spec useful before we even
think about a version 2.

Of course, if you seriously believe that the spec is useless unless it
allows multiple fragment bodies per package, then that is a comment you
should make and attempt to support.  We don't want to come out with a
spec folks think is useless, but we were trying to keep it as minimal
as possible while still addressing the problem we defined as our scope.

paul

Paul Grosso
Chair, XML Fragment WG

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Mar  5 00:27:45 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:39 2004
Subject: XSL Pre-processing
Message-ID: <005901be669e$efbf2520$0300000a@othniel.cygnus.uwa.edu.au>

I use James Clark's XT.

For a more complete list see http://www.xmlsoftware.com/xsl/

For examples of XSL I use to produce the above site, see
http://www.xmlsoftware.com/articles/xsl-by-example.html

James

-----Original Message-----
From: John Petit <jpetit@4thworldtele.com>

>Is there any software out there that will allow me to do server side XSL
>preprocessing of XML documents into HTML for display? This is
>independent of the user's browser.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Fri Mar  5 00:37:58 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:09:39 2004
Subject: XML MULTI-Fragment Interchange?
Message-ID: <004001be66a0$5aff8bd0$2ee044c6@arcot-main>

Paul,


>First, let me remind folks that only comments sent to the archived
>mail list set up for comments are "officially" considered.  The WG
>cannot promise to honor all requests for responses to questions posted
>on xml-dev.


Sorry about that.  I couldn't find the e-mail address of the mailing list
(W3C site was down) when I sent my message so had to punt into xml-dev.

>Of course, if you seriously believe that the spec is useless unless it
>allows multiple fragment bodies per package, then that is a comment you
>should make and attempt to support.  We don't want to come out with a
>spec folks think is useless, but we were trying to keep it as minimal
>as possible while still addressing the problem we defined as our scope.


I found the spec very useful, timely, and clear.  It was not my intention to
delay, divert, or hamper the progress of the XML Fragment spec.  It was also
not my intention to imply that the WG overlooked something important.

I withdraw my comment since it does not fall under the intended scope of the
spec.

Best,

Don Park
Docuverse


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cadams at cascadecc.com  Fri Mar  5 01:16:58 1999
From: cadams at cascadecc.com (Chad Adams)
Date: Mon Jun  7 17:09:39 2004
Subject: Opinions requested
Message-ID: <000701be66a5$d03f5100$01010101@development.cascade>

Forgive me for the generic question, I'm to the point of betting the bank on
XML, and I'm looking for a pat on the back, or a voice of warning....

We are starting from scratch on our next generation product, from what I've
read and seen - xml seems to fit the bill (Content Management, mixed with
WIDL RPC functionality seems right up our alley).  I'm looking hard at ODBMS
systems and laying out the DB via xml (storing xlm directly).  We have a
wealth of in-house Java and COM/DCOM experience, but none with ODBMS or XML.

Do I understand it correctly that I at an item level, I can:
	1. name it (URI)?
		a. possible supply some security to it?
	2. revision it?
	3. meta-data it?
		a. can meta-data have meta-data?

Would I be foolish to base my whole object system storage on xml, or on
ODBMS for that matter?  Are they cooked, are they ready for real world apps?

Once again, I'm sorry for the generic question, I have read the FAQ's, the
ODBMS webpages, several books etc.  I'm looking for the advice of those in
the trenches - Is it safe to make XML the foundation of my new product?

Should I grab a shovel, and jump in the trenches with you, or is this a deep
dark hole?


Thanks in advance, for all who might reply.


Chad Adams
Payback Training Systems
Email: cadams@cascadecc.com
Phone: 435-654-6304
fax:   435-654-1482


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar  5 01:28:11 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:39 2004
Subject: Opinions requested
Message-ID: <3.0.32.19990304172718.00ba7c80@pop.intergate.bc.ca>

At 06:16 PM 3/4/99 -0700, Chad Adams wrote:
>Forgive me for the generic question, I'm to the point of betting the bank on
>XML, and I'm looking for a pat on the back, or a voice of warning....

You might get more helpful help if you described the problem you're trying
to solve. 

On the other hand, anything that has XML and ODBMS and Java and
COM/DCOM in it has to be A Good Thing; ask any analyst or 
prognosticator.  You might have to hire some of those two-headed
programmers, though. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Fri Mar  5 01:53:52 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:39 2004
Subject: ModSax Suggestion
Message-ID: <d32bfc76.36df36fb@aol.com>

Hi Everyone,

While SAX does a good job as an event-based interface
to Parsers, it would be nice to add a few methods to
receive a DOM representation back from a reference to an org.xml.sax.Parser.

Something like:

org.w3c.dom.Document  parse(InputSource  is, boolean events) throws
SAXException; 
org.w3c.dom.Document  parse(java.lang.String uri, boolean events) throws
SAXException;
/* the events boolean would be to turn on/off event calls. */

If a SAXDriver did not want to produce a DOM, it could either simply
return a null or a method added like:

boolean isDomCapable();

The above would let me use the ParserFactory to seamlessly switch 
between Parser implementations and get a DOM tree without building
one myself.  It is fruitless for me to build a DOM tree when almost all
the parser implementations provide that ability.  I just want a way to get
at that functionality in a simple and standard way (thus SAX). 

Thoughts?

 - Mike
-----------------------------------------------
Michael C. Daconta
Author of Java 2 and JavaScript for C/C++ Programmers
Author of C++ Pointers and Dynamic Memory Management
Sun Certified Java Programmer and Developer
http://www.gosynergy.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Fri Mar  5 01:58:51 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:09:39 2004
Subject: Opinions requested
In-Reply-To: <000701be66a5$d03f5100$01010101@development.cascade>
Message-ID: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com>

I will not comment on the advisability of using an ODBMS, because 1) it's out of scope for this group, and 2) it's a highly religious topic. However, I will comment on the question of whether to store your data directly as XML, and confess that I don't understand the question. XML is a great interchange language; i.e., a way to move data between systems. Generally speaking, however, each particular system has its own optimal internal representation. In an RDBMS, for example, it's tables. In a Java program it's objects, and so forth. There is not (AFAIK) yet any such thing as an XDBMS (though you could consider a file system of XML documements plus a web server to resolve URL's to those documents as such a thing). Anyway, my approach would be to store data in the most natural format for the given storage technology, and define translations to and from XML to move data between systems.

Jeff

-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
Chad Adams
Sent: Thursday, March 04, 1999 5:17 PM
To: xml-dev@ic.ac.uk
Subject: Opinions requested


Forgive me for the generic question, I'm to the point of betting the bank on
XML, and I'm looking for a pat on the back, or a voice of warning....

We are starting from scratch on our next generation product, from what I've
read and seen - xml seems to fit the bill (Content Management, mixed with
WIDL RPC functionality seems right up our alley).  I'm looking hard at ODBMS
systems and laying out the DB via xml (storing xlm directly).  We have a
wealth of in-house Java and COM/DCOM experience, but none with ODBMS or XML.

Do I understand it correctly that I at an item level, I can:
	1. name it (URI)?
		a. possible supply some security to it?
	2. revision it?
	3. meta-data it?
		a. can meta-data have meta-data?

Would I be foolish to base my whole object system storage on xml, or on
ODBMS for that matter?  Are they cooked, are they ready for real world apps?

Once again, I'm sorry for the generic question, I have read the FAQ's, the
ODBMS webpages, several books etc.  I'm looking for the advice of those in
the trenches - Is it safe to make XML the foundation of my new product?

Should I grab a shovel, and jump in the trenches with you, or is this a deep
dark hole?


Thanks in advance, for all who might reply.


Chad Adams
Payback Training Systems
Email: cadams@cascadecc.com
Phone: 435-654-6304
fax:   435-654-1482


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Fri Mar  5 02:18:56 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:39 2004
Subject: Opinions requested
Message-ID: <8703ae19.36df3add@aol.com>

Hi Chad,

In a message dated 3/4/99 8:25:02 PM Eastern Standard Time,
cadams@cascadecc.com writes:
> Forgive me for the generic question, I'm to the point of betting the bank on
>  XML, and I'm looking for a pat on the back, or a voice of warning....
>  

Before you bet the bank, you need to make sure you are not 
dependent on any part of the XML family of specifications that are
not complete, nor have a variety of stable implementations from different
vendors.

XML will revolutionize the web ... but the key word there is "will".

A small company cannot afford to wait for a market to mature.

As one who has been part of a small company that jumped on a 
technology too soon in the maturity curve (like Java 1.02), 
I would recommend caution.

Best wishes,

 - Mike
-----------------------------------------------
Michael C. Daconta
Author of Java 2 and JavaScript for C/C++ Programmers
Author of C++ Pointers and Dynamic Memory Management
Sun Certified Java Programmer and Developer
http://www.gosynergy.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Fri Mar  5 03:31:37 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:39 2004
Subject: Opinions requested
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com>
Message-ID: <36DF4CE1.7F4D3681@simdb.com>

"Jeffrey E. Sussna" wrote:

> There is not (AFAIK) yet any such thing as an XDBMS (though you could consider a file system of XML documements plus a web server to resolve URL's to those documents as such a thing).

I am continually surprised to hear remarks such as this.  SIM _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with structure and full content query capabilities).  As an XDBMS it has weaknesses (it only supports predefined indexes and limited structure querying), but in some ways provides a model that is even richer than XML (it provides structure below element level, and has the concept of fields -- both of these features can be accessed through arbitrary expressions, which can be complete programs, for instance a field can contain every other word of paragraphs whose parent section has a "priority" attribute with a numerical value less than 5; it also provides arbitrary document fragmenting capabilities at the application level).  And the weaknesses are not intrinsic to our model -- we have full structure queries slated for the near future (probably in the next six months).

SIM is just one of many XDBMS's avilable on the market, and is one of the fastest, if not _the_ fastest, and most scalable available (at the very least, it is a country mile ahead of (R|OO)DBMS's in terms of XML performance, contrary to the ever-popular notion that the latter are inherently faster than the former -- one client, after migrating their application from a popular RDBMS to SIM, removed the stop button from the query dialog because no-one ever got a chance to see it).

Anyway, enough shameless marketing: XDBMS's do exist today, and they do support high performance storage, querying and retrieval.


Cheers,
Marcelo Cantos

http://www.simdb.com/~marcelo/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cadams at cascadecc.com  Fri Mar  5 06:40:43 1999
From: cadams at cascadecc.com (Chad Adams)
Date: Mon Jun  7 17:09:40 2004
Subject: Opinions requested - more detail on what my thinking is.
Message-ID: <000101be66d3$172277a0$01010101@development.cascade>

What is AFAIK?

Maybe I've been confused by ODBMS products/sells documentation.  At least
three of them that I have looked at (Object Design,  Ardent and Poet) seem
to have fairly extensive XML API's as well as other tools that support xml
storage in their databases.

For example, poet supplies a check-in/check-out utility that is used as a
version control system for content management of the xml structure stored
directly into the DB.  They also supply a browser utility that directly
accesses the DB, giving an xml tree navigation, and display - I assume they
are using something like microsoft's xml parser that renders to html and
displays it.

I assume that they are providing a set of java classes that model xml which
is then stored directly in the DB (feed it an xlm file it stores an xlm
object graph representing the document).  I assume upon retrieval it simply
streams (maybe as simple as toString())it's xml representation to the xml
consumer who parses/renders it per dtd or whatever - no conversion
processing is needed in the path until the consumer, keeping speed optimal,
(pushing expensive parsing work to the client, relieving a busy server with
time to dish up more).

verses

storing some java non-xml object in the database, you then retrieve the
object from the database and wrap the information of the object into xml -
and then ship it to some xml consumer, who then parses/renders it back into
the non-xml objects form.

It also seems to me that if the objects that you are storing are not xlm
objects, you have lost the concept of Context Management or at least made it
more complex to implement.

I think this is where "betting the bank" comes in.

To architect the system the second way would be to code a class per possible
unique xlm element.  You would then need to write classes to pull these
"atomic" elements together etc.  Upon retrieval you would then create the
xlm for transport.  This would isolate the DB storage and the client from
xml because it would be your own animal, giving you extensibility via normal
java class programming.

To architect with xlm from the bottom up puts XML Content Management at the
very root of the design.  You are dependant upon the xlm protocol (not your
own custom objects) to give you extensibility.  Custom tags, meta-data,
naming, versioning, whatever else xlm gives you, must be versatile enough to
emulate the java class hierarchies of complex inheritance and aggregation
graphs (as used in the option above).  This allows for the same authoring
tools used to develop content, to also develop navigation and other
parameters that will be utilized at run time by the consumer of the xlm.
I'm assuming the big buy here is code will only need to be written for the
authoring tool, and the xlm consumer.  All delivery from the db to the
client (even via complex n-tier systems) would require very little, or no
coding by us.

Client code would parse out the displayable portions to html and display it.
An applet would obtain the custom tags, meta-data etc. to make runtime
decisions on what to do, based on things that could happen as the user
interacts with the page.

Our need:

Author, name, store, revision, reuse, retrieve - pieces of documents, that
can then be combined with other documents, which can in turn be combined
with others ...

Documents are composed of text, video, audio, graphics ...  All the goodies
of style sheets etc. would be used.

Custom xlm tags would not be published at this time - interoperability with
the world is not the driving requirement - ease of transporting
documentation + special controls from an n-tier DB system running our code
to a thin client running our code is.

Meta data and custom tags would be used for several reasons - for example;
enhance search and selection algorithms for authoring, bury navigational
control data/logic that could be used at run time to help select the next
element to display, bury management hooks that would trigger widl rpc to
other processes based on runtime states etc.

I'm also assuming that an ODBMS could deliver up complex linked, deeply
nested xlm documents faster than open/read/closing hundreds of possible
files to assemble some document.  Concurrent open file handles also present
a problem ...

Have I missed the boat on what the ODBMS companies with XML Content
Management Systems have to offer me?

Chad


> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Jeffrey E. Sussna
> Sent: Thursday, March 04, 1999 6:57 PM
> To: 'Chad Adams'; xml-dev@ic.ac.uk
> Subject: RE: Opinions requested
>
>
> I will not comment on the advisability of using an ODBMS, because
> 1) it's out of scope for this group, and 2) it's a highly
> religious topic. However, I will comment on the question of
> whether to store your data directly as XML, and confess that I
> don't understand the question. XML is a great interchange
> language; i.e., a way to move data between systems. Generally
> speaking, however, each particular system has its own optimal
> internal representation. In an RDBMS, for example, it's tables.
> In a Java program it's objects, and so forth. There is not
> (AFAIK) yet any such thing as an XDBMS (though you could consider
> a file system of XML documements plus a web server to resolve
> URL's to those documents as such a thing). Anyway, my approach
> would be to store data in the most natural format for the given
> storage technology, and define translations to and from XML to
> move data between systems.
>
> Jeff
>
> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Chad Adams
> Sent: Thursday, March 04, 1999 5:17 PM
> To: xml-dev@ic.ac.uk
> Subject: Opinions requested
>
>
> Forgive me for the generic question, I'm to the point of betting
> the bank on
> XML, and I'm looking for a pat on the back, or a voice of warning....
>
> We are starting from scratch on our next generation product, from
> what I've
> read and seen - xml seems to fit the bill (Content Management, mixed with
> WIDL RPC functionality seems right up our alley).  I'm looking
> hard at ODBMS
> systems and laying out the DB via xml (storing xlm directly).  We have a
> wealth of in-house Java and COM/DCOM experience, but none with
> ODBMS or XML.
>
> Do I understand it correctly that I at an item level, I can:
> 	1. name it (URI)?
> 		a. possible supply some security to it?
> 	2. revision it?
> 	3. meta-data it?
> 		a. can meta-data have meta-data?
>
> Would I be foolish to base my whole object system storage on xml, or on
> ODBMS for that matter?  Are they cooked, are they ready for real
> world apps?
>
> Once again, I'm sorry for the generic question, I have read the FAQ's, the
> ODBMS webpages, several books etc.  I'm looking for the advice of those in
> the trenches - Is it safe to make XML the foundation of my new product?
>
> Should I grab a shovel, and jump in the trenches with you, or is
> this a deep
> dark hole?
>
>
> Thanks in advance, for all who might reply.
>
>
> Chad Adams
> Payback Training Systems
> Email: cadams@cascadecc.com
> Phone: 435-654-6304
> fax:   435-654-1482
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>

Chad Adams
Payback Training Systems
Email: cadams@cascadecc.com
Phone: 435-654-6304
fax:   435-654-1482


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wperry at fiduciary.com  Fri Mar  5 07:23:19 1999
From: wperry at fiduciary.com (W. E. Perry)
Date: Mon Jun  7 17:09:40 2004
Subject: Opinions requested
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com>
Message-ID: <36DF864B.B458299D@fiduciary.com>

Marcelo Cantos wrote:

> "Jeffrey E. Sussna" wrote:
>
> > There is not (AFAIK) yet any such thing as an XDBMS
>
> I am continually surprised to hear remarks such as this.  SIM _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with structure and full content query capabilities).  As an XDBMS it has weaknesses (it only supports predefined indexes and limited structure querying), but in some ways provides a model that is even richer than XML (it provides structure below element level, and has the concept of fields

In addition to this vision of an XML database, there has been much discussion of XML as a front end or a query-and-response framework for data stores, but I would argue that such applications of XML markup are not an XML database. A true XML database is shaped by the essential characteristics of XML itself:  it should be freely eXtensible; it should be defined and manipulated by Markup; and it should be cast in a Document Structure within which Elements identify Data Constructs, and Attributes provide Data Characterization.

Like XML itself, the XML database is fundamentally mismatched to the familiar storage and transmission frameworks of filesystem, relational table, object serialization or data stream. In the first case, any item--document, data table, or executable--whether 'text' or binary--which is committed to storage in a filesystem is treated as a file:  that is, as unitary and indivisible within the perspective and capabilities of the filesystem. A word processing program may, by opening a document, be able to identify and to manipulate as individual elements the sentences, paragraphs and chapters of that document. By contrast, the filesystem in which that document is stored reads, writes, renames, searches for or deletes the document as a whole. In XML terms, the filesystem sees the document as a single element--a root. Regardless of how many subelements we might mark up within that <root>, the
filesystem--designed for a generic 'file-like' document, is capable of manipulating only one.

In a similar way, a relational table--and the database engine behind it--can store, index, or construct joins upon only those data records which correspond to the schema of the table. While it is possible to use SQL or proprietary database tools to rewrite an existing table to a different schema, that is substantially different from submitting to a database engine, as an entry to a particular table, a single record which follows a unique schema of its own.

In the terms of both filesystem and relational table, an XML document is effectively a BLOB, in that its specifically XML structure is outside the ability of either to discern or to make any use of. Just as, for example, with audio or video content more commonly recognized as BLOBs, the filesystem or relational database engine is obliged to invoke a particular, content-specific processor in order to understand, and then to implement, the structure conveyed by markup in every XML document. Yet this need for pre-defined, content-specific handlers obviates the benefits of XML as a general solution. Indeed, it is not really XML at all if the markup possibilities are circumscribed by the need to conform to what a pre-defined handler can implement.

XML, by definition, is freely extensible. This fundamental characteristic trumps any hoped-for convenience in processing to be achieved by defining 'standard' tagsets, industry-wide 'domain' procedures, or normative namespace references. That this essential capability of XML is irreconcilably mismatched to conventional filesystems and relational databases means that if we are building true XML tools we are obliged to create new equivalents of the filesystem and the database which do conform to the extensible nature of XML. 'Internally' extensibility means that the structural definition of existing XML documents may be altered at any time by indicating, in a document instance, new subelements of the elements previously defined or, occasionally, consolidating--and eliminating--previously defined elements in favor of more general ones. This is not simple re-arrangement of the elements of an XML
document, but a fundamental re-definition of its structure. 'Externally' the extensibility of XML means that documents, arriving from any number of (not necessarily well-known) sources, may claim recognition by our XML database engine and expect, for example, to be accepted as input data, solely because the document root element has a tag which matches one defined in our system. Of course, below that apparently familiar root element may lie subelements whose type we have not seen before, or which are structured in a different hierarchy than we expect, or whose tag names are unfamiliar variants of what we use 'internally'.

A true XML database engine must inherently and efficiently handle the demands of both this internal and external extensibility. Effectively this means that the data schema must (potentially) be rewritten with every new 'record' accepted, or altered, in the database. That is, if we posit that those 'records' are XML documents then, as XML documents, they may be marked up at any time to a finer (or coarser) elemental granularity, and a true XML database engine must respond by reading, writing, querying, and generally processing them in sync with the markup. In the case of 'external' items?effectively data entry submitted to the XML database?the database engine must identify the schema with the data source. That is, it must understand that the markup of items originating from one source may be aliases of the markup in documents from another source and, again, may present a finer or coarser
elemental granularity than analogous documents from a different source.

What is missing in this, of course, is the traditional role of the DTD for validation. It is omitted because XML 1.0 defines two very different markup and processing disciplines, distinguished by whether there is a DTD, and in order to build XML tools it is necessary to choose which of these definitions we are following. XML is routinely introduced as both of its very different selves. Newcomers are usually first lured in with the promise of unlimited markup:  define your own tags which exactly suit your unique situation. Only after they have bitten for that bait are they told about the limitations imposed by the DTD. Yet the fact is that XML 1.0 defines one XML in which the DTD is omitted, and a simple and logical projection of that definition leads to an XML where markup is freely extensible and the data schema is what the sum of the markup in the system at any moment implies.

Respectfully,

Walter Perry


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cadams at cascadecc.com  Fri Mar  5 08:50:40 1999
From: cadams at cascadecc.com (Chad Adams)
Date: Mon Jun  7 17:09:40 2004
Subject: Opinions requested
In-Reply-To: <36DF864B.B458299D@fiduciary.com>
Message-ID: <000301be66e5$33f1c860$01010101@development.cascade>

Walter,

Thanks for the reply.  If I understand what you are saying, it does seem
kind of weird that they would spec the DTD instead of just going with the
schema - since that's what schema is for.

Also, having taken the bait, my assumption was that any given xml document
might be a mixture of both (ie. several dtd schemes + several free floating
custom tags with schema all mixed into one happy root)  If the consumer of
the file knows what they are looking at (either dtd or custom tag wise)-
doit, otherwise ignore it.  Is it not this simple?

Your paragraph on "XML, by definition, is freely extensible ..." as well as
the following paragraph describes what I hope the XLM Content Management
classes supplied by the ODBMS manufactures would do for me.

I'm not sure if this is considered "overloading" the functionality of
Content Management, but I believe is one of the concepts of XML. I not only
want the implied authoring flexibility of content management (arrange text,
video, audio, graphics etc. into segments and sub-segments) on the data
store side, but also to embed custom elements (in or around the displayable
elements) that determine some runtime programmatic behavior of the consumer
of the document.

As yet another overloading but as a secondary functionality to the content
management, I'm also hoping that the use of XML can be used in what you have
implied might be an impure use - that of a query-and-response mechanism.  If
I can avoid licensing yet another product, to get mine to market ie.
objectspace, weblogic, or coding to rmi or some other remoting technology,
happy day!

Am I looking for the silver bullet that does not exist?

Chad


> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> W. E. Perry
> Sent: Friday, March 05, 1999 12:23 AM
> To: xml-dev@ic.ac.uk
> Subject: Re: Opinions requested
>
>
> Marcelo Cantos wrote:
>
> > "Jeffrey E. Sussna" wrote:
> >
> > > There is not (AFAIK) yet any such thing as an XDBMS
> >
> > I am continually surprised to hear remarks such as this.  SIM
> _is_ an XDBMS (it is also an SGML, MARC, RTF, etc. database with
> structure and full content query capabilities).  As an XDBMS it
> has weaknesses (it only supports predefined indexes and limited
> structure querying), but in some ways provides a model that is
> even richer than XML (it provides structure below element level,
> and has the concept of fields
>
> In addition to this vision of an XML database, there has been
> much discussion of XML as a front end or a query-and-response
> framework for data stores, but I would argue that such
> applications of XML markup are not an XML database. A true XML
> database is shaped by the essential characteristics of XML
> itself:  it should be freely eXtensible; it should be defined and
> manipulated by Markup; and it should be cast in a Document
> Structure within which Elements identify Data Constructs, and
> Attributes provide Data Characterization.
>
> Like XML itself, the XML database is fundamentally mismatched to
> the familiar storage and transmission frameworks of filesystem,
> relational table, object serialization or data stream. In the
> first case, any item--document, data table, or
> executable--whether 'text' or binary--which is committed to
> storage in a filesystem is treated as a file:  that is, as
> unitary and indivisible within the perspective and capabilities
> of the filesystem. A word processing program may, by opening a
> document, be able to identify and to manipulate as individual
> elements the sentences, paragraphs and chapters of that document.
> By contrast, the filesystem in which that document is stored
> reads, writes, renames, searches for or deletes the document as a
> whole. In XML terms, the filesystem sees the document as a single
> element--a root. Regardless of how many subelements we might mark
> up within that <root>, the
> filesystem--designed for a generic 'file-like' document, is
> capable of manipulating only one.
>
> In a similar way, a relational table--and the database engine
> behind it--can store, index, or construct joins upon only those
> data records which correspond to the schema of the table. While
> it is possible to use SQL or proprietary database tools to
> rewrite an existing table to a different schema, that is
> substantially different from submitting to a database engine, as
> an entry to a particular table, a single record which follows a
> unique schema of its own.
>
> In the terms of both filesystem and relational table, an XML
> document is effectively a BLOB, in that its specifically XML
> structure is outside the ability of either to discern or to make
> any use of. Just as, for example, with audio or video content
> more commonly recognized as BLOBs, the filesystem or relational
> database engine is obliged to invoke a particular,
> content-specific processor in order to understand, and then to
> implement, the structure conveyed by markup in every XML
> document. Yet this need for pre-defined, content-specific
> handlers obviates the benefits of XML as a general solution.
> Indeed, it is not really XML at all if the markup possibilities
> are circumscribed by the need to conform to what a pre-defined
> handler can implement.
>
> XML, by definition, is freely extensible. This fundamental
> characteristic trumps any hoped-for convenience in processing to
> be achieved by defining 'standard' tagsets, industry-wide
> 'domain' procedures, or normative namespace references. That this
> essential capability of XML is irreconcilably mismatched to
> conventional filesystems and relational databases means that if
> we are building true XML tools we are obliged to create new
> equivalents of the filesystem and the database which do conform
> to the extensible nature of XML. 'Internally' extensibility means
> that the structural definition of existing XML documents may be
> altered at any time by indicating, in a document instance, new
> subelements of the elements previously defined or, occasionally,
> consolidating--and eliminating--previously defined elements in
> favor of more general ones. This is not simple re-arrangement of
> the elements of an XML
> document, but a fundamental re-definition of its structure.
> 'Externally' the extensibility of XML means that documents,
> arriving from any number of (not necessarily well-known) sources,
> may claim recognition by our XML database engine and expect, for
> example, to be accepted as input data, solely because the
> document root element has a tag which matches one defined in our
> system. Of course, below that apparently familiar root element
> may lie subelements whose type we have not seen before, or which
> are structured in a different hierarchy than we expect, or whose
> tag names are unfamiliar variants of what we use 'internally'.
>
> A true XML database engine must inherently and efficiently handle
> the demands of both this internal and external extensibility.
> Effectively this means that the data schema must (potentially) be
> rewritten with every new 'record' accepted, or altered, in the
> database. That is, if we posit that those 'records' are XML
> documents then, as XML documents, they may be marked up at any
> time to a finer (or coarser) elemental granularity, and a true
> XML database engine must respond by reading, writing, querying,
> and generally processing them in sync with the markup. In the
> case of 'external' items?effectively data entry submitted to the
> XML database?the database engine must identify the schema with
> the data source. That is, it must understand that the markup of
> items originating from one source may be aliases of the markup in
> documents from another source and, again, may present a finer or coarser
> elemental granularity than analogous documents from a different source.
>
> What is missing in this, of course, is the traditional role of
> the DTD for validation. It is omitted because XML 1.0 defines two
> very different markup and processing disciplines, distinguished
> by whether there is a DTD, and in order to build XML tools it is
> necessary to choose which of these definitions we are following.
> XML is routinely introduced as both of its very different selves.
> Newcomers are usually first lured in with the promise of
> unlimited markup:  define your own tags which exactly suit your
> unique situation. Only after they have bitten for that bait are
> they told about the limitations imposed by the DTD. Yet the fact
> is that XML 1.0 defines one XML in which the DTD is omitted, and
> a simple and logical projection of that definition leads to an
> XML where markup is freely extensible and the data schema is what
> the sum of the markup in the system at any moment implies.
>
> Respectfully,
>
> Walter Perry
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From s861766 at mail86.yzu.edu.tw  Fri Mar  5 09:31:53 1999
From: s861766 at mail86.yzu.edu.tw (Ephese Yang)
Date: Mon Jun  7 17:09:40 2004
Subject: A question about XSL/IE5...
Message-ID: <36DF9B9F.6A2087A4@mail86.yzu.edu.tw>

Hi:
I am new in xsl and I have some question about xsl and IE5.
Does IE5 beta2 support the flow object in xsl spec.?
��� ex:� fo:block
How can I display a figure in xml file using xsl?
Can somebody give me an example?

Thanks!


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From santi at qsystems.es  Fri Mar  5 09:59:10 1999
From: santi at qsystems.es (Santi)
Date: Mon Jun  7 17:09:40 2004
Subject: XML Tutorial.
Message-ID: <01BE66F7.1D57C840@Pc Santi.QSYSTEMS>

Hello,

I've started some days ago in XML.
Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful.

Thank you very much in advance.

	Santi Rivas
	santi@qsystems.es


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david.hitch at dial.pipex.com  Fri Mar  5 10:21:55 1999
From: david.hitch at dial.pipex.com (David Hitchcock)
Date: Mon Jun  7 17:09:40 2004
Subject: XML tutorial
Message-ID: <01be66e9$3f1d23c0$0100007f@ketlux03>

Hi Santi

We have a number of resources including links to tutorials on the El.pub
website at: http://www.pira.co.uk/IE .  The XML material is on the standards
page: http://www.pira.co.uk/IE/top011a.htm and there is also a comprehensive
list of commercial and shareware products on the products page:
http://www.pira.co.uk/IE/base09.htm#SGML

You may also wish to sign up for the free weekly information service: El.pub
Weekly which keeps you informed on a weekly basis of updated news on the
site.  You can subscribe from the welcome page at: http://www.pira.co.uk/IE

The site is run by IESERV2 which supports the advanced electronic publishing
research and development projects throughout Europe, run by the Information
Engineering sector of the European Commission's DG XIII/E under the
Telematics Applications Programme.

Best

--> David

*********************************
David Hitchcock
IESERV2
tel:   +44/ (0)181 255 7084
       +44/ (0)181 255 7085
email: david.hitch@dial.pipex.com
web:   http://www.pira.co.uk/IE
*********************************

El.pub: http://www.pira.co.uk/IE
Interactive publishing - news and resources
**Join our developing community
subscribe to the *NEW* El.pub Weekly
a *free* text email update service which includes
the week's news items and associated URLs**


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Mar  5 12:44:20 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:40 2004
Subject: XML Tutorial.
Message-ID: <003b01be6705$d5e059a0$0300000a@othniel.cygnus.uwa.edu.au>

>I've started some days ago in XML.
>Please, if somebody knows the existence of any XML tutorial, or any other
way to introduce me in XML I will be grateful.


see http://www.xmlinfo.com/newcomers/ for links introducing XML.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robin at isogen.com  Fri Mar  5 12:47:13 1999
From: robin at isogen.com (Robin Cover)
Date: Mon Jun  7 17:09:40 2004
Subject: XML Tutorial.
In-Reply-To: <01BE66F7.1D57C840@Pc Santi.QSYSTEMS>
Message-ID: <Pine.GSO.3.96.990305064333.17041B-100000@grind>


On Fri, 5 Mar 1999, Santi wrote:

> Hello,
> 
> I've started some days ago in XML.
> Please, if somebody knows the existence of any XML tutorial, or any other way to introduce me in XML I will be grateful.


IBM has a nice XML tutorial at:

  http://www.software.ibm.com/xml/education/tutorial-prog/writing.html

You may also find other useful introductions in the list at:

  http://www.oasis-open.org/cover/xmlIntro.html

-robin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mintert at irb.informatik.uni-dortmund.de  Fri Mar  5 13:44:01 1999
From: mintert at irb.informatik.uni-dortmund.de (Stefan Mintert)
Date: Mon Jun  7 17:09:40 2004
Subject: parsing spec.dtd & XML spec with nsgmls Re: W3C spec.dtd 
In-Reply-To: Your message of Wed, 03 Mar 1999 14:30:52 -0500.
             <f9903031429400056@inu.menteith.com> 
Message-ID: <199903051343.OAA07560@brown.informatik.uni-dortmund.de>


 > Add -c/tmp/sm/sp-1.3/pubtext/xml.soc to the command line so nsgmls
 > reads the xml.soc catalog that tells it to use the SGML Declaration
 > for XML, xml.dcl.  That SGML Declaration tells nsgmls what hexadecimal
 > character references look like.  Without it, things like &x2014; are
 > being interpreted as per ISO 8879:1986, which isn't doing you or the
 > parser any good.
 > 
 > Regards,
 > 
 > 
 > Tony Graham


Thanks to everybody who answered my question. Thanks to Tony. Yes, you're
right, with -c... it works. I'm was bit confused about that because I have
'Set the SGML_CATALOG_FILES environment variable to point to the file
pubtext/xml.soc' as explained in http://www.jclark.com/sp/xml.htm.

In fact I used my own old catalog file.
Now I checked the xml.dcl that I used and the one that is part of sp:
Unfortunately I used "ISO 8879:1986 (ENR)" instead of "ISO 8879:1986 (WWW)"
:-(


Thanks for your help!


Bye,

 Stefan.

+-----------------------------------------------------------+
  Stefan Mintert
       UniDo:    mintert@irb.informatik.uni-dortmund.de
       private:  stefan@mintert.com
+-----------------------------------------------------------+

        "let the music keep our spirits high..."

                                (Jackson Browne)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar  5 15:53:32 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:40 2004
Subject: XML MULTI-Fragment Interchange?
In-Reply-To: <004001be66a0$5aff8bd0$2ee044c6@arcot-main>
Message-ID: <199903051526.KAA17396@hesketh.net>

At 04:37 PM 3/4/99 -0800, Don Park wrote:
>>Of course, if you seriously believe that the spec is useless unless it
>>allows multiple fragment bodies per package, then that is a comment you
>>should make and attempt to support.  We don't want to come out with a
>>spec folks think is useless, but we were trying to keep it as minimal
>>as possible while still addressing the problem we defined as our scope.
>
>
>I found the spec very useful, timely, and clear.  It was not my intention to
>delay, divert, or hamper the progress of the XML Fragment spec.  It was also
>not my intention to imply that the WG overlooked something important.
>
>I withdraw my comment since it does not fall under the intended scope of the
>spec.

While you may be withdrawing the comment because of the scope the XML
Fragment group has set itself, we still need a way to represent multiple
fragments, whether or not the W3C considers that appropriate to the scope
of this particular working group.

Sounds like we need to get the XML streaming thread going again, and start
working out ways to represent multiple documents/fragments.  It seems like
a real need.

Is anyone interested in this issue going to be at XTech next week?  It'd be
culture shock to actually talk, I know, but that might be a good place to
get a spec for these streaming XML issues kickstarted.

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From asmith at drumbeat.com  Fri Mar  5 17:19:34 1999
From: asmith at drumbeat.com (Smith, Adrian)
Date: Mon Jun  7 17:09:41 2004
Subject: Opinions requested
Message-ID: <70B92603FC2CD21197D600609778A80D0AE64D@elemental2>

There actually is an XDBMS.  It predates XML.  This dates back to around
1965/1966.  The database created was titled "IMS" for Information
Management System, it was created by IBM and used an hierarchical model
for the data.  It had all the same characterstics of XML with almost the
exact same set of constructs and shortcomings.

Thanks!
Adrian

Worthless. 
	-Sir George Bidell Airy, KCB, MA, LLD, DCL, FRS, FRAS
(Astronomer Royal of Great Britain), estimating for the Chancellor of
the Exchequer the potential value of the "analytical engine" invented by
Charles Babbage, September 15, 1842. 


> -----Original Message-----
> From:	Jeffrey E. Sussna [SMTP:jes@kuantech.com]
> Sent:	Thursday, March 04, 1999 5:57 PM
> To:	'Chad Adams'; xml-dev@ic.ac.uk
> Subject:	RE: Opinions requested
> 
> I will not comment on the advisability of using an ODBMS, because 1)
> it's out of scope for this group, and 2) it's a highly religious
> topic. However, I will comment on the question of whether to store
> your data directly as XML, and confess that I don't understand the
> question. XML is a great interchange language; i.e., a way to move
> data between systems. Generally speaking, however, each particular
> system has its own optimal internal representation. In an RDBMS, for
> example, it's tables. In a Java program it's objects, and so forth.
> There is not (AFAIK) yet any such thing as an XDBMS (though you could
> consider a file system of XML documements plus a web server to resolve
> URL's to those documents as such a thing). Anyway, my approach would
> be to store data in the most natural format for the given storage
> technology, and define translations to and from XML to move data
> between systems.
> 
> Jeff
> 
> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf
> Of
> Chad Adams
> Sent: Thursday, March 04, 1999 5:17 PM
> To: xml-dev@ic.ac.uk
> Subject: Opinions requested
> 
> 
> Forgive me for the generic question, I'm to the point of betting the
> bank on
> XML, and I'm looking for a pat on the back, or a voice of warning....
> 
> We are starting from scratch on our next generation product, from what
> I've
> read and seen - xml seems to fit the bill (Content Management, mixed
> with
> WIDL RPC functionality seems right up our alley).  I'm looking hard at
> ODBMS
> systems and laying out the DB via xml (storing xlm directly).  We have
> a
> wealth of in-house Java and COM/DCOM experience, but none with ODBMS
> or XML.
> 
> Do I understand it correctly that I at an item level, I can:
> 	1. name it (URI)?
> 		a. possible supply some security to it?
> 	2. revision it?
> 	3. meta-data it?
> 		a. can meta-data have meta-data?
> 
> Would I be foolish to base my whole object system storage on xml, or
> on
> ODBMS for that matter?  Are they cooked, are they ready for real world
> apps?
> 
> Once again, I'm sorry for the generic question, I have read the FAQ's,
> the
> ODBMS webpages, several books etc.  I'm looking for the advice of
> those in
> the trenches - Is it safe to make XML the foundation of my new
> product?
> 
> Should I grab a shovel, and jump in the trenches with you, or is this
> a deep
> dark hole?
> 
> 
> Thanks in advance, for all who might reply.
> 
> 
> Chad Adams
> Payback Training Systems
> Email: cadams@cascadecc.com
> Phone: 435-654-6304
> fax:   435-654-1482
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Fri Mar  5 17:20:30 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:09:41 2004
Subject: XML MULTI-Fragment Interchange?
In-Reply-To: <199903051526.KAA17396@hesketh.net>; from Simon St.Laurent on Fri, Mar 05, 1999 at 10:29:09AM -0500
References: <004001be66a0$5aff8bd0$2ee044c6@arcot-main> <199903051526.KAA17396@hesketh.net>
Message-ID: <19990305121926.E22737@w3.org>

On Fri, Mar 05, 1999 at 10:29:09AM -0500, Simon St.Laurent wrote:
> At 04:37 PM 3/4/99 -0800, Don Park wrote:
> >I withdraw my comment since it does not fall under the intended scope of the
> >spec.
> 
> While you may be withdrawing the comment because of the scope the XML
> Fragment group has set itself, we still need a way to represent multiple
> fragments, whether or not the W3C considers that appropriate to the scope
> of this particular working group.
> 
> Sounds like we need to get the XML streaming thread going again, and start
> working out ways to represent multiple documents/fragments.  It seems like
> a real need.

  Hum, I have been following the streaming/fragment thread. However I have
the feeling that even multiple fragment body extensions would not solve
the problem you were facing. If I didn't get the discussion wrong, it seems
that you rather tried to make one very big (i.e. stream) document from
multiple sources while the scope of the fragment work was just the opposite,
i.e. how to extract and ship a piece of a very big document.

> Is anyone interested in this issue going to be at XTech next week?  It'd be
> culture shock to actually talk, I know, but that might be a good place to
> get a spec for these streaming XML issues kickstarted.

  I will be around, 

Daniel 

-- 
	    [Yes, I have moved back to France !]
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux, WWW, rpmfind,
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | rpm2html, XML,
http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at oreilly.com  Fri Mar  5 17:26:09 1999
From: crism at oreilly.com (Chris Maden)
Date: Mon Jun  7 17:09:41 2004
Subject: A question about XSL/IE5...
In-Reply-To: <36DF9B9F.6A2087A4@mail86.yzu.edu.tw> (message from Ephese Yang
	on Fri, 05 Mar 1999 16:53:52 +0800)
Message-ID: <199903051532.KAA27149@ruby.ora.com>

[Ephese Yang]
> I am new in xsl and I have some question about xsl and IE5.
> Does IE5 beta2 support the flow object in xsl spec.?
> ��� ex:� fo:block

IE5 does not support XSL formatting objects.  Tell Microsoft you are
interested that it do so.

XSL questions are best discussed on the xsl-list:
<URL:http://www.mulberrytech.com/xsl/xsl-list/>.

> How can I display a figure in xml file using xsl?
> Can somebody give me an example?

Since MSIE can only display HTML, try creating an HTML <img> element
in your stylesheet.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmcdonou at library.berkeley.edu  Fri Mar  5 17:48:34 1999
From: jmcdonou at library.berkeley.edu (Jerome McDonough)
Date: Mon Jun  7 17:09:41 2004
Subject: Opinions requested
In-Reply-To: <36DF4CE1.7F4D3681@simdb.com>
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com>
Message-ID: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>

At 02:17 PM 3/5/1999 +1100, Marcelo Cantos wrote:
>>"Jeffrey E. Sussna" wrote:
>>
>> There is not (AFAIK) yet any such thing as an XDBMS (though you could
consider 
>>a file system of XML documements plus a web server to resolve URL's to
those 
>>documents as such a thing).
>
>I am continually surprised to hear remarks such as this.  SIM _is_ an XDBMS 
>(it is also an SGML, MARC, RTF, etc. database with structure and full
content 
>query capabilities).

I think one of the reasons you hear these kinds of remarks is that the
terminology
surrounding these systems is used differently by different folks.  For
instance, 
from what I know of SIM, I wouldn't call it a DBMS system of any kind, as I
don't believe (I could be wrong) it supports referential integrity
constraints, concurrency
control, recoverable transactions, and other features I would expect out of
a reasonable DBMS.  Granted it has hooks that allow you to get it to work with
a DBMS that can provide all that, but that doesn't make SIM itself a DBMS.
I would instead class SIM as an information retrieval system, and a pretty 
damned good one at that.  However, SIM performs as well as it does in great
part because it's not doing the extra work that a DBMS should do, and which 
add greatly to retrieval time from database systems (as well as limiting their
ability to handle complex data formats gracefully).

This isn't to knock SIM; anyone who needs a flexible information retrieval
system should be taking a very serious look at it.  The Z39.50 support alone
puts it way ahead of the market as far as I'm concerned.  But I don't think
SIM is evidence that there are DBMS systems that handle SGML/XML well; I don't
think they do.  Oracle may very well be getting there with its latest release,
but I suspect there's still a lot of work to be done there.


Jerome McDonough -- jmcdonou@library.Berkeley.EDU  |  (......)
Library Systems Office, 386 Doe, U.C. Berkeley     |  \ *  * /
Berkeley, CA 94720-6000    (510) 642-5168          |  \  <>  /
"Well, it looks easy enough...."                   |   \ -- /  SGNORMPF!!!
         -- From the Famous Last Words file        |    ||||

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar  5 21:20:14 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:41 2004
Subject: Tell the world about your new language
Message-ID: <3.0.32.19990305131959.00b65280@pop.intergate.bc.ca>

Check out:

 http://www.usenix.org/events/dsl99/

 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Fri Mar  5 22:41:58 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:41 2004
Subject: ModSax Suggestion
References: <d32bfc76.36df36fb@aol.com>
Message-ID: <36E05C67.607F4C27@eng.sun.com>

Interesting suggestion for a big hole in the parts of
the Java API set that are more or less "standard" at
this poit -- SAX and DOM.

One comment though:  I've found that it's important to
be able to have options controlling how the DOM tree is
built.  For example, whether to discard ignorable spaces,
or do namespace conformance enforcement, or try to get
CDATA sections (comments, etc).

Accordingly, I think being able to do a bit more than
this will be important.

- Dave


MikeDacon@aol.com wrote:
> 
> Hi Everyone,
> 
> While SAX does a good job as an event-based interface
> to Parsers, it would be nice to add a few methods to
> receive a DOM representation back from a reference to an org.xml.sax.Parser.
> 
> Something like:
> 
> org.w3c.dom.Document  parse(InputSource  is, boolean events) throws
> SAXException;
> org.w3c.dom.Document  parse(java.lang.String uri, boolean events) throws
> SAXException;
> /* the events boolean would be to turn on/off event calls. */
> 
> If a SAXDriver did not want to produce a DOM, it could either simply
> return a null or a method added like:
> 
> boolean isDomCapable();
> 
> The above would let me use the ParserFactory to seamlessly switch
> between Parser implementations and get a DOM tree without building
> one myself.  It is fruitless for me to build a DOM tree when almost all
> the parser implementations provide that ability.  I just want a way to get
> at that functionality in a simple and standard way (thus SAX).
> 
> Thoughts?
> 
>  - Mike
> -----------------------------------------------
> Michael C. Daconta
> Author of Java 2 and JavaScript for C/C++ Programmers
> Author of C++ Pointers and Dynamic Memory Management
> Sun Certified Java Programmer and Developer
> http://www.gosynergy.com
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zmin at atpage.com  Sat Mar  6 01:04:43 1999
From: zmin at atpage.com (min zheng)
Date: Mon Jun  7 17:09:41 2004
Subject: Accessing DTD info. in IE5
References: <001701be65c7$febb5620$47be1990@mscardin-pc.us.oracle.com>
Message-ID: <002d01be676d$eee8e850$f66f6f0a@atpage>

Is DTD information accessable through IE5 DOM? I took is as granted because
I could do it with old MSXML for java used in IE4. However, when I really
wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD
information exposed in IE5 DOM?

Thanks,
Min


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Sat Mar  6 06:45:35 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:41 2004
Subject: Opinions requested
In-Reply-To: <36DF864B.B458299D@fiduciary.com>; from W. E. Perry on Fri, Mar 05, 1999 at 02:22:51AM -0500
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <36DF864B.B458299D@fiduciary.com>
Message-ID: <19990306153959.A22308@io.mds.rmit.edu.au>

Thank you, Walter for the erudite response.  I am left in a bit of
quandary as to how or even whether to respond.  This is in large part
due to the fact that, while your post was in response to mine, it is
not immediately clear to me whether you are addressing my comments
specifically or rather the general theme of this thread.

Having the vague impression (though no firm conviction) that it is in
response to my claims that you waxed eloquent on the theme of what
defines an XML database, I will proceed to provide commentary, and
occasionally direct response/rebuttal, to a smattering of your points.

My humble apologies, Walter, if I have in any way misconstrued your
post.


On Fri, Mar 05, 1999 at 02:22:51AM -0500, W. E. Perry wrote:
> Marcelo Cantos wrote:
> 
> > "Jeffrey E. Sussna" wrote:
> >
> > > There is not (AFAIK) yet any such thing as an XDBMS
> >
> > I am continually surprised to hear remarks such as this.  SIM _is_
> > an XDBMS (it is also an SGML, MARC, RTF, etc. database with
> > structure and full content query capabilities).  As an XDBMS it
> > has weaknesses (it only supports predefined indexes and limited
> > structure querying), but in some ways provides a model that is
> > even richer than XML (it provides structure below element level,
> > and has the concept of fields
> 
> In addition to this vision of an XML database, there has been much
> discussion of XML as a front end or a query-and-response framework
> for data stores, but I would argue that such applications of XML
> markup are not an XML database. A true XML database is shaped by the
> essential characteristics of XML itself: it should be freely
> eXtensible; it should be defined and manipulated by Markup; and it
> should be cast in a Document Structure within which Elements
> identify Data Constructs, and Attributes provide Data
> Characterization.

It seems here that I may have provided an incorrect characterisation
of what we do, and hence given Walter cause to provide some qualifiers
on anyone wishing to define themselves as an XML database.

On this point, I must make it quite clear that SIM is _not_ an XML
front end to a data store.  It is an XML (etc.) document repository.

One additional, crucial point is that SIM _is_ extensible (though I
will qualify this presently).  It can be defined to accept markup to
any degree of strictness or laxity (within the bounds of
well-formedness or validity, of course).  It can be setup to accept
any and all markup and do _something_ intelligent with it.  It can
also be configured to make stringent demands (well in excess of the
DTD, both with respect to strictness and complexity of constraints) of
its inputs.

This quality of SIM renders the product amenable to both of the
major application streams of XML: data and documents.  It can provide
strict data validation as well as extensibility.

Now, by way of qualification, SIM does not provide free-form runtime
extensibility (runtime from the administrator's perspective, not
ours).  Rather it provides the application developer with the
requisite tools to define, at design time, what structures will be
supported.  For instance, you cannot, with SIM, perform queries such
as, "find me all sections containing subsections with an attribute of
security="public" and at least one paragraph with fewer than four
words in it"  The semantic complexity of such a query is beyond the
scope of our product.  However, if one were to know in advance that
queries about the minimum paragraph length in public subsections will
be commonplace in the particular application one is developing, then
SIM could, at design time, be told to create an appropriate index and
then the above query could, indeed, be performed.

In short, SIM _is_ extensible, but the extensibility is bound somewhat
earlier than runtime.  In practice, clients never complain about this
quality.  In fact, it is usually a benefit rather than a hindrance,
for the same reason that compile time type checking is a good thing to
have in a programming language.

I also take issue with Walter's remark that an XML database should be
manipulated by and defined through the medium of XML.  This sounds
analogous to suggesting that relational databases should be defined
and manipulated by markup.  Now, it is true that relational schema
are, themselves, typically stored as relations (one will, for example,
find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc.
inside a database).  However, it seems to me patently absurd to
suggest that SQL (whether DML or DDL) be expressed in terms of tuples
and relations.  Now, while it does not seem likewise absurd to suggest
that XML queries and data definition constructs be defined as XML, the
truth of such a suggestion is anything but self-evident.  Why should
one not use an SQL-like language to define and query XML databases?
There may or may not be merit in such an approach, but it seems no
more or less appropriate than a query/data definition language cast in
XML.  Indeed, many of the query language position papers at W3C do not
use XML syntax.  Data definition and query languages are
meta-constructs.  They are not part of the data, but rather operate on
the data and structures.  This suggests that while it may be possible
to fold the system in on itself by expressing meta-structure as data,
it would be unwise to proceed down this path in _a priori_ fashion
(Now, have I completely missed Walter's point here?  I'm not sure.)

> Like XML itself, the XML database is fundamentally mismatched to the
> familiar storage and transmission frameworks of filesystem,
> relational table, object serialization or data stream. In the first
> case, any item--document, data table, or executable--whether 'text'
> or binary--which is committed to storage in a filesystem is treated
> as a file:  that is, as unitary and indivisible within the
> perspective and capabilities of the filesystem. A word processing
> program may, by opening a document, be able to identify and to
> manipulate as individual elements the sentences, paragraphs and
> chapters of that document.  By contrast, the filesystem in which
> that document is stored reads, writes, renames, searches for or
> deletes the document as a whole. In XML terms, the filesystem sees
> the document as a single element--a root. Regardless of how many
> subelements we might mark up within that <root>, the
> filesystem--designed for a generic 'file-like' document, is capable
> of manipulating only one.

One must be careful, here, to discriminate between interfaces and
implementations.  I basically agree with all of Walter's points in the
above paragraph, but would add that many systems store conceptual XML
documents as files.  Our system uses a highly tuned variable length
record manager (unsurprisingly named the VLRM) to store documents and
fragments of any size in a highly efficient manner (both in terms of
size and speed).  Consequently, we store entire documents for the most
part.  If parsing time starts to weigh heavily due to retrieval of
excessively large documents (the entire Australian Tax Legislation,
say, or a complete Boeing Aircraft Maintanence Manual), then we
fragment the documents to a level where parsing is no longer a
bottleneck.

In all of this, however, SIM can always treat the XML as XML.  The
developer always sees trees, not files, or BLOB's.  It doesn't matter
how it is stored in the background, that is an implementation issue.
The one caveat with our product is that fragmented documents cannot be
treated as a conceptual whole without physically rejoining the parts.
This is one thing which OODBMS's do better than us present, though we
are looking at ways to provide that additional level of abstraction
(we are also considering the usefulness of doing so, since fragments
are more commonly the unit of interest, rather than the entire
document).

> In the terms of both filesystem and relational table, an XML
> document is effectively a BLOB, in that its specifically XML
> structure is outside the ability of either to discern or to make any
> use of. Just as, for example, with audio or video content more
> commonly recognized as BLOBs, the filesystem or relational database
> engine is obliged to invoke a particular, content-specific processor
> in order to understand, and then to implement, the structure
> conveyed by markup in every XML document. Yet this need for
> pre-defined, content-specific handlers obviates the benefits of XML
> as a general solution. Indeed, it is not really XML at all if the
> markup possibilities are circumscribed by the need to conform to
> what a pre-defined handler can implement.

I disagree with the last sentence above.  Not from the pedagogical
perspective (which seems quite evident in Walter's prose, and with
which I largely sympathise), but from the pragmatic perspective.  Yes,
the purist will rightly decry the notion of predefinition of structure
in an ostensibly XML-friendly environment, but the end-user comes
along and not only accepts, but vociferously demands that his
environment be constrained.  The user doesn't want flexibility to
store anything, she wants the flexibility only to store what she wants
to store.

The serious user of XML does not have a heterogeneous collection of
vaguely defined documents with a motley crew of DTD's and well-formed
markup.  Most users have a well defined data set for which they want
to define efficient structures for storage and retrieval (if they
aren't interested in efficiency then their problem isn't particularly
interesting -- any tool will do).  In the few cases where they do have
arbitrary structure to deal with, more often than not they are only
interested in the content and are likely to throw the structure away.
After all, what is the use of structure if you don't know, say,
whether the prolog element contains an abstract element, or whether
"date" attributes refer to creation time, last modification time, or
effectivity (or, worse still, whether they are in U.S., Australian or
international format)?  In the real world, I suspect that cases where
structure is arbitrary but important will be few and far between.
This is borne out by the almost complete absense of demand for
arbitrary structure querying capability from our clients or potential
clients.  It just never seems to be an issue.

A qualifier is also in order for the above remarks, lest there be a
misunderstanding.  XML tools, in general, must be extensible and
accept any and all valid and/or well-formed inputs.  My comments
specifically address the issue of repositories (DBMS's).  XML may be
extensible, but it, too, expresses the notion of constraint through
the concept of DTD's.  Databases, likewise, not only can, but should
constraint the inputs, both for simplicity and efficiency.  Perhaps
this is, after all, what Walter meant when repudiating the idea of
predefined handlers.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Sat Mar  6 08:46:24 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:41 2004
Subject: Opinions requested
In-Reply-To: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>; from Jerome McDonough on Fri, Mar 05, 1999 at 09:37:29AM -0800
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>
Message-ID: <19990306154022.B22308@io.mds.rmit.edu.au>

On Fri, Mar 05, 1999 at 09:37:29AM -0800, Jerome McDonough wrote:
> At 02:17 PM 3/5/1999 +1100, Marcelo Cantos wrote:
> >>"Jeffrey E. Sussna" wrote:
> >>
> >> There is not (AFAIK) yet any such thing as an XDBMS (though you
> >> could consider a file system of XML documements plus a web server
> >> to resolve URL's to those documents as such a thing).
> >
> >I am continually surprised to hear remarks such as this.  SIM _is_
> >an XDBMS (it is also an SGML, MARC, RTF, etc. database with
> >structure and full content query capabilities).
> 
> I think one of the reasons you hear these kinds of remarks is that
> the terminology surrounding these systems is used differently by
> different folks.  For instance, from what I know of SIM, I wouldn't
> call it a DBMS system of any kind, as I don't believe (I could be
> wrong) it supports referential integrity constraints, concurrency
> control, recoverable transactions, and other features I would expect
> out of a reasonable DBMS.  Granted it has hooks that allow you to
> get it to work with a DBMS that can provide all that, but that
> doesn't make SIM itself a DBMS.  I would instead class SIM as an
> information retrieval system, and a pretty damned good one at that.
> However, SIM performs as well as it does in great part because it's
> not doing the extra work that a DBMS should do, and which add
> greatly to retrieval time from database systems (as well as limiting
> their ability to handle complex data formats gracefully).

Thank you, Jerome, for the candid and quite fair assessment of SIM.

On the point of referential integrity, you are quite right, there is
no built in support.  Though with our new event hook mechanism
(similar to the triggers found in most relational systems) one will be
able to attach event handlers to various update operations, and
prevent them from completing in the event of a referential integrity
violation.  This probably wouldn't work together with concurrency
controls (thought this will be moot when transaction support comes
in).

However, in one particular project, we have put in referential
integrity control using a single query per reference as part of the
check-in mechanism.  Another project only generates references
dynamically at query time effectively with a single reverse-reference
index lookup at query time.  The problem with referential integrity
checking is sometimes you need to be able to manage broken data and
this is more often the case with documents than with the more typical
applications of RDBMS technology (financial transactions etc).  Of
course when you store whole documents instead of unnaturally breaking
them up into millions of tiny pieces, you don't have nearly the same
referential integrity problems in the first place.

With respect to concurrency control you are mistaken.  We support
short term locks, which prevent individual records, at least, from
ever entering an undefined state under concurrent loads.  These locks
can be held as long as desired, but cannot persist beyond the lifetime
of a session.   Long term locks (which outlive the session) are in the
offing, and stand a good chance of getting into release 3.0 (scheduled
for mid-year, I think -- it could be earlier).

Transactions we most definitely do not support.  We do, however,
provide recovery through log files, which record server activity and
can be played back in a batch load operation.  It's a little crude
(you make the server read-only, back it up, and start a new log file.
When you crash, restore the last backup and replay the log) but it is
safe and effective.

More important than any specifics, however, is the issue of what you
call a DBMS.  To me, a DBMS is a database management system (seems
painfully obvious, but I think it bears repeating).  You may argue
that a product is not a DBMS if it does not support feature X, and I
don't entirely disagree.  When one talks of a DBMS one is conjuring up
a certain image in the mind of the listener, and that image may well
include feature X.  To be fair to SIM, however, the essence of a DBMS
is that it manages a collection of data.  If it doesn't support
transactions, this does not entail that it does not manage data.
Rather it simply has limits on the way the data is managed (i.e. it
doesn't manage data as well as one would like).

You clearly believe that transaction support is part of the essence of
what makes a DBMS.  I disagree, indeed, I profoundly disagree.  There
is nothing in the concept of a database that mandates any such
requirement.  Rather I would say that transaction support is an
important issue for any _good_ DBMS.  Likewise for referential
integrity and concurrency (and, for that matter, support for
declarative queries, use of indexes, a rich set of fundamental data
types, etc.).  If I recall correctly, dBase III was generally
acknowledged to be a DBMS though it lacked most of these requirements,
and could barely even call itself relational!

Now, don't get me wrong here.  I am not trying to defend SIM by
deprecating the features you demand.  They are very important and
highly desirable features in a DBMS (the fact that they are amazingly
difficult to do well is of no concern to the user).  Their absence in
SIM is of ongoing concern to us.  Furthermore it is far from
satisfying to be able to insist that, SIM fits into a strict,
minimalist definition of a DBMS if it lacks features that are
typically associated with DBMS's.  One of the primary reasons they are
not in at this stage is that, as you pointed out so well, the primary
focus of SIM has always been performance and scalability; and all of
the aforementioned features can have a significant impact on
performance if implemented naively (transaction support, in
particular, is an onerous requirement, though by no means untenable).

SIM is not a full featured DBMS.  But it is not a mere informaton
retrieval system either.  It does support recovery (though not full
transaction support), it does support concurrency, and it can be
coerced to support referential integrity.  It also bears mentioning
that you don't have to talk out to an RDBMS to do any of these things.
In fact the only use I have heard of for our ODBC capability is one
client who wanted to access a personnel database for authentication
purposes (it had nothing to with the database server per se).

I guess this all boils down to what's in a name.  At the end of the
day, it is far more important to know what a product does and does not
do than what you call it.

> This isn't to knock SIM; anyone who needs a flexible information
> retrieval system should be taking a very serious look at it.  The
> Z39.50 support alone puts it way ahead of the market as far as I'm
> concerned.  But I don't think SIM is evidence that there are DBMS
> systems that handle SGML/XML well; I don't think they do.  Oracle
> may very well be getting there with its latest release, but I
> suspect there's still a lot of work to be done there.

I am sceptical that any RDBMS vendor can come to the party in terms of
performance.  Past attempts to try to force text into a relational,
table or object based paradigm have not reaped great success (Oracle's
ConText comes to mind as an example of how forcing a square peg into a
round hole requires sacrificing the edges of performance).  I would be
surprised if any of the major database vendors would be prepared to
venture away from their core competency (the relational model) to
address the performance issues.

But why parse XML to split it up into tables when you can store the
XML directly?  Why build thousands of index entries to system
generated element ID's so that you can do join's to build up an XML
fragment, when you can build a single index and pull the fragment in
its entirety out of the document from which it comes?  Why use
inferior content indexing technology taking up to 10 to 20 times the
size of the data being indexed when you can use compressed inverted
files which take between 15% (document level index) and 50%
(multi-level word position index) the size of the data?  And all this
with faster update speed than many standard text retrieval systems.

There is an additional overhead in the relational paradigm which has
nothing to do with transactions, concurrency control, or referential
integrity checking.  That cost is that relational tables do not map
cleanly onto hierarchical documents (or data collections to pick up on
another thread).  Every fragment you insert, update, or remove has to
be taken apart to map it onto some underlying representation, modified
piece by piece, and then reassembled to be delivered.

I strongly disagree that SIM doesn't handle SGML/XML well.  In the
five years of successfully selling SIM, no customer has ever replaced
SIM with another product. In fact none of them have even mentioned to
us that they ever considered replacing SIM.  This in itself is
remarkable given that, because our customers use SIM to store their
SGML/XML natively, they can get the data out of SIM much more easily
than if it were mapped onto some proprietary internal database format.
People buy SIM because it is flexible enough to do whatever they need
to do with their XML/SGML.  It doesn't force them to adopt a
non-XML/SGML approach.  It doesn't force them to translate their data
into some proprietary format in order to interact with the data.  It
deals directly with the XML.  Precisely what the original post was
asking for, in fact.


Cheers,
Marcelo

P.S.: Some thanks go to my colleague, Tim Arnold-Moore, for providing
some of the content (including the closing) for this article.

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Mar  6 11:31:29 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:42 2004
Subject: Opinions requested
In-Reply-To: <19990306154022.B22308@io.mds.rmit.edu.au>
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com>
	<36DF4CE1.7F4D3681@simdb.com>
	<3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>
	<19990306154022.B22308@io.mds.rmit.edu.au>
Message-ID: <14049.4226.895273.99370@localhost.localdomain>

Marcelo Cantos writes:

 > More important than any specifics, however, is the issue of what you
 > call a DBMS.  To me, a DBMS is a database management system (seems
 > painfully obvious, but I think it bears repeating).  You may argue
 > that a product is not a DBMS if it does not support feature X [...]

A DBMS is something that manages data *and* passes the ACID test
(Atomicity, Consistency, Isolation and Durability).  This isn't a
question of "I want feature X" -- the ACID test is what distinguishes
a DBMS from, say, the Unix file system (which can also manage data).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wperry at fiduciary.com  Sat Mar  6 15:09:50 1999
From: wperry at fiduciary.com (W. E. Perry)
Date: Mon Jun  7 17:09:42 2004
Subject: Opinions requested
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com>
		<36DF4CE1.7F4D3681@simdb.com>
		<3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>
		<19990306154022.B22308@io.mds.rmit.edu.au> <14049.4226.895273.99370@localhost.localdomain>
Message-ID: <36E14530.10423DC8@fiduciary.com>

David Megginson wrote:

> A DBMS is something that manages data *and* passes the ACID test
> (Atomicity, Consistency, Isolation and Durability).  This isn't a
> question of "I want feature X" -- the ACID test is what distinguishes
> a DBMS from, say, the Unix file system (which can also manage data).

I am going to be the old fogey here, with experience of databases going back to IMS and R:
ACID is (one possible) test of a transaction processor, not of a database. It was precisely
the misguided emphasis upon ACID qualities which bloated the relational model into the
transaction-oriented behemoths sold today. For at least ten years we have tried to undo that
direction by re-imagining the original relational concept as the data warehouse and, when that
too became too bloated, the data mart. There is an opportunity with a true XML database to
describe, and implement, transactions without surrendering to the siren song of two-phase
commit. The key is understanding that there is no obvious or natural boundary to a
transaction. Because of the inherent differences in the perspective of every participant to a
transaction, each or them will describe a different set of elements to the transaction and
different specific relationships among them. In the data world there is no omniscience which
sees the transaction whole:  to imagine it as a single, identifiably boundable unit is to
deprecate the central task of each participant--to construct a transaction which is
understandable to and processable by his own system. That is an ongoing implementational task,
not just a conceptual one. In the real world it resolves to this:  how do I get what I have to
become what you need? What I have and what you need are both structures, and the two of them
will incorporate some set of similar or analogous elements, which gives them the common terms
on which they can define and communicate the transaction which they are attempting to execute.
The definition and the maintenance of each of these structures is the role of the database.
Yet each of those structures is peculiarly unique, and both are ephemeral in the specific
terms of the transaction which they facilitate. Yes, the transaction, once executed, endures.
But the terms in which that durability is communicated--indeed the very substance as which it
is preserved--may be utterly different in the systems (and, I would hope, in the databases) of
each of the participants. Precisely what each of those systems, or databases, does not exhibit
are the ACID qualities through which some would hope to define the identity, uniqueness and
permanence of that transaction.

Respectfully,

Walter Perry


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wperry at fiduciary.com  Sat Mar  6 17:19:49 1999
From: wperry at fiduciary.com (W. E. Perry)
Date: Mon Jun  7 17:09:42 2004
Subject: Opinions requested
References: <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com> <36DF4CE1.7F4D3681@simdb.com> <36DF864B.B458299D@fiduciary.com> <19990306153959.A22308@io.mds.rmit.edu.au>
Message-ID: <36E1639D.FDA85E9C@fiduciary.com>

Marcelo Cantos wrote:

> Thank you, Walter for the erudite response.  I am left in a bit of
> quandary as to how or even whether to respond.  This is in large part
> due to the fact that, while your post was in response to mine, it is
> not immediately clear to me whether you are addressing my comments
> specifically or rather the general theme of this thread.

Thank you for your kind words. I will confess that much of my post was addressed to the
general theme of the thread.

> On this point, I must make it quite clear that SIM is _not_ an XML
> front end to a data store.  It is an XML (etc.) document repository.

My naive reading of the SIM materials on your website leads me to this conclusion. I am glad
to have your confirmation of it. As a document repository SIM may more nearly compete with the
'grove minder' paradigm than with what I characterize as an XML database.

> One additional, crucial point is that SIM _is_ extensible (though I
> will qualify this presently).  It can be defined to accept markup to
> any degree of strictness or laxity (within the bounds of
> well-formedness or validity, of course).  It can be setup to accept
> any and all markup and do _something_ intelligent with it.  It can
> also be configured to make stringent demands (well in excess of the
> DTD, both with respect to strictness and complexity of constraints) of
> its inputs.

Granted. It is simply that I (perhaps perversely) have defined an XML database engine as one
which implements XML markup. My XML database engine is driven by the markup and must rework
the effective schema and re-cast its processing behavior in sync with changes to the document
instance markup.

> Now, by way of qualification, SIM does not provide free-form runtime
> extensibility (runtime from the administrator's perspective, not
> ours).  Rather it provides the application developer with the
> requisite tools to define, at design time, what structures will be
> supported.  For instance, you cannot, with SIM, perform queries such
> as, "find me all sections containing subsections with an attribute of
> security="public" and at least one paragraph with fewer than four
> words in it"  The semantic complexity of such a query is beyond the
> scope of our product.  However, if one were to know in advance that
> queries about the minimum paragraph length in public subsections will
> be commonplace in the particular application one is developing, then
> SIM could, at design time, be told to create an appropriate index and
> then the above query could, indeed, be performed.
>
> In short, SIM _is_ extensible, but the extensibility is bound somewhat
> earlier than runtime.  In practice, clients never complain about this
> quality.  In fact, it is usually a benefit rather than a hindrance,
> for the same reason that compile time type checking is a good thing to have in a programming
> language.

All of these are commendable design decisions. They are not, IMHO, realizations of the unique
qualities and potential of XML. On that, reasonable people may differ.

> I also take issue with Walter's remark that an XML database should be
> manipulated by and defined through the medium of XML.  This sounds
> analogous to suggesting that relational databases should be defined
> and manipulated by markup.

No, by relational schema, as you acknowledge in the next line.

>  Now, it is true that relational schema
> are, themselves, typically stored as relations (one will, for example,
> find a ".TABLES" table, a ".FIELDS" table, a ".INDEXES" table, etc.
> inside a database).  However, it seems to me patently absurd to
> suggest that SQL (whether DML or DDL) be expressed in terms of tuples
> and relations.  Now, while it does not seem likewise absurd to suggest
> that XML queries and data definition constructs be defined as XML, the
> truth of such a suggestion is anything but self-evident.  Why should
> one not use an SQL-like language to define and query XML databases?
> There may or may not be merit in such an approach, but it seems no
> more or less appropriate than a query/data definition language cast in
> XML.  Indeed, many of the query language position papers at W3C do not
> use XML syntax.  Data definition and query languages are
> meta-constructs.  They are not part of the data, but rather operate on
> the data and structures.  This suggests that while it may be possible
> to fold the system in on itself by expressing meta-structure as data,
> it would be unwise to proceed down this path in _a priori_ fashion

By following the path indicated by just such an a priori judgment I arrived at the conclusions
which I have shared with you. I am implementing the resulting design and, I suppose, the
almighty market will render the final verdict.

> The serious user of XML does not have a heterogeneous collection of
> vaguely defined documents with a motley crew of DTD's and well-formed
> markup.

That is exactly what I (and my customers, once we re-state their documents in various legacy
forms as XML) have to deal with. We process settlements of cross-border trades and the
regulatory reporting required by multiple overlapping legal jurisdictions. If I have advice of
a trade execution in the customary form used in, say, Djakarta, and the interested parties to
whom I must report it are a UK fiduciary, a Swiss depot bank, a US money manager and a Hong
Kong broker, as well as the various regulators which the involvement of each of those parties
entails, I must (in my opinion) drive the entire process off of a properly marked up document
which succinctly expresses the facts of the transaction reported. That document, received by
each of the interested parties, must be instantiated in the system--and I would hope the
database--of each in a form which may well require re-writing the schema upon which it will be
realized.

>  Most users have a well defined data set for which they want
> to define efficient structures for storage and retrieval (if they
> aren't interested in efficiency then their problem isn't particularly
> interesting -- any tool will do).  In the few cases where they do have
> arbitrary structure to deal with, more often than not they are only
> interested in the content and are likely to throw the structure away.

As I hope the use case fragment above illustrates, users may have very well defined
structures, well-suited to their specific needs. Those structures, however, may not
accommodate the instance documents which they receive as input data and which, in the
real-world examples I am familiar with, may exhibit differences of data structure on each
occasion.

Respectfully,

Walter Perry


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar  6 20:47:45 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:42 2004
Subject: ModSax Suggestion
Message-ID: <003b01be6811$cc5974e0$c9a8a8c0@thing2>

Seems like a good fit for filters--drop what you don't
want, transform the rest as needed.

Bill

-----Original Message-----
From: David Brownell <db@eng.sun.com>
To: MikeDacon@aol.com <MikeDacon@aol.com>
Cc: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Friday, March 05, 1999 5:58 PM
Subject: Re: ModSax Suggestion


>Interesting suggestion for a big hole in the parts of
>the Java API set that are more or less "standard" at
>this poit -- SAX and DOM.
>
>One comment though:  I've found that it's important to
>be able to have options controlling how the DOM tree is
>built.  For example, whether to discard ignorable spaces,
>or do namespace conformance enforcement, or try to get
>CDATA sections (comments, etc).
>
>Accordingly, I think being able to do a bit more than
>this will be important.
>
>- Dave
>
>
>
>MikeDacon@aol.com wrote:
>> 
>> Hi Everyone,
>> 
>> While SAX does a good job as an event-based interface
>> to Parsers, it would be nice to add a few methods to
>> receive a DOM representation back from a reference to an org.xml.sax.Parser.
>> 
>> Something like:
>> 
>> org.w3c.dom.Document  parse(InputSource  is, boolean events) throws
>> SAXException;
>> org.w3c.dom.Document  parse(java.lang.String uri, boolean events) throws
>> SAXException;
>> /* the events boolean would be to turn on/off event calls. */
>> 
>> If a SAXDriver did not want to produce a DOM, it could either simply
>> return a null or a method added like:
>> 
>> boolean isDomCapable();
>> 
>> The above would let me use the ParserFactory to seamlessly switch
>> between Parser implementations and get a DOM tree without building
>> one myself.  It is fruitless for me to build a DOM tree when almost all
>> the parser implementations provide that ability.  I just want a way to get
>> at that functionality in a simple and standard way (thus SAX).
>> 
>> Thoughts?
>> 
>>  - Mike
>> -----------------------------------------------
>> Michael C. Daconta
>> Author of Java 2 and JavaScript for C/C++ Programmers
>> Author of C++ Pointers and Dynamic Memory Management
>> Sun Certified Java Programmer and Developer
>> http://www.gosynergy.com
>> 
>> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>> (un)subscribe xml-dev
>> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>> subscribe xml-dev-digest
>> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar  6 20:57:14 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:42 2004
Subject: XML MULTI-Fragment Interchange?
Message-ID: <004801be6813$22d36820$c9a8a8c0@thing2>

From: Daniel Veillard <Daniel.Veillard@w3.org>
>  Hum, I have been following the streaming/fragment thread. However I have
>the feeling that even multiple fragment body extensions would not solve
>the problem you were facing. If I didn't get the discussion wrong, it seems
>that you rather tried to make one very big (i.e. stream) document from
>multiple sources while the scope of the fragment work was just the opposite,
>i.e. how to extract and ship a piece of a very big document.


Actually, it sounds to me like the seperation of physical and logical layers.

On the one hand, I have some data to move. Multiple documents, multiple fragements,
whatever. (logical)

On the other hand, I have a stream. It can pass any number of documents or fragments.
(physical)

The fragments in the stream could be all from one document or from different queries
on different documents or from one query applied to a set of documents. It shouldn't
matter. 

And how one might reassemble fragments back into a large document is another 
problem, though the stream should provide sufficient information to do so.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Mar  7 10:38:43 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:09:42 2004
Subject: New expat test release and FAQ
Message-ID: <36E253C5.E4301749@jclark.com>

A new expat test release is available at:

  ftp://ftp.jclark.com/pub/test/expat.zip

This adds handlers for namespace declarations; when namespace processing
is enabled these provide information about xmlns attributes.  This
release also fixes a few bugs.

I've also started an expat FAQ at:

  http://www.jclark.com/xml/expatfaq.html

Suggestions for additions are welcome.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Sun Mar  7 13:02:43 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:42 2004
Subject: ModSax Suggestion
Message-ID: <d9b431a1.36e278a4@aol.com>

Hi Dave,

In a message dated 3/5/99 5:40:50 PM Eastern Standard Time, db@eng.sun.com
writes:
> Interesting suggestion for a big hole in the parts of
>  the Java API set that are more or less "standard" at
>  this poit -- SAX and DOM.
>  
>  One comment though:  I've found that it's important to
>  be able to have options controlling how the DOM tree is
>  built.  For example, whether to discard ignorable spaces,
>  or do namespace conformance enforcement, or try to get
>  CDATA sections (comments, etc).
>  

I agree with that.  I think all that is possible while still retaining 
a minimalist design philosophy.  Something like:

void setDOMFeature(String feature, boolean val);
boolean get DOMFeature(String feature);

That way via an extensible common set of text properties we
can add properties as the need arises without expanding the API.

Looking forward to progress on the Java XML API.  BTW, Dave,
are you going to do a "Birds of a Feather" session on XML at this years
JavaOne?  I think that could be valuable.

Best wishes,

 - Mike
-----------------------------------------------
Michael C. Daconta
Author of Java 2 and JavaScript for C/C++ Programmers
Author of C++ Pointers and Dynamic Memory Management
Sun Certified Java Programmer and Developer
http://www.gosynergy.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Sun Mar  7 13:15:25 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:42 2004
Subject: ModSax Suggestion
Message-ID: <4d147c5b.36e27b77@aol.com>

In a message dated 3/6/99 4:03:25 PM Eastern Standard Time, b.laforge@jxml.com
writes:
> Seems like a good fit for filters--drop what you don't
>  want, transform the rest as needed.
>  

I think Bill has brought up an excellent point.  In fact, I like that
suggestion better than my setFeature() method.  It seems to me
that the central tension of API design is whether to expand the
API or relegate functionality to be handled by a higher-level layer
of software.

In my original suggestion, on getting access to a DOM it seems
appropriate that be part of SAX (a low-layer) while transforming
the resultant tree be relegated to a higher level layer.

While I have certainly written gobs of enterprise level software, 
my experience with formal APIs is limited -- does this track with
those of you with more API building experience?

Best wishes,

 - Mike

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Birbeck at iedigital.net  Sun Mar  7 16:10:02 1999
From: Mark.Birbeck at iedigital.net (Mark Birbeck)
Date: Mon Jun  7 17:09:42 2004
Subject: XML MULTI-Fragment Interchange?
Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054A5E@SOHOS002>

Bill wrote:
> From: Daniel Veillard <Daniel.Veillard@w3.org>
> > Hum, I have been following the streaming/fragment thread. 
> > However I have the feeling that even multiple fragment body
> > extensions would not solve the problem you were facing. If
> > I didn't get the discussion wrong, it seems that you rather
> > tried to make one very big (i.e. stream) document from
> > multiple sources while the scope of the fragment work was 
> > just the opposite, i.e. how to extract and ship a piece of
> > a very big document.
>
[snip]
> And how one might reassemble fragments back into a large 
> document is another 
> problem, though the stream should provide sufficient 
> information to do so.

I think Daniel's point is simply that in many situations you may not
want to reconstruct the 'large' document. The fragment work seems to
relate to providing context to a fragment, such that reasonable work can
be done on it. That's not the same - although related - as shipping one
great big document in a number of packages.

On the theme of multi-fragments, I think the simplest increment from
where we are now is to allow for the results set of a query that spans
different levels of a tree. I was previously exporting from queries
using a simple wrapper, but when I saw the fragment group's work decided
to use it with a very slight modification. The change is an obvious one
- and I think someone else suggested it on this list the other day - but
I wonder if anyone can see any pitfalls. I've enclosed four sets of
query results for those who might be interested in approving/criticising
my approach. The queries are:

http://[server]/documents/ysArticle[author=Ruth]
http://[server]/documents/ysArticle[author=Ruth]/ArticleText
http://[server]/documents/ysArticle[author=Ruth]/ArticleText/ysText
http://[server]/documents/ysArticle[author=Ruth]/ArticleText/ysText[ID=1
]

[Ignore non-quoted stuff, etc., it's still work in progress!]

Although the first few actually return pretty much the same information,
they differ in where the division between context and requested data is.
The first will return all articles by Ruth in their entirety, and so
only needs one 'fragbody' element. The second returns the same data, but
the articles themselves are now provided only as context, and the
containers of the text become the top level of the fragments. This
therefore requires two 'fragbody' elements, since there are two articles
by Ruth. (Actually it could be one, but because there's an article
between the two that is *not* by Ruth, even though it's not getting
returned it messes up my merging code!) The third query is not much
different from number two, but pushes one more level of data up into the
'context' information.

The final query is the one I'm most interested in getting feedback on,
in particular on whether I have the context information right. I think
the fragment document is a little ambiguous on what level of detail to
put in. Some examples in the doc. do what I have done - put in all
siblings of any element that is an ancestor of the ones we're interested
in - but one of them doesn't. Of course it is partly
application-dependent so I'm not that bothered.

Comments?

Regards,

Mark

Mark Birbeck
Managing Director
Intra Extra Digital Ltd.
39 Whitfield Street
London
W1P 5RE
w: http://www.iedigital.net/
t: 0171 681 4135
e: Mark.Birbeck@iedigital.net

-------------- next part --------------
A non-text attachment was scrubbed...
Name: q1.xml
Type: application/octet-stream
Size: 1565 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q1.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: q2.xml
Type: application/octet-stream
Size: 956 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q2.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: q3.xml
Type: application/octet-stream
Size: 1277 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q3.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: q4.xml
Type: application/octet-stream
Size: 3920 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990307/b7dd589d/q4.obj
From david at megginson.com  Sun Mar  7 23:57:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:42 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <14051.3215.196642.22571@localhost.localdomain>

What:   Four proposed predefined features for ModSAX
Action: Please read and comment (especially to propose core features
        I've missed)

Last month, I posted a proposal [1] for a backwards-compatible SAX
layer called ModSAX, which will allow parser and filter writers to
extend SAX and application writers to discover what extensions exist,
all in a well-defined and predictable way.

The relevant part of that interface for this posting is the following
method in ModParser (which extends org.xml.sax.Parser):

  public abstract void setFeature (String featureID, boolean state)
    throws SAXNotSupportedException;

The value of featureID will in some way piggyback on DNS, either by
using URIs or by using names similar to Java packages.  Although
people will be allowed (and encouraged) to invent their own features,
I'd like to predefine a core set of features for the next SAX
release.  Here's what I've thought of so far:

1. http://xml.org/sax/features/validation
  True means validate, false means don't validate.

2. http://xml.org/sax/features/external-entities
  True means expand external text entities, false means don't expand
  external text entities.

3. http://xml.org/sax/features/namespaces
  True means perform namespace processing -- munge element and
  attribute names and remove namespace declaration attributes -- and
  false means don't perform namespace processing.

4. http://xml.org/sax/features/unbuffered-input
  True means ensure that the parser does not buffer input from a
  Reader or InputStream supplied by the application (actually,
  one-character look-ahead will usually be required); false means do
  not ensure that the parser does not buffer input.  This feature might
  be useful for reading multiple documents from a single stream.

No SAX parsers will be *required* to support any of these -- they can
simply throw a SAXNotSupportedException for any request (as they
should for any other unrecognised feature request).  The earliest
ModSAX parser will probably be a general-purpose SAX 1.0 Parser
adapter, and that will certainly not be able to do anything useful
with these.

Unlike parsers, filters will ordinarily pass unrecognised feature
requests on up the chain of responsibility.


Examples
--------

If an application wants to ensure that the SAX parser is performing
validation, it can use

  try {
    parser.setFeature("http://xml.org/sax/features/validation", true);
  } catch (SAXNotSupportedException e) {
    // ...
  }

The parser may throw an exception for either of two reasons:

1. it cannot validation; or

2. it does not recognise the property.

If the application wants to determine which of the two is the case,
then it can try the following:

  try {
    parser.setFeature("http://xml.org/sax/features/validation", false);
  } catch (SAXNotSupportedException e) {
    // ...
  }

If the parser throws an exception again, then it does not recognise
the property name (in other words, it may or may not perform
validation, and the application has no way to tell); if the parser
does not throw and exception, then it simply does not support
validation.


[1] http://www.lists.ic.ac.uk/archives/xml-dev/9902/0627.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Mon Mar  8 02:03:56 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain>
Message-ID: <36E32900.BBDF43C0@jclark.com>

David Megginson wrote:

> 2. http://xml.org/sax/features/external-entities
>   True means expand external text entities, false means don't expand
>   external text entities.

I would suggest distinguishing the expansion of external parameter
entities (which would include the external DTD subset) from the
expansion of external general entities.  I can easily imagine wanting to
expand external general entities declared in the internal subset, but
not wanting to read an external DTD.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Mon Mar  8 02:04:22 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain>
Message-ID: <36E329F5.76A50E09@jclark.com>

David Megginson wrote:

> The parser may throw an exception for either of two reasons:
> 
> 1. it cannot validation; or
> 
> 2. it does not recognise the property.
> 
> If the application wants to determine which of the two is the case,
> then it can try the following:
> 
>   try {
>     parser.setFeature("http://xml.org/sax/features/validation", false);
>   } catch (SAXNotSupportedException e) {
>     // ...
>   }
> 
> If the parser throws an exception again, then it does not recognise
> the property name (in other words, it may or may not perform
> validation, and the application has no way to tell); if the parser
> does not throw and exception, then it simply does not support
> validation.

Wouldn't it be simpler to throw different type of exception in these two
cases?  You could have a SAXNotRecognizedException that extends
SAXNotSupportedException, and say that parsers should throw
SAXNotRecognizedException when the reason they don't support a feature
is that they do not recognize the feature.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Mon Mar  8 02:32:27 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <128b4bc2.36e3366b@aol.com>

Hi Dave,

Before responding to your specific proposal ... I do not understand
why you are creating a new interface like ModParser
instead of just evolving the Parser interface itself.  Personally,
while I know full well what it would mean to implement Parser -- 
a "ModParser" is just plain confusing.  Five years from now, 
someone should not have to know the history of SAX to understand
the terminology. 

Now to the Predefined features...

In a message dated 3/7/99 7:13:19 PM Eastern Standard Time,
david@megginson.com writes:
> What:   Four proposed predefined features for ModSAX
>  Action: Please read and comment (especially to propose core features
>          I've missed)
>  
>  Last month, I posted a proposal [1] for a backwards-compatible SAX
>  layer called ModSAX, which will allow parser and filter writers to
>  extend SAX and application writers to discover what extensions exist,
>  all in a well-defined and predictable way.

I like the idea of SAX filters but still feel that you should allow
access to a DOM Document if the implementing Parser can supply one.  
I won't restate the suggestion here as it was covered in a previous email.
However; that could greatly simplify a filter-writer's job.

>  
>  The relevant part of that interface for this posting is the following
>  method in ModParser (which extends org.xml.sax.Parser):
>  
>    public abstract void setFeature (String featureID, boolean state)
>      throws SAXNotSupportedException;
>  
>  The value of featureID will in some way piggyback on DNS, either by
>  using URIs or by using names similar to Java packages.  Although
>  people will be allowed (and encouraged) to invent their own features,
>  I'd like to predefine a core set of features for the next SAX
>  release.  Here's what I've thought of so far:

Since some finite set of SAX features will not approach a global naming
problem, I strongly urge not to use a URI.  If a package name scheme is
to be used, something like "sax.feature.validation".  It would also be nice
to provide one word String constants for the standard features.  

Best wishes,

 - Mike Daconta (mdaconta@aol.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar  8 02:56:51 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <001e01be690e$8913fe00$c9a8a8c0@thing2>

From: MikeDacon@aol.com <MikeDacon@aol.com>
>I like the idea of SAX filters but still feel that you should allow
>access to a DOM Document if the implementing Parser can supply one.  
>I won't restate the suggestion here as it was covered in a previous email.
>However; that could greatly simplify a filter-writer's job.


Well, that might depend on the job of the filter. You may want to use a filter
to prune out the parts of the document you are not interested in BEFORE
the DOM is built.

In general, I see several places where you might want to use a filter:

    o  Transform events from a parser into something to be output.

    o  Transform events from a parser before being accessed by an application.

    o  Between a parser and the DOM.

    o Transform events from a DOM walker into something to be output.

Note that in the last case, if the DOM walker shares its internal state (position in
the DOM tree) with the filters that come after it (using something like MDSAX),
we get a lot of XSL-like capabilities.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar  8 04:39:57 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <004b01be691c$f348fc40$c9a8a8c0@thing2>

David,

I am very much inclined to agree with you that the conservative approach
taken in implementing SAX was necessary to its broad acceptance
at that time.

However, broad acceptance of a SAX upgrade may require a different 
approach. For one thing, the very success of SAX has itself changed things.

The primary requirement is backward compatibility for both parsers and 
applications.

A second requirement is that the upgrade not be conservative, but that it
be a significant enhancement from a wide range of perspectives.

The upgrade needs to be worth doing, but for more than one reason. Feature
negotiation alone is not quite enough.

I'm sure you know the kinds of things I'm looking for: 
    o  Event objects for one.
    o  A way to specify a filter to a DOM-building-parser is another. 
    o  Better integration with the DOM in general.
I'm sure others have their own feature list.

We need to define a collection of new capabilities that have wide appeal,
together with an implementation strategy which provides full backward
compatibility. And for this group, it needs to be something that can be
implemented cleanly. 

I still feel like a newbie here. I wasn't here when SAX was done. But I 
would hate to see the initiative lost to the traditional standards bodies.

As I see it, there are two advantages to doing the work on this list:
    1. It is open to individuals. The cost to participate is measured only
        in the time it takes.
    2. This is the world's toughest bunch of critics. The folks here plan
        to implement the proposals themselves. And any proposal that isn't
        clean is going to be revised until it can be easily implemented.
And as much as the first point is what allows me to participate, it is
the second point that is the real winner. A standards body whose participants
are largely from large companies have more to gain from a spec that
is difficult to implement--it limits the competition.

So that's why I'm butting in here. I think an open standards process is
important for individuals and small companies. We need to do what we
can to keep the ball rolling here.

Bill

From: David Megginson <david@megginson.com>

>What:   Four proposed predefined features for ModSAX
>Action: Please read and comment (especially to propose core features
>        I've missed)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From shinichiro.hamada at toshiba.co.jp  Mon Mar  8 06:57:59 1999
From: shinichiro.hamada at toshiba.co.jp (Shinichiro HAMADA)
Date: Mon Jun  7 17:09:43 2004
Subject: Accessing DTD info. in IE5
Message-ID: <007301be6930$daa8e100$85247385@pv189.ssel.toshiba.co.jp>

Hello.

>Is DTD information accessable through IE5 DOM? I took is as granted because
>I could do it with old MSXML for java used in IE4. However, when I really
>wanted to access DTD info in IE5, I couldn't find it from anywhere. Is DTD
>information exposed in IE5 DOM?

I wonder if what you want to know is IXMLDOMDocument::get_doctype:

http://www.microsoft.com/workshop/xml/xmldom/reference/DOMDocument_doctype.a
sp

or I've misunderstood your question?

--
Shinichiro HAMADA


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From johnh at erin.gov.au  Mon Mar  8 06:58:56 1999
From: johnh at erin.gov.au (John Hockaday)
Date: Mon Jun  7 17:09:43 2004
Subject: Mapping elements in architectural forms
Message-ID: <199903080655.RAA21026@eos.erin.gov.au>

Hi,

I am using architectural forms to map elements from a client document
instance of a client DTD to a base document of a base DTD using the SP
software by James Clark.  The problem is that the structure of the
elements and sub-elements in the client document do not exactly match
the base DTDs elements and sub-elements and I don't know how to relate
this in the mapping DTD.

For example, sub-elements "b" and "c" occur in element "a" in the
client DTD but in the base DTD sub-elements "b" occur in element "a"
but sub-element "c" occurs in element "d".

	Client				Base
	======				====
	
	<a>				<a>
	   <b>				   <b>
	   <c>				</a>
	</a>				<d>
					   <c>
					</d>
					
If I map "a" to "a", "b" to "b" and "c" to "c" in the mapping DTD the
parser gives an error that "a" has not been finished and that "c"
should not occur here in the base document.

Does anyone know how I can map the client elements to the base elements
in the mapping DTD to fix this problem?


___________________________________________________________________________
John Hockaday - Systems Officer                                 GPO Box 787
email: johnh@erin.gov.au                                  Canberra ACT 2601
phone: +61 2 6274 1173  fax: +61 2 6274 1333                      Australia
URL:http://www.environment.gov.au/
ERIN          Environmental  Resources  Information  Network           ERIN
___________________________________________________________________________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wendy.cameron at qr.com.au  Mon Mar  8 07:25:52 1999
From: wendy.cameron at qr.com.au (Wendy Cameron)
Date: Mon Jun  7 17:09:43 2004
Subject: XSL Problem
References: <14051.3215.196642.22571@localhost.localdomain>
Message-ID: <00f101be6930$a123feb0$c62b580a@qrail.com.au>

Ok I have

<nodeType1 att1='thing1'>
<nodeType2 att1='thing2'>
<nodeType3 att1='thing3'>

I am trying to select all 3 nodes and orger by att1 but display different
information depending on what type of node it is?

Does anyone have any idea how i would do this

I have tried
<xsl:choose>
   <xsl:when test='nodeType1'>
   .....
   </xsl:when>
</xsl:choose>

But this doesnt test if the current node is of type nodeType1

Help!!!

Regards Wendy


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From zmin at atpage.com  Mon Mar  8 07:28:47 1999
From: zmin at atpage.com (min zheng)
Date: Mon Jun  7 17:09:43 2004
Subject: Accessing DTD info. in IE5
References: <007301be6930$daa8e100$85247385@pv189.ssel.toshiba.co.jp>
Message-ID: <002d01be6935$e73a66a0$f66f6f0a@atpage>

What I want is the DTD (or Schema) rules telling me what nodes are allowed
in an element. The get_doctype mothod only gives the doctype declaration.
There is no way (as far as I know) to access element rules from there.

Thanks anyway,
Min

----- Original Message -----
From: Shinichiro HAMADA <shinichiro.hamada@toshiba.co.jp>
To: <xml-dev@ic.ac.uk>
Sent: Sunday, March 07, 1999 10:56 PM
Subject: RE: Accessing DTD info. in IE5


> Hello.
>
> >Is DTD information accessable through IE5 DOM? I took is as granted
because
> >I could do it with old MSXML for java used in IE4. However, when I really
> >wanted to access DTD info in IE5, I couldn't find it from anywhere. Is
DTD
> >information exposed in IE5 DOM?
>
> I wonder if what you want to know is IXMLDOMDocument::get_doctype:
>
>
http://www.microsoft.com/workshop/xml/xmldom/reference/DOMDocument_doctype.a
> sp
>
> or I've misunderstood your question?
>
> --
> Shinichiro HAMADA
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar  8 10:29:39 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <14051.3215.196642.22571@localhost.localdomain>
References: <14051.3215.196642.22571@localhost.localdomain>
Message-ID: <wkzp5oe0m3.fsf@ifi.uio.no>


* David Megginson
|
| The value of featureID will in some way piggyback on DNS, either by
| using URIs or by using names similar to Java packages.

I think we should use package-like names.  Using protocol prefixes
seems to me both potentially confusing, slightly obfuscating and I
don't see the merit in it over a package-like scheme.

I much prefer org.xml.sax.features.validation over
http://xml.org/sax/features/validation.

| 2. http://xml.org/sax/features/external-entities

I agree with James that separating general entities and parameter
entities is a good idea.

| 4. http://xml.org/sax/features/unbuffered-input

I'm not sure I see the merit of this. Maybe we should skip this?


A suggestion of my own:

org.xml.sax.features.catalog

True means read the default catalog file, whether that is located via
an environment variable, a Java property or something else.  OpenXML,
XML Parser for Java (xml4j) and xmlproc already support catalogs, and
might find this useful.  xmlproc certainly will.
 
| No SAX parsers will be *required* to support any of these -- they
| can simply throw a SAXNotSupportedException for any request 

I also agree with James that a separate unrecognized-exception is a
good idea.

| Unlike parsers, filters will ordinarily pass unrecognised feature
| requests on up the chain of responsibility.

Good point. This implies that filters need references in both
directions, that is, both to the event source and to the event
receiver, thus resolving a question that was previously discussed
here.
 
| [1] http://www.lists.ic.ac.uk/archives/xml-dev/9902/0627.html

Hmmm. Wouldn't this reference be more correct?

<URL:http://www.lists.ic.ac.uk/archives/xml-dev/9902/0645.html>

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar  8 10:40:23 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <004b01be691c$f348fc40$c9a8a8c0@thing2>
References: <004b01be691c$f348fc40$c9a8a8c0@thing2>
Message-ID: <wkyal8e048.fsf@ifi.uio.no>


* Bill la Forge
| 
| The upgrade needs to be worth doing, but for more than one reason.

I agree that it needs to be worth doing, but to me what has been
proposed here certainly sounds like it is enough. (Remember, parameter
setting, handler extensibility, filters, namespaces, lexical
information and DTD information are probably all in the pipeline.)

| I'm sure you know the kinds of things I'm looking for: 
|     o  Event objects for one.

On this point I agree with what David will probably say: this belongs
on a higher level. If you want this functionality, make a value-adding
layer on top of SAX 1.1. There's no loss in that, since you can
implement this once for all SAX-aware parsers with hardly any
performance penalties. (This is why I agree with David: this is the
kind of benefit that being ultra low-level buys us.)

|     o  A way to specify a filter to a DOM-building-parser is another. 

We certainly need this, but I don't see how this can usefully be part
of SAX.  SAX is at a lower level than the DOM and so should certainly
be designed for a DOM layer to fit nicely on top, but there should be
no dependencies, I think.

In other words, this is something that either the DOM or the parsers
will have to deal with in a sensible fashion. Taking a ModParser as an
argument to DOM building would perhaps be the best way to do this.

However, I don't see the harm in someone sitting down to write a
recommendation to DOM parser writers for how to do this and why it's
useful.
 
| So that's why I'm butting in here. I think an open standards process
| is important for individuals and small companies. We need to do what
| we can to keep the ball rolling here.

We are certainly in heartfelt agreement here. :)

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 11:30:32 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <128b4bc2.36e3366b@aol.com>
References: <128b4bc2.36e3366b@aol.com>
Message-ID: <14051.45935.800104.922834@localhost.localdomain>

MikeDacon@aol.com writes:

 > I like the idea of SAX filters but still feel that you should allow
 > access to a DOM Document if the implementing Parser can supply one.
 > I won't restate the suggestion here as it was covered in a previous
 > email.  However; that could greatly simplify a filter-writer's job.

I have an idea for how we can handle that (and other, similar
problems), but I'll cover it in a separate posting (it's still brewing
a bit).

 > Since some finite set of SAX features will not approach a global naming
 > problem, I strongly urge not to use a URI.  

I disagree here -- if third parties want to be able to define feature
names, they need a way to avoid collision (i.e. we want to make
certain that both Oracle and Sun can define properties like
'normalize' without blowing up the whole system).

That said, the Java package naming scheme also provides DNS-based
uniqueness, as in 'org.xml.sax.features.validation'.  It's simply a
matter of taste:

- org.xml.sax.features.validation is more of a Java flavour.

- http://xml.org/sax/features/validation is more of an XML/Namespaces
  flavour


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 11:34:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <004b01be691c$f348fc40$c9a8a8c0@thing2>
References: <004b01be691c$f348fc40$c9a8a8c0@thing2>
Message-ID: <14051.46235.905949.308401@localhost.localdomain>

Bill la Forge writes:

 > The upgrade needs to be worth doing, but for more than one
 > reason. Feature negotiation alone is not quite enough.

Yes, but my original proposal was not limited to feature negotiation
-- it also included the ability to add and negotiate new handler types 
at runtime.

People will upgrade because they want to use the new handlers that are
implemented with ModSAX, not because of any elegance or inelegance in
ModSAX itself.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Mon Mar  8 11:41:43 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk>

 
> What:   Four proposed predefined features for ModSAX
> Action: Please read and comment (especially to propose core features
>         I've missed)
> 

Could I add a plea for another optional feature:
http://xml.org/sax/features/normalisePCDATA

whose effect is to ensure that successive calls to supply character data are
combined into a single call. The reason for this is that it's very common
for applications to assume the parser won't split character data, an
incorrect assumption but one that will survive most testing. 

Actually I think using "http://" names for things that have nothing to do
with HTTP protocol is very bad form. (Apart from anything else, my mail
client encourages my to click on them to see what's there.)
"org.xml.sax.features.normalisePCDATA" is much more sensible. If you want a
URN, choose a protocol name other than http.

Another rather trivial convenience feature I'd like added to SAX is the
ability for InputSource to accept a File (as well as a URL, etc). Though the
need for this has declined with Java 2.

Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 11:42:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <36E32900.BBDF43C0@jclark.com>
References: <14051.3215.196642.22571@localhost.localdomain>
	<36E32900.BBDF43C0@jclark.com>
Message-ID: <14051.46547.366706.485764@localhost.localdomain>

James Clark writes:

 > I would suggest distinguishing the expansion of external parameter
 > entities (which would include the external DTD subset) from the
 > expansion of external general entities.  I can easily imagine
 > wanting to expand external general entities declared in the
 > internal subset, but not wanting to read an external DTD.

I agree.  Here's the new core feature list:

http://xml.org/sax/features/validation
http://xml.org/sax/features/external-general-entities
http://xml.org/sax/features/external-parameter-entities
http://xml.org/sax/features/namespaces
http://xml.org/sax/features/unbuffered-input


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 11:43:26 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:43 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <36E329F5.76A50E09@jclark.com>
References: <14051.3215.196642.22571@localhost.localdomain>
	<36E329F5.76A50E09@jclark.com>
Message-ID: <14051.46946.186235.431488@localhost.localdomain>

James Clark writes:

 > Wouldn't it be simpler to throw different type of exception in these two
 > cases?  You could have a SAXNotRecognizedException that extends
 > SAXNotSupportedException, and say that parsers should throw
 > SAXNotRecognizedException when the reason they don't support a feature
 > is that they do not recognize the feature.

Yes, I agree.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 11:51:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <14051.46670.687235.664451@localhost.localdomain>

What: Additions to ModParser interface

I'm proposing a couple of additions to the ModParser interface:

  public interface ModParser extends Parser
  {
    public abstract void setFeature (String featureID, boolean state)
      throws SAXNotSupportedException;

    public abstract void setHandler (String handlerID, ModHandler handler)
      throws SAXNotSupportedException;

    public abstract void set (String infoID, Object prop)
      throws SAXNotSupportedException;

    public abstract Object get (String infoID)
      throws SAXNotSupportedException;
  }

These allow you to do interesting things like

  parser.set("http://www.foo.com/props/textfilter", filter);

or

  try {
    Node node = parser.get("http://xml.org/sax/props/dom-node");
  } catch (SAXNotRecognizedException e1) {
    // doesn't know about DOM processing...
  } catch (SAXNotSupportedException e2) {
    // knows about DOM processing, but not doing it...
  }

Again, it's a little sloppy as an interface, but it's beautifully
extensible and it supports filters nicely (if there are other filters
between the DOM iterator and the application, it will still work).

Note that strictly speaking, now, setHandler() and setFeature() are no
longer primitives, since they could both be implemented in terms of
set(), but I think that the extra type checking is worthwhile in those 
cases.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 11:54:40 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk>
References: <93CB64052F94D211BC5D0010A80013310EB35F@wwmessd3.bra01.icl.co.uk>
Message-ID: <14051.47534.65569.354415@localhost.localdomain>

Kay Michael writes:

 > Could I add a plea for another optional feature:
 > http://xml.org/sax/features/normalisePCDATA

Yes, this is especially useful for building a DOM as well.  I've added 
it to the list of core features:

http://xml.org/sax/features/validation
http://xml.org/sax/features/external-general-entities
http://xml.org/sax/features/external-parameter-entities
http://xml.org/sax/features/namespaces
http://xml.org/sax/features/unbuffered-input
http://xml.org/sax/features/normalize-text

Remember that parser will not be required to support any of these.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Mon Mar  8 12:33:14 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <073f01be695f$d8dc04e0$010a0a0a@home.wilson.co.uk>

----- Original Message -----
From: David Megginson <david@megginson.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Sent: 07 March 1999 23:56
Subject: SAX RFD: ModSAX Predefined Features


>What:   Four proposed predefined features for ModSAX
>Action: Please read and comment (especially to propose core features
>        I've missed)
>
>Last month, I posted a proposal [1] for a backwards-compatible SAX
>layer called ModSAX, which will allow parser and filter writers to
>extend SAX and application writers to discover what extensions exist,
>all in a well-defined and predictable way.

It seems to me that there are two kinds of parser extensions:

1/ those that are static (i.e. must be established before the parser is
used)
2/ those that are dynamic (i.e. they can be changed on the fly)

An example of a static extension would be buffering. If the parser is
buffering input then it is infeasible to change to unbuffered input in the
middle of parsing the text. Switching from non validating to validating is
problematic, insisting that a parser be able to do this would probably add
unacceptable overhead to the non validating mode.

I would suggest that the bulk of the extensions should be specified to the
parserFactory and only a *very* limited number (if any at all) be specified
to the instance of Parser.

I would very much like a getFeature function which returns a value telling
me if the feature is set or not.

I'm also not very keen on the use of strings to specify the features.

How about using instances of classes:

in org.xml.sax

public abstract class Feature {
  public Feature(boolean state) {
    this.state = state;
  }

  final boolean state;
}

public final class Validation extends Feature {
  public Validation(boolean state) {
    super(state);
  }
}

individual parser implementations would then be free to add their own
extensions defined by classes that subclass org.xml.Feature - they could
also contain parameters.

setFeature would then take a single Feature parameter:

    xxx.setFeature(new org.xml.sax.Validation(true));

getFeature would take a Class parameter and return an instance of the class
or null if the feature was unrecognised.

org.xml.sax.Feature f = xxx.getFeature(org.xml.sax.Validation.class);

if (f == null) // not supported
if (f.state) // supported and switched on.

non Java implementations would probably have to use a string instead of the
Class parameter.

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Mon Mar  8 13:13:13 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <14051.45935.800104.922834@localhost.localdomain>
Message-ID: <Pine.GHP.4.02A.9903081149260.2617-100000@mail.ilrt.bris.ac.uk>

On Mon, 8 Mar 1999, David Megginson wrote:

> MikeDacon@aol.com writes:

>  > Since some finite set of SAX features will not approach a global naming
>  > problem, I strongly urge not to use a URI.  
> 
> I disagree here -- if third parties want to be able to define feature
> names, they need a way to avoid collision (i.e. we want to make
> certain that both Oracle and Sun can define properties like
> 'normalize' without blowing up the whole system).
> 
> That said, the Java package naming scheme also provides DNS-based
> uniqueness, as in 'org.xml.sax.features.validation'.  It's simply a
> matter of taste:
> 
> - org.xml.sax.features.validation is more of a Java flavour.

Yep... but might not feel so natural for developers working with
versions of SAX translated for Perl, Python and so on.


> - http://xml.org/sax/features/validation is more of an XML/Namespaces
>   flavour

...and RDF [1]. Giving interesting entities URIs makes them more
fully a part of the Web, and means we can take advantage of
URI-oriented metadata. Eg. you might search a software database for
resources that were of type 'Perl Module' and that implemented the 
feature  known as 'http://xml.org/sax/features/validation'. (There's
already a Linux Packages Database[2] along similar lines...). I'm not
claiming that this would be impossible using the Java naming scheme,
just that a Web oriented approach might make it easier to do certain
things...


Dan


[1] http://www.w3.org/TR/REC-rdf-syntax
[2] http://rpmfind.net/linux/rpmfind/


--
Daniel.Brickley@bristol.ac.uk                
Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/
University of Bristol,  Bristol BS8 1TN, UK.   phone:+44(0)117-9288478


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar  8 13:39:47 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:44 2004
Subject: ModSAX addition, general property query
Message-ID: <008c01be6967$da059720$c9a8a8c0@thing2>

From: David Megginson <david@megginson.com>
>    public abstract void set (String infoID, Object prop)
>      throws SAXNotSupportedException;
>
>    public abstract Object get (String infoID)
>      throws SAXNotSupportedException;


David,

OK, this is more like it!

You have now defined an interface which is broad enough to fit all
of MDSAX under. 

Remember that filters also implement the parser interface.
And so do DOMWalkers.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Mon Mar  8 13:43:01 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <fb33b8b8.36e3d35f@aol.com>

Hi Bill,

In a message dated 3/7/99 10:03:55 PM Eastern Standard Time,
b.laforge@jxml.com writes:
> Well, that might depend on the job of the filter. You may want to use a 
> filter
>  to prune out the parts of the document you are not interested in BEFORE
>  the DOM is built.

I agree with you.  I was not saying that access to the DOM was the only
way to write a filter.  Just that filters can be based on walking a 
DOM Document tree as you state below. 

>  
>  In general, I see several places where you might want to use a filter:
>  
>      o  Transform events from a parser into something to be output.
>  
>      o  Transform events from a parser before being accessed by an 
> application.
>  
>      o  Between a parser and the DOM.
>  
>      o Transform events from a DOM walker into something to be output.
>  

Best wishes,

 - Mike (mdaconta@aol.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Mon Mar  8 14:54:21 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <18b603b2.36e3e337@aol.com>

Hi David,

In a message dated 3/8/99 9:10:40 AM Eastern Standard Time,
david@megginson.com writes:
> What: Additions to ModParser interface
>  
>  I'm proposing a couple of additions to the ModParser interface:
>  
>    public interface ModParser extends Parser
>    {
>      public abstract void setFeature (String featureID, boolean state)
>        throws SAXNotSupportedException; 
>      public abstract void setHandler (String handlerID, ModHandler handler)
>        throws SAXNotSupportedException;
>      public abstract void set (String infoID, Object prop)
>        throws SAXNotSupportedException;
>      public abstract Object get (String infoID)
>        throws SAXNotSupportedException;
>    }
>  
>  These allow you to do interesting things like
>  
>    parser.set("http://www.foo.com/props/textfilter", filter);
>  
>  or
>  
>    try {
>      Node node = parser.get("http://xml.org/sax/props/dom-node");
>    } catch (SAXNotRecognizedException e1) {
>      // doesn't know about DOM processing...
>    } catch (SAXNotSupportedException e2) {
>      // knows about DOM processing, but not doing it...
>    }
>  

I think the success of a general set() and get() capability will
be based on the creation of a good initial set of descriptors (what you
called infoID) to get or set.

So, in that vein, I have 2 comments:

1. I still strongly urge not to use a URI for a feature or infoID.  These are
not resource locations they are just a descriptive string.  In fact, I 
bet that most parsers just implement your initial recommended set.

2. I'd recommend that constants be defined in the interface for the initial
set of
standard features and infoIDs.  Something like:

public static final String VALIDATE = "sax.feature.validation";
public static final String DOCUMENT = "sax.dom.Document";

Then I can do this:
try 
{
      parser.setFeature(ModParser.VALIDATE, true);
} catch (SAXNotRecognizedException e1) 
  {
      // doesn't know about validation 
  } 
  catch (SAXNotSupportedException e2) 
  {
      // Does not support validation
  }

Best wishes,

 - Mike (mdaconta@aol.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  8 15:05:47 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain>
Message-ID: <36E3E712.D5556233@locke.ccil.org>

David Megginson wrote:

>   public abstract void setFeature (String featureID, boolean state)
>     throws SAXNotSupportedException;

I want to propose a restriction and an extension:

1) This method cannot be called after any other parser method
has been invoked.

2) This method is allowed to throw a SAXNewParserException, which
encapsulates a replacement parser.  The application should use
the parser inside the exception in place of the original parser.
This allows parsers to push filters on top of themselves, which
complements the ability of applications to push them.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  8 15:09:34 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain> <36E32900.BBDF43C0@jclark.com>
Message-ID: <36E3E80B.C3E55F16@locke.ccil.org>

James Clark scripsit:

> I can easily imagine wanting to
> expand external general entities declared in the internal subset, but
> not wanting to read an external DTD.

Or, indeed, the converse: I might want to get the whole DTD but
make my own decisions about loading external general entities.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 15:15:47 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <18b603b2.36e3e337@aol.com>
References: <18b603b2.36e3e337@aol.com>
Message-ID: <14051.59370.316671.640337@localhost.localdomain>

MikeDacon@aol.com writes:

 > 1. I still strongly urge not to use a URI for a feature or infoID.
 > These are not resource locations they are just a descriptive
 > string.  In fact, I bet that most parsers just implement your
 > initial recommended set.

Yes, but what about filters that perform specialised actions?  And
what about adding support (stable or experimental) for new XML-related
features like schemas, datatyping, and linking as they become
available?

The problem with SAX 1.0 is that it froze the XML status quo of about
a year ago, and many interesting things have happened since then; with 
ModSAX, I'd like to leave the API open for two reasons:

1. so that we can extend it without breaking existing implementations; 
   and

2. so that people can experiment with different ways of supporting new 
   features within the SAX framework.

As I wrote before, it doesn't much matter whether we use Java property 
names incorporating domain names (like
'org.xml.sax.features.validation') or URIs (like
'http://xml.org/sax/features/validation'), as long as we have the
ability for people to create new names without fear of collision.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  8 15:18:11 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:44 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <004b01be691c$f348fc40$c9a8a8c0@thing2>
Message-ID: <36E3E967.F0D6690B@locke.ccil.org>

Bill la Forge wrote:

>     o  Event objects for one.

But event objects are very easy to build on top of the
existing SAX.  Just do it!

>     o  A way to specify a filter to a DOM-building-parser is another.
>     o  Better integration with the DOM in general.

The chief problem here is that SAX doesn't provide all the
information that a DOM builder needs, notably the default
value of attributes.

> I'm sure others have their own feature list.

If we can standardize feature control, then feature lists can be
implemented in parsers or parser filters.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Mon Mar  8 15:26:24 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:44 2004
Subject: URIs for features (was Re: SAX RFD: ModSAX Predefined Features)
Message-ID: <01c501be6977$3a60ce00$0300000a@othniel.cygnus.uwa.edu.au>

>...and RDF [1]. Giving interesting entities URIs makes them more
>fully a part of the Web, and means we can take advantage of
>URI-oriented metadata. Eg. you might search a software database for
>resources that were of type 'Perl Module' and that implemented the
>feature  known as 'http://xml.org/sax/features/validation'. (There's
>already a Linux Packages Database[2] along similar lines...). I'm not
>claiming that this would be impossible using the Java naming scheme,
>just that a Web oriented approach might make it easier to do certain
>things...


I wonder if this could be extended to more general features of XML software,
not just SAX parsers. I wouldn't mind trying this out with XMLSOFTWARE.COM
(http://www.xmlsoftware.com/).

One of the problems that I have is with a canonical form of feature values
for XML software like platform. URIs might provide just the solution. A Java
2 XSL processor conforming to the WD-xsl from 16th December 1998 might be
specified in terms of http://java.sun.com/products/jdk/1.2/ and
http://www.w3.org/TR/1998/WD-xsl-19981216

Actually, now that I think of it, we already have namespaces for content.
They are called notations. There seems to be some link here.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Mon Mar  8 15:56:06 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:09:44 2004
Subject: Architectural Forms Questions
References: <Pine.GHP.4.02A.9903081149260.2617-100000@mail.ilrt.bris.ac.uk>
Message-ID: <36E3F30C.F6D6DB51@mitre.org>

Hi Folks,

We have some beginner's questions on Architectural Forms.

The motivation for this message is our interest in creation, discovery,
sharing and reuse of mappings. 

- How powerful is the correspondence that you can express with
Architectural Forms?  Is it essentially limited to renaming and
omission?

- In addition to using Architectural Forms to express correspondences
that are known a priori, could you use them to document mappings that
are discovered "on-the-fly" by modifying a document or DTD after a
mapping is discovered?

- It appears to be the case that the correspondence between A and B must
be documented in a way that keeps the mapping tightly coupled to either
A or B. Are there any plans to represent the correspondence so that it
is not tightly coupled to either A or B?

- Is it a correct interpretation to say that Architectural Forms
represent correspondence by overloading existing language constructs?

- Given that subtyping and inheritance have been part of the primary XML
"schema" proposals, is it likely that XML Architectural Forms will be
overtaken by advances in the XML schema area?

Thanks.  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmcdonou at library.berkeley.edu  Mon Mar  8 17:09:35 1999
From: jmcdonou at library.berkeley.edu (Jerome McDonough)
Date: Mon Jun  7 17:09:45 2004
Subject: Opinions requested
In-Reply-To: <19990306154022.B22308@io.mds.rmit.edu.au>
References: <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>
 <000801be66ab$6c0d3c00$5118a8c0@kuantech1.quokka.com>
 <36DF4CE1.7F4D3681@simdb.com>
 <3.0.5.32.19990305093729.00c6fcf0@library.berkeley.edu>
Message-ID: <3.0.5.32.19990308090238.00c74c90@library.berkeley.edu>

Thanks for the update on SIM.  It's definitely more advanced in its
development than I thought.  A few additional comments, and a clarification:

At 03:40 PM 3/6/1999 +1100, Marcelo Cantos wrote:
>More important than any specifics, however, is the issue of what you
>call a DBMS.  To me, a DBMS is a database management system (seems
>painfully obvious, but I think it bears repeating).  You may argue
>that a product is not a DBMS if it does not support feature X, and I
>don't entirely disagree.  When one talks of a DBMS one is conjuring up
>a certain image in the mind of the listener, and that image may well
>include feature X.  To be fair to SIM, however, the essence of a DBMS
>is that it manages a collection of data.  If it doesn't support
>transactions, this does not entail that it does not manage data.
>Rather it simply has limits on the way the data is managed (i.e. it
>doesn't manage data as well as one would like).
>
>You clearly believe that transaction support is part of the essence of
>what makes a DBMS.  I disagree, indeed, I profoundly disagree.  There
>is nothing in the concept of a database that mandates any such
>requirement.  Rather I would say that transaction support is an
>important issue for any _good_ DBMS.  Likewise for referential
>integrity and concurrency (and, for that matter, support for
>declarative queries, use of indexes, a rich set of fundamental data
>types, etc.).  If I recall correctly, dBase III was generally
>acknowledged to be a DBMS though it lacked most of these requirements,
>and could barely even call itself relational!

I agree with all of the above, and I didn't mean to particularly single
out transaction support.  In addition to the point you raise that a DBMS
calls to mind a particular set of features (not all of which need to
be present to qualify a system as a DBMS), I'd add that particular systems
are developed based on previous work within a particular paradigm (oh man,
referencing Kuhn before I've even had coffee -- been a grad student too long)
and I see SIM as much more following in the lineage of IR systems than
DBMS systems.  I'll grant there's overlap, and SIM is obviously moving
towards a graceful integration of the two areas, but I'd characterize
it as moving from an IR engine towards a combined IR/DBMS system.

>I guess this all boils down to what's in a name.  At the end of the
>day, it is far more important to know what a product does and does not
>do than what you call it.
>

Agreed, but as you mentioned, particular names invoke an understanding
of what a system does/what features it may be expected to support, etc.
While these understandings may overlap from one person to the next, often
they don't, and I think DBMS are an example of an area where they can mean
quite different things to different people.  Hence, the frequency of people
saying 'DBMS don't handle SGML/XML' occuring side by side with people
saying 'what, are you crazy?  Of course they do.'

>I am sceptical that any RDBMS vendor can come to the party in terms of
>performance.  Past attempts to try to force text into a relational,
>table or object based paradigm have not reaped great success (Oracle's
>ConText comes to mind as an example of how forcing a square peg into a
>round hole requires sacrificing the edges of performance).  I would be
>surprised if any of the major database vendors would be prepared to
>venture away from their core competency (the relational model) to
>address the performance issues.
>

I share your skepticism, but we can hope.  If nothing else, there appears
to be at least the dawnings of an understanding among the major DBMS
vendors that there's a huge market for text management/retrieval products.
Some of the approaches taken by the object-oriented database folks, like
Informix's data blades, struck me as having promise.

>I strongly disagree that SIM doesn't handle SGML/XML well.

Ah, now here, I'm afraid you're reading words into my mouth.  To clarify,
I think SIM handles SGML/XML very well indeed; one of the best I've seen,
in fact.  I said I don't think any DBMS handles SGML/XML well, but I also 
excluded SIM from the DBMS category.  Sorry, I should have been clearer 
about that.

>From what you've said, though, SIM does appear to be shaping up as
a very interesting IR/DBMS hybrid.  The referential integrity hooks
are a very nice plus.  I have one piece of advice: promote yourselves
more! :)  I looked over the SIM web site before my post, and didn't see
any discussion of the new features you're working on.  A few words about
future directions you're exploring for your product would be a good thing.


Jerome McDonough -- jmcdonou@library.Berkeley.EDU  |  (......)
Library Systems Office, 386 Doe, U.C. Berkeley     |  \ *  * /
Berkeley, CA 94720-6000    (510) 642-5168          |  \  <>  /
"Well, it looks easy enough...."                   |   \ -- /  SGNORMPF!!!
         -- From the Famous Last Words file        |    ||||

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Mon Mar  8 17:17:40 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:09:45 2004
Subject: Java Specification Request for XML
In-Reply-To: <36DD9EA1.2CEE7CEA@eng.sun.com>
Message-ID: <v03102803b309a99cfca5@[168.100.203.234]>

At 12:42 PM -0800 3/3/99, David Brownell wrote:


>The Java Community Process is an open, inclusive process and we
>look forward to the active particpation of all interested parties.
>

The process, and its relatnive openness, is a little more obvious if you
remove the passive voice. compare this:

>The process goes forward in several steps:
>
>[1] The JSR is presented for comment (as you've seen)
>[2] The JSR is approved (we hope)
>[3] An expert group is formed to write the specification; this
>    begins with a "Call for Experts" (CAFE) to participate.
>[4] The expert group writes a first draft of the specification
>[5] The draft is circulated to all Java technology licensees and
>    Participants in the Java Community Process.
>[6] Comments are collected, read, and responded to by the expert
>    group, resulting in an improved specification.
>[7] The refined specification is then released to the public for
>    comment.
>[8] Comments from the public are collected, read, and responded
>    to by the expert group, resulting in more refinements.
>[9] The final specification is produced by the expert group, along
>    with a reference implementation and compatibility tests.
>

to this:

[1] Sun presents the JSR for comment (as you've seen)
[2] Sun's Process Management Office approves the JSR.
[3] Sun forms an expert group to write the specification; this
    begins with a "Call for Experts" (CAFE) to participate.
    [Sun chooses the leader of the group, who then chooses
     the remainder of the experts.]
[4] The expert group writes a first draft of the specification
[5] Sun circulates the draft  to all Java technology licensees and
    Participants in the Java Community Process. [that is,
    companies who have paid Sun thousands of dollars to do this]
[6] The expert group collects, reads, and responds to comments,
    resulting in an improved specification.
[7] Sun releases the refined specification to the public for comment.
[8] The expert group collects, reads, and responds to comments,
    resulting in more refinements.
[9] The expert group produces the final specification, along
    with a reference implementation and compatibility tests.

>The key point is that everyone with internet access will get a
>chance to review and comment on the emerging specification.
>

They can review and comment. There's no promise that
anyone will even listen to their comments, much less act on them.

There are a number of aspects of this "open" process that aren't mentioned
here.

1. It costs between $2,000 (educational) and $5,000 (commercial) dollars to
participate as an expert.

2. Sun owns the copyright and other intellectual property rights related to
the spec. As owner, they will not allow derivative works they decide are
incompatible.

3. Participants in the expert group can't talk about the ongoing work with
outsiders.

4. Only company employees are allowed to be experts. Freelancers like many
of those who participated in the development of SAX and XML are excluded.
This is similar to W3C procedures, but the W3C allows exceptions for
recognized experts. Sun does not.

To me these alone make it pretty clear, that this process is open in name
only. If you're still not convinced, ask yourself these questions:

1. Can anyone tell Sun No? Can anyone keep Sun from putting something into
the spec they want to put it in? Or put something in that Sun wants to keep
out?

2. Can Sun's enemies (i.e. Microsoft, HP, etc.) particpate in this process
on an equal footing with Sun? Can they even participate at all?

Bottom line: The openness of this process is PR, pure and simple. When you
actually read the fine print, all Sun does is agree to let other companies
contribute their time, money, and knowledge to help Sun do what it wants to
do anyway.  That may be intelligent business, but it's not an open,
community based process for developing standards.


+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|        XML: Extensible Markup Language (IDG Books 1998)            |
|   http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://sunsite.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/     |
+----------------------------------+---------------------------------+


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 19:52:05 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <36E3E712.D5556233@locke.ccil.org>
References: <14051.3215.196642.22571@localhost.localdomain>
	<36E3E712.D5556233@locke.ccil.org>
Message-ID: <14052.10627.837114.651600@localhost.localdomain>

John Cowan writes:
 > David Megginson wrote:
 > 
 > >   public abstract void setFeature (String featureID, boolean state)
 > >     throws SAXNotSupportedException;
 > 
 > I want to propose a restriction and an extension:
 > 
 > 1) This method cannot be called after any other parser method
 > has been invoked.

Wouldn't it be better to allow the parser/filter make that decision?
If the user attempts to change something during a parse that should
*not* be changed during a parse, the parser/filter can throw a
SAXNotSupportedException.

 > 2) This method is allowed to throw a SAXNewParserException, which
 > encapsulates a replacement parser.  The application should use
 > the parser inside the exception in place of the original parser.
 > This allows parsers to push filters on top of themselves, which
 > complements the ability of applications to push them.

I think that this could be layered on top of SAX, simply by
subclassing SAXNotSupportedException.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tomh at thinlink.com  Mon Mar  8 22:02:37 1999
From: tomh at thinlink.com (Tom Harding)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX: ModSAX addition, general property query
References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain>
Message-ID: <36E44898.CB8E18C4@thinlink.com>

David Megginson wrote:

> As I wrote before, it doesn't much matter whether we use Java property
> names incorporating domain names (like
> 'org.xml.sax.features.validation') or URIs (like
> 'http://xml.org/sax/features/validation'), as long as we have the
> ability for people to create new names without fear of collision.

I would also urge against using an http: URI since it is not meant that a resource actually be
retrieved using the http protocol.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar  8 22:12:55 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain>
		<36E3E712.D5556233@locke.ccil.org> <14052.10627.837114.651600@localhost.localdomain>
Message-ID: <36E44B40.4303A066@locke.ccil.org>

David Megginson wrote:

> Wouldn't it be better to allow the parser/filter make that decision?

Yes.

>  > 2) This method is allowed to throw a SAXNewParserException, which
>  > encapsulates a replacement parser.  The application should use
>  > the parser inside the exception in place of the original parser.
>  > This allows parsers to push filters on top of themselves, which
>  > complements the ability of applications to push them.
> 
> I think that this could be layered on top of SAX, simply by
> subclassing SAXNotSupportedException.

Yes, but by making it part of the core SAX protocol for setting
features, we guarantee universal support for it.  A parser that knows
itself to be naive about namespaces can load the NamespaceFilter and
push it on top of itself, almost transparently to the application.
Otherwise, every application that wants namespace support needs
specialized knowledge about how to recover from SAXNotSupportedExn.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Mon Mar  8 22:27:50 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <e211101b.36e44e98@aol.com>

Hi David,

In a message dated 3/8/99 12:19:24 PM Eastern Standard Time,
david@megginson.com writes:
> Yes, but what about filters that perform specialised actions?  And
>  what about adding support (stable or experimental) for new XML-related
>  features like schemas, datatyping, and linking as they become
>  available?

You are absolutely right that extensibility is important. And, as
you also stated, both naming schemes provide that ability.
  
>  As I wrote before, it doesn't much matter whether we use Java property 
>  names incorporating domain names (like
>  'org.xml.sax.features.validation') or URIs (like
>  'http://xml.org/sax/features/validation'), as long as we have the
>  ability for people to create new names without fear of collision.

Why do you need a domain name in there?  I think one Parser/Filter implementor
would be loathe to implement another companies feature name if it had
sun.com or microsoft.com in it.  That was the chief problem that developers
had with Sun naming the Swing package com.sun.swing.  I thought your
features would have a single root tree like:

sax.feature

So that all features would be:

sax.feature.whatever.myfeature

as well as 

sax.props   (for properties)

Now, I understand the domain name being in there is a piggyback off of DNS.
But, I still believe that functional features (of both Parser and Filters) are
a 
finite domain -- whereas the web is not.  That is why I don't see the
correlation
between this feature set and XML namespaces.  If you agree that features and 
props are a finite domain (and in the whole scheme of things a rather small
one),
then a single naming tree should suffice.

Also, Daniel Brickley mentioned a Java bias.  I can understand his
concern; heck, let's separate them with the delimiter of your choice
(hyphens, underscore, etc.).

While we are on the subject of bias: a URI has a resource/file system 
bias.  To me, that bias was just confusing (and overkill) for something that 
I felt was best expressed with one word String constants (if you added 
the initial default set to the interface).

Lastly, I would like to say that I do like your idea for the general 
property query and am glad you proposed it.  The naming concerns I
express here I deem as minor issues.

Best wishes,

 - Mike (mdaconta@aol.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 22:32:38 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <36E44898.CB8E18C4@thinlink.com>
References: <18b603b2.36e3e337@aol.com>
	<14051.59370.316671.640337@localhost.localdomain>
	<36E44898.CB8E18C4@thinlink.com>
Message-ID: <14052.19853.887104.987727@localhost.localdomain>

Tom Harding writes:
 > David Megginson wrote:
 > 
 > > As I wrote before, it doesn't much matter whether we use Java property
 > > names incorporating domain names (like
 > > 'org.xml.sax.features.validation') or URIs (like
 > > 'http://xml.org/sax/features/validation'), as long as we have the
 > > ability for people to create new names without fear of collision.
 > 
 > I would also urge against using an http: URI since it is not meant
 > that a resource actually be retrieved using the http protocol.

I've been thinking about this issue, and I'm fairly convinced that the 
URI is the right choice.

Think of the URI a statement of ownership.  Assume that my ISP is
host.net, and that I've been allocated 5MB of web space at
http://host.net/foo/.

I am the only one who has the right to make a resource available at
http://host.net/foo/, so I am the one who has the (moral) right to
construct feature IDs based on http://host.net/foo/.  It is not
sufficient simply to use the domain name "host.net", because I don't
own the domain (someone else could construct the same feature ID), and
it is not sufficient to use something starting with net.host.foo,
because I *don't* have the right to make something available at, say,
ftp://host.net/foo/ -- host.net has made the foo available to me only
through the HTTP protocol.  Perhaps Foo enterprises has a download
directory at ftp://host.net/foo/, and they might want to construct
their own property ID based on it.

Namespaces seems to have got it right.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar  8 22:42:49 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <e211101b.36e44e98@aol.com>
References: <e211101b.36e44e98@aol.com>
Message-ID: <14052.20696.226477.386853@localhost.localdomain>

MikeDacon@aol.com writes:

 > Why do you need a domain name in there?  I think one Parser/Filter
 > implementor would be loathe to implement another companies feature
 > name if it had sun.com or microsoft.com in it.  That was the chief
 > problem that developers had with Sun naming the Swing package
 > com.sun.swing.  

A neutral .org domain usually provides a nice way around that
problem.

 > Now, I understand the domain name being in there is a piggyback off
 > of DNS.  But, I still believe that functional features (of both
 > Parser and Filters) are a finite domain -- whereas the web is not.
 > That is why I don't see the correlation between this feature set
 > and XML namespaces.  If you agree that features and props are a
 > finite domain (and in the whole scheme of things a rather small
 > one), then a single naming tree should suffice.

I expect the number of features to grow slowly, but I do not think
that it is clearly bounded, especially not with all the XML-related
work going on right now.  A couple of years from now we could have
data-typing, digital signing, and who knows what else.

Furthermore, I do not want to have to set up my own registration
authority, and I do not want developers to have to wait for anyone to
approve their feature names before they can ship.


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Mon Mar  8 22:58:18 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:09:45 2004
Subject: Naming ModSAX features: good use for the 'java:' URI scheme?
In-Reply-To: <36E44898.CB8E18C4@thinlink.com>
Message-ID: <Pine.GHP.4.02A.9903082214540.9289-100000@mail.ilrt.bris.ac.uk>


On Mon, 8 Mar 1999, Tom Harding wrote:

> David Megginson wrote:
> 
> > As I wrote before, it doesn't much matter whether we use Java property
> > names incorporating domain names (like
> > 'org.xml.sax.features.validation') or URIs (like
> > 'http://xml.org/sax/features/validation'), as long as we have the
> > ability for people to create new names without fear of collision.
> 
> I would also urge against using an http: URI since it is not meant that a resource actually be
> retrieved using the http protocol.


I think I've found a compromise of sorts that'll let us use the Java
naming scheme (for those uncomfortable with naming conceptual entities
in the http namespace), whilst still using URIs.

>From http://www.w3.org/Addressing/schemes.html

	Addressing Schemes 
	This is (an attempt at) an exhaustive list of URI schemes. I try to list
	them all, whether they're standard or not. 

Under 'J' we find a useful looking entry...

	java:  identifies java classes (@@spec?) 
	javascript: 

There's also a reference to a JavaRMI: URI schema invented by Bill
Jansen, which would be interesting to track down. But anyway...


So... here's the proposal:

	Naming ModSAX Features

	ModSAX is intended to be easily extensible, and is designed to
	anticipate future independently developed extensions ('features').
	For ModSAX-aware software to cope with the decentralised evolution of
	new features, it is important to have a controlled mechanism for naming
	these features unambiguously. For this we adopt the Uniform Resource
	Identifier (URI) system defined in RFC 2396[URI]. Each (version of a) ModSAX
	feature should be assigned a unique URI. It should not be assumed that 
	these identifiers can always be  dereferenced to acquire further
	information about the feature they name. 
	
	For example, the 'http:' scheme and 'java:' schemes can be used.
	'http://purl.org/net/sax/MyFeature' and 'java:org.desire.sax.MyFeature'
	are both legitimate names for SAX features. 'phone:+44-117-9287493'
	would not be an appropriate name, since the 'phone:' URI namespace can
	only be used for telephone numbers. 

This way, people who manage http: URI names and want to use them to name
SAX features are free to do so. Others can piggyback on the DNS via the
java: scheme instead. But both through the same overarching approach.
 

So... It would be nice to have a reference to some spec defining the 'java:'
URI scheme mentioned at http://www.w3.org/Addressing/schemes.html
Maybe somebody from Sun has a pointer to this...?

BTW as a side effect of having a URI scheme for Java classes and
intefaces, we can exchange (aggregate, search, reason over) RDF
metadata about those resources. This would be handy in Sun's JINI
amongst other places.... Here's a quick and dull example of metadata
keyed off a java: URI...

	<rdf:RDF
	  xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
	  xmlns:rdfs="http://www.w3.org/TR/PR-rdf-schema#"
	  xmlns:dc="http://purl.org/metadata/dublin_core#">
	<rdf:Description rdf:about="java:org.desire.rudolf.jtree.NavApplet">
	<dc:Creator>Dan Brickley and Larry Franklin</dc:Creator>
	<dc:Description>This applet is an attempt at a metadata browsing tree control</dc:Description>
	<rdfs:seeAlso rdf:resource="../moremetadata.rdf"/>
	</rdf:Description>
	</rdf:RDF>


But I'm sidetracking again. I'm really just saying one thing: the
existence of a URI schema for Java classes (and packages) means we don't
need to choose between Java and URI naming formalisms. We can have the
best of both worlds...

Dan


[URI] Uniform Resource Identifiers (URI): Generic Syntax; Berners-Lee,  
Fielding, Masinter, Internet Draft Standard August, 1998; RFC2396. 
http://www.isi.edu/in-notes/rfc2396.txt 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar  8 23:01:23 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <Pine.GHP.4.02A.9903081149260.2617-100000@mail.ilrt.bris.ac.uk>
References: <Pine.GHP.4.02A.9903081149260.2617-100000@mail.ilrt.bris.ac.uk>
Message-ID: <wkbti3egdy.fsf@ifi.uio.no>


* David Megginson
| 
| - org.xml.sax.features.validation is more of a Java flavour.

* Dan Brickley
| 
| Yep... but might not feel so natural for developers working with
| versions of SAX translated for Perl, Python and so on.

I'll be translating this into Python and I see absolutely no problems
with this from that point of view.  It's a natural way to use the DNS
as a basis for a naming system and Java just happens to use it.
 
--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Mon Mar  8 23:03:34 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <8246f301.36e4560e@aol.com>

Hi David,

In a message dated 3/8/99 5:38:55 PM Eastern Standard Time,
david@megginson.com writes:
> Think of the URI a statement of ownership.  Assume that my ISP is
>  host.net, and that I've been allocated 5MB of web space at
>  http://host.net/foo/.
>  

This is the primary reason I disagree with using a URI.  
A feature is not a resource.  Also, a standard interface to a set 
of features is not the place to invoke ownership priviledges.
You can't own a feature that you expect others to implement.

Unless I am not getting your idea of a feature, your logic seems
incorrect.

Interesting discussion and process (well worth it),

 - Mike (mdaconta@aol.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Mon Mar  8 23:09:36 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <14052.19853.887104.987727@localhost.localdomain>
Message-ID: <Pine.GHP.4.02A.9903082301370.9289-100000@mail.ilrt.bris.ac.uk>

On Mon, 8 Mar 1999, David Megginson wrote:
> Tom Harding writes:
>  > David Megginson wrote:
>  > 
>  > > As I wrote before, it doesn't much matter whether we use Java property
>  > > names incorporating domain names (like
>  > > 'org.xml.sax.features.validation') or URIs (like
>  > > 'http://xml.org/sax/features/validation'), as long as we have the
>  > > ability for people to create new names without fear of collision.
>  > 
>  > I would also urge against using an http: URI since it is not meant
>  > that a resource actually be retrieved using the http protocol.
> 
> I've been thinking about this issue, and I'm fairly convinced that the 
> URI is the right choice.
> 
> Think of the URI a statement of ownership.  Assume that my ISP is
> host.net, and that I've been allocated 5MB of web space at
> http://host.net/foo/.
> 
[...]


Just to head off one possible objection... that of the persistence (or
lack of) w.r.t. http URLs. The PURL folks (Persistent URLs) make a
credible case when they argue that URLs can be managed just a
responsibly as URNs, and that persistence of http naming is a social
issue not a technical one. PURL servers are available to help here -- eg
XML-DEV's own XSchema (now DDML) pages have been available from several
different http servers, but have always had the same URI:
http://purl.oclc.org/NET/xschema  
The PURL server at that address sends an HTTP redirect messge if you try
to derefence it. So we could for eg use PURLs to name software features,
with reassurance that PURL.ORG have committed to do their best to manage
http://purl.org/* names responsibly.

> Namespaces seems to have got it right.

Yep. 


Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar  8 23:13:49 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:45 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <01ab01be69b8$39f1cf00$c9a8a8c0@thing2>

From: John Cowan <cowan@locke.ccil.org>
>>  > 2) This method is allowed to throw a SAXNewParserException, which
>>  > encapsulates a replacement parser.  The application should use
>>  > the parser inside the exception in place of the original parser.
>>  > This allows parsers to push filters on top of themselves, which
>>  > complements the ability of applications to push them.
>> 
>> I think that this could be layered on top of SAX, simply by
>> subclassing SAXNotSupportedException.
>
>Yes, but by making it part of the core SAX protocol for setting
>features, we guarantee universal support for it.  A parser that knows
>itself to be naive about namespaces can load the NamespaceFilter and
>push it on top of itself, almost transparently to the application.
>Otherwise, every application that wants namespace support needs
>specialized knowledge about how to recover from SAXNotSupportedExn.


There are really three approaches here:
1. An application pushes a filter "on top of" a parser. In this case, the application
    starts with a parser and chooses to augment it with a filter.
2. The application requests a feature of the parser and the parser elects to wrap
    itself in a filter. For efficiency reasons(?), it asks the application to now use
    the filter in place of itself.
3. An application works with a pseudo-parser. It asks for various features and
    the pseudo-parser selects a parser and a set of filters which together can deliver
    the requested capabilities.

I do like David's proposal--its pretty open ended. The method get(infoID) will even
serves as a front-end for aggregation! But I see a problem in trying to go too
far on the feature selection path. The assumption seems to be that we are
dealing here with a completely orthogonal set of features which are just selected
or not as needed. There is no sense of structure or architecture here. I'm not sure 
that this is a useful model. Frankly, I much prefer Simon's layered approach:
    http://www.simonstl.com/articles/layering/layered.htm

Again, I'm happy with the interface, but this idea of creating filter structures based
on feature selection seems a bit lame.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Mon Mar  8 23:18:18 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <8246f301.36e4560e@aol.com>
Message-ID: <Pine.GHP.4.02A.9903082310150.9289-100000@mail.ilrt.bris.ac.uk>

On Mon, 8 Mar 1999 MikeDacon@aol.com wrote:

> Hi David,
> 
> In a message dated 3/8/99 5:38:55 PM Eastern Standard Time,
> david@megginson.com writes:
> > Think of the URI a statement of ownership.  Assume that my ISP is
> >  host.net, and that I've been allocated 5MB of web space at
> >  http://host.net/foo/.
> >  
> 
> This is the primary reason I disagree with using a URI.  

> A feature is not a resource.  

Software features aren't files, nor are they HTML pages, but they are
'resources' as defined in RFC2396 and as used in the XML Namespaces and
RDF recommendations from W3C.

I'm getting *really* boring on this topic... ;-)


>From RFC2396 (online at http://www.isi.edu/in-notes/rfc2396.txt)
	
	A Uniform Resource Identifier (URI) is a compact string of
	characters for identifying an abstract or physical resource.
[...]
      Resource
         A resource can be anything that has identity.  Familiar
         examples include an electronic document, an image, a service
         (e.g., "today's weather report for Los Angeles"), and a
         collection of other resources.  Not all resources are network
         "retrievable"; e.g., human beings, corporations, and bound
         books in a library can also be considered resources.

         The resource is the conceptual mapping to an entity or set of
         entities, not necessarily the entity which corresponds to that
         mapping at any particular instance in time.


>			Also, a standard interface to a set 
> of features is not the place to invoke ownership priviledges.

You can own (or manage) the name for the feature though. Javasoft own
all the URIs beginning 'java:java.lang.*'; I own the URIs beginning
'java:org.desire.rudolf.rdf.*'. These can name classes or interfaces
others might implement.

Dan

> You can't own a feature that you expect others to implement.
> 
> Unless I am not getting your idea of a feature, your logic seems
> incorrect.
> 
> Interesting discussion and process (well worth it),
> 
>  - Mike (mdaconta@aol.com)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Tue Mar  9 00:19:02 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <021a01be69c1$a49538c0$c9a8a8c0@thing2>

From: David Megginson <david@megginson.com>
>I expect the number of features to grow slowly

I suspect otherwise. Especially since the interface would also
be used by filters and DOMWlakers. 

Think of the get and set methods as ways of accessing the 
properties on filters which are part of some larger filter
structure (a stack being the simplest case).

In addition to parse events moving from parser-kernel to 
application via a series of filters and event routers, the
get and set "events" move from the application through
the filters and down to the parser-kernel.

Think of the parser and the filters together as a large aggregate
of components. The get, set, setFeature, and setHandler 
may well be intercepted by any component in that aggregate
which recognizes the featureID, handlerID, or infoID.

I see the ModParser interface as currently defined as being 
very important for filters, with the number of featureIDs growing
with the popularity of such filiters.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Mar  9 00:34:35 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:46 2004
Subject: Opinions requested
Message-ID: <3.0.32.19990308103203.00e7d2cc@pop.intergate.bc.ca>

At 09:02 AM 3/8/99 -0800, Jerome McDonough wrote:
>I share your skepticism, but we can hope.  If nothing else, there appears
>to be at least the dawnings of an understanding among the major DBMS
>vendors that there's a huge market for text management/retrieval products.
>Some of the approaches taken by the object-oriented database folks, like
>Informix's data blades, struck me as having promise.

There's the rub.  *Is* there really a huge market for text 
management/retrieval?  The history of software is littered with the
corpses of companies who tried to make a go of it in that area; I
know from personal experience that up to and through the year 1996,
there was *not* any such huge market.  Will XML change that?  It
would be nice to think so. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Tue Mar  9 00:52:11 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:09:46 2004
Subject: Namespaces and DTDs
Message-ID: <36E49A4D.413D71F3@metalab.unc.edu>

Situation:

I have several DTDs with conflicting definitions of certain elements.
(e.g one defines a HEAD as a TITLE followed by a META and another
defines a HEAD as #PCDATA). I need to use all the DTDs and associated
markup languages for a single document. 

To an extent I can disambiguate them with namespaces.  However, is there
any way I can do this while still validating against the orignal DTDs?
That is without rewriting the DTDs to use the qualified names instead of
the orignal names that are in the DTDs? I've been trying to work with
default values for xmlns attributes, and the like; but that doesn't seem
to get me quite all the way to where I need to go. Am I going to have to
break down and just rewrite the DTDs to use the qualified names?

--
Elliotte Rusty Harold

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dent at oofile.com.au  Tue Mar  9 02:03:11 1999
From: dent at oofile.com.au (Andy Dent)
Date: Mon Jun  7 17:09:46 2004
Subject: Expat API
In-Reply-To: <v04104403b301faab26bc@[192.70.254.157]>
References: <49092BAEAC84D2119B0600805FD40F9F120DBD@MDYNYCMSX1>
Message-ID: <v04011701b30a314eceed@[203.23.215.87]>

>My question is: where is the documentation on how to use the expat
>API? I downloaded version 1.0.2 and ported the code to run the sample
>program on my Macintosh, but I'm pretty much dead in the water. I
>tried sending email to the author (James Clark) twice in the last few
>days, but I have so far failed to receive a response. The comments in
>the header files do not seem to be sufficient.

Dave

We have a c++ wrapper on expat running under CodeWarrior as part of a much
bigger project to make our report writer interchange data with XML. You're
welcome to a copy.

It makes the expat API a LOT easier to use if you are a c++ programmer as
it presents a virtual method interface to expat - you inherit from our
object and override the methods (eg: startElement) that you want to use.

When it's a bit more cleaned up with better samples I'll be submitting it
back to James.

Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia
OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows
PP2MFC - PowerPlant->MFC portability
http://www.highway1.com.au/adsoftware/crossplatform.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From avirr at LanMinds.Com  Tue Mar  9 05:31:02 1999
From: avirr at LanMinds.Com (Avi Rappoport)
Date: Mon Jun  7 17:09:46 2004
Subject: Opinions requested
In-Reply-To: <3.0.32.19990308103203.00e7d2cc@pop.intergate.bc.ca>
Message-ID: <v04104805b30a5839771a@[207.33.50.55]>

At 4:37 PM -0800 3/8/1999, Tim Bray wrote:
> At 09:02 AM 3/8/99 -0800, Jerome McDonough wrote:
>>I share your skepticism, but we can hope.  If nothing else, there appears
>>to be at least the dawnings of an understanding among the major DBMS
>>vendors that there's a huge market for text management/retrieval products.
>>Some of the approaches taken by the object-oriented database folks, like
>>Informix's data blades, struck me as having promise.
>
> There's the rub.  *Is* there really a huge market for text
> management/retrieval?  The history of software is littered with the
> corpses of companies who tried to make a go of it in that area; I
> know from personal experience that up to and through the year 1996,
> there was *not* any such huge market.  Will XML change that?  It
> would be nice to think so. -Tim

The Web has certainly raised the profile for text retrieval, and the 
amount of text online is larger than its ever been.  A lot of 
text-management turns out to be going on in relational databases, and 
those are pretty big business.  But the large content-management 
companies -- Verity, Open Text, Fulcrum (bought by PCDOCS recently 
bought by someone else) -- seem to be going through wild stock price 
variations recently.  I've no idea what the future market will be: I 
find it all mystifying!

BTW, Lisa Rein has written a report on the Query Language '98 
workshop at W3C last year:

http://www.xml.com/xml/pub/1999/03/quest/index.html

It looks quite comprehensive to me, and all the position papers 
indicate that the topic is a hot one.

Avi

________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr@lanminds.com>
Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar  9 05:36:44 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:46 2004
Subject: ModSax Suggestion
References: <d9b431a1.36e278a4@aol.com>
Message-ID: <36E4B1FA.E482164@eng.sun.com>

> > 	Interesting suggestion for a big hole in the parts of
> >  the Java API set that are more or less "standard" at
> >  this poit -- SAX and DOM.
> >
> >  One comment though:  I've found that it's important to
> >  be able to have options controlling how the DOM tree is
> >  built.  For example, whether to discard ignorable spaces,
> >  or do namespace conformance enforcement, or try to get
> >  CDATA sections (comments, etc).
> >
> 
> I agree with that.  I think all that is possible while still retaining
> a minimalist design philosophy. [deletia]
> 
> That way via an extensible common set of text properties we
> can add properties as the need arises without expanding the API.

I've always liked the idea of filters in the SAX event chain.
As Bill la Forge (and you) noted, that's a fine way to address that
general issue.  One can overdo layers, of course, and pay for it
in performance.  But filters are a good architectural notion, and
there's been lots of discussion about how to use them well with
SAX and DOM.

That does imply keeping DOM out of the basic parser API, which
I still think is the best way to go.  An event generator (say,
a SAX parser, or something walking a DOM tree) can have its
events filtered, and delivered to acomponent building a DOM tree.


> Looking forward to progress on the Java XML API.  BTW, Dave,
> are you going to do a "Birds of a Feather" session on XML at this years
> JavaOne?  I think that could be valuable.

I may be signed up for more than that this time...

A BOF on XML -- an XML-DEV BOF! -- would be lots of fun.
Some of the folk here have never met in person.  I think
there will be lots of interesting applications to talk
about ... and probably some interesting frameworks.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar  9 05:55:30 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX: ModSAX addition, general property query
References: <14051.46670.687235.664451@localhost.localdomain>
Message-ID: <36E4B666.411E15C2@eng.sun.com>

OK, I'll pick this thread rather than the longer one to read first...
XML-DEV really generates lots of traffic lately!!

- I agree re using URIs, like Namespaces do.  Anyone can get a URI
  nowadays, for virtually no cost, but that's not true of reversed
  domain names (as used in Java properties and package names).

- There will need to be some strong policies for how the "things"
  to which an {info,handler,feature}ID map are documented.  I think
  that leadership by example can play a strong role here ... :-)

  Related point, that policy should specify the status of the "thing".
  For example, "stable", "beta", "experimental", "private", to pick
  an order where folk should be progressively less willing to use or
  implement the "thing" in a parser.

- I'd like a "getHandler" API ... or perhaps, eliminate the notion
  of 'feature' and 'handler' IDs and just use "infoID" values that
  map to the appropriate handdlers.

  I've found it important to be able to do things like, say, "use
  the error handler everyone else is using".  (Where's getFeature?
  One can return a Boolean from a "get" ...)

Re that last point, I might have missed some e-mail and will try
to catch up.  It's not clear why there's a need for more than a
single general get/set API for this.

- Dave


David Megginson wrote:
> 
> What: Additions to ModParser interface
> 
> I'm proposing a couple of additions to the ModParser interface:
> 
>   public interface ModParser extends Parser
>   {
>     public abstract void setFeature (String featureID, boolean state)
>       throws SAXNotSupportedException;
> 
>     public abstract void setHandler (String handlerID, ModHandler handler)
>       throws SAXNotSupportedException;
> 
>     public abstract void set (String infoID, Object prop)
>       throws SAXNotSupportedException;
> 
>     public abstract Object get (String infoID)
>       throws SAXNotSupportedException;
>   }
> 
> These allow you to do interesting things like
> 
>   parser.set("http://www.foo.com/props/textfilter", filter);
> 
> or
> 
>   try {
>     Node node = parser.get("http://xml.org/sax/props/dom-node");
>   } catch (SAXNotRecognizedException e1) {
>     // doesn't know about DOM processing...
>   } catch (SAXNotSupportedException e2) {
>     // knows about DOM processing, but not doing it...
>   }
> 
> Again, it's a little sloppy as an interface, but it's beautifully
> extensible and it supports filters nicely (if there are other filters
> between the DOM iterator and the application, it will still work).
> 
> Note that strictly speaking, now, setHandler() and setFeature() are no
> longer primitives, since they could both be implemented in terms of
> set(), but I think that the extra type checking is worthwhile in those
> cases.
> 
> All the best,
> 
> David
> 
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar  9 06:27:39 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <004b01be691c$f348fc40$c9a8a8c0@thing2> <wkyal8e048.fsf@ifi.uio.no>
Message-ID: <36E4BDC9.DB06F185@eng.sun.com>

Lars Marius Garshol wrote:
> 
> * Bill la Forge
>
> | So that's why I'm butting in here. I think an open standards process
> | is important for individuals and small companies. We need to do what
> | we can to keep the ball rolling here.
> 
> We are certainly in heartfelt agreement here. :)

Gee, as a wage-slave working for a big company, I hope that I'm
not _too_ excluded from the discussions ... :-)

Seriously:  my personal model is a lot more akin to the original
IETF style "running code and working consensus" model than most
existing standards bodies.  I'm a lot happier with standards that
come from such a process than from ones that involve fat specs
that can't be implemented.  Writing code is generally more fun
than specs -- though an elegant spec is also a work of art!

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar  9 06:57:28 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain>
Message-ID: <36E4C4E6.B51DDFF3@eng.sun.com>

Again, I think that unifying these under the generic get/set
API (with Boolean.TRUE and Boolean.FALSE objects as values
for features that are really boolean) could be useful.

Documentation for each feature should specify whether it's
changeable mid-parse ... I'd suggest "no" as the default answer!

Mike Dacon commented about the "API archaeology" aspect of this
name; perhaps the "Parser2" style naming convention can avoid
losing technical context (i.e. this is still a parser, even
if it's parsing a DOM or a stream of SAX events :-).


> 1. http://xml.org/sax/features/validation

Good.  (I'm curious if folks prefer one parser, which can
have this feature toggled, vs two, where the parser comes
with at least an initial value.)


> 2A. http://xml.org/sax/features/external-general-entities
> 2B. http://xml.org/sax/features/external-parameter-entities

Right, two kinds of parsed entities, two control knobs.
Validating parsers must refuse to change these knobs.
(OK, _five_ kind of parser -- validating, and four kinds
of nonvalidating parser!  ;-)


> 3. http://xml.org/sax/features/namespaces

I'd rather have this just kick in modified XML syntax rules
(e.g. entity names may never be scoped, and scoped names may
have only one interior colon).

With that, one can layer the rest of namespace processing
on top in any of several fashions.  A DOM can be built which
exposes namespace declarations; or a filter can munge names
and strip out the declarations.  The "munge" feature could
get its own namespace URI.


> 4. http://xml.org/sax/features/unbuffered-input
>   True means ensure that the parser does not buffer input from a
>   Reader or InputStream supplied by the application (actually,
>   one-character look-ahead will usually be required); false means do
>   not ensure that the parser does not buffer input.  This feature might
>   be useful for reading multiple documents from a single stream.

I'm not sure this is a common enough feature to need to be
predefined ... support for "XML Islands" within HTML may become
important, but much of this can be done (at least in Java) by
requiring pushback to be done at appropriate points.


> http://xml.org/sax/features/normalize-text

This is a good filter feature, I think.


Lars suggested a "Catalog" feature.  There are different sorts of
catalog, and they need configuration, so the value of this could
be a URI for the catalog, not just a boolean.  Plus, this would
seem to be up to the "EntityResolver" to handle ... yes?  It'd
perhaps suggest that one could ask the next filter in the stream
for the resolver it was using ... :-)


Good discussion, gang!

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Tue Mar  9 08:58:11 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:09:46 2004
Subject: version within XML
Message-ID: <3601a91c.090299@smtpgate1.ONE2ONE.CO.UK>


Hi all,
I am seeking info on versioning XML documents. I have seen it done in a   
few different ways. Specifically what are the issues to ensure backward   
comparability between versions.

Any help is appreciated.


adios

 -lucio

 ---------------------------------------------------------------------
 One2One              LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower      tel : +44 181 214 3847
 Elstree Way
 Borehamwood                 fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Tue Mar  9 09:43:35 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:46 2004
Subject: Namespaces and DTDs
Message-ID: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de>

Elliotte Rusty Harold wrote:

> I have several DTDs with conflicting definitions of certain elements.
> (e.g one defines a HEAD as a TITLE followed by a META and another
> defines a HEAD as #PCDATA). I need to use all the DTDs and associated
> markup languages for a single document.
>
> To an extent I can disambiguate them with namespaces.  However, is there
> any way I can do this while still validating against the orignal DTDs?
> That is without rewriting the DTDs to use the qualified names instead of
> the orignal names that are in the DTDs? I've been trying to work with
> default values for xmlns attributes, and the like; but that doesn't seem
> to get me quite all the way to where I need to go. Am I going to have to
> break down and just rewrite the DTDs to use the qualified names?

If you want to use a namespace-unaware parser, I don't see how you can 
avoid rewriting the DTDs.  Unless the names in the DTDs are qualified, you 
will have two elements with the same name (e.g. "HEAD"), which is a 
validation error.  And even assuming that this isn't immediately flagged, I 
can see no way for a namespace-unaware parser to figure out which content 
model to validate against when it encounters one of the duplicated element 
names: If prefixes are used, the name won't match any of the DTD names; if 
prefixes are not used (due to use of defaults), the name will match 
multiple DTD names.

Note that this problem is not limited just to validation.  At the very 
least, it applies to retrieving default attribute values as well.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Tue Mar  9 10:00:50 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <93CB64052F94D211BC5D0010A80013310EB364@wwmessd3.bra01.icl.co.uk>

> I've been thinking about this issue, and I'm fairly convinced 
> that the URI is the right choice.
> 
> Think of the URI a statement of ownership.  Assume that my ISP is
> host.net, and that I've been allocated 5MB of web space at
> http://host.net/foo/.
> 
I don't often disagree with David, but I think this is quite misguided.

If we're only after a unique identifier we could use the longitude and
latitude of the house where I live. In fact that would be better, because it
identifies a unique place, whereas the "http:" idea also says you can get
there by bus and the buses are run by the host.net bus company: in fact it
invites you to "click here" to jump on the bus. But if you get on the bus
and ask for the destination the driver will tell you "Never heard of it,
guv."

And of course it ignores the fact that you can have two buses going to the
same place from different directions.

Just because Namespaces made this mistake (and confused all newbies by doing
so) doesn't mean we have to as well.

Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Tue Mar  9 10:25:49 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:09:46 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <93CB64052F94D211BC5D0010A80013310EB364@wwmessd3.bra01.icl.co.uk>
Message-ID: <Pine.GHP.4.02A.9903091012210.2617-100000@mail.ilrt.bris.ac.uk>

On Tue, 9 Mar 1999, Kay Michael wrote:

> > I've been thinking about this issue, and I'm fairly convinced 
> > that the URI is the right choice.
> > 
> > Think of the URI a statement of ownership.  Assume that my ISP is
> > host.net, and that I've been allocated 5MB of web space at
> > http://host.net/foo/.
> > 
> I don't often disagree with David, but I think this is quite misguided.
> 
> If we're only after a unique identifier we could use the longitude and
> latitude of the house where I live. 

Great. Why not propose a URI scheme for it? (although this would also
confuse people as a place is something you'd look up on a map, not a
software feature.)

		In fact that would be better, because it
> identifies a unique place, whereas the "http:" idea also says you can get
> there by bus and the buses are run by the host.net bus company: in fact it
> invites you to "click here" to jump on the bus. But if you get on the bus
> and ask for the destination the driver will tell you "Never heard of it,
> guv."
> 
> And of course it ignores the fact that you can have two buses going to the
> same place from different directions.


The URI spec very clearly does not ignore this point.

>From RFC 2396 again... (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt)


	1.2. URI, URL, and URN
 
    A URI can be further classified as a locator, a name, or both.  The
   term "Uniform Resource Locator" (URL) refers to the subset of URI
   that identify resources via a representation of their primary access
   mechanism (e.g., their network "location"), rather than identifying
   the resource by name or by some other attribute(s) of that resource.
   [...]
    Although many URL schemes are named after protocols, this does not
   imply that the only way to access the URL's resource is via the named
   protocol.  Gateways, proxies, caches, and name resolution services
   might be used to access some resources, independent of the protocol
   of their origin, and the resolution of some URL may require the use
   of more than one protocol (e.g., both DNS and HTTP are typically used
   to access an "http" URL's resource when it can't be found in a local
   cache).


> Just because Namespaces made this mistake (and confused all newbies by doing
> so) doesn't mean we have to as well.

Making the same mistake as the rest of the world has its benefits
though: if we use URIs for ModSAX features, we get for free any progress
on better naming infrastructure (URNs, metadata, resolution infrastructure
layered over the Web caching network etc). If we invent another a
nameless, specless naming system, we're on our own.

Dan


--
Daniel.Brickley@bristol.ac.uk               
Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/
University of Bristol,  Bristol BS8 1TN, UK.   phone:+44(0)117-9288478


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Tue Mar  9 10:43:27 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:47 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk>

----- Original Message -----
From: Kay Michael <Michael.Kay@icl.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Sent: 09 March 1999 09:54
Subject: RE: SAX: ModSAX addition, general property query


>> I've been thinking about this issue, and I'm fairly convinced
>> that the URI is the right choice.
>>
>> Think of the URI a statement of ownership.  Assume that my ISP is
>> host.net, and that I've been allocated 5MB of web space at
>> http://host.net/foo/.
>>
>I don't often disagree with David, but I think this is quite misguided.

I agree - I don't actually see the benefit of using a string identifier at
all:

I don't think that it's unreasonable to insist that objects representing a
Feature, Handler or Property should either implement a distinct interface or
subclass a distinct class. If this is so the Parser can tell what Feature,
Handler or Property is being set by enquiring of the type of the object. (I
favour insisting that they subclass distinct classes because (in Java) that
naturally imposes the restriction that a single object can only represent a
single Property.)

The get() member function could take a Class parameter.

The advantage of this approach is that it relies only on the type naming
scheme of Java and there are already well established mechanisms that
ensures that different implementers create distinct types.

I am by no means an expert in the other languages that are supported by
SAX - would this approach cause dreadful problems in other languages?

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar  9 11:09:31 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:47 2004
Subject: Java Specification Request for XML
References: <v03102803b309a99cfca5@[168.100.203.234]>
Message-ID: <36E4FF9B.8CA35A47@eng.sun.com>

Elliotte Rusty Harold wrote:
> 
> >The Java Community Process is an open, inclusive process and we
> >look forward to the active particpation of all interested parties.
> 
> The process, and its relatnive openness, is a little more obvious if you
> remove the passive voice. compare this:

When you change it to what you wrote, it is no longer correct.
Some key points:

	- No, Sun doesn't need to submit all JSRs.  Any
	  Participant can do so.  We did for this one, to
	  help jumpstart the process; many people want to
	  see a Java Platform API for XML.

	- Yes, Sun's Program Management Office (vs. say
	  Ken Starr) approves or rejects submitted JSRs.

	- No, the leader of the expert group doesn't need
	  to be from Sun.  The group formed by that leader,
	  from the pool of volunteer experts and from
	  external invited experts, is supposed to be a
	  diverse cross section.  This is auditable.

	- Re cost to be a "Participant", I had the same
	  comment.  The fee can be waived for invited
	  experts.  And note that the fee is less than
	  an expert's time will cost -- much less!

Sun is working with this process in good faith, though you seem
to fear otherwise.

Re other processes ... I don't think anyone's quite figured
out how to make the "open source" processes drive established
software companies.  Like many leading companies, Sun is
taking steps in that direction. But at least for this year,
that isn't a useful class of processes to measure against.


> >The key point is that everyone with internet access will get a
> >chance to review and comment on the emerging specification.
> 
> They can review and comment. There's no promise that
> anyone will even listen to their comments, much less act on them.

No, there _is_ a promise they'll be listened to; and I understand
the action will at least include a response.

Have you ever participated in the comment process for an IEEE spec?
One submits comments, and gets formal responses. (I seem to recall
it being restricted to paid-up IEEE members though.)

That's the model to keep in mind -- not the "black hole" model
you've described.  Again, this is auditable.


> There are a number of aspects of this "open" process that aren't mentioned
> here.

Paraphrasing points I didn't mention above:

- Copyright and other Intellectual Property Rights.  Hmm, wouldn't
  you just hate to base a product on a specification, and then find
  that you've got to fork over $5K/copy to use it?  Have a look at
  what any of the "Open Source" license agreements (e.g. MPL2) say
  about such issues.

- Derivative works.  Nobody wins if people are allowed to ship things
  as "compatible" that really aren't; that's what the compatibility
  test suite is there to help ensure:  "Write Once, Run Anywhere" does
  not come without effort, and it's a Big Deal.

- Pillow talk.  It's supposed to be private.

- Of course non-corporate experts exist; always have, always will.
  And they can participate too.


> To me these alone make it pretty clear, that this process is open in name
> only. If you're still not convinced, ask yourself these questions:
> 
> 1. Can anyone tell Sun No? Can anyone keep Sun from putting something into
> the spec they want to put it in? Or put something in that Sun wants to keep
> out?

If the Expert Group disagrees with Sun's representative, that
could happen.  I'd hope it wouldn't -- but it could happen.


> 2. Can Sun's enemies (i.e. Microsoft, HP, etc.) particpate in this process
> on an equal footing with Sun? Can they even participate at all?

Can those companies participate?  Absolutely.  Though I don't think
that they've wanted to do so -- going purely by what the press has
been seen to report.


> Bottom line: The openness of this process is PR, pure and simple.

So is that glass half full, or half empty?  :-)

"Openness" fits on a spectrum.  I think that this process compares
favorably with most other standards processes I've seen.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Andy.Bradbury at syntegra.bt.co.uk  Tue Mar  9 11:18:45 1999
From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk)
Date: Mon Jun  7 17:09:47 2004
Subject: X for eXtensible DBMS?
Message-ID: <65AF45D5E535D2118AFB0008C7FA23180C3D08@FL-EXCHANGE-03>

The only IMS I ever came across was hardly what I'd call 'extensible' - not
unless you actually *like* taking a whole database down in order to create
or modify a single extra link  ;,)

Regards

Andy B.

-----Original Message-----
From: Smith, Adrian [mailto:asmith@drumbeat.com]
Sent: 05 March 1999 17:19
To: 'Jeffrey E. Sussna'; 'Chad Adams'; xml-dev@ic.ac.uk
Subject: RE: Opinions requested


There actually is an XDBMS.  It predates XML.  This dates back to around
1965/1966.  The database created was titled "IMS" for Information
Management System, it was created by IBM and used an hierarchical model
for the data.  It had all the same characterstics of XML with almost the
exact same set of constructs and shortcomings.

Thanks!
Adrian


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Tue Mar  9 12:00:06 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:47 2004
Subject: ModSax Suggestion
Message-ID: <005001be6a23$7e574240$c9a8a8c0@thing2>

From: David Brownell <db@eng.sun.com>
>> Looking forward to progress on the Java XML API.  BTW, Dave,
>> are you going to do a "Birds of a Feather" session on XML at this years
>> JavaOne?  I think that could be valuable.
>
>I may be signed up for more than that this time...
>
>A BOF on XML -- an XML-DEV BOF! -- would be lots of fun.
>Some of the folk here have never met in person.  I think
>there will be lots of interesting applications to talk
>about ... and probably some interesting frameworks.


Simon and I proposed a Coins BOF some time back and
JavaOne accepted it. Might be a good place to meet and 
discuss ModSAX, filters, and such.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Tue Mar  9 12:16:09 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:47 2004
Subject: ModSax Suggestion
Message-ID: <006301be6a25$c42a6700$c9a8a8c0@thing2>

From: David Brownell <db@eng.sun.com>

>I've always liked the idea of filters in the SAX event chain.
>As Bill la Forge (and you) noted, that's a fine way to address that
>general issue.  One can overdo layers, of course, and pay for it
>in performance.  But filters are a good architectural notion, and
>there's been lots of discussion about how to use them well with
>SAX and DOM.
>
>That does imply keeping DOM out of the basic parser API, which
>I still think is the best way to go.  An event generator (say,
>a SAX parser, or something walking a DOM tree) can have its
>events filtered, and delivered to acomponent building a DOM tree.


A filter can itself hold a stack of other filters, or even a set of filters
to which events are routed based on some pattern. Being able
to place just one filter in front of the DOM built by the parser
is all you really need.

Using the ModParser interface, can do the following:

1. Use setFeature to turn on DOM construction.

2. Use set to insert a filter in front of the DOM.

3. Parse a document.

4. Use get to retrieve the constructed DOM.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Tue Mar  9 12:45:44 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:47 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <007801be6a29$d6efdd80$c9a8a8c0@thing2>

From: John Wilson <tug@wilson.co.uk>
>I don't think that it's unreasonable to insist that objects representing a
>Feature, Handler or Property should either implement a distinct interface or
>subclass a distinct class. If this is so the Parser can tell what Feature,
>Handler or Property is being set by enquiring of the type of the object. (I
>favour insisting that they subclass distinct classes because (in Java) that
>naturally imposes the restriction that a single object can only represent a
>single Property.)


Filters often implement more than one (generally all) handler interface and
then register themselves with the underlying parser/filter for the same events
requested by the overlaying application/filter.

Your proposal would require the filter to instantiate seperate objects for each
set of events it needs to process, though it could simply pass-through the handlers
for those it does not.

The role and class of an object are often distinct. This was one of the things I
did not like about the aggregation scheme that was proposed by Sun a while back.
I think David got it right.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Tue Mar  9 13:04:46 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:47 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <08b501be6a2d$3b34c820$010a0a0a@home.wilson.co.uk>


----- Original Message -----
From: Bill la Forge <b.laforge@jxml.com>
To: John Wilson <tug@wilson.co.uk>; XML Developers' List <xml-dev@ic.ac.uk>
Sent: 09 March 1999 12:39
Subject: Re: SAX: ModSAX addition, general property query


>From: John Wilson <tug@wilson.co.uk>
>>I don't think that it's unreasonable to insist that objects representing a
>>Feature, Handler or Property should either implement a distinct interface
or
>>subclass a distinct class. If this is so the Parser can tell what Feature,
>>Handler or Property is being set by enquiring of the type of the object.
(I
>>favour insisting that they subclass distinct classes because (in Java)
that
>>naturally imposes the restriction that a single object can only represent
a
>>single Property.)
>
>
>Filters often implement more than one (generally all) handler interface and
>then register themselves with the underlying parser/filter for the same
events
>requested by the overlaying application/filter.
>
>Your proposal would require the filter to instantiate seperate objects for
each
>set of events it needs to process, though it could simply pass-through the
handlers
>for those it does not.

Certainly you need to instantiate an object per handler, however it need not
be too ugly

public class MyFilter {
  public final DTDHandler dtdHandler = new DTDHandler() {
      ... };
  public final DocumentHandler documentHandler = new DocumentHandler() {
     ... };

....

}

would seem to me to be a reasonable way of dealing with this.

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Tue Mar  9 13:15:17 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:47 2004
Subject: Java Specification Request for XML
Message-ID: <008b01be6a2e$0592cfe0$c9a8a8c0@thing2>

From: David Brownell <db@eng.sun.com>
>Re other processes ... I don't think anyone's quite figured
>out how to make the "open source" processes drive established
>software companies.  Like many leading companies, Sun is
>taking steps in that direction. But at least for this year,
>that isn't a useful class of processes to measure against.


I suspect that a change to Open Source Software will depend
on more than just vendors. Vendors need to be responsive to
their customers, many of whom are still not with the new program.

I don't think this process can be driven entirely from the top.
It would be risky for a vendor to get to far ahead of its "community".

So while open forums like XML-DEV are closer to the ideal,
given the opportunity, I will be glad to participate in Sun's
own process.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Tue Mar  9 14:39:06 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:47 2004
Subject: Namespaces and DTDs
References: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de>
Message-ID: <36E5322A.7DDADDD8@goon.stg.brown.edu>

Ronald Bourret wrote:

> > I have several DTDs with conflicting definitions of certain elements.
> > ...Am I going to have to break down and just rewrite the DTDs to use
> > the qualified names?
> 
> If you want to use a namespace-unaware parser, I don't see how you can
> avoid rewriting the DTDs.

Maybe I misunderstand, but as far as I can see, namespaces won't help
you, either.  Why?  Because even if you can refer to, say, your two TITLE
elements by different prefixes, you'll still have to declare the prefixed
elements in the DTD as if they were atomic element names.

Namespaces, in other words, don't solve your problem.  They may make it
worse, in fact, because you have to know what prefixes you are going to
declare in a given document to be able to rewrite your DTD to work with
that document.

There was a furor two or three months ago on this list about namespaces
breaking validation.  That furor died down when the namespace spec became
an official recommendation (a done deal, in other words).

Just so you know, though:  The issue you raise is just the sort of thing
that caused the furor.  People were expecting namespaces to help in just
your situation.  When they found out that namespaces didn't help, many
were disappointed, and said so.  The most effective responses I saw were
from people who said, in effect, "Namespaces do far less than you want
or expect them to."

The question is my mind is whether they actually get in the way.

(You won't hear any gripes from me if my take on namespaces turns out to
be dead wrong.)

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Tue Mar  9 14:55:56 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:47 2004
Subject: Java Specification Request for XML
In-Reply-To: <36E4FF9B.8CA35A47@eng.sun.com>
References: <v03102803b309a99cfca5@[168.100.203.234]>
Message-ID: <4.0.1.19990309092029.00f0f4b0@pop.hesketh.net>

David Brownell wrote:
>> >The Java Community Process is an open, inclusive process and we
>> >look forward to the active particpation of all interested parties.

If I just had to take _your_ word for it, David, I'd definitely believe it.
 Your continued participation on these lists and your contributions to
projects like SAX and ModSAX clearly indicate that you, at least, have an
open mind when it comes to open source/open process models.

Unfortunately, when I visit Sun's site, and read the documentation
surrounding the JCP, I'm decidedly unconvinced.  Elliotte may have put Sun
too deeply in the process in his description, but there's no getting around
the pay to play principle that is deeply enshrined in this so-called open
process.  I'm glad to hear you say that it can be waived for the expert
group, though it certainly wasn't clear from the Web site.  (It looks like
it can be waived for the first year only.)

If Sun's approach involved only royalties-after-a-product-ships, I'd be a
lot quieter.  (I don't, after all, charge for the software I produce.)
It's not, though.  There are upfront fees ($5000 for non-educational
entities, $2000 for non-profit or educational.  (See
http://developer.java.sun.com/developer/jcp/java_community_process.html for
details.  Most of the kickers are in the agreement,
http://developer.java.sun.com/developer/jcp/JSPA.pdf)

The JCP may feel like an 'open' process if you're a mammoth, or even if
you're a reasonably well-off sabre-toothed tiger, but to us small mammals,
it's the same old s***, different day, that we get from standards
organizations.  We get to run around among the mammoths and sabre-toothed
tigers wearing funny lenses that blur our vision and working with tools
that may not have been created with our needs in mind.

The price of _joining_ the process (as a partner, where it appears you do
have more influence) is even more irritating because Sun is, after all, a
vendor.  If I really wanted to give Sun Microsystems a sizable check, I'd
expect at least a Sparc 5 with a huge monitor to show up in return.  Giving
Sun $5000 so this poor company can manage a not-so-open process ('Process
Cost Sharing') is ridiculous.  

Given that $5000 pays all my expenses for a few months, the cost to small
business and self-employed folks is outrageous.  I'd love to participate in
the process as a 'full' member, contributing time (which costs me something
too), the standard currency for open source and open process participation,
rather than a large sum of money that goes nowhere.

I'll participate - as much as I'm allowed - but remember that the JCP is
_far_ less open than the current ModSAX discussion, and I think the results
of the JSR for XML are going to suffer as a result.

Enough of the populist ranting.  We now return to the extremely open ModSAX
discussion.

(p.s. It looks like David will be giving a presentation on this JSR at
XTech.  I'll be there, I assume he'll be there, and anyone else who's
around and would like to take a close look at this thing should come by at
2:45 on Wednesday.  Oh, and did I mention the price of conferences?  Never
mind, forget I said that.)

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From glv at vanderburg.org  Tue Mar  9 15:32:39 1999
From: glv at vanderburg.org (Glenn Vanderburg)
Date: Mon Jun  7 17:09:47 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org>
Message-ID: <36E53DA7.BF547D80@vanderburg.org>

John Cowan wrote:
> 
> >   public abstract void setFeature (String featureID, boolean state)
> >     throws SAXNotSupportedException;
>
> 2) This method is allowed to throw a SAXNewParserException, which
> encapsulates a replacement parser.

There are two problems with this.

First: let's not use exceptions to report non-error conditions.  There
are theoretical and practical reasons to restrict the use of Java 
exceptions to reporting errors.  (On a related note, I would like to
propose an explicit "boolean featureSupported(String featureID)"
query method to make it possible to test for a feature without risking
an exception.  If anyone would like details of why it's bad to have
exceptions as a part of normal control flow, let me know.)

Second: if an application needs to implement certain features by 
pushing filters from the bottom, it can encapsulate the entire process
on its own, using a composite, and the process never needs to be 
exposed through the ModSAX API.

(I'm new to this discussion, so forgive me --- but let me know ---
if I'm rehashing old debates.)

---glv

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From glv at vanderburg.org  Tue Mar  9 15:53:09 1999
From: glv at vanderburg.org (Glenn Vanderburg)
Date: Mon Jun  7 17:09:47 2004
Subject: SAX: ModSAX addition, general property query
References: <007801be6a29$d6efdd80$c9a8a8c0@thing2>
Message-ID: <36E540FA.61C3F574@vanderburg.org>

Bill la Forge wrote:
> 
> From: John Wilson <tug@wilson.co.uk>
> >I don't think that it's unreasonable to insist that objects 
> >representing a Feature, Handler or Property should either implement 
> >a distinct interface or subclass a distinct class. If this is so 
> >the Parser can tell what Feature, Handler or Property is being set 
> >by enquiring of the type of the object.
> 
> Filters often implement more than one (generally all) handler 
> interface and then register themselves with the underlying 
> parser/filter for the same events requested by the overlaying 
> application/filter.

Yes, and as written, John's proposal would require distinct handler
objects for each feature, which would be bad.  However, with a slight 
modification, it would work beautifully.  Instead of using a string 
as a feature ID, use a type descriptor (in Java, an instance of 
java.lang.Class).  Feature handlers would be registered by supplying 
the Class object that represents the feature being implemented, along 
with a handler object that is assignable to that type.

It seems probable to me that, whatever naming scheme is chosen for 
features, each feature will have a special interface that handlers 
must implement; if that's true, and Strings are used to identify 
features, we will effectively have two names for each feature.  And 
using classes shares one of the good aspects of the URI solution: it 
piggybacks on the DNS to provide a ready-made collision-free global 
namespace.

The only problem I see with this proposal is that it may not translate
well to other languages.  One possibility is for other languages to use
the name of the corresponding Java interface as a feature name; for
example, "org.xml.sax.NamespaceHandler".  This may not be ideal, but
does not seem too onerous.

---glv

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Tue Mar  9 16:06:34 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:47 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <08e801be6a46$bb711d40$010a0a0a@home.wilson.co.uk>


----- Original Message ----- 
From: Glenn Vanderburg <glv@vanderburg.org>
To: Bill la Forge <b.laforge@jxml.com>
Cc: John Wilson <tug@wilson.co.uk>; XML Developers' List <xml-dev@ic.ac.uk>
Sent: 09 March 1999 15:40
Subject: Re: SAX: ModSAX addition, general property query


>Bill la Forge wrote:
>> 
>> From: John Wilson <tug@wilson.co.uk>
>> >I don't think that it's unreasonable to insist that objects 
>> >representing a Feature, Handler or Property should either implement 
>> >a distinct interface or subclass a distinct class. If this is so 
>> >the Parser can tell what Feature, Handler or Property is being set 
>> >by enquiring of the type of the object.
>> 
>> Filters often implement more than one (generally all) handler 
>> interface and then register themselves with the underlying 
>> parser/filter for the same events requested by the overlaying 
>> application/filter.
>
>Yes, and as written, John's proposal would require distinct handler
>objects for each feature, which would be bad.  However, with a slight 
>modification, it would work beautifully.  Instead of using a string 
>as a feature ID, use a type descriptor (in Java, an instance of 
>java.lang.Class).  Feature handlers would be registered by supplying 
>the Class object that represents the feature being implemented, along 
>with a handler object that is assignable to that type.

This seems to me to be an excellent suggestion;)

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar  9 16:16:49 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:09:47 2004
Subject: ModSax Suggestion
References: <005001be6a23$7e574240$c9a8a8c0@thing2>
Message-ID: <36E547A3.41E513A@eng.sun.com>

> >> Looking forward to progress on the Java XML API.  BTW, Dave,
> >> are you going to do a "Birds of a Feather" session on XML at this years
> >> JavaOne?  I think that could be valuable.
> >
> >I may be signed up for more than that this time...
> >
> >A BOF on XML -- an XML-DEV BOF! -- would be lots of fun.
> >Some of the folk here have never met in person.  I think
> >there will be lots of interesting applications to talk
> >about ... and probably some interesting frameworks.
> 
> Simon and I proposed a Coins BOF some time back and
> JavaOne accepted it. Might be a good place to meet and
> discuss ModSAX, filters, and such.

I thought they were doing BOF scheduling on a more
typical schedule -- e.g. hold off for a month or two
before the conference.  Evidently not!

I'll encourage someone else to do the legwork on setting
up an XML, or XML-DEV, BOF ... I'll gladly show up!  It's
not looking like something I'll have time to arrange.  I'm
sure contact information is available via the java.sun.com
website.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Tue Mar  9 17:03:20 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:47 2004
Subject: Namespaces and DTDs
Message-ID: <01BE6A56.FD03D0D0@grappa.ito.tu-darmstadt.de>

Richard L. Goerwitz wrote:

> Maybe I misunderstand, but as far as I can see, namespaces won't help
> you, either.  Why?  Because even if you can refer to, say, your two TITLE
> elements by different prefixes, you'll still have to declare the prefixed
> elements in the DTD as if they were atomic element names.
>
> Namespaces, in other words, don't solve your problem.  They may make it
> worse, in fact, because you have to know what prefixes you are going to
> declare in a given document to be able to rewrite your DTD to work with
> that document.
>
> There was a furor two or three months ago on this list about namespaces
> breaking validation.  That furor died down when the namespace spec became
> an official recommendation (a done deal, in other words).

You are correct.  In today's environment (namespace-unaware parsers and no 
way to associate prefixes and URIs in the DTD), you must use the same 
prefixes in the DTD and the document for validation to work.  I didn't 
state this because it was stated repeatedly during the aforementioned 
furor, which I sincerely hope this thread won't reignite.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar  9 17:11:41 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org>
Message-ID: <36E55607.332557F1@locke.ccil.org>

Glenn Vanderburg wrote:

> First: let's not use exceptions to report non-error conditions.  There
> are theoretical and practical reasons to restrict the use of Java
> exceptions to reporting errors.

We should take this off-line.  I'll simply say: exceptions are
suitable for reporting exceptional conditions.  Having an object
request its own replacement is certainly exceptional.

> Second: if an application needs to implement certain features by
> pushing filters from the bottom,

The idea here is that an application may request a feature which
a parser does not itself support, but can be adapted to support
by pushing a filter between itself and the application.  That
of course requires that the application now talk to the filter
instead.  (In principle, the parser could act as an adapter for
the filter, but that would complicated the bejesus out of it.)

In Smalltalk, the parser could swap object ids with the
filter using the become: method, but AFAIK no other OO language
supports that.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Tue Mar  9 17:21:20 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:48 2004
Subject: Namespaces and DTDs
References: <01BE6A19.882B5360@grappa.ito.tu-darmstadt.de>
Message-ID: <36E55BFE.C5DB6816@mecomnet.de>

? which of the "namespace aware" parsers will permit you to parse validate a
document for which partions of the dtd contain element declarations with
ambiguous names - without first modifying the dtd? i've yet to hear a solution
to the "ambiguous name" problem for xml-1.0/+ns conforming parsers.

Ronald Bourret wrote:
> 
> Elliotte Rusty Harold wrote:
> 
> > I have several DTDs with conflicting definitions of certain elements.
> > (e.g one defines a HEAD as a TITLE followed by a META and another
> > defines a HEAD as #PCDATA). I need to use all the DTDs and associated
> > markup languages for a single document.
> >
> > To an extent I can disambiguate them with namespaces.  However, is there
> > any way I can do this while still validating against the orignal DTDs?
> > ...
> 
> If you want to use a namespace-unaware parser, I don't see how you can
> avoid rewriting the DTDs. ...l


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar  9 17:22:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk>
References: <083401be6a19$94195f50$010a0a0a@home.wilson.co.uk>
Message-ID: <14053.22490.504236.874846@localhost.localdomain>

John Wilson writes:

 > I don't think that it's unreasonable to insist that objects representing a
 > Feature, Handler or Property should either implement a distinct interface or
 > subclass a distinct class. If this is so the Parser can tell what Feature,
 > Handler or Property is being set by enquiring of the type of the object. (I
 > favour insisting that they subclass distinct classes because (in Java) that
 > naturally imposes the restriction that a single object can only represent a
 > single Property.)

We wouldn't want to have to rely on discovering the class at runtime,
so we'd have to have a method in the interface that reports a string
ID anyway -- at lot more work for the same result.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Tue Mar  9 18:41:03 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <098d01be6a5c$57d56190$010a0a0a@home.wilson.co.uk>


----- Original Message -----
From: David Megginson <david@megginson.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Sent: 08 March 1999 22:30
Subject: Re: SAX: ModSAX addition, general property query


>Tom Harding writes:
> > David Megginson wrote:
> >
> > > As I wrote before, it doesn't much matter whether we use Java property
> > > names incorporating domain names (like
> > > 'org.xml.sax.features.validation') or URIs (like
> > > 'http://xml.org/sax/features/validation'), as long as we have the
> > > ability for people to create new names without fear of collision.
> >
> > I would also urge against using an http: URI since it is not meant
> > that a resource actually be retrieved using the http protocol.
>
>I've been thinking about this issue, and I'm fairly convinced that the
>URI is the right choice.

I really have a problem with using URI's for this.

RFC2396 (http://www.ics.uci.edu/pub/ietf/uri/rfc2396.txt) section 6 talks
about URI Normalisation and equivalence

It says that URI equivalence is defined on a scheme basis. You have chosen
the http scheme so we are presumably required to apply the http definition
of URI equivalence. This does not seem to me to be a desirable criteria for
equivalence.

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Tue Mar  9 18:41:04 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk>


----- Original Message -----
From: David Megginson <david@megginson.com>
To: John Wilson <tug@wilson.co.uk>
Cc: XML Developers' List <xml-dev@ic.ac.uk>
Sent: 09 March 1999 17:19
Subject: Re: SAX: ModSAX addition, general property query


>John Wilson writes:
>
> > I don't think that it's unreasonable to insist that objects representing
a
> > Feature, Handler or Property should either implement a distinct
interface or
> > subclass a distinct class. If this is so the Parser can tell what
Feature,
> > Handler or Property is being set by enquiring of the type of the object.
(I
> > favour insisting that they subclass distinct classes because (in Java)
that
> > naturally imposes the restriction that a single object can only
represent a
> > single Property.)
>
>We wouldn't want to have to rely on discovering the class at runtime,
>so we'd have to have a method in the interface that reports a string
>ID anyway -- at lot more work for the same result.

Testing the type at run time is a tivial operation in Java so I'm not sure
why you say that we wouldn't want to rely on descovering the class at run
time. If there was some worry about the performance hit on iterating through
all the supported interfaces (which I strongly doubt) the interface would
have a method that reported a Class rather than a String.

However, Glen Vanderburg has suggested an amendment to my idea which seems
to me to address you concerns.

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Tue Mar  9 19:36:41 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:09:48 2004
Subject: RDF, ID's, XPtrs, and object orientation
Message-ID: <000001be6a63$d7c87de0$5118a8c0@kuantech1.quokka.com>

I am struggling with the following limitation caused by RDF's use of ID attributes:

I want to use RDF in a truly object-oriented fashion. It lets me get really close but not quite there. I would like to use the "subPropertyOf" element to indicate overriding. However, since property names are ID's, I can't override by name. I could use XPointer to refer to overridden names (in effect referring to "the property whose name is foo and whose class is bar"), but I can't actually define the bar version of foo and the baz version of foo in the same document. Of course, if I could specify a key composed of multiple attributes, my problems would be solved. 

I realize I can also avoid the problem by putting each "class" in a separate document, but this causes problems of its own in my particular application. 

If anyone has a hint as to how to get around this issue, that would be great, otherwise it's just food for thought. 

Jeff 

P.S. I am finding the problem of ID conflicts between "fragments" that need to be created separately and then combined into a single document to be a general one. My approach has been not to use ID attributes, but I don't have a choice if I'm using RDF. I suppose it will work as long as I don't validate, but I really want to validate.

-----------------------------------------------------------------
Kuantech, Inc.                            http://www.kuantech.com
Jeffrey E. Sussna, Principal                     jes@kuantech.com

Distributed Content Architectures for Dynamic Online Applications
-----------------------------------------------------------------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From glv at vanderburg.org  Tue Mar  9 19:49:19 1999
From: glv at vanderburg.org (Glenn Vanderburg)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain> <36E3E712.D5556233@locke.ccil.org> <36E53DA7.BF547D80@vanderburg.org> <36E55607.332557F1@locke.ccil.org>
Message-ID: <36E57A78.87779A2C@vanderburg.org>

> We should take this off-line.  I'll simply say: exceptions are
> suitable for reporting exceptional conditions.  Having an object
> request its own replacement is certainly exceptional.

Well, yes and no.  But I'd prefer to go the cleaner route of not
allowing the object to request its own replacement.

> The idea here is that an application may request a feature which
> a parser does not itself support, but can be adapted to support
> by pushing a filter between itself and the application.  

Yes, I understand.

>                                                          That
> of course requires that the application now talk to the filter
> instead.  (In principle, the parser could act as an adapter for
> the filter, but that would complicated the bejesus out of it.)

It's not complicated at all --- merely a little tedious.  It would
be easy to provide a class in the helpers package that would make it
almost trivial.

My primary objection to the idea is precisely what you mentioned above:
that it is an extremely unusual thing to happen.  Programmers will be
surprised by this behavior.  Coupled with the fact that it's very easy
to make it all transparent, I think exposing the parser's internal
tricks is a bad idea.

---glv

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar  9 19:52:14 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <36E540FA.61C3F574@vanderburg.org>
References: <007801be6a29$d6efdd80$c9a8a8c0@thing2>
	<36E540FA.61C3F574@vanderburg.org>
Message-ID: <14053.31455.289569.926503@localhost.localdomain>

Glenn Vanderburg writes:

 > It seems probable to me that, whatever naming scheme is chosen for 
 > features, each feature will have a special interface that handlers 
 > must implement

This is not the case.  Some features will require special handlers,
some will allow special handlers, and some will simply change the way
existing handlers are used.  

For example, if you enable validation, you request that the parser
report additional error states the existing ErrorHandler; if you
enable text-normalisation, you simply ask the parser to guarantee that
there will never be two DocumentHandler.characters events in a row.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar  9 19:53:20 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <36E53DA7.BF547D80@vanderburg.org>
References: <14051.3215.196642.22571@localhost.localdomain>
	<36E3E712.D5556233@locke.ccil.org>
	<36E53DA7.BF547D80@vanderburg.org>
Message-ID: <14053.31616.491923.652158@localhost.localdomain>

Glenn Vanderburg writes:

 > Second: if an application needs to implement certain features by 
 > pushing filters from the bottom, it can encapsulate the entire process
 > on its own, using a composite, and the process never needs to be 
 > exposed through the ModSAX API.

This is actually a good point.  Since the SAX driver is usually a
separate class rather than the parser itself, it would not be
difficult for it to encapsulate any needed filters.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar  9 20:18:19 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX: ModSAX addition, general property query
In-Reply-To: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk>
References: <098101be6a58$bc3b45e0$010a0a0a@home.wilson.co.uk>
Message-ID: <14053.33072.300457.335320@localhost.localdomain>

John Wilson writes:

 > Testing the type at run time is a tivial operation in Java 

... but not in other programming languages.

 > so I'm not sure why you say that we wouldn't want to rely on
 > descovering the class at run time. 

In the end, you're doing the equivalent of testing for a string anyway
-- you're just letting the Java class name serve as the unique ID.  I
don't see the advantage of forcing the users to get the unique ID
through a circuitous route.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tug at wilson.co.uk  Tue Mar  9 21:24:42 1999
From: tug at wilson.co.uk (John Wilson)
Date: Mon Jun  7 17:09:48 2004
Subject: SAX: ModSAX addition, general property query
Message-ID: <09b201be6a73$243c7dc0$010a0a0a@home.wilson.co.uk>


----- Original Message -----
From: David Megginson <david@megginson.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Sent: 09 March 1999 20:16
Subject: Re: SAX: ModSAX addition, general property query


>John Wilson writes:
>
> > Testing the type at run time is a tivial operation in Java
>
>... but not in other programming languages.
>
> > so I'm not sure why you say that we wouldn't want to rely on
> > descovering the class at run time.
>
>In the end, you're doing the equivalent of testing for a string anyway
>-- you're just letting the Java class name serve as the unique ID.  I
>don't see the advantage of forcing the users to get the unique ID
>through a circuitous route.


You are testing for a value. Testing for a String, a Class or an int are, at
that level, equivalent The issue is: how do you chose the value? It so
happens that Java provides a natural way for us to create a unique value.
Other languages provide other ways of creating the unique value.

John Wilson
The Wilson Partnership
5 Market Hill, Whitchurch, Aylesbury, Bucks HP22 4JB, UK
+44 1296 641072, +44 976 611010(mobile), +44 1296 641874(fax)
Mailto: tug@wilson.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nikita.ogievetsky at csfb.com  Tue Mar  9 21:56:33 1999
From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita)
Date: Mon Jun  7 17:09:48 2004
Subject: Namespaces and DTDs
Message-ID: <9C998CDFE027D211B61300A0C9CF9AB4424719@SNYC11309>

Richard L. Goerwitz wrote:
>Ronald Bourret wrote:
>> > I have several DTDs with conflicting definitions of certain elements.
>> > ...Am I going to have to break down and just rewrite the DTDs to use
>> > the qualified names?
>> 
>> If you want to use a namespace-unaware parser, I don't see how you can
>> avoid rewriting the DTDs.
>Maybe I misunderstand, but as far as I can see, namespaces won't help
>you, either. Why? Because even if you can refer to, say, your two TITLE
>elements by different prefixes, you'll still have to declare the prefixed
>elements in the DTD as if they were atomic element names.
>Namespaces, in other words, don't solve your problem. They may make it
>worse, in fact, because you have to know what prefixes you are going to
>declare in a given document to be able to rewrite your DTD to work with
>that document.

I have a similar problem: 
On my web site http://www.cogx.com, I am working on XML driven menu bar (can
be a tree, etc)
The underlying XML uses reusable structures such as months, quarters of the
year,
Tax schedules with zillions of tax lines repeated, etc.
Instead of having just one XML document for the menu bar,
I moved reusable fragments  into a separate file and access them from 
my main XML by  
<group frnms:ref="fx_tax_lines"/>
or 
<group frnms:ref="months_of_the_year"/>
it is also obvious
that I should not keep all fragments in one reusable collection, but rather 
separate them by theme. - Why should I send file with tax schedules to a guy
interested in Opera performances?
So I can have as many reusable collections as I wish: tax related,
publications related, theater related, etc...
It means I should allow freedom in specifying namespace prefixes and still
know
what each prefix means!
I am achieving this by 
declaring my namespaces as follows:
xmlns:ref="groups:www.cogx.com/xmlbar/ref-menu.xml" 
the prefix "groups:" tells me that a namespace of reusable fragments was
defined
Now I can give my prefix any name. 
When parsing I know that it is a namespace of reusable fragments!

Problem here is that <group> element has to be defined with an open model to
allow 
for different namespace prefixes.

I also made a proposal that it would be great to reserve "any" prefix for
this type of situation. This will
save me from using open model, which I do not like, really!

> The most effective responses I saw were
>from people who said, in effect, "Namespaces do far less than you want
>or expect them to."

Exactly! And this is why 
	Namespaces let you do much more then you thought you can!

Best regards,
	Nikita O.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Tue Mar  9 22:49:16 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:09:48 2004
Subject: Namespaces and DTDs
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990309224818Z-5223@master.design-intelligence.com>

A simple extension to namespaces could have fixed this problem:
1.	Allow a DTD to be optionally specified along with the namespace 
prefix and URI
2.	When an element is prefixed, parse it using the DTD associated with 
the namespace and the given prefix as the default.
3.	If no DTD is associated with the prefix or not validating, do what 
is done now (ensure element is well-formed).

Your DTDs would not need to be changed, you would just have to 
indicate which HEAD (for example) is desired in the content and add 
associated DTD urls to the namespace declarations.

Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de]
Sent:  Tuesday, March 09, 1999 9:02 AM
To:  xml-dev@ic.ac.uk
Subject:  RE: Namespaces and DTDs

Richard L. Goerwitz wrote:

> Maybe I misunderstand, but as far as I can see, namespaces won't 
help
> you, either.  Why?  Because even if you can refer to, say, your two 
TITLE
> elements by different prefixes, you'll still have to declare the 
prefixed
> elements in the DTD as if they were atomic element names.
>
> Namespaces, in other words, don't solve your problem.  They may make 
it
> worse, in fact, because you have to know what prefixes you are going 
to
> declare in a given document to be able to rewrite your DTD to work 
with
> that document.
>
> There was a furor two or three months ago on this list about 
namespaces
> breaking validation.  That furor died down when the namespace spec 
became
> an official recommendation (a done deal, in other words).

You are correct.  In today's environment (namespace-unaware parsers 
and no
way to associate prefixes and URIs in the DTD), you must use the same 
prefixes in the DTD and the document for validation to work.  I didn't 
state this because it was stated repeatedly during the aforementioned 
furor, which I sincerely hope this thread won't reignite.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar  9 23:06:12 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:48 2004
Subject: RDF, ID's, XPtrs, and object orientation
References: <000001be6a63$d7c87de0$5118a8c0@kuantech1.quokka.com>
Message-ID: <36E5A929.394204F0@locke.ccil.org>

Jeffrey E. Sussna wrote:

> My approach has been not to use ID attributes, but I don't have a
> choice if I'm using RDF. I suppose it will work as long as I don't
> validate, but I really want to validate.

Actually, the values of ID and bagID attributes have to be unique
within the document, but nothing says that either has to be an
XML "ID attribute".  (That was so in earlier drafts, but not now.)

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 01:12:14 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:48 2004
Subject: ModSAX: Proposed Core Handlers
Message-ID: <14053.50619.147376.869177@localhost.localdomain>

My current proposal for the ModParser interface includes the following 
method (ModHandler is an empty interface):

  public abstract void setHandler (String handlerID, ModHandler handler)
    throws SAXNotSupportedException;

I propose the following core handlers, with the understanding that SAX 
parsers are not required to support any of them (they are free to
throw a SAXNotSupportedException):


ModSAX Core Handlers
--------------------

(All handler IDs correspond to a specific interface.)

http://xml.org/sax/handlers/lexical <LexicalHandler>
  Receive callbacks for comments, CDATA sections, and (possibly)
  entity references.

http://xml.org/sax/handlers/dtd-decl <DTDDeclHandler>
  Receive callbacks for element, attribute, and (possibly) parsed
  entity declarations.

http://xml.org/sax/handlers/namespace <NamespaceHandler>
  Receive callbacks for the start and end of the scope of each
  namespace declaration.


I'm not certain, but it might make sense to replace the third one with 
a read-only parse-time property.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 01:17:07 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:49 2004
Subject: ModSAX: Proposed Core Properties
Message-ID: <14053.50863.546824.628181@localhost.localdomain>

My current proposal for the ModParser interface includes the following 
methods:

  public abstract void set (String propID, Object value)
    throws SAXNotSupportedException;

  public abstract Object get (String propID);
    throws SAXNotSupportedException;

Properties may be read-write, read-only, or write-only; they may also
be parse-time (may be changed during parsing) or non-parse-time (may
be changed only before a parse or between parses).


ModSAX Core Properties
----------------------

(All properties are associated with a single value type.)

http://xml.org/sax/properties/namespace-sep <String> (write-only)
  Set the separator to be used between the URI part of a name and the
  local part of a name when namespace processing is being performed
  (see the http://xml.org/sax/features/namespaces feature).  By
  default, the separator is a single space.  This property may not be
  set while a parse is in progress (throws a SAXNotSupportedException).

http://xml.org/sax/properties/dom-node <Node> (read-only)
  Get the DOM node currently being visited, if the SAX parser is
  iterating over a DOM tree.  If the parser recognises and supports
  this property but is not currently visiting a DOM node, it should
  return null (this is a good way to check for availability before the 
  parse begins).

http://xml.org/sax/properties/xml-string <String> (read-only)
  Get the literal string of characters associated with the current
  event.  If the parser recognises and supports this property but is
  not currently parsing text, it should return null (this is a good
  way to check for availability before the parse begins).  I stole
  this idea from Expat.


Remember that no SAX parser will be required to support any of these
-- it simply has to throw a SAXNotSupportedException if it doesn't
know about the property.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 01:17:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:49 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <14053.51113.676945.877507@localhost.localdomain>

Here's my revised version of the core feature list, based on recent
discussions:


ModSAX Core Features
--------------------

http://xml.org/sax/features/validation
  Validate (true) or don't validate (false).

http://xml.org/sax/features/external-general-entities
  Expand external general entities (true) or don't expand (false).

http://xml.org/sax/features/external-parameter-entities
  Expand external parameter entities (true) or don't expand (false).

http://xml.org/sax/features/namespaces
  Preprocess namespaces (true) or don't preprocess (false).  See also
  the http://xml.org/sax/properties/namespace-sep property.

http://xml.org/sax/features/normalize-text
  Ensure that all consecutive text is returned in a single callback to
  DocumentHandler.characters or DocumentHandler.ignorableWhitespace
  (true) or explicitly do not require it (false).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 01:21:08 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:49 2004
Subject: ModSAX: Proposed ModParser Interface
Message-ID: <14053.51158.347156.718466@localhost.localdomain>

Here's my current proposed interface for ModParser:

public interface ModParser extends Parser
{
  public abstract void setFeature (String featureID, boolean state)
    throws SAXNotSupportedException;

  public abstract void setHandler (String handlerID, ModHandler handler)
    throws SAXNotSupportedException;

  public abstract void set (String propID, Object value)
    throws SAXNotSupportedException;

  public abstract Object get (String propID)
    throws SAXNotSupportedException;
}


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Wed Mar 10 01:24:06 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:09:49 2004
Subject: Namespaces and DTDs 
In-Reply-To: Your message of "Tue, 09 Mar 1999 14:48:18 -0800."
             <c=US%a=_%p=Design_Intellige%l=MASTER-990309224818Z-5223@master.design-intelligence.com> 
Message-ID: <199903100123.OAA10814@aniwa.sky>


How about having the ability to say 'process the children of this element 
using that dtd'.  Attach DTD declarations to elements, not just to documents.

It feels like some way is needed to make interpretation of XML subtrees 
dependent on context, hence not requiring the rewriting of XML imported into a 
document as a subtree from the context of a different document.

(Perhaps I'm being naive.  I'm new to this.)

Andrew McNaughton


> A simple extension to namespaces could have fixed this problem:
> 1.	Allow a DTD to be optionally specified along with the namespace 
> prefix and URI
> 2.	When an element is prefixed, parse it using the DTD associated with 
> the namespace and the given prefix as the default.
> 3.	If no DTD is associated with the prefix or not validating, do what 
> is done now (ensure element is well-formed).
> 
> Your DTDs would not need to be changed, you would just have to 
> indicate which HEAD (for example) is desired in the content and add 
> associated DTD urls to the namespace declarations.
> 
> Marc B McDonald
> Principal Software Scientist
> Design Intelligence, Inc
> www.design-intelligence.com
> 
> 
> ----------
> From:  Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de]
> Sent:  Tuesday, March 09, 1999 9:02 AM
> To:  xml-dev@ic.ac.uk
> Subject:  RE: Namespaces and DTDs
> 
> Richard L. Goerwitz wrote:
> 
> > Maybe I misunderstand, but as far as I can see, namespaces won't 
> help
> > you, either.  Why?  Because even if you can refer to, say, your two 
> TITLE
> > elements by different prefixes, you'll still have to declare the 
> prefixed
> > elements in the DTD as if they were atomic element names.
> >
> > Namespaces, in other words, don't solve your problem.  They may make 
> it
> > worse, in fact, because you have to know what prefixes you are going 
> to
> > declare in a given document to be able to rewrite your DTD to work 
> with
> > that document.
> >
> > There was a furor two or three months ago on this list about 
> namespaces
> > breaking validation.  That furor died down when the namespace spec 
> became
> > an official recommendation (a done deal, in other words).
> 
> You are correct.  In today's environment (namespace-unaware parsers 
> and no
> way to associate prefixes and URIs in the DTD), you must use the same 
> prefixes in the DTD and the document for validation to work.  I didn't 
> state this because it was stated repeatedly during the aforementioned 
> furor, which I sincerely hope this thread won't reignite.
> 
> -- Ron Bourret
> 
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 10 01:30:01 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:09:49 2004
Subject: Architectural Forms Questions
References: <Pine.GHP.4.02A.9903081149260.2617-100000@mail.ilrt.bris.ac.uk> <36E3F30C.F6D6DB51@mitre.org>
Message-ID: <36E5BA4C.7916D8@prescod.net>

"Roger L. Costello" wrote:
> 
> - How powerful is the correspondence that you can express with
> Architectural Forms?  Is it essentially limited to renaming and
> omission?

You can also map elements to attributes and attributes to elements.

> - In addition to using Architectural Forms to express correspondences
> that are known a priori, could you use them to document mappings that
> are discovered "on-the-fly" by modifying a document or DTD after a
> mapping is discovered?

Yes, you can do this by modifying DTDs. Caveat: In my experience it is
seldom the case that a subtype relationship can be "discovered" after the
fact. It works for really loose DTDs like HTML and ICADD, but not for more
complex/strict DTDs. This is very similar to the situation in software
development. It is very rarely the case that you can "adapt" an existing
class to a newly discovered supertype without radically changing the class
or breaking existing code.

> - It appears to be the case that the correspondence between A and B must
> be documented in a way that keeps the mapping tightly coupled to either
> A or B. Are there any plans to represent the correspondence so that it
> is not tightly coupled to either A or B?

You could think of this as the distinction between subtyping and
transformation. Subtyping is about an inherent relationship that is
discovered in advance. Transformation is about imposing a mapping
externally, "on the fly."

> - Is it a correct interpretation to say that Architectural Forms
> represent correspondence by overloading existing language constructs?

"Overloading" is a somewhat overloaded term. Let's say "reusing" existing
language constructs.

> - Given that subtyping and inheritance have been part of the primary XML
> "schema" proposals, is it likely that XML Architectural Forms will be
> overtaken by advances in the XML schema area?

Eventually. In what time frame, I don't know.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"The Excursion [Sport Utility Vehicle] is so large that it will come
equipped with adjustable pedals to fit smaller drivers and sensor 
devices that warn the driver when he or she is about to back into a
Toyota or some other object." -- Dallas Morning News

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Wed Mar 10 01:31:13 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:09:49 2004
Subject: Namespaces and DTDs 
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990310012937Z-5348@master.design-intelligence.com>

Exactly. By using <a:HEAD> you would be saying process the HEAD 
element according to the DTD associated with the namespace prefix 'a' 
and consider 'a' to be the default namespace for the DTD. If there is 
no associated DTD, can only check HEAD is well-formed.


Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Andrew McNaughton [SMTP:andrew@squiz.co.nz]
Sent:  Wednesday, March 10, 1999 6:23 AM
To:  Marc McDonald
Cc:  xml-dev@ic.ac.uk; rbourret@ito.tu-darmstadt.de
Subject:  Re: Namespaces and DTDs


How about having the ability to say 'process the children of this 
element
using that dtd'.  Attach DTD declarations to elements, not just to 
documents.

It feels like some way is needed to make interpretation of XML 
subtrees
dependent on context, hence not requiring the rewriting of XML 
imported into a
document as a subtree from the context of a different document.

(Perhaps I'm being naive.  I'm new to this.)

Andrew McNaughton


> A simple extension to namespaces could have fixed this problem:
> 1.	Allow a DTD to be optionally specified along with the namespace
> prefix and URI
> 2.	When an element is prefixed, parse it using the DTD associated 
with
> the namespace and the given prefix as the default.
> 3.	If no DTD is associated with the prefix or not validating, do 
what
> is done now (ensure element is well-formed).
>
> Your DTDs would not need to be changed, you would just have to
> indicate which HEAD (for example) is desired in the content and add 
> associated DTD urls to the namespace declarations.
>
> Marc B McDonald
> Principal Software Scientist
> Design Intelligence, Inc
> www.design-intelligence.com
>
>
> ----------
> From:  Ronald Bourret [SMTP:rbourret@ito.tu-darmstadt.de]
> Sent:  Tuesday, March 09, 1999 9:02 AM
> To:  xml-dev@ic.ac.uk
> Subject:  RE: Namespaces and DTDs
>
> Richard L. Goerwitz wrote:
>
> > Maybe I misunderstand, but as far as I can see, namespaces won't
> help
> > you, either.  Why?  Because even if you can refer to, say, your 
two
> TITLE
> > elements by different prefixes, you'll still have to declare the
> prefixed
> > elements in the DTD as if they were atomic element names.
> >
> > Namespaces, in other words, don't solve your problem.  They may 
make
> it
> > worse, in fact, because you have to know what prefixes you are 
going
> to
> > declare in a given document to be able to rewrite your DTD to work 
> with
> > that document.
> >
> > There was a furor two or three months ago on this list about
> namespaces
> > breaking validation.  That furor died down when the namespace spec 
> became
> > an official recommendation (a done deal, in other words).
>
> You are correct.  In today's environment (namespace-unaware parsers 
> and no
> way to associate prefixes and URIs in the DTD), you must use the 
same
> prefixes in the DTD and the document for validation to work.  I 
didn't
> state this because it was stated repeatedly during the 
aforementioned
> furor, which I sincerely hope this thread won't reignite.
>
> -- Ron Bourret
>
>
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>
>
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>

--
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Wed Mar 10 02:40:06 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:49 2004
Subject: ModSAX: Proposed Core Properties
Message-ID: <a64969b7.36e5dab6@aol.com>

Hi David,

In a message dated 3/9/99 8:30:31 PM Eastern Standard Time,
david@megginson.com writes:
> http://xml.org/sax/properties/dom-node <Node> (read-only)
>    Get the DOM node currently being visited, if the SAX parser is
>    iterating over a DOM tree.  If the parser recognises and supports
>    this property but is not currently visiting a DOM node, it should
>    return null (this is a good way to check for availability before the 
>    parse begins).
>  

This has made me realize that I was under a misconception about
what the generic get() and set() parser properties would provide in
terms of functionality.  What I was really hoping for was:

org.w3c.dom.Document  parse(InputSource  is, boolean events) throws
SAXException; 
org.w3c.dom.Document  parse(java.lang.String uri, boolean events) throws
SAXException;
/* the events boolean would be to turn on/off event calls. */

Which would allow me to code:
try
{
     ModParser mp = ParserFactory.makeModParser();
     boolean supported = true;
     try
     {
             mp.setFeature("http://xml.org/sax/features/dom-result", true);
     }  catch (SAXNotSupportedException snse) { supported = false; }

     if (supported)
     {
          Document d = mp.parse("test.xml", false);
          // ... process Document
     }
} catch (SAXException se)
  {
      // handle it
  }

So, what I'm saying is that I would like to be able to choose 
whether to interface to the Parser via events or via a DOM.
If you agree with this, I believe using the return type is more
appropriate than getting a resultant property (as I suggest next).

If for some reason the above is not palatable, the same could be 
accomplished under the current scheme if we added a 
property:

http://xml.org/sax/properties/dom-document <org.w3c.dom.Document> (read-only)

Then I could code:

try
{
     ModParser mp = ParserFactory.makeModParser();
     boolean supported = true;
     try
     {
             mp.setFeature("http://xml.org/sax/features/dom-capable", true);
     }  catch (SAXNotSupportedException snse) { supported = false; }

     if (supported)
     {
          mp.parse("test.xml");
	    Document d = (Document) mp.get("http://xml.org/sax/properties/dom-
document");
          // ... process Document
     }
} catch (SAXException se)
  {
      // handle it
  }

Note: both code examples also required an added feature to check
for the desired functionality.

I believe the above is sorely missing from the current API.  Does
anyone else see a need for this?  If not, why not?  But before you
say, "build a layer on top of SAX" -- to me that seems ridiculous when most 
of the Parser implementations can produce a dom Document.

Best wishes,

 - Mike (mdaconta@aol.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 10 04:23:41 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:49 2004
Subject: ModSAX: Proposed Core Properties
Message-ID: <013601be6aac$f96ad580$c9a8a8c0@thing2>

From: MikeDacon@aol.com <MikeDacon@aol.com>
>This has made me realize that I was under a misconception about
>what the generic get() and set() parser properties would provide in
>terms of functionality.  What I was really hoping for was:
>
>org.w3c.dom.Document  parse(InputSource  is, boolean events) throws
>SAXException; 
>org.w3c.dom.Document  parse(java.lang.String uri, boolean events) throws
>SAXException;
>/* the events boolean would be to turn on/off event calls. */


I think you have this capability without the extra parameter, since you don't
get events unless you register a handler to receives them.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From GAjitK at dbss.com  Wed Mar 10 08:38:28 1999
From: GAjitK at dbss.com (George, Ajit Kumar (CTS))
Date: Mon Jun  7 17:09:49 2004
Subject: No subject
Message-ID: <0B9BF5AE8A3ED21196980060B0B54551870EFF@CTSINENTSXUA>

Hi,

I am new to the XML and Java. I am trying to display a XML document in a
tree structure using
XML parser classes from IBM xml4j 2.0.0. I am able to get to the elements,
but how do I
get the text content out of the element 

So I do have a NodeList and I am able to iterate through it, but I am not
able to figure out a
way to get the content information out of it.

I could appreciate any help in this. I will not be using Microsoft parser
classes.

regards

Ajit


GAjitK@dbss.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From leich at wiwi.uni-marburg.de  Wed Mar 10 09:04:02 1999
From: leich at wiwi.uni-marburg.de (Steffen Leich)
Date: Mon Jun  7 17:09:49 2004
Subject: your mail
In-Reply-To: <0B9BF5AE8A3ED21196980060B0B54551870EFF@CTSINENTSXUA>
Message-ID: <Pine.LNX.4.10.9903100953490.13247-100000@pc02yh.wiwi.uni-marburg.de>

On Wed, 10 Mar 1999, George, Ajit Kumar (CTS) wrote:

> Hi,
> 
> I am new to the XML and Java. I am trying to display a XML document in a
> tree structure using
> XML parser classes from IBM xml4j 2.0.0. I am able to get to the elements,
> but how do I
> get the text content out of the element 
> 
> So I do have a NodeList and I am able to iterate through it, but I am not
> able to figure out a
> way to get the content information out of it.
> 
> I could appreciate any help in this. I will not be using Microsoft parser
> classes.
> 

Hi,
check out the following URLs:

http://www.software.ibm.com/xml/education/buildappl/xml_to_html.html

http://www.alphaworks.ibm.com/forum/xmlforjava.nsf/discussion_vert
(Discussion of and Links to Tutorials)

http://developerlife.com/xmljavatutorial1


Steffen

___________________________________________________
Steffen Leich               Phone: +49-6421-283144
leich@wiwi.uni-marburg.de
Universitaet Marburg
Informations- und Kommunikationsdienste       FB 02


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 10 09:16:04 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:49 2004
Subject: Namespaces and DTDs
Message-ID: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de>

james anderson wrote:

> ? which of the "namespace aware" parsers will permit you to parse 
validate a
> document for which partions of the dtd contain element declarations with
> ambiguous names - without first modifying the dtd? i've yet to hear a 
solution
> to the "ambiguous name" problem for xml-1.0/+ns conforming parsers.

Good point -- it was unfair of me to blame the parsers here.  It all seems 
rather obvious now:

Q. Why were namespaces invented?
A. To disambiguate duplicate names.

Q. I have a DTD with duplicate names.  How do I disambiguate them?
A. Use namespaces.

The only inobvious bit is that, because there is no way to declare 
namespaces in the DTD, you can't declare different default namespaces for 
different parts of the DTD, which would have solved Elliotte's problem 
rather neatly.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Wed Mar 10 09:19:07 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:09:49 2004
Subject: Architectural Forms Questions
In-Reply-To: <36E5BA4C.7916D8@prescod.net>
References: <Pine.GHP.4.02A.9903081149260.2617-100000@mail.ilrt.bris.ac.uk>
 <36E3F30C.F6D6DB51@mitre.org>
Message-ID: <3.0.6.32.19990310090702.0097ce90@gpo.iol.ie>

>"Roger L. Costello" wrote:
> - Given that subtyping and inheritance have been part of the primary XML
> "schema" proposals, is it likely that XML Architectural Forms will be
> overtaken by advances in the XML schema area?
>

I believe and hope this is true. The mapping that AFs
enable is too limiting in my experience. Case in point:
at XML 98 in Chicago the GCA issued a DTD for paper
submissions. I wrote a paper for that confernence using
XML. Along comes XML Europe 99 a variation on the
DTD for paper submissions. Even this mapping between
two DTDs from the same broad organization in the same
ballpark of document types cannot be done with AFs.
At least not with my cerebral cortex.


<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From l-arcini at uniandes.edu.co  Wed Mar 10 10:02:51 1999
From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.)
Date: Mon Jun  7 17:09:50 2004
Subject: Req:Music DTD(?)
Message-ID: <36E644E5.6E728D40@uniandes.edu.co>

Hello to all,
I'm currently working on a xml-based sequencer, and I would like to see
some music notation DTDs, before I start to write my own. I've searched
the web high and low... no luck so far, so and I was wondering if any of
you guys have any pointer I could use.

Thanks in advance
Fabio

--
Fabio Arciniegas A.
Ingenieria de Sistemas
Uniandes


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From reschke at medicaldataservice.de  Wed Mar 10 10:23:53 1999
From: reschke at medicaldataservice.de (Julian Reschke)
Date: Mon Jun  7 17:09:50 2004
Subject: XML query engines
Message-ID: <000d01be6ae0$8a0da080$2e00a8c0@julian>

At Sun, 31 Jan 1999 16:53:32 -0800, Tim Bray (tbray@textuality.com) wrote:

>At 08:24 PM 1/31/99 +73900, John Cowan wrote:
>>Assign a sequentially increasing number to each *tag* (start-tag or
end-tag)
>>in the document, treating an empty tag as a start-tag followed by an
>>end-tag. Then e1 is a descendant of e2 iff e1.start > e2.start
>>and e1.end < e2.end. Also, e1 is a left sibling of e2 (and e2 is
>>a right sibling of e1) iff e1.end + 1 = e2.start; e1 is the leftmost
>>child of e2 iff e1.start = e2.start + 1. Modeling the child/parent
>>relationship is not so easy, and requires iteration.
>
>This structure has all sorts of advantages; that's how the
>Open Text SGML-savvy search engine of yore used to run. Fast as
>ell, equal access to any & all elements without performance
>penalty.
>
>
>But hard to update.

Is there an easy way to apply this model to a MSXML.DLL DOM object?
Microsoft's documentation (uniqueID Method, elementIndexList Method) is not
very clear about how these IDs are generated, and whether they remain the
same across to separate parser invocations on the same XML data...

--
Julian Reschke
MedicalData Service GmbH (http://www.medicaldataservice.de)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Wed Mar 10 10:34:43 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:50 2004
Subject: Namespaces and DTDs
References: <c=US%a=_%p=Design_Intellige%l=MASTER-990309224818Z-5223@master.design-intelligence.com>
Message-ID: <36E64E4F.83649621@mecomnet.de>

That "REC-xml-names-19990114" does not provide any means to establish
prefix<->uri bindings for a DTD has long been a point of contention. A cursory
search of the archives will bear this out. The decision to eliminate the
combined prefix/uri/dtd binding (the original pi form) was, however, correct,
as the pi form, at least as proposed in "WD-xml-names-19980327", would not
have been sufficient to handle such things as a dtd which needs multiple
prefix bindings or the situation where a given prefix<->uri binding is to
apply to multiple schema sources.

While it is true that some mechanism is necessary, a form - as discussed below
- which effected a singular binding would also not have solved the problem.
"Everyone" would seem to be waiting for "schemas"....

Marc.McDonald@Design-Intelligence.com wrote:
> 
> A simple extension to namespaces could have fixed this problem:
> 1.      Allow a DTD to be optionally specified along with the namespace
> prefix and URI
> 2.      When an element is prefixed, parse it using the DTD associated with
> the namespace and the given prefix as the default.
> 3.      If no DTD is associated with the prefix or not validating, do what
> is done now (ensure element is well-formed).
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 11:30:32 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:50 2004
Subject: ModSAX: Proposed Core Properties
In-Reply-To: <a64969b7.36e5dab6@aol.com>
References: <a64969b7.36e5dab6@aol.com>
Message-ID: <14054.21854.948934.185758@localhost.localdomain>

MikeDacon@aol.com writes:

 > So, what I'm saying is that I would like to be able to choose
 > whether to interface to the Parser via events or via a DOM.  If you
 > agree with this, I believe using the return type is more
 > appropriate than getting a resultant property (as I suggest next).

This is easy enough to build on top of SAX, but I think that it's
probably out of scope for SAX itself.  SAX is meant to be a relatively 
simple, low-level layer that people can build on.

 > If for some reason the above is not palatable, the same could be 
 > accomplished under the current scheme if we added a 
 > property:
 > 
 > http://xml.org/sax/properties/dom-document <org.w3c.dom.Document> (read-only)

The nice thing about ModSAX is that you're free to try this yourself
-- just define a property like

  http://www.aol.com/mdaconta/props/dom-document

(or whatever URL you can use based on your AOL account) and let the
market decide whether to support it.  Perhaps one of the people who
has written a higher-level utility package that supports both SAX and
DOM would like to use this or something like it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Wed Mar 10 12:05:42 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:09:50 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <36E4C4E6.B51DDFF3@eng.sun.com>
References: <14051.3215.196642.22571@localhost.localdomain> <36E4C4E6.B51DDFF3@eng.sun.com>
Message-ID: <wkiuc9v9ck.fsf@ifi.uio.no>


* David Megginson
|
| http://xml.org/sax/features/normalize-text

* David Brownell
| 
| This is a good filter feature, I think.

I agree.
 
| Lars suggested a "Catalog" feature.  There are different sorts of
| catalog, and they need configuration, so the value of this could be
| a URI for the catalog, not just a boolean.  

There should be a catalog parameter as well, but the reason I proposed
this as a feature rather than just as a parameter is that SP and
xmlproc both allow you to use environment variables to point to a
default catalog file, which is rather handy.

So it would definitely be useful to be able to tell the parser, go
read the default catalog, wherever it is. (Or don't.) Java parsers
could use a Java property to achieve the same thing.

BTW: I'm surprised that David Megginson hasn't replied to this.
     David, Some kind of confirmation that you've at least seen this
     would be welcome. (I know majordomo isn't 100% trustworthy, so it
     might have disappeared on the way.)

| Plus, this would seem to be up to the "EntityResolver" to handle
| ... yes?  

Sort of. You could make a parser filter that used an entity resolver
to do this in general. xmlproc has an internal PubIdResolver interface
which it uses for this (and which is also exposed as the
EntityResolver when using SAX).

| It'd perhaps suggest that one could ask the next filter in the
| stream for the resolver it was using ... :-)

Hmmm. This is actually potentially troubling, since one would need to
specify how a catalog EntityResolver and a custom one specified to be
used together should work.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Wed Mar 10 12:21:56 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:50 2004
Subject: ModSAX: Proposed Core Properties
Message-ID: <4449a8bc.36e66305@aol.com>

Hi Bill,

In a message dated 3/9/99 11:37:51 PM Eastern Standard Time,
b.laforge@jxml.com writes:
> >org.w3c.dom.Document  parse(InputSource  is, boolean events) throws
>  >SAXException; 
>  >org.w3c.dom.Document  parse(java.lang.String uri, boolean events) throws
>  >SAXException;
>  >/* the events boolean would be to turn on/off event calls. */
>  
>  
>  I think you have this capability without the extra parameter, since you
don't
>  get events unless you register a handler to receives them.
>  

Since there is already a parse(InputSource) and parse(String) method
in the interface, in order to overload it we need a second parameter.
The events parameter was the first one that came to mind, there may
be a better one.

Best wishes,

 - Mike  

Mike Daconta (www.gosynergy.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Wed Mar 10 13:16:35 1999
From: richard at goon.stg.brown.edu (Richard Goerwitz)
Date: Mon Jun  7 17:09:50 2004
Subject: Namespaces and DTDs
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de>
Message-ID: <36E6704E.A13B3890@goon.stg.brown.edu>

Ronald Bourret wrote:

> The only inobvious bit is that, because there is no way to declare
> namespaces in the DTD, you can't declare different default namespaces
> for different parts of the DTD

Because the DTD is not namespace aware, all it can deal with are the pre-
fixes you declare (not the URLs associated with them).  Since these pre-
fixes are declared in the document content, you end up with a peculiar
situation in which the DTD has to be written according to declarations
in a given document instance, rather than the reverse.  Worse yet, there
is no way to be sure that the various documents being validated against
a particular DTD use the prefixes correctly, with the correct URLs, un-
less you make extensive use of attribute defaults - which, ironically,
means we now need the DTD (probably an external one, typically with a
bunch of parameter entities; so get your validating parser ready).

After another year or two of this, with alternate schemas floating around
besides DTDs, with architectural forms, with namespaces, and what not -
after all of this, I wonder if we'll all, in good conscience, be able to
say that anything has been simplified.

(Simplicity _was_ one of XML's primary goals back in the dark ages last
February.)

In reality, XML is functioning less like a "simplification," and more like
a political move intended to facilitate changes that could never have been
made to a mature standard like SGML.

This is actually a very old story that's been repeated many times over.
(Just look at what's happened to LDAP.  By the time we get all the PKI and
ACL extensions in place, it's really not going to be very L.)

In the end, LDAP and XML may end up serving their constituencies better
than their predecessors did.  Or they may not.  Frankly, with regard to
XML, the jury is still out.  It's not catching on nearly as fast as pre-
dicted a year or two ago.  And it's taking considerably more work to im-
plement it than anybody ever envisioned.

Those of us who have done the work of writing XML processing software,
and of making it work, have a right to say this.

The emperor may or may not have clothes.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Wed Mar 10 13:36:47 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:09:50 2004
Subject: Req:Music DTD(?) 
In-Reply-To: Your message of "Wed, 10 Mar 1999 05:09:41 CDT."
             <36E644E5.6E728D40@uniandes.edu.co> 
Message-ID: <199903101333.CAA06775@aniwa.sky>

> Hello to all,
> I'm currently working on a xml-based sequencer, and I would like to see
> some music notation DTDs, before I start to write my own. I've searched
> the web high and low... no luck so far, so and I was wondering if any of
> you guys have any pointer I could use.

You need a new search engine.  I've recently been using www.google.com with 
results an order of magitude better than what I got from altavista (though 
altavista still has it's place for more complex query definitions).

Try this url:

http://www.googlebot.com/search?q=music+dtd


Andrew McNaughton

Disclaimer: I have nothing to do with google.com, I'm just impressed by their service


-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Wed Mar 10 14:24:01 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:50 2004
Subject: Namespaces and DTDs
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de>
Message-ID: <36E683F7.429E4B25@mecomnet.de>

yes; agreement on all points.
mr. harold is not the only one who would have benefitted.

the only aspect of which i can comprehend, is the claim, that, being able to
bind the prefixes over a dtd would have broken the rule that namespaces should
not "change the validity of a given document". which claim is true, but which
i believe to be fundamentally misdirected.

it's an old argument.

Ronald Bourret wrote:
> 
> james anderson wrote:
> 
> > ? which of the "namespace aware" parsers will permit you to parse
> validate a
> > document for which partions of the dtd contain element declarations with
> > ambiguous names - without first modifying the dtd? i've yet to hear a
> solution
> > to the "ambiguous name" problem for xml-1.0/+ns conforming parsers.
> 
> Good point -- it was unfair of me to blame the parsers here.  It all seems
> rather obvious now:
> 
> Q. Why were namespaces invented?
> A. To disambiguate duplicate names.
> 
> Q. I have a DTD with duplicate names.  How do I disambiguate them?
> A. Use namespaces.
> 
> The only inobvious bit is that, because there is no way to declare
> namespaces in the DTD, you can't declare different default namespaces for
> different parts of the DTD, which would have solved Elliotte's problem
> rather neatly.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 14:52:36 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:50 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <wkiuc9v9ck.fsf@ifi.uio.no>
References: <14051.3215.196642.22571@localhost.localdomain>
	<36E4C4E6.B51DDFF3@eng.sun.com>
	<wkiuc9v9ck.fsf@ifi.uio.no>
Message-ID: <14054.34184.693965.347827@localhost.localdomain>

Lars Marius Garshol writes:

 > | Lars suggested a "Catalog" feature.  There are different sorts of
 > | catalog, and they need configuration, so the value of this could be
 > | a URI for the catalog, not just a boolean.  
 > 
 > There should be a catalog parameter as well, but the reason I proposed
 > this as a feature rather than just as a parameter is that SP and
 > xmlproc both allow you to use environment variables to point to a
 > default catalog file, which is rather handy.
 > 
 > So it would definitely be useful to be able to tell the parser, go
 > read the default catalog, wherever it is. (Or don't.) Java parsers
 > could use a Java property to achieve the same thing.
 > 
 > BTW: I'm surprised that David Megginson hasn't replied to this.
 >      David, Some kind of confirmation that you've at least seen this
 >      would be welcome. (I know majordomo isn't 100% trustworthy, so it
 >      might have disappeared on the way.)

Please don't be surprised -- depending on how new a suggestion is,
sometimes I like to sit back and hear different people's opinions for
a few hours or a few days before blurting out my own.  On this topic,
I'm a little uncomfortable putting in a core feature for catalogues
when XML catalogue formats haven't settled yet (likewise, I don't
include a feature for data typing, though some kind of data typing
will undoubtedly arrive before long).

It would probably make more sense for the promoters of different
catalogue formats to define their own properties and/or features, such 
as

  http://www.oasis.org/sax/features/entity-catalog

That way, we won't have any unpleasant surprises when a user expects a
parser to use one type of catalogue and the parser finds another
instead.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Mar 10 15:07:12 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:50 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca>

At 08:16 PM 3/9/99 -0500, David Megginson wrote:
>Here's my revised version of the core feature list, based on recent
>discussions:

This seems to be converging nicely.  Any chance of losing the
ugly "Mod" prefix? -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 10 15:09:58 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:50 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca>
References: <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca>
Message-ID: <14054.35485.843066.25717@localhost.localdomain>

Tim Bray writes:

 > At 08:16 PM 3/9/99 -0500, David Megginson wrote:
 > >Here's my revised version of the core feature list, based on recent
 > >discussions:
 > 
 > This seems to be converging nicely.  Any chance of losing the
 > ugly "Mod" prefix? -Tim

Yeah, no one seems to like it but me.  Any other suggestions?  I don't 
like Parser2 or things like that, because I want to emphasise that
this is an add-on to SAX 1.0 rather than an upgrade.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mgoulde at psgroup.com  Wed Mar 10 16:24:26 1999
From: mgoulde at psgroup.com (Michael Goulde)
Date: Mon Jun  7 17:09:50 2004
Subject: Music DTD(?)
Message-ID: <71A71A050B7BD111838300805F579504926776@psgroup.com>

Check out:

http://www.tcf.nl/3.0/musicml/index.html

Michael Goulde
Executive Vice President
Research and Services
Patricia Seybold Group
85 Devonshire St., 5th Floor
Boston, MA  02109

Tel: 617 742-5200

Order "Customers.com" by Patricia Seybold with Ronni Marshak today from
Amazon.com
<http://www.amazon.com/exec/obidos/ASIN/0812930371/qid%3D912986855/002-9
385176-7177833>


-----Original Message-----
From: Fabio Arciniegas A. [mailto:l-arcini@uniandes.edu.co]
Sent: Wednesday, March 10, 1999 5:10 AM
To: XML Mailing List
Subject: Req:Music DTD(?)


Hello to all,
I'm currently working on a xml-based sequencer, and I would like to see
some music notation DTDs, before I start to write my own. I've searched
the web high and low... no luck so far, so and I was wondering if any of
you guys have any pointer I could use.

Thanks in advance
Fabio

--
Fabio Arciniegas A.
Ingenieria de Sistemas
Uniandes


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Wed Mar 10 16:26:54 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:50 2004
Subject: Namespaces and DTDs
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E6704E.A13B3890@goon.stg.brown.edu>
Message-ID: <36E6A0D0.4C4DA307@mecomnet.de>

all of which presumes that you've elevated prefixes to the status of uri's -
attribute defaults or not.

Richard Goerwitz wrote:
> 
> Ronald Bourret wrote:
> 
> > The only inobvious bit is that, because there is no way to declare
> > namespaces in the DTD, you can't declare different default namespaces
> > for different parts of the DTD
> 
> Because the DTD is not namespace aware, all it can deal with are the pre-
> fixes you declare (not the URLs associated with them).  Since these pre-
> fixes are declared in the document content, you end up with a peculiar
> situation in which the DTD has to be written according to declarations
> in a given document instance, rather than the reverse.  Worse yet, there
> is no way to be sure that the various documents being validated against
> a particular DTD use the prefixes correctly, with the correct URLs, un-
> less you make extensive use of attribute defaults - which, ironically,
> means we now need the DTD (probably an external one, typically with a
> bunch of parameter entities; so get your validating parser ready).


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 10 16:57:12 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:50 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <01BE6B1F.58595400@grappa.ito.tu-darmstadt.de>

David Megginson writes:

> Yeah, no one seems to like it but me.  Any other suggestions?  I don't
> like Parser2 or things like that, because I want to emphasise that
> this is an add-on to SAX 1.0 rather than an upgrade.

It's a bit long, but how about ExtendedParser?  (Actually, I'm rather fond 
of Parser2 because it gives us a clear path should this be extended in the 
future.)

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lloyd at digitaljam.com  Wed Mar 10 16:58:30 1999
From: lloyd at digitaljam.com (Lloyd Harding)
Date: Mon Jun  7 17:09:51 2004
Subject: SAX RFD: ModSAX Predefined Features
Message-ID: <36E68B39.DF535797@digitaljam.com>

Lars Marius Garshol wrote:
> 
> * Bill la Forge
>
> | So that's why I'm butting in here. I think an open standards process
> | is important for individuals and small companies. We need to do what
> | we can to keep the ball rolling here.
> 
> We are certainly in heartfelt agreement here. :)

David Brownell wrote:

Gee, as a wage-slave working for a big company, I hope that I'm
not _too_ excluded from the discussions ... :-)

Seriously:  my personal model is a lot more akin to the original
IETF style "running code and working consensus" model than most
existing standards bodies.  I'm a lot happier with standards that
come from such a process than from ones that involve fat specs
that can't be implemented.  Writing code is generally more fun
than specs -- though an elegant spec is also a work of art!

- - Dave


Standards processes require effort and in all cases the effort is
primarily provided by individuals from large companies. Small companies
do not have the resources to put into standards efforts. Voting members
make the difference and they are
typically not small company employees.


That is not to say standards bodies do not have methods for
non-voting input. They all do. 

There are as many defacto standards that have failed as there
are planned standards that have failed. There are as many
defacto standards that have succeeded as there are planned standards
that have succeeded. To claim one is better than
another without details is not sufficient.

Personal perception might be based on the the differences 
in methods for receiving input or differences in the scope
or differences in personal preference regarding process.

But claiming one is better than the other based on 
failure/success rate requires more detail regarding
definitions of failure/success and analysis of history
to be convincing.
 
I believe the issue is not so much which method is best
but rather WHEN method A is better than method B.

Implementation first versus specification first is similar to
deduction versus induction. Both have their places the
question is when.

lloyd


-- 
----------------------------------------------------------------
Lloyd Harding                                 lloyd@infoauto.com
----------------------------------------------------------------
               Information Assembly Automation Inc.
                     http://www.infoauto.com 
   SGML/XML Services for the Publishing and Medical Community
Architectural Design, DTD Creation, Editorial System Development
----------------------------------------------------------------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 10 17:21:14 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <002b01be6b1a$eb1fbcc0$c8a8a8c0@thing1>

OK, Dave, you asked for it.

As an add on, you have made the SAX parser much more eXtensible.
As if we didn't have enough X's...

            XParser

Bill

-----Original Message-----
From: David Megginson <david@megginson.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Date: Wednesday, March 10, 1999 12:00 PM
Subject: Re: ModSAX: Proposed Core Features


>Tim Bray writes:
>
> > At 08:16 PM 3/9/99 -0500, David Megginson wrote:
> > >Here's my revised version of the core feature list, based on recent
> > >discussions:
> > 
> > This seems to be converging nicely.  Any chance of losing the
> > ugly "Mod" prefix? -Tim
>
>Yeah, no one seems to like it but me.  Any other suggestions?  I don't 
>like Parser2 or things like that, because I want to emphasise that
>this is an add-on to SAX 1.0 rather than an upgrade.
>
>
>All the best,
>
>
>David
>
>-- 
>David Megginson                 david@megginson.com
>           http://www.megginson.com/
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 10 17:26:32 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <003001be6b1b$c4915360$c8a8a8c0@thing1>

On a more serious note,

I think we need a new ParserFactory... ModParserFactory? XParserFactory?
It should use ParserFactory to create a Parser and then check to see if the new
extension is supported. If not, it proceeds to wrap the parser so that it looks
like a ModParser.

Note that this compatibility wrapper will effectively be a filter. 

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Patrice.Bonhomme at loria.fr  Wed Mar 10 17:45:10 1999
From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features 
In-Reply-To: Your message of "Wed, 10 Mar 1999 07:09:59 PST."
             <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> 
Message-ID: <199903101744.SAA01077@chimay.loria.fr>


tbray@textuality.com said:
] This seems to be converging nicely.  Any chance of losing the ugly
] "Mod" prefix? -Tim 

Why not XSAX for eXtended SAX ?

Pat.

-- 
  ==============================================================
  bonhomme@loria.fr               |      Office : B.228
  http://www.loria.fr/~bonhomme   |      Phone  : 03 83 59 30 52
  --------------------------------------------------------------
   * Serveur Silfide  : http://www.loria.fr/projets/Silfide
  ==============================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Wed Mar 10 17:50:54 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features 
Message-ID: <01b001be6b1e$938bdbc0$c5010180@p197>

>Why not XSAX for eXtended SAX ?
"E-SAX" would be less confusing.

-----Original Message-----
From: Patrice Bonhomme <Patrice.Bonhomme@loria.fr>
To: XML Developers' List <xml-dev@ic.ac.uk>
Date: 10 March 1999 17:48
Subject: Re: ModSAX: Proposed Core Features


>
>tbray@textuality.com said:
>] This seems to be converging nicely.  Any chance of losing the ugly
>] "Mod" prefix? -Tim
>
>Why not XSAX for eXtended SAX ?
>
>Pat.
>
>--
>  ==============================================================
>  bonhomme@loria.fr               |      Office : B.228
>  http://www.loria.fr/~bonhomme   |      Phone  : 03 83 59 30 52
>  --------------------------------------------------------------
>   * Serveur Silfide  : http://www.loria.fr/projets/Silfide
>  ==============================================================
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From luke at javagroup.org  Wed Mar 10 18:41:53 1999
From: luke at javagroup.org (Luke Gorrie)
Date: Mon Jun  7 17:09:51 2004
Subject: Generating typed code from DTDs, why not?
Message-ID: <zp5lji3b.fsf@javagroup.org>

Hi all,

I'm pretty new to XML, but as I've poked around I've observed what
seem to be some strange things.  XML parsers all seem to provide
interfaces which ignore the static structure information provided by
DTDs and rely on "one fits all" interfaces to elements, in stark
contrast to the conventions of statically typed languages.

For instance, the first thing I played with in XML was SAX using
Python.  I was impressed by how easily it worked and how naturally it
fit in with a dynamically typed language like python.  Then I had a
look at the Java interface and found that it was just the same, which
I thought very odd!  The natural mapping for SAX onto Java, to get the
(significant) benefits of static typing, would be to generate a
Visitor interface.  The Visitor interface would have a method for
"visiting" each type of element in the document, and the argument to
this method would be an object which presents the element contents
through typed accessor methods.  At least, that's how it looks to me.

In the case of DOM, again generating typed accessor code would provide
these great benefits.  People could use a DTD (or similar) as the
definition language for their abstract data types, and generate
DOM-compliant classes which they can both use "natively" in their
language and also manipulate as part of a genuine DOM tree at the same
time.

It seem like these methods which ignore the wealth of static structure
information available will begin to show serious problems if they try
to scale to the features proposed in some specifications like SOX,
where more fine grained relationships and constraints can be
expressed.

So, my question is: are there any efforts around working towards
creating mappings from DTD or other other XML type definition
languages to various programming languages (or to other IDLs like
OMG's), or is there some reason why this is considered a bad idea?

I'm excited by the possibility of using a visual modelling tool
(perhaps using an extension of the UML) to model document structure,
and from the model be able to generate a DTD, from which to generate
classes which give me access to the XML data in a natural way for
programming language.  I'm amazed that more people don't seem to share
this enthusiasm.  What we're doing with vanilla DOM and SAX interfaces
seems analogous to using CORBA IDL as documentation, and making all
object calls using the dynamic invocation interface!

P.S. I was told today that Oracle have recently done something similar to
this, which sounds great.  I look forward to taking a look, but I
can't help but wonder if there's a reason that it took this long - and
how much the Oracle product does.  If someone could point me to some
other products which do similar things, I'd be much obliged.

Cheers,
Luke


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Wed Mar 10 19:16:58 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:09:51 2004
Subject: DocumentHandler with xml4j DOMParser
Message-ID: <3601b6f9.100299@smtpgate1.ONE2ONE.CO.UK>


Hi all,
I am using IBM's xmlj2.0.3 XML parsers. I am having the following   
problem:
When i set my own document handler with a DOMParser, the handler is never   
invoked upon. However when i use the SAXParser it does. Why does the   
DOMParser not invoke the DocumentHandler yet hte SAXParser does?
The docs does not throw any light on the problem.
Is there a fundamental problem with using a DocumentHandler with a   
DOMParser?

 -lucio

 ---------------------------------------------------------------------
 One2One              LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower      tel : +44 181 214 3847
 Elstree Way
 Borehamwood                 fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cadams at cascadecc.com  Wed Mar 10 19:24:11 1999
From: cadams at cascadecc.com (Chad Adams)
Date: Mon Jun  7 17:09:51 2004
Subject: WIDL
Message-ID: <001001be6b2b$7d49d8f0$01010101@development.cascade>

Is anybody doing a B2B/WIDL type of application?

Will I be able to use regular HTML pages (and maybe CGI/Pearl) to push and
pull XML from a remote server, and easily be able to parse the XML on both
sides, looking for custom request/reply types of data and then act on it
(via JavaScript or applets on the client, and maybe servlets on the server?

Am I dreaming to think that this can give me a light weight remoting
technology with out the likes of RMI, CORBA, Weblogic, ObjectSpace etc.


Chad Adams
Payback Training Systems
Email: cadams@cascadecc.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Wed Mar 10 20:21:41 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:09:51 2004
Subject: WIDL
In-Reply-To: <001001be6b2b$7d49d8f0$01010101@development.cascade>
Message-ID: <199903102020.VAA15937@sonne.darmstadt.gmd.de>

Chad Adams <cadams@cascadecc.com> wrote at 10 Mar 99, 12:23:

> Is anybody doing a B2B/WIDL type of application?

There is a lot of research going on with in the area of wrapper 
generation. Some approaches prefer creating java objects, others 
directly map to XML. Implementations include:

http://db.cis.upenn.edu/W4F/
http://www.cse.ogi.edu/DISC/XWRAP/
http://www.darmstadt.gmd.de/oasys/projects/jedi/jedie.html

Just look at the bibliographies to find others.

> Am I dreaming to think that this can give me a light weight remoting
> technology with out the likes of RMI, CORBA, Weblogic, ObjectSpace etc.

Have a look at XML query languages, they are about that (among other 
things).

http://www.w3.org/TandS/QL/QL98/

A good paper to start with is from David Maier, look at sections 2.9 
and 2.10 to see his Vision of data communication via XML on the web.

http://www.w3.org/TandS/QL/QL98/pp/maier.html

And of course there is Microsoft's vision, see

http://www.oasis-open.org/cover/bosworthXML98.html

Hope that helps.

	++im
--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tms at ansa.co.uk  Wed Mar 10 20:38:05 1999
From: tms at ansa.co.uk (Toby Speight)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX feature naming (was: SAX: ModSAX addition, general ...)
References: <18b603b2.36e3e337@aol.com> <14051.59370.316671.640337@localhost.localdomain> <36E44898.CB8E18C4@thinlink.com> <14052.19853.887104.987727@localhost.localdomain>
Message-ID: <u3e3djcuv.fsf@slc-visitor-4.ansa.co.uk>

David> David Megginson <URL:mailto:david@megginson.com>

[I accidentally mailed this to David; it was meant for the list.
Sorry, David.]

0> In article <14052.19853.887104.987727@localhost.localdomain>, David
0> wrote:

David> I've been thinking about this issue, and I'm fairly convinced
David> that the URI is the right choice.

I agree with this much.


David> Think of the URI a statement of ownership.  Assume that my ISP
David> is host.net, and that I've been allocated 5MB of web space at
David> http://host.net/foo/.

Okay, you own that name subspace *at this moment in time*.  Who will
have the right to create names below that next March?  Five years from
now?  A hundred years from now?  Persistent uniqueness of names is the
core work of the URN group, and the consensus there is that DNS names
are a poor basis for any kind of URN (and what we want is exactly what
URNs are for: naming things).

If you are saying that the use of URLs as names is just a stopgap
until the URN registration stuff is sorted, then I'll accept that, but
be aware of the precedent you're setting with the initial "well-known"
feature names.


David> I am the only one who has the right to make a resource available at
David> http://host.net/foo/, so I am the one who has the (moral) right to
David> construct feature IDs based on http://host.net/foo/.

At this instant...

David> It is not sufficient simply to use the domain name "host.net",
David> because I don't own the domain (someone else could construct
David> the same feature ID), and it is not sufficient to use something
David> starting with net.host.foo, because I *don't* have the right to
David> make something available at, say, ftp://host.net/foo/ --

Nor do you own the host "foo.host.net"

In summary, I think URNs are a good fit, but not necessarily other
kinds of URI.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Wed Mar 10 22:08:35 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features 
In-Reply-To: <199903101744.SAA01077@chimay.loria.fr>
References: <Your message of "Wed, 10 Mar 1999 07:09:59 PST."             <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca>
Message-ID: <4.1.19990311090304.00c96360@steptwo.com.au>

At 03:44 11/03/1999 , Patrice Bonhomme wrote:
  | 
  | tbray@textuality.com said:
  | ] This seems to be converging nicely.  Any chance of losing the ugly
  | ] "Mod" prefix? -Tim 
  | 
  | Why not XSAX for eXtended SAX ?

Damn, you beat me to it.

Although I was thinking SAX eXtended, ie:
	
	SAXX

This could later become

	SAXXX

or perhaps:

          3	
       SAX

J

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Wed Mar 10 22:33:02 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:09:51 2004
Subject: Namespaces and DTDs
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990310223202Z-5687@master.design-intelligence.com>

For a more complete solution than the option (emphasize option) of a 
DTD associated with a namespace prefix and URI, I would add the means 
to declare a namespace, prefix and DTD in a DTD.


Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  james anderson [SMTP:James.Anderson@mecomnet.de]
Sent:  Wednesday, March 10, 1999 8:42 AM
To:  xml-dev@ic.ac.uk
Subject:  Re: Namespaces and DTDs

all of which presumes that you've elevated prefixes to the status of 
uri's -
attribute defaults or not.

Richard Goerwitz wrote:
>
> Ronald Bourret wrote:
>
> > The only inobvious bit is that, because there is no way to 
declare
> > namespaces in the DTD, you can't declare different default 
namespaces
> > for different parts of the DTD
>
> Because the DTD is not namespace aware, all it can deal with are the 
pre-
> fixes you declare (not the URLs associated with them).  Since these 
pre-
> fixes are declared in the document content, you end up with a 
peculiar
> situation in which the DTD has to be written according to 
declarations
> in a given document instance, rather than the reverse.  Worse yet, 
there
> is no way to be sure that the various documents being validated 
against
> a particular DTD use the prefixes correctly, with the correct URLs, 
un-
> less you make extensive use of attribute defaults - which, 
ironically,
> means we now need the DTD (probably an external one, typically with 
a
> bunch of parameter entities; so get your validating parser ready).


xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Wed Mar 10 22:42:35 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:09:51 2004
Subject: DocumentHandler with xml4j DOMParser
Message-ID: <30649320C177D111ADEC00A024E9F297169F8A@exchange-server.dega.com>


<You_Said>
Hi all,
I am using IBM's xmlj2.0.3 XML parsers. I am having the following   
problem:
When i set my own document handler with a DOMParser, the handler is never   
invoked upon. However when i use the SAXParser it does. Why does the   
DOMParser not invoke the DocumentHandler yet hte SAXParser does?
The docs does not throw any light on the problem.
Is there a fundamental problem with using a DocumentHandler with a   
DOMParser?

 One2One              LUCIO.PICCOLI@one2one.co.uk
</You_Said>

I don't know about the version of your XML4J, but in mine (1.1.9), the
documentation states that DocumentHandler is to be used with the SAX Parser
to eb informed of parsing events. This is logical, since the main difference
is that DOM parsers parse the whole document into a resulting DOM tree, and
SAX parsers are used for event based processing. 

There doesn't appear to be any way to create a DocumentHandler on class
com.ibm.xml.parser.Parser, but you can from org.xml.sax.DocumentHandler. Did
they change this in your newer version?

Ed


Ed Howland
ed@dega.com
http://www.dega.com 
"As your attorney, I advise you to take some adrenalchrome"

-----Original Message-----
From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk]
Sent: Wednesday, March 10, 1999 11:12 AM
To: xml-dev@ic.ac.uk
Subject: DocumentHandler with xml4j DOMParser


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Mar 10 22:44:39 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main>

> > This seems to be converging nicely.  Any chance of losing the
> > ugly "Mod" prefix? -Tim
>
>Yeah, no one seems to like it but me.  Any other suggestions?  I don't
>like Parser2 or things like that, because I want to emphasise that
>this is an add-on to SAX 1.0 rather than an upgrade.


I have been tracking the progress of 'ModSAX' closely as well and it seems
the extension is maturing nicely.

BTW, it would help great in renaming if you could tell us what 'Mod' in
ModParser stands for.

Best,

Don Park
Docuverse


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Wed Mar 10 23:41:33 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:09:51 2004
Subject: ModSAX: Proposed Core Features
References: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main>
Message-ID: <36E70238.18FC6360@manhattanproject.com>

Don Park wrote:
> BTW, it would help great in renaming if you could tell us what 'Mod' in
> ModParser stands for.

I thought it stood for "Modular"

:) Clark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Mar 11 00:06:18 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:09:51 2004
Subject: DOM Impl: Array or Linked List? 
References: <3601a91c.090299@smtpgate1.ONE2ONE.CO.UK>
Message-ID: <36E7080F.7EEF4CEB@manhattanproject.com>

I've been struggling with this slightly, and would
like your feedback.  I'm building a DOM tree.  For
the internal representation, I see two options:

A) A linked list for children

* Easy inserts in middle of list
* Slower non-sequential reads

B) An array for children

* Harder inserts in middle of list
* Faster non-sequential reads

Anyway, I was thinking of implementing
a compromise, a sparse array with 
configurable spacing, depending upon
the document.

Thoughts?

Thank you.

Clark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 11 00:48:48 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:52 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main>
References: <009b01be6b47$72b0c4f0$2ee044c6@arcot-main>
Message-ID: <14055.4677.914570.392597@localhost.localdomain>

Don Park writes:

 > BTW, it would help great in renaming if you could tell us what
 > 'Mod' in ModParser stands for.

It means that it's not a Rocker.

Or else it means 'modular' -- I'm not sure.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jarle.stabell at dokpro.uio.no  Thu Mar 11 00:54:36 1999
From: jarle.stabell at dokpro.uio.no (Jarle Stabell)
Date: Mon Jun  7 17:09:52 2004
Subject: Namespaces and DTDs
Message-ID: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no>

Richard Goerwitz  wrote:
> (Simplicity _was_ one of XML's primary goals back in the dark ages last
> February.)

It seems to me that the SGML compatibility requirement killed simplicity. 
(And gave a very confusing and hard-to-learn vocabulary)

I'm hoping that ideas like the Layered Model for XML (by Simon St.Laurent) 
will be able to influence XML in a positive direction, making it simpler to 
understand, use and implement. Today it's way too hard to "fully" 
understand XML.

Cheers,
Jarle Stabell


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 11 01:04:11 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:52 2004
Subject: ModSAX feature naming (was: SAX: ModSAX addition, general ...)
In-Reply-To: <u4sntjd6g.fsf@slc-visitor-4.ansa.co.uk>
References: <18b603b2.36e3e337@aol.com>
	<14051.59370.316671.640337@localhost.localdomain>
	<36E44898.CB8E18C4@thinlink.com>
	<14052.19853.887104.987727@localhost.localdomain>
	<u4sntjd6g.fsf@slc-visitor-4.ansa.co.uk>
Message-ID: <14054.56958.252482.1690@localhost.localdomain>

[originally sent privately to Tony]

Toby Speight writes:

 > If you are saying that the use of URLs as names is just a stopgap
 > until the URN registration stuff is sorted, then I'll accept that,
 > but be aware of the precedent you're setting with the initial
 > "well-known" feature names.

The quality of URNs will depend entirely on the quality of the
registration schemes -- URNs really have no inherent advantage over
URLs.

There are an awful lot of ways that I could construct a unique ID:
using my phone number, my latitude and longitude, my Ethernet card's
MAC address, the IP address served by Rogers Wave's DHCP server, my
driver's license number, my Canadian Social Insurance Number, the ISBN
for my book (though I think the publisher would have a moral claim to
that), a domain name, or a specific URL.  The problem is that you have
to balance four factors:

1. ease of access (not everyone can get an ISBN easily);
2. usability (who wants to memorise MAC addresses?);
3. universality (my Canadian SIN is meaningless outside the country);
   and
4. persistence (the DHCP server might change my IP address in a few
   hours when my current lease expires).

HTTP URLs win pretty close to a 10/10 on (1) and (3), about an 8/10 on
(2), and probably a 6/10 or so on (4).  A UUID might win on all but
(2), depending on how hard it is to obtain one, but that is an
inherent property of UUIDs, not of URNs -- and as I understand it,
people are actually proposing constructing URNs from domain names
among other schemes anyway.

Even if UUIDs do turn out to be the best choice, what's the advantage
of URNs?  Why not just

  uuid:123344567773634


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Thu Mar 11 01:21:24 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:09:52 2004
Subject: DOM Impl: Array or Linked List? 
Message-ID: <002301be6b5d$5e5434e0$2ee044c6@arcot-main>

Docuverse DOM SDK is implemented using the array approach with the last
accessed index cached to improve next/prevSibling performance.  Resulting
implementation is fast for index-based access to child nodes and slightly
slower for sibling-based access (only 10% slower than linked-list version).
Modification to the tree is fast when appending nodes (i.e. building new
tree) but is somewhat slow when inserting new nodes since array contents
have to be shifted around.  If your XML document has gazillion child nodes
per element, performance will suffer quite a bit.

You can get around the update problem by applying the Strategy pattern to
child array implementation.  On insert, check to see if the array is big
enough to justify using different type of array implementation (i.e. sparse
array).  One caveat is that this tends to increase the number of child list
array (smart NodeLists and NodeList implementation strategies).  There are
ways to minimize this problem though.

So the bottom line is, you are on the right track.

Don Park
Docuverse

-----Original Message-----
From: Clark Evans <clark.evans@manhattanproject.com>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Wednesday, March 10, 1999 4:15 PM
Subject: DOM Impl: Array or Linked List?


>I've been struggling with this slightly, and would
>like your feedback.  I'm building a DOM tree.  For
>the internal representation, I see two options:
>
>A) A linked list for children
>
>* Easy inserts in middle of list
>* Slower non-sequential reads
>
>B) An array for children
>
>* Harder inserts in middle of list
>* Faster non-sequential reads
>
>Anyway, I was thinking of implementing
>a compromise, a sparse array with
>configurable spacing, depending upon
>the document.
>
>Thoughts?
>
>Thank you.
>
>Clark
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Thu Mar 11 01:21:28 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:09:52 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <002401be6b5d$5f34d0e0$2ee044c6@arcot-main>

If it is 'modular' then ModularParser makes sense IMO.

I believe XParser is being used by FBI for the technology that auto-detects
images with adult content.  <g>

Don Park
Docuverse

-----Original Message-----
From: David Megginson <david@megginson.com>
To: XML Developers' List <xml-dev@ic.ac.uk>
Date: Wednesday, March 10, 1999 4:52 PM
Subject: Re: ModSAX: Proposed Core Features


>Don Park writes:
>
> > BTW, it would help great in renaming if you could tell us what
> > 'Mod' in ModParser stands for.
>
>It means that it's not a Rocker.
>
>Or else it means 'modular' -- I'm not sure.
>
>
>All the best,
>
>
>David
>
>--
>David Megginson                 david@megginson.com
>           http://www.megginson.com/
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Thu Mar 11 02:35:58 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:09:52 2004
Subject: Simplicity (was Re: Namespaces and DTDs)
References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no>
Message-ID: <36E72BDE.6798F08E@allette.com.au>


Jarle Stabell wrote:

> It seems to me that the SGML compatibility requirement killed simplicity.
> (And gave a very confusing and hard-to-learn vocabulary)

Really? I think the requirement for web compatibility made XML more complex than it looked
from the outset.

This is the advent of the third catchcry for XML. First it was "XML is SGML", second was "Use
XML because SGML is too hard" and now "XML is very powerful, but can be difficult".
Remarkably, we just now seem to be coming to the realisation that it's difficult to solve
complex problems. XML seeks to do more than SGML, but it's supposed to be simpler - how can
this be so? The only immediate areas of gain would have come from trimming the fat from the
SGML, but the more the X*L I see, the skinner SGML looks. Yes, it is less powerful, yes it can
be more proprietary, yes it is harder to write tools for, no it doesn't solve ten percent of
what X*L can do before it even gets out of bed. Yes, I still use it a lot. Ponder that - SGML
for simplicity.

> I'm hoping that ideas like the Layered Model for XML (by Simon St.Laurent)
> will be able to influence XML in a positive direction, making it simpler to
> understand, use and implement. Today it's way too hard to "fully"
> understand XML.

It is unquestionably hard to fully understand - anyone who says that it isn't deserves a gold
star - they're smarter than I am.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msharp at sybex.com  Thu Mar 11 03:04:42 1999
From: msharp at sybex.com (Molly Sharp)
Date: Mon Jun  7 17:09:52 2004
Subject: Delivery of XML
Message-ID: <88256731.000FD31A.00@sybex.com>

Hello,

I'm new to the list. I'm in the computer book publishing business, and I'm
looking for information about delivering XML content to customers in a
secure, copy-protected (encrypted) manner.

Does anyone know if there are any companies out there offering secure
encryption for XML? I imagine you'd have to create a browser based on IE or
Netscape that disabled functions such as view source, copy, and save as ---
and that would be the only browser your encrypted XML content could be
opened from.

Thanks for any information about this,

Molly Sharp


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Thu Mar 11 03:10:00 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:52 2004
Subject: A new name for ModSax
Message-ID: <146803a9.36e732d1@aol.com>

Hi Everyone,

Instead of XSAX or XParser (which rely on the overplayed
X of extensible),  how about

ExSAX
ExParser

Which stands for the same thing.  
Extensible SAX
Extensible Parser

Is shorter to type than ModSAX.
Avoids the double capital of XSAX and XParser.
And is pronounced the same way.

 - Mike
(www.gosynergy.com)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Thu Mar 11 03:34:23 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:09:52 2004
Subject: Delivery of XML
In-Reply-To: <88256731.000FD31A.00@sybex.com>
Message-ID: <000c01be6b6f$309a19e0$d3228018@jabr.ne.mediaone.net>

> 
> Does anyone know if there are any companies out there offering secure
> encryption for XML? I imagine you'd have to create a browser 
> based on IE or
> Netscape that disabled functions such as view source, copy, and 
> save as ---
> and that would be the only browser your encrypted XML content could be
> opened from.
> 

Would that be SSL with certificates to distinguish clients?


Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Thu Mar 11 03:42:30 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:52 2004
Subject: One more ModSax naming try...
Message-ID: <8179a506.36e73846@aol.com>

Hi All,

Ok, while I like ExSax for the previously mentioned 
reasons -- I don't like its connotation for all things "Ex" like
Ex-girlfriend, Ex-wife, Ex-husband... 

So, one other way to go is the "Add-on" theme that David expressed.

XtraSax
XtraParser

This is a combination of "add-on", "extra" and Xml.  

 - Mike
(www.gosynergy.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From landerse at du.edu  Thu Mar 11 05:29:02 1999
From: landerse at du.edu (Buzz Andersen)
Date: Mon Jun  7 17:09:52 2004
Subject: Help w/Docuverse DOM SDK (Please)
Message-ID: <0F8F007FZ0J3BS@du.edu>

I would be eternally grateful if anyone out there who happens to be familiar
with the Docuverse DOM SDK could tell me what is wrong with the following
code.  It generates a "com.docuverse.dom.DOMExceptionImpl" exception when
the "appendChild" method of the document is attempted.

Here it is:

  DOM dom = new com.docuverse.dom.DOM();
  dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver");
  Document x = dom.createDocument("e1");
  Element y = x.createElement("e2");
  y.appendChild(root);

I would think this would generate:

<e1>
    <e2></e2>
</e1>

Am I mistaken?

Thanks in advance,
Buzz Andersen
www.du.edu/~landerse
landerse@du.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Thu Mar 11 05:59:23 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:09:52 2004
Subject: Help w/Docuverse DOM SDK (Please)
Message-ID: <004a01be6b84$379613b0$2ee044c6@arcot-main>

Buzz,

>DOM dom = new com.docuverse.dom.DOM();
>dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver");
>Document x = dom.createDocument("e1");
>Element y = x.createElement("e2");
>y.appendChild(root);
>
>I would think this would generate:
>
><e1>
>   <e2></e2>
></e1>

I don't know what y.appendChild(root) is supposed to be but you have to
insert your "e2" element into your document.

// creates a document with "e1" as document element
Document doc = dom.createDocument("e1");

// make sure document root exists
Node e1 = doc.getDocumentElement();
if (e1 == null)
    e1 = doc.appendChild(doc.createElement("e1"));

// create and insert e2 into e1
e1.appendChild(doc.createElement("e2"));

at this point, you will have:

<e1><e2/></e1>

Best,

Don Park
Docuverse


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From landerse at du.edu  Thu Mar 11 06:41:52 1999
From: landerse at du.edu (Buzz Andersen)
Date: Mon Jun  7 17:09:52 2004
Subject: Help w/Docuverse DOM SDK (Please)
Message-ID: <0F8F0004G3W74A@du.edu>

Whoa...that was a mistranslation from my original code!  It was supposed to
read:

x.appendChild(y);

Sorry about the confusion, and thanks much for the advice/code.  I've been
generating XML for awhile using proprietary parser APIs, but I'm still
trying to grok the whole SAX/DOM thing.

Buzz Andersen
www.du.edu/~landerse
landerse@du.edu

----------
>From: Don Park <donpark@quake.net>
>To: xml-dev@ic.ac.uk
>Subject: Re: Help w/Docuverse DOM SDK (Please)
>Date: Wed, Mar 10, 1999, 10:58 PM
>

>Buzz,
>
>>DOM dom = new com.docuverse.dom.DOM();
>>dom.setProperty("sax.driver", "com.ibm.xml.parser.SAXDriver");
>>Document x = dom.createDocument("e1");
>>Element y = x.createElement("e2");
>>y.appendChild(root);
>>
>>I would think this would generate:
>>
>><e1>
>>   <e2></e2>
>></e1>
>
>I don't know what y.appendChild(root) is supposed to be but you have to
>insert your "e2" element into your document.
>
>// creates a document with "e1" as document element
>Document doc = dom.createDocument("e1");
>
>// make sure document root exists
>Node e1 = doc.getDocumentElement();
>if (e1 == null)
>    e1 = doc.appendChild(doc.createElement("e1"));
>
>// create and insert e2 into e1
>e1.appendChild(doc.createElement("e2"));
>
>at this point, you will have:
>
><e1><e2/></e1>
>
>Best,
>
>Don Park
>Docuverse
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
>CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Thu Mar 11 08:22:49 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:09:52 2004
Subject: DocumentHandler with xml4j DOMParser
Message-ID: <3601b76a.110299@smtpgate1.ONE2ONE.CO.UK>


> <You_Said>
> Hi all,
> I am using IBM's xmlj2.0.3 XML parsers. I am having the following
> problem:
> When i set my own document handler with a DOMParser, the
> handler is never
> invoked upon. However when i use the SAXParser it does. Why
> does the
> DOMParser not invoke the DocumentHandler yet hte SAXParser does?
> The docs does not throw any light on the problem.
> Is there a fundamental problem with using a DocumentHandler with a
> DOMParser?
>
>  One2One              LUCIO.PICCOLI@one2one.co.uk
> </You_Said>
>
> I don't know about the version of your XML4J, but in mine (1.1.9), the
> documentation states that DocumentHandler is to be used with
> the SAX Parser
> to eb informed of parsing events. This is logical, since the
> main difference is that DOM parsers parse the whole document into a   
resulting
> DOM tree, and SAX parsers are used for event based processing.
>
> There doesn't appear to be any way to create a
> DocumentHandler on class com.ibm.xml.parser.Parser, but you can from
> org.xml.sax.DocumentHandler.
>Did they change this in your newer version?

I am not sure what you mean here. The Documenthandler i used was a   
instance of org.xml.sax.DocumentHandler. The   
setDocumentHandler(DocumentHandler handler)  is a method on the   
org.xml.sax.Parser. Since all the ibm parser class implement this   
interface then why doesn't it work?

I viewed the source code to the DOMParser and noticed that in the   
constructor it calls setDocumentHandler( this ). So it using itself as   
the document handler. Is it OK to have more than one DocumentHandler?

In fact the bigger question is using a DocumentHandler on the DOMParser   
the correct thing to do when attempting to extract the content?


 -lucio

>
> Ed
>
>
> Ed Howland
> ed@dega.com
> http://www.dega.com
> "As your attorney, I advise you to take some adrenalchrome"
>
> -----Original Message-----
> From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk]
> Sent: Wednesday, March 10, 1999 11:12 AM
> To: xml-dev@ic.ac.uk
> Subject: DocumentHandler with xml4j DOMParser
>
>
> xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on   
CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following   
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Thu Mar 11 09:49:32 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:53 2004
Subject: Namespaces and DTDs
References: <c=US%a=_%p=Design_Intellige%l=MASTER-990310223202Z-5687@master.design-intelligence.com>
Message-ID: <36E7951F.BD56A8E4@mecomnet.de>

the third parameter (the DTD) is ill advised. one will, in any case, need to
establish scoping rules for the bindings. such rules, in combination with
xml's existing reference and sequence mechanisms, would render the third
parameter either redundant or too restrictive.

Marc.McDonald@Design-Intelligence.com wrote:
> 
> For a more complete solution than the option (emphasize option) of a
> DTD associated with a namespace prefix and URI, I would add the means
> to declare a namespace, prefix and DTD in a DTD.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Thu Mar 11 10:12:24 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:09:53 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <024301be6ba6$dfdc1c00$5402a8c0@oren.capella.co.il>

Bill la Forge <b.laforge@jxml.com> wrote:

>On a more serious note,
>
>I think we need a new ParserFactory... ModParserFactory? XParserFactory?
>It should use ParserFactory to create a Parser and then check to see if the
new
>extension is supported. If not, it proceeds to wrap the parser so that it
looks
>like a ModParser.
>
>Note that this compatibility wrapper will effectively be a filter.


I think you've hit on something important here. The Mod/X/Xtra/E-Sax thread
has focused on "how to access extra functionality which is already available
within a particular SAX parser implementation". This might be the wrong
question to ask. Shouldn't it be "how to I obtain an instance of a SAX
parser which provides the features I need", instead?

This is a subtle but important shift of focus. Today one can obtain an
instance of a SAX parser by using the ParserFactory. Now suppose my
application needs an order of a namespace aware parser, character
normalization on the side, and don't spare the comments, please - how would
I go around creating such a thing?

Note that this issue contains the original one; one needs to be able to
access the extra features. But it goes beyond it. It might also help to
constrain some design choices. Take for example the issue of naming
features. Today ParserFactory uses the string "org.xml.sax.parser" as an
identifier for the feature "take an input source and convert it to SAX
events". The format of this particular string was chosen since it is usable
as a key in a properties file.

Wouldn't it be reasonable to say that whichever way
Mod/X/Xtra/E-ParserFactory works, it will use the same approach - that is,
use Java-like package names to identify features, so that it will be
possible to provide default implementations using property files? I know
this would be hard for the URI camp to swallow :-) but isn't it worth it?

As to the issue itself, the way I see it there is one major question to be
decided first. Are the extra features independent of each other?

If they aren't, we are in trouble. How do I know that pushing a filter
implementing feature X on top of a parser implementing feature Y doesn't
break that feature? What if one feature depends on another? Should there be
a way to describe the relationship between features? How?

At any rate, the goal should be some registry of "parsers" and "filters"
with an appropriate API so that it would be possible to ask for a certain
feature set and obtain a "parser" instance. IMVHO as far as this registry is
concerned, the basic SAX events interface and the input source interface
should be on equal ground with the other features. This could be a flexible
framework allowing to create processing chains such as using DOM as
input/output of the chain, making XSL processing a core "feature", and so
on.

Has anything similar been done in a different field, so we could reuse the
design lessons there? It seems like a pretty generic "stream processing"
problem.

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Mar 11 11:54:40 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:09:53 2004
Subject: Namespaces and DTDs
References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no>
Message-ID: <36E7AE27.7DE6@hiwaay.net>

Jarle Stabell wrote:
> 
> Richard Goerwitz  wrote:
> > (Simplicity _was_ one of XML's primary goals back in the dark ages last
> > February.)
> 
> It seems to me that the SGML compatibility requirement killed simplicity.
> (And gave a very confusing and hard-to-learn vocabulary)

Or its inventors have discovered that assuming the mission of an
existing 
mature standard without acknowledging the complexity of that mission
leads to 
the same or worse complexity in the invention.

Darn.  Maybe LISP was the right language after all and forty years 
of computer scientists just didn't "get it".

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Mar 11 11:56:30 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:09:53 2004
Subject: ModSAX: Proposed Core Features
References: <Your message of "Wed, 10 Mar 1999 07:09:59 PST."             <3.0.32.19990310070951.00eb6780@pop.intergate.bc.ca> <4.1.19990311090304.00c96360@steptwo.com.au>
Message-ID: <36E7AD18.486C@hiwaay.net>

James Robertson wrote:
> 
> Although I was thinking SAX eXtended, ie:
> 
>         SAXX
> 
> This could later become
> 
>         SAXXX
> 
> or perhaps:
> 
>           3
>        SAX

At which point the local firewall chokes again, 
tosses up the warning message about unacceptable 
sites and local policies, accounts get flagged, 
and the whole nine yards of censorial software 
and American puritanism kicks in.

Call it Sax++. Incrementally better.  ;-)

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Mar 11 12:20:33 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:09:53 2004
Subject: RDF not conforming to the Namespace spec?
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de>
Message-ID: <36E7B4B3.999188F5@mitre.org>

Hi Folks,

There has been a lot of discussion on this list group about namespaces
and how there is no necessary link between a namespace URI and a schema
(DTD).  Just as I was accepting that and getting comfortable with it I
read the RDF spec...

For those of you unfamiliar with RDF, its mission in life is to enable
you to express data about your data; i.e., metadata.  You can express
things like, "the creator of the BookCatalog is John Doe".  "creator" is
a piece of metadata about the "resource", BookCatalog.  In RDF "creator"
is called a "property".  Thus, the "property" <creator> has the "value"
John Doe ...  <creator>John Doe</creator>.

Okay, here's where the rub comes.  Let me give you a couple of quotes
from the RDF spec (the *'s I have put in and are my way of emphasizing
the words that I wish for you to really focus on):  "Property names
*must* be associated with a schema.  This can be done by qualifying the
element names with a namespace prefix to unambigously *connect* the
property definition with the corresponding RDF schema ..."  Earlier in
the spec it says: "Due to RDF's incremental extensibility, agents
processing metadata will be able to trace the origins of schemata they
are unfamiliar with back to known schemata and perform meaningful
actions on metadata they weren't originally designed to process."

Let me tell you how I interpret those two sentences.  Suppose that I
haver written a Web agent and it comes across a Web site that serves up
an XML document containing some metadata (expressed using the RDF
syntax).  Let's suppose that the metadata says, in XMLese, "the creator
of the BookCatalog is John Doe".  My agent has never seen the property
"creator", so it follows the namespace URI to the property schema.  From
there it finds the superclass of the creator property.  If it doesn't
recognize that class then it goes to its superclass.  It keeps doing
this until it finds a class that it understands and then it starts
unwinding (presumably by this process it will be able to gain insight
into what "creator" is all about.  I have no idea how this will happen,
but it sounds pretty cool.)

This mechanism of following references until the agent gains
"enlightenment" makes sense to me.  I like it!  ***However*** that
presupposes that there is a *guaranteed* association between a namespace
URI and a schema.  This is totally against what this list group has
worked so hard to clarify as NOT being the case.

Somebody help me to understand this.  Obviously I am misreading,
misinterpreting the RDF spec.  Thanks.  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 11 12:31:05 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:53 2004
Subject: One more ModSax naming try...
Message-ID: <002a01be6bbb$3da914a0$c8a8a8c0@thing1>

From: MikeDacon@aol.com <MikeDacon@aol.com>
>So, one other way to go is the "Add-on" theme that David expressed.
>
>XtraSax
>XtraParser
>
>This is a combination of "add-on", "extra" and Xml.  


What about open? OpenParser/OpenSAX.
With the new extensions, we are not constrained by the interface--its quite "open".

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 11 12:39:59 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:53 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <002d01be6bbc$a014e460$c8a8a8c0@thing1>

From: Oren Ben-Kiki <oren@capella.co.il>
>I think you've hit on something important here. The Mod/X/Xtra/E-Sax thread
>has focused on "how to access extra functionality which is already available
>within a particular SAX parser implementation". This might be the wrong
>question to ask. Shouldn't it be "how to I obtain an instance of a SAX
>parser which provides the features I need", instead?


It is interesting how small shifts in perspective can have major design implications.
I just wanted to make it easy for new ModSAX applications to use older SAX 
parsers without requiring any extra code in the application.

If ModSAX is to remain low-level, I suspect a registry is out of scope. As for building
up a parser with filters to meet a set of requirements automagically, I'd rather give
more control to the application to specify what it needs, than try to compose
something based on a feature list.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Thu Mar 11 13:11:07 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:53 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <01BE6BC8.E71E5AB0@grappa.ito.tu-darmstadt.de>

Oren Ben-Kiki wrote:

> Has anything similar [assembling processors based on feature requests]
> been done in a different field, so we could reuse the
> design lessons there? It seems like a pretty generic "stream processing"
> problem.

I think there is an inherent assumption in this question that we are 
defining individual features that can be implemented by different parties 
and then randomly assembled to get a useful processor.  While this is 
potentially a useful thing to do -- UNIX pipes are a good example -- it is 
not necessarily an easy thing to do, nor is it clear that this is a goal of 
ExModE-XSAX.

We tried to do a similar thing in OLE DB, where database functionality 
would be broken down into individual services which could be assembled at 
will on top of a database driver.  (Generally, this would be meaningful 
only for drivers for non-database sources, as drivers for existing 
databases already exposed most/all functionality.)  The idea never really 
worked out, but here are some of the issues:

* Are there enough useful features/components to make this worthwhile?  For 
OLE DB, the answer was "probably not".  We implemented a scrollable cursor 
(basically just a result set cache), but other ideas (transactions, 
security) were not easily implementable as separate layers and were not 
really meaningful -- anybody could get around them by excluding the layer.

* What are the interfaces between components and how hard are they to 
implement?  If you want to be able to assemble components from different 
vendors at will, these need to be defined.  The success of SAX filters is a 
red herring here -- it leads one to believe that SAX can function as a 
useful interface for all XML-related processing features.  In fact, this is 
not the case -- for example, whether or not to retrieve external entities 
has nothing to do with SAX.  Thus, other interfaces would need to be 
defined to be able to assemble processors from third-party components.  (I 
think this is one thing that led us astray in OLE DB.  The usefulness of a 
scrollable cursor engine that spoke OLE DB at both ends led us to believe 
that the same could be done with other database features.  In fact, OLE DB 
was less well suited or completely unsuited for other operations.  In 
addition, it was expensive to implement.)

* How independent are the features?  Is it meaningful to ask for one thing 
but not another, such as wanting validation without namespaces (maybe) or 
parsing external entities (no)?  Again, I think the orthogonality of some 
features is a red herring leading one to believe all features are 
orthogonal.

* Are performance penalties too high to separate features into separate 
components?  For example, suppose several features need to process XML 
documents as trees.  While it might make sense to write a single processor 
for these features and toggle them within the processor, the performance 
hit of implementing them as separate, chained processors would be too high: 
each would have to build a tree, process it, and then stream it back out as 
SAX.

* Are there order dependencies between components?  For example, if you 
want validation and namespace processing as separate components, you had 
better do namespace processing first.  An open question is who knows about 
order and how is it advertised.

* Who assembles the components -- the application, the processor, or a 
third party?  The advantage of a processor or third party (such as a 
factory) assembling components is that you need the assembly logic in only 
a few places.  The disadvantage is that applications that know about a new 
feature cannot use that feature until the assembly logic in the 
processor/factory is updated.  It is probably best to have a mechanism that 
allows both processors and applications to assemble components.

My personal feeling is that assembling XML processors completely on the fly 
is a pipe (if you will excuse the pun) dream.  The world is simply not o  
rthogonal enough to make this possible.  Furthermore, there are too many 
performance gains to be had by tight integration of functionality to ever 
convince people to build things entirely as components with public 
interfaces.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Thu Mar 11 13:30:48 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:09:53 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <02a501be6bc2$9da75810$5402a8c0@oren.capella.co.il>

Bill la Forge <b.laforge@jxml.com> wrote:

>If ModSAX is to remain low-level, I suspect a registry is out of scope. As
for building
>up a parser with filters to meet a set of requirements automagically, I'd
rather give
>more control to the application to specify what it needs, than try to
compose
>something based on a feature list.


A registry might be outside the scope of ModSAX (but see below). Even if it
is, I feel that we should take care that ModSAX design choices won't make
such a registry unnecessarily difficult. It might also be that a "registry"
is the wrong way to go; John Cowan, for example, suggested a mechanism to
allow a parser to automatically push a filter between itself and the
application. I'm certain there are other reasonable approaches.

All I'm saying is that before we decide on ModSAX, some thought should be
given to this issue.

To get the ball rolling, how about the following low level solution, which
would allow smarter high level solutions later on:

class ModSAXRegistry {
    static void setClassFeatures(String className, String[] featureNames);
    static String[] getClassFeatures(String className);
    static Enumeration getFeatureClasses(String featureName);
    static Object newInstance(String className);
}

The idea being that it would be easy to get a list of classes which provide
any requested feature, and check which features are implemented by a
particular class. This should be trivial to implement; static code could do
the registration automatically, or it could be loaded from property files,
environment variables, or whatever.

We already have one standard feature: "org.xml.sax.parser", to which we
should probably add "org.xml.sax.filter".

The question of how to build a parser implementing a particular feature
would be left open. In general the application would query the registry, use
whatever algorithm it likes to decide on which classes to use, instantiated
them and go on as per the current ModSAX interface.

Once enough experience is gained using this, we could decide to add some
methods which implement popular algorithms.

Compatibility with the current state: It should be trivial to implement
ParserFactory above the registry. As for property files, the following
scheme is safe and upward compatible with today's practice of providing the
SAX parser name in "org.xml.sax.parser":

org.xml.sax.class.<class-name>=<feature>,<feature>,...
<feature>=<default-class-name>

The whole thing is as lightweight and low-level as you can get.

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Thu Mar 11 14:21:43 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:09:53 2004
Subject: One more ModSax naming try...
Message-ID: <5f01bac8.36e7bcfd@aol.com>

Hi Bill,

In a message dated 3/11/99 7:27:22 AM Eastern Standard Time,
b.laforge@jxml.com writes:
> 
>  What about open? OpenParser/OpenSAX.
>  With the new extensions, we are not constrained by the interface--its quite
"
> open".
>  

I like OpenParser/OpenSAX!!
Besides the open/extensible link, it gives a nod to open source
which is appealing.

- Mike
(www.gosynergy.com)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dkirsch at quintcom.com  Thu Mar 11 14:44:15 1999
From: dkirsch at quintcom.com (dkirsch@quintcom.com)
Date: Mon Jun  7 17:09:53 2004
Subject: Delivery of XML
Message-ID: <88256731.00509FF2.00@mercury.quintcom.com>


Molly,

I understand that IBM will make a presentation at the IETF meeting next week for
just this type of support.   I'll see if I can get you a contact for that while
I'm here at the XTECH conference today.

Cheers,

David K.


"Molly Sharp" <msharp@sybex.com> on 03/10/99 07:00:57 PM

Please respond to "Molly Sharp" <msharp@sybex.com>

To:   SGML-L@RELAY.URZ.UNI-HEIDELBERG.DE, xml-dev@ic.ac.uk
cc:    (bcc: David Kirsch/QCI)
Subject:  Delivery of XML


Hello,

I'm new to the list. I'm in the computer book publishing business, and I'm
looking for information about delivering XML content to customers in a
secure, copy-protected (encrypted) manner.

Does anyone know if there are any companies out there offering secure
encryption for XML? I imagine you'd have to create a browser based on IE or
Netscape that disabled functions such as view source, copy, and save as ---
and that would be the only browser your encrypted XML content could be
opened from.

Thanks for any information about this,

Molly Sharp


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Thu Mar 11 15:06:37 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:09:53 2004
Subject: DOM Impl: Array or Linked List?
Message-ID: <85256731.0052D629.00@D51MTA03.pok.ibm.com>

As a contrasting point, my com.ibm.domimpl operates on the linked-list
approach. I considered changing that, but decided that for the applications
I anticipated folks to be writing in Java, integer indexing was going to be
relatively rare compared to next and previous, and performing the
additional work to maintain the indices didn't feel like it was going to be
a net gain.

I'm firmly convinced that there's no such thing as one best way to
implement the DOM. There are too many issues to trade off which will make
an implementation better at one thing than another. The fastest DOM may
need more storage space for the model; the smallest model may require more
code; the smallest code may be slower. Also, don't forget that the DOM is
strictly an API, which can be wrapped around any model that can contain a
document; there may be DOMs which are really just thin access layers for
databases, for example.

Pick, or write, the DOM that suits your intended application(s). Hammers
make poor screwdrivers, and vice versa.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cadams at cascadecc.com  Thu Mar 11 15:22:26 1999
From: cadams at cascadecc.com (Chad Adams)
Date: Mon Jun  7 17:09:53 2004
Subject: Java DOM Parsers
Message-ID: <000001be6bd2$e0059900$01010101@development.cascade>

What companies supply java DOM API's and other xml api tools?  Any
suggestions on which to go with?

Thanks

Chad Adams
Payback Training Systems
Email: cadams@cascadecc.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Thu Mar 11 15:41:18 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:53 2004
Subject: Namespaces and DTDs
References: <c=US%a=_%p=Design_Intellige%l=MASTER-990310223202Z-5687@master.design-intelligence.com> <36E7951F.BD56A8E4@mecomnet.de>
Message-ID: <36E7E379.D3BE5204@goon.stg.brown.edu>

James Anderson wrote (with regard to declaring namespaces in the DTD):

> one will, in any case, need to establish scoping rules for the bindings

That's a very insightful comment, and right on target about DTDs.

But back to an earlier point a poster made about SGML-conformance (DTDs,
etc.) being the thing that is killing XML:  If it weren't for the promise
of backwards compatibility with SGML/HTML, XML could not have gathered 
the initial following that it did.

(Don't get me wrong; our shop is still largely an SGML shop.  I'll be
very sad if XML loses these connections.  But I think that's where we
are headed.  Many people who are entering the XML community have never
heard of SGML, and resent being encumbered it.)

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Thu Mar 11 15:56:53 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:09:53 2004
Subject: Fw: ModSAX: Proposed Core Features
Message-ID: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il>

I asked:
>> Has anything similar [assembling processors based on feature requests]
>> been done in a different field, so we could reuse the
>> design lessons there? It seems like a pretty generic "stream processing"
>> problem.

Ronald Bourret <rbourret@ito.tu-darmstadt.de> wrote:

>I think there is an inherent assumption in this question that we are
>defining individual features that can be implemented by different parties
>and then randomly assembled to get a useful processor.  While this is
>potentially a useful thing to do -- UNIX pipes are a good example -- it is
>not necessarily an easy thing to do, nor is it clear that this is a goal of
>ExModE-XSAX.

Well, at least the idea warrants some serious thought.

>We tried to do a similar thing in OLE DB, where database functionality
>would be broken down into individual services which could be assembled at
>will on top of a database driver.  (Generally, this would be meaningful
>only for drivers for non-database sources, as drivers for existing
>databases already exposed most/all functionality.)  The idea never really
>worked out, but here are some of the issues:
>
>* Are there enough useful features/components to make this worthwhile?

Good question. For SAX I'd say "probably yes". Here's a list of features
(courtesy of David Megginson):

> http://xml.org/sax/features/validation
>  Validate (true) or don't validate (false).
> http://xml.org/sax/features/external-general-entities
>  Expand external general entities (true) or don't expand (false).
> http://xml.org/sax/features/external-parameter-entities
>  Expand external parameter entities (true) or don't expand (false).
> http://xml.org/sax/features/namespaces
>  Preprocess namespaces (true) or don't preprocess (false).  See also
>  the http://xml.org/sax/properties/namespace-sep property.
> http://xml.org/sax/features/normalize-text
>  Ensure that all consecutive text is returned in a single callback to
>  DocumentHandler.characters or DocumentHandler.ignorableWhitespace
>  (true) or explicitly do not require it (false).

I'd like to see "http://xml.org/sax/features/xsl-transformation" as well.
Anyway, all of the above seem to fall nicely into the pipeline framework.

>* What are the interfaces between components and how hard are they to
>implement?

Basically the SAX callbacks, probably extended so that the full document
data is available (comments and so on). This seems pretty much a done deal.

>* How independent are the features?
>* Are there order dependencies between components?

This is a problem, as I've already pointed out. Take "normalize-text", for
example. The effects of such a filter might be lost if it is followed by any
of the entity expansion filters (say), not to mention an XSL one. However
most of the other features seems relatively independent. I'd say this isn't
a fatal problem. It definitely doesn't effect the API I suggested.

>* Are performance penalties too high to separate features into separate
>components?

Unknown; I guess this depends on the feature and the implementation. But
then, allowing one to build a system by combining filters doesn't mean one
has to do so. Even inefficient pipelines are still very useful for ad-hoc
processing, for prototyping systems, and so on. From the list of features
above, I'd say that most won't suffer a serious penalty.

>* Who assembles the components -- the application, the processor, or a
>third party?

What I'm suggesting is we currently answer "for now, the application", and
provide a simple, lightweight, low-level API which allows it to do so. More
complex solutions could evolve later on. This seems to be in the SAX spirit.

>My personal feeling is that assembling XML processors completely on the fly
>is a pipe (if you will excuse the pun) dream.  The world is simply not o
>rthogonal enough to make this possible.  Furthermore, there are too many
>performance gains to be had by tight integration of functionality to ever
>convince people to build things entirely as components with public
>interfaces.


Simon St.Laurent has made a good case for layering XML functionality - see
http://www.simonstl.com/articles/layering/layered.htm. The list of features
above seems to validate his claims.

My feeling is that pipelining is a valid approach. This is because there are
quite a few features which fit this model, and each application needs its
own special subset of them. If this weren't the case, we'd be designing
SAX2.0 with a fixed set of features instead of ModSAX.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 11 16:06:16 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:53 2004
Subject: Oedipus XML (was Re: Namespaces and DTDs)
In-Reply-To: <36E7AE27.7DE6@hiwaay.net>
References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no>
	<36E7AE27.7DE6@hiwaay.net>
Message-ID: <14055.59156.593634.998329@localhost.localdomain>

len bullard writes:

 > Jarle Stabell wrote:
 > > 
 > > Richard Goerwitz  wrote:
 > > > (Simplicity _was_ one of XML's primary goals back in the dark ages last
 > > > February.)
 > > 
 > > It seems to me that the SGML compatibility requirement killed simplicity.
 > > (And gave a very confusing and hard-to-learn vocabulary)
 > 
 > Or its inventors have discovered that assuming the mission of an
 > existing mature standard without acknowledging the complexity of
 > that mission leads to the same or worse complexity in the
 > invention.

XML has introduced some nasty new complexities, but many of those
relate to providing proper Unicode support, and SGML would have had to
deal with them anyway.  (There were, of course, a couple of mistakes
that added to the complexity, especially relating to entities and
external subsets.)

Speaking as both a parser writer and an application writer, I am
confortable writing that XML is significantly simpler to support in
enterprise-level implementations than full SGML, and that I have not
actually yet really missed any of the SGML features excluded from
XML.

To be fair, I am talking only about the core specs -- I am comparing
ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral
standards on both sides.  A comparison of HyTime to XLink, XPointer,
and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an
interesting but separate exercise.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 11 16:37:20 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:53 2004
Subject: Namespaces and DTDs
Message-ID: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org>

From: len bullard <cbullard@hiwaay.net>
>Darn.  Maybe LISP was the right language after all and forty years 
>of computer scientists just didn't "get it".


Lisp and XML have a few things in common, like being easy to
determine if they are well formed. Frankly, I think XML will be
better in the long run because it can be validated against various
schema.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 11 16:40:52 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:54 2004
Subject: One more ModSax naming try...
In-Reply-To: <002a01be6bbb$3da914a0$c8a8a8c0@thing1>
References: <002a01be6bbb$3da914a0$c8a8a8c0@thing1>
Message-ID: <14055.61833.969345.509241@localhost.localdomain>

Bill la Forge writes:

 > What about open? OpenParser/OpenSAX.
 > With the new extensions, we are not constrained by the interface--its quite "open".

Not bad, but we weren't really closed to begin with.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Thu Mar 11 16:41:45 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:54 2004
Subject: RDF not conforming to the Namespace spec?
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org>
Message-ID: <36E7F1DB.608598CC@goon.stg.brown.edu>

"Roger L. Costello" wrote, re elements like "creator" (which may not be
defined by a given DTD, but which must occur in a document instance that
is using RDF):

> My agent has never seen the property "creator", so it follows the
> namespace URI to the property schema...

Okay, so your agent is reading the document.  It runs into an element
in another RDF namespace.  You want to use that namespace's URI component
to read in additional schema information.

Two problems:  1) namespace URIs don't necessarily point to schemas, and
2) if they did, you'd be extending the schema mechanism in a way that's
incompatible with DTDs, as they're normally defined and understood.

I don't know if its possible, from an implementation standpoint, to add
the DTD after you've already started parsing the document.  And if you to
could, whether doing so would be reasonable.

Surely this sort of problem has been discussed in the SGML community.
Can someone who has hashed all the details out already perhaps post with
some commentary?

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Bruce.Duffy at westgroup.com  Thu Mar 11 16:47:20 1999
From: Bruce.Duffy at westgroup.com (Duffy, Bruce)
Date: Mon Jun  7 17:09:54 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <7BA102761CAED111B27E00805FBB72333FAE4C@arrowhead.int.westgroup.com>


Hi folks,

One feature I'd really like to see is a Locator.getByteOffset()
method.  Obviously this feature would have to be optional, since not
all XML inputs are indexable files.

James Clark's non-SAX API for XP implements this method for startElement(),
but not for the characters() callback, which unfortunately is exactly what
I need it for.  I could hack XP or another parser, but I'd much rather work
within the context of SAX.

One name for such a feature is:

	http://xml.org/sax/features/locator.byteOffsets
	  
	(true) means getByteOffset() is supported for startElement,
	endElement, and character callbacks.  (false) means it is not
	supported for those callbacks.


Alternatively, if there's some reason why this feature is a Bad Idea,
I'd like to know why!

Thanks,


	Bruce Duffy
	West Group

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From creitzel at mediaone.net  Thu Mar 11 16:58:56 1999
From: creitzel at mediaone.net (Charles Reitzel)
Date: Mon Jun  7 17:09:54 2004
Subject: Namespaces and DTDs
Message-ID: <199903111654.LAA04302@chmls06.mediaone.net>

Ah, my favorite thing to hate about XML <grin>.

Seriously, though.  I have yet to hear of a single real application that
needs element level prefix declarations.  Not one!  The PI was just fine for
99.99% of applications.  The 0.01% should simply not use XML (or may need an
additional layer, such as AF or a schema processor).  Element declared
namespaces is a solution in search of a problem.  Unfortunately, namespaces
have effectively killed DTD validation.

My wish list for namespaces is as follows:
1) The prefix should be set by document author, *not* the DTD author.
2) The FPI should be set by the DTD author.
3) Prefixes should have document scope.
4) Namespaces should be part of XML proper and *not* an add on.
5) Element names should be resolved in the namespace of the nearest ancestor. 

Until most of these conditions are met, I predict the demise of DTD's.  It
may be too late already...

Best regards,
Charles Reitzel

>From: james anderson <James.Anderson@mecomnet.de>
>Date: Wed, 10 Mar 1999 11:49:51 +0100
>Subject: Re: Namespaces and DTDs
>
>That "REC-xml-names-19990114" does not provide any means to establish
>prefix<->uri bindings for a DTD has long been a point of contention. A cursory
>search of the archives will bear this out. The decision to eliminate the
>combined prefix/uri/dtd binding (the original pi form) was, however, correct,
>as the pi form, at least as proposed in "WD-xml-names-19980327", would not
>have been sufficient to handle such things as a dtd which needs multiple
>prefix bindings or the situation where a given prefix<->uri binding is to
>apply to multiple schema sources.
>
>While it is true that some mechanism is necessary, a form - as discussed below
>- - which effected a singular binding would also not have solved the problem.
>"Everyone" would seem to be waiting for "schemas"....
>
>Marc.McDonald@Design-Intelligence.com wrote:
>> 
>> A simple extension to namespaces could have fixed this problem:
>> 1.      Allow a DTD to be optionally specified along with the namespace
>> prefix and URI
>> 2.      When an element is prefixed, parse it using the DTD associated with
>> the namespace and the given prefix as the default.
>> 3.      If no DTD is associated with the prefix or not validating, do what
>> is done now (ensure element is well-formed).


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Thu Mar 11 17:16:18 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:09:54 2004
Subject: Namespaces and DTDs
Message-ID: <00dc01be6be1$c4587e20$0b2e249b@fileroom.Synapse>

Bill la Forge wrote:

>From: len bullard <cbullard@hiwaay.net>
>>Darn.  Maybe LISP was the right language after all and forty years
>>of computer scientists just didn't "get it".
>
>
>Lisp and XML have a few things in common, like being easy to
>determine if they are well formed. Frankly, I think XML will be
>better in the long run because it can be validated against various
>schema.
>
    LISP defines a serialization format for lists and atoms (s-expressions)
which employs '(' and ')' in an analogous fashion to XML being a
serialization format for trees.

    LISP also defines a set of rules by which lists are eval'd as functions
with arguments. Aside from syntactic issues, '<' and '>' could be used as
s-expression delimiters without significant change to the LISP interpreter
(aside from the parsing routine). In order to properly compare LISP with
XML, then, we would need to propose a set of rules whereby *x-expressions*
were evaluated.

    The closest we have today is XSL which is not currently a fair
comparison to LISP (e.g. try writing a compiler or word processor in XSL
:-))

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Thu Mar 11 18:03:04 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:54 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <01BE6BF1.B0427BB0@grappa.ito.tu-darmstadt.de>

Oren Ben-Kiki wrote:

> >* What are the interfaces between components and how hard are they to
> >implement?
>
> Basically the SAX callbacks, probably extended so that the full document
> data is available (comments and so on). This seems pretty much a done 
deal.

and also wrote:

> >* Who assembles the components -- the application, the processor, or a
> >third party?
>
> What I'm suggesting is we currently answer "for now, the application", 
and
> provide a simple, lightweight, low-level API which allows it to do so. 
More
> complex solutions could evolve later on. This seems to be in the SAX 
spirit.

If the application assembles the components and the interface between them 
is SAX, what do we need that SAX filters don't already give us?  In other 
words, does anything need to be done to OpenSAX (best name so far) to 
support this besides adding the ParserFilter interface?

The other question that occurs to me is how useful/common it is to 
dynamically assemble a processor at run time. That is, are there really 
applications (outside of test environments) that allow the user to 
designate their parser at run time (or even installation time) and 
therefore need to cover any possible deficiencies in the chosen parser? 
 What is gained by allowing the user to choose the parser?

Note that this is a very different situation from, say, using different 
ODBC drivers.  In the case of ODBC drivers, you are choosing a different 
source of data (type of database) and application writers have a strong 
incentive to support multiple databases through ODBC.  In the case of XML, 
the source of data is always the same XML document and the choice of parser 
becomes a trade-off between speed, reliability, feature-set, etc.

Since the application writer knows the feature set ahead of time, why not 
just hard-code the required parser and SAX filters and be done with it? 
 (Yes, I know that "hard-code" is a bad word and I shudder as a write it, 
but I really am curious if anybody out there has a real-world application 
that allows users to change parsers and what the benefits of this are 
besides the ability to say, "Oh, look. I'm using a different parser.")

In this view, the utility of SAX is not the ability to change parsers at 
run time, but to change them over time as reliability, speed, size, etc. of 
the parsers change.  It also means that application writers can learn a 
single interface (SAX) and then choose parsers as they are appropriate to 
the application without having to learn different interfaces for different 
parsers.

The ability to request features in OpenSAX allows the application to 
request processor behavior, which is slightly different from assembling a 
suitable parser.  For example, if I have an application that doesn't need 
validation, but I the parser I want to use does validation by default, I 
would like to be able to turn that off.

Just to be clear, I'm not necessarily against assembling processors based 
on a feature set.  I just believe that it is far more complex than it 
appears at first glance and am not convinced that it's worth the trouble.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 11 18:03:35 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:54 2004
Subject: Opening SAX for better filter support
Message-ID: <007701be6be9$f9bfc840$46026982@thing1.camb.opengroup.org>

A fixed API has lots of advantages in terms of service/user.
Each can be implemented to the API without being bound to
the other. And if you do need a non-standard feature, you 
isolate the code that has such a dependency. Overall, a
very manageable situation unless you move too far out of scope
of the API.

Introduce middleware and everything changes. Now you
want an open API that permits unanticipated interactions 
between the service/user without needing to completely bypass
the middleware.

With the advent of SAX filters, we have now moved to having a
need for a more open API, and David's proposal seems to
fit that need precisely.

Consider a complex of stacked and nested filters wrapping
a parser. This composition is something which might be best
done separately from the application itself, but the application
may still need to access various parts. Indeed, a good design
would keep as much of the application as possible independent
of any particular structure, as the structure may need to
change if we change parsers or introduce more appropriate
filters.

Think of this complex of parser and filters as some kind of aggregate
that is best treated as a gray box by the application--the application
may need to identify and interact with various parts of the aggregate,
but doesn't know the overall structure.

The new get and set methods are exactly what we need. We can
present a named object to the aggregate and, by routing the 
request through the aggregate, the component which knows 
what to do with that object can process it. Conversely, we can
request a reference to a component or result by name and the
appropriate component is able to respond.

Now while not of this may be terribly efficient, it doesn't need to be--
these are calls that are made for configuration or to access
results. So it should work and work beautifully.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 11 18:29:48 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:54 2004
Subject: Namespaces and DTDs
Message-ID: <00c501be6bed$96b33260$46026982@thing1.camb.opengroup.org>

From: Jonathan Borden <jborden@mediaone.net>
>    The closest we have today is XSL which is not currently a fair
>comparison to LISP (e.g. try writing a compiler or word processor in XSL
>:-))


I like to use XML to do compositions of components, which encompases the
declaritive rather than the proceedural aspects of programming. What I
like is that a schema can then validate a composition, allowing clients
to send a composition to a server to construct an agent, but without the
security problems that you would otherwise have.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar 11 18:36:24 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:54 2004
Subject: RDF not conforming to the Namespace spec?
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org>
Message-ID: <36E80E9C.FC04B18@locke.ccil.org>

Roger L. Costello wrote:

> Okay, here's where the rub comes.  Let me give you a couple of quotes
> from the RDF spec (the *'s I have put in and are my way of emphasizing
> the words that I wish for you to really focus on):  "Property names
> *must* be associated with a schema.  This can be done by qualifying the
> element names with a namespace prefix to unambigously *connect* the
> property definition with the corresponding RDF schema ..."

Watch the modal verbs!  Property names *must* be associated with a
schema, but this can (i.e. *may*) be done by making the URI to
which the namespace prefix is bound the actual URI of the schema
document.

There may be other ways to do it.  Besides, RDF is free to set
tighter limits on the URIs used to identify namespaces than
XML in general.  XML-based standards can always set extra
requirements, like the SMIL requirement (clause 5.1) that there be no
internal DTD subset in SMIL documents.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar 11 18:53:59 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:54 2004
Subject: RDF not conforming to the Namespace spec?
References: <01BE6ADE.DE849F30@grappa.ito.tu-darmstadt.de> <36E7B4B3.999188F5@mitre.org> <36E7F1DB.608598CC@goon.stg.brown.edu>
Message-ID: <36E812D1.CE73AEE7@locke.ccil.org>

Richard L. Goerwitz wrote:

> Okay, so your agent is reading the document.  It runs into an element
> in another RDF namespace.  You want to use that namespace's URI component
> to read in additional schema information.
> 
> Two problems:  1) namespace URIs don't necessarily point to schemas, and
> 2) if they did, you'd be extending the schema mechanism in a way that's
> incompatible with DTDs, as they're normally defined and understood.

RDF namespace declarations *may* (and even perhaps should) point to
RDF schemas, which are not XML schemas at all.  They declare
RDF classes and properties, not XML elements and attributes.

Both RDF statements and RDF schemas are normally represented in XML,
but other representations (graphical) also exist.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 11 19:27:19 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:54 2004
Subject: Fw: ModSAX: Proposed Core Features
In-Reply-To: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il>
References: <02ee01be6bd6$a66a24a0$5402a8c0@oren.capella.co.il>
Message-ID: <14056.6147.757421.124783@localhost.localdomain>

Oren Ben-Kiki writes:

 > I'd like to see "http://xml.org/sax/features/xsl-transformation" as
 > well.  Anyway, all of the above seem to fall nicely into the
 > pipeline framework.

How about "http://capella.co.il/~oren/sax/features/xsl-transformation" 
(or whatever is suitable for your web rights)?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Mar 11 19:28:45 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:54 2004
Subject: FW: Namespaces and DTDs
Message-ID: <NBBBJPGDLPIHJGEHAKBAMEOCCOAA.martind@netfolder.com>

Hi

I am using these simple rule of thumb:

a) a XML DTD is useful for XML editors not for XML renderers
b) Most XML renderers (XSL, CSS or DSSSL won't do document validation)
c) a XML interpreter do not need a DTD (something else than rendition)

If I need a DTD at the receiving end, then I am now no longer in the XML
world but in the SGML world because the receiving end needs a validating
parser. Several SGML parser like for instance SP can parse XML simplifyed
DTD. The only simplification I gained is the -- or -0 think called omitags.
Therefore, because I have to include a DTD for validation, better use then a
SGML format.

However, on the Web, to reduce complexity, I should not assume that the
receiving end has a validating parser. Thus, because my XML document has
been validated with my XML editor or by any other validation program. The
receiving end makes the reasonnable assumption that if the docuement is a
XML docuement it is "well formed" and valid.

Its a lot simplier that way.

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 11 19:31:36 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:54 2004
Subject: Namespaces and DTDs
In-Reply-To: <199903111654.LAA04302@chmls06.mediaone.net>
References: <199903111654.LAA04302@chmls06.mediaone.net>
Message-ID: <14056.6319.57210.877490@localhost.localdomain>

Charles Reitzel writes:

 > Seriously, though.  I have yet to hear of a single real application
 > that needs element level prefix declarations.  Not one!

I'll paraphrase the use case as follows (I'll leave the source
anonymous):

  A server wants to construct a large XML document as the response to
  a client request, and it does so by handing off the work to several
  parallel processes and then concatenating the results into a single
  document.  If each of the processes can declare its own namespaces,
  then it is not necessary to establish complicated negotiation
  channels between the top-level process and the child processes to
  obtain the correct namespace declarations.

Before everyone rushes out to shoot holes in this use case, I'd like
to note that I still have callouses on my trigger finger from doing so
myself.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Mar 11 19:57:40 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:54 2004
Subject: Namespaces and DTDs
In-Reply-To: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org>
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEOFCOAA.martind@netfolder.com>

HI Bill,

<YourComment>

Lisp and XML have a few things in common, like being easy to
determine if they are well formed. Frankly, I think XML will be
better in the long run because it can be validated against various
schema.
</YourComment>

<Reply>
I am not sure of that.

a) a Lisp document could be made SGML compliant because SGML can let you
define begin and end tag's delimiters (Ex: dsssl).
b) if the previous proposition is true, then you can also change the
delimiters and keep the structural coherency.
c) You could also enforce that a begin and end tag conform to the well
formed constraint.
d) a XML document is a hierarchy and a hisrarchy could be mapped with list
constructs. In fact, as soon as you map lisp to SGML and then to XML, you
notice immediately the similarities. There is formal transformation possible
from one structure to the other. In mathematical term would coud talk of
"topological" transformation from one to the other. Their structure are
similar enough to transform one into the other.

Conclusion: we should not take what Jonathan said so lightly and do some
homework fisrt.

This said, I agree that XML could potentially be more succesful than lisp or
SGML or (fill here less than popular good ideas) but this is for other
reasons than technical reasons. For instance, this could be very popular
because the web is popular and XML benefit form the aura effect. Also
because, important software manufacturer are behind it and put compliant
products on the market. Also because poeple don't want to miss the next Web
big success, etc... This has nothing to do with technical vertues but more
with marketing vertues. But surely not because XML is bettern than lisp
because it could be validated against different schemas.

a) XML has the advantage, because of its strict syntax (compared to SGML
omitags) that a receiver do not need to validate the structure to interpret
the XML document. In fact, there is a high probability that interpreters
would "hard code" in some ways what to do for each element and this without
the need of a DTD. (except for style language that will "hard code" tree
manipulation and formatting object model)
b) If a DTD is necessary why not use SGML except for a marketing advantage
then?
c) An otehr usage of XML is to separate the content from the rendition. In
this case, most of browsers' style engine won't contain a validating parser
and therefore validation mechanism is irrelevant.

Conclusion: XML will be better simply because it has marketing momentum not
because of its technical merits period. The whole difference between SGML
and XML is that the receiver do not necessarily need validation to interpret
the document (because of the "well formed" constraint). But from the
marketing point of view it has huge advantage. New domain languages could be
created and big software manufacturers could again regain some control by
creating a domain language and let the numbers create a de facto standard.
In fact, HTML by being a standard domain language is more a threat to big
manufacturer than XML is. So, if XML is to be more popular this is surely
for marketing reasons :-)
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Mar 11 20:27:45 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:55 2004
Subject: RDF not conforming to the Namespace spec?
In-Reply-To: <36E7F1DB.608598CC@goon.stg.brown.edu>
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEOFCOAA.martind@netfolder.com>

Hi

<YourComment>
Okay, so your agent is reading the document.  It runs into an element
in another RDF namespace.  You want to use that namespace's URI component
to read in additional schema information.

Two problems:  1) namespace URIs don't necessarily point to schemas, and
2) if they did, you'd be extending the schema mechanism in a way that's
incompatible with DTDs, as they're normally defined and understood.

I don't know if its possible, from an implementation standpoint, to add
the DTD after you've already started parsing the document.  And if you to
could, whether doing so would be reasonable.

Surely this sort of problem has been discussed in the SGML community.
Can someone who has hashed all the details out already perhaps post with
some commentary?
</YourComment>

<Reply>
You're right on this.

For RDF the validation mechanism is not on the name space. The name space
mechanism form the receiver point of view is to be seen as a way to prevent
name collision in the same document space (including document linked to the
document). A simple parser could then process the complete markup name as a
whole word (i.e.. MySameSpace:MymarkupName). It could occur however that two
name space would collide (i.e. two name space have the same name space id
and the same markup name) then in this case, the parser may not take any
chance and replace the name space ID by the URI (if the URI is unique) and
be sure that now the element name is unique (i.e. <uri>:MyMarkupName). The
whole thing is to be sure that we do not have name collision in the document
name space (I mean here the document complete set of names).

For RDF the property list is defined by a schema. RDF is like directory
service schemas.

a) you have to define a record or property set with a schema. You also
define entities relationship with the schema.

The parser do not have to use a DTD as a validation mechanism just the trick
to replace the name space ID by the URI if we want to reduce name collision
to near zero probability. However this is not a validation mechanism this is
a name space collision resolution mechanism like for instance used in
languages like C++ (practically, you replace the name space ID by the URI to
create a unique name element, not more not less -> MyNSID:MyElementName into
http://www.netfolder.com/:MyElementName This is now a very low probability
that a linked document would contain the same named element.)

This is for the parsing side. Now for the interpretation side, a RDF
interpreter (that uses a XML parser) has to know the object's property set
to do something on it. This something could be to build a "frame" for this
object. A frame, to recall, is like a record. This frame could be strongly
typed by a schema that says what the frame is allowed to contain and what
relationship it has with other frames. A schema is not a DTD because the
validation is not at the syntax level but at the interpretation level. Let's
take an example:

we want to import data into a directory service and to do so, we use RDF. To
be sure that the XML parser won't have any name collision we could use name
space otherwise if the document name space is controlled the usage of name
spaces is superfluous. Thus let have a directory record for a user on a
network.

<user>
	<Firstname> Albert </Firstname>
	<Lastname> Einstein </Lastname>
			etc....
</user>

The XML parser has enough to do its job but the RDF interpreter now needs to
know what is the "frame" schema or object category constraints. Thus, the
RDF interpreter can ask the XML parser to parse the xml based schema
document to know the "frame" constraints. After the parsing done, it can
compare each frame property with the schema to know if the "frame" is valid
of not. It could also add a new schema to the directory service if the
object category is new to the directory.

Conclusion: The schema stuff is useful for the interpreter not the syntax
parser which in this case is the XML parser. We have to keep in mind that
XML is for the syntax and other mechanism may have to be provided to the
syntax parser client: the interpreter. A RDF interpreter then use XML parser
to convert into a structure than could be manipulated by the parser:
a) the RDF document
b) the schemas
then "interpret" what to like for instance import data into a directory
service.

A XML document is like a sleeping beauty without an interpreter :-)
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Thu Mar 11 20:33:41 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:55 2004
Subject: Namespaces and DTDs
References: <199903111654.LAA04302@chmls06.mediaone.net> <14056.6319.57210.877490@localhost.localdomain>
Message-ID: <36E8285F.DEB5CE9@goon.stg.brown.edu>

David Megginson wrote, re why namespaces are needed:

> I'll paraphrase the use case as follows (I'll leave the source
> anonymous):
> 
>   A server wants to construct a large XML document as the response to
>   a client request, and it does so by handing off the work to several
>   parallel processes and then concatenating the results into a single
>   document.  If each of the processes can declare its own namespaces,
>   then it is not necessary to establish complicated negotiation
>   channels between the top-level process and the child processes to
>   obtain the correct namespace declarations.

Maybe it's just me, but this sort of statement would have more credi-
bility if there were more evidence of widespread practical application
of this technique.

Most successful standards are based, in large part, on experience and
wisdom people gain from actually doing a thing.  A lot.

Am I missing something here?

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at oreilly.com  Thu Mar 11 20:34:30 1999
From: crism at oreilly.com (Chris Maden)
Date: Mon Jun  7 17:09:55 2004
Subject: Namespaces and DTDs
In-Reply-To: <NBBBJPGDLPIHJGEHAKBAAEOFCOAA.martind@netfolder.com>
Message-ID: <199903112033.PAA24971@ruby.ora.com>

[Didier PH Martin]
> a) a Lisp document could be made SGML compliant because SGML can let
> you define begin and end tag's delimiters (Ex: dsssl).

I think there's a little confusion here about DSSSL.  DSSSL
stylesheets are SGML documents, but they usually use angle-brackets:

<style-sheet>
<style-specification>
<style-specification-body>

(default (make sequence))

</style-specification-body>
</style-specification>
</style-sheet>

The parentheses are only character data.

I don't think that Lisp could be made SGML compliant; the delimiters
could be redefined, but as Steve DeRose notes in _The SGML FAQ Book_,
there are some limits to the flexibility of the redefinitions, since
some delimiter roles are overloaded.  Also, Lisp doesn't have the
equivalent of start-tag close, and you can only omit tagc if the next
character is stago or etago (ISO 8879:1986, clause 7.4.1.2) which it
wouldn't be when you get to the leaves of a structure.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Thu Mar 11 22:03:18 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:55 2004
Subject: Java DOM Parsers
Message-ID: <008901be6c09$465da400$0300000a@othniel.cygnus.uwa.edu.au>

>What companies supply java DOM API's and other xml api tools?  Any
>suggestions on which to go with?


I'll leave others to suggest which to go with. But for a list, see:

http://www.xmlsoftware.com/utilities/
http://www.xmlsoftware.com/parsers/

James
--
James Tauber / jtauber@jtauber.com / www.jtauber.com
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 : http://www8.org/

Maintainer of : www.xmlinfo.com,  www.xmlsoftware.com and www.schema.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Thu Mar 11 22:20:05 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:09:55 2004
Subject: Namespaces and DTDs
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990311221906Z-6063@master.design-intelligence.com>

So make a namespace declaration a PI and add an "not using this 
namespace anymore" PI. Then use simple occurrence scoping:

Process result:
<?XMLNS prefix="foo" uri="...">
<A> .... </A>
<XMLENDNS prefix="foo">

Process gets to define the prefixes that override any previous 
definition, old definition (if any) restored by XMLENDNS. No problem 
with concatenation.


Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  David Megginson [SMTP:david@megginson.com]
Sent:  Thursday, March 11, 1999 11:30 AM
To:  XML Developers' List
Subject:  Re: Namespaces and DTDs

Charles Reitzel writes:

 > Seriously, though.  I have yet to hear of a single real 
application
 > that needs element level prefix declarations.  Not one!

I'll paraphrase the use case as follows (I'll leave the source
anonymous):

  A server wants to construct a large XML document as the response to
  a client request, and it does so by handing off the work to several
  parallel processes and then concatenating the results into a single
  document.  If each of the processes can declare its own namespaces,
  then it is not necessary to establish complicated negotiation
  channels between the top-level process and the child processes to
  obtain the correct namespace declarations.

Before everyone rushes out to shoot holes in this use case, I'd like
to note that I still have callouses on my trigger finger from doing 
so
myself.


All the best,


David

--
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Fri Mar 12 00:44:34 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:09:55 2004
Subject: DocumentHandler with xml4j DOMParser
In-Reply-To: <3601b76a.110299@smtpgate1.ONE2ONE.CO.UK>
Message-ID: <000301be6c20$a6e177e0$d3228018@jabr.ne.mediaone.net>

Perhaps the confusion is this:

A DocumentHandler is a SAX concept, not a DOM concept. The DOMParser
contains a DocumentHandler that builds a DOM tree from the source document.
If you are working with the DOM, then you will parse the document and then
access its members through the DOM interfaces. If you would rather process
using an event based interface, then use SAX directly i.e. the SAXParser.

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Fri Mar 12 02:42:04 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:09:55 2004
Subject: Namespaces and DTDs 
In-Reply-To: Your message of "Thu, 11 Mar 1999 14:19:06 -0800."
             <c=US%a=_%p=Design_Intellige%l=MASTER-990311221906Z-6063@master.design-intelligence.com> 
Message-ID: <199903120241.PAA10692@aniwa.sky>


Documents resulting from queries run on the concatenated document would tend 
to cause problems, as query results don't generally return the context of the 
XML elements returned.  This problem also applies to queries run across 
multiple documents unless their DTD's are identical, which perhaps suggests 
that an answer to this problem has to come from the query languages.

Andrew McNaughton


 Marc.McDonald@Design-Intelligence.com wrote:
> So make a namespace declaration a PI and add an "not using this 
> namespace anymore" PI. Then use simple occurrence scoping:
> 
> Process result:
> <?XMLNS prefix="foo" uri="...">
> <A> .... </A>
> <XMLENDNS prefix="foo">
> 
> Process gets to define the prefixes that override any previous 
> definition, old definition (if any) restored by XMLENDNS. No problem 
> with concatenation.
> 
> 
> Marc B McDonald
> Principal Software Scientist
> Design Intelligence, Inc
> www.design-intelligence.com
> 
> 
> ----------
> From:  David Megginson [SMTP:david@megginson.com]
> Sent:  Thursday, March 11, 1999 11:30 AM
> To:  XML Developers' List
> Subject:  Re: Namespaces and DTDs
> 
> Charles Reitzel writes:
> 
>  > Seriously, though.  I have yet to hear of a single real 
> application
>  > that needs element level prefix declarations.  Not one!
> 
> I'll paraphrase the use case as follows (I'll leave the source
> anonymous):
> 
>   A server wants to construct a large XML document as the response to
>   a client request, and it does so by handing off the work to several
>   parallel processes and then concatenating the results into a single
>   document.  If each of the processes can declare its own namespaces,
>   then it is not necessary to establish complicated negotiation
>   channels between the top-level process and the child processes to
>   obtain the correct namespace declarations.
> 
> Before everyone rushes out to shoot holes in this use case, I'd like
> to note that I still have callouses on my trigger finger from doing 
> so
> myself.
> 
> 
> All the best,
> 
> 
> David
> 
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Fri Mar 12 05:13:49 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:09:55 2004
Subject: Oedipus XML (TIe Your Mother Down)
References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no>
		<36E7AE27.7DE6@hiwaay.net> <14055.59156.593634.998329@localhost.localdomain>
Message-ID: <36E8A1E8.1B@hiwaay.net>

David Megginson wrote:
> 
> Speaking as both a parser writer and an application writer, I am
> confortable writing that XML is significantly simpler to support in
> enterprise-level implementations than full SGML, and that I have not
> actually yet really missed any of the SGML features excluded from
> XML.

I agree with this.  As an application writer who only has to parse 
parts of it and then in the context of using a relational system 
with XML editing, it looks very much the same to me as the simple 
features of SGML that I've always used.  In effect, much of the 
nastier bits of SGML I did not use before.  So, it looks much 
the same.  It is a lot of fun to tie the treeviews, browser 
objects, tables, dialogs, combo boxes, etc. together into a 
generalized knowledge management system.  Cheap too.  ;-)

> To be fair, I am talking only about the core specs -- I am comparing
> ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral
> standards on both sides.  A comparison of HyTime to XLink, XPointer,
> and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an
> interesting but separate exercise.

Here I don't disagree, but in my work, the concepts of HyTime, DSSSL, 
the RDF Dublin Core, and namespaces influence my work.  Learning to 
think beyond the DTD to the information properties of the metalanguage 
proves to be very useful and that is not something I did before.  
As activities like X3D ramp up, I find I am applying more and more 
of the wall-to-wall markup concepts from the middle years of SGML 
and they work in the XML infrastructure of tools.  This is actually 
quite delightful.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Fri Mar 12 05:19:29 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:09:55 2004
Subject: Namespaces and DTDs
References: <002001be6bdd$cf44f560$46026982@thing1.camb.opengroup.org>
Message-ID: <36E8A33A.43BB@hiwaay.net>

Bill la Forge wrote:
> 
> From: len bullard <cbullard@hiwaay.net>
> >Darn.  Maybe LISP was the right language after all and forty years
> >of computer scientists just didn't "get it".
> 
> Lisp and XML have a few things in common, like being easy to
> determine if they are well formed. Frankly, I think XML will be
> better in the long run because it can be validated against various
> schema.

As much as I resisted it in the early working groups for 
various reasons, I find myself agreeing with the position 
that it is good to have formal definitions for both 
wrll-formed and validated information.  I had worked in 
that mode in the IDE/AS, IADS and GE systems, but the 
notion wasn't formally expressed.  I like ISO 8879 DTDs mainly 
because they are for me, much easier to read and use 
to parse in my head.  As I implement more with relational 
systems and use the tables to store the property sets of 
both schemata, properties of schemata as well as instances, 
I think I have more insight now into why people want 
multiple schema types even without the obvious extensions 
such as inheritance.

nodes is nodes is nodes.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Fri Mar 12 05:26:40 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:09:55 2004
Subject: FW: Namespaces and DTDs
References: <NBBBJPGDLPIHJGEHAKBAMEOCCOAA.martind@netfolder.com>
Message-ID: <36E8A4EB.375F@hiwaay.net>

Didier PH Martin wrote:
> 
> Hi
> 
> I am using these simple rule of thumb:
> 
> a) a XML DTD is useful for XML editors not for XML renderers
> b) Most XML renderers (XSL, CSS or DSSSL won't do document validation)
> c) a XML interpreter do not need a DTD (something else than rendition)
> 
> If I need a DTD at the receiving end, then I am now no longer in the XML
> world but in the SGML world because the receiving end needs a validating
> parser. Several SGML parser like for instance SP can parse XML simplifyed
> DTD. The only simplification I gained is the -- or -0 think called omitags.
> Therefore, because I have to include a DTD for validation, better use then a
> SGML format.
> 
> However, on the Web, to reduce complexity, I should not assume that the
> receiving end has a validating parser. Thus, because my XML document has
> been validated with my XML editor or by any other validation program. The
> receiving end makes the reasonnable assumption that if the docuement is a
> XML docuement it is "well formed" and valid.

That's mostly true because web documents don't stick around.  In 
cases where information is moving across multiple processes or sits 
in some long term archival, it is very handy to be able to validate it 
on the receiving end.  This will become more apparent to the XML
community 
when they get to do the sort of work the SGML community did a decade
after 
the first SGML applications fielded instances.   Things change.  Finding 
those changes quickly is the key to cheap rehosting.   In my experience,
if 
DTDs die, someone gets to reinvent them and it will be painful.

Otherwise, yes, the DTD is much more useful in the editor in the initial 
part of the information lifecycle.

len
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Fri Mar 12 08:29:31 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:55 2004
Subject: Namespaces and DTDs
Message-ID: <01BE6C6A.B2052F50@grappa.ito.tu-darmstadt.de>

Didier PH Martin wrote:

> a) a XML DTD is useful for XML editors not for XML renderers
> b) Most XML renderers (XSL, CSS or DSSSL won't do document validation)
> c) a XML interpreter do not need a DTD (something else than rendition)

(c) is not always true because DTDs are used for more things than just 
validation. For example, DTDs are used to define internal general entities, 
attribute defaults, and attribute types.  (The latter is important, for 
example, if a processor expects to build links based on ID/IDREF attributes 
or process according to notations.)

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Fri Mar 12 08:44:10 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:09:55 2004
Subject: XML DATA
Message-ID: <3601bf97.120299@smtpgate1.ONE2ONE.CO.UK>


Hi all,

I would like to know the status of XML Data proposal and it's take up in   
the XML community. Currently the only XML data parser that i found is   
from MS. Does anyone else plan on supporting XML Data in the future?
If not, what is the alternative to XML DATA ?


adios

 -lucio

 ---------------------------------------------------------------------
 One2One              LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower      tel : +44 181 214 3847
 Elstree Way
 Borehamwood                 fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Fri Mar 12 09:16:47 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:09:55 2004
Subject: XML DATA
Message-ID: <01BE6C71.4DE32A70@grappa.ito.tu-darmstadt.de>

LUCIO PICOLLI wrote:

> I would like to know the status of XML Data proposal and it's take up in 
> the XML community. Currently the only XML data parser that i found is
> from MS. Does anyone else plan on supporting XML Data in the future?
> If not, what is the alternative to XML DATA ?

Outside of the Microsoft parser, XML Data is probably dead.  There are 
three other schema proposals (SOX, DCD, and DDML) and the W3C is currently 
working on their own.  XML Data is significant in that it seems to be the 
only schema language that is publicly supported by a parser.

You can find the various schema language specs at:

SOX: http://www.w3.org/TR/NOTE-SOX/
DCD: http://www.w3.org/TR/NOTE-dcd
DDML: http://www.w3.org/TR/NOTE-ddml
XML-Data: http://www.w3.org/TR/1998/NOTE-XML-data/
W3C XML Schema requirements: http://www.w3.org/TR/NOTE-xml-schema-req

and an overview of the various schema languages at:

http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/  
index.htm OR
http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/  
XMLSchemas.ppt

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Fri Mar 12 09:19:58 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:09:55 2004
Subject: Fw: ModSAX: Proposed Core Features
Message-ID: <00f501be6c68$bd213020$5402a8c0@oren.capella.co.il>

Ronald Bourret <rbourret@ito.tu-darmstadt.de> wrote:
>If the application assembles the components and the interface between them
>is SAX, what do we need that SAX filters don't already give us?  In other
>words, does anything need to be done to OpenSAX (best name so far) to
>support this besides adding the ParserFilter interface?


Yes. One needs to _locate_ the necessary filters. Hence the registry, the
query-for-a-feature, etc.

>The other question that occurs to me is how useful/common it is to
>dynamically assemble a processor at run time. That is, are there really
>applications (outside of test environments) that allow the user to
>designate their parser at run time (or even installation time) and
>therefore need to cover any possible deficiencies in the chosen parser?
> What is gained by allowing the user to choose the parser?


If there aren't, why bother with ModSAX at all? If I know exactly which
class is used, I also know exactly which features it provides, right? The
whole point of ModSAX is that this isn't the case.

Think of it like this: XML support is not the same on all platforms.
Sometimes there's a built-in SAX parser. It may or may not support some
features. Sometimes there's an XSL processor. And so on. I'm talking about
platforms existing today, or "real soon now" - IE5, server packages, etc.

I want to write code which is _reasonably_ portable to such platforms. I
accept the remark that a full-scale solution is beyond the scope of ModSAX.
What I suggested is an interface in the spirit of SAX (I hope) -
lightweight, simple, low-level, which allows future layering of higher-level
solutions.

>Note that this is a very different situation from, say, using different
>ODBC drivers.  In the case of ODBC drivers, you are choosing a different
>source of data (type of database) and application writers have a strong
>incentive to support multiple databases through ODBC.  In the case of XML,
>the source of data is always the same XML document and the choice of parser
>becomes a trade-off between speed, reliability, feature-set, etc.


On the contrary, I see it as vbeing very similar to using ODBC drivers. ODBC
drivers vary in their capabilities, and therefore have a mechanism for
querying for particular features. So do XML components. There might be any
number of ODBC drivers available in a particular system. Same for XML
components. And you typically have a pretty good idea of which ODBC driver
you are going to use. Same for XML components. The last point doesn't
invalidate the first two.

BTW, have you ever tried to write a non trivial program which would work
with any ODBC driver? I have. You have to at least negotiate its
capabilities, find a match for your needs, and then the problems start - it
doesn't like this join syntax, it can't do this particular form of query...
You end up writing an adapter class which knows the particular nastiness of
the particular driver. Of course this is due to SQL being such a weak
standard; XML should be better in this regard - if we insist on
well-defining features, that is.

>Since the application writer knows the feature set ahead of time, why not
>just hard-code the required parser and SAX filters and be done with it?
> (Yes, I know that "hard-code" is a bad word and I shudder as a write it,
>but I really am curious if anybody out there has a real-world application
>that allows users to change parsers and what the benefits of this are
>besides the ability to say, "Oh, look. I'm using a different parser.")


Mine. I run on both IE5 ("hey, look, there's a built in XSL processor") and
IE4 ("oh well, let's use XT"), not to mention some server platforms I'm
considering. I'm also tentatively considering other XML features -
namespaces and embedding. I doubt I'm unique in this regard. And as XML
support starts crawling into popular platforms (examples abound), this would
become more and more common.

At least we hope so :-)

>In this view, the utility of SAX is not the ability to change parsers at
>run time, but to change them over time as reliability, speed, size, etc. of
>the parsers change.  It also means that application writers can learn a
>single interface (SAX) and then choose parsers as they are appropriate to
>the application without having to learn different interfaces for different
>parsers.


That's one view and a valid one. It shouldn't prevent the other one.

>The ability to request features in OpenSAX allows the application to
>request processor behavior, which is slightly different from assembling a
>suitable parser.  For example, if I have an application that doesn't need
>validation, but I the parser I want to use does validation by default, I
>would like to be able to turn that off.


Right. I didn't suggest that the original question ("which features are
supported") isn't important. What I suggested is that the second question
("how do I find a filter/parser which does X") is also important.

If it wasn't, why do we have a ParserFactory class in SAX?

BTW, I'm not happy with this "parser" fixation. SAX is an interface which
allows processing an XML tree. I don't see why the special case ("input:
text; output: SAX events") is any different then "input: DOM; output: SAX
events", for example. That's why "org.xml.sax.parser" is just another
"feature" in the API I suggested. "org.xml.sax.visitor" and
"org.xml.sax.builder" would be on equal grounds. IMVHO, converting DOM to
SAX and back is something which we will have to deal with.

>Just to be clear, I'm not necessarily against assembling processors based
>on a feature set.  I just believe that it is far more complex than it
>appears at first glance and am not convinced that it's worth the trouble.


I think I've answered the complexity issue - the API I've suggested is
anything but. It merely provides the basic building blocks. The application
may be as complex or as simple as you want.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Fri Mar 12 09:28:26 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:09:55 2004
Subject: Fw: ModSAX: Proposed Core Features
Message-ID: <00fa01be6c69$e7a44160$5402a8c0@oren.capella.co.il>

David Megginson <david@megginson.com> wrote:

>I wrote:
> > I'd like to see "http://xml.org/sax/features/xsl-transformation" as
> > well.  Anyway, all of the above seem to fall nicely into the
> > pipeline framework.
>
>How about "http://capella.co.il/~oren/sax/features/xsl-transformation"
>(or whatever is suitable for your web rights)?


I kind of doubt that any XSL processors would register themselves under this
name :-) I don't think that implementations of standard W3C features should
be under private names. Come to think of it, if we go the URI way (which I'm
not happy with since it can't be used as a property name), the "right" URI
is a pointer to the relevant W3C standard.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Mar 12 09:51:35 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:56 2004
Subject: FW: Namespaces and DTDs
References: <NBBBJPGDLPIHJGEHAKBAMEOCCOAA.martind@netfolder.com>
Message-ID: <36E8E733.7F84EA5E@mecomnet.de>

Didier PH Martin wrote:
> 
> I am using these simple rule of thumb:
> 
> a) a XML DTD is useful for XML editors not for XML renderers

if one presumes this, then one loses the ability to use attribute defaults
and, thereby, for example, the chance to use "architectural" techniques.

> b) Most XML renderers (XSL, CSS or DSSSL won't do document validation)
> c) a XML interpreter do not need a DTD (something else than rendition)
> 
> If I need a DTD at the receiving end, then I am now no longer in the XML
> world but in the SGML world because the receiving end needs a validating
> parser.

these techniques do note presume validation, just the availability of
attribute declarations.

>    Several SGML parser like for instance SP can parse XML simplifyed
> DTD. The only simplification I gained is the -- or -0 think called omitags.
> Therefore, because I have to include a DTD for validation, better use then a
> SGML format.
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Mar 12 10:01:05 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:09:56 2004
Subject: Namespaces and DTDs
References: <199903120241.PAA10692@aniwa.sky>
Message-ID: <36E8E947.5B6EF6B7@mecomnet.de>

of all things, the "context problems" would be minimal, so long as the
decoding process maps the prefixed identifiers to universal identifiers.
(which as best i can surmis all the "standard" parsers do.) the application
wouldn't care where they came from and any reserialization would be
responsible to get its own declarations in order.

[i'm not arguing for this declaration form, just noting that it doesn't make
the problem any more complex.]

Andrew McNaughton wrote:
> 
> Documents resulting from queries run on the concatenated document would tend
> to cause problems, as query results don't generally return the context of the
> XML elements returned.  This problem also applies to queries run across
> multiple documents unless their DTD's are identical, which perhaps suggests
> that an answer to this problem has to come from the query languages.
> 
> Andrew McNaughton
> 
>  Marc.McDonald@Design-Intelligence.com wrote:
> > So make a namespace declaration a PI and add an "not using this
> > namespace anymore" PI. Then use simple occurrence scoping:
> >
> > Process result:
> > <?XMLNS prefix="foo" uri="...">
> > <A> .... </A>
> > <XMLENDNS prefix="foo">
> >


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Fri Mar 12 10:53:41 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:09:56 2004
Subject: XML DATA
Message-ID: <3601c191.120299@smtpgate1.ONE2ONE.CO.UK>


Thanks for info below Ronald.
I am very interested in the SOX schema. I guess i'll might as asked a   
similar question as before. What is the take up of SOX in the XML   
community? What do other people use to mapping native data type in XML?

 -lucio

> > I would like to know the status of XML Data proposal and
> it's take up in
> > the XML community. Currently the only XML data parser that
> i found is
> > from MS. Does anyone else plan on supporting XML Data in the future?
> > If not, what is the alternative to XML DATA ?
>
> Outside of the Microsoft parser, XML Data is probably dead.
> There are
> three other schema proposals (SOX, DCD, and DDML) and the W3C
> is currently
> working on their own.  XML Data is significant in that it
> seems to be the
> only schema language that is publicly supported by a parser.
>
> You can find the various schema language specs at:
>
> SOX: http://www.w3.org/TR/NOTE-SOX/
> DCD: http://www.w3.org/TR/NOTE-dcd
> DDML: http://www.w3.org/TR/NOTE-ddml
> XML-Data: http://www.w3.org/TR/1998/NOTE-XML-data/
> W3C XML Schema requirements: http://www.w3.org/TR/NOTE-xml-schema-req
>
> and an overview of the various schema languages at:
>
http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/   
   

index.htm OR
http://www.informatik.tu-darmstadt.de/DVS1/staff/bourret/xml/xmlschemas/   
   

XMLSchemas.ppt

 -- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on   
CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following   
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 12 11:38:00 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:56 2004
Subject: Oedipus XML (TIe Your Mother Down)
In-Reply-To: <36E8A1E8.1B@hiwaay.net>
References: <01BE6B63.2B7477A0.jarle.stabell@dokpro.uio.no>
	<36E7AE27.7DE6@hiwaay.net>
	<14055.59156.593634.998329@localhost.localdomain>
	<36E8A1E8.1B@hiwaay.net>
Message-ID: <14056.64392.235652.11594@localhost.localdomain>

len bullard writes:

 > > To be fair, I am talking only about the core specs -- I am comparing
 > > ISO 8879 to the XML 1.0 REC, and am leaving out the peripheral
 > > standards on both sides.  A comparison of HyTime to XLink, XPointer,
 > > and Namespaces, of DSSSL to XSL, or of Topic Maps to RDF would be an
 > > interesting but separate exercise.
 > 
 > Here I don't disagree, but in my work, the concepts of HyTime, DSSSL, 
 > the RDF Dublin Core, and namespaces influence my work.  Learning to 
 > think beyond the DTD to the information properties of the metalanguage 
 > proves to be very useful and that is not something I did before.  
 > As activities like X3D ramp up, I find I am applying more and more 
 > of the wall-to-wall markup concepts from the middle years of SGML 
 > and they work in the XML infrastructure of tools.  This is actually 
 > quite delightful.

Precisely.  The important point, though, is that none of these
peripheral standards is hard-coded to SGML or XML.  Although there are
some minor lexical differences, in general you could apply Namespaces
to SGML or HyTime to XML, you could use XSL to format an SGML document
or DSSSL to format an XML document, etc.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rudman at idetix.com  Fri Mar 12 14:02:05 1999
From: rudman at idetix.com (Dan Rudman)
Date: Mon Jun  7 17:09:56 2004
Subject: Basic Question
Message-ID: <000701be6c90$8e31ee80$49e9fdce@diablo.idetix.com>

I apologize for the basic question in advance :)


With the wealth of XML libraries available, I am more and more inclined to
make use of these libraries to help me create, parse, and utilize my own tag
markup language to be embedded within an HTML document.  My understanding of
XML at this point is that it must be well-formed or a fatal error occurs.
If this is the case, how can I deal with the fact that most HTML documents
are NOT well-formed and that most HTML design tools do not enforce, require,
or even sometimes support, well-formedness in a document?

Things would be rosy if I didn't have to rely on HTML, but my application
requires it.

Thanks.

-- Dan


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990312/525211ab/attachment.htm
From david at megginson.com  Fri Mar 12 14:06:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:56 2004
Subject: Mod??SAX: Revised Proposed Core Features 1999-03-12
Message-ID: <14057.7534.612787.424789@localhost.localdomain>

Here's a revised list of the proposed core features for Mod??SAX.
I've added one new feature -- http://xml.org/sax/features/use-locator
-- which will explicitly request the parse to supply or not to supply
a Locator through the DocumentHandler.setDocumentLocator callback.

There are two possible advantages to including this feature:

1. If the application wants a locator, it can tell before beginning
   the parse whether the parser can supply one.

2. If the application does not want a locator, the SAX parser/driver
   might be able to operate more efficiently if it doesn't have to
   maintain the Locator information.

What does everyone else think?

In any case, here's the revised core feature list (I've also added
extra wording to make it clear that the external DTD subset counts as
an external parameter entity):


ModSAX Core Features
--------------------

$Id: features.txt,v 1.1 1999/03/12 13:57:54 david Exp $

http://xml.org/sax/features/validation
  Validate (true) or don't validate (false).

http://xml.org/sax/features/external-general-entities
  Expand external general entities (true) or don't expand (false).

http://xml.org/sax/features/external-parameter-entities
  Expand external parameter entities including the external DTD subset 
  (true) or don't expand (false).

http://xml.org/sax/features/namespaces
  Preprocess namespaces (true) or don't preprocess (false).  See also
  the http://xml.org/sax/properties/namespace-sep property.

http://xml.org/sax/features/normalize-text
  Ensure that all consecutive text is returned in a single callback to
  DocumentHandler.characters or DocumentHandler.ignorableWhitespace
  (true) or explicitly do not require it (false).

http://xml.org/sax/features/use-locator
  Provide a Locator using the DocumentHandler.setDocumentLocator
  callback (true), or explicitly do not provide one (false).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Fri Mar 12 14:27:48 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:09:56 2004
Subject: One more ModSax naming try...
Message-ID: <85256732.004F61BA.00@D51MTA03.pok.ibm.com>

</lurk>
For what it's worth, the approach the DOM has been looking at is that Level
2 classes which are subclasses/refinements of things that were present in
Level 2 will be named by adding 2 as a suffix (Document2 et al). Simple,
effective, extensible and indicates which version of the spec it refers to.

Hence: SAX2?
<lurk>
______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 12 14:45:24 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:56 2004
Subject: Mod??SAX: Feature Matrix
Message-ID: <14057.8681.213869.498655@localhost.localdomain>

It would be interesting to put together a feature matrix representing
current practice among SAX parsers/drivers, at least in the Java
world.  Assuming that I simply wrapped the existing drivers with a
Mod??Parser adapter, what features would and would not be supported?

>From my fuzzy recollection, here's what AElfred supported just about a 
year ago before I gave it up:

                                 true              false
------------------------------------------------------------------------
validation                       no                yes
external-general-entities        yes               no
external-parameter-entities      yes               no
namespaces                       no                yes
normalize-text                   yes               no
use-locator                      yes               no

(Wherever there's a 'no' answer, the driver should throw a
SAXNotSupportedException).  Actually, it would probably be safe to
accept false for 'normalize-text' as well.  If I were to wrap the
AElfred driver, then, I'd do something like this (there's likely some
kind of a static initialisation trap here, but it should be good
enough as an unreliable example):


public class AElfredModParser extends com.microstar.xml.SAXDriver
                              implements org.xml.sax.ModParser
{
  private static Hashtable featureTable = new Hashtable();
  private static final Object TRUE = new Object();
  private static final Object FALSE = new Object();
  private static final Object TRUEFALSE = new Object();
  private static final String FEATURE_NS = "http://xml.org/sax/features/";

  static {
    featureTable.put(FEATURE_NS + "validation", FALSE);
    featureTable.put(FEATURE_NS + "external-general-entities", TRUE);
    featureTable.put(FEATURE_NS + "external-parameter-entities", TRUE);
    featureTable.put(FEATURE_NS + "namespaces", TRUE);
    featureTable.put(FEATURE_NS + "normalize-text", TRUEFALSE);
    featureTable.put(FEATURE_NS + "use-locator", TRUE);
  }

  public void setFeature (String featureID, boolean state)
    throws SAXNotSupportedException
  {
    Object allowedState = featureTable.get(featureID);
    if (allowedState == null) {
      throw new SAXNotRecognizedException();
    } else if ((state && allowedState == FALSE) ||
               (!state && allowedState == TRUE)) {
      throw new SAXNotSupportedException();
    }
  }

  // etc. for setHandler, set, and get
}


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 12 15:51:59 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:09:56 2004
Subject: Namespaces and DTDs
References: <NBBBJPGDLPIHJGEHAKBAAEOFCOAA.martind@netfolder.com>
Message-ID: <36E94F45.46D0E403@prescod.net>

Didier PH Martin wrote:
> 
> a) a Lisp document could be made SGML compliant because SGML can let you
> define begin and end tag's delimiters (Ex: dsssl).

I let this claim pass a couple of times because I didn't consider it
important but now I feel the need to scratch that itch. DSSSL does not
actually use parentheses as tags. If you use nsgmls to look at the SGML
structure of a DSSSL document you will find that all of DSSSL's structure
is actually in omitted tags.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"The Excursion [Sport Utility Vehicle] is so large that it will come
equipped with adjustable pedals to fit smaller drivers and sensor 
devices that warn the driver when he or she is about to back into a
Toyota or some other object." -- Dallas Morning News

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 12 16:01:22 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:56 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <14053.51113.676945.877507@localhost.localdomain>
Message-ID: <199903121543.KAA13555@hesketh.net>

After a few months of intense busy-ness (and business), plus a trip to
XTech that I'm still recovering from, I'm finally catching up to ModSAX.
Printed it all out, looked it all over, and mostly I'm very pleased.  I
think it'll make implementing my Layered Model proposal as a a Layered
Parser much easier overall.

One key thing is missing at this stage, and that's a feature.

At 08:16 PM 3/9/99 -0500, David Megginson wrote:
>Here's my revised version of the core feature list, based on recent
>discussions:
>
>
>ModSAX Core Features
>--------------------
>
>http://xml.org/sax/features/validation
>  Validate (true) or don't validate (false).
>
>http://xml.org/sax/features/external-general-entities
>  Expand external general entities (true) or don't expand (false).
>
>http://xml.org/sax/features/external-parameter-entities
>  Expand external parameter entities (true) or don't expand (false).
>
>http://xml.org/sax/features/namespaces
>  Preprocess namespaces (true) or don't preprocess (false).  See also
>  the http://xml.org/sax/properties/namespace-sep property.
>
>http://xml.org/sax/features/normalize-text
>  Ensure that all consecutive text is returned in a single callback to
>  DocumentHandler.characters or DocumentHandler.ignorableWhitespace
>  (true) or explicitly do not require it (false).
>

We need:

http://xml.org/sax/features/external-subset

   Requires the parser to load the external subset of the DTD and process
it.  (External parameter entities remain a separate issue referenced by a
separate feature.)

This is critically important for attribute defaulting, which makes things
like XLink much much simpler.  At one point I switched parsers, only to
find that my attribute values in the external subset had all disappeared.
I promptly jumped back.  The spec (5.1) allows non-validating parsers to
skip the external subset; I'd very much like to have a way to tell the
parser not to skip it, or at least know that they are in fact being skipped.

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Mar 12 16:45:04 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:56 2004
Subject: Basic Question
Message-ID: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au>

-----Original Message-----
From: Dan Rudman <rudman@idetix.com>
>With the wealth of XML libraries available, I am more and more inclined to
>make use of these libraries to help me create, parse, and utilize my own
tag
>markup language to be embedded within an HTML document.  My understanding
of
>XML at this point is that it must be well-formed or a fatal error occurs.


Yes, this is correct.

>If this is the case, how can I deal with the fact that most HTML documents
>are NOT well-formed and that most HTML design tools do not enforce,
require,
>or even sometimes support, well-formedness in a document?


You might try Tidy as the initial step. Tidy can take bad HTML and spit out
XML that could then be parsed by any XML parser.

See http://www.w3.org/People/Raggett/tidy/

Hope this helps.

James
--
James Tauber / jtauber@jtauber.com / www.jtauber.com
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 : http://www8.org/

Maintainer of : www.xmlinfo.com,  www.xmlsoftware.com and www.schema.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Fri Mar 12 16:51:41 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:09:56 2004
Subject: Basic Question
Message-ID: <008201be6ca8$3b56c380$32acdccf@ix.netcom.com>

Dan,
use XHTML which is well formed xml

Frank
----- Original Message -----
From: Dan Rudman <rudman@idetix.com>
To: 'XML-DEV' <xml-dev@ic.ac.uk>
Sent: Friday, March 12, 1999 8:59 AM
Subject: Basic Question


>I apologize for the basic question in advance :)
>
>
>With the wealth of XML libraries available, I am more and more inclined to
>make use of these libraries to help me create, parse, and utilize my own
tag
>markup language to be embedded within an HTML document.  My understanding
of
>XML at this point is that it must be well-formed or a fatal error occurs.
>If this is the case, how can I deal with the fact that most HTML documents
>are NOT well-formed and that most HTML design tools do not enforce,
require,
>or even sometimes support, well-formedness in a document?
>
>Things would be rosy if I didn't have to rely on HTML, but my application
>requires it.
>
>Thanks.
>
>-- Dan
>
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Fri Mar 12 16:56:18 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:09:56 2004
Subject: Mod??SAX: Revised Proposed Core Features 1999-03-12
In-Reply-To: <14057.7534.612787.424789@localhost.localdomain>
References: <14057.7534.612787.424789@localhost.localdomain>
Message-ID: <wkbthylkar.fsf@ifi.uio.no>


* David Megginson
|
| I've added one new feature -- http://xml.org/sax/features/use-locator
| [...]
| What does everyone else think?

Good one! I'm in favour.
 
--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 12 17:13:12 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:56 2004
Subject: ModSAX: Proposed Core Features
References: <199903121543.KAA13555@hesketh.net>
Message-ID: <36E94AF4.8A154C60@locke.ccil.org>

Simon St.Laurent wrote:

>    Requires the parser to load the external subset of the DTD and process
> it.  (External parameter entities remain a separate issue referenced by a
> separate feature.)

I don't see why it should be.  I think that parsers will either
process just the internal subset, or will load all external DTD
parts, including both the external subset and the external
parameter entities.  (Ignoring the internal subset is *not* an
option, of course, except for DPH parsers.)

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Fri Mar 12 17:40:28 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
Message-ID: <36E97AED.3F44F7D1@metalab.unc.edu>

>From the XML spec, section 3.1:

"Empty-element tags may be used for any element which has no content,
whether or not it is declared using the keyword EMPTY. For
interoperability, the empty-element tag must be used, and can only be
used, for elements which are declared EMPTY."

1. The "can only be used" part of the second sentence seems to
contradict the the first sentence.

2. "the empty-element tag must be used...for elements which are declared
EMPTY" seems to contradict the assertiona that <NAME></NAME> and <NAME/>
are the same thing.

Is there any way out of this conundrum?

--
Elliotte Rusty Harold

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From indiketr at churchill.co.uk  Fri Mar 12 17:48:21 1999
From: indiketr at churchill.co.uk (Rajeeva Indiketiya)
Date: Mon Jun  7 17:09:57 2004
Subject: unsubscribe xml-dev
In-Reply-To: <003b01be6705$d5e059a0$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <Pine.SV4.4.02.9903101105580.15441-100000@chilli>

unsubscribe xml-dev


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 12 18:03:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:57 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <199903121543.KAA13555@hesketh.net>
References: <14053.51113.676945.877507@localhost.localdomain>
	<199903121543.KAA13555@hesketh.net>
Message-ID: <14057.21849.286930.749213@localhost.localdomain>

Simon St.Laurent writes:

 > We need:
 > 
 > http://xml.org/sax/features/external-subset

I agree that this functionality is required.  The question is whether
there is a strong case for making inclusion of the external DTD subset
separately configurable from the inclusion of external parameter
entities in general.

I'd suggest not.  Consider the following:

  <!DOCTYPE doc [
    <!-- other stuff -->
    <!ENTITY % dtd SYSTEM "doc.dtd">
  ]>

and

  <!DOCTYPE doc SYSTEM "doc.dtd" [
    <!-- other stuff -->
  ]>

Except for the extra "%doc" entry in the entity name table, these two
document type declarations look to me to be exactly equivalent; as a
matter of fact, I've always considered the second to be simply a
short-hand for the first.

James Clark made a convincing case for separating the inclusion of
external general entities for the inclusion of external parameter
entities.  Can anyone make a convincing case for separating the
inclusion of external parameter entities from the inclusion of the
external DTD subset?


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 12 18:09:48 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:09:57 2004
Subject: Basic Question
In-Reply-To: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au>
References: <009601be6ca6$cbe05440$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <14057.22147.177213.132967@localhost.localdomain>

Dan Rudman <rudman@idetix.com> writes:

 [on XML's well-formedness constraints]

 > If this is the case, how can I deal with the fact that most HTML
 > documents are NOT well-formed and that most HTML design tools do
 > not enforce, require, or even sometimes support, well-formedness in
 > a document?

You'd best keep the two separate.  Try including the following in the
HTML:

  <link rel="whatever" type="text/xml" href="mystuff.xml">

Now, the HTML can stay as it is, and the XML can be properly
well-formed.  This approach is already best practice for including CSS
stylesheets (using <link>) and ECMA scripts (using <script>), and it
also has the twin advantages that many HTML documents can share the
same XML if necessary, and that you can update the XML information
without messing up the HTML pages.

There's nothing fancy about this approach -- even naive web designers
already use it whenever they include graphics on a web page (and they
often allow several pages to share the same JPEG logo, for example).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stark at uplanet.com  Fri Mar 12 18:11:44 1999
From: stark at uplanet.com (Peter Stark)
Date: Mon Jun  7 17:09:57 2004
Subject: Basic Question
In-Reply-To: <008201be6ca8$3b56c380$32acdccf@ix.netcom.com>
Message-ID: <002201be6cb3$c82bc4d0$07c3c6c3@sluk.uplanet.com>

Or try HTML Tidy at :
http://www.w3.org/People/Raggett/tidy/

Peter Stark

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Frank Boumphrey
> Sent: Friday, March 12, 1999 8:49 AM
> To: Dan Rudman; 'XML-DEV'
> Subject: Re: Basic Question
>
>
> Dan,
> use XHTML which is well formed xml
>
> Frank
> ----- Original Message -----
> From: Dan Rudman <rudman@idetix.com>
> To: 'XML-DEV' <xml-dev@ic.ac.uk>
> Sent: Friday, March 12, 1999 8:59 AM
> Subject: Basic Question
>
>
> >I apologize for the basic question in advance :)
> >
> >
> >With the wealth of XML libraries available, I am more and more
> inclined to
> >make use of these libraries to help me create, parse, and utilize my own
> tag
> >markup language to be embedded within an HTML document.  My understanding
> of
> >XML at this point is that it must be well-formed or a fatal error occurs.
> >If this is the case, how can I deal with the fact that most HTML
> documents
> >are NOT well-formed and that most HTML design tools do not enforce,
> require,
> >or even sometimes support, well-formedness in a document?
> >
> >Things would be rosy if I didn't have to rely on HTML, but my application
> >requires it.
> >
> >Thanks.
> >
> >-- Dan
> >
> >
> >
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 12 18:18:02 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:57 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <36E94AF4.8A154C60@locke.ccil.org>
References: <199903121543.KAA13555@hesketh.net>
Message-ID: <199903121816.NAA17031@hesketh.net>

At 12:12 PM 3/12/99 -0500, John Cowan wrote:
>Simon St.Laurent wrote:
>
>>    Requires the parser to load the external subset of the DTD and process
>> it.  (External parameter entities remain a separate issue referenced by a
>> separate feature.)
>
>I don't see why it should be.  I think that parsers will either
>process just the internal subset, or will load all external DTD
>parts, including both the external subset and the external
>parameter entities.  (Ignoring the internal subset is *not* an
>option, of course, except for DPH parsers.)

I suppose it doesn't _have_ to be, and David's message earlier this morning
combined them.  (I wrote my message before I got his message.)

Still, I can imagine that it might well be useful to privilege the external
subset's initial contents, without retrieving stacks of validation
information stored in external parameter entities.  An XLink application,
for example, might not care about retrieving and analyzing lots of element
declarations when all it really needs is the attribute declarations for
defaulting.  A mechanism like this might be useful in such a context - put
attribute declarations in the ext subset, element declarations in a file
referenced by PEs, and go.  Validating parsers would get all of it, while
non-validating parsers could pick out the parts they need. You could do the
same thing with the internal subset, but frankly I'd rather not use the
internal subset for anything I can avoid - management of an external subset
is _much_ easier.

(I don't think the spec is clear on whether a non-validating parser that
has read the external subset is then required to go get PE values; I
suspect it doesn't have to.)

The example may feel a little far-fetched.  In any case, it's an argument
for keeping features described in different parts of the spec, which seems
to me like a good way to proceed in general.

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at oreilly.com  Fri Mar 12 18:20:16 1999
From: crism at oreilly.com (Chris Maden)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
In-Reply-To: <36E97AED.3F44F7D1@metalab.unc.edu> (message from Elliotte Rusty
	Harold on Fri, 12 Mar 1999 12:37:01 -0800)
Message-ID: <199903121817.NAA29485@ruby.ora.com>

[Elliotte Rusty Harold]
> "Empty-element tags may be used for any element which has no
> content, whether or not it is declared using the keyword EMPTY. For
> interoperability, the empty-element tag must be used, and can only
> be used, for elements which are declared EMPTY."
> 
> Is there any way out of this conundrum?

See the definition of "for interoperability" in �1.2.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Fri Mar 12 18:21:39 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
Message-ID: <93CB64052F94D211BC5D0010A80013310EB379@WWMESS3.172.19.125.2>

> From: Elliotte Rusty Harold
> "Empty-element tags may be used for any element which has no content,
> whether or not it is declared using the keyword EMPTY. For
> interoperability, the empty-element tag must be used, and can only be
> used, for elements which are declared EMPTY."
> 
> 1. The "can only be used" part of the second sentence seems to
> contradict the the first sentence.
> 
> 2. "the empty-element tag must be used...for elements which 
> are declared EMPTY" seems to contradict the assertiona that <NAME></NAME> 
> and <NAME/> are the same thing.
> 
> Is there any way out of this conundrum?
> 
Yes. Section 1.2, terminology, says that all sentences beginning with "For
interoperability" can be safely ignored unless you are interested in getting
your document through a piece of software that wasn't written to process
XML.

Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Fri Mar 12 18:41:54 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
References: <36E97AED.3F44F7D1@metalab.unc.edu>
Message-ID: <36E95F8D.C24E29DB@goon.stg.brown.edu>

Elliotte Rusty Harold wrote:
> 
> >From the XML spec, section 3.1:
> 
> "Empty-element tags may be used for any element which has no content,
> whether or not it is declared using the keyword EMPTY. For
> interoperability, the empty-element tag must be used, and can only be
> used, for elements which are declared EMPTY."
>
> ...Is there any way out of this conundrum?

Yes.  Note what the standard means by "for interoperability".  It's not
a mandatory constraint.  So you can use <tag/> in place of <tag></tag>
for elements that aren't declared EMPTY.  But a good validator will at
least issue a warning when it sees <tag/>.  Typically this warning can
(and should) be ignored.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 12 18:57:17 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
References: <36E97AED.3F44F7D1@metalab.unc.edu>
Message-ID: <36E97A55.E8740937@prescod.net>

Elliotte Rusty Harold wrote:
> 
> >From the XML spec, section 3.1:
> 
> "Empty-element tags may be used for any element which has no content,
> whether or not it is declared using the keyword EMPTY. For
> interoperability, the empty-element tag must be used, and can only be
> used, for elements which are declared EMPTY."
> 
> ...
> 
> Is there any way out of this conundrum?

Yes, look at the definition for "interoperability." The paragraph above
says "If you want to be legal XML do this. If you want to be compatible
with old SGML tools also do this."

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"The Excursion [Sport Utility Vehicle] is so large that it will come
equipped with adjustable pedals to fit smaller drivers and sensor 
devices that warn the driver when he or she is about to back into a
Toyota or some other object." -- Dallas Morning News


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 12 19:22:34 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
References: <93CB64052F94D211BC5D0010A80013310EB379@WWMESS3.172.19.125.2>
Message-ID: <36E96923.6FE96D54@locke.ccil.org>

Kay Michael wrote:

> Yes. Section 1.2, terminology, says that all sentences beginning with "For
> interoperability" can be safely ignored unless you are interested in getting
> your document through a piece of software that wasn't written to process
> XML.

I have already bitched to xml-editor@w3.org about the use of
"For interoperability" and "must" in the same sentence, suggesting
that "should" is the correct modal verb.  All other uses of "F.i."
use "should".

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Fri Mar 12 19:25:08 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:09:57 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <14057.21849.286930.749213@localhost.localdomain>
References: <199903121543.KAA13555@hesketh.net>
 <14053.51113.676945.877507@localhost.localdomain>
 <199903121543.KAA13555@hesketh.net>
Message-ID: <3.0.6.32.19990312191256.0098c080@gpo.iol.ie>

At 01:01 PM 3/12/99 -0500, you wrote:
>Simon St.Laurent writes:
>
> > We need:
> > 
> > http://xml.org/sax/features/external-subset
>
>I agree that this functionality is required.  The question is whether
>there is a strong case for making inclusion of the external DTD subset
>separately configurable from the inclusion of external parameter
>entities in general.
>
[Dave Megginson]
>I'd suggest not.  Consider the following:
>
>  <!DOCTYPE doc [
>    <!-- other stuff -->
>    <!ENTITY % dtd SYSTEM "doc.dtd">
>  ]>
>
>and
>
>  <!DOCTYPE doc SYSTEM "doc.dtd" [
>    <!-- other stuff -->
>  ]>
>
>Except for the extra "%doc" entry in the entity name table, these two
>document type declarations look to me to be exactly equivalent; as a
>matter of fact, I've always considered the second to be simply a
>short-hand for the first.
>

In the former, the doc.dtd entity is part of the internal document
type declaration subset. In the latter it is part of the
external document type declaration subset.

So, if you have conditional sections in doc.dtd then they are
not the same.

regards,

<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Mar 12 19:29:16 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:57 2004
Subject: empty tags and the XMl 1.0 spec
Message-ID: <025501be6cbe$1c018400$0300000a@othniel.cygnus.uwa.edu.au>

>"Empty-element tags may be used for any element which has no content,
>whether or not it is declared using the keyword EMPTY. For
>interoperability, the empty-element tag must be used, and can only be
>used, for elements which are declared EMPTY."
>
>1. The "can only be used" part of the second sentence seems to
>contradict the the first sentence.
>
>2. "the empty-element tag must be used...for elements which are declared
>EMPTY" seems to contradict the assertiona that <NAME></NAME> and <NAME/>
>are the same thing.
>
>Is there any way out of this conundrum?


Yes. The term "for interoperability" is means it's not a requirement for
well-formed XML, but *if* you want to maintain interoperability with
pre-WebSGML SGML processors, then you should do it.

In other words, <NAME></NAME> is equivalent to <NAME/> *UNLESS* you want to
maintain interoperability with older SGML systems in which case you should
only use <NAME/> for elements declared EMPTY and not <NAME></NAME>.

Hope this helps.

James
--
James Tauber / jtauber@jtauber.com / www.jtauber.com
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 : http://www8.org/

Maintainer of : www.xmlinfo.com,  www.xmlsoftware.com and www.schema.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 12 19:34:14 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:57 2004
Subject: Partial DTDs (was: ModSAX: Proposed Core Features)
References: <199903121543.KAA13555@hesketh.net> <199903121816.NAA17031@hesketh.net>
Message-ID: <36E96BF3.380385C8@locke.ccil.org>

Simon St.Laurent wrote:

> Still, I can imagine that it might well be useful to privilege the external
> subset's initial contents, without retrieving stacks of validation
> information stored in external parameter entities.

Actually it's quite tricky to do that correctly, although the XML
spec is silent on just how.

In particular, you must be very careful about processing parts of the
DTD that follow a reference to a PE that you do not expand.  For
one thing, further entity declarations may have been overridden
and must be ignored, possibly leading to further troubles.
For another, conditional sections where the keyword is an
unknown PE reference must be treated as IGNORE.

The simplest approach is probably the approach taken in the
internal subset:  Ignore everything after the first uninterpretable
PE reference.  Is this really useful?  XHTML, for example,
loads its external PEs (lists of HTML general entities for
characters) almost the first thing.

> An XLink application,
> for example, might not care about retrieving and analyzing lots of element
> declarations when all it really needs is the attribute declarations for
> defaulting.  A mechanism like this might be useful in such a context - put
> attribute declarations in the ext subset, element declarations in a file
> referenced by PEs, and go.  Validating parsers would get all of it, while
> non-validating parsers could pick out the parts they need.

Well, yes, but what hope is there that people will structure their
DTDs in this oddball way?  To do so messily separates element
declarations from their corresponding attribute declarations for
the sake of an implementation hack.  I sure wouldn't do it if I
had any hope of keeping the ELEMENT and ATTLIST declarations
in sync.

> You could do the
> same thing with the internal subset, but frankly I'd rather not use the
> internal subset for anything I can avoid - management of an external subset
> is _much_ easier.

The internal subset is good for things like document-specific
internal general entities.
 
> (I don't think the spec is clear on whether a non-validating parser that
> has read the external subset is then required to go get PE values; I
> suspect it doesn't have to.)

No, it doesn't.  An NVP is privileged to not read any and all external
entities with the sole exception of the document entity, which it
must read.

In practice, however, I suspect that all NVPs fall into one of the
following four classes:

1) Read only the document entity.

2) Read the whole DTD but no external general entities.

3) Read all external general entities, but only process
the internal DTD subset.

4) Read all external entities (except unparsed entities).
 
-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Fri Mar 12 19:36:03 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:09:57 2004
Subject: XML-DATA
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF107@RED-MSG-08>

Ron Bourret summarized the current state of XML schema languages, and
included the statement "Outside of the Microsoft parser, XML Data is
probably dead."

The current state of this is

The Microsoft Internet Explorer 5.0 includes an implementation of XML-Data.
(See http://www.microsoft.com/xml .) This is a fully-functional "technology
preview" -- meaning that it is a correct, tested implementation of XML-Data
Reduced, but that it is meant to demonstrate the viability of and allow
customers to get experience with the concepts involved in an XML-based
schema language while the W3C process progresses, which is a good lead to
the next topic...

The W3C XML group currently has an activity to examine the features needed
in a new schema language for XML.  You can find this described at
http://www.w3.org/XML/Activity .


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Fri Mar 12 19:36:27 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:09:58 2004
Subject: Intro message stale
Message-ID: <36E96BBF.1B8C98B9@w3.org>

The intro message could confuse new subscribers. It says

> The XML spec is still being actively developed and the latest draft is
> at http://www.textuality.com/sgml-erb/WD-xml.html

Which is untrue, since following that link takes you to a Working Draft
dated 30-Jun-97 wheras the actual latest spec is the W3C Recomendation
dated 10-February-1998 

It also points you to an old draft of Xlink, dated June-30-97.
The current latest spec is twofld, the XLink and XPointer WDs, both
dated 3-March-1998 

I know, everyone on this list already knew that. People just joining
however might be confused.

URLs for the latest documents (to aid in updating the welcome message):

http://www.w3.org/TR/REC-xml
http://www.w3.org/TR/REC-xml-names/
http://www.w3.org/TR/PR-xml-stylesheet
http://www.w3.org/TR/WD-xml-fragment
http://www.w3.org/TR/WD-xlink
http://www.w3.org/TR/WD-xptr
http://www.w3.org/XML/

--
Chris
-------------- next part --------------
An embedded message was scrubbed...
From: Majordomo@ic.ac.uk
Subject: Welcome to xml-dev
Date: Fri, 12 Mar 1999 08:26:31 +0000
Size: 3302
Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990312/28428792/attachment.eml
From simonstl at simonstl.com  Fri Mar 12 19:42:00 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:58 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <14057.21849.286930.749213@localhost.localdomain>
References: <199903121543.KAA13555@hesketh.net>
 <14053.51113.676945.877507@localhost.localdomain>
 <199903121543.KAA13555@hesketh.net>
Message-ID: <199903121940.OAA19086@hesketh.net>

At 01:01 PM 3/12/99 -0500, David Megginson wrote:
>James Clark made a convincing case for separating the inclusion of
>external general entities for the inclusion of external parameter
>entities.  Can anyone make a convincing case for separating the
>inclusion of external parameter entities from the inclusion of the
>external DTD subset?

We appear to be leapfrogging each other's messages.  See my argument in my
previous message at
http://www.lists.ic.ac.uk/hypermail/xml-dev/9903/0336.html.

I present one case, though basically I'll admit that it comes down to:
"The spec calls these two different things.  Let's continue to call them
two different things to minimize confusion and maximize flexibility."

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 12 19:49:01 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:58 2004
Subject: ModSAX: Proposed Core Features
References: <199903121543.KAA13555@hesketh.net>
	 <14053.51113.676945.877507@localhost.localdomain>
	 <199903121543.KAA13555@hesketh.net> <3.0.6.32.19990312191256.0098c080@gpo.iol.ie>
Message-ID: <36E96F37.9FF4EED1@locke.ccil.org>

Sean Mc Grath wrote:

> In the former, the doc.dtd entity is part of the internal document
> type declaration subset. In the latter it is part of the
> external document type declaration subset.
> 
> So, if you have conditional sections in doc.dtd then they are
> not the same.

Not so.  The text of the WFC "PEs in Internal Subset" reads:

# In the internal DTD subset, parameter-entity references can
# occur only where markup declarations can
# occur, not within markup declarations. (This does not
# apply to references that occur in external parameter
# entities or to the external subset.)
# 
# Like the internal subset, the external subset and any
# external parameter entities referred to in the DTD
# must consist of a series of complete markup declarations
# of the types allowed by the non-terminal symbol
# markupdecl, interspersed with white space or
# parameter-entity references. However, portions of the
# contents of the external subset or of external
# parameter entities may conditionally be ignored by using the
# conditional section construct; this is not allowed in
# the internal subset.

So the internal subset's special restrictions on external PE
references (and conditional sections) apply only to the text
actually within the DOCTYPE declaration, not to external PEs
that are referred to from there.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Fri Mar 12 20:08:45 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:09:58 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <3.0.6.32.19990312195838.00996930@gpo.iol.ie>

At 02:47 PM 3/12/99 -0500, you wrote:
>Sean Mc Grath wrote:
>
>> In the former, the doc.dtd entity is part of the internal document
>> type declaration subset. In the latter it is part of the
>> external document type declaration subset.
>> 
>> So, if you have conditional sections in doc.dtd then they are
>> not the same.
>
>Not so.  The text of the WFC "PEs in Internal Subset" reads:
>
>...
>So the internal subset's special restrictions on external PE
>references (and conditional sections) apply only to the text
>actually within the DOCTYPE declaration, not to external PEs
>that are referred to from there.
>
My point is that you cannot switch from:
	<!DOCTYPE foo SYSTEM "doc.dtd">
to:
	<!DOCTYPE foo [
	<!ENTITY % dtd SYSTEM "doc.dtd">
	%dtd;
	]>

and expect things to be exactly the same. if you have
conditional sections in doc.dtd, the latter is illegal.

regards,
Sean

<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 12 20:23:35 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:09:58 2004
Subject: Partial DTDs (was: ModSAX: Proposed Core Features)
In-Reply-To: <36E96BF3.380385C8@locke.ccil.org>
References: <199903121543.KAA13555@hesketh.net>
 <199903121816.NAA17031@hesketh.net>
Message-ID: <199903122022.PAA19957@hesketh.net>

At 02:33 PM 3/12/99 -0500, John Cowan wrote:
>No, it doesn't.  An NVP is privileged to not read any and all external
>entities with the sole exception of the document entity, which it
>must read.
>
>In practice, however, I suspect that all NVPs fall into one of the
>following four classes:
>
>1) Read only the document entity.
>
>2) Read the whole DTD but no external general entities.
>
>3) Read all external general entities, but only process
>the internal DTD subset.
>
>4) Read all external entities (except unparsed entities).

I think 4 should read:

4) Read the document, the internal subset, the external subset, and all
external entities except unparsed entities.

(In my experience with Aelfred, that's what it does, though I'll admit to
not actually using entities in 'real' practice.)

I guess the question comes down to whether your interpretation of current
practice is reason enough to combine the external subset and external
entities in this situation.  Given the point Sean McGrath just raised about
things (INCLUDE/IGNORE, but also certain kinds of PE processing) that are
legal in the external DTD subset but not in the internal subset or external
PEs referenced from the internal subset, I'd say to keep these things
separate.

Of course, we could create a feature that simply referred to external
resources, combining both properties, on top of the external PEs/external
subset features.  Might be the best of all worlds, though it would take
some clear documentation identifying what happened if you used both the
convenience feature and its sub-parts.

Simon St.Laurent
XML: A Primer / Building XML Applications (April)
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 12 22:17:42 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:58 2004
Subject: ModSAX: Proposed Core Features
References: <199903121543.KAA13555@hesketh.net>
	 <14053.51113.676945.877507@localhost.localdomain>
	 <199903121543.KAA13555@hesketh.net>
	 <3.0.6.32.19990312191256.0098c080@gpo.iol.ie> <3.0.6.32.19990312195804.00995720@gpo.iol.ie>
Message-ID: <36E99267.73C16A7@locke.ccil.org>

Sean Mc Grath wrote:

> My point is that you cannot switch from:
>         <!DOCTYPE foo SYSTEM "doc.dtd">
> to:
>         <!DOCTYPE foo [
>         <!ENTITY % dtd SYSTEM "doc.dtd">
>         %dtd;
>         ]>
> 
> and expect things to be exactly the same. if you have
> conditional sections in doc.dtd, the latter is illegal.

I understand what you mean.  I believe you are not correct.
Conditional sections are allowed in external PEs referenced
from the internal subset, just not in the internal subset itself.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 12 22:20:46 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:58 2004
Subject: Partial DTDs (was: ModSAX: Proposed Core Features)
References: <199903121543.KAA13555@hesketh.net>
	 <199903121816.NAA17031@hesketh.net> <199903122022.PAA19957@hesketh.net>
Message-ID: <36E99314.9ACA605A@locke.ccil.org>

Simon St.Laurent wrote:

> I think 4 should read:
> 
> 4) Read the document, the internal subset, the external subset, and all
> external entities except unparsed entities.

That's what I meant.

> I guess the question comes down to whether your interpretation of current
> practice is reason enough to combine the external subset and external
> entities in this situation.

Not only current practice, but the fact that there isn't even 
current *theory* for interpreting partial DTDs except for the
rules for reading just-the-internal-subset.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Fri Mar 12 22:38:30 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:09:58 2004
Subject: FW: Namespaces and DTDs
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990312223727Z-6417@master.design-intelligence.com>

Actually there is another representation of the information in the DTD 
that is present: the application that uses the document. Unfortunately 
the representation is in C++, Java or some other language. This 
introduces a synchronization problem between the two.

The DOM api for instance gives you access to the parsed document tree, 
but a sizable amount of independent code must be written to 
essentially parse the DOM tree into the form the application needs. 
The result is the structure is in 2 different forms, declarative and 
procedural, which must be kept in sync.


Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  len bullard [SMTP:cbullard@hiwaay.net]
Sent:  Thursday, March 11, 1999 9:24 PM
To:  Didier PH Martin
Cc:  'XML Dev'
Subject:  Re: FW: Namespaces and DTDs

Didier PH Martin wrote:
>
> Hi
>
> I am using these simple rule of thumb:
>
> a) a XML DTD is useful for XML editors not for XML renderers
> b) Most XML renderers (XSL, CSS or DSSSL won't do document 
validation)
> c) a XML interpreter do not need a DTD (something else than 
rendition)
>
> If I need a DTD at the receiving end, then I am now no longer in the 
XML
> world but in the SGML world because the receiving end needs a 
validating
> parser. Several SGML parser like for instance SP can parse XML 
simplifyed
> DTD. The only simplification I gained is the -- or -0 think called 
omitags.
> Therefore, because I have to include a DTD for validation, better 
use then a
> SGML format.
>
> However, on the Web, to reduce complexity, I should not assume that 
the
> receiving end has a validating parser. Thus, because my XML document 
has
> been validated with my XML editor or by any other validation 
program. The
> receiving end makes the reasonnable assumption that if the docuement 
is a
> XML docuement it is "well formed" and valid.

That's mostly true because web documents don't stick around.  In
cases where information is moving across multiple processes or sits
in some long term archival, it is very handy to be able to validate it 
on the receiving end.  This will become more apparent to the XML
community
when they get to do the sort of work the SGML community did a decade
after
the first SGML applications fielded instances.   Things change. 
 Finding
those changes quickly is the key to cheap rehosting.   In my 
experience,
if
DTDs die, someone gets to reinvent them and it will be painful.

Otherwise, yes, the DTD is much more useful in the editor in the 
initial
part of the information lifecycle.

len
>

xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sat Mar 13 01:15:44 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:58 2004
Subject: FW: Namespaces and DTDs
In-Reply-To: <c=US%a=_%p=Design_Intellige%l=MASTER-990312223727Z-6417@master.design-intelligence.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEANCPAA.martind@netfolder.com>

Hi Marc,

<YourComment>
Actually there is another representation of the information in the DTD
that is present: the application that uses the document. Unfortunately
the representation is in C++, Java or some other language. This
introduces a synchronization problem between the two.

The DOM api for instance gives you access to the parsed document tree,
but a sizable amount of independent code must be written to
essentially parse the DOM tree into the form the application needs.
The result is the structure is in 2 different forms, declarative and
procedural, which must be kept in sync.
</YourComment>

<Reply>
You are right. but I can construct a DOM without any validation. The whole
point here is: if I need validation at the receiving end why not use SGML
which is more elaborate and necessarily need validation (because of the
possibility to have omittags). If however, we do not need validation at the
receiving end then, we are better to use XML that, because of its structure,
can be parsed without validation and then a DOM could be created for
procedural language consumption.

But you are right to say that from the serialized format I have to construct
a model (i.e. a structure) that interpreters can access. The DOM is the XML
way to do it and the grove for the SGML way (DOM and grove concept are
similar enough to reduce one to the other)

to become useful XML life cycle could be expressed like:

a) XML format creation: we need a DTD, so that the editor can validate the
document or simply prevent me to create an invalid document.
b) transport
c) receiving end: interpretation. The interpreter needs a parser. A
validating parser is not necesssary with XML, It seems that we have several
kinds of parsers:
	1- event driven
	2- function call within a loop
	3- DOM producer
d) The interpreter knows the semantic and do something.

In fact, XML rules do not convey semantics only syntax. Xpointers or Xlinks
are domain specific languages that add a semantic layer to XML. XHTML also.

In fact, all these concept where existing in the SGML world. Waht we gained
with XML compared to SGML is simplier parsing rule. So simple that
validation is no longer necessary to do a complete parsing operation. The
SGML syntax is more tricky because you need to tell the parser that some
markups are not with an end tag, thus, the need for a DTD which has the main
function to tell the parser some parsing rules like where a tag begin and
end. So, because of the "well formed" constraint we gained that now parser
do not a DTD to accomplish their task, the rule is clear on how a markup
begin and ends.

My conclusion:
we gained with XML the fact that a parser do not need to do validation.
Otherwise its only changing the XML extension to a sgml document. So, to go
from "mydocument.sgml" to "mydocument.xml" whitout really changing anything
except some minor modifications in the DTD declaration. That may be good for
marketing reasons but surely not for technical reasons.

</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Sat Mar 13 02:11:47 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:09:58 2004
Subject: FW: Namespaces and DTDs
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990313021044Z-6519@master.design-intelligence.com>

It's quite true that you can have XML that does not require validation 
and that this is commonly done. An exception is the defaulting of the 
value of any attributes of elements in a DTD, which has been mentioned 
in another reply.

You can construct a DOM without validation, but the next step ends up 
being a procedural implementation of picking apart the DOM document 
tree to construct whatever structure the application using DOM 
requires to interpret the document.

I can parse:
  <book title="tale of 2 cities">
    <chapter>
      <para>..<para>
    </chapter>
    <chapter>
        ...
    </chapter>
      ...
  </book>
without a DTD.

But if my application needs to get the information out of the DOM I 
need to write code to:
  Create a representation for Book consisting of a title and chapters 
and get book from DOM
  Create a representation for each Chapter and get Chapters from DOM
  Create a representation for each paragraph in a chapter and get 
paragraphs from DOM.
Part of this is what is expressed in the DTD. Wouldn't it be better if 
a system were created that used the DTD on the receiving end to create 
the application representation instead of serializing it back into 
elements and constructing a new tree?

Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Didier PH Martin [SMTP:martind@netfolder.com]
Sent:  Friday, March 12, 1999 5:20 PM
To:  Marc McDonald; cbullard@hiwaay.net
Cc:  xml-dev@ic.ac.uk
Subject:  RE: FW: Namespaces and DTDs

Hi Marc,

<YourComment>
Actually there is another representation of the information in the 
DTD
that is present: the application that uses the document. 
Unfortunately
the representation is in C++, Java or some other language. This
introduces a synchronization problem between the two.

The DOM api for instance gives you access to the parsed document 
tree,
but a sizable amount of independent code must be written to
essentially parse the DOM tree into the form the application needs.
The result is the structure is in 2 different forms, declarative and
procedural, which must be kept in sync.
</YourComment>

<Reply>
You are right. but I can construct a DOM without any validation. The 
whole
point here is: if I need validation at the receiving end why not use 
SGML
which is more elaborate and necessarily need validation (because of 
the
possibility to have omittags). If however, we do not need validation 
at the
receiving end then, we are better to use XML that, because of its 
structure,
can be parsed without validation and then a DOM could be created for
procedural language consumption.

But you are right to say that from the serialized format I have to 
construct
a model (i.e. a structure) that interpreters can access. The DOM is 
the XML
way to do it and the grove for the SGML way (DOM and grove concept 
are
similar enough to reduce one to the other)

to become useful XML life cycle could be expressed like:

a) XML format creation: we need a DTD, so that the editor can validate 
the
document or simply prevent me to create an invalid document.
b) transport
c) receiving end: interpretation. The interpreter needs a parser. A
validating parser is not necesssary with XML, It seems that we have 
several
kinds of parsers:
	1- event driven
	2- function call within a loop
	3- DOM producer
d) The interpreter knows the semantic and do something.

In fact, XML rules do not convey semantics only syntax. Xpointers or 
Xlinks
are domain specific languages that add a semantic layer to XML. XHTML 
also.

In fact, all these concept where existing in the SGML world. Waht we 
gained
with XML compared to SGML is simplier parsing rule. So simple that
validation is no longer necessary to do a complete parsing operation. 
The
SGML syntax is more tricky because you need to tell the parser that 
some
markups are not with an end tag, thus, the need for a DTD which has 
the main
function to tell the parser some parsing rules like where a tag begin 
and
end. So, because of the "well formed" constraint we gained that now 
parser
do not a DTD to accomplish their task, the rule is clear on how a 
markup
begin and ends.

My conclusion:
we gained with XML the fact that a parser do not need to do 
validation.
Otherwise its only changing the XML extension to a sgml document. So, 
to go
from "mydocument.sgml" to "mydocument.xml" whitout really changing 
anything
except some minor modifications in the DTD declaration. That may be 
good for
marketing reasons but surely not for technical reasons.

</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sat Mar 13 03:06:18 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:09:58 2004
Subject: FW: Namespaces and DTDs
References: <NBBBJPGDLPIHJGEHAKBAEEANCPAA.martind@netfolder.com>
Message-ID: <36E9D50F.3C03C24E@manhattanproject.com>

XML is really nasty for humans to look at.  SGML is 
definately better.  SGML is really nasty for computers
to look at XML is definately better.

Didier PH Martin wrote:
> 
> The whole point here is: if I need validation at the receiving end why 
> not use SGML which is more elaborate and necessarily need validation 
> (because of the > possibility to have omittags). If however, we do not 
> need validation at the receiving end then, we are better to use XML that, 
> because of its structure, can be parsed without validation and then a DOM 
> could be created for procedural language consumption.

I see it this way:

	XML = { SGML - irregular grammar constructs }

Thus, in general, you can't parse an SGML document without
the DTD describing it's grammer, where you can parse 
an XML without the DTD.  

An SGML DTD has two components:  

	a) Those parts used as grammer
	b) Those parts dictating structure

Where an XML DTD only has the latter (structural declarations)

Why is this important?

I was visiting a friend of mine with a GUI tool (Netscape 
Composer?) trying to write a web site.  She was very frustrated
beacuse she could not get the page to do what she wanted.
For my next visit, I bought her a copy of the HTML O'Rilley 
book.  I sat down with her and showed her how to write HTML
manually...  it was like pulling teeth.  But then, after about
15 minutes (the first 10 were spent fighting...) she said:
"Oh ya, I got it.  Is that the trick?"   I said yes.  
Then she went back to composer and continued to work. *sigh*  
When she had a problem again (table alignment), I helped her
re-write the page (I refuse to edit stuff from those tools)
from scratch.  It didn't take long.   Once again, we spent 
about 15 min fighting about it, but then, in about 10 minutes
whipped up a pretty table.  After that she went back to 
composer.... 

Well.  I thought that I had completely failed, so I left.
Then, about two weeks later I went over to visit, (hadn't
received any more pleas for HTML help...) and I found her
using an editor to hand create the HTML! I was a bit
stunned.   She said writing HTML in an editor directly
was "easier".  She quickly added that composer is good 
too, but only to "find what I want".  She uses it to 
'draw' what she wants, looks at the 'view source' and
then ALT-TABBS over to the editor to do the 'real' work.

Anyway, seeing this, I tried another experiment.

I asked her to fill out an invoice.  She went to a web
form, filled it out, pressed enter, and it showed her
the XML.

She then went over to the EDITOR (showing off) and
put in a correct SGML rendition of it, negelecting
all of the 'obvious' end-tags. 

SUMMARY:

  I don't see XML being used directly by humans, 
  however, I do see SGML in use several years
  from now replacing data entry forms.

Thus, 

  Human -> Form -> Form Processor -> XML -> XML Validation -> Business Processes

becomes...

  Human -> SGML -> SGML Expansion -> XML -> XML Validation -> Business Processes


Then, once it's in XML, then it can be validated
for the business processes it participates.

I see SGML being used in business processing not
as a validator, but as a way to introduce the
shortcuts necessary for a productive data entry
personell to get the business information
'the way they want it'.

Then XML can be used where it shines, in back end 
validation and processing.

? Clark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Mar 13 03:22:10 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:58 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <3.0.32.19990312163703.00eb86fc@pop.intergate.bc.ca>

At 10:08 AM 3/10/99 -0500, David Megginson wrote:
> > This seems to be converging nicely.  Any chance of losing the
> > ugly "Mod" prefix? -Tim
>
>Yeah, no one seems to like it but me.  Any other suggestions?  I don't 
>like Parser2 or things like that, because I want to emphasise that
>this is an add-on to SAX 1.0 rather than an upgrade.

Uh.... how about just changing the package name to be sax2.org or
something like that?  Then it's still a Parser. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sat Mar 13 04:33:03 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:58 2004
Subject: FW: Namespaces and DTDs
In-Reply-To: <c=US%a=_%p=Design_Intellige%l=MASTER-990313021044Z-6519@master.design-intelligence.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAGEBCCPAA.martind@netfolder.com>

Hi Marc

<YourComment>
It's quite true that you can have XML that does not require validation 
and that this is commonly done. An exception is the defaulting of the 
value of any attributes of elements in a DTD, which has been mentioned 
in another reply.

You can construct a DOM without validation, but the next step ends up 
being a procedural implementation of picking apart the DOM document 
tree to construct whatever structure the application using DOM 
requires to interpret the document.

I can parse:
  <book title="tale of 2 cities">
    <chapter>
      <para>..<para>
    </chapter>
    <chapter>
        ...
    </chapter>
      ...
  </book>
without a DTD.

But if my application needs to get the information out of the DOM I 
need to write code to:
  Create a representation for Book consisting of a title and chapters 
and get book from DOM
  Create a representation for each Chapter and get Chapters from DOM
  Create a representation for each paragraph in a chapter and get 
paragraphs from DOM.
Part of this is what is expressed in the DTD. Wouldn't it be better if 
a system were created that used the DTD on the receiving end to create 
the application representation instead of serializing it back into 
elements and constructing a new tree?
</YourComment>

<Reply>
a) what do mean by " a representation" is it a rendition object? 
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Sat Mar 13 10:26:24 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:09:59 2004
Subject: ModSAX: Proposed Core Features
In-Reply-To: <36E99267.73C16A7@locke.ccil.org>
References: <199903121543.KAA13555@hesketh.net>
 <14053.51113.676945.877507@localhost.localdomain>
 <199903121543.KAA13555@hesketh.net>
 <3.0.6.32.19990312191256.0098c080@gpo.iol.ie>
 <3.0.6.32.19990312195804.00995720@gpo.iol.ie>
Message-ID: <3.0.6.32.19990313101455.0092f320@gpo.iol.ie>

[Sean Mc Grath]
>
>> My point is that you cannot switch from:
>>         <!DOCTYPE foo SYSTEM "doc.dtd">
>> to:
>>         <!DOCTYPE foo [
>>         <!ENTITY % dtd SYSTEM "doc.dtd">
>>         %dtd;
>>         ]>
>> 
>> and expect things to be exactly the same. if you have
>> conditional sections in doc.dtd, the latter is illegal.
>
[John Cowan]
>I understand what you mean.  I believe you are not correct.
>Conditional sections are allowed in external PEs referenced
>from the internal subset, just not in the internal subset itself.
>

I only have one validating XML parser on this machine IBMs
XML4J:-

doc.dtd
	<!ELEMENT foo (#PCDATA)>
	<![IGNORE[
	<!ATTLIST foo bar CDATA #REQUIRED>
	]]>

a.xml
	<?xml version="1.0"?>
	<!DOCTYPE foo [
	<!ENTITY % dtd SYSTEM "doc.dtd">
	%dtd;
	]>
	<foo></foo>

b.xml
	<?xml version="1.0"?>
	<!DOCTYPE foo SYSTEM "doc.dtd">
	<foo></foo>

b.xml validates. a.xml fails reporting:-
	Error: Conditional section only allowed in external subset. (2,3)


<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 13 15:00:45 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:59 2004
Subject: Are we ready to resolve the Mod??SAX name?
Message-ID: <00ed01be6d62$fb0cdf60$c8a8a8c0@thing1>

Should we just stick with ModSAX/ModParser?
Should we change to OpenSAX/OpenParser?

I think we are down to these two possibilities. The other
possibilities were not well received (variants on ExLax,
Scream2, X-Files).

David has been a driving force here and I really like what
he has done. I have no problem with his original choice in name.

The rest of ModSAX seems to be fairly well cooked. Finalizing
the name will go a long way to wrapping the whole thing up.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lindsey at diac.com  Sat Mar 13 15:43:53 1999
From: lindsey at diac.com (William Lindsey)
Date: Mon Jun  7 17:09:59 2004
Subject: Lisp concrete syntax -- was: Namespaces and DTDs
Message-ID: <Pine.UW2.3.95.990313083834.21486A-100000@november>

On Thursday, 11 Mar 1999, Chris Maden wrote:

> The parentheses are only character data.

Yes.

> I don't think that Lisp could be made SGML compliant; the delimiters
> could be redefined, but as Steve DeRose notes in _The SGML FAQ Book_,
> there are some limits to the flexibility of the redefinitions, since
> some delimiter roles are overloaded.  Also, Lisp doesn't have the
> equivalent of start-tag close, and you can only omit tagc if the next
> character is stago or etago (ISO 8879:1986, clause 7.4.1.2) which it
> wouldn't be when you get to the leaves of a structure.

While it is true that not all lisp can be made SGML compliant,  it
is possible to define a concrete syntax (using parens as tag 
delimiters),  and establish a set of conventions such that you 
can create documents that are simultaneously R5RS Scheme 
programs and valid SGML documents.  

The close paren works fine as NESTC/NET if you treat the starttag 
as a procedure which, when evaluated, produces a procedure 
that will evaluate the element's contents. You can define
a procedure named "!ELEMENT" which will create those starttag
procedures when the DTD is evaluated.

I played around with this idea a bit.  You can see a working example 
with the SGML declarations, DTD (which also uses Scheme syntax), 
and Scheme bootstrap code (or the rules for evaluating *x-expressions*)
at:

http://www.diac.com/~lindsey/lml

Enjoy,

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 13 16:16:39 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:59 2004
Subject: ModSAX: Core Proposal for Filters
Message-ID: <011401be6d6d$6712c9e0$c8a8a8c0@thing1>

It would be strange for ModSAX not to accommodate filters. Frankly
I've been wondering if we really need a separate filter interface and
for SAX the answer was no. But for ModSAX, I would argue that
we need a filter interface.

One key issue here, one of the few that can not be addressed by ModSAX
as it stands for filters, is the routing of the get/set/setHandler/setFeature
events. (Lets call these application events, since that is the direction they
come from.)

In a simple filter stack, it is easy enough to route these events down towards
the parser, allowing them to be intercepted by any intervening filter on the way.
(Interception is one possibility here, but it handles nicely the problem of
recognizing that the event has been processed.) But when the structure
contains an event router (for parser events), things fall apart.

Event routers let you do things like process parser events differently for different
types of elements or different types of documents. But for application events, you
may need to chain all the subordinate filters together. Unfortunately even this
doesn't work if the application event is needed by more than one filter. It also
becomes quite slow if there are a large number of different kinds of elements and
consequently a large number of filters subordinate to an event router.

The JavaBeans approach seems to be best. Provide for application event registration,
and allow more than one Parser to be registered for the same application event.
This combined with a default behavior of passing application events towards the 
parser should work nicely.

Also, because a filter may have a whole raft of properties specific to itself, it will be
helpful to provide for a very crude form of wildcarding. We can easily constrain such
properties to all have the form: common ID, '#', property sub-id. Registration of the
common ID should be sufficient to allow the routing of all such application events.

The interface is

public interface ModFilter extends ModParser
{
    public abstract void register(String ID, ModParser parser)
        throws SAXNotSupportedException;
}

Now, will a ModFilter also need to extend DocumentHandler,
ErrorHandler, DTDHandler, and EntityResolver? Strictly speaking,
these are implementation issues, so the answer should be NO.

Time for a quick recap of application event processing by filters:

1. If a filter doesn't know what to do with an application event, it
    passes the event towards the parser (to the next filter on the
    stack, or to the parser if it is the last filter on the stack). Any raised
    exception is passed back toward the application.

2. A filter may contain any number of other filters or filter containers.
    The assemblage of such a structure can also support registration,
    a la JavaBean event registration. Any number of filters may be
    registered with another filter to receive specific application events,
    depending on the event ID.

3. A filter may have its own unique ID, assigned or hard-coded (that's
    an implementation issue). It can also have a number of properties
    useful for configuring that specific filter, each with its own unique
    subID. Such properties can be accessed using a propertyID of the
    form filterID#propertySubID.

4. Application events with partitioned property IDs will be routed when
    the filterID has been registered.

(The first production release of MDSAX 1.0 is due out "real soon now".
But frankly, I can't wait to use the ModSAX interfaces. They will be the
corner stone for the 2.0 release, I'm sure. ModSAX is going to let us
simplify a lot of the interfaces.)

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sat Mar 13 17:18:31 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:09:59 2004
Subject: Namespaces and DTDs
In-Reply-To: <36E94F45.46D0E403@prescod.net>
Message-ID: <NBBBJPGDLPIHJGEHAKBAMEBECPAA.martind@netfolder.com>

HI Paul,

<YourComment>
I let this claim pass a couple of times because I didn't consider it
important but now I feel the need to scratch that itch. DSSSL does not
actually use parentheses as tags. If you use nsgmls to look at the SGML
structure of a DSSSL document you will find that all of DSSSL's structure
is actually in omitted tags.
</YourComment>

<Reply>
Exactly, the tag that contains all dsssl style module constructs is the
"STYLE-SPECIFICATION-BODY" element (therefore dsssl is content). Sorry I
made the wrong analogy. I should have said simply that a lisp structure
could be easily transformed into a SGML/XML compliant format. However we
would end up with a resultant document having mostly elements and properties
and nearly no element's content. But, in a certain way this is what XSL is
trying to do.

The question of mapping is very important for XSL vs DSSSL. The mapping
seems actually to be done by transforming a DSSSL property element into a
XML element. For instance, to express a DSSSL paragraph fo into a XSL block
(the block seems equivalent to the paragraph), we move from a construct
like:

(make paragraph font-size: 10pt)

to
  <fo:block font-size="10pt">

So the structural similarities are quite strong except that the former
construct begin with a verb and the latter with a noun. The former
expression has some procedural meaning and the latter is strictly
declarative. In this case, its the verb that make the whole difference and
prevent a straightforward replacement of "(" by "<" if we got instead:
(paragraph font-size: 10 pt), this can be easily mapped to <paragraph
font-size: 10pt> which is a valid SGML/XML construct. The other mapping
problem is with containment because, for instance, DSSSL expression like:

(make display-group
	(make paragraph.....)
	(make rule.....)
)
So to transform this into SGML/XML we need to change the containment syntax
rule to:
<display-group>
	<paragraph>
	</paragraph>
	<rule>
	</rule>
</display-group>

a) elmininating the procedural element "make" and then mapping containment
relationship to XML containment syntax.

So, it seems that the two blocking factors for an obvious mapping is the
procedural reference "make" and the containment syntax rule. A
transformation tool would have to consider these two factors and simply
replacing the delimiters is insufficient.

Off course if we take the expression part of dsssl we face with more
difficulties because now we have a full language. However, if we only
consider the formatting object part that XSL replicate, we see that the main
difference is not so much the "(" and ">" than the presence of procedural
elements like "make" and differences in containement syntax rules, otherwise
we would have very similar construct where the main difference would be "("
instead of ">" and a SGML/XML parser would be easily able to parse a dsssl
style simply that defining "(" as delimiters. So if DSSSL object would have
been expressed as strict property sets, the mapping would have been obvious.
The mapping is made less obvious by the procedural nature of DSSSL FOs (i.e.
the "make" verb). However, the verb add meaning to what the style language
"do". XSL construct simply say what the FO "is". Two way to express things
with their own merit and advantages. But it seems that XSL will have its own
"wall" when procedural construct will be needed. I hope simply that both
languages will find their place in the XML world each one for their own
advantages.

Sorry for not having be enough clear about what I meant. And thank you for
reminding me to be more precise :-)
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Mar 13 19:15:21 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:59 2004
Subject: Deep wisdom
Message-ID: <3.0.32.19990313110619.00bcb420@pop.intergate.bc.ca>

In a computer bookstore down in San Jose, picked up this free ad mag
called "Computer Currents".  It has an article on XML which pretty much
says it all.

 http://www.currents.net/magazine/national/1705/intb1705.html
 -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Mar 13 19:16:02 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:09:59 2004
Subject: empty tags and the XMl 1.0 spec
Message-ID: <3.0.32.19990313111034.00bc8910@pop.intergate.bc.ca>

At 02:21 PM 3/12/99 -0500, John Cowan wrote:
>I have already bitched to xml-editor@w3.org about the use of
>"For interoperability" and "must" in the same sentence, suggesting
>that "should" is the correct modal verb.  All other uses of "F.i."
>use "should".

John's right.  If it isn't in the public errata list now, it will
be real soon. -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sat Mar 13 20:00:44 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:09:59 2004
Subject: Are we ready to resolve the Mod??SAX name?
In-Reply-To: <00ed01be6d62$fb0cdf60$c8a8a8c0@thing1> from "Bill la Forge" at Mar 13, 99 10:05:43 am
Message-ID: <199903132102.QAA15088@locke.ccil.org>

Bill la Forge scripsit:
> 
> Should we just stick with ModSAX/ModParser?
> Should we change to OpenSAX/OpenParser?

I support OpenSAX/Parser, in other words keeping the names of the
classes and interfaces the same, and changing the package name only.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 13 21:53:48 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:09:59 2004
Subject: Are we ready to resolve the Mod??SAX name?
Message-ID: <001f01be6d9c$adc75480$c8a8a8c0@thing1>

This would force the use of fully qualified names and create 
name conflicts for programmers used to simply importing 
the classes being used.

I try hard to avoid such potential name conflicts and would prefer
XParser, ExParser, Parser2, ModParser, OpenParser, or anything else 
over Parser.

Bill
-----Original Message-----
From: John Cowan <cowan@locke.ccil.org>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Saturday, March 13, 1999 3:24 PM
Subject: Re: Are we ready to resolve the Mod??SAX name?


>Bill la Forge scripsit:
>> 
>> Should we just stick with ModSAX/ModParser?
>> Should we change to OpenSAX/OpenParser?
>
>I support OpenSAX/Parser, in other words keeping the names of the
>classes and interfaces the same, and changing the package name only.
>
>-- 
>John Cowan cowan@ccil.org
> e'osai ko sarji la lojban.
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Sun Mar 14 05:43:47 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:09:59 2004
Subject: ModSAX: Proposed Core Features
Message-ID: <006601be6ddd$63557ee0$0300000a@othniel.cygnus.uwa.edu.au>

>[John Cowan]
>>I understand what you mean.  I believe you are not correct.
>>Conditional sections are allowed in external PEs referenced
>>from the internal subset, just not in the internal subset itself.


[Sean McGrath]
>I only have one validating XML parser on this machine IBMs
>XML4J:-

[...]
> Error: Conditional section only allowed in external subset. (2,3)


My reading of 2.8 of the REC supports John Cowan against XML4J.

In particular:

"However, portions of the contents of the external subset OR OF EXTERNAL
PARAMETER ENTITIES may conditionally be ignored by using the conditional
section construct".

The constraints on the internal subset are to make life easier for
non-validating parsers that read only the document entity (see 5.1)

The significant point to note is that external parameter entities referenced
in the internal subset are not part of the internal subset.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From avirr at LanMinds.Com  Sun Mar 14 07:42:59 1999
From: avirr at LanMinds.Com (Avi Rappoport)
Date: Mon Jun  7 17:09:59 2004
Subject: Boolean Query DTD?
In-Reply-To: <3.0.32.19990313110619.00bcb420@pop.intergate.bc.ca>
Message-ID: <v04104800b3111540ae88@[207.33.50.55]>

I have a consulting client who is shipping around Boolean queries. 
The documents are not XML, but they are using XML for data transfer 
between the search form and the back-end search engine, and in the 
results listings.  We would like to replace their proprietary query 
language (it looks like AltaVista's + and - system) with XML markup, 
because it seems cleaner and could be parsed with the rest of the 
document.

I have read all 66 position papers at the QL '98 site 
<http://www.w3.org/TandS/QL/QL98/pp.html> and think that the proposed 
languages, such as XQL or XML-Query, are overkill for our needs.   Do 
we have to go the whole query route or is there something lightweight 
that might handle it for us?

Avi
________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr@lanminds.com>
Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Sun Mar 14 10:13:07 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:09:59 2004
Subject: Boolean Query DTD?
In-Reply-To: <v04104800b3111540ae88@[207.33.50.55]>
Message-ID: <Pine.GHP.4.02A.9903140958080.15299-100000@mail.ilrt.bris.ac.uk>


On Sat, 13 Mar 1999, Avi Rappoport wrote:

> I have a consulting client who is shipping around Boolean queries. 
> The documents are not XML, but they are using XML for data transfer 
> between the search form and the back-end search engine, and in the 
> results listings.  We would like to replace their proprietary query 
> language (it looks like AltaVista's + and - system) with XML markup, 
> because it seems cleaner and could be parsed with the rest of the 
> document.
> 
> I have read all 66 position papers at the QL '98 site 
> <http://www.w3.org/TandS/QL/QL98/pp.html> and think that the proposed 
> languages, such as XQL or XML-Query, are overkill for our needs.   Do 
> we have to go the whole query route or is there something lightweight 
> that might handle it for us?

There's a different between using XML to ship queries around, and
querying XML. The majority of the XML-oriented position papers at the
Boston QL workshop were concerned with the latter, ie. finding things
within a real or hypothetical XML document or document set. For these to
be useful for your application, you'd need to 'pretend' that the
database queried by your back-end search engine was an XML document or
document collection. Which as you say might be overkill. 

I'm not aware of anything lightweight that'd do what's needed here.
Maybe there's a document out there somewhere that describes the
Altavista-esque +/- format adopted by the various engines objectively
enough for it to be useful to ship queries around marked up as being of
that type...? 

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar 15 00:19:03 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:59 2004
Subject: XTech Show [fwd]
Message-ID: <19990315111835.A6048@io.mds.rmit.edu.au>

Apologies for yet another brazen advert.  But this is definitely of
general interest...


----- Forwarded message from Bob Carter <BCARTER@kti.com> -----

Date: Sun, 14 Mar 1999 17:35:36 -0500
From: Bob Carter <BCARTER@kti.com>
To: sim@mds.rmit.edu.au
Subject: XTech Show

SIM won the interoperability award at the XTech conference in San
Jose, California last week.  With only 2 days to load 3 Gig of XML
data, develop an application and provide sophisticated searching, the
SIM technology beat Oracle, Object Design, Xyvision, Inso and others.
Neil Sharman upgraded the running app into SIM 2.3, in less than an
hour, showing off new features such as limited searching and the
advanced search interface.

Thanks to Trevor Clark, Tapan Shah and Neil Sharman for their efforts.

----- End forwarded message -----


(Bob Carter is the U.S. product manager for SIM, based at Kinetic
Technologies, Inc.; Neil Sharman and Trevor Clarke are core developers
for SIM; Tapan Shah is an application developer also working at
KTI)...


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar 15 02:59:45 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:09:59 2004
Subject: Boolean Query DTD?
In-Reply-To: <v04104800b3111540ae88@[207.33.50.55]>; from Avi Rappoport on Sat, Mar 13, 1999 at 11:38:58PM -0800
References: <3.0.32.19990313110619.00bcb420@pop.intergate.bc.ca> <v04104800b3111540ae88@[207.33.50.55]>
Message-ID: <19990315135913.B6048@io.mds.rmit.edu.au>

On Sat, Mar 13, 1999 at 11:38:58PM -0800, Avi Rappoport wrote:
> I have a consulting client who is shipping around Boolean queries. 
> The documents are not XML, but they are using XML for data transfer 
> between the search form and the back-end search engine, and in the 
> results listings.  We would like to replace their proprietary query 
> language (it looks like AltaVista's + and - system) with XML markup, 
> because it seems cleaner and could be parsed with the rest of the 
> document.
> 
> I have read all 66 position papers at the QL '98 site 
> <http://www.w3.org/TandS/QL/QL98/pp.html> and think that the proposed 
> languages, such as XQL or XML-Query, are overkill for our needs.   Do 
> we have to go the whole query route or is there something lightweight 
> that might handle it for us?

Try ISO 8777, a.k.a. CCL (Common Command Language).  You can't get it
off the web (AFAIK), and it's ugly.  But it's simple (only about 10
pages, I'm told), it's there and it's a standard.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Mon Mar 15 10:03:53 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:00 2004
Subject: Multi-valued attributes
Message-ID: <01BE6ED3.5F558EB0@grappa.ito.tu-darmstadt.de>

XML supports three types of multi-valued attributes: IDREFS, NMTOKENS, and 
ENTITIES.  Is the order of the values in a multi-valued attribute 
significant?  I can't find anything that says one way or the other in the 
spec.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Mon Mar 15 10:53:50 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:00 2004
Subject: Are we ready to resolve the Mod??SAX name?
Message-ID: <93CB64052F94D211BC5D0010A80013310EB37E@WWMESS3.172.19.125.2>

> This would force the use of fully qualified names and create 
> name conflicts for programmers used to simply importing 
> the classes being used.

I'd vote for changing the package name rather than the class names because
package
names appear in fewer places in the code, and because few applications are
likely to
want to use the old classes and the new concurrently.

And I'd vote for sax2 rather than open-, mod-, ex- etc because (a) it's
obvious to everyone what it means, and (b) it avoids further debate when it
comes to sax3.

Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Mon Mar 15 10:59:14 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:00 2004
Subject: ANNOUNCING SAXON 4.1
Message-ID: <93CB64052F94D211BC5D0010A80013310EB37F@WWMESS3.172.19.125.2>

The latest release of SAXON is available on
http://home.iclweb.com/icl2/mhkay/saxon.html

This release focuses on support for XSL. SAXON now supports about 85% of the
XSL transformation language draft. The added-value features you might find
interesting are:

* Support for multiple output files. SAXON allows you to split a single
input document into lots of linked output documents.

* Close integration of Java and XSL code. You can invoke Java element
handlers from XSL, and XSL element handlers from Java. You can also use the
full XSL syntax for match patterns and select patterns from within your Java
code.

* SAXON Stylesheets produce a text file, not a tree. This means you can use
them to produce CSV files, EDI messages, SQL scripts, or any number of
formats that don't use angle-bracket syntax. Of course you can also produce
XML and HTML output.

* SAXON Stylesheets can process the source document in serial mode. This
means the document doesn't have to fit in memory, and output can start
appearing before the input is all available. The XSL constructs available
are a pure subset of those used in "navigational" mode, and sufficient to
perform a wide variety of processing tasks (notably, splitting the large
document into small pieces).

* SAXON Stylesheets are extensible (anyone remember what the X in XSL stands
for?). By writing Java element handlers you can define additional elements
that extend the standard XSL vocabulary, and then use these in any
stylesheet. I show an example where I use this to create a syntax for SQL
Stylesheets: this contains <sql:connect> and <sql:insert> elements so that a
style sheet can now be used to load XML data into any relational database.

As a free bonus the new release of SAXON includes a new version of my
DTDGenerator tool, which generates a DTD from a specimen document. The new
version attempts to detect patterns in the ordering of child elements for a
given parent, and also examines the syntax of attribute values in greater
detail. Further information about DTDGenerator is on
http://home.iclweb.com/icl2/mhkay/dtdgen.html

Have fun!

Michael Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar 15 14:52:30 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:00 2004
Subject: Are we ready to resolve the Mod??SAX name?
References: <001f01be6d9c$adc75480$c8a8a8c0@thing1>
Message-ID: <36ED1E47.80491C@locke.ccil.org>

Bill la Forge wrote:

> This would force the use of fully qualified names and create
> name conflicts for programmers used to simply importing
> the classes being used.

I suppose if you use "import org.xml.sax.*;" then you have
a problem, yes, because you can't just add "import org.xml.opensax.*;"
to it.  But if you import individual class names it's no problem:
if you use opensax.Parser, you never need to refer to sax.Parser
any more, since it is subsumed.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Mon Mar 15 15:25:53 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:00 2004
Subject: Periodic plea
Message-ID: <85256735.00549852.00@D51MTA03.pok.ibm.com>

</lurk>
... to trim your quotes. It really isn't necessary to quote 30 lines in order to
attach a 3-line response, and it makes _finding_ your response text
significantly harder.

(Related comment: I like the idea of XML markup for delimiting quote and
response, but  indentation, or a > mark in the first column, is faster for a
human in a hurry.)
<lurk>
______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar 15 15:56:50 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:00 2004
Subject: Multi-valued attributes
References: <01BE6ED3.5F558EB0@grappa.ito.tu-darmstadt.de>
Message-ID: <36ED2B19.62734CE3@locke.ccil.org>

Ronald Bourret wrote:

> XML supports three types of multi-valued attributes: IDREFS, NMTOKENS, and
> ENTITIES.  Is the order of the values in a multi-valued attribute
> significant?  I can't find anything that says one way or the other in the
> spec.

A fine question.  I think it's application-dependent, so a priori
order counts, and applications can throw it away if they want to.

Can SGML tribal elders report on any actual uses of IDREFS and ENTITIES?

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Mon Mar 15 16:14:20 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:00 2004
Subject: Are we ready to resolve the Mod??SAX name?
Message-ID: <01BE6F07.2AF0C930@grappa.ito.tu-darmstadt.de>

John Cowan wrote:

> Bill la Forge wrote:
>
> > This would force the use of fully qualified names and create
> > name conflicts for programmers used to simply importing
> > the classes being used.
>
> I suppose if you use "import org.xml.sax.*;" then you have
> a problem, yes, because you can't just add "import org.xml.opensax.*;"
> to it.  But if you import individual class names it's no problem:
> if you use opensax.Parser, you never need to refer to sax.Parser
> any more, since it is subsumed.

Without even considering the technical merit, this strikes me as a good way 
encourage unsuspecting programmers to introduce errors.  I vote no.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jasr at im.se  Mon Mar 15 16:21:11 1999
From: jasr at im.se (Serrat Jaime - jasr)
Date: Mon Jun  7 17:10:00 2004
Subject: What a tangled web!!! XML and related specs
Message-ID: <389DA7CB46CFD111A0D100600836AD65E66B7D@msxmar1>

As a relative newbie to XML, I've been wading through the XML 1.0 spec and
related documents (notes, recommendations, etc.), mostly at the W3C site.
Boy,  am I confused!

I'm not only confused by the specific detail contents of the respective
documents, but perhaps more importantly, by the *relationship* of the
various docs to the others.  Does RDF extend or replace DTD as specified in
XML 1.0?  Is SOX an alternative to both DTD and RDF?  What about Namespace
and DCD?  For that matter, what about the recent XML Schema Requirements
Note?  I guess I'm hoping someone will provide a spec roadmap.

-- jaime "jim" serrat

Technical Manager of PDA08 - EDI Engine	Office: 	+1 609-797-3227
Product Development				Fax:	+1 609-797-6660
Industri-Matematik				Mobile:	+1 609-315-3338
Five Greentree Centre				Web:	http://www.im.se
Marlton, NJ 08053				Email:	jasr@im.se


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Mon Mar 15 17:29:45 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:00 2004
Subject: Multi-valued attributes
References: <01BE6ED3.5F558EB0@grappa.ito.tu-darmstadt.de>
Message-ID: <36EDF426.F5006A8F@prescod.net>

Ronald Bourret wrote:
> 
> XML supports three types of multi-valued attributes: IDREFS, NMTOKENS, and
> ENTITIES.  Is the order of the values in a multi-valued attribute
> significant?  I can't find anything that says one way or the other in the
> spec.

In general there are very few answers in the spec. about what is
"significant" or not. Where does it say that element order is significant?
That's what the infoset group is all about.

I would say that everything should be presumed significant until the
infoset group says otherwise.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"The culture we are living in becomes an ever-wider sewer."
	- Paul Weyrich, he man who gave the Moral Majority its name

"Only someone attached to an irrecoverable past, and therefore hostile 
to change as such, could react so negatively toward a culture that 
is doing all right by any reasonable measure."  
	- http://www.salonmagazine.com/col/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Mon Mar 15 17:50:45 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:00 2004
Subject: ModSAX: Proposed Core Features (heretical?)
In-Reply-To: <006601be6ddd$63557ee0$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <199903151749.MAA17211@hesketh.net>

While I was at XTech, I had an interesting discussion with a database
developer (whose name I unfortunately have forgotten) who was well, rather
irate about the existence of the internal subset and its ability to make
schemas and validation fairly useless to application development and
interchange. 

Basically, he wanted the ability to check the document structure without
the internal subset, so he could rely on the validation process to make
certain that documents conformed to an 'official' DTD, without extra junk
some twerpy developer put in the internal subset to make his own version
valid if not official.

Without the ability to turn off the internal subset, validation is a pretty
weak gatekeeper for information management.  The worst cases involve DTDs
that use ANY to accomodate officially approved extensibility, but which can
be effectively subverted by anyone with a basic knowledge of XML DTDs.
(Even with this ability, we all know that validation doesn't check
everything.)  Yes, I know that using ANY is asking for trouble, but I also
suspect that it does have real uses.

It may not rank as a core feature, but I suspect in the near future we'll
be seeing:

http://xml.org/sax/features/no-internal-subset

though it may not appear under xml.org as a core feature, of course.

(Maybe the internal subset will disappear in the next round of schemas,
maybe not...)

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dave at userland.com  Mon Mar 15 17:57:11 1999
From: dave at userland.com (Dave Winer)
Date: Mon Jun  7 17:10:00 2004
Subject: Scripting News supports RSS
In-Reply-To: <389DA7CB46CFD111A0D100600836AD65E66B7D@msxmar1>
Message-ID: <4.2.0.25.19990315095216.00c07280@mail.userland.com>

RSS stands for RDF Site Summary, it's a new format being promoted by Netscape.

As far as I know, Scripting News is the first site to support RSS:

http://news.userland.com/scriptingNewsToRss.xml

We'll be basing several interesting new services on this format. It's not 
the ideal syndication format, but it's usable. Also, if you have a site 
that's supporting RSS, please send me a private email. We're interested in 
gathering links to sites that support RSS. Thanks!

Dave Winer
UserLand Software

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar 15 19:15:00 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:00 2004
Subject: ModSAX: Proposed Core Features (heretical?)
Message-ID: <002201be6f18$99252840$46026982@thing1>

From: Simon St.Laurent <simonstl@simonstl.com>
>Basically, he wanted the ability to check the document structure without
>the internal subset, so he could rely on the validation process to make
>certain that documents conformed to an 'official' DTD, without extra junk
>some twerpy developer put in the internal subset to make his own version
>valid if not official.

But even given that an 'official' DTD was used, there is a question as to
WHICH official DTD was used. I see several problems with relying on
an unaugmented SAX parser for validation of data being input to an application:

1. DTD-driven validation is rarely complete enough--there will always be
    something critical that the application needs to validate. Fortunately,
    SAX supports parse exceptions in all the right places, with full information
    available on where in the document the error occurred.

2. If the application is going to depend on the parser for some of the validation
    (a real boon to the application programmer), then the application needs
    to be informed by the parser as to which DTD or other schema was used.

    Having the document specify this information in a PI or by some other means is
    not sufficient unless that information is somehow compared to the DTD 
    actually used by the parser.

3. As mentioned by Simon, allowing an author to change a DTD makes no
    sense at all in terms of providing a validation service for the application.

4. When filters are placed between the parser and the application, validation is
    best done in the last filter, rather than prior to the transformations performed
    by those filters. Validation by the parser in this case may produce clearer
    error messages, but validation of the transformed data provides the application
    with a greater assurance that its data will be in the expected form.

My belief here is that it is perhaps best to abandon validation by the parser-
kernel and instead use filters which support the validation needs of the
application. Errors so detected may be because of a poorly constructed document,
but may also be due to constraints imposed by a particular application. This
of course raises the question of how the response to these two different types of
errors should differ. I can understand a desire to make such a distinction, but
I have not yet come to appreciate the need to make such a distinction.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 15 19:43:46 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:00 2004
Subject: SAX: Name Vote
In-Reply-To: <00ed01be6d62$fb0cdf60$c8a8a8c0@thing1>
References: <00ed01be6d62$fb0cdf60$c8a8a8c0@thing1>
Message-ID: <14061.23058.872982.792773@localhost.localdomain>

NOTE: **PLEASE** reply by private e-mail

What: Vote for the name of the next generation of SAX
When: Until midnight (EST) Wednesday 17 March 1999
Where: By private e-mail to david@megginson.com

Bill la Forge writes:

 > Should we just stick with ModSAX/ModParser?
 > Should we change to OpenSAX/OpenParser?

OK, I'm going to act like a W3C committee chair instead of a hacker,
and decree that this is the question before us:

[Q] Which of the following namesets shall the next version of SAX use?

  a. ModSAX, ModParser, ModHandler, etc.
  b. OpenSAX, OpenParser, OpenHandler, etc.
  c. SAX2, Parser2, Handler2, etc.

Please vote clearly for only one of the alternatives -- ambiguous
votes will not be counted.

I will accept all e-mail votes up to midnight (EST) Wednesday 17 March
1999, will attempt to eliminate duplicates, then will tabulate and
post the results, including the name, address, and vote of each
participant.  

If none of the alternatives wins 50% + 1 of the votes, I will remove
the alternative with the lowest number of votes and then will repeat
this process with the remaining two alternatives.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 15 20:01:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:00 2004
Subject: What a tangled web!!! XML and related specs
In-Reply-To: <389DA7CB46CFD111A0D100600836AD65E66B7D@msxmar1>
References: <389DA7CB46CFD111A0D100600836AD65E66B7D@msxmar1>
Message-ID: <14061.25755.212595.40@localhost.localdomain>

Serrat Jaime - jasr writes:

 > As a relative newbie to XML, I've been wading through the XML 1.0
 > spec and related documents (notes, recommendations, etc.), mostly
 > at the W3C site.  Boy, am I confused!

Take it slow.  Like IP, XML is fairly simple; like IP, XML has a lot
of other stuff built on top of it that you can use if and only if you
want to.

Oh, yeah -- please ignore the notes unless you're doing cutting-edge,
speculative research.

 > I'm not only confused by the specific detail contents of the
 > respective documents, but perhaps more importantly, by the
 > *relationship* of the various docs to the others.  Does RDF extend
 > or replace DTD as specified in XML 1.0?

RDF is an XML-based format for a specific domain, metadata exchange.
It often makes sense for specific domains to have their own schema
formats, since an XML 1.0 DTD covers only basic structure and does so
in a very generic and low-level way.

 > Is SOX an alternative to both DTD and RDF?  What about Namespace
 > and DCD? 

SOX is just a note right now, and a member submission at that --
unless you're doing cutting-edge experimental research or planning to
write your own spec, it's best to ignore member submissions and wait
for actual recommendations.  Everyone is sending in XML-related
submissions these days, and most of them will die unimplemented (I
make no specific comment on SOX, positive or negative, but simply on
member submissions in general).

 > For that matter, what about the recent XML Schema Requirements
 > Note?  I guess I'm hoping someone will provide a spec roadmap.

Only for the XML Schema work itself.

Here's what you really have to know:

1. XML 1.0

2. Namespaces in XML, because it is used as a foundation by several
   other specs like RDF and XSL.

Here's what you might want to learn, depending on your requirements:

3. Document Object Model (DOM level one core), if you need a
   tree-based programming API for XML.

4. Simple API for XML (SAX 1.0, non-W3C), if you need an event-based
   programming API for XML.

5. RDF, if you need to exchange metadata.

Feel free to ignore everything else for now -- work on things like XSL
or XPointer is promising, but it's still far from the stable
recommendation stage, and in general lacks production-quality tool
support (so does RDF, mostly, but that's another sad story).  Other
specs cover specific document types, like XHTML, and you can ignore
those unless you need them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 15 20:03:50 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:00 2004
Subject: ModSAX: Proposed Core Features (heretical?)
In-Reply-To: <199903151749.MAA17211@hesketh.net>
References: <006601be6ddd$63557ee0$0300000a@othniel.cygnus.uwa.edu.au>
	<199903151749.MAA17211@hesketh.net>
Message-ID: <14061.26423.987574.830095@localhost.localdomain>

Simon St.Laurent writes:

 > It may not rank as a core feature, but I suspect in the near future we'll
 > be seeing:
 > 
 > http://xml.org/sax/features/no-internal-subset
 > 
 > though it may not appear under xml.org as a core feature, of course.

We cannot jump the gun -- if new XML 1.1 work is chartered, I'm
certain that it will give rise to several new core features similar to
this one (if not to this one specifically).  For now, it's not
conformant to do DTD validation while ignoring the internal subset.

For now, the earth is still the centre of the universe and I'd suggest
keeping your eye away from that telescope thingy.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Mon Mar 15 20:11:10 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:01 2004
Subject: ModSAX: Proposed Core Features (heretical?)
In-Reply-To: <002201be6f18$99252840$46026982@thing1>
Message-ID: <199903152010.PAA26382@hesketh.net>

At 02:18 PM 3/15/99 -0500, Bill la Forge wrote:
>From: Simon St.Laurent <simonstl@simonstl.com>
>>Basically, he wanted the ability to check the document structure without
>>the internal subset, so he could rely on the validation process to make
>>certain that documents conformed to an 'official' DTD, without extra junk
>>some twerpy developer put in the internal subset to make his own version
>>valid if not official.
>
>But even given that an 'official' DTD was used, there is a question as to
>WHICH official DTD was used. 
>
>[...much good stuff...]
>
>My belief here is that it is perhaps best to abandon validation by the
parser-
>kernel and instead use filters which support the validation needs of the
>application. Errors so detected may be because of a poorly constructed
document,
>but may also be due to constraints imposed by a particular application. 

I like the filters approach very much.  Perhaps it might be possible to
implement that approach, and control it through options set with ModSAX -
among other things, being able to order the parser to use a particular
DTD/schema to process a document would get this poor DB admin out of his
nightmare.  Hmmm....

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Mon Mar 15 20:39:07 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:01 2004
Subject: Scripting News supports RSS
In-Reply-To: <4.2.0.25.19990315095216.00c07280@mail.userland.com>
References: <389DA7CB46CFD111A0D100600836AD65E66B7D@msxmar1>
Message-ID: <199903152038.PAA26965@hesketh.net>

At 09:55 AM 3/15/99 -0800, Dave Winer wrote:
>RSS stands for RDF Site Summary, it's a new format being promoted by
Netscape.
>
>As far as I know, Scripting News is the first site to support RSS:
>
>http://news.userland.com/scriptingNewsToRss.xml
>
>We'll be basing several interesting new services on this format. It's not 
>the ideal syndication format, but it's usable. Also, if you have a site 
>that's supporting RSS, please send me a private email. We're interested in 
>gathering links to sites that support RSS. Thanks!

Er... where do I find more on RDF Site Summary?  Altavista came up blank,
and the Netscape and Mozilla sites brought up false matches.  The links at
www.scripting.com take me to a news story, but no connection to an RSS spec.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dave at userland.com  Mon Mar 15 20:43:28 1999
From: dave at userland.com (Dave Winer)
Date: Mon Jun  7 17:10:01 2004
Subject: Scripting News supports RSS
In-Reply-To: <199903152038.PAA26965@hesketh.net>
References: <4.2.0.25.19990315095216.00c07280@mail.userland.com>
 <389DA7CB46CFD111A0D100600836AD65E66B7D@msxmar1>
Message-ID: <4.2.0.25.19990315124143.00c0c4a0@mail.userland.com>

Simon, this is what they gave us to work with:

http://nirvana.userland.com/misc/netscapeStuff/My%20Netscape%20Network%20Hel 
p.htm

Not really a spec, but we were able to get the job done.

Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Mon Mar 15 20:47:08 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:01 2004
Subject: Multi-valued attributes
Message-ID: <00b901be6f24$894ec820$0300000a@othniel.cygnus.uwa.edu.au>

>In general there are very few answers in the spec. about what is
>"significant" or not. Where does it say that element order is significant?
>That's what the infoset group is all about.
>
>I would say that everything should be presumed significant until the
>infoset group says otherwise.


This also relates to the XML Canonicalization work. The order is significant
if it is preserved under canonicalization.

James
(not speaking for the canonicalization work)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Mon Mar 15 20:49:37 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:10:01 2004
Subject: Scripting News supports RSS
In-Reply-To: <199903152038.PAA26965@hesketh.net>
Message-ID: <000501be6f25$0a24a780$5118a8c0@kuantech1.quokka.com>

devedge.netscape.com itself has only a skeletal "RDF = motherhood + apple pie" XML section.

-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
Simon St.Laurent
Sent: Monday, March 15, 1999 12:42 PM
To: Dave Winer; xml-dev@ic.ac.uk
Subject: Re: Scripting News supports RSS


At 09:55 AM 3/15/99 -0800, Dave Winer wrote:
>RSS stands for RDF Site Summary, it's a new format being promoted by
Netscape.
>
>As far as I know, Scripting News is the first site to support RSS:
>
>http://news.userland.com/scriptingNewsToRss.xml
>
>We'll be basing several interesting new services on this format. It's not 
>the ideal syndication format, but it's usable. Also, if you have a site 
>that's supporting RSS, please send me a private email. We're interested in 
>gathering links to sites that support RSS. Thanks!

Er... where do I find more on RDF Site Summary?  Altavista came up blank,
and the Netscape and Mozilla sites brought up false matches.  The links at
www.scripting.com take me to a news story, but no connection to an RSS spec.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar 15 21:07:28 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:01 2004
Subject: ModSAX: Proposed Core Features (heretical?)
Message-ID: <001c01be6f28$62370320$46026982@thing1>

>For now, the earth is still the centre of the universe and I'd suggest
>keeping your eye away from that telescope thingy.


I feel like a two-sided heretic (a really bad coin).

1. I am a big believer in document-centric persistance. Tying data to
    a specific implementation or even to a particular application is
    insane. Data should conform to a standard.

2. I also believe in application-centric validation. DTDs and the like may
    be helpful for document composition, but validation of input is the 
    domain of the application. For a given document type (universal
    element name aka namespaces), the application should dictate
    (a) if it can process that document type and (b) how that document
    type should be validated.

For me, DTDs should be applied against the output of an application,
not its input.

I live in a binary star system and have three eyes. Earth? What is earth?

B)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar 15 21:10:21 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:01 2004
Subject: ModSAX: Proposed Core Features (heretical?)
Message-ID: <002301be6f28$c503fb20$46026982@thing1>

From: Simon St.Laurent <simonstl@simonstl.com>
>I like the filters approach very much.  Perhaps it might be possible to
>implement that approach, and control it through options set with ModSAX -
>among other things, being able to order the parser to use a particular
>DTD/schema to process a document would get this poor DB admin out of his
>nightmare.  Hmmm....


Remember that some applications might want to process more than one
type of document. Validation and processing may vary depending on
document type, so long as the application (or its associated filter complex)
gets to decide which document types are valid and how they are to be 
validated/processed. In terms of events, this would require a document
router of some kind.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Mon Mar 15 21:17:20 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:01 2004
Subject: Are we ready to resolve the Mod??SAX name?
Message-ID: <002a01be6f29$19f4abc0$2ee044c6@arcot-main>

I also vote no to package renaming.  While it works, it causes confusion in
practice and in discussions.

Don Park
Docuverse

-----Original Message-----
From: Ronald Bourret <rbourret@ito.tu-darmstadt.de>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Monday, March 15, 1999 8:34 AM
Subject: RE: Are we ready to resolve the Mod??SAX name?


>John Cowan wrote:
>
>> Bill la Forge wrote:
>>
>> > This would force the use of fully qualified names and create
>> > name conflicts for programmers used to simply importing
>> > the classes being used.
>>
>> I suppose if you use "import org.xml.sax.*;" then you have
>> a problem, yes, because you can't just add "import org.xml.opensax.*;"
>> to it.  But if you import individual class names it's no problem:
>> if you use opensax.Parser, you never need to refer to sax.Parser
>> any more, since it is subsumed.
>
>Without even considering the technical merit, this strikes me as a good way
>encourage unsuspecting programmers to introduce errors.  I vote no.
>
>-- Ron Bourret
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar 15 21:57:36 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:01 2004
Subject: SAX RFD: ModSAX Predefined Features
In-Reply-To: <14054.34184.693965.347827@localhost.localdomain>
References: <14051.3215.196642.22571@localhost.localdomain> 	<36E4C4E6.B51DDFF3@eng.sun.com> 	<wkiuc9v9ck.fsf@ifi.uio.no> <14054.34184.693965.347827@localhost.localdomain>
Message-ID: <wkemmqs9gc.fsf@ifi.uio.no>


* David Megginson
| 
| It would probably make more sense for the promoters of different
| catalogue formats to define their own properties and/or features, such 
| as
| 
|   http://www.oasis.org/sax/features/entity-catalog

I originally envisioned leaving the separation of the different kinds
of catalogs to parsers (xmlproc supports both kinds), but on
reflection I think you're right.

John, do you plan to add a section 7 to the XCatalog proposal defining
SAX features and parameters for the Socat subset and for XCatalogs? If
the answer is yes, then I'm happy.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jasr at im.se  Mon Mar 15 22:02:21 1999
From: jasr at im.se (Serrat Jaime - jasr)
Date: Mon Jun  7 17:10:01 2004
Subject: What a tangled web!!! XML and related specs
Message-ID: <389DA7CB46CFD111A0D100600836AD65E66B84@msxmar1>

David,

Thanks for your prompt and useful reply.  At the risk of changing from
'newbie' to 'PEST', how about a follow-up question?

I will ignore Notes, as you suggest since I am NOT doing cutting edge stuff,
and SAX (I don't need event-based processing).  I do, however, want to
exchange metadata, so in addition to the base XML spec, it appears that I
need to be familiar with Namespaces, DOM level 1 (which apparently does NOT
support Namespaces; coming in DOM 2, maybe?) and RDF.  If that's right, it's
a reasonable enough roadmap for the time being.

But I'm still left wondering about the schema related proposals.  NOTE or
not, no pun, did the SOX submitters intend it to supercede DTD?  Was RDF
(and DCD?) meant to *extend* DTD?  I guess I'm wondering about the
*direction* of the schema proposals, without fully understanding the details
in them, nor the players involved.

-- jaime "jim" serrat

Technical Manager of PDA08 - EDI Engine	Office: 	+1 609-797-3227
Product Development				Fax:	+1 609-797-6660
Industri-Matematik				Mobile:	+1 609-315-3338
Five Greentree Centre				Web:	http://www.im.se
Marlton, NJ 08053				Email:	jasr@im.se


> -----Original Message-----
> From:	David Megginson [SMTP:david@megginson.com]
> Sent:	Monday, March 15, 1999 3:01 PM
> To:	Xml-Dev (E-mail)
> Subject:	What a tangled web!!! XML and related specs
> 
> Serrat Jaime - jasr writes:
> 
>  > As a relative newbie to XML, I've been wading through the XML 1.0
>  > spec and related documents (notes, recommendations, etc.), mostly
>  > at the W3C site.  Boy, am I confused!
> 
> Take it slow.  Like IP, XML is fairly simple; like IP, XML has a lot
> of other stuff built on top of it that you can use if and only if you
> want to.
> 
> Oh, yeah -- please ignore the notes unless you're doing cutting-edge,
> speculative research.
> 
>  > I'm not only confused by the specific detail contents of the
>  > respective documents, but perhaps more importantly, by the
>  > *relationship* of the various docs to the others.  Does RDF extend
>  > or replace DTD as specified in XML 1.0?
> 
> RDF is an XML-based format for a specific domain, metadata exchange.
> It often makes sense for specific domains to have their own schema
> formats, since an XML 1.0 DTD covers only basic structure and does so
> in a very generic and low-level way.
> 
>  > Is SOX an alternative to both DTD and RDF?  What about Namespace
>  > and DCD? 
> 
> SOX is just a note right now, and a member submission at that --
> unless you're doing cutting-edge experimental research or planning to
> write your own spec, it's best to ignore member submissions and wait
> for actual recommendations.  Everyone is sending in XML-related
> submissions these days, and most of them will die unimplemented (I
> make no specific comment on SOX, positive or negative, but simply on
> member submissions in general).
> 
>  > For that matter, what about the recent XML Schema Requirements
>  > Note?  I guess I'm hoping someone will provide a spec roadmap.
> 
> Only for the XML Schema work itself.
> 
> Here's what you really have to know:
> 
> 1. XML 1.0
> 
> 2. Namespaces in XML, because it is used as a foundation by several
>    other specs like RDF and XSL.
> 
> Here's what you might want to learn, depending on your requirements:
> 
> 3. Document Object Model (DOM level one core), if you need a
>    tree-based programming API for XML.
> 
> 4. Simple API for XML (SAX 1.0, non-W3C), if you need an event-based
>    programming API for XML.
> 
> 5. RDF, if you need to exchange metadata.
> 
> Feel free to ignore everything else for now -- work on things like XSL
> or XPointer is promising, but it's still far from the stable
> recommendation stage, and in general lacks production-quality tool
> support (so does RDF, mostly, but that's another sad story).  Other
> specs cover specific document types, like XHTML, and you can ignore
> those unless you need them.
> 
> 
> All the best,
> 
> 
> David
> 
> -- 
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar 15 22:14:54 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:01 2004
Subject: ModSAX: Proposed Core Properties
In-Reply-To: <14053.50863.546824.628181@localhost.localdomain>
References: <14053.50863.546824.628181@localhost.localdomain>
Message-ID: <wkd82as8ng.fsf@ifi.uio.no>


* David Megginson
|
| http://xml.org/sax/properties/namespace-sep <String> (write-only)

Hmmm. Why should this be write-only? One can easily imagine situations
where it is desirable to be able to find out what the separator is
after events have passed through a stack of filters, any of which may
have modified it.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar 15 22:23:56 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:01 2004
Subject: SAX RFD: ModSAX Predefined Features
References: <14051.3215.196642.22571@localhost.localdomain> 	<36E4C4E6.B51DDFF3@eng.sun.com> 	<wkiuc9v9ck.fsf@ifi.uio.no> <14054.34184.693965.347827@localhost.localdomain> <wkemmqs9gc.fsf@ifi.uio.no>
Message-ID: <36ED882D.1864D1FE@locke.ccil.org>

Lars Marius Garshol wrote:

> John, do you plan to add a section 7 to the XCatalog proposal defining
> SAX features and parameters for the Socat subset and for XCatalogs? If
> the answer is yes, then I'm happy.

As of now, I no longer support (the use of) the XML syntax for
XCatalogs, because it involves parsing the XCatalog in order to
be able to parse the document, and may lead to useless recursions.

I think it's more important to get Socat support revved
up, and one of my back-burner efforts (any volunteers to take it
over?  The Java code *almost* works) is a SocatResolver that
implements org.xml.sax.EntityResolver.

My notion is that catalog support should be highly pluggable, and
if you want Socat support, you just install it using existing
SAX calls.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar 15 22:26:05 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:01 2004
Subject: What a tangled web!!! XML and related specs
References: <389DA7CB46CFD111A0D100600836AD65E66B84@msxmar1>
Message-ID: <36ED88D5.8D0044D5@locke.ccil.org>

Serrat Jaime - jasr wrote:

> But I'm still left wondering about the schema related proposals.  NOTE or
> not, no pun, did the SOX submitters intend it to supercede DTD?  Was RDF
> (and DCD?) meant to *extend* DTD? 

RDF Schemas are totally separate from the other schema proposals and
have nothing to do with them.  They are schemas that describe metadata
classes and properties, and have nothing to do with elements and
attributes.  (They are *expressed* using elements and attributes,
but that's another thing.)

The current schema proposals on the table are XML-Data (in IE5,
but probably moribund), DCD, SOX, and DDML.  None of them is
official anything.  They support various subsets and supersets of
DTDs.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Mon Mar 15 22:45:10 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:10:01 2004
Subject: FW: Namespaces and DTDs
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990315224417Z-379@master.design-intelligence.com>

I mean representation in the sense of some data structure representing 
the document. Perhaps using a book as the example gave the wrong 
connotation. If it were some kind of purchase order, the application 
that is using the parser would create some representation of the 
information - typically a tree of object representing the elements but 
not required to be so.

I look at it as not only using XML to represent documents that are 
rendered via a style sheet, but also information or data in the more 
general sense which may be delivered a specific application. Moving 
the information from an XML tree in DOM to whatever form the 
application uses for the information represents another 'parser'.

Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Didier PH Martin [SMTP:martind@netfolder.com]
Sent:  Friday, March 12, 1999 8:38 PM
To:  Marc McDonald; cbullard@hiwaay.net
Cc:  xml-dev@ic.ac.uk
Subject:  RE: FW: Namespaces and DTDs

Hi Marc

<YourComment>
It's quite true that you can have XML that does not require validation 
and that this is commonly done. An exception is the defaulting of the 
value of any attributes of elements in a DTD, which has been mentioned 
in another reply.

You can construct a DOM without validation, but the next step ends up 
being a procedural implementation of picking apart the DOM document
tree to construct whatever structure the application using DOM
requires to interpret the document.

I can parse:
  <book title="tale of 2 cities">
    <chapter>
      <para>..<para>
    </chapter>
    <chapter>
        ...
    </chapter>
      ...
  </book>
without a DTD.

But if my application needs to get the information out of the DOM I
need to write code to:
  Create a representation for Book consisting of a title and chapters 
and get book from DOM
  Create a representation for each Chapter and get Chapters from DOM
  Create a representation for each paragraph in a chapter and get
paragraphs from DOM.
Part of this is what is expressed in the DTD. Wouldn't it be better if 
a system were created that used the DTD on the receiving end to create 
the application representation instead of serializing it back into
elements and constructing a new tree?
</YourComment>

<Reply>
a) what do mean by " a representation" is it a rendition object?
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Mon Mar 15 22:57:27 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:10:02 2004
Subject: ModSAX: Proposed Core Features (heretical?)
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990315225622Z-384@master.design-intelligence.com>

I would suggest that the application should specify the DTD that the 
document is parsed by. After all, the document is supposed to conform 
to the application so you shouldn't have the document define what it 
means to conform (i.e. the DTD to use).

Perhaps extend SAX so that there is an API to specify a DTD to 
override the documents?

Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Bill la Forge [SMTP:b.laforge@jxml.com]
Sent:  Monday, March 15, 1999 11:18 AM
To:  Simon St.Laurent; XML-Dev Mailing list
Subject:  Re: ModSAX: Proposed Core Features (heretical?)

From: Simon St.Laurent <simonstl@simonstl.com>
>Basically, he wanted the ability to check the document structure 
without
>the internal subset, so he could rely on the validation process to 
make
>certain that documents conformed to an 'official' DTD, without extra 
junk
>some twerpy developer put in the internal subset to make his own 
version
>valid if not official.

But even given that an 'official' DTD was used, there is a question as 
to
WHICH official DTD was used. I see several problems with relying on
an unaugmented SAX parser for validation of data being input to an 
application:

1. DTD-driven validation is rarely complete enough--there will always 
be
    something critical that the application needs to validate. 
Fortunately,
    SAX supports parse exceptions in all the right places, with full 
information
    available on where in the document the error occurred.

2. If the application is going to depend on the parser for some of the 
validation
    (a real boon to the application programmer), then the application 
needs
    to be informed by the parser as to which DTD or other schema was 
used.

    Having the document specify this information in a PI or by some 
other means is
    not sufficient unless that information is somehow compared to the 
DTD
    actually used by the parser.

3. As mentioned by Simon, allowing an author to change a DTD makes no
    sense at all in terms of providing a validation service for the 
application.

4. When filters are placed between the parser and the application, 
validation is
    best done in the last filter, rather than prior to the 
transformations performed
    by those filters. Validation by the parser in this case may 
produce clearer
    error messages, but validation of the transformed data provides 
the application
    with a greater assurance that its data will be in the expected 
form.

My belief here is that it is perhaps best to abandon validation by the 
parser-
kernel and instead use filters which support the validation needs of 
the
application. Errors so detected may be because of a poorly constructed 
document,
but may also be due to constraints imposed by a particular 
application. This
of course raises the question of how the response to these two 
different types of
errors should differ. I can understand a desire to make such a 
distinction, but
I have not yet come to appreciate the need to make such a 
distinction.

Bill


xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Mon Mar 15 23:43:28 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:02 2004
Subject: Multi-valued attributes
References: <01BE6ED3.5F558EB0@grappa.ito.tu-darmstadt.de> <36ED2B19.62734CE3@locke.ccil.org>
Message-ID: <36ED9AFD.DBF40945@allette.com.au>


John Cowan wrote:

> Can SGML tribal elders report on any actual uses of IDREFS and ENTITIES?

Certainly - come closer to the fire, my son...

Actually, I can't think of a particularly good application for IDREFS, but I have used
ENTITIES quite often to emulate allowing multiple tokens from a token list. NAMES is too loose
and I may need more than one value, so a token list is out. By declaring entities and
assigning a notation that reflects their meaning, I get the best of both worlds.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Tue Mar 16 00:00:32 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:02 2004
Subject: Generating typed code from DTDs, why not?
In-Reply-To: <zp5lji3b.fsf@javagroup.org> (message from Luke Gorrie on 11 Mar
	1999 04:49:12 +1000)
References: <zp5lji3b.fsf@javagroup.org>
Message-ID: <199903152252.QAA01076@bruno.techno.com>

[Luke Gorrie:]

> So, my question is: are there any efforts around working towards
> creating mappings from DTD or other other XML type definition
> languages to various programming languages (or to other IDLs like
> OMG's), or is there some reason why this is considered a bad idea?

It's a good idea.  I suggest you read the slides
of my presentation at XTech '99, "Vocabularies: Opportunities
for Efficiency and Reliability" at
http://www.hytime.org/papers/srnXTech99/

(The text version of the slides, under the "A" button, is perfectly
adequate and much less bandwidth-intensive.)

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Tue Mar 16 00:02:26 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:02 2004
Subject: Multi-valued attributes
Message-ID: <017701be6f3f$f0e4c5a0$0300000a@othniel.cygnus.uwa.edu.au>

John Cowan wrote:
> Can SGML tribal elders report on any actual uses of IDREFS and ENTITIES?


Marcus Carr:
>Certainly - come closer to the fire, my son...
>
>Actually, I can't think of a particularly good application for IDREFS
[...]

How about when you want to reference more than one ID from the one element?
:-)

I came close to using IDREFS for xmlsoftware.com and the only reason I
didn't was the lack of support in XSL.

xmlsoftware.com is a single XML document with two main sections, a list of
categories with descriptions and a list of software. Each category has an
ID. IDREFS could be used for associating products with one *or more*
categories:

    <Product Categories="editor browser">...</Product>

James
(not an SGML tribal elder)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Tue Mar 16 00:48:11 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:02 2004
Subject: Multi-valued attributes
References: <017701be6f3f$f0e4c5a0$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <36EDAA23.CDA721B0@allette.com.au>


James Tauber wrote:

> >Actually, I can't think of a particularly good application for IDREFS
>
> How about when you want to reference more than one ID from the one element?
> :-)

Gee, I hadn't thought of that... :-)

> I came close to using IDREFS for xmlsoftware.com and the only reason I
> didn't was the lack of support in XSL.

Come to think of it, what you say is perfectly valid. I'm a great believer in clean and
abstract structure - if a transformation was required to make the data more accessible to an
application, I would probably wear that. John was looking for actual uses though, and I still
can't provide one.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 16 01:22:58 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:02 2004
Subject: Lisp concrete syntax -- was: Namespaces and DTDs
Message-ID: <01cd01be6f4b$d544aa70$38f96d8c@NT.JELLIFFE.COM.AU>

 You can make LISP into SGML by short-reffing
    ( to "<sexp><car>"
    ) to "</sexp>"
allowing end-tag ommission on the car element and cdr element type, 
and within a car element type, short-ref white-space to "</car><cdr>".
Depending on the kind of LISP, you may need to also handle
quotes and other delimiters, to taste.

This marks up  
    (+ 1 1)
as
    <sexp><car>+</car><cdr>1 1</cdr></sexp>
which is a nice start for parsing. (If you think cad and cdr
would be confusing, use some other name.)

You could indeed continue this and get
<sexp>
    <car>+</car>
    <cdr><car>1</car><cdr>1</cdr></cdr>
</sexp>

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Tue Mar 16 01:28:41 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:02 2004
Subject: FW: Namespaces and DTDs
References: <NBBBJPGDLPIHJGEHAKBAEEANCPAA.martind@netfolder.com> <36E9D50F.3C03C24E@manhattanproject.com>
Message-ID: <36EDB31E.AFD@hiwaay.net>

Clark Evans wrote:
> 
> Well.  I thought that I had completely failed, so I left.
> Then, about two weeks later I went over to visit, (hadn't
> received any more pleas for HTML help...) and I found her
> using an editor to hand create the HTML! I was a bit
> stunned.   She said writing HTML in an editor directly
> was "easier".  She quickly added that composer is good
> too, but only to "find what I want".  She uses it to
> 'draw' what she wants, looks at the 'view source' and
> then ALT-TABBS over to the editor to do the 'real' work.

That is pretty much the way it went for SGML editors too 
until the file got very big or one dropped a right quotation 
in a literal.  Then, thank Charles for the original cheap 
SGML parser which was rewritten as SGMLS.  Some say it 
wasn't great code, but it was fast and it found the tagging 
errors.

Also observed, people think HTML is great and 
for what we use for, the rendering pass, it is in much the 
same way its antecedents like DSR were.  Yes, lord love a 
duck, we edited DSR by hand too.  Still, when we got 
to the complex content in some of the systems which used 
stylesheets instead (circa 1986 to 93), we found people 
had a much easier time with content tagging, eg, 

<part>
  <partno>

because they could look at the markup and knew precisely 
what was there. A RPSTL (parts table) was easy to 
pick out in the mass of tags. The same was true of the editors which 
were context sensitive.  That XML is replicating the SGML 
experience in the main is not surprising.  That SGML 
is slightly easier to edit isn't surprising either since 
some of the features of SGML that went away in XML, 
eg, minimization, were editing features.  Other features 
such as quantities needed in the days of precious RAM 
aren't missed as much.

For that reason, many of us writing editors for applications 
to use HTML or other markup even when using a relational system 
for storage are writing node editors instead of hardwired 
tag stackers.  Because we can use tables to store meta-properties, 
this is easy to do.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Tue Mar 16 02:06:30 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:02 2004
Subject: NON-XML:  IrishSpace On Web
References: <19990208104624.A4862@io.mds.rmit.edu.au> <370CE4A4.76D43491@prescod.net> <36BF35E7.D6FF0E5E@manhattanproject.com> <370CF891.21C0BDE9@prescod.net>
Message-ID: <36EDBC00.306D@hiwaay.net>

Normally I wouldn't break protocol by posting this to XML-DEV, 
but since it is a unique piece for the Internet, some of the 
elders of the Web are here, and since making the next generation 
of 3D languages for the Web is an XML project, I think some 
of the members might enjoy seeing the most ambitious of the 
online work done with VRML.  When this was done two years 
ago, it was impossible to put it on the Web at all.  But 
it was built with the idea that things change fast on the 
Internet and what was impossible two years ago, is just 
doable today.  Consider that this is fully audio enabled, 
animated in 3D, and runs over an hour.  It is only a hint 
of what can be done.

To those that dream, to those that build, my sincere 
thanks.  This is what we did with what you gave us.

len bullard
IrishSpace Project Coordinator


************************************************************

Hey, folks -

I've begun reshaping IrishSpace to be viewed on the Internet! I'd been
experimenting with streaming audio (Java only: no plugins or special
servers).

You can view it (if you've got a fast machine *AND* a pretty fast
Internet connection - faster than 28.8 definitely!) at:
        http://pluto.njcc.com/~paulsam/voyage/index.html
(The intro page tells you what the technical requirements are.)

For now, you can go right through from the beginning to the flybys of
Venus and Mercury.

I don't think I can get much more than that up on my ISP without paying
for more disk space. Is there a good fast server we can put it all on
over in Tralee? (I guess the whole thing will be somewhere between 10
and 15 meg.)

- Paul
***********************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Tue Mar 16 02:42:20 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:02 2004
Subject: Generating typed code from DTDs, why not?
References: <zp5lji3b.fsf@javagroup.org>
Message-ID: <36EF2194.F9DA588B@prescod.net>

Luke Gorrie wrote:
> 
> I'm pretty new to XML, but as I've poked around I've observed what
> seem to be some strange things.  XML parsers all seem to provide
> interfaces which ignore the static structure information provided by
> DTDs and rely on "one fits all" interfaces to elements, in stark
> contrast to the conventions of statically typed languages.

The reason this is the case is because most document data depends on many
heterogenous lists:

<P>This is some string data, <EMPH>This is a sub-element (new node
type)</EMPH>, this is another <STRONG>sub-element</STRONG>, and this is a
<?PROCESSING-INSTRUCTION ?>.</P>

A visitor pattern would severly complicate the flow of control in this
case.

There is a subset of XML processing where document structures are
predictable enough for static type checking to be useful. I suspect that
when the W3C schema working group completes its work it will be common to
derive database schemas and IDL from the document schemas. But there are
many applications where that stuff will just get in the way.

> For instance, the first thing I played with in XML was SAX using
> Python.  I was impressed by how easily it worked and how naturally it
> fit in with a dynamically typed language like python. 

You are a wise man.

> look at the Java interface and found that it was just the same, which
> I thought very odd!  The natural mapping for SAX onto Java, to get the
> (significant) benefits of static typing, would be to generate a
> Visitor interface.  The Visitor interface would have a method for
> "visiting" each type of element in the document, and the argument to
> this method would be an object which presents the element contents
> through typed accessor methods.  At least, that's how it looks to me.

This turns out to be a fairly painful way of doing text processing. For
instance it ignores the fact that two elements can share a name but have
radically different behaviour because of their contexts. Consider the
title of a document and the title of a section within the document.

So do you need one visitor per context?

> In the case of DOM, again generating typed accessor code would provide
> these great benefits.  People could use a DTD (or similar) as the
> definition language for their abstract data types, and generate
> DOM-compliant classes which they can both use "natively" in their
> language and also manipulate as part of a genuine DOM tree at the same
> time.

On the one hand you have the goal of a language: improving on the wire
interoperability, minimizing redundancy, perhaps editing efficiency.

On the other hand you have the goals of an API: improving runtime
interoperability, maximizing ease of use (perhaps by providing many
redundant pointers), maximinzing runtime efficiency.

These can often conflict. If HTML were designed for programming
convenience and not for authoring convenience it would be quite different.

I don't want to say your idea isn't useful: I've worked on something
similar myself in the past and it is exciting -- but it can't replace
dynamically typed processing. It can only augment it in certain
situations. I think that the reason this hasn't got more research is
because the current method works and it works for all types of XML
processing.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"The culture we are living in becomes an ever-wider sewer."
	- Paul Weyrich, of the "Moral Majority"

"Only someone attached to an irrecoverable past, and therefore hostile 
to change as such, could react so negatively toward a culture that 
is doing all right by any reasonable measure."  
	- http://www.salonmagazine.com/col/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mookie at undef.com  Tue Mar 16 06:31:36 1999
From: mookie at undef.com (mookie@undef.com)
Date: Mon Jun  7 17:10:02 2004
Subject: managing large collections of XML docs
In-Reply-To: <36EF2194.F9DA588B@prescod.net>
Message-ID: <Pine.LNX.3.96.990315220444.18492C-100000@unagi.undef.com>


I've got around 10,000 docs (around 4k to 10k in size) to manage.  I would
like at least some basic version control.  The only thing I can think of
is CVS, but I am reluctant to use that with so many files. 

Anybody using Oracle 8i?  iFS?


-- Chris


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Mar 16 07:44:09 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:02 2004
Subject: What a tangled web!!! XML and related specs
Message-ID: <3.0.32.19990315234139.00bde240@pop.intergate.bc.ca>

At 11:02 PM 3/15/99 +0100, Serrat Jaime - jasr wrote:
>But I'm still left wondering about the schema related proposals.  NOTE or
>not, no pun, did the SOX submitters intend it to supercede DTD?  

In answering, I take all instances of "supercede" and "extend" and 
replace them with "serve as next the next generation of".  Answer: yes.

>Was RDF
>(and DCD?) meant to *extend* DTD?  

No and yes.  RDF is meant for metadata interchange.  DCD is a next-gen
DTD proposal based in part on the premise that it's useful to think of
schemas as metadata, and hence to use RDF to interchange them.

>I guess I'm wondering about the
>*direction* of the schema proposals, without fully understanding the details
>in them, nor the players involved.

Well, you left out DDML AKA XSchema.  It is fair to say that these
proposals rush madly off in all directions.  Further explication requires
understanding details and players.  But don't bother.  It is to be hoped 
that the W3C "XML Schema Working Group" process will digest this input and 
produce something usable before it's too late. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crey at dcd.abk.nec.co.jp  Tue Mar 16 08:15:36 1999
From: crey at dcd.abk.nec.co.jp (Charlemagne L. Rey)
Date: Mon Jun  7 17:10:02 2004
Subject: some tools
Message-ID: <36EE12F8.50B74A9B@dcd.abk.nec.co.jp>

hello fellow XML listers,

I would like to know if you know a tool something like
a library or package in Java that helps you produce
an XML document having an input of DTD file and
a table data? Kindly share it with me.
Currently I'm trying to develop something like that,
but if it is already available in the web then, why
should I recreate the ....it may ease my job actually.

cheers,
--
Charlemagne L. Rey
NEC
--


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Tue Mar 16 10:47:07 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:02 2004
Subject: managing large collections of XML docs
Message-ID: <93CB64052F94D211BC5D0010A80013310EB38A@WWMESS3.172.19.125.2>

> 
> I've got around 10,000 docs (around 4k to 10k in size) to 
> manage.  I would
> like at least some basic version control.  The only thing I 
> can think of
> is CVS, but I am reluctant to use that with so many files. 
> 
I would certainly hold these in a database. Any structure will do, just keep
each document as a "blob" with any relevant attributes (e.g. name and
version number) in other columns of the table.

Mike Kay

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 16 11:55:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:02 2004
Subject: What a tangled web!!! XML and related specs
In-Reply-To: <389DA7CB46CFD111A0D100600836AD65E66B84@msxmar1>
References: <389DA7CB46CFD111A0D100600836AD65E66B84@msxmar1>
Message-ID: <14062.17649.213809.774504@localhost.localdomain>

Serrat Jaime - jasr writes:

 > I will ignore Notes, as you suggest since I am NOT doing cutting
 > edge stuff, and SAX (I don't need event-based processing).  I do,
 > however, want to exchange metadata, so in addition to the base XML
 > spec, it appears that I need to be familiar with Namespaces, DOM
 > level 1 (which apparently does NOT support Namespaces; coming in
 > DOM 2, maybe?) and RDF.  If that's right, it's a reasonable enough
 > roadmap for the time being.

Actually, people are doing namespaces with DOM level 1 right now, but
it's a bit controversial.  What they do is simply mangle the names
before they build the DOM tree, so that

  <a xmlns="http://www.foo.com/ns/default/">

comes through as a DOM Element with the name
"http://www.foo.com/ns/default/ a" or something like that.  Many
people on the DOM IG list believe that that's non-conformant, but it
does happen to work fairly well for basic (i.e. most) applications.

I put some caveats about RDF in my original message because, while it
has a fairly simply (though not-fully-specified) data model, the
number of syntactic variants permitted is so mind-numbing that we've
had only a couple of toy demo implementations so far, despite the fact 
that RDF is exactly the kind of thing my customers want and need right 
now.

 > But I'm still left wondering about the schema related proposals.
 > NOTE or not, no pun, did the SOX submitters intend it to supercede
 > DTD?  Was RDF (and DCD?) meant to *extend* DTD?  I guess I'm
 > wondering about the *direction* of the schema proposals, without
 > fully understanding the details in them, nor the players involved.

I cannot speak on behalf of any of these groups, but my understanding
is that SOX and DCD are meant to provide alternatives to DTDs, while
RDF itself is not (though it can provide the foundation for expressing 
such an alternative, as in the case of DCD).


Al the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 16 11:58:12 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:02 2004
Subject: ModSAX: Proposed Core Features (heretical?)
In-Reply-To: <c=US%a=_%p=Design_Intellige%l=MASTER-990315225622Z-384@master.design-intelligence.com>
References: <c=US%a=_%p=Design_Intellige%l=MASTER-990315225622Z-384@master.design-intelligence.com>
Message-ID: <14062.18150.647591.139280@localhost.localdomain>

Marc.McDonald@Design-Intelligence.com writes:

 > I would suggest that the application should specify the DTD that the 
 > document is parsed by. After all, the document is supposed to conform 
 > to the application so you shouldn't have the document define what it 
 > means to conform (i.e. the DTD to use).
 > 
 > Perhaps extend SAX so that there is an API to specify a DTD to 
 > override the documents?

If people want to experiment with this kind of thing, the API formerly 
known as SAX provides the framework they need to do it; however, I am
not willing to create core features to trigger non-conformant
behaviour, however good the case in its favour.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From heikki at citec.fi  Tue Mar 16 12:15:09 1999
From: heikki at citec.fi (Heikki Toivonen)
Date: Mon Jun  7 17:10:03 2004
Subject: Extending Mozilla Tutorial
Message-ID: <002b01be6fa5$5fe913b0$2500a8c0@hto.citec.fi>

Extending Mozilla Tutorial

Me (Heikki Toivonen) and Johnny Stenback presented a tutorial
called "Extending Mozilla or How To Do The Impossible" at the
XTech'99 in San Jose, California (March 7).

This tutorial is now online at:

http://www.doczilla.com/development/index.html

We are correcting and updating the handout, which should appear
there shortly. The sample code is downloadable already.

We plan to keep this tutorial up to date so that when the APIs
change we will change our samples as well. This should be the
definitive place to find out how to write XPCOM modules and plugins
and especially how to embed and extend Mozilla in various ways.

All comments are welcome. We would be really interested to know if
there are any errors etc. in this material.

We hope you will find this material useful.


Sample Code


XPCOM Hello World

We created an XPCOM component that can automatically register itself. 
The component's only functionality is that it writes "Hello World!" 
to standard output, but it is the only up to date sample application 
on how to use XPCOM. A sample test program is also provided.


Embedding MozillaControl ActiveX Component

In this sample we have created a simple HTML/XML editor with preview. 
The preview is handled by the MozillaControl. Another tool in this 
sample is a double browser, which has both MozillaControl and the
Microsoft IE ActiveX control side by side. It is easy to check that 
pages you have created will look okay with both products. We used 
Microsoft Access for this sample, which you will need to get something out
of this sample.


Frankenbrowser with DOM Tree View

This sample includes an MFC dialog that shows the documents structure 
in a tree view. It is possible to view a node's content in the tree 
view and also delete nodes in the tree view. Deletions get reflected back
in the viewer.


Embedding NGLayout Programmatically

In this sample we have embedded a web shell in an MFC dialog application. 
It shows how easy it is to add web browser support to any application 
on any platform, although we used only MFC on Windows.


XPCOM Plugin

We are working on an XPCOM plugin sample. It is not yet available.

--
  Heikki Toivonen
  http://www.doczilla.com
  http://www.citec.fi

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From raju at wipinfo.soft.net  Tue Mar 16 12:25:21 1999
From: raju at wipinfo.soft.net (raju)
Date: Mon Jun  7 17:10:03 2004
Subject: How to ??
Message-ID: <01BE6FD6.59C1A520@parijatha.wipinfo.soft.net>

Hi friends,
	I'm Rajendra Kadam.
I had gone through book 'Just XML'. It is really interesting.Currently I'm writing an XML Editor using the XML Library provided by Sun. ( Java-X project.)  But I'm facing some problems. I will now explain in detail my problem :

	For Eg. I have an xml file "catalog.xml"  with its DTD in "catalog.dtd". I'm attaching the both the files here. 
	   
	Now in my XML Editor, whenever an Element has an attribute, I want to display the all values that an attribute can take with current assigned value.
	But when I gone through the Java-X document, I got methods that will give me the current value of the attribute of an element. But is there any way, so that I can get the all possible values for an attribute in my Application ??
	In above DTD, PRODUCT element has an attribute CATEGORY which can have any one of the following values :
	" Handtool | Table | Shop-Professional " with default value "Handtool". So my question is, how can I get the all above three values for the attribute CATEGORY in my Application.
	
	Sorry for asking you problem by personally sending mail.	

	Thanking you,
	Rajendra B. Kadam. 	
 
   
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/octet-stream
Size: 1261 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990316/77160039/attachment.obj
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/octet-stream
Size: 1561 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990316/77160039/attachment-0001.obj
From cowan at locke.ccil.org  Tue Mar 16 15:10:24 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:03 2004
Subject: Multi-valued attributes
References: <01BE6ED3.5F558EB0@grappa.ito.tu-darmstadt.de> <36ED2B19.62734CE3@locke.ccil.org> <36ED9AFD.DBF40945@allette.com.au>
Message-ID: <36EE7414.719A91C0@locke.ccil.org>

Marcus Carr wrote:

> I have used
> ENTITIES quite often to emulate allowing multiple tokens from a token list. NAMES is too loose
> and I may need more than one value, so a token list is out. By declaring entities and
> assigning a notation that reflects their meaning, I get the best of both worlds.

I don't understand what you mean by "token" and "token list".  Can
you give a more concrete example?  Thanks.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nrao at Apptivity.com  Tue Mar 16 16:40:54 1999
From: nrao at Apptivity.com (Niranjan Rao)
Date: Mon Jun  7 17:10:03 2004
Subject: How to ??
Message-ID: <c=US%a=_%p=Apptivity%l=SWINGSET-990316163555Z-14141@swingset.newark.progress.com>

I am also facing similar problems. I don't want to write attribute and
it's value if the value of attribute is same as DTD. I don't want to
hard code these values in my code, but rather want to get it from DTD.
Are there any API's that can return information about DTD - like
elements allowed here, default and allowed values for attributes etc.

IBM's XML4J had some support but I could not locate methods to get
default values from a DTD.

Thanks for any help,

- Niranjan


The three principal virtues of a programmer are Laziness, Impatience,
and Hubris.
 	- From the perl manual.

>	But when I gone through the Java-X document, I got methods that will give me
>the current value of the attribute of an element. But is there any way, so
>that I can get the all possible values for an attribute in my Application ??
>	In above DTD, PRODUCT element has an attribute CATEGORY which can have any
>one of the following values :
>	" Handtool | Table | Shop-Professional " with default value "Handtool". So
>my question is, how can I get the all above three values for the attribute
>CATEGORY in my Application.
>	
>	Sorry for asking you problem by personally sending mail.	
>
>	Thanking you,
>	Rajendra B. Kadam. 	
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From falk at icon.at  Tue Mar 16 16:44:33 1999
From: falk at icon.at (Falk, Alexander)
Date: Mon Jun  7 17:10:03 2004
Subject: XML Spy 2.0 Release
Message-ID: <A01C76E644CAD111B83A0000E8D8890E057C75@melange.icon.co.at>

Dear XML enthusiast,

it is my pleasure to announce the release of version 2.0 of XML Spy, our
shareware XML editor for Windows, which adds these new and exciting
features:

*	Full Unicode support (UTF-7, UTF-8, UTF-16, ISO-10646-UCS-2,
ISO-10646-UCS-4)
*	Enhanced character-set encodings (all ISO-8859-x, Shift-JIS, EUC-JP,
ISO-2022-JP, GB2312, Big5, etc.) with auto-detection and auto-correction
*	XML Namespaces support
*	XHTML 1.0 (HTML 4.0 in XML 1.0 reformulation) support

For more information please refer to our product information web-server at
http://www.xmlspy.com where you'll be able to get more detailed information
about the new features.

If you directly want to download the latest version, please use this URL:

	http://www.icon-is.com/xml/xmlspy.exe

You'll also find all the details about the new version in the included
"readme.txt" file and the updated on-line manual.

Sincerely,

Alexander Falk

... Icon Informations-Systeme GmbH
... ALEXANDER FALK
... President, CEO
... http://www.icon-is.com/falk

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990316/a55606ce/attachment.htm
From roddey at us.ibm.com  Tue Mar 16 19:06:46 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:03 2004
Subject: ModSAX: Proposed Core Features (heretical?)
Message-ID: <87256736.0068C313.00@d53mta03h.boulder.ibm.com>


<Bill's Comment>
My belief here is that it is perhaps best to abandon validation by the
parser-
kernel and instead use filters which support the validation needs of the
application. Errors so detected may be because of a poorly constructed
document,
but may also be due to constraints imposed by a particular application.
This
of course raises the question of how the response to these two different
types of
errors should differ. I can understand a desire to make such a distinction,
but
I have not yet come to appreciate the need to make such a distinction.

Bill
</Bill's Comment>

That would have some pretty large performance implications. For our new
generation parsers, we can validate the event stream *very* fast as its
going out of the parser. Doing it after the fact, way up stream, would be
much, much slower. I could imagine that this would be true of other parsers
as well, that once the stuff has gone out into the 'real world', validation
becomes much more work becuase now it has to be in terms of text
comparisons instead of internal element ids.

I understand that the filter sequence could change the document, but
wouldn't it be just as important to know that it died because the original
document was hosed (and therefore the filters spat out junk)?

Another option, though also frighteningly bad for performance and requiring
compliance by parsers, would be way to plug filters in 'under' the input
into the parser. That way, the filtering would happen before the parser
passed judgement on it for either well-formedness or validity. Otherwise,
even if you validate the end product of the filter sequence, how do you
know it remained well-formed? That check is usually implicit in the call
stack of the parser on the original content. But, of course, how would the
filters operate on the text that has not been parsed yet? :-)

Its almost like you'd want to parse it lightly to get it to the filters,
let them shake and bake it, then parser it for real and validate it. But
that would be really rough for performance also.

For just the scenario where an application wants normal wf/valid checks but
needs to add more to it, that's an obvious application of a validation
filter that wouldn't be a performance pig hopefully. I think that makes a
lot of sense. But if the content gets changed along the way, knowing
whether it got hosed in the process seems important.

Anyway, that ramble seemed incoherent even to me, but I think there was a
point in there somewhere :-)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From joel at spooky.emcs.cornell.edu  Tue Mar 16 20:32:42 1999
From: joel at spooky.emcs.cornell.edu (Joel Bender)
Date: Mon Jun  7 17:10:03 2004
Subject: Way off topic (was: Lisp concrete syntax)
Message-ID: <v04011702b3146f40332b@[128.253.245.66]>

Rick Jelliffe wrote:

>  You could indeed continue this and get
>  <sexp>
>      <car>+</car>
>      <cdr><car>1</car><cdr>1</cdr></cdr>
>  </sexp>

Not to be a pest or anything, shouldn't this be...

<sexp>
    <car>+</car>
    <cdr><car>1</car><cdr><car>1</car><cdr/></cdr></cdr>
</sexp>


Joel

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Tue Mar 16 21:13:39 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:03 2004
Subject: who's on the XML WG?
Message-ID: <199903162113.QAA19884@hesketh.net>

This is more a curiosity question than anything else.  Is there any way to
find out who is actually participating on the XML WG and affiliated groups?
 The committee chairs are listed on the activity pages, but the rest is
pretty mysterious.

It was odd last week to bump into people, talk briefly about XML
developments, and then have to suddenly end the conversation because they
can't talk any further.  Sort of spy-vs.-spy or something.

I guess knowing who all these folks are would reduce the odds of such
uncomfortable situations cropping up.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lists at lumdata.com  Tue Mar 16 21:47:06 1999
From: lists at lumdata.com (Scott Vanderbilt)
Date: Mon Jun  7 17:10:03 2004
Subject: [OFF] was Re: who's on the XML WG?
In-Reply-To: <199903162113.QAA19884@hesketh.net>
Message-ID: <v04020a03b31481ad3092@[24.130.21.92]>

At 4:16 PM -0500 3/16/99, Simon St.Laurent wrote:

>This is more a curiosity question than anything else.  Is there any way to
>find out who is actually participating on the XML WG and affiliated groups?
> The committee chairs are listed on the activity pages, but the rest is
>pretty mysterious.
>
>It was odd last week to bump into people, talk briefly about XML
>developments, and then have to suddenly end the conversation because they
>can't talk any further.  Sort of spy-vs.-spy or something.
>
>I guess knowing who all these folks are would reduce the odds of such
>uncomfortable situations cropping up.


Any of the WG members could tell you this information, but then they would
have to kill you. <g>

Cheers.

==========================================================================
Scott D. Vanderbilt                               mailto:scott@lumdata.com
Luminous Dataworks
Phone: (310) 253-9918                            Custom database solutions
Fax:   (310) 842-7025                               for the World Wide Web
==========================================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Tue Mar 16 22:04:57 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:10:03 2004
Subject: [OFF] was Re: who's on the XML WG?
In-Reply-To: <v04020a03b31481ad3092@[24.130.21.92]>
References: <199903162113.QAA19884@hesketh.net>
Message-ID: <199903162204.OAA18410@sqwest.bc.ca>

On 16 Mar 99, at 13:47, Scott Vanderbilt wrote:

> At 4:16 PM -0500 3/16/99, Simon St.Laurent wrote:
> 
> >This is more a curiosity question than anything else.  Is there any way
> >to find out who is actually participating on the XML WG and affiliated
> >groups?
> > The committee chairs are listed on the activity pages, but the rest is
> >pretty mysterious.
[...]
> Any of the WG members could tell you this information, but then they would
> have to kill you. <g>

A little more seriously, W3C doesn't tell people in general who the 
members of WGs and IGs are, unless you're on the WG or IG or 
work for a member organisation. Some of this, I gather, is to make 
sure people representing their companies aren't hassled by 
outsiders (e.g. journalists or lobbyists or people who disagree with 
the implementations) unless the companies want them to be. And 
W3C (and the WG or IG chairs!) don't want the questions of "why 
isn't company X on WG Y" (I've had these, and there's no tactful 
way of dealing with them). 

Nothing stops any company or individual saying "I'm on this group", 
of course.

cheers,


Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 16 22:07:47 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:03 2004
Subject: who's on the XML WG?
In-Reply-To: <199903162113.QAA19884@hesketh.net>
References: <199903162113.QAA19884@hesketh.net>
Message-ID: <14062.54726.681522.146163@localhost.localdomain>

Simon St.Laurent writes:

 > This is more a curiosity question than anything else.  Is there any
 > way to find out who is actually participating on the XML WG and
 > affiliated groups?  The committee chairs are listed on the activity
 > pages, but the rest is pretty mysterious.

You're unlikely to see that happen -- it would be a shopping list for
head-hunters.  People are appointed W3C committee chairs only because
they're considered so hopelessly unmarketable that no one would try to
hire them away from their current employers.

Of course, the members of the committees are listed on member-only
pages, so members can raid each-other anyway.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Tue Mar 16 22:14:51 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:03 2004
Subject: Multi-valued attributes
References: <01BE6ED3.5F558EB0@grappa.ito.tu-darmstadt.de> <36ED2B19.62734CE3@locke.ccil.org> <36ED9AFD.DBF40945@allette.com.au> <36EE7414.719A91C0@locke.ccil.org>
Message-ID: <36EED7B1.15978725@allette.com.au>


John Cowan wrote:

> Marcus Carr wrote:
>
> > I have used
> > ENTITIES quite often to emulate allowing multiple tokens from a token list. NAMES is too loose
> > and I may need more than one value, so a token list is out. By declaring entities and
> > assigning a notation that reflects their meaning, I get the best of both worlds.
>
> I don't understand what you mean by "token" and "token list".  Can
> you give a more concrete example?  Thanks.

Given the following attlist declaration:

<!ATTLIST sometime days (Mon | Tues | Wed | Thur | Fri) #IMPLIED>

the token list is the collection of available values (tokens).

In the event that I wanted an occurrence of the element to contain multiple days, such as the
following:

<sometime days="Mon Wed Fri">

I could not do it because of the restriction that I can only use one day. If the declared content
was NAMES, I couldn't prevent the insertion of Monday, Wednesday, etc or Sat, Sun. If the declared
content is ENTITIES, then I can declare just the five that I want, case sensitively and get the
desired results. There is a danger that entities designed for one element will get mixed up with
another, but OmniMark (my tool of choice) provides access to the notation associated with the
entity, so you could differentiate the notations and perform a semantic check while parsing. I
believe that this is as close as you can get to allowing multiple well-defined tokens to be
available in an attribute value - nearly on par for rigidity with IDREFS. :-)


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar 16 22:40:44 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:03 2004
Subject: [OFF] was Re: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net> <199903162204.OAA18410@sqwest.bc.ca>
Message-ID: <36EEDDBE.F778270C@locke.ccil.org>

Lauren Wood spoke as one having authority:

> Nothing stops any company or individual saying "I'm on this group",
> of course.

Okay.

<breath depth="deep"/>

I'm on the Infoset WG as an invited expert.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 16 22:52:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:03 2004
Subject: ModSAX: Proposed Core Properties
In-Reply-To: <wkd82as8ng.fsf@ifi.uio.no>
References: <14053.50863.546824.628181@localhost.localdomain>
	<wkd82as8ng.fsf@ifi.uio.no>
Message-ID: <14062.57453.852593.212563@localhost.localdomain>

Lars Marius Garshol writes:

 > * David Megginson
 > |
 > | http://xml.org/sax/properties/namespace-sep <String> (write-only)
 > 
 > Hmmm. Why should this be write-only? One can easily imagine situations
 > where it is desirable to be able to find out what the separator is
 > after events have passed through a stack of filters, any of which may
 > have modified it.

So far, for consistency, I've made all pre-parse properties write-only 
and all parse-time properties read-only.  We'll have to think about
this one.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 16 22:53:03 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:04 2004
Subject: [OFF] was Re: who's on the XML WG?
In-Reply-To: <36EEDDBE.F778270C@locke.ccil.org>
References: <199903162113.QAA19884@hesketh.net>
	<199903162204.OAA18410@sqwest.bc.ca>
	<36EEDDBE.F778270C@locke.ccil.org>
Message-ID: <14062.57513.901642.587777@localhost.localdomain>

John Cowan writes:

 > I'm on the Infoset WG as an invited expert.

John was also The Walrus all along, and Paul wasn't.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Tue Mar 16 23:00:03 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net>
Message-ID: <36EEDFF1.52DE6542@prescod.net>

"Simon St.Laurent" wrote:
> 
> This is more a curiosity question than anything else.  Is there any way to
> find out who is actually participating on the XML WG and affiliated groups?
>  The committee chairs are listed on the activity pages, but the rest is
> pretty mysterious.
> 
> It was odd last week to bump into people, talk briefly about XML
> developments, and then have to suddenly end the conversation because they
> can't talk any further.  Sort of spy-vs.-spy or something.

One problem is that any employee of a member organization could "know
something" whether they are on a working group or not.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"The culture we are living in becomes an ever-wider sewer."
	- Paul Weyrich, of the "Moral Majority"

"Only someone attached to an irrecoverable past, and therefore hostile 
to change as such, could react so negatively toward a culture that 
is doing all right by any reasonable measure."  
	- http://www.salonmagazine.com/col/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 16 23:27:26 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:04 2004
Subject: Way off topic (was: Lisp concrete syntax)
Message-ID: <001401be7004$d2d24bd0$35f96d8c@NT.JELLIFFE.COM.AU>


>Not to be a pest or anything, shouldn't this be...
>
><sexp>
>    <car>+</car>
>    <cdr><car>1</car><cdr><car>1</car><cdr/></cdr></cdr>
></sexp>

There are three things we could be notating:
 * parsing the s-expression source code;
* fully expanding the notional LISP lists;
* exposing the implementation.

The form I gave parses the s-expression. The version you give shows the
fully expanded list. And if an implementation used cdr-coding, it would
look like my version; if it used explicit nulls, it would look like your
version.

This is a good example that XML markup is not a data modeling language,
but a data-model modeling language :-)  The lexical, semantic and
implementation structures of a programming language are all different,
and a XML document could reflect any of them (or even a mix).

Rick

P.S. For more on cdr-coding, refer
http://www.landfield.com/faqs/lisp-faq/part2/section-9.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Tue Mar 16 23:42:57 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
In-Reply-To: <14062.54726.681522.146163@localhost.localdomain>
Message-ID: <03c001be7005$98377030$577670c6@eps.inso.com>

 
> You're unlikely to see that happen -- it would be a shopping list for
> head-hunters.  People are appointed W3C committee chairs only because
> they're considered so hopelessly unmarketable that no one would try to
> hire them away from their current employers.


God forbid that that be true. Three people I respect greatly are in the
chair/sub chair role.... and I....


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Mar 16 23:48:37 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
Message-ID: <3.0.32.19990316154521.00bcc5c0@pop.intergate.bc.ca>

At 05:08 PM 3/16/99 -0500, David Megginson wrote:
>People are appointed W3C committee chairs only because
>they're considered so hopelessly unmarketable that no one would try to
>hire them away from their current employers.

Or because they're unemployed bums who would spurn a good job
offer even if they got one. -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 17 00:14:27 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:04 2004
Subject: ModSAX: Proposed Core Features (heretical?)
Message-ID: <00d001be700a$3f7a0de0$c9a8a8c0@thing2>

From: roddey@us.ibm.com <roddey@us.ibm.com>
>That would have some pretty large performance implications. For our new
>generation parsers, we can validate the event stream *very* fast as its
>going out of the parser. Doing it after the fact, way up stream, would be
>much, much slower. I could imagine that this would be true of other parsers
>as well, that once the stuff has gone out into the 'real world', validation
>becomes much more work becuase now it has to be in terms of text
>comparisons instead of internal element ids.


It would be nice to have a lower-level interface where such things could
be done efficently. This is one of the reasons why I sometimes speak of
parser-kernel, as an oblique reference to Simon's layered architecture
proposal.

>I understand that the filter sequence could change the document, but
>wouldn't it be just as important to know that it died because the original
>document was hosed (and therefore the filters spat out junk)?


Too true.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Mar 17 00:36:53 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:04 2004
Subject: [OFF] was Re: who's on the XML WG?
Message-ID: <027501be700e$440fece0$0300000a@othniel.cygnus.uwa.edu.au>

-----Original Message-----
From: John Cowan <cowan@locke.ccil.org>
><breath depth="deep"/>
>
>I'm on the Infoset WG as an invited expert.


This is starting to sound like Working Group Members Anonymous...

Hi, I'm James Tauber and <pause reason="nervous"/> I'm a working group
member.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From daniela at cnet.com  Wed Mar 17 00:50:39 1999
From: daniela at cnet.com (Daniel Austin)
Date: Mon Jun  7 17:10:04 2004
Subject: [OFF] was Re: who's on the XML WG?
Message-ID: <77A952A6B467D211855D00805F9521F1149349@cnet10.cnet.com>

Remember, the first step is to admit you have a problem...we will all need
years of therapy afterwards.

Regards,

D-

(not speaking for the HTML WG, but for my own self)

> -----Original Message-----
> From: James Tauber [mailto:jtauber@jtauber.com]
> Sent: Tuesday, March 16, 1999 4:07 PM
> To: XML Dev
> Subject: Re: [OFF] was Re: who's on the XML WG?
> 
> 
> -----Original Message-----
> From: John Cowan <cowan@locke.ccil.org>
> ><breath depth="deep"/>
> >
> >I'm on the Infoset WG as an invited expert.
> 
> 
> This is starting to sound like Working Group Members Anonymous...
> 
> Hi, I'm James Tauber and <pause reason="nervous"/> I'm a working group
> member.
> 
> James
> 
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Wed Mar 17 02:16:54 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net>
Message-ID: <36EF13A7.C9C8A196@finetuning.com>

Simon St. Laurent wrote:

> It was odd last week to bump into people, talk briefly about XML
> developments, and then have to suddenly end the conversation because they
> can't talk any further.  Sort of spy-vs.-spy or something.

yes the cloak and dagger stuff is amusing, isn't it. yet necessary.

> 
> I guess knowing who all these folks are would reduce the odds of such
> uncomfortable situations cropping up.

On the contrary -- just be understanding when such situations come up
and ....politely....back off....:-)

It's also important not to repeat details that sometimes slip out from
WG members, if it is your understanding that information "still isn't
public". Such is the life being a responsible XML journalist :-)

Jeez there must be three or four of us, at least :-)

lisa

ps.tip: If you're really THAT curious, you can look at the names at the
end of specs, sometimes, to get names.


Simon St.Laurent wrote:
> 
> This is more a curiosity question than anything else.  Is there any way to
> find out who is actually participating on the XML WG and affiliated groups?
>  The committee chairs are listed on the activity pages, but the rest is
> pretty mysterious.
> 

\

> 
> Simon St.Laurent
> XML: A Primer
> Sharing Bandwidth / Cookies
> http://www.simonstl.com
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Wed Mar 17 02:38:06 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net> <36EF13A7.C9C8A196@finetuning.com>
Message-ID: <36EF14EA.E82@hiwaay.net>

Lisa Rein wrote:
> 
> yes the cloak and dagger stuff is amusing, isn't it. yet necessary.

Not really.  It may up the excitement to be in a secret society, but 
I don't think it does a lot to help the spec.

> > I guess knowing who all these folks are would reduce the odds of such
> > uncomfortable situations cropping up.

You mean like blame and credit?  ;-)

Wow! I can just see all of those journalists, wannabes and 
industry spies camping outside David Megginson's door.  
Almost like being a rock star without the sex.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Wed Mar 17 03:24:39 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:04 2004
Subject: How to ??
Message-ID: <85256737.00129928.00@D51MTA03.pok.ibm.com>

In the DOM world, default attribute values should show up automatically as
attributes of the elements they apply to. The Attr.isSpecified() method can be
used to distinguish between a default and an explicitly entered attribute. Note
that even if an explicit value is the same as the default, it is considered
Specified. Note too that unless you use a validating parser, the whole concept
of default attributes is moot.

Complication: DOM Level 1 defined that behavior but did not define where the DTD
information to support it should be stored. There are many parts of the DTD
behavior that were deferred in Level 1, in the hope that schemas would shape up
quickly enough that Level 2 could easily support both DTDs and schemas...
unfortunately that doesn't seem to be happening. So for now, the DOM really
doesn't provide a good API for working with DTDs, and parsers and applications
have to either accept those limits or fall back on nonstandard interfaces that
may vanish in later versions of the code.

In the case of IBM's parser, both versions of the DOM (TXDocument and
DocumentImpl) can support default attributes. DocumentImpl is pretty much a
strict Level 1 DOM for now, and they may have decided not to attempt to set
default attributes as a result. The TXDocument implementation is somewhat
heavily loaded with non-DOM behaviors, and among those is a set of custom
classes that provide a bit more DTD support (but may bear no resemblance to how
DTDs are handled in future versions of the DOM).

Welcome to the bleeding edge. Wear your crash helmet.

______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Wed Mar 17 03:30:19 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
Message-ID: <85256737.001316A1.00@D51MTA03.pok.ibm.com>

If your company is a W3C Member, you can access that information. If not, all
you can do is watch the public discussions and see who's willing to admit to
being involved.

"The names have been changed to protect the innocent." -- Dragnet
______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 17 09:00:36 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:04 2004
Subject: ModSAX: Proposed Core Properties
Message-ID: <01BE705C.DB375010@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> So far, for consistency, I've made all pre-parse properties write-only 
> and all parse-time properties read-only.  We'll have to think about
> this one.

What is the possible benefit of making any property write-only? That is, can any harm ever come from reading a property?

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From hassan.hussein at zurich.com  Wed Mar 17 11:03:25 1999
From: hassan.hussein at zurich.com (hassan.hussein@zurich.com)
Date: Mon Jun  7 17:10:04 2004
Subject: XML Java Parsers
Message-ID: <C1256737.003C6418.00@mtach2.zurich.com>


Hi,

I read the list everyday and understand half of it, the rest is basically too
muh to grasp (i.e DOM, RDF,DCD,XSchema,API's).  I have a windows programming
background. I understand some XML syntax and DTD's and attributes but that is
about it.  Could anyone give me a push start with these jargon so I can
appreciate the content of the list.

Also, I am trying to compare both commercial and free XML java parsers that are
currently available on the web.  I thought it wise to start here and get a list
of candidates then do the comparison, because otherwise I could get lost in the
web.

Thank you all.

Hassan Hussein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 11:56:11 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:04 2004
Subject: who's on the XML WG?
In-Reply-To: <36EF14EA.E82@hiwaay.net>
References: <199903162113.QAA19884@hesketh.net>
	<36EF13A7.C9C8A196@finetuning.com>
	<36EF14EA.E82@hiwaay.net>
Message-ID: <14063.38650.573008.410552@localhost.localdomain>

len bullard writes:

 > Wow! I can just see all of those journalists, wannabes and industry
 > spies camping outside David Megginson's door.  Almost like being a
 > rock star without the sex.

Without the drugs, too (I can barely handle children's chewable
multivitamins).  

Actually, Infoset isn't controversial enough that anyone cares to do
more than ask for the occasional courtesy quote ("Oh yes, and what
does Infoset do again?"), but some chairs of more visible WGs do get
hounded unbelievably by both the press and the big industry players.

RFC: New mailing list, XML-Group-Therapy.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 12:03:00 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:04 2004
Subject: How to ??
In-Reply-To: <85256737.00129928.00@D51MTA03.pok.ibm.com>
References: <85256737.00129928.00@D51MTA03.pok.ibm.com>
Message-ID: <14063.38976.561058.757958@localhost.localdomain>

keshlam@us.ibm.com writes:

 > In the DOM world, default attribute values should show up
 > automatically as attributes of the elements they apply to. The
 > Attr.isSpecified() method can be used to distinguish between a
 > default and an explicitly entered attribute. Note that even if an
 > explicit value is the same as the default, it is considered
 > Specified. Note too that unless you use a validating parser, the
 > whole concept of default attributes is moot.

Would that it were so simple -- in fact, while the XML REC requires
validating parsers from reading default attribute values, it does not
forbid non-validating parsers to use them.  In fact, both AElfred and
Expat (both of which are non validating) do recognise and use default
attribute values.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Wed Mar 17 13:54:12 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:10:05 2004
Subject: who's on the XML WG?
References: <85256737.001316A1.00@D51MTA03.pok.ibm.com>
Message-ID: <36EFB3ED.27DA6B72@goon.stg.brown.edu>

keshlam@us.ibm.com wrote:
> 
> If your company is a W3C Member, you can access that information. If
> not, all you can do is watch the public discussions and see who's
> willing to admit to being involved.

How refreshingly open.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Mar 17 14:34:03 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:05 2004
Subject: XML Java Parsers
In-Reply-To: <C1256737.003C6418.00@mtach2.zurich.com>
Message-ID: <199903171426.JAA02154@hesketh.net>

At 11:02 AM 3/17/99 +0000, hassan.hussein@zurich.com wrote:
>Also, I am trying to compare both commercial and free XML java parsers
that are
>currently available on the web.  I thought it wise to start here and get a
list
>of candidates then do the comparison, because otherwise I could get lost
in the
>web.

For the Java end of parsers, you can check out:

http://archive.javareport.com/9902/html/products/prod_rev.shtml

It's the only (vaguely recent) comparison of parsers I'm aware of - I
haven't seen any comparisons including parsers for other environments.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Mar 17 15:11:19 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:05 2004
Subject: [OFF] who's on the XML WG?
In-Reply-To: <14063.38650.573008.410552@localhost.localdomain>
References: <36EF14EA.E82@hiwaay.net>
 <199903162113.QAA19884@hesketh.net>
 <36EF13A7.C9C8A196@finetuning.com>
 <36EF14EA.E82@hiwaay.net>
Message-ID: <199903171508.KAA02978@hesketh.net>

At 06:55 AM 3/17/99 -0500, David Megginson wrote:
>len bullard writes:
>
> > Wow! I can just see all of those journalists, wannabes and industry
> > spies camping outside David Megginson's door.  Almost like being a
> > rock star without the sex.
>
>Without the drugs, too (I can barely handle children's chewable
>multivitamins).  
>
>RFC: New mailing list, XML-Group-Therapy.

Wow.  I always thought Spy vs. Spy was the funniest part of MAD Magazine,
but I didn't expect to set off this kind of entertainment.  Can't say I
thought it would be this complex or the source of so much angst.

And as anyone who has listened to me for the last two years is aware, I'm
hardly a responsible XML journalist.  (Haven't gone to a show on a press
badge in a long, long, time!)  Since my books have such long schedules (one
is verging on infinite, it seems), a slip today would likely be old news in
six months to a year, when the book arrives.  Community member, developer,
commentator, sometimes critic, perhaps.  

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Mar 17 15:11:40 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
Message-ID: <3.0.32.19990317070939.00a54a90@pop.intergate.bc.ca>

At 06:59 AM 3/17/99 -0500, David Megginson wrote:
>Would that it were so simple -- in fact, while the XML REC requires
>validating parsers from reading default attribute values, it does not
>forbid non-validating parsers to use them.

In fact, it *requires* that processors do attribute defaulting, when
the declarations are in the internal subset. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tomh at thinlink.com  Wed Mar 17 15:40:39 1999
From: tomh at thinlink.com (Tom Harding)
Date: Mon Jun  7 17:10:05 2004
Subject: [OFF] was Re: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net> <199903162204.OAA18410@sqwest.bc.ca>
Message-ID: <36EFCC90.20DAED92@thinlink.com>

Lauren Wood wrote:

> Nothing stops any company or individual saying "I'm on this group",
> of course.

And nothing stops such a claim from being unverifiable.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 17 15:41:32 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
Message-ID: <004801be708d$08c6d2a0$46026982@thing1>

From: keshlam@us.ibm.com <keshlam@us.ibm.com>

>In the DOM world, default attribute values should show up automatically as
>attributes of the elements they apply to. The Attr.isSpecified() method can be
>used to distinguish between a default and an explicitly entered attribute. 

It seems then that isSpecified could be used to filter-out excess attributes
when attempting to recreate a document from a DOM, at least with the above
interpretation. A useful feature that is hard to achieve when you want to
insert SAX filters between a parser and a DOM.

In looking over the Parser/DOM from JavaSoft, it looked like isSpecified is 
always true, even for default values.

In looking over the Docuverse DOM, it looked like isSpecified is also always
true, except for updates made by the application, for which it is always false.

We've been playing with an alternate helper class for AttributeList that lets us
set isSpecified, post parse. When you add a filter that is easily configured for
attribute specification (everything you can do in a DTD), things start to get
interesting. The final piece was finding a way to get the docuverse DOM to
accept an attribute at parse time that is not "specified".

Its been an up-hill battle, but we did it without modifying any parsers or DOM
implementations, except for some subclassing of the DOM. Makes us keep 
wondering if we have the wrong idea here about isSpecified.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Mar 17 15:58:09 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:05 2004
Subject: Microsoft XML 2.0?
Message-ID: <199903171557.KAA04054@hesketh.net>

Anyone know anything about this? Is it an SDK 2.0?  (What was 1.0?)  Kind
of strange, if you ask me.  I hope it's not Microsoft announcing XML 2.0,
which the outraged reviewer seems to think.

-----------------------------------------
Microsoft XML 2.0 Programmer's Guide and Software Development Kit With CDROM 
- by Microsoft Corporation 

http://www.amazon.com/exec/obidos/ASIN/0735606390/qid=921685338/sr=1-34/002-
2838235-4840457
-----------------------------------------

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Wed Mar 17 16:08:41 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
Message-ID: <85256737.005868AF.00@D51MTA03.pok.ibm.com>

>It seems then that isSpecified could be used to filter-out excess
>attributes when attempting to recreate a document from a DOM

That's the intent.

>In looking over the Parser/DOM from JavaSoft, it looked like isSpecified
>is always true, even for default values.
>
>In looking over the Docuverse DOM, it looked like isSpecified is also
>always true, except for updates made by the application, for which it
>is always false.

If so, my reading is that those are deviations from the DOM Recommendation -- in
other words, bugs.

 I sorta undertand why Docuverse could have gotten this wrong; the spec's
description starts of with a discussion of "original document", which is
probably an error... though the summary lays out the intent quite precisely in
terms of assigned versus default values.

I'll suggest a clarification of the document; you may want to contact your DOM
suppliers and suggest that they recheck the definition of this parameter.

______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Wed Mar 17 16:24:23 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
In-Reply-To: <85256737.00129928.00@D51MTA03.pok.ibm.com> (keshlam@us.ibm.com)
References: <85256737.00129928.00@D51MTA03.pok.ibm.com>
Message-ID: <199903171524.JAA01148@bruno.techno.com>

[Joe Kesselman:]

> So for now, the DOM really doesn't provide a good API for working
> with DTDs, and parsers and applications have to either accept those
> limits or fall back on nonstandard interfaces that may vanish in
> later versions of the code.

Uh, "nonstandard" is not the correct adjective.  "Non-W3C-recommended"
is much more correct.  In fact, there has been an internationally
standard model for the API to DTDs for over 3 years: the SGML Property
Set, found in ISO/IEC 10744:1997 ("HyTime") and in ISO/IEC 10179:1996
("DSSSL").

An API based on this internationally standard model is available in
the "MarkMinder" SGML engine, a plug-in module of the "GroveMinder"
product (see http://www.techno.com/gminder3.htm).  A demonstration of
GroveMinder is now available to prospective licensees.  It includes a
document set that includes XML, HyTime, SGML, Word and Excel
resources, it demonstrates extended hyperlinking using HyTime varlink
(aka XML XLink) and transclusions.  It provides a GroveMinder-based
http server in Python source form.  I demonstrated it during my talk
at XTech 99.

As for values of default attributes, the SGML Property Set allows the
API to report not only to report the defaulted value, but also the
fact that it was defaulted.  Indeed, one can tell exactly what was
explicitly in the document, and what was supplied at parse time.  I
hope and believe that the "XML Infoset" committee will come up with an
information set that will be equally revealing in this respect.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar 17 16:40:44 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
References: <004801be708d$08c6d2a0$46026982@thing1>
Message-ID: <36EFDADE.A349AA77@locke.ccil.org>

Bill la Forge wrote:

> In looking over the Parser/DOM from JavaSoft, it looked like isSpecified is
> always true, even for default values.
> 
> In looking over the Docuverse DOM, it looked like isSpecified is also always
> true, except for updates made by the application, for which it is always false.

This is one of the deficiencies of SAX considered as a DOM builder;
it doesn't give you everything the DOM requires.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Wed Mar 17 16:44:40 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
Message-ID: <85256737.005BD03E.00@D51MTA03.pok.ibm.com>

>Uh, "nonstandard" is not the correct adjective.  "Non-W3C-recommended"
>is much more correct.  In fact, there has been an internationally
>standard model for the API to DTDs for over 3 years: the SGML Property
>Set, found in ISO/IEC 10744:1997 ("HyTime") and in ISO/IEC 10179:1996
>("DSSSL").

Unfortunately, that may not what's being implemented by DOM developers, so I
stand by my statement: there is no guarantee that any given DOM has any given
API to its DTDs (outside of the simple calls included in DOM Level 1), and
whatever it does supply is not likely to follow any standard.

We're all waiting for the Schema group to report out so the DOM WG feels
comfortable implementing a real schema API. Until then, caveat usetor.

______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simpson at polaris.net  Wed Mar 17 17:01:36 1999
From: simpson at polaris.net (John E. Simpson)
Date: Mon Jun  7 17:10:05 2004
Subject: Microsoft XML 2.0?
Message-ID: <3.0.32.19990317115600.007e7960@polaris.net>

At 11:00 AM 3/17/99 -0500, Simon St.Laurent wrote:
>Anyone know anything about this? Is it an SDK 2.0?  (What was 1.0?)  Kind
>of strange, if you ask me.  I hope it's not Microsoft announcing XML 2.0,
>which the outraged reviewer seems to think.
>
>-----------------------------------------
>Microsoft XML 2.0 Programmer's Guide and Software Development Kit With CDROM 
>- by Microsoft Corporation 
>
>http://www.amazon.com/exec/obidos/ASIN/0735606390

Fwiw, the Microsoft Press website (http://mspress.microsoft.com) apparently
has no mention of the book. Guess this will have to be one more piece of
the conspiracy puzzle!

=============================================================
John E. Simpson          | It's no disgrace t'be poor, 
simpson@polaris.net      | but it might as well be.
                         |            -- "Kin" Hubbard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Wed Mar 17 17:04:53 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:10:05 2004
Subject: [OFF] who's on the XML WG?
References: <36EF14EA.E82@hiwaay.net>
	 <199903162113.QAA19884@hesketh.net>
	 <36EF13A7.C9C8A196@finetuning.com>
	 <36EF14EA.E82@hiwaay.net> <199903171508.KAA02978@hesketh.net>
Message-ID: <36EFE3AF.FE9967E6@finetuning.com>

> Wow.  I always thought Spy vs. Spy was the funniest part of MAD Magazine,
> but I didn't expect to set off this kind of entertainment.  

oh it only gets like that when microsoft's involved :-)


Can't say I
> thought it would be this complex or the source of so much angst.
> 
> And as anyone who has listened to me for the last two years is aware, I'm
> hardly a responsible XML journalist.  

how bout just a polite collegue then?

(Haven't gone to a show on a press
> badge in a long, long, time!)  

Wow! what a waste of cash!  silly goose!  Never, i repeat! NEVER!  give
those up!  No matter what else you do!

 a slip today would likely be old news in
> six months to a year, when the book arrives.  

But once again, I wasn't talking about technical information that would
be outdatable, I'm just talking about not making working group members
uncomfortable....or really as long as YOUR not uncomfortable with it --
they're probably used to it.

Since my books have such long schedules (one
> is verging on infinite, it seems),

I'm a little too sensitive to this subject right now to crack jokes.
:-)


Lisa Rein
XML Explained
(due out summer hopefully :-)
Addison-Wesley Longman
http://www.finetuning.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar 17 17:28:16 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
References: <85256737.00129928.00@D51MTA03.pok.ibm.com> <199903171524.JAA01148@bruno.techno.com>
Message-ID: <36EFE58A.E37657A3@locke.ccil.org>

Steven R. Newcomb wrote:

> As for values of default attributes, the SGML Property Set allows the
> API to report not only to report the defaulted value, but also the
> fact that it was defaulted.

Where can I lay my hands on this property set in intelligible
form?

> Indeed, one can tell exactly what was
> explicitly in the document, and what was supplied at parse time.  I
> hope and believe that the "XML Infoset" committee will come up with an
> information set that will be equally revealing in this respect.

You may well hope.  Are you by chance an invited expert?

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Wed Mar 17 17:28:52 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:10:05 2004
Subject: How to ??
References: <85256737.00129928.00@D51MTA03.pok.ibm.com> <199903171524.JAA01148@bruno.techno.com>
Message-ID: <36EFE612.871C5407@sqwest.bc.ca>

"Steven R. Newcomb" wrote:

> As for values of default attributes, the SGML Property Set allows the
> API to report not only to report the defaulted value, but also the
> fact that it was defaulted.  

As does the DOM. BTW, one of the major inputs to the DOM was the
SGML property set; we took out parts that were not in XML and came
up with an API to the rest, modified by existing practice in various
APIs. Though I'm not guaranteeing that what we have left is a pure
subset of the SGML property set.

Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 17 18:04:21 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
Message-ID: <002a01be70a1$26aefb80$46026982@thing1>

>If so, my reading is that those are deviations from the DOM Recommendation -- in
>other words, bugs.


Frankly, I think the problem is more with the fact that they are SAX-based. 

Complicating matters is the lack of a method for setSpecified (DOM is somewhat
incomplete from the perspective of an application) and a particular DOM implementation
which uses a private variable for the specified variable. (I haven't looked at the
JavaSoft implementation yet. This was true for just one DOM implementation.) 

But subclassing was still viable. We just had to define a second specified variable
and override the isSpecified method to use it. This will all be included in the upcoming
(and still open source) production release of MDSAX.

Adding specified to SAX is really pretty simple--no changes to any interfaces is needed.
Its just a matter of subclassing the existing AttributeList interface in either the parser or 
a filter which has access to default values. The DOM builder then can check for the use
of this extended interface to access the specified value.

I've appended the AttributeList and Attribute interfaces to this email. This is what we're
working with now, though your input would be welcome. In my mind, including these
interfaces (or something similar) in ModSAX/OpenSAX/SAX2 would be the best of all
worlds.

Bill

public interface MDAttributeList extends AttributeList
{
 /*
  * Make this attribute list a copy of another
  */
  public void
 copy(AttributeList attList);

 /*
  * Get an attribute object (by name)
  */
  public MDAttribute
 getMDAttribute(String name);

 /*
  * Get an attribute object (by position)
  */
  public MDAttribute
 getMDAttribute(int i);

 /*
  * Add an attribute
  */
  public void
 putMDAttribute(MDAttribute att);

 /*
  * Remove all attributes
  */
  public void
 clear();

 /*
  * Remove an attribute
  */
  public void
 removeMDAttribute(MDAttribute att);

 /*
  * Remove an attribute by name
  */
  public void
 removeMDAttribute(String name);

 /*
  * Remove an attribute by position
  */
  public void
 removeMDAttribute(int i);

 /**
  * Returns the specified property for the ith attribute.
  */
 public boolean getSpecified(int i);

 /**
  * Returns the specified property for the named attribute.
  */
 public boolean getSpecified(String name);
}


public interface MDAttribute
{
 /*
  * Make this attribute list another, except
  * that the name of the attribute is not changed.
  */
  public void
 copy(MDAttribute att);

  public String
 getName();

  public String
 getType();

  public void
 setType(String type);

  public String
 getValue();

  public void
 setValue(String value);

  public boolean
 getSpecified();

  public void
 setSpecified(boolean specified);
}


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Mar 17 18:20:53 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:06 2004
Subject: Need XML documents located on web servers for testing
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEHDCPAA.martind@netfolder.com>

Hi

We are doing tests on the SGML/XML kit version 2. The mian difference with
version 1 is that now it can access and render documents from the Web (HTTP
protocol). We need XML documents with associated XSL, CSS or DSSSL style
sheets to do some tests. The documents have to be located on a HTTP server
and have an associated style sheet. If you have some, tell us your link so
that we can do some tests.

Thanks a lot
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Wed Mar 17 19:26:18 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:10:06 2004
Subject: Unicode Font Required
Message-ID: <3.0.6.32.19990317191101.00c5d150@gpo.iol.ie>

The cyberbit Unicode font from Bitstream is no longer
available for download:-( Can anyone recommend a URL for
Unicode fonts for Win-32?


<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 19:31:15 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
In-Reply-To: <36EFE58A.E37657A3@locke.ccil.org>
References: <85256737.00129928.00@D51MTA03.pok.ibm.com>
	<199903171524.JAA01148@bruno.techno.com>
	<36EFE58A.E37657A3@locke.ccil.org>
Message-ID: <14064.545.466004.80375@localhost.localdomain>

John Cowan writes:

 > Steven R. Newcomb wrote:
 > 
 > > As for values of default attributes, the SGML Property Set allows the
 > > API to report not only to report the defaulted value, but also the
 > > fact that it was defaulted.
 > 
 > Where can I lay my hands on this property set in intelligible
 > form?

<laughter categories="cynical raucous"/>

A long time ago, I made an attempt at presenting some of the simpler
parts in a more API-like format; you can find the result at my ancient
personal web site:

  http://home.sprynet.com/~dmeggins/grove.html  

I never touched the DTD-related stuff, though.  Did Paul Prescod put
together a groves tutorial once?  Something seems to be ringing in my
head...


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 19:39:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
In-Reply-To: <002a01be70a1$26aefb80$46026982@thing1>
References: <002a01be70a1$26aefb80$46026982@thing1>
Message-ID: <14064.1097.894096.155245@localhost.localdomain>

Bill la Forge writes:

 > >If so, my reading is that those are deviations from the DOM
 > >Recommendation -- in other words, bugs.
 > 
 > 
 > Frankly, I think the problem is more with the fact that they are
 > SAX-based.

Or more generally, the point that people constantly forget is that the
DOM is an interface, not a processor -- it provides a means for
representing the information that is made available to it, which may
or may not correspond in any obvious way with the information in the
original XML document (if there ever was an original XML document).

If SAX 1.0 made isSpecified information available, the DOM could model
it; since SAX 1.0 does not make that information available, the DOM
interface has nothing to pass on.  In other words, this is a
quality-of-implementation issue, not a conformance issue.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Wed Mar 17 19:57:05 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
References: <002a01be70a1$26aefb80$46026982@thing1> <14064.1097.894096.155245@localhost.localdomain>
Message-ID: <36F008B4.1861E333@sqwest.bc.ca>

David Megginson wrote:

> Or more generally, the point that people constantly forget is that the
> DOM is an interface, not a processor -- it provides a means for
> representing the information that is made available to it, which may
> or may not correspond in any obvious way with the information in the
> original XML document (if there ever was an original XML document).

And in fact people could use (some of?) the DOM interfaces to access
documents that never were and never would be XML, simply by treating
objects as elements, attributes etc. The DOM makes no guarantees as
to what the original document was, simply that once you have a
structure model built, you can delete things called elements, change
the content of things called comments, etc.

> If SAX 1.0 made isSpecified information available, the DOM could model
> it; since SAX 1.0 does not make that information available, the DOM
> interface has nothing to pass on.  In other words, this is a
> quality-of-implementation issue, not a conformance issue.

Non-validating parsers don't need to pass this information on to the
DOM structure model either, since they may not have it. And XML
processors may choose not to pass comments on at all, or to
completely expand entity references before building the structure
model. All of which makes those DOM interfaces not relevant to those
documents, although obviously scripts or other applications that
call them shouldn't crash and burn just because the processor
doesn't pass on comments, for example.

Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Wed Mar 17 20:13:53 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:06 2004
Subject: Why is this JC test not-wf?
Message-ID: <87256737.006A78E4.00@d53mta03h.boulder.ibm.com>


Ok, so I'm running through the James Clark tests with my new parser and I
don't exactly understand why not-wf\sa\081.xml is not well formed
necessarily. Here is the text:

   <!DOCTYPE doc [
   <!ENTITY e SYSTEM "nul">
   ]>
   <doc a="&e;"></doc>

So it defines an external entity 'e', which resolves to a file named 'nul',
which is an empty file. Then it references that entity as the value of the
attribute 'a'. To me, that seems perfectly fine. If could have legally
written manually:

     <doc a=""></doc>

and I think I could legally if not validating, then why would not the
result of the test file give exactly that result? And if it does give that
result, how is that less well formed than what I could have typed myself?
There is no prohibition against empty attribute values that I know of.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 17 20:28:47 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
References: <85256737.00129928.00@D51MTA03.pok.ibm.com>
		<199903171524.JAA01148@bruno.techno.com>
		<36EFE58A.E37657A3@locke.ccil.org> <14064.545.466004.80375@localhost.localdomain>
Message-ID: <36F00CB2.7AB2D64D@prescod.net>

David Megginson wrote:
> 
> I never touched the DTD-related stuff, though.  Did Paul Prescod put
> together a groves tutorial once?  Something seems to be ringing in my
> head...

Ding dong.

http://www.prescod.net/groves/shorttut

And here is a link to a readable (comparatively speaking!) version of the
SGML property set.

http://www.hytime.org/materials/sgml-esis/

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"If you were casting Bob Kane's character by disposition, you would 
never in a million years think Michael Keaton or George Clooney.
Good God: George Clooney? If you were casting Bob Kane's Batman, even 
the likes of Tim Roth or Christopher Walken would be much too 
lighthearted to play this demonic avenger."
	- http://www.salonmagazine.com/feature/1998/11/06feature.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar 17 20:33:02 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
References: <85256737.00129928.00@D51MTA03.pok.ibm.com>
		<199903171524.JAA01148@bruno.techno.com>
		<36EFE58A.E37657A3@locke.ccil.org> <14064.545.466004.80375@localhost.localdomain>
Message-ID: <36F01140.5BA253AB@locke.ccil.org>

David Megginson wrote:

> A long time ago, I made an attempt at presenting some of the simpler
> parts in a more API-like format; you can find the result at my ancient
> personal web site:
> 
>   http://home.sprynet.com/~dmeggins/grove.html

Very good.  The only thing I do not grok is the boolean "included?"
property of an Element.  Whazzatmean?
 
-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Mar 17 20:35:24 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:06 2004
Subject: Why is this JC test not-wf?
Message-ID: <003501be70b5$b25a33c0$0300000a@othniel.cygnus.uwa.edu.au>


>So it defines an external entity 'e', which resolves to a file named 'nul',
>which is an empty file. Then it references that entity as the value of the
>attribute 'a'. To me, that seems perfectly fine.

Attribute values cannot contain references to external entities. See the WFC
in 3.1

I wanted to do this on XMLSOFTWARE where I wanted to have an attribute
Updated="&date;" where date was an external entity changed daily by a cron
job. Instead I just made it an element rather than an attribute.

James
--
James Tauber / jtauber@jtauber.com / www.jtauber.com
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 : http://www8.org/

Maintainer of : www.xmlinfo.com,  www.xmlsoftware.com and www.schema.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Mar 17 20:44:07 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:06 2004
Subject: Why is this JC test not-wf?
Message-ID: <3.0.32.19990317124207.018f6100@pop.intergate.bc.ca>

At 12:22 PM 3/17/99 -0700, roddey@us.ibm.com wrote:
>
>
>
>Ok, so I'm running through the James Clark tests with my new parser and I
>don't exactly understand why not-wf\sa\081.xml is not well formed
>necessarily. Here is the text:
>
>   <!DOCTYPE doc [
>   <!ENTITY e SYSTEM "nul">
>   ]>
>   <doc a="&e;"></doc>

It's not WF because there is a specific prohibition against referring
to external entities in attribute values. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Mar 17 20:58:42 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
Message-ID: <00a901be70b8$d510bd50$2ee044c6@arcot-main>

>If so, my reading is that those are deviations from the DOM
Recommendation -- in
>other words, bugs.
>
> I sorta undertand why Docuverse could have gotten this wrong; the spec's
>description starts of with a discussion of "original document", which is
>probably an error... though the summary lays out the intent quite precisely
in
>terms of assigned versus default values.


Joe,

Attr.isSpecified is always true in Docuverse SDK only because the SAXReader,
which builds a DOM Document from SAX events, has no way to tell whether an
attribute is specified or not.  Attr implementation itself allows creation
of 'default' Attr.  As far as misintrepreting the DOM spec is concerned, I
would have to be a serious idiot to do that since I am in the DOM IG just as
you are.  I am busy but I am not that busy.

Best,

Don Park
Docuverse


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Wed Mar 17 21:03:26 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
In-Reply-To: <14064.545.466004.80375@localhost.localdomain>
Message-ID: <000001be70b8$7e9e75c0$577670c6@eps.inso.com>

> A long time ago, I made an attempt at presenting some of the simpler
> parts in a more API-like format; you can find the result at my ancient
> personal web site:

I did some work on this in the early days of the DOM too. I think some
of those efforts might still be laying aroud the DOM WG web site.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 21:05:14 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:06 2004
Subject: Why is this JC test not-wf?
In-Reply-To: <87256737.006A78E4.00@d53mta03h.boulder.ibm.com>
References: <87256737.006A78E4.00@d53mta03h.boulder.ibm.com>
Message-ID: <14064.6300.4075.636603@localhost.localdomain>

roddey@us.ibm.com writes:

 > Ok, so I'm running through the James Clark tests with my new parser and I
 > don't exactly understand why not-wf\sa\081.xml is not well formed
 > necessarily. Here is the text:
 > 
 >    <!DOCTYPE doc [
 >    <!ENTITY e SYSTEM "nul">
 >    ]>
 >    <doc a="&e;"></doc>

External entities references are not allowed in attribute values.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 21:07:59 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:06 2004
Subject: How to ??
In-Reply-To: <36F01140.5BA253AB@locke.ccil.org>
References: <85256737.00129928.00@D51MTA03.pok.ibm.com>
	<199903171524.JAA01148@bruno.techno.com>
	<36EFE58A.E37657A3@locke.ccil.org>
	<14064.545.466004.80375@localhost.localdomain>
	<36F01140.5BA253AB@locke.ccil.org>
Message-ID: <14064.6478.715444.949451@localhost.localdomain>

John Cowan writes:

 > Very good.  The only thing I do not grok is the boolean "included?"
 > property of an Element.  Whazzatmean?

I don't recall, but I'd guess that it means that the element was
allowed by an inclusion exception.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar 17 21:14:30 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:06 2004
Subject: Why is this JC test not-wf?
References: <87256737.006A78E4.00@d53mta03h.boulder.ibm.com>
Message-ID: <36F01B06.56EAA5A8@locke.ccil.org>

roddey@us.ibm.com wrote:

>    <!DOCTYPE doc [
>    <!ENTITY e SYSTEM "nul">
>    ]>
>    <doc a="&e;"></doc>

References to external entities aren't allowed in attribute values,
only in content.  That keeps attribute values simple for non-validating
parsers that don't load external entities.

Or it was supposed to.  In fact, since NVPs don't have to read
external parameter entities either, we can wind up with an unresolvable
reference to an internal entity that is externally declared.
 
-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 21:15:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:07 2004
Subject: ModSAX: Proposed Core Properties
In-Reply-To: <01BE705C.DB375010@grappa.ito.tu-darmstadt.de>
References: <01BE705C.DB375010@grappa.ito.tu-darmstadt.de>
Message-ID: <14064.6789.718797.734226@localhost.localdomain>

Ronald Bourret writes:
 > David Megginson wrote:
 > 
 > > So far, for consistency, I've made all pre-parse properties
 > > write-only and all parse-time properties read-only.  We'll have
 > > to think about this one.
 > 
 > What is the possible benefit of making any property write-only?
 > That is, can any harm ever come from reading a property?

There are three benefits:

1. Keep the API absolutely as small as possible.
2. Avoid confusion.
3. Allow properties to be unknown until set.

Any attempt to access a property can generate a
SAXNotSupportedException (or the derived SAXNotRecognizedException),
but there is no guarantee that they will be symmetrical.  For example,
a driver built on top of a parser might allow you to set the namespace
separator but not to query it, or vice-versa.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar 17 21:31:36 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:07 2004
Subject: XCatalog
References: <00d101be70bb$7f2a81c0$2ee044c6@arcot-main>
Message-ID: <36F01F02.2EEECD05@locke.ccil.org>

Don Park wrote:

> You wrote a few days ago that you were abandoning XCatalog.

More precisely:  I no longer think the XML syntax described in
the XCatalog paper makes much sense.  The Socat subset is
more interoperable and more sensible.

The main reason is that it's unreasonable, IMHO, to ask a
parser to recursively parse XML-XCatalog format while it's
parsing some other document.  Socat format on the other hand
is an easy hack.

I am still looking for someone to adopt my Socat and SocatResolver
Java classes, fix the remaining bugs and integrate them into
a framework.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Wed Mar 17 22:04:51 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:10:07 2004
Subject: RDF considered optional
Message-ID: <000801be70c1$edfe0800$5118a8c0@kuantech1.quokka.com>

The RDF syntax spec states that "The RDF element is optional if the content can be known to be RDF from the application context." 

Can anyone clarify exactly what is meant by "application context"? 

Jeff

-----------------------------------------------------------------
Kuantech, Inc.                            http://www.kuantech.com
Jeffrey E. Sussna, Principal                     jes@kuantech.com

Distributed Content Architectures for Dynamic Online Applications
-----------------------------------------------------------------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Wed Mar 17 22:23:20 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:07 2004
Subject: RDF considered optional
References: <000801be70c1$edfe0800$5118a8c0@kuantech1.quokka.com>
Message-ID: <36F02B36.F5E46355@locke.ccil.org>

Jeffrey E. Sussna wrote:

> The RDF syntax spec states that "The RDF element is optional if the
> content can be known to be RDF from the application context."
> 
> Can anyone clarify exactly what is meant by "application context"?

I take that to mean that if the application already knows it is
dealing with RDF, no RDF element is required.  It's just a container
for the true RDF anyway.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 22:42:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:07 2004
Subject: XCatalog
In-Reply-To: <36F01F02.2EEECD05@locke.ccil.org>
References: <00d101be70bb$7f2a81c0$2ee044c6@arcot-main>
	<36F01F02.2EEECD05@locke.ccil.org>
Message-ID: <14064.9093.654524.171248@localhost.localdomain>

John Cowan writes:

 > The main reason is that it's unreasonable, IMHO, to ask a
 > parser to recursively parse XML-XCatalog format while it's
 > parsing some other document.  Socat format on the other hand
 > is an easy hack.

That same issue might be what keeps DTD syntax alive for simple uses
-- sure, DTDs aren't all that powerful compared to most of the new
proposals out there, but it's compact and easy to parse and it doesn't
require you to process yet another XML document (it's also supported
by lots of existing software and is backed up by approved ISO and W3C
specs).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Wed Mar 17 22:48:02 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:10:07 2004
Subject: RDF considered optional
In-Reply-To: <36F02B36.F5E46355@locke.ccil.org>
Message-ID: <000901be70c7$ead3bb60$5118a8c0@kuantech1.quokka.com>

I took it the same way. But doesn't that violate the principle of XML as being self-describing? The document is now tied to a specific set of applications, rather than only being tied to any applications that are capable of reading and understanding a schema. I suspect either I or the spec are being a little pedantic here; given that there is a namespace specifier, the application will always know what it's dealing with, but something feels wrong about binding a description language to an application context.

Jeff

-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
John Cowan
Sent: Wednesday, March 17, 1999 2:23 PM
To: XML Dev
Subject: Re: RDF considered optional


Jeffrey E. Sussna wrote:

> The RDF syntax spec states that "The RDF element is optional if the
> content can be known to be RDF from the application context."
> 
> Can anyone clarify exactly what is meant by "application context"?

I take that to mean that if the application already knows it is
dealing with RDF, no RDF element is required.  It's just a container
for the true RDF anyway.

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 17 22:56:49 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:07 2004
Subject: RDF considered optional
In-Reply-To: <000801be70c1$edfe0800$5118a8c0@kuantech1.quokka.com>
References: <000801be70c1$edfe0800$5118a8c0@kuantech1.quokka.com>
Message-ID: <14064.12308.540925.58995@localhost.localdomain>

Jeffrey E. Sussna writes:

 > The RDF syntax spec states that "The RDF element is optional if the
 > content can be known to be RDF from the application context."
 > 
 > Can anyone clarify exactly what is meant by "application context"? 

That means that if you know in advance what you're reading, you don't
need to have an rdf:RDF element.  For example, if I define an
RDF-based format for exchanging Flight-Simulator scenery information
(FSML), and I never intend to embed this in a web page, etc., I can
just use something like this:

  <?xml version="1.0"?>

  <!-- Yes, Virginia, this is RDF-conformant! -->

  <fsml:Building xmlns:fsml="http://flightsim.com/ns/fsml#">
   <fsml:name>Lennox Generating Station</fsml:name>
   <fsml:latitude>N44*8'33"</fsml:latitude>
   <fsml:longitude>W76*51'9"</fsml:longitude>
   <fsml:length>300m</fsml:length>
   <fsml:width>150m</fsml:width>
  </fsml:Building>

What's really entertaining is that I can do this, and be
RDF-conformant, without even declaring the RDF namespace.

The most recent version of SiRPAC that I saw still chokes on this
example, but I consider that a bug in SiRPAC (or at least, a
deliberate implementation choice).

For the curious, here's the basic serialisation syntax for the same
thing:

  <?xml version="1.0"?>

  <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
           xmlns:fsml="http://flightsim.com/ns/fsml#">

  <rdf:Description>
   <rdf:type resource="http://flightsim.com/ns/fsml#Building"/>
   <fsml:name>Lennox Generating Station</fsml:name>
   <fsml:latitude>N44*8'33"</fsml:latitude>
   <fsml:longitude>W76*51'9"</fsml:longitude>
   <fsml:length>300m</fsml:length>
   <fsml:width>150m</fsml:width>
  </rdf:Description>

  </rdf:RDF>

Actually, this is *not* exactly identical because the RDF spec has a
bizarre requirement that when you use rdf:Description explicitly the
processor has to build about a zillion extra tuples for reification
purposes, whether you want them or not.  Go figure.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Mar 17 23:07:04 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:07 2004
Subject: XCatalog
Message-ID: <008b01be70ca$c2767380$2ee044c6@arcot-main>

It would interesting to find out which of the XML parsers can not handle
this sort of recursion.

Confession anyone?  Specifically, can your parser implementation allow
multiple instances of the parser to run concurrently?  For example, parse
another XML file (XCatalog) within EntityResolver.

Don

>John Cowan writes:
>
> > The main reason is that it's unreasonable, IMHO, to ask a
> > parser to recursively parse XML-XCatalog format while it's
> > parsing some other document.  Socat format on the other hand
> > is an easy hack.
>
>That same issue might be what keeps DTD syntax alive for simple uses
>-- sure, DTDs aren't all that powerful compared to most of the new
>proposals out there, but it's compact and easy to parse and it doesn't
>require you to process yet another XML document (it's also supported
>by lots of existing software and is backed up by approved ISO and W3C
>specs).


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From derekdb at microsoft.com  Wed Mar 17 23:17:42 1999
From: derekdb at microsoft.com (Derek Denny-Brown)
Date: Mon Jun  7 17:10:07 2004
Subject: Microsoft XML 2.0?
Message-ID: <8B57882C41A0D1118F7100805F9F68B506F1BE8D@RED-MSG-45>

Microsoft shiped a dll with IE4 (msxml.dll), which implemented a library
which was called "Microsoft XML 1.0", i.e. that was the helpstring in the
IDL which defined the libraries interface.  Also the version of the library
interface was 1.0, which was also defined in the IDL.  IE5 includes an newer
version which has updated the version to 2.0.  The product mentioned below
is refering to that version number.  The name conflict/confusion is
unfortunate.  Version 2.0 of msxml.dll implements version 1.0 of the W3C's
XML Specification.

-----Original Message-----
From: John E. Simpson [mailto:simpson@polaris.net]
Sent: Wednesday, March 17, 1999 8:57 AM
To: XML-Dev Mailing list
Subject: Re: Microsoft XML 2.0?


At 11:00 AM 3/17/99 -0500, Simon St.Laurent wrote:
>Anyone know anything about this? Is it an SDK 2.0?  (What was 1.0?)  Kind
>of strange, if you ask me.  I hope it's not Microsoft announcing XML 2.0,
>which the outraged reviewer seems to think.
>
>-----------------------------------------
>Microsoft XML 2.0 Programmer's Guide and Software Development Kit With
CDROM 
>- by Microsoft Corporation 
>
>http://www.amazon.com/exec/obidos/ASIN/0735606390

Fwiw, the Microsoft Press website (http://mspress.microsoft.com) apparently
has no mention of the book. Guess this will have to be one more piece of
the conspiracy puzzle!

=============================================================
John E. Simpson          | It's no disgrace t'be poor, 
simpson@polaris.net      | but it might as well be.
                         |            -- "Kin" Hubbard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Wed Mar 17 23:34:05 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:10:07 2004
Subject: Why is this JC test not-wf?
In-Reply-To: James Tauber's message of Thu, 18 Mar 1999 04:35:24 +0800
Message-ID: <15490.199903172333@brodie.cogsci.ed.ac.uk>

> I wanted to do this on XMLSOFTWARE where I wanted to have an attribute
> Updated="&date;" where date was an external entity changed daily by a cron
> job. Instead I just made it an element rather than an attribute.

If you *really wanted* to do it with an attribute, you could use an
external parameter entity in the definition of the internal entity
definition:

 <!ENTITY % date SYSTEM "date">
 <!ENTITY date "%date;">

This would have to be in the external subset.

-- Richard


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Wed Mar 17 23:55:01 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:07 2004
Subject: Why is this JC test not-wf?
Message-ID: <87256737.00833617.00@d53mta03h.boulder.ibm.com>


Yup, I'm an idiot. I was so obsessed about the other reasons why it might
not be valid, that I missed the obvious one. Its too bad he doesn't have a
listing of what he thinks is wrong with each file and why. That would save
some trouble for the mentally/chronologically challenged such as myself :-)

Thanks.


"James Tauber" <jtauber@jtauber.com> on 03/17/99 12:35:24 PM

To:   Dean Roddey/Cupertino/IBM, xml-dev@ic.ac.uk
cc:
Subject:  Re: Why is this JC test not-wf?


>So it defines an external entity 'e', which resolves to a file named
'nul',
>which is an empty file. Then it references that entity as the value of the
>attribute 'a'. To me, that seems perfectly fine.

Attribute values cannot contain references to external entities. See the
WFC
in 3.1

I wanted to do this on XMLSOFTWARE where I wanted to have an attribute
Updated="&date;" where date was an external entity changed daily by a cron
job. Instead I just made it an element rather than an attribute.

James
--
James Tauber / jtauber@jtauber.com / www.jtauber.com
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 : http://www8.org/

Maintainer of : www.xmlinfo.com,  www.xmlsoftware.com and www.schema.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Mar 18 00:11:10 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:07 2004
Subject: Need XML documents located on web servers for testing
Message-ID: <002001be70d4$1fca1e70$3ff96d8c@NT.JELLIFFE.COM.AU>


From: Didier PH Martin <martind@netfolder.com>

>We are doing tests on the SGML/XML kit version 2. The mian difference
with
>version 1 is that now it can access and render documents from the Web
(HTTP
>protocol). We need XML documents with associated XSL, CSS or DSSSL
style
>sheets to do some tests. The documents have to be located on a HTTP
server
>and have an associated style sheet. If you have some, tell us your link
so
>that we can do some tests.

We have a collection of XML test files and sample documents using CSS:
see the pages
    http://xml.ascc.net/xml/en/utf-8/resource-index.html
    http://xml.ascc.net/xml/test/index.html
The test files are delivered in multiple character sets, and in multiple
MIME content-types.

Also a simple version of the (English language) FAQ in QAML DTD:
    http://xml.ascc.net/xml/en/utf-8/index.html

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Mar 18 00:33:12 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:07 2004
Subject: XCatalog
Message-ID: <003d01be70d7$3240ce20$3ff96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com>

>That same issue might be what keeps DTD syntax alive for simple uses
>-- sure, DTDs aren't all that powerful compared to most of the new
>proposals out there, but it's compact and easy to parse and it doesn't
>require you to process yet another XML document (it's also supported
>by lots of existing software and is backed up by approved ISO and W3C
>specs).

Furthermore, I think there is human factors element involved: when there
is a change in domain, (at least some) people expect or need this to be
flagged by using a difference syntax. When the data is highly cohesive,
it is natural to couple it syntactically to distinguish it from the
markup in which it is embedded.

I think it is a really important design principle, and too easily
dismissed. It helps explain
* why are URLs not factored out into attributes? &
* why are scripting languages not in instance notation? &
* why are the patterns in XSL not in instance notation? &
* why don't people like LISP syntax (i.e., does the unified syntax
actually cause reading panic in newcomers: it has been widely commented
that computer languages with different syntaxes each for assignment,
declarations, infix maths, and prefix functions, such as C and ALGOL
family languages have been much more successful)?

Apart from that, there is an issue of the perceived cost-benefit of
metadata (i.e. schemas) compared to data. If I have a 1K Docbook
instance and a 600K Docbook Schema in instance syntax, it is a strong
dissuader against using that schema. So I don't think that terseness is
of minimal importance for schemas: appropraite balance to the size of
the instance is, however.

There is also clearly some contradictory factors at work too: some
people feel saturated with too many little language and love a unified
syntax. You can see their personality, in this regard, reflected in the
DTDs they write.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Thu Mar 18 00:37:08 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:07 2004
Subject: Why is this JC test not-wf?
Message-ID: <002401be70d7$5562a400$2ee044c6@arcot-main>

>> I wanted to do this on XMLSOFTWARE where I wanted to have an attribute
>> Updated="&date;" where date was an external entity changed daily by a
cron
>> job. Instead I just made it an element rather than an attribute.
>
>If you *really wanted* to do it with an attribute, you could use an
>external parameter entity in the definition of the internal entity
>definition:
>
> <!ENTITY % date SYSTEM "date">
> <!ENTITY date "%date;">
>
>This would have to be in the external subset.

Another way to do this is by reversing the roles of the files by storing
data as external general parsed entities and define the referenced entities
in the document.

dated.xml:

    <?xml version="1.0"?>
    <!DOCTYPE dated [
        <!ENTITY datable SYSTEM "datable.xml">
        <!ENTITY date "17/03/99">
    ]>
    <dated>
    &datable;
    </dated>

datable.xml:

    <?xml version="1.0"?>
    <datable timestamp="&date;">
        blah blah
    </datable>


This form of role reversal technicque can be quite powerful when used
properly.

Best,

Don Park
Docuverse


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Thu Mar 18 00:48:16 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:10:07 2004
Subject: XCatalog
In-Reply-To: <003d01be70d7$3240ce20$3ff96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <000b01be70d8$beace000$5118a8c0@kuantech1.quokka.com>

Your points about syntactic unification vs. "diversity" (snipped) are excellent. A key component, however, to consider, is the extent to which the language in question will be read and written by humans. Machines like unification, while (many, at least) humans like diversity. In my own use of XML, and relating it to an HTML-trained community, I am coming to question the extent to which XML is really suited to human consumption.

Jeff Sussna


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Wed Mar 17 02:38:06 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:07 2004
Subject: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net> <36EF13A7.C9C8A196@finetuning.com>
Message-ID: <36EF14EA.E82@hiwaay.net>

Lisa Rein wrote:
> 
> yes the cloak and dagger stuff is amusing, isn't it. yet necessary.

Not really.  It may up the excitement to be in a secret society, but 
I don't think it does a lot to help the spec.

> > I guess knowing who all these folks are would reduce the odds of such
> > uncomfortable situations cropping up.

You mean like blame and credit?  ;-)

Wow! I can just see all of those journalists, wannabes and 
industry spies camping outside David Megginson's door.  
Almost like being a rock star without the sex.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 18 01:18:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:08 2004
Subject: RDF considered optional
In-Reply-To: <000901be70c7$ead3bb60$5118a8c0@kuantech1.quokka.com>
References: <36F02B36.F5E46355@locke.ccil.org>
	<000901be70c7$ead3bb60$5118a8c0@kuantech1.quokka.com>
Message-ID: <14064.21512.146188.888832@localhost.localdomain>

Jeffrey E. Sussna writes:

[on the omissibility of rdf:RDF]

 > I took it the same way. But doesn't that violate the principle of
 > XML as being self-describing?

First, there is no such published principle for XML itself, though the
Namespaces spec provides an infrastructure for such a thing.

Second, it in no way violates it, because if you recognise the
namespaces/elements being used, you can still figure out that you're
dealing with RDF (and if not, fat lot of good they'll do you anyway).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 18 01:25:45 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:08 2004
Subject: XCatalog
In-Reply-To: <008b01be70ca$c2767380$2ee044c6@arcot-main>
References: <008b01be70ca$c2767380$2ee044c6@arcot-main>
Message-ID: <14064.22019.894218.252220@localhost.localdomain>

Don Park writes:

 > It would interesting to find out which of the XML parsers can not handle
 > this sort of recursion.

I think it's more a question of efficiency than capability.  I know of 
no XML parser that is not reentrant, but I have tried only five or six 
of them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Thu Mar 18 01:30:54 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:10:08 2004
Subject: Why is this JC test not-wf?
In-Reply-To: roddey@us.ibm.com's message of Wed, 17 Mar 1999 16:53:07 -0700
Message-ID: <18386.199903180130@brodie.cogsci.ed.ac.uk>

> Its too bad he doesn't have a
> listing of what he thinks is wrong with each file and why.

How true.

So I have put up a web page showing the error messages rxp produces for
each not-well-formed test.  It's at

 http://www.cogsci.ed.ac.uk/~richard/jjc-test-errors.html

I apologise for the poor quality of the error messages in cases where
there are illegal characters in the document; this will be improved
some day.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Mar 18 02:06:20 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:08 2004
Subject: who's on the XML WG?
References: <199903162113.QAA19884@hesketh.net>
		<36EF13A7.C9C8A196@finetuning.com>
		<36EF14EA.E82@hiwaay.net> <14063.38650.573008.410552@localhost.localdomain>
Message-ID: <36F05EF2.38A1@hiwaay.net>

David Megginson wrote:
> 
> len bullard writes:
> 
>  > Wow! I can just see all of those journalists, wannabes and industry
>  > spies camping outside David Megginson's door.  Almost like being a
>  > rock star without the sex.
> 
> Without the drugs, too (I can barely handle children's chewable
> multivitamins).

Sex kills and drugs get you busted; might as well pick up the axe and
rock.
 
> Actually, Infoset isn't controversial enough that anyone cares to do
> more than ask for the occasional courtesy quote ("Oh yes, and what
> does Infoset do again?"), 

weird.  it is the one wg i am hoping does a righteous job.  i can 
ignore many of the others because hey, object.navigate URL usually 
works in my world, but when you have to ask a table to tell you what a 
node is named so you can get the name of the named node, then darn, a  
standard that names the names is really useful.

> but some chairs of more visible WGs do get
> hounded unbelievably by both the press and the big industry players.

To quote my favorite rock star, "the hounds of love are hunting. I've 
always been a coward."

> RFC: New mailing list, XML-Group-Therapy.

this list works fine.  all the groupies are here.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Mar 18 02:07:59 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:08 2004
Subject: How to ??
References: <85256737.00129928.00@D51MTA03.pok.ibm.com> <199903171524.JAA01148@bruno.techno.com> <36EFE58A.E37657A3@locke.ccil.org>
Message-ID: <36F05F53.76FD@hiwaay.net>

John Cowan wrote:
> 
> Steven R. Newcomb wrote:

> > Indeed, one can tell exactly what was
> > explicitly in the document, and what was supplied at parse time.  I
> > hope and believe that the "XML Infoset" committee will come up with an
> > information set that will be equally revealing in this respect.
> 
> You may well hope.  Are you by chance an invited expert?

He is THE expert.

len (original Doctor Steve "groupie")

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Mar 18 03:24:12 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:08 2004
Subject: who's on the XML WG?
Message-ID: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>


Lisa Rein wrote:
>>
>> yes the cloak and dagger stuff is amusing, isn't it. yet necessary.

I think there should be more secrecy, not less. Look at our impatience
at waiting 2 or more years for links, for example: the WGs are cruel
vixen toying with us.

Of course, pre-announcement helps keep up the buzz I suppose.  The RDF
spec is so good I think it should have been kept secret indefinitely.


Rick


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Thu Mar 18 05:31:11 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:08 2004
Subject: How to ??
In-Reply-To: <36EFE58A.E37657A3@locke.ccil.org> (message from John Cowan on
	Wed, 17 Mar 1999 12:25:30 -0500)
References: <85256737.00129928.00@D51MTA03.pok.ibm.com> <199903171524.JAA01148@bruno.techno.com> <36EFE58A.E37657A3@locke.ccil.org>
Message-ID: <199903180504.XAA00766@bruno.techno.com>

[John Cowan:]

> Steven R. Newcomb wrote:
> 
> > As for values of default attributes, the SGML Property Set allows the
> > API not only to report the defaulted value, but also the
> > fact that it was defaulted.
> 
> Where can I lay my hands on this property set in intelligible
> form?

At its currently-most-intelligible, it can be found at:
http://www.hytime.org/materials/sgmlpropset/index.html

It's still not as intelligible as one might like, unfortunately, but
it's a hell of lot more human-accessible than the formal code from
which it was prepared, which can be found at
http://www.hytime.org/materials/hi2pssgm.sgm.

> > Indeed, one can tell exactly what was
> > explicitly in the document, and what was supplied at parse time.  I
> > hope and believe that the "XML Infoset" committee will come up with an
> > information set that will be equally revealing in this respect.

> You may well hope.  Are you by chance an invited expert?

I am merely a well-wisher, but that's at least as much because I'm
focused on other stuff (Topic Maps, for example) as because I wasn't
invited.  I have huge respect for David Megginson, so I'm very
optimistic about the outcome.  I gather that there is an open comment
period with respect to an Infoset draft (or perhaps a requirements
draft) that is either in progress or is about to occur, and I'm
frankly frustrated that I can't take the time to pay it the attention
that it so richly deserves.  I can't imagine anything more vital and
fundamental to the longterm future of every aspect of XML than the
work of this particular committee.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Thu Mar 18 05:31:28 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:08 2004
Subject: How to ??
In-Reply-To: <36EFE612.871C5407@sqwest.bc.ca> (message from Lauren Wood on
	Wed, 17 Mar 1999 09:27:46 -0800)
References: <85256737.00129928.00@D51MTA03.pok.ibm.com> <199903171524.JAA01148@bruno.techno.com> <36EFE612.871C5407@sqwest.bc.ca>
Message-ID: <199903180504.XAA00773@bruno.techno.com>

[Lauren Wood:]
> > As for values of default attributes, the SGML Property Set allows the
> > API to report not only to report the defaulted value, but also the
> > fact that it was defaulted.  
> 
> As does the DOM. BTW, one of the major inputs to the DOM was the
> SGML property set; we took out parts that were not in XML and came
> up with an API to the rest, modified by existing practice in various
> APIs. Though I'm not guaranteeing that what we have left is a pure
> subset of the SGML property set.

I find myself warming up to the DOM.  Thanks, Lauren -- I need to
learn more details about this.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Mar 18 06:04:19 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:08 2004
Subject: Transformation tool for windows
Message-ID: <000c01be7105$049fa8a0$a8acdccf@ix.netcom.com>

At the suggestion of several people I am making generaly available a simple
tool that carries out batch transformations of XML files under windows 95,
98, or NT. Although stable, it is very much alpha ware and is still a 'work
in process'. I would be glad of any feed back from members of this list.

It was written for an undergraduate class and requires no more skill's to
run than than basic windows skill's but in spite of that it is quite
powerful and can easily handle documents up to 2M in size. (I havn't tested
it on anything larger)

This tool is exerpted from a larger editing tool which uses the MSXML
parser. However as the later is in flux and the MSXML dll has not been
released or liscensed for general use, I have split the transformation tool
off from the editing and DOM tool.

'TransformXML' allows the following proceeses to be automated.

	1. Creating a list of xml files for processing.
	2. Running a list of commands on each file.
	3. Transforming one xml nametag to another.

It has not yet been optimized for speed. for example on a middle of the road
platform it takes about 1 minute to convert an XML file marked up by Jon
Bosak into HTML. It took 20 minutes to transform the complete works of
Shakespeare from xml to xhtml.

Please go to www.hypermedic.com/style to down load the zip file (20K). Look
under transform XML.

It uses the VB5 dll's which are also available if needed.


Frank Boumphrey

XML and style sheet info at Http://www.hypermedic.com/style/index.htm
Author: - Professional Style Sheets for HTML and XML http://www.wrox.com
CoAuthor:  XML applications from Wrox Press, www.wrox.com
Author: Using XML on the Web (March)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Thu Mar 18 08:25:45 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:10:08 2004
Subject: XCatalog
Message-ID: <3.0.6.32.19990318081540.009caec0@gpo.iol.ie>

>Don Park wrote:
>
>> You wrote a few days ago that you were abandoning XCatalog.
>
[John Cowan]
>More precisely:  I no longer think the XML syntax described in
>the XCatalog paper makes much sense.  The Socat subset is
>more interoperable and more sensible.
>
>The main reason is that it's unreasonable, IMHO, to ask a
>parser to recursively parse XML-XCatalog format while it's
>parsing some other document.  Socat format on the other hand
>is an easy hack.
>

An alternative might be to retain the XML syntax but
not use an XML parser to parse the syntax.
What I mean is that you could add application specific
rules about how the XML is laid out to make it
easily parsed with regexp yet retain the ability
to do a full parse for syntax/structure checking.
You would also retain the ability to manipulate
these catalogs with XML tools which is a good
thing:-) In particular your catalogs get
to live with your "documents" in your
document database and can be stored, chopped,
re-used and pretty printed using XML tools.

Saying that "the XML parser uses special parsing
logic for catalogs (oh, and by the way, they
are also XML files)" sounds better to me than
"the XML parse uses special parsing logic
for Socat catalogs (oh, and by the way, the
Socat parser is a one-off creation.")

<experience where="In the trenches">
Data structures have a tendancy to grow organically
over time. This is expecially true of succesful ones:-)
They start off with a small number of
syntactic constructs which are trivially parsed
but the parsing gets more difficult as new stuff
is added. Then comes the point where you say
"Gee, I wish I had stuck with XML..."
</experience>

<PythonSpecificHack>
An alternative to the "perfectly-good-XML-with-
DTD-but-parsed-with-regexp-to-avoid-chicken-and-egg"
would be a syntax like this:-

Maps = {
"foo" : "bar",
....
}

This appeals to the Python bigot in me because
it is easily parsed in any language you like,
but parsed and loaded into an in memory data
structure in one line of Python:-

	import Catalog.py

A similar line in bigotry can be followed
from other interpreted languages of course:-)
</PythonSpecificHack>

<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 18 09:06:51 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:08 2004
Subject: XCatalog
In-Reply-To: <003d01be70d7$3240ce20$3ff96d8c@NT.JELLIFFE.COM.AU>
References: <003d01be70d7$3240ce20$3ff96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <wkoglrkvyy.fsf@ifi.uio.no>


* Rick Jelliffe
| 
| * why are scripting languages not in instance notation? &

Because they would no longer be scripting languages, but instead
unreadable, awkward horrors no one would want to program in. :)

| * why are the patterns in XSL not in instance notation? &

Probably for the same reason: to avoid excessive verbosity and in the
interests of readability.

| * why don't people like LISP syntax 

Probably mainly because they're not exposed to it long enough to
discover that it's actually both readable, beautiful and
extraordinarily flexible. The macro system in Common Lisp is decades
ahead of anything else I've ever seen or heard about. (The same could
be said of the object system, although it's possible that Dylan (a CL
descendant with different syntax) has the same features.)

| (i.e., does the unified syntax actually cause reading panic in
| newcomers: 

With some people, yes.

| it has been widely commented that computer languages with different
| syntaxes each for assignment, declarations, infix maths, and prefix
| functions, such as C and ALGOL family languages have been much more
| successful)?

There are probably other reasons for that, such as that Unix was
written in C, so that all other languages were at a disadvantage.
Also, the chaotic jungle of Lisp dialects in the pre-CL (ie: 1985) era
probably didn't help either. Nor did the (now long incorrect) rumours
of Lisp as a functional, non-OO, untyped, dynamically scoped and slow
language.

If programming languages were competing on the basis of usability and
quality the world would look rather different today. 

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 18 09:17:16 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:08 2004
Subject: XCatalog
In-Reply-To: <36F01F02.2EEECD05@locke.ccil.org>
References: <00d101be70bb$7f2a81c0$2ee044c6@arcot-main> <36F01F02.2EEECD05@locke.ccil.org>
Message-ID: <wkn21bkvih.fsf@ifi.uio.no>


* John Cowan
| 
| The main reason is that it's unreasonable, IMHO, to ask a parser to
| recursively parse XML-XCatalog format while it's parsing some other
| document.  

I suppose there are two issues here:
 
  - reentrant parsers and
  - bottomless recursions in that parsing requires a catalog, the
  parsing of which requires another catalog...

I can't imagine that the first issue would present serious problems.
XCatalogs are no more than external entities that are not part of the
document, and it's also possible to first parse the XCatalog and only
afterwards parse the document.

The recursion issue is potentially more troublesome.  In my case I
solve it by not registering the callback resolvers that actually use
catalog information when I parse XCatalogs.  So the problem should be
easy to avoid, and users should not be surprised when they find
catalog features unavailable in XCatalog instances.

So on reflection I think XCatalogs are far less troublesome than for
example parameter entities.

| Socat format on the other hand is an easy hack.

It's not hard to implement, no, but the fact remains that more parsers
support XCatalogs than SO Catalogs.  Worse really _is_ better, I
think, also in this case.
 
--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Thu Mar 18 09:50:30 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:10:08 2004
Subject: IDL2XML converter
Message-ID: <3601e01c.180299@smtpgate1.ONE2ONE.CO.UK>


Hi all,
I am curious about the progress of a IDL to XML converter. I have found   
what seems to be a project going on in Germany. But all the docs are in   
German and i can't read German. If any one knows of such projects i am   
very keen to discuss a variety of issues. In fact if any one has any   
thoughts about a IDL2XML converter i would like to hear from them.

I suspect that the main obstacle is in the IDL data type mappings. Since   
there is no W3C endorsed schema what do you use? I guess the DCD proposal   
that Microsoft and IBM jointly submitted to the W3C will be a starting   
point.
Any comments are welcome.

Thanks

adios

 -lucio

 ---------------------------------------------------------------------
 One2One              LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower      tel : +44 181 214 3847
 Elstree Way
 Borehamwood                 fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Mar 18 10:08:34 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:08 2004
Subject: who's on the XML WG?
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <36F0CFF4.365B@hiwaay.net>

Rick Jelliffe wrote:
> 
> Lisa Rein wrote:
> >>
> >> yes the cloak and dagger stuff is amusing, isn't it. yet necessary.
> 
> I think there should be more secrecy, not less. Look at our impatience
> at waiting 2 or more years for links, for example: the WGs are cruel
> vixen toying with us.

If we are looking to the animal kingdom for examples, the W3C is  
evolving like armadillos:  timid animals that slowly go their own 
way.  When faced with predators, they roll up into a ball and depend 
on their hard shell to protect them.  It works until the predator 
is a speeding truck.

On a different thread and because it is 3:57AM here, people are 
speculating about why XML isn't taking off:

1.  For a technology to emerge quickly, there has to be a need 
for it, perceived or otherwise, not already being done well 
enough by other technologies.  In the case of text management, 
relational systems augmented by technololgies such as DHTML, 
IIS, and ADO are solving most of the near term problems.  IOW, 
for the immediate needs which occupy the implementors' attention, 
they have working and relatively new solutions.

2.  Industries that could take advantage of XML technologies 
and other web technologies are wrestling with the economic 
models that drive their businesses.  The per-seat cost must 
support the bottom line.  If webTech is perceived as a source 
of lite applications which will erode that product model, they 
will be resisted.  

It isn't the complexity of XML.  Truth is, it isn't that complex. 
It is the fit.  XML technologists and sales people have to 
look to markets where the fit is good but the traditional 
vendors are resisting, then create niches in those markets 
where the force of syntax unification becomes an irresistable 
feature such that adoption becomes necessary to win new 
business and retain upgrade contracts.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 18 10:51:09 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:08 2004
Subject: Multi-valued attributes
In-Reply-To: <017701be6f3f$f0e4c5a0$0300000a@othniel.cygnus.uwa.edu.au>
References: <017701be6f3f$f0e4c5a0$0300000a@othniel.cygnus.uwa.edu.au>
Message-ID: <wkhfrjkr5j.fsf@ifi.uio.no>


* James Tauber
| 
| xmlsoftware.com is a single XML document with two main sections, a
| list of categories with descriptions and a list of software. Each
| category has an ID. IDREFS could be used for associating products
| with one *or more* categories:
 
My XML tools list in fact does this.  And since it has a separate
section for vendors IDREFS are used to identify vendors as well.  Thus
PyExpat can be listed as the work of both James Clark and Jack Jansen.

The matter of IDREFS support in XSL was never a problem, since I use
Python for my conversion to HTML.  I've built a module that represents
the contents using Vendor, Product and Category objects and uses
hashtables to resolve IDREFs.  

This has the added advantage that from the same in-memory API I can
easily build my search index[1].  The Python marshal module is used to
dump and load the index, and for the moment (current DB size) this
operation is lightning fast.  

| James
| (not an SGML tribal elder)

--Lars M. (not an elder at all :)

[1] <URL:http://birk105.studby.uio.no/cgi-bin/searchform.py>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 18 11:54:16 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:08 2004
Subject: IDL2XML converter
In-Reply-To: <3601e01c.180299@smtpgate1.ONE2ONE.CO.UK>
References: <3601e01c.180299@smtpgate1.ONE2ONE.CO.UK>
Message-ID: <wkbthrko80.fsf@ifi.uio.no>


* LUCIO PICOLLI
|
| I am curious about the progress of a IDL to XML converter. [...] In
| fact if any one has any thoughts about a IDL2XML converter i would
| like to hear from them.

I'm currently working on an IDL parser in Common Lisp. (I chose this
language because using the META technique it's very easy to write
parsers in it. They're also fast.)
 
For the moment I parse a subset of what it looks like CORBA 3.0 IDL
will look like (and also the CIDL introduced by the CORBA Component
Model Spec). I now produce equivalent 2.2-IDL and Java code from this,
and will later produce component packaging XML and possibly also
HTML/XML documentation.

| I suspect that the main obstacle is in the IDL data type mappings.

Hmmm. Now you lost me. What kind of IDL2XML converter did you have in
mind? Did you really mean a CDR2XML converter? Or IIOP2XML-RPC?

--Lars .


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Thu Mar 18 14:02:06 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:10:09 2004
Subject: IDL2XML converter
Message-ID: <3601e340.180299@smtpgate1.ONE2ONE.CO.UK>


>
> | I suspect that the main obstacle is in the IDL data type mappings.
>
> Hmmm. Now you lost me. What kind of IDL2XML converter did you have in
> mind? Did you really mean a CDR2XML converter? Or IIOP2XML-RPC?

Neither i was thinking of converting a IDL into a DTD. The DTD will need   
to suport data types. Any XML doc generated from that DTD could then be   
used to pass data between ORB's instead of IIOP.I am sure it not a   
original idea. This XML-RPC sounds interesting though.

This XML-RPC things looks good but i don't know what it is as i have only   
founds a few lightweight docs on it. I searched w3C but i can't find it.   
Has it been submitted to W3C?


 -lucio

>
> --Lars .
>
>
> xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on   
CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following   
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Thu Mar 18 14:26:54 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:10:09 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net>
Message-ID: <36F10CFC.CFEB89A8@goon.stg.brown.edu>

len bullard wrote:

> It isn't the complexity of XML.  Truth is, it isn't that complex.
> It is the fit.

Speaking of complexity, I just updated STG's web-available validator
to cope with namespaces.  I'm not claiming that I got it right on the
first pass.  But the updates should help those of you experimenting
with the Jan 14 spec:

  http://www.stg.brown.edu/service/xmlvalid/

Re namespaces:

After working with them now for a few months, I can't say I'm any more
impressed with namespaces than when I started.  Why?

  --  No no matter what anyone says, they screw up validation.  --

    1) because DTDs aren't namespace-aware, and therefore
      a) don't know the difference between a defaulted element and one
         that simply has no namespace
      b) have no scoping mechanism to at least allow you to kludge
         namespace defaulting by restricting elements to one or another
         part of the syntax tree

    2) because namespaces require you to parse attributes and values
       fully before finishing element name processing; this is bad be-
       cause it
      a) makes one-pass parsing more difficult, and requires retention
         of much more information during the parse
      b) makes for unexpected interactions between the DTD (which may
         provide default attributes for a given element, including
         xmlns="" - which puts the element into a namespace)

    3) because inherited attributes are inimical to the whole DTD
       concept
      a) it was bad enough that we had to put up with xml:lang and
         such (which processing software must pass down the parse
         tree), now the XML standard itself has inherited attributes
         built in with namespaces

I have no issues here.  I'm not a W3C member, and we make no significant
use of XML here in my shop.  I'm basically just an interested observer.
And my observation is that namespaces screw up validation.

This is all very bothersome because validation is one of the key points
that separate XML from HTML, and potentially make it better.  With XML,
anyone can define their own HTML, so to speak, or another markup lang they
find useful, and then simply publish a DTD with it.  There's none of the
chaos of HTML, which didn't even get a DTD until it was in wide use, and
that (despite the DTDs it now has) typically doesn't validate.  It's to
the point where the only people who can write effective HTML processing
software are outfits with armies of programmers hired to deal with error
recovery and proprietary extensions (both their own and their competitors').

With XML, we can potentially start out on the right foot, and avoid this
nonsense by using validation from the start.  Well-formedness is nice,
but it's not clearly enough defined (and anyway, many non-validating
processors find it necessary to at least grab attribute defaults, if not
also look for parameter entities and conditional sections).  Using it
alone could easily put us back into an HTML-like mess.

So the problem now is how to encourage validation despite the fact that
the W3C has apparently shot DTDs and itself in the foot with namespaces.

The answer, obviously, is to shed any pretense of DTDs being the basic
XML schema mechanism.  We could waffle for years, claiming that both the
DTD and some other mechanism are "standard".  But what's this supposed
to do to the complexity (remember complexity?) of our processing soft-
ware?

It's not like it's any harder to construct a schema mechanism that
offers a superset of what a DTD offers, and then provide simple conver-
sion tools.

Yes, SGML compatibility was an original goal.  But a lot of original
goals seem to have gone out the window.  Another one isn't going to make
any difference now.

The only problem with this scenario is that it will horrify the old SGML
community, which looks to me as if it's trying to kludge architectural
forms onto XML, maybe in efforts to save DTDs.

It's all getting rather bizarre.  Again, I say this as someone who has
gotten with the program, and implemented everything the W3C has put out
(and who works in an SGML shop).  My boss isn't leaning on me to hack
out EDI code, or to whip up an RDF engine.  I'm really just a disinter-
ested observer.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Mar 18 14:30:31 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:09 2004
Subject: RDF considered optional
References: <36F02B36.F5E46355@locke.ccil.org>
		<000901be70c7$ead3bb60$5118a8c0@kuantech1.quokka.com> <14064.21512.146188.888832@localhost.localdomain>
Message-ID: <36F10784.E40B0123@prescod.net>

David Megginson wrote:
> 
>  > I took it the same way. But doesn't that violate the principle of
>  > XML as being self-describing?
> 
> First, there is no such published principle for XML itself, though the
> Namespaces spec provides an infrastructure for such a thing.

Debatably the principle of self-describing-ness is encoded in both the XML
declaration (unfortunately optional!) and the DOCTYPE declaration.

> Second, it in no way violates it, because if you recognise the
> namespaces/elements being used, you can still figure out that you're
> dealing with RDF (and if not, fat lot of good they'll do you anyway).

Wouldn't it be useful for a generic RDF processor (e.g. viewer, search
engine) to be able to recognize RDF in arbitrary documents?

Personally, I do feel that the optional RDF element is a bad idea. I have
had many bad experiences with "implied declarations" like <!DOCTYPE HTML
...> and <!SGML ...>. "Say what you are!" In this case the obvious way for
RDF elements to say that they are RDF elements without enforcing a
particular vocabulary would be to use attributes -- but that sounds like a
smelly old SGML-ish idea.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"If you were casting Bob Kane's character by disposition, you would 
never in a million years think Michael Keaton or George Clooney.
Good God: George Clooney? If you were casting Bob Kane's Batman, even 
the likes of Tim Roth or Christopher Walken would be much too 
lighthearted to play this demonic avenger."
	- http://www.salonmagazine.com/feature/1998/11/06feature.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 18 16:03:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:09 2004
Subject: SAX vote: "SAX2" wins first-round majority (preliminary results)
Message-ID: <14065.8220.353851.594664@localhost.localdomain>

Voting for the name of the next version of SAX closed a midnight (EST) 
last night, St. Patrick's day.

Here are the preliminary results:

  "SAX2":    19 votes
  "OpenSAX": 12 votes
  "ModSAX":  2 votes

Since the name "SAX2" has won a clear majority, there is no need for a
further round.  However, I did tabulate the votes manually, and would
like to ask everyone to check the following list to ensure that I
recorded their votes correctly (I might add this list to the SAX2
documentation, so make certain that posterity will know what your real
opinion was):


SAX2 (19 votes)
---------------

Parand Tony Darugar <tdarugar@binevolve.com>
John E. Simpson <simpson@polaris.net>	
Jonathan Borden <jborden@mediaone.net>
Don Park <donpark@quake.net>
Paul Tchistopolskii <paul@globalsight.com>
Lars Marius Garshol <larsga@ifi.uio.no>
Marcus Carr <mrc@allette.com.au>
Timothy S Balraj <tbalraj@india.dharma.com>
Steven Marcus <srnm@yahoo.com>
Moncef Mezghani <Moncef.Mezghani@wanadoo.fr>
Ronald Bourret <rbourret@ito.tu-darmstadt.de>
Andy Redhead <Andy.Redhead@ThomsonConsulting.com>
Michael Kay <Michael.Kay@icl.com>
Adrian Tivey <A.Tivey@hgmp.mrc.ac.uk>
Jason A. Buss <jabuss@cessna.textron.com>
David Brownell <David.Brownell@Eng.Sun.COM>
Rajiv Mordani <Rajiv.Mordani@Eng.Sun.COM>
Eric Armstrong <eric.armstrong@eng.sun.com>
Matthias Spycher <Matthias.Spycher@Eng.Sun.COM>


OpenSAX (12 votes)
------------------

Bill la Forge <b.laforge@jxml.com>
John Orla-bukowski <John.Orla-bukowski@Schwab.COM>
Lisa Richards (RTIS)" <lisa.richards@reedtech.com>
John Cowan <cowan@locke.ccil.org>
Ramin Firoozye <ramin@wizen.com>
Dan Brickley <Daniel.Brickley@Bristol.ac.uk>
Jouni Miettunen <Jouni.Miettunen@ccc.fi>
E.L. Willighagen <egonw@sci.kun.nl>
Mike Dacon[?] <MikeDacon@aol.com>
Bob DuCharme <DuCharmR@moodys.com>
Dr. Janet Bagg <J.Bagg@ukc.ac.uk>
Simon St.Laurent <simonstl@simonstl.com>


ModSAX (2 votes)
----------------

Clark Evans <clark.evans@manhattanproject.com>
W. E. Perry <wperry@fiduciary.com>


Thanks, and all the best,


David

p.s. I would have voted for "ModSAX" if I had not been conducting the
     vote, but "SAX2" would still have had a one-vote majority.

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From hassan.hussein at zurich.com  Thu Mar 18 16:26:23 1999
From: hassan.hussein at zurich.com (hassan.hussein@zurich.com)
Date: Mon Jun  7 17:10:09 2004
Subject: 100% Java XML Parsers
Message-ID: <C1256738.0059EDC2.00@mtach2.zurich.com>


Hello,

I apologise in advance if this question is not appropraite for the list. In my
opinion the list members are the best source of XML knowledge and I am confident
that many of you have an answer to my question

I have a product that is required to read and write XML documents.

I am looking for a comparison of the 100% Java XML Parsers that are currently
available from any company.
What I would like to see is list of such parsers (and a link to its resources,
web site etc.) each with the following:

   Description of parser including any relevant history and company background
   comparison of the parser to others in relation to:
     Functionality
     Performance
     Ease of use
     Reliability
     Flexibility
   A recommendation for which parsers are best in your opinion


Thank you

Hassan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Thu Mar 18 16:38:57 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:10:09 2004
Subject: 100% Java XML Parsers
Message-ID: <3601e52e.180299@smtpgate1.ONE2ONE.CO.UK>


check out

http://archive.javareport.com/9902/html/products/prod_rev.shtml

 -lucio

>
> Hello,
>
> I apologise in advance if this question is not appropraite
> for the list. In my
> opinion the list members are the best source of XML knowledge and I am
> confident
> that many of you have an answer to my question
>
> I have a product that is required to read and write XML documents.
>
> I am looking for a comparison of the 100% Java XML Parsers
> that are currently
> available from any company.
> What I would like to see is list of such parsers (and a link
> to its resources,
> web site etc.) each with the following:
>
>    Description of parser including any relevant history and
> company background
>    comparison of the parser to others in relation to:
>      Functionality
>      Performance
>      Ease of use
>      Reliability
>      Flexibility
>    A recommendation for which parsers are best in your opinion
>
>
> Thank you
>
> Hassan
>
>
>
> xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on   
CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following   
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simpson at polaris.net  Thu Mar 18 16:41:16 1999
From: simpson at polaris.net (John E. Simpson)
Date: Mon Jun  7 17:10:09 2004
Subject: Tech questions: "Can XML cure world hunger?"
Message-ID: <3.0.32.19990318113609.007cf2f0@polaris.net>

I've received the below questions from a friend who works for a large
telecomm consulting firm with, in his words, "some pretty expert developers
who are familiar with Java, HTML, CORBA, etc."

Although general, his questions touch on subjects outside my own areas of
expertise. Anyone care to take a shot at answering them, or to suggest
resources where he might at least *start* finding answers?

>>1) Here's the basic problem:
>>
>>How do you build a high performance, fast loading web interface that has the
>>speed of HTML, but the screen design flexibility of Java applets?  Also
assume
>>that you are in an environment where you are not able to pre-load
plug-ins which
>>would help the applet loading.
>>
>>Can XML help here?  If so, this seems to be almost like solving world
hunger for
>>web developers.
>>
>>2) Extra credit question:
>>
>>Does XML have any possible relation to CORBA?  I assume no, but I'm really
>>beyond my level of understanding here.

Although as I say I'm not sufficiently well-versed in Java and CORBA to
answer his questions definitively, I sense -- from many of the discussions
that have taken place here in the year-and-a-half I've been on the list --
that many of xml-dev's regular contributors won't blink twice before
answering in the affirmative.

Thanks in advance for any input,
John

=============================================================
John E. Simpson          | It's no disgrace t'be poor, 
simpson@polaris.net      | but it might as well be.
                         |            -- "Kin" Hubbard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar 18 17:08:28 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:09 2004
Subject: How to ??
References: <85256737.00129928.00@D51MTA03.pok.ibm.com> <199903171524.JAA01148@bruno.techno.com> <36EFE58A.E37657A3@locke.ccil.org> <199903180504.XAA00766@bruno.techno.com>
Message-ID: <36F132A4.89F959F8@locke.ccil.org>

Steven R. Newcomb wrote:

> At its currently-most-intelligible, it can be found at:
> http://www.hytime.org/materials/sgmlpropset/index.html

Thank you.  That is relatively clear.
 
> I gather that there is an open comment
> period with respect to an Infoset draft (or perhaps a requirements
> draft) that is either in progress or is about to occur,

The requirements document is publicly available (3 pages long) at:

	http://www.w3.org/TR/NOTE-xml-infoset-req

and we hope to have the first WD out Very Soon Now.  It will be
(looking into my crystal ball ...) probably about 15 pages long,
of which maybe a third comprises an RDF schema that is not normative,
as the requirements document says the prose is the normative part.

> and I'm
> frankly frustrated that I can't take the time to pay it the attention
> that it so richly deserves.

After WD publication there will be a comment period of a month
at least.  Please try to squeeze in a little bit of time, everyone,
to read the WD make your viewpoints heard.

John Cowan, member of, but not speaking for, the Infoset WG

-- 
John Cowan	http://www.ccil.org/~cowan		cowan@ccil.org
	You tollerday donsk?  N.  You tolkatiff scowegian?  Nn.
	You spigotty anglease?  Nnn.  You phonio saxo?  Nnnn.
		Clear all so!  'Tis a Jute.... (Finnegans Wake 16.5)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rudman at idetix.com  Thu Mar 18 17:20:20 1999
From: rudman at idetix.com (Dan Rudman)
Date: Mon Jun  7 17:10:09 2004
Subject: XML inside HTML question
Message-ID: <001501be7163$04759bb0$49e9fdce@diablo.idetix.com>

Challenge:

I have a tag-based scripting language to be used with web pages (HTML).  I
want this scripting language to be XML-based, despite the fact that the XML
tags will exist within the confines of the non-well-formed HTML.

Is there a way to write a DTD that covers all of that in a way that lets me
use an XML parser to get my stuff out and treat all the other stuff that's
not mine (HTML, text, other people's XML tags, etc.) as CDATA, or something
similar?

This would be great... then I could assume XML to deal with my own scripting
language but I examine everything else in the context of simple CDATA rather
than trying to parse it out as a full-blown tag tree.


Any ideas?

-- Dan
-- Idetix, Inc.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Thu Mar 18 17:41:20 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:09 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu>
Message-ID: <36F13992.150D05F9@w3.org>


"Richard L. Goerwitz" wrote:
> I just updated STG's web-available validator
> to cope with namespaces.  I'm not claiming that I got it right on the
> first pass.  But the updates should help those of you experimenting
> with the Jan 14 spec:
> 
>   http://www.stg.brown.edu/service/xmlvalid/

Thanks; its interesting to experiment with.

> After working with them now for a few months, I can't say I'm any more
> impressed with namespaces than when I started.  Why?
> 
>   --  No no matter what anyone says, they screw up validation.  --

I observe the same results, but draw different conclusions. I'm not any
more impressed with DTDs than when I started.
> 
>     1) because DTDs aren't namespace-aware, and therefore
>       a) don't know the difference between a defaulted element and one
>          that simply has no namespace
>       b) have no scoping mechanism to at least allow you to kludge
>          namespace defaulting by restricting elements to one or another
>          part of the syntax tree

These are problems with DTDs rather than with namespaces as such. 

>     2) because namespaces require you to parse attributes and values
>        fully before finishing element name processing; this is bad be-
>        cause it
>       a) makes one-pass parsing more difficult, and requires retention
>          of much more information during the parse
>       b) makes for unexpected interactions between the DTD (which may
>          provide default attributes for a given element, including
>          xmlns="" - which puts the element into a namespace)

Ok, and fair comment, but it seems a reasobable power/complexity
trade-off to me.

>     3) because inherited attributes are inimical to the whole DTD
>        concept
>       a) it was bad enough that we had to put up with xml:lang and
>          such (which processing software must pass down the parse
>          tree), now the XML standard itself has inherited attributes
>          built in with namespaces

Inherited attributes are a powerful and obvious concept; again, it seems
to be DTDs which are insufficiently expressive rather than namespaces
which are broken.

> And my observation is that namespaces screw up validation.

screw up validation *with DTDs using current DTD syntax*, which is not
the same thing at all. 

Unfortunately I came across EBNF long before I came accross DTD syntax,
so about half an hour after meeting DTDs I was, like, what do you mean
it can't express that this attribute is a url? Why can't it express that
this attribute is an ISO standard date?

So I quickly formed the opinion that DTDs really got in the way of
validation ;-)

> This is all very bothersome because validation is one of the key points
> that separate XML from HTML, 

that separates XML from HTML practice. HTML theory always required
validation, doctypes, all that good stuff; but the bar was massively
high and thus the spec was not really relating tothe users at all. I
first saw an implementation of HTML only a couple of months ago (the
DocZilla browser from Citec).

With XML, the bar has been owered sufficiently by throwing out the
criuftl bits of SGML, that it becomes an achievable target. So, there
are lots of users stress-testing XML, which is great, and getting much
more from it than was possible with typical HTML "implementations" which
is also great. But one result of that stress testing is that DTDs (which
were just about OK in a closed, single system, single user world) are
poorly suited to an open, multi-user, Web-enabled world.

You know what they used to say about SGML; its assymetric. Getting the
data in just takes a text editor, but getting it out again requires a
consultant. Well, with XML, the effort to get some benefit from XML is
reduced because of economy of scale - someone somewhere will have the
dtd you want to do part of your job. Build what you want from a kit of
parts that other people wrote; add a little glue, and off you go.

That model has been spectacularly succesful in programming; namespaces
gives that same power for XML. 

Yes, validation is important - and I mean real validation, with no
critical-path human-readable comments in the DTD and multiple utilities
to check different aspects of validity (like separate scripts to ensure
that an attribute is a valid date or customer number).

So what is critically needed is a real, namespace-aware, schema language
that can be used to do real validation.

> and potentially make it better.  With XML,
> anyone can define their own HTML, so to speak, or another markup lang they
> find useful, and then simply publish a DTD with it.

Right; in the same way, anyone can do the data modelling rquired to
define a database format and anyone can write a parser. In theory. But
most people choose not to, and to use ones that someone else wrote. This
works for code. It should work for data, too.

Since many people will have come across *some* aspect of a users problem
space before, but no-one will have come across the *exact same* entire
problem, then namespaces are required so that people can build what they
want from a distributed kit of parts.

> It's to
> the point where the only people who can write effective HTML processing
> software are outfits with armies of programmers hired to deal with error
> recovery and proprietary extensions (both their own and their competitors').

Yes. Typically 95% of the programming effor is in the reverse
engineering and undocumented trickery; implementing the actual specs is
the remaining 5%.

> With XML, we can potentially start out on the right foot, and avoid this
> nonsense 

Yes

> by using validation from the start. 

For stand alone documents ina single namespace, that can still be done.
For combinations of particular namespaces, it can be done declaratively
and a resulting DTD auto generated, but that is fragile because it makes
assumptions about the namespace prefix and limits the use of namespace
defaulting. Given a more powerful schema language, creating a schema for
a new  XML application should be as easy as reading in a selection of
DTDs and doing drag and drop tree construction withthe component parts.

> Well-formedness is nice, but it's not clearly enough defined 

It seems fairly clearly defined; it may not be sufficient.

> (and anyway, many non-validating
> processors find it necessary to at least grab attribute defaults, if not
> also look for parameter entities and conditional sections). 

You mean, ones in the external subset? If code is doing that, it might
as well do validation too.

>  Using it
> alone could easily put us back into an HTML-like mess.

Oh no, even that gives us more than HTML-in-practice. For example,
different applications can actually be assumed to be using the same
parse tree ;-) which does make the DOM and style sheets a whole lot more
predictable.

The combination of SGML omissible start-tags and HTML extensions meant
that all parsing wa error recovery and that no two brwsers would have
the same parse tree. Or if they did, it was because they had five or so
different possible parse trees around, depending on what you were doing
;-)


> So the problem now is how to encourage validation despite the fact that
> the W3C has apparently shot DTDs and itself in the foot with namespaces.

Or rather, the problem is that the W3C (and the general public)
exercising XML and putting it into real practice has made painfully
obvious some shortcomings inherited from SGML The W3C XML Schema WG will
however solve these, I am confident of this.

That isn't shooting ourselves in the foot; it is more akin to
discovering that movement is much faster when your clogs aren't nailed
together, and pausing for a while to separate them and to develop
running shoes. 

> The answer, obviously, is to shed any pretense of DTDs being the basic
> XML schema mechanism. 

For declaring multi-namespace documents, yes. They still have at least
an interim role in validating single namespace documents and in defining
the building blocks from which a multi-namespace schema can be
constructed.

> We could waffle for years, claiming that both the
> DTD and some other mechanism are "standard".  But what's this supposed
> to do to the complexity (remember complexity?) of our processing soft-
> ware?
> 
> It's not like it's any harder to construct a schema mechanism that
> offers a superset of what a DTD offers, and then provide simple conver-
> sion tools.
> 
> Yes, SGML compatibility was an original goal.  But a lot of original
> goals seem to have gone out the window.  Another one isn't going to make
> any difference now.

;-)

> The only problem with this scenario is that it will horrify the old SGML
> community, which looks to me as if it's trying to kludge architectural
> forms onto XML, maybe in efforts to save DTDs.

There are significant portions of the old SGML community working to
improve XML and to help build the missing parts which are needed. I have
a lot of rwespect for that portion. There are, as you say, other parts
which are merely trying to save their own highly paid jobs as priests of
complex, low-powered technology. One can usually tell the difference by
noting that the former portion have their eyes open.


> It's all getting rather bizarre.  Again, I say this as someone who has
> gotten with the program, and implemented everything the W3C has put out

Cool. Implementation experience is like gold.

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jabuss at cessna.textron.com  Thu Mar 18 19:20:53 1999
From: jabuss at cessna.textron.com (Buss, Jason A)
Date: Mon Jun  7 17:10:09 2004
Subject: 100% Java XML Parsers
Message-ID: <F7E1775C1C27D211881F00A024B2853046A046@CESS01AMX03>

		I am looking for a comparison of the 100% Java XML Parsers
that are currently
		available from any company.
	A recommendation for which parsers are best in your opinion.

My personal favorite is xml4java by IBM's Alphaworks.

IMHO, it has the closest conformity to the XML 1.0 rec.  You can find it at

http://www.ibm.com/alphaworks

> -----Original Message-----
> From:	hassan.hussein@zurich.com [SMTP:hassan.hussein@zurich.com]
> Sent:	Thursday, March 18, 1999 10:26 AM
> To:	xml-dev@ic.ac.uk
> Subject:	100% Java XML Parsers
> 
> 
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Mar 18 19:48:48 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:09 2004
Subject: Validation
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org>
Message-ID: <36F15372.3FF36ABB@prescod.net>

Chris Lilley wrote:
> 
> Unfortunately I came across EBNF long before I came accross DTD syntax,
> so about half an hour after meeting DTDs I was, like, what do you mean
> it can't express that this attribute is a url? Why can't it express that
> this attribute is an ISO standard date?

I can guarantee you today that the XML schema effort will not allow you to
express everything that EBNF will so if that's your standard it will fail.
But even if we use EBNF as our standard: do you know of any programming
languages expressed entirely in EBNF? Or even entirely in *any formalism*?

> Yes, validation is important - and I mean real validation, with no
> critical-path human-readable comments in the DTD and multiple utilities
> to check different aspects of validity (like separate scripts to ensure
> that an attribute is a valid date or customer number).

It will never be the case that it will be possible to write schemas that
are so tight that they remove the need for comments that describe
additional constraints to other human beings. There will always be a need
not only for multiple schema languages but also for the ultimately
flexible schema language: prose text.

Luckily, eliminating all other schema languages is not a goal of the W3C
schema language effort. 

> So what is critically needed is a real, namespace-aware, schema 
> language that can be used to do real validation.

I hear a lot of users saying that. They don't seem to realize that there
is no such thing as "real validation" there is only "the validation I need
to do today." Ten years from now, we'll be griping that XMLSchemas don't
do "real validation" for some other arbitrarily advanced definition of
"real."

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"A year ago, when Ernest Pecounis said he wanted to bring
Linux into the state agency he works for, there was a swell of
laughter from his colleagues. Guess who's laughing now."
 - http://www.zdnet.com/pcweek/stories/news/0,4153,393443,00.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 18 20:25:01 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:10 2004
Subject: Tech questions: "Can XML cure world hunger?"
Message-ID: <002e01be717d$dd7c15a0$46026982@thing1>

From: John E. Simpson <simpson@polaris.net>

>>>How do you build a high performance, fast loading web interface that has the
>>>speed of HTML, but the screen design flexibility of Java applets?  Also
>assume
>>>that you are in an environment where you are not able to pre-load
>plug-ins which
>>>would help the applet loading.


My interest in XML is largely centered around program composition. Case in
point, using an XML file to assemble Swing components into an interactive GUI.

A lot of Swing programming is the glue code for assembling components. This
tends to be hard to read, consequently hard to maintain, and weaved together
with small smatterings of application logic. But it is inherently tree structured.

Replacing that glue code with an XML-document driven composition system
means that you have a easy to read (compared to the glue code) document
which naturally reflects the tree structure of the GUI. It also means that, aside
from a small XML document and a little bit of application logic, the only thing you
need to download is the composer--which is fixed for all applications. Also,
you don't need to download the Swing components, which can be used unmidified
and which should already be present.

BML, Bluestone, and MDSAX are three different implementations to this general
approach.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 18 20:49:19 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:10 2004
Subject: RDF considered optional
In-Reply-To: <36F10784.E40B0123@prescod.net>
References: <36F02B36.F5E46355@locke.ccil.org>
	<000901be70c7$ead3bb60$5118a8c0@kuantech1.quokka.com>
	<14064.21512.146188.888832@localhost.localdomain>
	<36F10784.E40B0123@prescod.net>
Message-ID: <14065.26107.308780.583545@localhost.localdomain>

Paul Prescod writes:
 > David Megginson wrote:
 > > 
 > >  > I took it the same way. But doesn't that violate the principle
 > >  > of XML as being self-describing?
 > > 
 > > First, there is no such published principle for XML itself,
 > > though the Namespaces spec provides an infrastructure for such a
 > > thing.
 > 
 > Debatably the principle of self-describing-ness is encoded in both
 > the XML declaration (unfortunately optional!) and the DOCTYPE
 > declaration.

Perhaps -- it's really a question of degree.  The XML declaration
tells you what the XML version is and may tell you about the intended
character encoding (the 'standalone' declaration is pointless and
should be ignored); the DOCTYPE declaration tells you what the root
element of the document is.  Neither of these, however, tells you
anything about what kind of XML document you're looking at.

 > > Second, it in no way violates it, because if you recognise the
 > > namespaces/elements being used, you can still figure out that you're
 > > dealing with RDF (and if not, fat lot of good they'll do you anyway).
 > 
 > Wouldn't it be useful for a generic RDF processor (e.g. viewer, search
 > engine) to be able to recognize RDF in arbitrary documents?

Of course it would be, but not everyone will want to define an
exchange format with rdf:RDF as the root element.  The presence of the 
rdf:about attribute can serve as a useful clue if it's there.

 > Personally, I do feel that the optional RDF element is a bad idea. I have
 > had many bad experiences with "implied declarations" like <!DOCTYPE HTML
 > ...> and <!SGML ...>. "Say what you are!" In this case the obvious way for
 > RDF elements to say that they are RDF elements without enforcing a
 > particular vocabulary would be to use attributes -- but that sounds like a
 > smelly old SGML-ish idea.

Yeah, stinky old architectural forms would be very helpful here.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Mar 18 21:04:48 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:10 2004
Subject: Tech questions: "Can XML cure world hunger?"
In-Reply-To: <3.0.32.19990318113609.007cf2f0@polaris.net>
Message-ID: <NBBBJPGDLPIHJGEHAKBAKEJJCPAA.martind@netfolder.com>

Hi John,

<TheComment>
>>Does XML have any possible relation to CORBA?  I assume no, but I'm really
>>beyond my level of understanding here.
</TheComment>

<Reply>
Yes it actually has for objects meta-information specified in XML see latest
announcements form OMG. Also, in a near future, marshalling between objects
could be done with a XML based format. There is current work on CORBA
implementations so it is for DCOM implementations. Thus, XML will be added
to other marshalling formats.
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tomh at thinlink.com  Thu Mar 18 21:25:25 1999
From: tomh at thinlink.com (Tom Harding)
Date: Mon Jun  7 17:10:10 2004
Subject: DOM Implemetation in C?
Message-ID: <36F16F0E.BE41B37D@thinlink.com>


I've checked through everything at the w3c, as well as Robin Cover's
list... does anyone know where I might find an implementation of the DOM
(any level will do) written in C?  If there isn't one, is anyone else
interested?

Many thanks.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Thu Mar 18 21:56:25 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:10 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F10CFC.CFEB89A8@goon.stg.brown.edu>
	(richard@goon.stg.brown.edu)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu>
Message-ID: <199903182155.PAA01202@bruno.techno.com>

[Richard Goerwitz:]

> [Certain members of] ...the old SGML community... looks to
> me as if it's trying to kludge architectural forms onto
> XML, maybe in efforts to save DTDs.

I would plead "Guilty as charged" if you said basically the
same thing in a more positive way, and without the implicit
accusation of syntax-bigotry:

  "Certain members of the SGML community are demonstrating
   how to use architectural forms with XML.  Thus they are
   promoting the concept of structural and semantic modeling
   and validation in XML."

It is largely immaterial whether the old DTD syntax survives
in XML-land.  However, the idea of validatable structure is
of vital importance if we're to have efficient, reliable
information interchange via XML resources.  If an XML
resource uses more than one vocabulary (as namespaces are
designed to allow), then the use of each of those
vocabularies should be validatable according to its own
syntactic and semantic constraints.  This is what
architectural forms can do -- and are already doing -- for
XML resources.  Namespaces (at least the bulk of their
syntax and the idea of identifying a namespace via a URI)
could also be used to do architectural forms.

The alternative SGML syntax for doing architectural forms
works perfectly in XML.  I demonstrated that fact at my talk
at XTech 99 (http://www.hytime.org/papers/srnXTech99/).  I
also offered to share our version of SP (including sx and
sgmlnorm) that reads and processes the alternative PI-based
syntax for architectural form declaration that was adopted
over a year ago by the ISO, mostly for the benefit of XML
users.  My offer stands; the catch is that this small change
to the underlying SP parser hasn't been fully integrated
with SP's error-reporting mechanisms yet.  Some
public-spirited person is welcome to that task.  Takers?
(Anyway, the real issue is what XML parsers will do with
inherited information architectures, not what SGML/XML
parsers like SP will do with them.  What SGML/XML parsers do
has been decided long since, and James Clark's free SP
parser already implements the architectural forms paradigm
very effectively indeed.)

I often hear about how inheritance will be supported by
future "schema languages" for XML documents.  The
architectural forms paradigm (regardless of the syntax used
to exploit it) is, at the very least, a major part of the
solution.  If one doesn't yet understand how each inherited
architecture results in one or two distinct groves (in
addition to the primary XML grove), all cross-connected,
then one doesn't yet "get" it.  Everything in my experience
convinces me that, when one does "get" it, the simplicity
and elegance of the whole thing, taken together, makes its
adoption a no-brainer.  "The solution, once found, is
obvious."  I would strongly urge that the syntax of
vocabulary inheritance in XML be designed *after* the
architectural forms paradigm is well understood, and *after*
there is consensus about how much of it to adopt and exploit
in XML.

In any case, the importance of being able to validate every
XML resource's the use of every inherited vocabulary, on
that vocabulary's own terms, cannot be overemphasized.  If
one accepts the principle that such validation is essential
for reliable information interchange, then I believe the
whole architectural forms thing follows inexorably, groves
and all.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Mar 18 22:02:13 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:10:10 2004
Subject: DOM Implemetation in C?
References: <36F16F0E.BE41B37D@thinlink.com>
Message-ID: <36F176DD.D3658194@manhattanproject.com>

Tom Harding wrote:
> 
> I've checked through everything at the w3c, as well as Robin Cover's
> list... does anyone know where I might find an implementation of the DOM
> (any level will do) written in C?  If there isn't one, is anyone else
> interested?
> 

Here you go:

> 
>                   XML parser for Gnome
> 
> Documentation is available on-line at
>     http://rufus.w3.org/veillard/XML/xml.html
> 
> A mailing-list has been set-up, to subscribe:
>     echo "subscribe xml" | mail majordomo@rufus.w3.org
> 
> The list archive is at:
>     http://rufus.w3.org/veillard/XML/messages/
> 
> 
> Daniel.Veillard@w3.org
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mscardin at us.oracle.com  Thu Mar 18 22:26:14 1999
From: mscardin at us.oracle.com (Mark Scardina)
Date: Mon Jun  7 17:10:10 2004
Subject: ANN: Oracle XML Parser for Java v1.0.1.1 - Maintenance
Message-ID: <000201be718e$11aeb6b0$47be1990@mscardin-pc.us.oracle.com>

The first maintenance release of the Oracle XML Parser for Java
is available for download at http://technet.oracle.com/tech/xml.
This is a bug-fix release.

The following are included features and specs:
     Supports validation and non-validation modes 
     Built-in Error Recovery until fatal error. 
     Supports W3C XML 1.0 Recommendation. 
     Intergrated Document Object Model (DOM) Level 1.0 API 
     Integrated SAX 1.0  API 
     Supports W3C Proposed Recomendation for XML Namespaces 
     Supports documents in the following encodings: 

           UTF-8           BIG 5
           UTF-16          GB2312
           ISO-10646-UCS-2 EUC-JP
           ISO-10646-UCS-4 EUC-KR
           US-ASCII        KOI8-R
           EBCDIC-CP-*     ISO-2022-JP
           ISO-8859-1to -9 ISO-2022-KR
           Shift_JIS

Support is available in the XML Forum on OTN to provide a collaborative
area for bug reporting, technical support, and discussing other Oracle/XML
issues.  This forum will be used for external as well as internal beta
testers.

Mark V. Scardina
Sr. Product Manager - Core Development
Server Technologies - Oracle Corporation
Oracle XML News http://www.oracle.com/xml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Thu Mar 18 22:46:04 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:10 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org>
Message-ID: <36F18209.8C68524@allette.com.au>


Chris Lilley wrote:

[a number of sideways kicks at SGML, then:]

> There are significant portions of the old SGML community working to
> improve XML and to help build the missing parts which are needed. I have
> a lot of rwespect for that portion. There are, as you say, other parts
> which are merely trying to save their own highly paid jobs as priests of
> complex, low-powered technology. One can usually tell the difference by
> noting that the former portion have their eyes open.

Spare me. The biggest driving factor behind people working in SGML is the fact that there are
clients who want work done. SGML is neither complex nor low-powered, as numerous defence,
telcos, legal publishers, stock exchanges, aircraft manufacturers, automotive companies, etc.
can attest. Generalisations of the participants such as those above, create friction between
the XML and SGML camps and reveal an inate lack of understanding about the relationship
between the two. I will thank you to not to categorise me as either a "good XML groupie" or a
"garden gnome".


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pvelikho at cs.ucsd.edu  Thu Mar 18 23:13:38 1999
From: pvelikho at cs.ucsd.edu (Pavel Velikhov)
Date: Mon Jun  7 17:10:10 2004
Subject: Fragment Interchange Question
References: <36F02B36.F5E46355@locke.ccil.org>
		<000901be70c7$ead3bb60$5118a8c0@kuantech1.quokka.com>
		<14064.21512.146188.888832@localhost.localdomain>
		<36F10784.E40B0123@prescod.net> <14065.26107.308780.583545@localhost.localdomain>
Message-ID: <36F18760.C10C2BF2@cs.ucsd.edu>

Hi,
	I have tried to "parse" the Fragment Interchange spec, but failed. 
So I have this question: I would like to use the Fragement Interchange
to
ship small pieces of XML documents. So far it fits with the spec. 
But I also want these fragments to be incomplete, and to be able to 
explicitly state that and to recover the incomplete pieces if needed.

	Here is an example, suppose my initial XML document is:

<dealers_db>
  <dealer>
	<address>
		<street> 1102 Main St. </street>
		<city> San Diego </city>
		<state> CA </state>
		<zip> 99119 <zip>
	</address>
	<ads>
		<ad>
			<make> Honda </make>
			<model> Civic </model>
			<year> 95 </year>
		</ad>
		<ad>
			<make> Jeep </make>
			<model> Wrangler </model>
			<year> 93 </year>
		</ad>
		... (lots of other ads)
	</ads>
  </dealer>
  <dealer>
	...
  </dealer>
</dealers_db>

	And suppose I want to ship the first dealer, but not the full dealer 
element with all subobjects. Instead, I want only the City from the
address, 
and the first ad. However I need to know what information is incomplete.
Do 
you know if this is doable with Fragment Interchange Spec? I.e. could I
have 
"unresolved" fragbodies within a fragbody?

Here is (approximately) what I need to do: (in_e is a short of
incomplete 
element and in_l is a short of incomplete list)


<?xml version="1.0"?>
<p:package xmlns:p="http://www.w3.org/XML/Package/1.0"
           xmlns:f="http://www.w3.org/XML/Fragment/1.0"
           xmlns="">


  <f:fcs fragbodyref="http://dealer.com#root().child(1,dealer)">
    <dealer_db>
        <f:fragbody/>
    </dealer_db>
  </p:fcs>


  <p:body>
    <dealer>
      <address> 
	<in_e
at="http://dealer.com#root().child(1,dealer).child(1,address).child(1,street)"
/> 
	<city> San Diego </city>
	<in_l
at="http://dealer.com#root().child(1,dealer).child(1,address).child(3)"
/>
      </address>
      <ads>
	<ad>
		<make> Honda </make>
		<model> Civic </model>
		<year> 95 </year>
	</ad>
	<in_l
at="http://dealer.com#root().child(1,dealer).child(1,ads).child(2,ad)"
/>
      </ads>
    </dealer>
  </p:body>


</p:package>


Thank you,
Pavel Velikhov
UCSD Database Laboratory
http://www.db.ucsd.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Fri Mar 19 00:32:29 1999
From: richard at goon.stg.brown.edu (Richard Goerwitz)
Date: Mon Jun  7 17:10:10 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au>
Message-ID: <36F19AC0.B5B40B20@goon.stg.brown.edu>

Marcus Carr wrote:

> > There are significant portions of the old SGML community working to
> > improve XML and to help build the missing parts which are needed. I have
> > a lot of rwespect for that portion...
> 
> Spare me. The biggest driving factor behind people working in SGML is
> the fact that there are clients who want work done. SGML is neither
> complex nor low-powered, as numerous defence, telcos, ..., etc. can
> attest.

You are right in observing that these industries have all profited from
SGML-based descriptive markup.  You are wrong if you are also asserting
that there isn't any room for dramatic improvement (e.g., in the area of
schemas).

I come from a small shop that does a lot of SGML work.  Trust me:  SGML
is complex and intractable.  Software that works with it is scarce and
often expensive, and too often doesn't work very well.  Just because a
giant telco firm can muster the personnel to deal with SGML doesn't make
it a particularly elegant solution, except by way of comparison with
approaches that use non-standard or presentation-focused languages.

As for DTDs:

The growing realization that DTDs are insufficient for XML is not a
result of mindless SGML bashing.  Nor does it represent a failure to ap-
preciate how great a leap SGML was in the 80s.  This realization is,
rather, just something implementors are coming to after painful experi-
ences trying to make DTDs work with XML.

The sooner we can all agree on another schema mechanism, the sooner we
can all stop trying to outfit XML with all the kludges that people have
already built onto SGML to make it useful in a modern, scoped, object-
oriented world.

I'm increasingly looking on the original SGML-compatibility goal for
XML as a necessary political move - but one that should be shed at the
earliest convenient opportunity.

This doesn't mean we should shed all the experience that the SGML com-
munity can bring to the table, of course.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Fri Mar 19 01:23:08 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:10:10 2004
Subject: XML inside HTML question 
In-Reply-To: Your message of "Thu, 18 Mar 1999 12:16:10 CDT."
             <001501be7163$04759bb0$49e9fdce@diablo.idetix.com> 
Message-ID: <199903190122.OAA04551@aniwa.sky>


rudman@idetix.com said:
> Challenge: 
> 
> I have a tag-based scripting language to be used with web pages (HTML).  I
> want this scripting language to be XML-based, despite the fact that the XML
> tags will exist within the confines of the non-well-formed HTML.
> 
> Is there a way to write a DTD that covers all of that in a way that lets me
> use an XML parser to get my stuff out and treat all the other stuff that's
> not mine (HTML, text, other people's XML tags, etc.) as CDATA, or something
> similar?
> 
> This would be great... then I could assume XML to deal with my own scripting
> language but I examine everything else in the context of simple CDATA rather
> than trying to parse it out as a full-blown tag tree.

I'm doing this using CDATA as an interim step in development (the system is 
live).  I'm looking at moving to having some tags which my XML app is 
interested in exposed, and the rest character-entity-encoded.  To recover my 
original HTML I'd need to do a single round of entity decoding on all entities 
outside of tags.  I'm a bit concerned though that this strategy will not be 
well supported by stylesheet languages.  Perhaps I need to use lots of small 
CDATA sections.  I'm still investigating

Andrew McNaughton


-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar 19 01:24:37 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:10 2004
Subject: Something of a red-letter day
Message-ID: <3.0.32.19990318172415.00c169b0@pop.intergate.bc.ca>

As of today, we have the first release of a non-beta commercial browser
that can display XML, i.e. I.E.  We decided that XML.com ought to try
actually publishing something in XML, and so there's a story there in 
XML (with a version in HTML for the wimps among you) about, of course,
writing a story in XML about writing a story in XML about writ^C.

If you can stand the (hefty!) download, get IE5 and read it in XML, 
because it's Good For You.  And the XML version is way prettier. -T.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Fri Mar 19 01:56:54 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:10 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F18209.8C68524@allette.com.au>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
 <36F0CFF4.365B@hiwaay.net>
 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
 <36F13992.150D05F9@w3.org>
Message-ID: <4.1.19990319124943.00bcd100@steptwo.com.au>

At 08:45 19/03/1999 , Marcus Carr wrote:

  | Chris Lilley wrote:
  | 
  | [a number of sideways kicks at SGML, then:]
  | 
  | > There are significant portions of the old SGML community working to
  | > improve XML and to help build the missing parts which are needed. I have
  | > a lot of rwespect for that portion. There are, as you say, other parts
  | > which are merely trying to save their own highly paid jobs as priests of
  | > complex, low-powered technology. One can usually tell the difference by
  | > noting that the former portion have their eyes open.
  | 
  | Spare me. The biggest driving factor behind people working in SGML is the 
  | fact that there are
  | clients who want work done. SGML is neither complex nor low-powered, as 
  | numerous defence,
  | telcos, legal publishers, stock exchanges, aircraft manufacturers, 
  | automotive companies, etc.
  | can attest. Generalisations of the participants such as those above,
create 
  | friction between
  | the XML and SGML camps and reveal an inate lack of understanding about the 
  | relationship
  | between the two. I will thank you to not to categorise me as either a
"good 
  | XML groupie" or a
  | "garden gnome".

Hear, hear!

I'm another who is happily getting on with real work, for real
people, solving real problems, using SGML.

It's about time we stopped wasting time arguing about what
the "next generation" of schemas should be.

Let's get on with actually using XML as it stands, and 
prove to the world that it's more than just hot air.

J


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Fri Mar 19 02:02:44 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu>
Message-ID: <36F1AFF5.2DF948A2@allette.com.au>


Richard Goerwitz wrote:

> You are right in observing that these industries have all profited from
> SGML-based descriptive markup.  You are wrong if you are also asserting
> that there isn't any room for dramatic improvement (e.g., in the area of
> schemas).

I completely agree that there's room for dramatic improvement. In fact, I consider myself to
be one of those SGML people who is encouraging XML's growth and development, not one clinging
desperately to a high-paying SGML job. (I think Sasquatch Inc. is hiring, but you have to be
willing to relocate to Loch Ness.)

> I come from a small shop that does a lot of SGML work.  Trust me:  SGML
> is complex and intractable.  Software that works with it is scarce and
> often expensive, and too often doesn't work very well.  Just because a
> giant telco firm can muster the personnel to deal with SGML doesn't make
> it a particularly elegant solution, except by way of comparison with
> approaches that use non-standard or presentation-focused languages.

I also come from a small company that has been making a living principally out of SGML for
almost a decade. I understand very well the complexity associated with SGML; I have been doing
it for very a long time. In my experience, large organisations tend to come to small firms
rather than implement solutions themselves - they're looking for elegance, not muscle.

> The growing realization that DTDs are insufficient for XML is not a
> result of mindless SGML bashing.  Nor does it represent a failure to ap-
> preciate how great a leap SGML was in the 80s.  This realization is,
> rather, just something implementors are coming to after painful experi-
> ences trying to make DTDs work with XML.

So... how do we get back to SGML people not having their eyes open? I accept and agree that
not everything about SGML works in XML. That's not the issue; not even the most fervent
supporter of SGML would argue that the issues are identical for SGML and XML. That's the whole
point.

> I'm increasingly looking on the original SGML-compatibility goal for
> XML as a necessary political move - but one that should be shed at the
> earliest convenient opportunity.
>
> This doesn't mean we should shed all the experience that the SGML com-
> munity can bring to the table, of course.

Again, I don't necessarily disagree with either of these statements, but that wasn't what I
was posting about. I just think that we should be grown-up enough not to feel that we have to
eat our young (or in this case, our parents).


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 19 03:10:51 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu>
Message-ID: <36F1BBE8.A3AB13EB@prescod.net>

Richard Goerwitz wrote:
> 
> I come from a small shop that does a lot of SGML work.  Trust me:  SGML
> is complex and intractable.  

<RANT>
This is way off topic but I must admit that these characterizations really
annoy me.

I can only speak anecdotally: I started using SGML while working for a
professor of English as an undergrad. A single programmer (not me) wrote a
pretty sophisticated application that converted SGML to HTML and RTF in a
couple of months -- almost exactly the same amount of time it would take
to do the same for XML. The process was almost identical too: you use a
parser from James Clark, pump the data into your favorite scripting
language and output it in the other language. The complexity of the input
syntax was and is irrelevant to solving that problem.

If we were doing that now it would be much, much easier because we would
use Jade. That proves that technology improves and it becomes easier to do
hard things over time which is pretty much unrelated to the distinction
between SGML and XML.

So anyhow the professor, my friend and I branched out and did some
consulting. So anecdotally I can say that two undergrads and an English
professor can figure out SGML and sell it to some Very Large Companies.

I wasn't expensive by consulting standards but compared to the other
undergraduates I billed out at a pretty high rate (not that I saw most of
that money!).

Was that because I was doing SGML? You bet. Was it because SGML was hard?
No. Almost everything I did then I would do today with XML in roughly the
same way.

I was expensive because SGML was fundamentally uncool and smart computer
science students could not be convinced to look at. So the industry was
dominated by technical writers, lawyers, humanists and other people who
had the vision of where they wanted to go but usually not the technical
skills to get there.

The companies we worked with would never have looked at us twice if we
were working with SQL or CORBA because those technologies are cool. If we
were working with SQL we would have got "summer job" rates instead of
consultant rates. We weren't doing anything more difficult than everybody
else, but we were getting paid more (at a cost of some pride). Now it
rather annoys me to be uncool again because I made the mistake of
ingeniously recognizing the virtue of (okay, stumbling upon) generic
markup a little too early.

Yes, many things are easier today. Part of that is the progression of
time. Jade is better than Omnimark for converting to RTF, modern SGML
editors are better than what we had a few years ago. 

Another important part is XML. It's all of a sudden cool to do markup
because the average programmer feels like they could make a parser if they
had to, even though the average programmer is generally too smart to waste
time reinventing that wheel. It's cool because it is associated with the
Internet. It's cool because Microsoft likes it.

I know what Simula's inventors must feel like. Sun repackages Simula
twenty years late and its treated as the second coming of Kernighan. Argh.

> Software that works with it [SGML] is scarce and
> often expensive, and too often doesn't work very well.  

That's often the case with emerging technologies. Software to work with
XML doesn't work so great yet either. The most sophisticated, solid
software I have that work with XML (e.g. Jade, Excosoft Documentor) was
all SGML software first. Do you have some counter examples?

> Just because a
> giant telco firm can muster the personnel to deal with SGML doesn't make
> it a particularly elegant solution, except by way of comparison with
> approaches that use non-standard or presentation-focused languages.

Elegance is pretty subjective but according to my jaded view neither SGML
nor XML are very elegant. The angle bracket syntax alone is annoying. The
strange dichotomy between elements and attributes is also odd. SGML and
XML make it possible to get at stuctural heart of documents. That makes
some things very easy. It makes some other things that were previously
impossible hard, but possible. 

The syntactic differences between them have so little to do with the
complexity of making industrial strength applications that I can only
conclude that those who think that SGML implementation is "hard" and XML
implementation is "easy" haven't actually got around to implementing
anything complex yet.

Re: Schemas -- it is 10 years later. We can probably improve on DTDs by
about 50%. We should do so. It doesn't make sense to wait for schemas in
order to implement a new system, and it is also not the case that they
will "revolutionize" the use and practice of XML, but it IS the case that
they will probably give us some nice features that will make our lives a
little easier. Great!
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"A year ago, when Ernest Pecounis said he wanted to bring
Linux into the state agency he works for, there was a swell of
laughter from his colleagues. Guess who's laughing now."
 - http://www.zdnet.com/pcweek/stories/news/0,4153,393443,00.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Fri Mar 19 04:15:27 1999
From: cbullard at hiwaay.net (len bullard)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu>
Message-ID: <36F1CEAF.50B4@hiwaay.net>

Richard Goerwitz wrote:
> 
> The sooner we can all agree on another schema mechanism, the sooner we
> can all stop trying to outfit XML with all the kludges that people have
> already built onto SGML to make it useful in a modern, scoped, object-
> oriented world.

I agree for the most part.

<rant>As one who also doesn't like to see one group try to get hegemony
by 
taking out the prior group (sort of the imperialism the europeans 
used on the american indians and the HTML community used on everyone), 
I also admit a DTD comes up short when one starts trying to do things 
with it that neither it nor SGML were designed for.  The SGML community 
realized this at least a decade ago and has been intensely involved in 
work to fix it.   Let's face it, XML has concentrated most of that 
work in one domain and be glad for it.</rant>

I think (just an opinion) the right way (morally and politically) to 
approach this is to say that as the environment has changed, and the 
demands on markup systems for applications not envisioned in the 
original designs of SGML have emerged, the requirements have changed. 
New capabilities have to be designed to meet the requirements. 

Stodgy as that may sound, it is an engineering approach to what 
is an engineering job.  We will do well to be engineers and not try 
to crusade.  Otherwise, we become like artists who also write critique:  
just politicians.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tony.mcdonald at ncl.ac.uk  Fri Mar 19 06:46:38 1999
From: tony.mcdonald at ncl.ac.uk (Tony McDonald)
Date: Mon Jun  7 17:10:11 2004
Subject: Parsing XML->DOM and XSL querying optimising
Message-ID: <199903190636.GAA00207@cheviot.ncl.ac.uk>

Hi all,

I have an application that consists of 140+ XML documents, roughly 100k
bytes each that I want to be able to query (using XSL pattern matching at
present) and output to XML/HTML and RTF format. This will happen in real
time (if at all possible).

Additionally, I'd like to be able to search/query the entire repository of
documents and return a composite XML/HTM or RTF document from these.

At the moment, I'm experimenting with the DOM parser in Python and finding
that a DOM parse takes about 4 seconds, whilst an XSL query takes about 1.8
seconds.

I reckon that a user could wait the 1.8 seconds for a query, but might start
to get fidgety after almost 6 seconds (how transient we are!).

What strategies have people got for limiting the DOM parsing time?

My own thoughts are that I load up all 140 documents at server-startup time,
parse them into DOM[0]...DOM[139], store them into memory and then query
each one in turn in the case of a simple query, and query all the DOM
objects in the case of a full query across all XML documents.

Is this sensible? practical? stupid?

any thoughts on this would be appreciated,
cheers,
tone.
 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Fri Mar 19 08:29:37 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:11 2004
Subject: IDL2XML converter
In-Reply-To: <3601e340.180299@smtpgate1.ONE2ONE.CO.UK>
References: <3601e340.180299@smtpgate1.ONE2ONE.CO.UK>
Message-ID: <wkvhfxlw5u.fsf@ifi.uio.no>


* Lars Marius Garshol
|
| Hmmm. Now you lost me. What kind of IDL2XML converter did you have
| in mind? Did you really mean a CDR2XML converter? Or IIOP2XML-RPC?

* LUCIO PICOLLI
| 
| Neither i was thinking of converting a IDL into a DTD. The DTD will
| need to suport data types. 

You'll have to provide a framework for that yourself, I'm afraid. XML
has no concept of octet, long long, short, boolean, sequence and so
on. It's still possible to use XML, but I think you'd have to go with
a single DTD and do your own type/error checking. It's not as bad as
it may sound, as you'd have to do that at some level anyway.

Anyway, if you want to do your own work in this area I have a flexible
IDL parser written in Common Lisp, which you can have the source to.
It's not complete, but it could easily be used for this kind of thing.
I haven't optimized it yet (or even started declaring types), but it's
already noticeably faster than VisiBrokers Java IDL parser.

| Any XML doc generated from that DTD could then be used to pass data
| between ORB's instead of IIOP.

I'm not sure this has much value, except insofar as it is easier to
develop support for than full CORBA. But XML-RPC already does that.

One thing you might look at is using the IDL to automatically produce
stubs and skeletons that marshal and demarshal XML-RPC requests. I'm
not sure if that's really necessary for the language you use, though.
(Whichever language that may be. :)
 
| This XML-RPC things looks good but i don't know what it is as i have
| only founds a few lightweight docs on it. I searched w3C but i can't
| find it.  Has it been submitted to W3C?

I don't think so, no.  If it had been it should have been on
<URL:http://www.w3.org/TR/> as a note.  Anyway, I think this is
outside the scope of the W3C, although HTTP-NG does similar things.

The spec is at: <URL:http://www.scripting.com/frontier5/xml/code/rpc.html>

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Fri Mar 19 10:53:11 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:10:11 2004
Subject: IDL2XML converter
Message-ID: <3601e81d.190299@smtpgate1.ONE2ONE.CO.UK>


> You'll have to provide a framework for that yourself, I'm afraid. XML
> has no concept of octet, long long, short, boolean, sequence and so
> on. It's still possible to use XML, but I think you'd have to go with
> a single DTD and do your own type/error checking. It's not as bad as
> it may sound, as you'd have to do that at some level anyway.

I was hoping that the XML-DATA/DCD schema might be useful here. I   
realised that it is still under submission to W3C.

>
> One thing you might look at is using the IDL to automatically produce
> stubs and skeletons that marshal and demarshal XML-RPC requests. I'm
> not sure if that's really necessary for the language you use, though.
> (Whichever language that may be. :)

If i understand you correctly, are thinking of generating stubs/skeltons   
from IDL into language specfic source that can then be used to parse XML   
requests? If so that is a interesting idea.

I am keen to extend the XML-RPC idea to generate a DTD from the IDL and   
use the XML parser to do the marshalling. What i have seen of XML-RPC   
there is no DTD defined for each post. Hence the doc cannot be validated.   
Having the DTD handling most of the marshalling via the validation seems   
to be getting something for nothing.

Is there anything obviously flawed in the above thought?

 -lucio

    
> --Lars M.
>
>
> xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on   
CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following   
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Mar 19 11:08:44 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu>
Message-ID: <36F233DC.1BFBD4FE@mecomnet.de>

Richard L. Goerwitz wrote:
> 
> Re namespaces:
> 
> After working with them now for a few months, I can't say I'm any more
> impressed with namespaces than when I started.  Why?
> 

Nice list.

>   --  No no matter what anyone says, they screw up validation.  --

While I agree with the general sentiment and with the detailed observations, I
don't agree with the conclusion.
When the namespace spec ascended to the status of a recommendation, I changed
our mechanism for interning symbols to conform to it. Those modifications were
non-trivial, baroque, and in some cases difficult to motivate. (except that,
"well, that's what the spec says.") I did not, however, change anything in the
validation engine. They're really separable issues.

Whatever the problems with the namespace spec may be, this one is that it
simply does not provide complete means to encode names unambiguous names in an
xml-1.0 document. This is the problem which you describe below. It's an old problem.

All aspects of which have solutions. That is, it is possible to either provide
or to infer the necessary information. Providing it is easy. But it's not
standard. Inferring it is someone more complex, but also readily doable. In
this case, only the scoping rules remain outside the standard.

> 
>     1) because DTDs aren't namespace-aware, and therefore
>       a) don't know the difference between a defaulted element and one
>          that simply has no namespace
>       b) have no scoping mechanism to at least allow you to kludge
>          namespace defaulting by restricting elements to one or another
>          part of the syntax tree
> 
>     2) because namespaces require you to parse attributes and values
>        fully before finishing element name processing; this is bad be-
>        cause it
>       a) makes one-pass parsing more difficult, and requires retention
>          of much more information during the parse
>       b) makes for unexpected interactions between the DTD (which may
>          provide default attributes for a given element, including
>          xmlns="" - which puts the element into a namespace)
> 
>     3) because inherited attributes are inimical to the whole DTD
>        concept
>       a) it was bad enough that we had to put up with xml:lang and
>          such (which processing software must pass down the parse
>          tree), now the XML standard itself has inherited attributes
>          built in with namespaces
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Mar 19 11:09:07 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <199903182155.PAA01202@bruno.techno.com>
Message-ID: <36F233F4.5CC629EA@mecomnet.de>

Steven R. Newcomb wrote:
> 
> ...  Namespaces (at least the bulk of their
> syntax and the idea of identifying a namespace via a URI)
> could also be used to do architectural forms.
> 

I've wondered about this.
Wouldn't they, at least in their standardized form, be restricted to the
limited fimaily of architectures which are mutually exclusive?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Mar 19 12:59:23 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:11 2004
Subject: About Tim's article on XML
Message-ID: <NBBBJPGDLPIHJGEHAKBAOEKHCPAA.martind@netfolder.com>

Hi,

I read Tim's article in XML.com with interest (Ref:
http://www.xml.com/1999/03/ie5/first-x.xml). Several comments are to the
point, the critic well conducted and exact except one glitch....


<ArticleExtract>
At this point in history, there is only one official, approved, stable,
production-quality standard for stylesheets, and it's named Cascading Style
Sheets, or CSS for short. CSS 1 has been around since December 1996, and CSS
2 since May 1998.
</ArticleExtract>

<Reply>
The above statement is correct except with its statement beginning "there is
only one official, approved, stable, production-quality standard ". this
statement is inexact. DSSSL is also an official standard (ISO), approved
(internationally), stable (proved it since 2 years), production quality
standard (proved it with several implementations). So, let's put the clock
with the right time this time and redo the sentence with the correct factual
information:

At this point in history, there is two official, approved, stable,
production-quality standards for style sheets:
a) Cascading Style Sheets, or CSS for short. CSS 1 has been around since
December 1996, and CSS 2 since May 1998.
b) Document Style Semantics and Specification Language or DSSSL has been
around since 1996.

IE 5.x implements the former, the latter is available as an add-on.

So, as Tim did for IE5, I'll put a bug image in front of this article's part
:-)
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 19 13:53:11 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F1BBE8.A3AB13EB@prescod.net>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F19AC0.B5B40B20@goon.stg.brown.edu>
	<36F1BBE8.A3AB13EB@prescod.net>
Message-ID: <14066.21253.117700.991361@localhost.localdomain>

Paul Prescod writes:

 > Richard Goerwitz wrote:
 > > 
 > > I come from a small shop that does a lot of SGML work.  Trust me:
 > > SGML is complex and intractable.
 > 
 > <RANT>
 > This is way off topic but I must admit that these characterizations really
 > annoy me.
 > 
 > I can only speak anecdotally: I started using SGML while working
 > for a professor of English as an undergrad. A single programmer
 > (not me) wrote a pretty sophisticated application that converted
 > SGML to HTML and RTF in a couple of months -- almost exactly the
 > same amount of time it would take to do the same for XML.

Actually, many such applications were often written in a few days or
even a few hours.  The interesting thing about SGML is that it was
heavily used in two separate markets at extreme ends of the scale:

1. academia, for large, low-budget projects using free software (like
   Emacs, NSGMLS, Perl, and Jade) or cheap software (like WP7); and

2. government/military/heavy-industry, for large, high-budget projects 
   using extremely expensive commercial software (like ArborText and
   Omnimark).

In general, the academic projects (and there are hundreds of them)
accomplished much more using much less (often just a single PC on a
grad student's desk), but that is partly because they never had to
become too user friendly -- the researchers would work directly with
SGML markup, rather than hiding it behind $20K/seat GUI tools.  The
gov/mil/industry projects spent most of the money trying to hide the
SGML from view -- the processing itself has never been difficult, SGML
or XML.

What SGML missed was the middle part of the document market -- the
$1M-$100M/year companies who couldn't afford all of the customised
user-friendly tools, but didn't have the free time or initiative to
support and maintain their own custom installations.

 > The process was almost identical too: you use a parser from James
 > Clark, pump the data into your favorite scripting language and
 > output it in the other language. The complexity of the input syntax
 > was and is irrelevant to solving that problem.

Almost correct.  One expensive disadvantage of SGML (until WebSGML) is
that it requires full DTD conformance at every stage of production; as
a result, if your production chain consists of ten physical steps,
writing out SGML at each stage, you *must* have DTDs for all of the
intermediate steps.  This one constraint can add $100K or more to a
large enterprise SGML project, since DTD writers are expensive to hire
(and a single, configured DTD becomes heavily obfuscated so that it
can almost never be maintained in-house).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Fri Mar 19 15:08:50 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:10:11 2004
Subject: XML complexity, namespaces
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu> <36F1BBE8.A3AB13EB@prescod.net>
Message-ID: <36F2686F.EF6C9B38@goon.stg.brown.edu>

Paul Prescod responded on a number of fronts to my posting, covering
topics such as the utility of SGML, his old CS friends' attitudes to-
ward it, and the existence of good, easily accessible software to pro-
cess SGML.

A response is in order, because he's actually arguing against a posi-
tion I do not hold.

Paul says:

> Software to work with XML doesn't work so great yet either. The most
> sophisticated, solid software I have that work with XML (e.g. Jade,
> Excosoft Documentor) was all SGML software first. Do you have some
> counter examples?

XML has only been around a short while.  It's not a fair comparison.

By way of contrast, SGML has been around a long time.  If there's not
a lot of good software out there for it by now, I don't think I'm being
unreasonable in claiming that it's, at least in part, because SGML is
a mess.

Re your CS friends who belittled SGML:  If it was the concept of des-
criptive markup that they belittled, then they were just silly.  And
I think most of them would admit that now.  But if it was the formal
properties of SGML, specifically DTDs, that they were belittling, then
there's very little question that they had a point.

Now re James Clark:  SGML defenders typically hold up his amazing work
as evidence that SGML is easy to process, and quite elegant.  Within a
rather restricted domain, that's true.  But it's really not fair to use
JC as prima facie evidence of elegance or simplicity.  He's worked long
and hard, and he's done some work that's frankly amazed the rest of us -
and the industry.

In a sense, though, all of this is moot.  Your comments seem aimed at
refuting an argument I never made.  I am not saying that you couldn't
get work done with SGML.  I'm not even saying that, for its time, it 
wasn't a tremendous advance.  I'm just saying what should be obvious to
any impartial observer:  That it could stand a lot of improvement, and
that we now have a chance to make the improving easy on ourselves by
making a clean break, on the XML schema issue, with SGML.

Re XML and SGML, you say:

> The syntactic differences between them have so little to do with the
> complexity of making industrial strength applications that I can only
> conclude that those who think that SGML implementation is "hard" and XML
> implementation is "easy" haven't actually got around to implementing
> anything complex yet.

Paul, just for the record:  I have done a lot of implementation work,
some of it quite complex.  Again, though, you're refuting an argument
that I never made.  Far from characterizing the difference between SGML
and XML as hard vs. easy, I have criticized the W3C repeatedly for let-
ting XML become cluttered and disunified, and for letting the old "CS
student can implement a parser for it in a week" become a cruel joke.

> It doesn't make sense to wait for schemas in order to implement a new
> system

This is a good point you make.

As soon as possible, the W3C should make known its intentions.  The worst
possible outcome here would be for them to push DTDs and all the junk that
goes with them to make them useful (architectures, etc.) - only to replace
the whole mechanism by recommending a new or alternate schema setup later
on.

If we're going to get another schema setup, then let's just live with DTDs
as they are for now.  Skip architectures.  Then let's move on to the new
schema mechanism when it's ready.

Until then, we can live with the namespace debacle.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Fri Mar 19 15:17:00 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:10:12 2004
Subject: eyes open (was XML complexity, namespaces)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu> <36F1AFF5.2DF948A2@allette.com.au>
Message-ID: <36F26A53.E4395E9C@goon.stg.brown.edu>

Marcus Carr wrote:
> 
> So... how do we get back to SGML people not having their eyes open?

Just for the record, I never said they were closed.  I believe it was
Chris Lilly.  And when he said this, he wasn't characterizing the en-
tire SGML community.  In fact, he was, overall, defending SGML.

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cerium at ibm.net  Fri Mar 19 15:41:59 1999
From: cerium at ibm.net (John Hicks)
Date: Mon Jun  7 17:10:12 2004
Subject: Parsing XML->DOM and XSL querying optimising
In-Reply-To: <199903190636.GAA00207@cheviot.ncl.ac.uk>
Message-ID: <003701be7237$7e57bec0$01010101@c31cj>

Hi Tony:

How similar are the "140+ XML documents" you mention?  How about put them in
a database and search there?  Might be faster to search *before* you compose
final documents, not after;  whether your database stores entire documents,
or just data for templates...

John Hicks

Cerium Component Software
Build Your Database Website with Our XML Team or Tools
XML Outline | XML DB | XML Servlet
212-662-3982 | 888-742-8989
http://ceriumworks.com
"Software as a conversation with a community."

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Tony McDonald
> Sent: Thursday, March 18, 1999 10:35 PM
> To: xml-dev
> Subject: Parsing XML->DOM and XSL querying optimising
>
>
> Hi all,
>
> I have an application that consists of 140+ XML documents, roughly 100k
> bytes each that I want to be able to query (using XSL pattern matching at
> present) and output to XML/HTML and RTF format. This will happen in real
> time (if at all possible).
>
> Additionally, I'd like to be able to search/query the entire repository of
> documents and return a composite XML/HTM or RTF document from these.
>
> At the moment, I'm experimenting with the DOM parser in Python and finding
> that a DOM parse takes about 4 seconds, whilst an XSL query takes
> about 1.8
> seconds.
>
> I reckon that a user could wait the 1.8 seconds for a query, but
> might start
> to get fidgety after almost 6 seconds (how transient we are!).
>
> What strategies have people got for limiting the DOM parsing time?
>
> My own thoughts are that I load up all 140 documents at
> server-startup time,
> parse them into DOM[0]...DOM[139], store them into memory and then query
> each one in turn in the case of a simple query, and query all the DOM
> objects in the case of a full query across all XML documents.
>
> Is this sensible? practical? stupid?
>
> any thoughts on this would be appreciated,
> cheers,
> tone.
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 19 16:09:11 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:12 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
		<36F0CFF4.365B@hiwaay.net>
		<36F10CFC.CFEB89A8@goon.stg.brown.edu>
		<36F13992.150D05F9@w3.org>
		<36F18209.8C68524@allette.com.au>
		<36F19AC0.B5B40B20@goon.stg.brown.edu>
		<36F1BBE8.A3AB13EB@prescod.net> <14066.21253.117700.991361@localhost.localdomain>
Message-ID: <36F2722C.FCB403F3@prescod.net>

David Megginson wrote:
> 
>  > I can only speak anecdotally: I started using SGML while working
>  > for a professor of English as an undergrad. A single programmer
>  > (not me) wrote a pretty sophisticated application that converted
>  > SGML to HTML and RTF in a couple of months -- almost exactly the
>  > same amount of time it would take to do the same for XML.
> 
> Actually, many such applications were often written in a few days or
> even a few hours.  

In defense of my friend, this one was pretty slick, with a graphical UI
and used C++ for really high throughput. Actually, she was mostly a C++
bigot so that part isn't completely defensible. Ironically, I encouraged
her to learn Perl before I had attempted to do so myself. Imagine my
surprise.

> The interesting thing about SGML is that it was
> heavily used in two separate markets at extreme ends of the scale:
> 
> 1. academia, for large, low-budget projects using free software (like
>    Emacs, NSGMLS, Perl, and Jade) or cheap software (like WP7); and
> 
> 2. government/military/heavy-industry, for large, high-budget projects
>    using extremely expensive commercial software (like ArborText and
>    Omnimark).

True enough. I expect that there will be a certain amount of this with XML
also, however. You need a certain critical mass of problem complexity
before it makes sense to implement a generic markup-based solution to a
document processing problem. Despite what Chris Lilley says, it *still*
takes a text editor to get data into XML and a consultant (or internal
expert) to get it out. Properly structured XML requires transformations to
turn into beautiful print. XSL is easier than what we had three years ago
but it still isn't something your typical office user will learn. But
again, the difference is that XSL is cool so programmers flock to it.
Perl+SGML/Omnimark was not cool so people with the expertise were
expensive.

> In general, the academic projects (and there are hundreds of them)
> accomplished much more using much less (often just a single PC on a
> grad student's desk), but that is partly because they never had to
> become too user friendly -- the researchers would work directly with
> SGML markup, rather than hiding it behind $20K/seat GUI tools.  The
> gov/mil/industry projects spent most of the money trying to hide the
> SGML from view -- the processing itself has never been difficult, SGML
> or XML.

One of the hardest things with XML *or* SGML is making usable user
interfaces. XML doesn't make it any easier. In fact it retains some the
SGML features that can do the most damage to an intuitive user interface
(consider internal entities in attributes).

> Almost correct.  One expensive disadvantage of SGML (until WebSGML) is
> that it requires full DTD conformance at every stage of production; as
> a result, if your production chain consists of ten physical steps,
> writing out SGML at each stage, you *must* have DTDs for all of the
> intermediate steps.  

Here's the DTD I would use:

<!ELEMENT INTERMEDIATE_P ANY>
<!ELEMENT INTERMEDIATE_HEAD ANY>
<!ELEMENT INTERMEDIATE_TITLE ANY>
<!ELEMENT INTERMEDIATE_XREF ANY>
...

Actually, I tend to write real DTDs for intermediate steps if I can.
Untermediate steps can add errors too. I agree that sometimes the
cost/benefit ratio isn't there.

> This one constraint can add $100K or more to a
> large enterprise SGML project, since DTD writers are expensive to hire
> (and a single, configured DTD becomes heavily obfuscated so that it
> can almost never be maintained in-house).

I'm surprised that you wouldn't allow the programmer who builds the
intermediate transformations to also build the intermediate DTDs. I
consider the DTDs to be part of the specification for what the program
does.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"A year ago, when Ernest Pecounis said he wanted to bring
Linux into the state agency he works for, there was a swell of
laughter from his colleagues. Guess who's laughing now."
 - http://www.zdnet.com/pcweek/stories/news/0,4153,393443,00.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Mar 19 16:27:30 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:12 2004
Subject: XML complexity, namespaces
In-Reply-To: <36F2686F.EF6C9B38@goon.stg.brown.edu>
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEKNCPAA.martind@netfolder.com>

HI Richard,

<YourComment>
XML has only been around a short while.  It's not a fair comparison.

By way of contrast, SGML has been around a long time.  If there's not
a lot of good software out there for it by now, I don't think I'm being
unreasonable in claiming that it's, at least in part, because SGML is
a mess.
</YourComment>

<Reply>
Why are you saying that? Why do you say that there is only bad SGML
software? Did you got a bad experience with a SGML software vendor and now
the whole pound is a mess? :-)
</Reply>

<YourComment>
Re your CS friends who belittled SGML:  If it was the concept of des-
criptive markup that they belittled, then they were just silly.  And
I think most of them would admit that now.  But if it was the formal
properties of SGML, specifically DTDs, that they were belittling, then
there's very little question that they had a point.

Now re James Clark:  SGML defenders typically hold up his amazing work
as evidence that SGML is easy to process, and quite elegant.  Within a
rather restricted domain, that's true.  But it's really not fair to use
JC as prima facie evidence of elegance or simplicity.  He's worked long
and hard, and he's done some work that's frankly amazed the rest of us -
and the industry.

In a sense, though, all of this is moot.  Your comments seem aimed at
refuting an argument I never made.  I am not saying that you couldn't
get work done with SGML.  I'm not even saying that, for its time, it
wasn't a tremendous advance.  I'm just saying what should be obvious to
any impartial observer:  That it could stand a lot of improvement, and
that we now have a chance to make the improving easy on ourselves by
making a clean break, on the XML schema issue, with SGML.
</YourComment>

<Reply>
OK fair comment, but don't sell the bear's fur before killing it. Let's
first see how brilliant the new stuff is before claming victory over the
darkness of prehistoric times :-)
</Reply>

<YourComment>
As soon as possible, the W3C should make known its intentions.  The worst
possible outcome here would be for them to push DTDs and all the junk that
goes with them to make them useful (architectures, etc.) - only to replace
the whole mechanism by recommending a new or alternate schema setup later
on.

If we're going to get another schema setup, then let's just live with DTDs
as they are for now.  Skip architectures.  Then let's move on to the new
schema mechanism when it's ready.

Until then, we can live with the namespace debacle.
</YourComment>

<Reply>
W3 is facing a hard problem and the funniest part is that is was the problem
SGML faced too :-). My own conclusion: as we know this is work in progress.
Actual match score:

name spaces: do not really solve document structure validation - just a way
to reduce the probabily of name collision. score: D
XML: yes the job is now a lot easier for parser and a document could be
parsed even in the absence of a DTD. Score: A
Schema: lost in the limbes tacit knowledge stored in workgroup members :-)
Score: No score

Conclusion: too early to say if XML is _really_ better than SGML. If only we
could combine the good points of property sets and the intent of DTD but
what kind of control do we have on this specification process except burn
some candles and pray that they'll do the best :-))))
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Fri Mar 19 16:49:24 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:12 2004
Subject: About Tim's article on XML
Message-ID: <30649320C177D111ADEC00A024E9F297169FA7@exchange-server.dega.com>

Everyone,

The correct link for this article should be:
http://www.xml.com/xml/pub/1999/03/ie5/first-x.html


Ed Howland
ed@dega.com
http://www.dega.com 
"As your attorney, I advise you to take some adrenalchrome"

-----Original Message-----
From: Didier PH Martin [mailto:martind@netfolder.com]
Sent: Friday, March 19, 1999 4:58 AM
To: 'XML Dev'
Subject: About Tim's article on XML


Hi,

I read Tim's article in XML.com with interest (Ref:
http://www.xml.com/1999/03/ie5/first-x.xml). Several comments are to the
point, the critic well conducted and exact except one glitch....


<ArticleExtract>
At this point in history, there is only one official, approved, stable,
production-quality standard for stylesheets, and it's named Cascading Style
Sheets, or CSS for short. CSS 1 has been around since December 1996, and CSS
2 since May 1998.
</ArticleExtract>

<Reply>
The above statement is correct except with its statement beginning "there is
only one official, approved, stable, production-quality standard ". this
statement is inexact. DSSSL is also an official standard (ISO), approved
(internationally), stable (proved it since 2 years), production quality
standard (proved it with several implementations). So, let's put the clock
with the right time this time and redo the sentence with the correct factual
information:

At this point in history, there is two official, approved, stable,
production-quality standards for style sheets:
a) Cascading Style Sheets, or CSS for short. CSS 1 has been around since
December 1996, and CSS 2 since May 1998.
b) Document Style Semantics and Specification Language or DSSSL has been
around since 1996.

IE 5.x implements the former, the latter is available as an add-on.

So, as Tim did for IE5, I'll put a bug image in front of this article's part
:-)
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tony.mcdonald at ncl.ac.uk  Fri Mar 19 17:36:13 1999
From: tony.mcdonald at ncl.ac.uk (Tony McDonald)
Date: Mon Jun  7 17:10:12 2004
Subject: Parsing XML->DOM and XSL querying optimising
In-Reply-To: <003701be7237$7e57bec0$01010101@c31cj>
References: <199903190636.GAA00207@cheviot.ncl.ac.uk>
Message-ID: <v04104422b31827c12be9@[128.240.198.13]>

> Hi Tony:
>
> How similar are the "140+ XML documents" you mention?  How about put them in
> a database and search there?  Might be faster to search *before* you compose
> final documents, not after;  whether your database stores entire documents,
> or just data for templates...
> John Hicks

John,
Thanks for the reply - the documents are very similar. They're all 
driven from the same DTD. The problem I have is that searching from a 
database involves splitting the xml documents into a tree structure 
if they're to fit into the SQL-based db that we're using at the 
moment. The database is intended to store whole documents, but if I 
can find a sensible way of using templates, I'm all ears!

As an aside, I'm looking at Zope (http://www.zope.org) as a possible 
repository for our XML documents as it's an object 
database/web-server that has some interesting features that marry up 
rather well with some other problems I have...

many thanks,
tone
------
Dr Tony McDonald,  FMCC, Networked Learning Environments Project
The Medical School, Newcastle University Tel: +44 191 222 5888
Fingerprint: 3450 876D FA41 B926 D3DD  F8C3 F2D0 C3B9 8B38 18A2

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Fri Mar 19 18:55:56 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:12 2004
Subject: XML complexity, namespaces (was WG)
Message-ID: <87256739.0067DA2D.00@d53mta03h.boulder.ibm.com>


>> The answer, obviously, is to shed any pretense of DTDs being the basic
>> XML schema mechanism.
>
>For declaring multi-namespace documents, yes. They still have at least
>an interim role in validating single namespace documents and in defining
>the building blocks from which a multi-namespace schema can be
>constructed.

And, I think that in the glorious "Schema Age" to come, it might be nice to
be able to validate instances of your schema itself using a DTD? Since so
much will depend upon the correctness of the schema built, having it be
known correct will be nice.

I figure that, in the end, there will be tools that let people build their
own schemas and then documents that deal with them. So having a cheap and
easy way to validate the schema will be nice. But, of course, if the schema
itself uses namespaces then a DTD won't be useful for validating instances
of it either  (in which case I guess you incestuously validate it with a
description of itself in itself.)

And of course there is the ever looming issue of entities, the separation
of content and structural description. I'd kind of hate personally for DTDs
(and its separate syntax) to have to hang around just to deal with this
issue. If we have to keep DTDs for those reasons, I'd prefer to fix DTDs so
that at least they could be used for simpler namespace based validation.
Otherwise, I'd argue for throwing them out and coming up with a way to deal
with entities within a single, consistent syntax.

Then again, I'm just a guy :-)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 19 18:56:48 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:12 2004
Subject: XSL 'coolness' (was Re: XML complexity, namespaces (was WG))
In-Reply-To: <36F2722C.FCB403F3@prescod.net>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
 <36F0CFF4.365B@hiwaay.net>
 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
 <36F13992.150D05F9@w3.org>
 <36F18209.8C68524@allette.com.au>
 <36F19AC0.B5B40B20@goon.stg.brown.edu>
 <36F1BBE8.A3AB13EB@prescod.net>
 <14066.21253.117700.991361@localhost.localdomain>
Message-ID: <199903191856.NAA11713@hesketh.net>

At 09:50 AM 3/19/99 -0600, Paul Prescod wrote:
>XSL is easier than what we had three years ago
>but it still isn't something your typical office user will learn. But
>again, the difference is that XSL is cool so programmers flock to it.

<rant subject="XSL">
Er, uh... is that why Sun and Adobe are putting up large sums of money to
inspire someone (anyone?) to implement the formatting objects end of it?

Am I the only one who gets email from people (mostly programmers) asking
for what the hell is going on in XSL?  The transform end seems to bug
programmers in particular, while the FO end bugs developers who thought
they'd already learned a style sheet language for the Web - _and_ print.
</rant>

Oh well.  Just another Friday afternoon working with XML...

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 19 19:21:42 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:12 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F2722C.FCB403F3@prescod.net>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F19AC0.B5B40B20@goon.stg.brown.edu>
	<36F1BBE8.A3AB13EB@prescod.net>
	<14066.21253.117700.991361@localhost.localdomain>
	<36F2722C.FCB403F3@prescod.net>
Message-ID: <14066.41480.122404.993942@localhost.localdomain>

Paul Prescod writes:

 > Despite what Chris Lilley says, it *still* takes a text editor to
 > get data into XML and a consultant (or internal expert) to get it
 > out.

Unless, of course, the XML is simply a serialisation of an existing
data structure.

 > Perl+SGML/Omnimark was not cool so people with the expertise were
 > expensive.

People seem to be flocking to Perl+XML and Java+XML quite rapidly, XSL
aside.  The biggest cost of Omnimark was not the purchase price
(although $12K or more per seat [per annum, I think] would give most
programmers reason to pause), but the cost of maintenance.

A lot of people know Perl and Java but almost no one knows Omnimark,
and maintaining a system that relies heavily on scripts written in an
esoteric and virtually-unknown language can be extremely difficult and
painfully expensive, when it's possible at all.

 > One of the hardest things with XML *or* SGML is making usable user
 > interfaces. XML doesn't make it any easier. In fact it retains some the
 > SGML features that can do the most damage to an intuitive user interface
 > (consider internal entities in attributes).

Gotta be pragmatic here -- the editing software can declare that
attributes contain text, period, end of discussion.  On import, the
editor can simply expand the entities and then forget about their
boundaries.  I think that most users can live with that.

 > I'm surprised that you wouldn't allow the programmer who builds the
 > intermediate transformations to also build the intermediate DTDs. I
 > consider the DTDs to be part of the specification for what the program
 > does.

The problem is simply one of matching up skills -- it's hard to find
people who can write transformations (in any language), and it's hard
to find people who can write DTDs; it is painfully difficult to find
people who can do both.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kurt.donath at lmco.com  Fri Mar 19 19:40:58 1999
From: kurt.donath at lmco.com (Kurt Donath)
Date: Mon Jun  7 17:10:12 2004
Subject: DTD Question: Attributes vs Elements
Message-ID: <36F2A693.BD5FECA0@lmco.com>


What is the criteria for selecting when to define data as an attribute
or element in a DTD?  

Simon says attributes are an "excellent tool for passing along extra
information about your element to an automated processor - a parser, a
browser, or a conversion tool.   They are NOT a good place to actually
store data".  Simon continues, "generally, you should use attributes to
store information that may not be useful to humans directly but may help
computers process the element properly."

Would anyone like to add or differ with this?

Kurt Donath

-- 
Kurt Donath
Lockheed Martin - Enterprise Information Systems
Systems Engineering / Webserv
315.456.6276
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar 19 19:56:43 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:12 2004
Subject: DTD Question: Attributes vs Elements
Message-ID: <3.0.32.19990319115542.00c2b330@pop.intergate.bc.ca>

At 02:33 PM 3/19/99 -0500, Kurt Donath wrote:
>
>What is the criteria for selecting when to define data as an attribute
>or element in a DTD?  

Personal taste.  This is a religious issue.  Attributes differ in 
that 
- they get white-space-normalized
- their order is not significant
- you can't have more than 1 with the same name on an element
- they can't contain any internal structure

If these things matter, your decision is made.  If not, make it yourself.
Lesson: software writers have to be able/willing to pull information
out of either elements or attributes. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 19 20:24:34 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:12 2004
Subject: DTD Question: Attributes vs Elements
References: <36F2A693.BD5FECA0@lmco.com>
Message-ID: <36F2B150.9B7812DE@prescod.net>

Kurt Donath wrote:
> 
> What is the criteria for selecting when to define data as an attribute
> or element in a DTD?

This will probably help:
http://www.oasis-open.org/cover/topics.html#elementsAndAttrs

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"A year ago, when Ernest Pecounis said he wanted to bring
Linux into the state agency he works for, there was a swell of
laughter from his colleagues. Guess who's laughing now."
 - http://www.zdnet.com/pcweek/stories/news/0,4153,393443,00.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 19 20:34:23 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:13 2004
Subject: DTD Question: Attributes vs Elements
In-Reply-To: <36F2A693.BD5FECA0@lmco.com>
Message-ID: <199903192033.PAA15213@hesketh.net>

At 02:33 PM 3/19/99 -0500, Kurt Donath wrote:
>
>What is the criteria for selecting when to define data as an attribute
>or element in a DTD?  
>
>Simon says attributes are an "excellent tool for passing along extra
>information about your element to an automated processor - a parser, a
>browser, or a conversion tool.   They are NOT a good place to actually
>store data".  Simon continues, "generally, you should use attributes to
>store information that may not be useful to humans directly but may help
>computers process the element properly."
>
>Would anyone like to add or differ with this?

I'll just clarify a little bit.  In my (highly religious and thoroughly
debatable) viewpoint, attributes are pretty much like annotations.  In a
few cases - empty elements in particular, where the attributes are the only
'real' content - attributes do have a direct impact on the content.
Nonetheless, I'm much happier seeing attributes as a description of the
element - more a metadata role - than providing content.

If you use XML in a display environment that isn't oriented toward
transformation, say CSS, the implications are pretty simple: users will see
the element content directly, if you let them, while the attribute content
is used to generate the presentation, not the content.  From my
perspective, elements contain 'first class' textual content for humans,
while attributes contain information that people will typically access
directly only through an editor.  

In other environments, attributes may seem more or less worthy of storing
content.  From a programmer's perspective, elements and attributes are just
different implementations of about the same thing, and RDF's many syntaxes
make this quite clear. I've definitely softened my strong opinions about
this dichotomy, though I certainly know what style I prefer when I go to
write my own DTDs.


Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Mar 19 20:40:41 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:13 2004
Subject: DTD Question: Attributes vs Elements
In-Reply-To: <36F2A693.BD5FECA0@lmco.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAGELECPAA.martind@netfolder.com>

Hi,

<YourComment>
What is the criteria for selecting when to define data as an attribute
or element in a DTD?

Simon says attributes are an "excellent tool for passing along extra
information about your element to an automated processor - a parser, a
browser, or a conversion tool.   They are NOT a good place to actually
store data".  Simon continues, "generally, you should use attributes to
store information that may not be useful to humans directly but may help
computers process the element properly."

Would anyone like to add or differ with this?
</YourComment>

<Reply>
To make some order in this world I created a mental model about this. Let's
imagine that a XML document is an object tree. At each node, the object is
associated to a set of properties (Do you see the tree of objects and each
object get a bag of properties attached to it). There is one property which
is more useful to human than to machines: the data content or what's in
between the markups.

the actual problem with XML is that the same hammer is used for all kinds of
works like feeding the cat, making the toast, etc. :-). At the beginning,
markups where used to bring structure to unstructured information. So to
speak add metadata or provide more information about what kind of
information this is. Not for human because we can extract the meaning from
the sentences themselves but for machines that are too young to speak human
language :-) So, with markup, the unstructured information became more
structured and machine could extract some meaning of a text. So, the markups
where for machines and text for humans. But, as you know, never let a kid
with a hammer in a house because everything becomes a nail :-) So, with
time, XML became the hope for the next Esperanto, the Microsoft hegemony
deliverance day, a release from not to have to learn 50 computer languages.
And we started to use it for other usage, some of it not for human at all
like exchange of data between machines. So in that case, the human vs.
machine part is quite confusing (for machine too, they prefer machine
language and they don't understand why humans are so convoluted :-).

In conclusion: If the information that you encapsulate in a XML substrate is
knwoledge or information for human consumption. What Simon said stands and
is a good rule of thumb. If your application is machine information exchange
and that you have to deal with firewalls, diverse platforms then you are
probably constrained to use something like XML or an object middleware which
is "firewall proof". In this case, the markup content could be seen as a
serialized variable or property and is not necessarily for humans. So we
just did some conceptual extension. I tell you, never let a child with a
hammer, everything becomes a nail :-))
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 19 21:01:57 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:13 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org>
Message-ID: <36F2B790.DBE75CC3@prescod.net>

Chris Lilley wrote:
> 
> You know what they used to say about SGML; its assymetric. Getting the
> data in just takes a text editor, but getting it out again requires a
> consultant. Well, with XML, the effort to get some benefit from XML is
> reduced because of economy of scale - someone somewhere will have the
> dtd you want to do part of your job. Build what you want from a kit of
> parts that other people wrote; add a little glue, and off you go.

It doesn't work that way.

A DTD is a reflection of an organization's business model. It varies from
organization to organization. You can't directly use some else's UML model
nor their DTD directly. It certainly does help to be able to use someone
else's as a starting point. It is also useful to use industry standard
DTDs for interchange (after some form of mapping or translation).

DTDs can and should be shared, but you should expect to make
customizations for every organization. That in turn requires customization
to all of the software that works with the DTD. Customizations is easier
than starting from scratch but the problem is that it may touch every part
of the system: 

 * document types
 * document type documentation
 * editor customizations
 * metadata query GUIs
 * data entry GUIs
 * navigation GUIs
 * output specifications (all of them!)

That isn't glue anymore. It's a major project (though less major than
starting from scratch). And unfortunately it involves specialized
knowledge which will usually mean consulting. Most technical publications
departments have a very thin technical staff and they aren't going to
become experts in all of these areas.

In other words, XML is as asymmetric as SGML. Actually neither is really
very asymmetric because you can't (well, shouldn't) get data into them
before you have designed your document type. So input and output are both
pretty difficult if you compare them, say, to Microsoft Word which is
usually the benchmark people use to demonstrate how hard SGML systems are
to build.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 19 21:24:11 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:13 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F2B790.DBE75CC3@prescod.net>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
 <36F0CFF4.365B@hiwaay.net>
 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
 <36F13992.150D05F9@w3.org>
Message-ID: <199903192123.QAA16896@hesketh.net>

At 02:46 PM 3/19/99 -0600, Paul Prescod wrote:
>In other words, XML is as asymmetric as SGML. Actually neither is really
>very asymmetric because you can't (well, shouldn't) get data into them
>before you have designed your document type. So input and output are both
>pretty difficult if you compare them, say, to Microsoft Word which is
>usually the benchmark people use to demonstrate how hard SGML systems are
>to build.

Ah, but if MS Word had a simple "Save-To-XML" option that let users save
their documents using markup based on the styles they've built.  Three
times now, I've seen organizations that had done a lot of very good
informal work with Word styles, and no easy path for those structures or
the documents that use them to move to XML.  I guess the incentive just
isn't there for MS to make life easy.  There are tools to do it, but it's
still not much fun.  (Another painful case of asymmetry.)

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Sat Mar 20 00:00:40 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:13 2004
Subject: Save-To-XML
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	 <36F0CFF4.365B@hiwaay.net>
	 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
	 <36F13992.150D05F9@w3.org> <199903192123.QAA16896@hesketh.net>
Message-ID: <36F2E218.937C9B71@prescod.net>

"Simon St.Laurent" wrote:
> 
> Ah, but if MS Word had a simple "Save-To-XML" option that let users save
> their documents using markup based on the styles they've built.  

I was thinking about this last week. Someone could build this relatively
easily on top of the Office 2000 save as XML and the MSHTML DLL. 

> Three
> times now, I've seen organizations that had done a lot of very good
> informal work with Word styles, and no easy path for those structures or
> the documents that use them to move to XML.  I guess the incentive just
> isn't there for MS to make life easy.  There are tools to do it, but it's
> still not much fun.  (Another painful case of asymmetry.)

Even if the tool to do it was a "Save-To-XML" option it would still be not
much fun. 

After all, the goal is not to get it into any-old-XML (that's easy) but to
get it into "our vocabulary". That's the harder part. There are tricky
problems about setting up division structure, converting tables to a
particular table model, cross-references to a particular linking model and
so forth. In the end it is a transformation job no matter how you slice
it. And even then you will likely have to do many manual fix-ups unless
the writers are Zen monks.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Sat Mar 20 00:01:39 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:13 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F233F4.5CC629EA@mecomnet.de> (message from james anderson on
	Fri, 19 Mar 1999 12:24:39 +0100)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <199903182155.PAA01202@bruno.techno.com> <36F233F4.5CC629EA@mecomnet.de>
Message-ID: <199903192331.RAA03429@bruno.techno.com>

[James Anderson:]

> > ...  Namespaces (at least the bulk of their
> > syntax and the idea of identifying a namespace via a URI)
> > could also be used to do architectural forms.

> I've wondered about this.
> Wouldn't they, at least in their standardized form, be restricted to the
> limited fimaily of architectures which are mutually exclusive?

I don't [yet] see why.  What makes you think so?

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sat Mar 20 02:51:31 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:10:13 2004
Subject: ModSAX: Proposed Core Features
References: <7BA102761CAED111B27E00805FBB72333FAE4C@arrowhead.int.westgroup.com>
Message-ID: <36F30239.DB7F5414@jclark.com>

"Duffy, Bruce" wrote:
> 
> Hi folks,
> 
> One feature I'd really like to see is a Locator.getByteOffset()
> method.  Obviously this feature would have to be optional, since not
> all XML inputs are indexable files.
> 
> James Clark's non-SAX API for XP implements this method for startElement(),
> but not for the characters() callback, which unfortunately is exactly what
> I need it for.  I could hack XP or another parser, but I'd much rather work
> within the context of SAX.

XP makes this available via SAX by subclassing the SAX Locator class. 
Just cast the Locator to object to com.jclark.xml.sax.Locator.  If you
want to standardize this in SAX, I would suggest adding a Locator2 with
additional methods.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dsgraham at gte.net  Sat Mar 20 04:25:51 1999
From: dsgraham at gte.net (Doug Graham)
Date: Mon Jun  7 17:10:13 2004
Subject: Unsubscibing
Message-ID: <00cf01be7289$192da3a0$0201a8c0@pro200>

I bet you guys get a lot of messages asking how to unsubscribe.  And, I'd bet you dislike those messages as much as I do.  So, now I have to ask.  Please accept my appology for doing this.

I've been trying to unsubscribe for a while now and I keep getting the same error message back:

>>>> This is a multi-part message in MIME format.
**** Command 'this' not recognized.
>>>> 
>>>> ------=_NextPart_000_007B_01BE723F.349C7ED0
END OF COMMANDS


I've tried five different messages in the body:

unsubscribe xml-dev Douglas Graham <dsgraham@gte.net>
unsubscribe xml-dev Douglas Graham
unsubscribe xml-dev dsgraham@gte.net
unsubscribe xml-dev
(un)subscribe xml-dev 

BTW I send the messages to majordomo@ic.ac.uk and I don't put 'this' anywhere in my message.

Any suggestions on how to get off this list?  

Any help appreciated,
Doug
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990320/369792ad/attachment.htm
From jamesr at steptwo.com.au  Sat Mar 20 06:24:27 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:13 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <14066.21253.117700.991361@localhost.localdomain>
References: <36F1BBE8.A3AB13EB@prescod.net>
 <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
 <36F0CFF4.365B@hiwaay.net>
 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
 <36F13992.150D05F9@w3.org>
 <36F18209.8C68524@allette.com.au>
 <36F19AC0.B5B40B20@goon.stg.brown.edu>
 <36F1BBE8.A3AB13EB@prescod.net>
Message-ID: <4.1.19990320162057.00cd19f0@steptwo.com.au>

At 23:53 19/03/1999 , David Megginson wrote:

  | Almost correct.  One expensive disadvantage of SGML (until WebSGML) is
  | that it requires full DTD conformance at every stage of production; as
  | a result, if your production chain consists of ten physical steps,
  | writing out SGML at each stage, you *must* have DTDs for all of the
  | intermediate steps.  This one constraint can add $100K or more to a
  | large enterprise SGML project, since DTD writers are expensive to hire
  | (and a single, configured DTD becomes heavily obfuscated so that it
  | can almost never be maintained in-house).

I'd like to take this point up.

Having done a transformation with 12 steps in the chain,
I would disagree that the DTD becomes a big problem.

If you're using a tool like Omnimark, you can choose
to view the document as SGML or text. In general, a
lot of the steps use the latter, with some simple
find-and-replace regexps.

So in practice, we just ended up with about 5 DTDs
that were very close to each other.

Not a lot of work.

And it did ensure that all was correct at
each stage, catching errors in a most satisfying
way.

Cheers,

James

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Sat Mar 20 09:14:50 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:10:13 2004
Subject: XML complexity, namespaces (was WG) 
In-Reply-To: Your message of "Fri, 19 Mar 1999 16:26:33 CDT."
             <199903192123.QAA16896@hesketh.net> 
Message-ID: <199903200052.NAA00988@aniwa.sky>

> At 02:46 PM 3/19/99 -0600, Paul Prescod wrote:
> >In other words, XML is as asymmetric as SGML. Actually neither is really
> >very asymmetric because you can't (well, shouldn't) get data into them
> >before you have designed your document type. So input and output are both
> >pretty difficult if you compare them, say, to Microsoft Word which is
> >usually the benchmark people use to demonstrate how hard SGML systems are
> >to build.
> 
> Ah, but if MS Word had a simple "Save-To-XML" option that let users save
> their documents using markup based on the styles they've built.  Three
> times now, I've seen organizations that had done a lot of very good
> informal work with Word styles, and no easy path for those structures or
> the documents that use them to move to XML.  I guess the incentive just
> isn't there for MS to make life easy.  There are tools to do it, but it's
> still not much fun.  (Another painful case of asymmetry.)
> 
> Simon St.Laurent

Word does have a "Save to RTF" which looks like it could be useful as an intermediate step.

Andrew McNaughton


-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Sat Mar 20 09:15:17 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:10:13 2004
Subject: Parsing XML->DOM and XSL querying optimising 
In-Reply-To: Your message of "Fri, 19 Mar 1999 06:35:09 -0000."
             <199903190636.GAA00207@cheviot.ncl.ac.uk> 
Message-ID: <199903200028.NAA00816@aniwa.sky>

> Hi all,
> 
> I have an application that consists of 140+ XML documents, roughly 100k
> bytes each that I want to be able to query (using XSL pattern matching at
> present) and output to XML/HTML and RTF format. This will happen in real
> time (if at all possible).
> 
> Additionally, I'd like to be able to search/query the entire repository of
> documents and return a composite XML/HTM or RTF document from these.
> 
> At the moment, I'm experimenting with the DOM parser in Python and finding
> that a DOM parse takes about 4 seconds, whilst an XSL query takes about 1.8
> seconds.
> 
> I reckon that a user could wait the 1.8 seconds for a query, but might start
> to get fidgety after almost 6 seconds (how transient we are!).

I'm using sgrep.  I can run a query returning 2000 fragments out of a 25,000 
document, 100Mb collection in around 2 seconds.  Queries returning less text 
are significantly faster.

sgrep pre-indexes the document collection, which takes me around 15 minutes.  
There's no way to update indexes other than rebuilding from scratch, so it has 
limits for rapidly changing databases.  I'm getting by with concatenations of 
searches of indexes of sections of my total database, meaning that updates 
aren't as large as they might be.

sgrep's parser doesn't care about well-formedness, but if your documents are 
well formed it behaves correctly.

It's queries are based on containment, and have no facility for things like 
'get me the nth occurence of X'.  Some workarounds can be managed with 
recursion, but if you expect to do a lot of this stuff it may not be for you. 
I noticed recently that Tim Bray was proposing a query language with these 
same limitations because it provides for more efficient processing.


> What strategies have people got for limiting the DOM parsing time?

In perl there are various tools (eg Storable.pm) for dumping perl data 
hierarchies in a binary form, which could be done with pre-parsed DOM data.  
For my application though just the time required to load the DOM module's code 
is a problem.


> My own thoughts are that I load up all 140 documents at server-startup time,
> parse them into DOM[0]...DOM[139], store them into memory and then query
> each one in turn in the case of a simple query, and query all the DOM
> objects in the case of a full query across all XML documents.
> 
> Is this sensible? practical? stupid?

DOM operations in perl typically involve inefficient linear searches.  I'm not sure whether this is implicit in the DOM or is implmentation dependent.  At least in perl, The DOM is good for manipulation, but not particularly efficient for simple extraction of data.

Andrew McNaughton


-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jarle.stabell at dokpro.uio.no  Sat Mar 20 11:37:49 1999
From: jarle.stabell at dokpro.uio.no (Jarle Stabell)
Date: Mon Jun  7 17:10:13 2004
Subject: DTD Question: Attributes vs Elements
Message-ID: <01BE72CF.9E022510.jarle.stabell@dokpro.uio.no>

Tim Bray wrote:
> Attributes differ in  that 
> - they get white-space-normalized
> - their order is not significant
> - you can't have more than 1 with the same name on an element
> - they can't contain any internal structure

They also don't "pollute" the namespace.

Cheers,
Jarle


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Birbeck at iedigital.net  Sat Mar 20 13:11:58 1999
From: Mark.Birbeck at iedigital.net (Mark Birbeck)
Date: Mon Jun  7 17:10:13 2004
Subject: Unsubscibing
Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054AA0@SOHOS002>

Maybe the fact that your email is?HTML is throwing things out. Don't
know how the system works, so I'm not sure on that, but worth trying
just a straight text message.
?
Mark

-----Original Message-----
From: Doug Graham 
Sent: 20 March 1999 04:21
To: xml-dev@ic.ac.uk
Subject: Unsubscibing


I bet you guys get a lot of messages asking how to unsubscribe.? And,
I'd bet you dislike those messages as much as I do.??So, now I have to
ask.??Please accept my appology for doing this.
?
I've been trying to unsubscribe for a while now and I keep getting the
same error message back:
?
>>>> This is a multi-part message in MIME format.
**** Command 'this' not recognized.
>>>> 
>>>> ------=_NextPart_000_007B_01BE723F.349C7ED0
END OF COMMANDS

?
I've tried five different messages in the body:
?

unsubscribe xml-dev Douglas Graham < dsgraham@gte.net>
unsubscribe xml-dev Douglas Graham
unsubscribe xml-dev dsgraham@gte.net
unsubscribe xml-dev
(un)subscribe xml-dev 
?
BTW I send the messages to majordomo@ic.ac.uk and I don't put 'this'
anywhere in my message.
?
Any suggestions on how to get off this list?? 
?
Any help appreciated,
Doug


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Mar 20 15:16:02 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:13 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <4.1.19990320162057.00cd19f0@steptwo.com.au>
References: <36F1BBE8.A3AB13EB@prescod.net>
	<002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F19AC0.B5B40B20@goon.stg.brown.edu>
	<14066.21253.117700.991361@localhost.localdomain>
	<4.1.19990320162057.00cd19f0@steptwo.com.au>
Message-ID: <14067.47347.332566.447160@localhost.localdomain>

James Robertson writes:

 > Having done a transformation with 12 steps in the chain,
 > I would disagree that the DTD becomes a big problem.
 > 
 > If you're using a tool like Omnimark, 

... or Perl, etc. ...

 > you can choose to view the document as SGML or text. In general, a
 > lot of the steps use the latter, with some simple find-and-replace
 > regexps.

That's a kludge that works when you have end-to-end lexical control of
the SGML, but not otherwise.  You have to know for certain (for
example) that you won't have to deal with pathological cases like

<!-- <em> --></em>

or

<foo
>

Fine-grained lexical control is rare in major enterprise projects,
where people might be using five different tools and several different
software systems to produce the SGML that feeds into the front end.

 > So in practice, we just ended up with about 5 DTDs that were very
 > close to each other.
 > 
 > Not a lot of work.

It depends, again, on the complexity of the project.  If there are,
say, a project manager, three UI specialists, a sysadmin, a DBA, ten
software engineers working on the DB and transformations, five DTD
consultants (with a DTD co-ordinator), and two publishing specialists
working in the chain, the difficulties of co-ordinating even small
changes become near exponential, especially if the team is scattered
across the continent (as is common in large enterprises).

It can be done (I know from my own experience), but it's quite
different from a situation where you and a couple of associates
control all of the parts of the chain yourselves, and the original
SGML's requirement for a DTD makes the problem that much harder.

 > And it did ensure that all was correct at each stage, catching
 > errors in a most satisfying way.

Yes, that can be a great advantage, but like anything, it requires a
cost/benefit analysis.  If I ask a customer "Do you want DTD
validation at every stage", she'll say yes; if I ask her "Are you
willing to risk ending up paying an extra US$200,000 to have DTD
validation at every stage", she might hesitate (remember, everyone's
time costs money, not just the DTD designer's).  

Maybe she'll still say yes, depending on her requirements and budget,
but at least with XML (and WebSGML) the choice isn't forced on her.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sat Mar 20 15:29:29 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:10:14 2004
Subject: ModSAX: Proposed Core Features
References: <7BA102761CAED111B27E00805FBB72333FAE4C@arrowhead.int.westgroup.com>
Message-ID: <36F30239.DB7F5414@jclark.com>

"Duffy, Bruce" wrote:
> 
> Hi folks,
> 
> One feature I'd really like to see is a Locator.getByteOffset()
> method.  Obviously this feature would have to be optional, since not
> all XML inputs are indexable files.
> 
> James Clark's non-SAX API for XP implements this method for startElement(),
> but not for the characters() callback, which unfortunately is exactly what
> I need it for.  I could hack XP or another parser, but I'd much rather work
> within the context of SAX.

XP makes this available via SAX by subclassing the SAX Locator cX-Mozilla-Status: 0009Locator to object to com.jclark.xml.sax.Locator.  If you
want to standardize this in SAX, I would suggest adding a Locator2 with
additional methods.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simpson at polaris.net  Sat Mar 20 16:25:49 1999
From: simpson at polaris.net (John E. Simpson)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <14067.47347.332566.447160@localhost.localdomain>
References: <4.1.19990320162057.00cd19f0@steptwo.com.au>
 <36F1BBE8.A3AB13EB@prescod.net>
 <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
 <36F0CFF4.365B@hiwaay.net>
 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
 <36F13992.150D05F9@w3.org>
 <36F18209.8C68524@allette.com.au>
 <36F19AC0.B5B40B20@goon.stg.brown.edu>
 <14066.21253.117700.991361@localhost.localdomain>
 <4.1.19990320162057.00cd19f0@steptwo.com.au>
Message-ID: <3.0.5.32.19990320112542.00866e60@nexus.polaris.net>

At 10:16 AM 3/20/99 -0500, David Megginson wrote:
>If I ask a customer "Do you want DTD
>validation at every stage", she'll say yes; if I ask her "Are you
>willing to risk ending up paying an extra US$200,000 to have DTD
>validation at every stage", she might hesitate (remember, everyone's
>time costs money, not just the DTD designer's).  

Aye. One can do anything, but some things are not worth doing.

==========================================================
John E. Simpson            | The secret of eternal youth
simpson@polaris.net        | is arrested development.
http://www.flixml.org      |  -- Alice Roosevelt Longworth

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Patrice.Bonhomme at loria.fr  Sat Mar 20 16:54:06 1999
From: Patrice.Bonhomme at loria.fr (Patrice Bonhomme)
Date: Mon Jun  7 17:10:14 2004
Subject: RDF DTD ?
Message-ID: <199903201653.RAA19628@chimay.loria.fr>

Hi,

I am looking for an RDF DTD ? Why is there a "Formal Grammar" for RDF and not 
a DTD ?

Thanks,

Pat.

-- 
  ==============================================================
  bonhomme@loria.fr               |      Office : B.228
  http://www.loria.fr/~bonhomme   |      Phone  : 03 83 59 30 52
  --------------------------------------------------------------
   * Serveur Silfide  : http://www.loria.fr/projets/Silfide
  ==============================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Sat Mar 20 18:39:56 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:10:14 2004
Subject: RDF DTD ?
In-Reply-To: <199903201653.RAA19628@chimay.loria.fr>
Message-ID: <Pine.GHP.4.02A.9903201833140.24884-100000@mail.ilrt.bris.ac.uk>

On Sat, 20 Mar 1999, Patrice Bonhomme wrote:
> I am looking for an RDF DTD ? Why is there a "Formal Grammar" for RDF and not 
> a DTD ?

There is no DTD for the RDF syntax because RDF data is likely to be
embedded in a variety of different document types. Also the RDF
syntax uses a variety of element and attribute names to represent
intermingled  classes, properties and resource drawing on multiple
independently defined vocabularies. This is pretty much impossible to
anticipate in a single monolithic "RDF DTD". 

Dan


--
Daniel.Brickley@bristol.ac.uk                  
Institute for Learning and Research Technology http://www.ilrt.bris.ac.uk/
University of Bristol,  Bristol BS8 1TN, UK.   phone:+44(0)117-9287096


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Sun Mar 21 00:26:34 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <14067.47347.332566.447160@localhost.localdomain>
References: <4.1.19990320162057.00cd19f0@steptwo.com.au>
 <36F1BBE8.A3AB13EB@prescod.net>
 <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
 <36F0CFF4.365B@hiwaay.net>
 <36F10CFC.CFEB89A8@goon.stg.brown.edu>
 <36F13992.150D05F9@w3.org>
 <36F18209.8C68524@allette.com.au>
 <36F19AC0.B5B40B20@goon.stg.brown.edu>
 <14066.21253.117700.991361@localhost.localdomain>
 <4.1.19990320162057.00cd19f0@steptwo.com.au>
Message-ID: <4.1.19990321102248.00baff00@steptwo.com.au>

At 01:16 21/03/1999 , David Megginson wrote:

  |  > So in practice, we just ended up with about 5 DTDs that were very
  |  > close to each other.
  |  > 
  |  > Not a lot of work.
  | 
  | It depends, again, on the complexity of the project.  If there are,
  | say, a project manager, three UI specialists, a sysadmin, a DBA, ten
  | software engineers working on the DB and transformations, five DTD
  | consultants (with a DTD co-ordinator), and two publishing specialists
  | working in the chain, the difficulties of co-ordinating even small
  | changes become near exponential, especially if the team is scattered
  | across the continent (as is common in large enterprises).
  | 
  | It can be done (I know from my own experience), but it's quite
  | different from a situation where you and a couple of associates
  | control all of the parts of the chain yourselves, and the original
  | SGML's requirement for a DTD makes the problem that much harder.

Agreed.

But aren't you just saying: "very big jobs are a lot of work,
and are very complex"?

Wouldn't this be true if you were implementing DB solutions,
or three-tier architectures, etc?

In otherwords, why is XML any different here than SGML
(which was the original question)?

J

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Sun Mar 21 01:35:12 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:14 2004
Subject: DTD Question: Attributes vs Elements
Message-ID: <005401be733a$d6ae5800$a3acdccf@ix.netcom.com>

>At 02:33 PM 3/19/99 -0500, Kurt Donath wrote:
>>
>>What is the criteria for selecting when to define data as an attribute
>>or element in a DTD?


It always struck me that the only valid reason for using an attribute and
not an element is that you can force the writer to add an attribute value,
wheras you cant force them to add element content.

Mind you you can't force them to add correct content!

As every one else has pointed out the rest is religion!

Frank

Frank Boumphrey

XML and style sheet info at Http://www.hypermedic.com/style/index.htm
Author: - Professional Style Sheets for HTML and XML http://www.wrox.com
CoAuthor:  XML applications from Wrox Press, www.wrox.com
Author: Using XML on the Web (july)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 21 02:38:35 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <4.1.19990321102248.00baff00@steptwo.com.au>
References: <4.1.19990320162057.00cd19f0@steptwo.com.au>
	<36F1BBE8.A3AB13EB@prescod.net>
	<002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F19AC0.B5B40B20@goon.stg.brown.edu>
	<14066.21253.117700.991361@localhost.localdomain>
	<14067.47347.332566.447160@localhost.localdomain>
	<4.1.19990321102248.00baff00@steptwo.com.au>
Message-ID: <14068.22702.384640.143448@localhost.localdomain>

James Robertson writes:

 > But aren't you just saying: "very big jobs are a lot of work,
 > and are very complex"?
 > 
 > Wouldn't this be true if you were implementing DB solutions,
 > or three-tier architectures, etc?
 > 
 > In otherwords, why is XML any different here than SGML
 > (which was the original question)?

SGML requires DTD validation, while XML and WebSGML do not.

With XML (or WebSGML), it is not necessary for the DTD designer(s) to
track the internals of the different steps in the transformation
chain, since DTD validation at each step is not required.  For
example, consider this relatively simple production chain:

1. Original document (internal document type)
2. Transform #1 -- add references from database
3. Transform #2 -- remove elements that are configured out for the
                   current production run (for security reasons,
                   customer preferences, etc.)
4. Transform #3 -- add boilerplate
5. Transform #4 -- convert specialised tables to CALS tables
6. Transform #5 -- add revision markup
7. Transform #6 -- transform to industry-standard document type for
                   exchange

You probably have a DTD for (1), and you almost certainly have an
industry-standard DTD for (7), but the results of intermediate
transformations #1-#5 may not be valid according to either DTD.  

In the SGML world, someone had to create one or more variant DTDs to
ensure that the document was valid at each stage (either through
elaborate and obfuscatory configuration with parameter entities and
marked sections, or by writing separate DTDs from scratch, both often
with liberal sprinklings of ANY).  If the production engineers made
even minor changes in the system design, the DTDs would (usually)
immediately break validation and shut down the whole system -- in
other words, the system was brittle and expensive to maintain.

Sometimes, the benefit of tight validation at each step makes the
extra expense worthwhile, but as I mentioned before, XML lets you make
that decision rather than making it for you.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Sun Mar 21 02:52:38 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:14 2004
Subject: DTD Question: Attributes vs Elements
In-Reply-To: <01BE72CF.9E022510.jarle.stabell@dokpro.uio.no>; from Jarle Stabell on Sat, Mar 20, 1999 at 12:46:03PM +0100
References: <01BE72CF.9E022510.jarle.stabell@dokpro.uio.no>
Message-ID: <19990321135222.E29582@io.mds.rmit.edu.au>

On Sat, Mar 20, 1999 at 12:46:03PM +0100, Jarle Stabell wrote:
> Tim Bray wrote:
> > Attributes differ in  that 
> > - they get white-space-normalized
> > - their order is not significant
> > - you can't have more than 1 with the same name on an element
> > - they can't contain any internal structure
> 
> They also don't "pollute" the namespace.

This could probably be formalised into three categories: data that
_must_ be stored in an attribute, data that _cannot_ be stored in an
attribute, and everything else.  For example:

MUST:   * Enumerated values (status can be active, inactive or
          closed).

CANNOT: * Structured information
        * White space matters
	* order is significant
	* More than one item for any one name

OTHER:  * General string values
	* URL's

Maybe the wisdom of experience could subdivide OTHER into SHOULD,
SHOULDN'T and DOESN'T MATTER (which could address the namespace
pollution issue).


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 21 02:58:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:14 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <14068.24150.843634.988657@localhost.localdomain>

Now that we have the name down pat and a good idea of how we'll be
setting up SAX2 (including the core features, properties, and extended 
handlers), it's time to dive into the details.

We discussed the lexical handler before, but I have lost my earlier
drafts, so I've reinvented it from memory -- sincere apologies to
anyone whose brillian suggestions I have lost.

This handler has the handerID http://xml.org/sax/handlers/lexical, and
you would set it like this:

  try {
    parser.setHandler("http://xml.org/sax/handlers/lexical", handler);
  } catch (SAXNotSupportedException e) {
    // do something clever ...
  }

I've included some additional details after the interface.  Note that
this interface (and all other new SAX2 interfaces) will be optional -- 
a parser will not have to implement it for SAX conformance, and may
choose to use only parts of it even if it does implement it (it might
report comments but not CDATA section boundaries, for example) as long 
as it does report matching start/end pairs.

====================8<====================8<====================
// LexicalHandler.java
// $Id: LexicalHandler.java,v 1.1 1999/03/21 02:49:41 david Exp $
// SAX2 handlerID: http://xml.org/sax/handlers/lexical

package org.xml.sax;

public interface LexicalHandler
{
    public abstract void xmlDecl (String version,
				  String encoding,
				  String standalone)
	throws SAXException;

    public abstract void startDTD (String doctype,
				   String publicID,
				   String systemID)
	throws SAXException;

    public abstract void endDTD ()
	throws SAXException;

    public abstract void startEntity (String name)
	throws SAXException;

    public abstract void endEntity (String name)
	throws SAXException;

    public abstract void comment (String text)
	throws SAXException;

    public abstract void startCDATA ()
	throws SAXException;

    public abstract void endCDATA ()
	throws SAXException;
}

// end of LexicalHandler.java
====================8<====================8<====================

Notes:

1. The startDTD() and endDTD() methods surround everything within the
   DOCTYPE declaration, including the internal and external subsets
   and any declarations, comments, PI's, etc. within.

2. The startEntity() method will be used for the external DTD subset
   (if the parser reads it), with the pseudo-entity name '[dtd]'.

Now, is this overkill?  Do we really need to know about CDATA sections 
and the XML declaration?  Comments, please.


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Sun Mar 21 03:22:28 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F1BBE8.A3AB13EB@prescod.net>; from Paul Prescod on Thu, Mar 18, 1999 at 08:52:24PM -0600
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu> <36F1BBE8.A3AB13EB@prescod.net>
Message-ID: <19990321142205.F29582@io.mds.rmit.edu.au>

On Thu, Mar 18, 1999 at 08:52:24PM -0600, Paul Prescod wrote:
> Richard Goerwitz wrote:
> > 
> > I come from a small shop that does a lot of SGML work.  Trust me:  SGML
> > is complex and intractable.  
> 
> <RANT>
> This is way off topic but I must admit that these characterizations really
> annoy me.
> 
> I can only speak anecdotally: I started using SGML while working for a
> professor of English as an undergrad. A single programmer (not me) wrote a
> pretty sophisticated application that converted SGML to HTML and RTF in a
> couple of months -- almost exactly the same amount of time it would take
> to do the same for XML. The process was almost identical too: you use a
> parser from James Clark, pump the data into your favorite scripting
> language and output it in the other language. The complexity of the input
> syntax was and is irrelevant to solving that problem.
> 
> If we were doing that now it would be much, much easier because we would
> use Jade. That proves that technology improves and it becomes easier to do
> hard things over time which is pretty much unrelated to the distinction
> between SGML and XML.

There is a very well known phrase: "Nothing is worth more than what
people will pay for it."  Paul here is essentially saying the same
thing: "Nothing is more complex than the amount effort it takes to
build and use it."

The perceived complexity of SGML is not dependent on how complex it is
to implement an SGML parser, since one already exists.  What matters
is how complex it is to use.  If one were to insist on considering the
underlying technology in determining the complexity one would be
forced to concede that a hello world program written in C is
enormously complex since it involves compilers, file systems, advanced
virtual memory architectures, windowing systems, possibly network
based windowing protocols such as X, virtual machines, OS kernels and
much, much more, in order to convert the contents of a C source file
into a pattern of light and dark phosphors on a CRT.

[The obvious caveat is if one cannot use the available technology for
whatever reasons.  In this case, implementing the subsystem is clearly
part of the cost.  In the case of SP, however, very few people will
have reason to do their own work: SP is free and usable in commercial
software.]


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Sun Mar 21 04:10:14 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu> <36F1BBE8.A3AB13EB@prescod.net> <19990321142205.F29582@io.mds.rmit.edu.au>
Message-ID: <36F46DF4.2C8F993E@prescod.net>

Marcelo Cantos wrote:
> 
> The perceived complexity of SGML is not dependent on how complex it is
> to implement an SGML parser, since one already exists.  What matters
> is how complex it is to use.  If one were to insist on considering the
> underlying technology in determining the complexity one would be
> forced to concede that a hello world program written in C is
> enormously complex since it involves compilers, file systems, advanced
> virtual memory architectures, windowing systems, possibly network
> based windowing protocols such as X, virtual machines, OS kernels and
> much, much more, in order to convert the contents of a C source file
> into a pattern of light and dark phosphors on a CRT.

Thank you Marcelo. You've said it wonderfully.

I also want to to point out that I am not excusing SGML's syntactic
complexity nor arguing that it was a good thing. From a PR perspective
alone it was a disaster. If I hadn't had problems to solve that only SGML
could solve I would probably have run away after reading the specification
myself. After all, I'm the guy who avoids Perl because of its same
syntactic complexity and context sensitivity.

I just reject the argument that it was difficult ("inctractable") to
*use*. You fired up Emacs and SP and it was about as difficult to type and
process as XML. That makes me wonder: perhaps there is an Emacs mode that
will make Perl as easy to read as Python.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Sun Mar 21 04:23:31 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
References: <4.1.19990320162057.00cd19f0@steptwo.com.au>
		<36F1BBE8.A3AB13EB@prescod.net>
		<002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
		<36F0CFF4.365B@hiwaay.net>
		<36F10CFC.CFEB89A8@goon.stg.brown.edu>
		<36F13992.150D05F9@w3.org>
		<36F18209.8C68524@allette.com.au>
		<36F19AC0.B5B40B20@goon.stg.brown.edu>
		<14066.21253.117700.991361@localhost.localdomain>
		<14067.47347.332566.447160@localhost.localdomain>
		<4.1.19990321102248.00baff00@steptwo.com.au> <14068.22702.384640.143448@localhost.localdomain>
Message-ID: <36F472EE.B075CDD4@prescod.net>

David Megginson wrote:
> 
> In the SGML world, someone had to create one or more variant DTDs to
> ensure that the document was valid at each stage (... often
> with liberal sprinklings of ANY).  

What's wrong with using ANY for all content models instead of going
DTD-less? Then the only thing you need to be careful of is that elements
you create have a corresponding type in the DTD. 

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Sun Mar 21 06:06:56 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:14 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <199903200052.NAA00988@aniwa.sky>; from Andrew McNaughton on Sat, Mar 20, 1999 at 01:52:46PM +1300
References: <199903192123.QAA16896@hesketh.net> <199903200052.NAA00988@aniwa.sky>
Message-ID: <19990321170641.G29582@io.mds.rmit.edu.au>

On Sat, Mar 20, 1999 at 01:52:46PM +1300, Andrew McNaughton wrote:
> > At 02:46 PM 3/19/99 -0600, Paul Prescod wrote:
> > >In other words, XML is as asymmetric as SGML. Actually neither is
> > >really very asymmetric because you can't (well, shouldn't) get
> > >data into them before you have designed your document type. So
> > >input and output are both pretty difficult if you compare them,
> > >say, to Microsoft Word which is usually the benchmark people use
> > >to demonstrate how hard SGML systems are to build.
> > 
> > Ah, but if MS Word had a simple "Save-To-XML" option that let
> > users save their documents using markup based on the styles
> > they've built.  Three times now, I've seen organizations that had
> > done a lot of very good informal work with Word styles, and no
> > easy path for those structures or the documents that use them to
> > move to XML.  I guess the incentive just isn't there for MS to
> > make life easy.  There are tools to do it, but it's still not much
> > fun.  (Another painful case of asymmetry.)
> > 
> > Simon St.Laurent
> 
> Word does have a "Save to RTF" which looks like it could be useful
> as an intermediate step.

It is, in part.  We have developed an extensive legislation management
package for the Tasmanian government (an Australian state, for those
who don't know).  The drafters use word with a collection of macros to
create the appropriately constrained styles and formatting for export
to RTF.  The RTF is then translated into an SGML document using custom
code.

It is an ugly solution, and a far cry from the utopia of a mainstream
SGML editor, but it works, and it is only ugly for the implementors --
the users love it (more accurately, the users are oblivious to the
ugliness; I have no idea whether they love it or not).


Cheers,
Marcelo

P.S.: For those interested in legislation, the Tasmanian legislation
system is, AFAIK, the first body of government law in the world to go
online.  EnAct (the product we built on top of SIM) is enshrined in
law as the official source of legislation, it is not a mere copy of
some hard copy.  Better still, anyone on the web can query it.  It
also supports point-in-time queries (e.g. you can query on the law as
it was two years ago).  Go to http://www.thelaw.tas.gov.au/ to have a
go at it.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Sun Mar 21 06:26:31 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:15 2004
Subject: Save-To-XML
In-Reply-To: <36F2E218.937C9B71@prescod.net>; from Paul Prescod on Fri, Mar 19, 1999 at 05:47:36PM -0600
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <199903192123.QAA16896@hesketh.net> <36F2E218.937C9B71@prescod.net>
Message-ID: <19990321172616.H29582@io.mds.rmit.edu.au>

On Fri, Mar 19, 1999 at 05:47:36PM -0600, Paul Prescod wrote:
> "Simon St.Laurent" wrote:
> > 
> > Ah, but if MS Word had a simple "Save-To-XML" option that let users save
> > their documents using markup based on the styles they've built.  
> 
> I was thinking about this last week. Someone could build this relatively
> easily on top of the Office 2000 save as XML and the MSHTML DLL. 
> 
> > Three
> > times now, I've seen organizations that had done a lot of very good
> > informal work with Word styles, and no easy path for those structures or
> > the documents that use them to move to XML.  I guess the incentive just
> > isn't there for MS to make life easy.  There are tools to do it, but it's
> > still not much fun.  (Another painful case of asymmetry.)
> 
> Even if the tool to do it was a "Save-To-XML" option it would still be not
> much fun. 
> 
> After all, the goal is not to get it into any-old-XML (that's easy) but to
> get it into "our vocabulary". That's the harder part. There are tricky
> problems about setting up division structure, converting tables to a
> particular table model, cross-references to a particular linking model and
> so forth. In the end it is a transformation job no matter how you slice
> it. And even then you will likely have to do many manual fix-ups unless
> the writers are Zen monks.

The real problem lies with the flat nature of RTF.  Paragraphs are not
children of sections.  They are siblings of level 1 headings.  The
first pass always involves inferring structure from the sequence of
styles.  This stage cannot be avoided, though of course it can be
rolled together with other passes.

For EnAct (a legislation management system built on top of SIM) we
developed a configuration file approach that defines how to perform
such inference.  It works vaguely like a yacc grammar, though I know
almost nothing more about that particular aspect of Enact.  We are
currently looking at a completely generic import filter to do
essentially the same thing (EnAct is targetted at legislation).


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 21 11:21:59 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:15 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F472EE.B075CDD4@prescod.net>
References: <4.1.19990320162057.00cd19f0@steptwo.com.au>
	<36F1BBE8.A3AB13EB@prescod.net>
	<002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F19AC0.B5B40B20@goon.stg.brown.edu>
	<14066.21253.117700.991361@localhost.localdomain>
	<14067.47347.332566.447160@localhost.localdomain>
	<4.1.19990321102248.00baff00@steptwo.com.au>
	<14068.22702.384640.143448@localhost.localdomain>
	<36F472EE.B075CDD4@prescod.net>
Message-ID: <14068.54604.691791.511850@localhost.localdomain>

Paul Prescod writes:

 > What's wrong with using ANY for all content models instead of going
 > DTD-less? Then the only thing you need to be careful of is that
 > elements you create have a corresponding type in the DTD.

1. It still doesn't help with attributes.
2. You still have to define all of the element types.
3. You still have to maintain (a) variant DTD(s).

I agree that this is one partial work-around for SGML's insistence on
DTD validation, but it's hard to argue that XML and WebSGML do not
make things much easier (and less expensive) by not forcing you to
jump through these hoops -- in the case of ANY, in particular, you're
no longer doing any real validation (except for my caveats mentioned
above), but you're still incurring cost to satisfy SGML's
validation requirement (to the letter, but not to the spirit).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Mar 21 13:30:34 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:15 2004
Subject: DOM Implemetation in C?
Message-ID: <004801be739f$4b5a5c80$17f96d8c@NT.JELLIFFE.COM.AU>


>I've checked through everything at the w3c, as well as Robin Cover's
>list... does anyone know where I might find an implementation of the
DOM
>(any level will do) written in C?  If there isn't one, is anyone else
>interested?

I was working on one, but we have put that project on hold. If you are
interested, I think I can send you a version of the .h file for DOM, but
not any of the implementation files unfortunately. (Our experimental DOM
was being optimised for read-only Chinese documents...in any case, we
had dropped our C requirement, and if it comes to life again we will
move to C++.)

There is a technical problem that CORBA IDL mappings do not (as far as I
can see) provide C mappings to let us know how to create objects, but it
seems that DOM (or, at least, DOM users) require object creation and
finalization. That being the case, DOM cannot have a completely portable
C interface.  (I would love to be wrong in this...any IDL->C gurus
around?)

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Mar 21 13:54:39 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:15 2004
Subject: DTD Question: Attributes vs Elements
Message-ID: <008801be73a2$a8efda20$17f96d8c@NT.JELLIFFE.COM.AU>


From: Frank Boumphrey <bckman@ix.netcom.com>

>It always struck me that the only valid reason for using an attribute
and
>not an element is that you can force the writer to add an attribute
value,
>wheras you cant force them to add element content.

>As every one else has pointed out the rest is religion!

Lou Burnard, of TEI, has said that a DTD is a theory about a document
(one of my favorite thoughts).

What a DTD-writer has decided is an attribute or element belongs to this
theory. Whether something is an element or attribute reveals, to people
downstream, the DTD-writer's concept of how that information relates to
the total element. This is very far from religion, but is part of the
information modeling.

The question shouldn't be "does an attribute node behave differently in
a parsed document to an element node?", which is what many people seem
to reduce things too. The answer is pretty much "No". The better
question is "Why does the DTD writer think that this is an attribute and
not a part?"

The question is somewhat muddied in that complex attributes (the element
has an IDREF to some other elements somewhere else which contain nested
elements, which give the "attribute" values) are not as conveniently
specified as complex element content.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Mar 21 13:58:20 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:15 2004
Subject: XML complexity, namespaces (was WG)
Message-ID: <009901be73a3$2b40e7d0$17f96d8c@NT.JELLIFFE.COM.AU>

 From: Marcelo Cantos <marcelo@mds.rmit.edu.au>

>It is, in part.  We have developed an extensive legislation management
>package for the Tasmanian government (an Australian state, for those
>who don't know).

Many people will be familiar with Tasmania from the excellent series of
animated documentaries from the brothers Warner.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sun Mar 21 14:16:52 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:15 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <005101be73a6$325a59e0$c8a8a8c0@thing1>

From: David Megginson <david@megginson.com>

>Now, is this overkill?  

No, its beautiful. I'm particularly delighted with access to the 
doctype and comments. An application which strips out
comments when updating a document always seemed
unforgivably rude!

>Do we really need to know about CDATA sections 


Debatable perhaps, but supported by the DOM. (Anyone know why?)
But I'd really like to see better SAX/DOM integration, so Yes!

>and the XML declaration?

I think this could be helpful when updating a document, the encoding
in particular. (An update should make the minimum necessary changes.)

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Sun Mar 21 14:43:42 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:10:15 2004
Subject: DOM Implemetation in C?
Message-ID: <000601be73a8$ea556ec0$4a5eedc1@arp01>

I think there are some DOM C mappings on my old web site
http://www.arpsolutions.demon.co.uk

Next week (check around Friday) there *will* be a DOM C/C++ implementation
at http://www.vivid-creations.com

The DOM stuff will be chargable, but there should also be some freebie
stuff. ( new SAX control etc )

Cheers,

Richard A.

-----Original Message-----
From: Rick Jelliffe <ricko@allette.com.au>
To: XML Dev <xml-dev@ic.ac.uk>
Date: Sunday, March 21, 1999 1:31 PM
Subject: Re: DOM Implemetation in C?


>
>>I've checked through everything at the w3c, as well as Robin Cover's
>>list... does anyone know where I might find an implementation of the
>DOM
>>(any level will do) written in C?  If there isn't one, is anyone else
>>interested?
>
>I was working on one, but we have put that project on hold. If you are
>interested, I think I can send you a version of the .h file for DOM, but
>not any of the implementation files unfortunately. (Our experimental DOM
>was being optimised for read-only Chinese documents...in any case, we
>had dropped our C requirement, and if it comes to life again we will
>move to C++.)
>
>There is a technical problem that CORBA IDL mappings do not (as far as I
>can see) provide C mappings to let us know how to create objects, but it
>seems that DOM (or, at least, DOM users) require object creation and
>finalization. That being the case, DOM cannot have a completely portable
>C interface.  (I would love to be wrong in this...any IDL->C gurus
>around?)
>
>Rick Jelliffe
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Sun Mar 21 15:20:38 1999
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:10:15 2004
Subject: IE5.0 does not conform to RFC2376
Message-ID: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp>

I believe that IE 5.0 does not conform to RFC2376 (XML Media Types), 
of which I am a co-author.

As for the XML media type "text/xml", the charset parameter in the 
MIME header is authoritative.  Encoding declarations have to be ignored 
so that transcoding is possible.

However, IE 5.0 appears to always ignore the charset parameter and use 
the BOM or encoding declaration only.  Therefore, IE 5.0 does not conform to
RFC 2376.

Proof: I made a UTF-8 XML document which also parses even when it is assumed as 
Shift_JIS.  Then, I provided the correct charaset parameter "utf-8" 
in the MIME header by configuring Apache and provided an encoding declaration 
"Shift_JIS" in the XML document.  Such mismatch is perfectly legal and 
usual when proxies perform code conversion.  I tried this document with IE 5.0.  
Incorrect characters were displayed.  Q.E.D.

FYI:  

When the charset parameter is not specified, it is assumed as US-ASCII.  The 
column "Setting Your Server Up for XML" (http://www.xml.com/1999/03/ie5/first-x.xml) 
considers this case only.

If you are using Apache and overriding by AddType is allowed, you only have to 
create a file named .htaccess in your directory and write a line as below:

	AddType "text/xml;  charset=utf-8"    xml

Cheers,

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sun Mar 21 17:14:27 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:15 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <14068.24150.843634.988657@localhost.localdomain>
References: <14068.24150.843634.988657@localhost.localdomain>
Message-ID: <wku2velq8q.fsf@ifi.uio.no>


* David Megginson
|
|     public abstract void xmlDecl (String version,
| 				  String encoding,
| 				  String standalone)
| 	throws SAXException;

Should we perhaps make standalone a boolean instead?  It can only have
two values anyway, and this will spare us a lot of
standalone.equals(this or that).
 
|     public abstract void startDTD (String doctype,
| 				   String publicID,
| 				   String systemID)
| 	throws SAXException;

I think naming doctype docelem or rootelem would be better. It took me
a couple of seconds to figure out what it meant.
 
|     public abstract void startEntity (String name)
| 	throws SAXException;

Is this sufficient? Now we don't even know whether it's internal or
external. I know EntityResolver can be used to get that information
(including sysid and pubid), but I'd much rather see it included here
as well, since IMHO EntityResolver fills a separate role from the data
handlers. Often one would want to plug in a separate component there.

How about this?

      public abstract void startEntity (String name, String publicID,
                                        String systemID)
        throws SAXException;
 
If systemID is null we know it is an external entity. Alternatively,
we could have a separate callback for external entities.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sun Mar 21 17:39:04 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:15 2004
Subject: DOM Implemetation in C?
In-Reply-To: <004801be739f$4b5a5c80$17f96d8c@NT.JELLIFFE.COM.AU>
References: <004801be739f$4b5a5c80$17f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <wksoaylp3k.fsf@ifi.uio.no>


* Rick Jelliffe
| 
| There is a technical problem that CORBA IDL mappings do not (as far
| as I can see) provide C mappings to let us know how to create
| objects, 

This is correct, but I don't see it as a technical problem. IDL only
describes interfaces, and since these may be implemented in a huge
number of ways it is in general impossible to know exactly what data
they need at creation time.

In general I think one should be very careful with declaring
constructors on classes/interfaces that may have more than one
implementation, as it often leads to painful unnecessary
cross-dependencies.

In particular, having constructors on interfaces may cause trouble
when a class implements two different interfaces at the same time.

Java interfaces can't have constructors for (I assume) exactly the
same reasons.

| but it seems that DOM (or, at least, DOM users) require object
| creation and finalization.

Well, the Document interface provides operations to deal with
creation. As for finalization, all CORBA object references have a
'release' operation which can be used to do finalization. Do you think
these are insufficient?

--Lars M. (not an IDL->C guru :)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sun Mar 21 17:50:44 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:15 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp>
References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp>
Message-ID: <wkr9qilok3.fsf@ifi.uio.no>


* MURATA Makoto
|
| However, IE 5.0 appears to always ignore the charset parameter and
| use the BOM or encoding declaration only.  Therefore, IE 5.0 does
| not conform to RFC 2376.

MSIE has been notorious for years for ignoring the Content-type header
field as well and instead uses some kind of internal logic to figure
out the format of the resource. It may be worth your while to check
whether this applies to XML as well. (I don't have IE5 installed.)

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sun Mar 21 17:51:20 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:15 2004
Subject: IE5.0 does not conform to RFC2376
Message-ID: <3.0.32.19990321095409.00ea8078@pop.intergate.bc.ca>

At 12:09 AM 3/22/99 +0900, MURATA Makoto wrote:
>I believe that IE 5.0 does not conform to RFC2376 (XML Media Types), 
>of which I am a co-author.

Yeah, well, IE 5.0 also fails to conform to XML 1.0, in a variety
of quite serious ways.  I.e. CDATA sections don't work, entities
stop working if you use CSS (?!?), escaping < with &lt; only works
if the next character isn't a "?".  Oh yes, it also violates the
namespace draft by hardwiring the prefix "html".

Since they rebuilt it from scratch for this release, we could cut
them some slack as standard release 1.0 bugs, but it's hard not to
be a bit irritated, since there has been excellent free software
available since 1997 that does these things correctly.

 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Mar 21 18:16:12 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:10:15 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1 
In-Reply-To: Your message of "Sun, 21 Mar 1999 09:22:05 EST."
             <005101be73a6$325a59e0$c8a8a8c0@thing1> 
Message-ID: <199903211815.LAA02365@malatesta.local>

> >Do we really need to know about CDATA sections 
> 
> Debatable perhaps, but supported by the DOM. (Anyone know why?)
> But I'd really like to see better SAX/DOM integration, so Yes!

Heartily agreed.  No particular XML authority (certainly not the DOM WG group) 
seems to give the time of day to anyone who wants a fully standardized path 
through

XML File -> DOM (including transformations) -> XML File

And the DOM WG might be right to sniff that I/O considerations are not in 
their charter, but that does no good to those of us who are trying to use 
XML/DOM on the server side in environments heterogenous w.r.t. language, 
platform, network interface, etc.

Something that would help us greatly, even though it is likely an unfair 
burden on the thankfully open SAX development process (senatus rei publicae, 
plebis onerum), is event-based support for most aspects of the DOM.

For instance, one cannot construct the DOM DocumentType of a ducument parsed 
in with SAX 1 because that standard has no events to support the public and 
syatem IDs of the external DTD subset.  Another example is the ability to set 
isSpecified for Attributes properly, though this is not as crucial.

I've been terribly busy lately, but glancing at the various drafts of SAX 2, 
I'm pretty happy that it will serve my DOM I/O needs, and I hope to go over a 
complete SAX 2 spec more thoroughly to see if I can suggest any areas of 
increased DOM-friendliness.

> >and the XML declaration?
> 
> I think this could be helpful when updating a document, the encoding
> in particular. (An update should make the minimum necessary changes.)

Is it better to just use the processing instruction event?  I know it would be 
nice to specialize the PI parameters, but do we then extend this to other PIs 
that come out of the XML spectrum, such as style-sheet specifiers?

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sun Mar 21 18:20:38 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:15 2004
Subject: MDSAX 1.0 Production Release Available
Message-ID: <009d01be73c8$3f715d00$c8a8a8c0@thing1>

http://www.jxml.com/future.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 21 18:29:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:15 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <wku2velq8q.fsf@ifi.uio.no>
References: <14068.24150.843634.988657@localhost.localdomain>
	<wku2velq8q.fsf@ifi.uio.no>
Message-ID: <14069.14611.23649.780638@localhost.localdomain>

Lars Marius Garshol writes:
 > 
 > * David Megginson
 > |
 > |     public abstract void xmlDecl (String version,
 > | 				  String encoding,
 > | 				  String standalone)
 > | 	throws SAXException;
 > 
 > Should we perhaps make standalone a boolean instead?  It can only have
 > two values anyway, and this will spare us a lot of
 > standalone.equals(this or that).

Duh! <sound effect="loud slap on forehead"/> &hellip; thanks for
catching that one.

 > |     public abstract void startDTD (String doctype,
 > | 				   String publicID,
 > | 				   String systemID)
 > | 	throws SAXException;
 > 
 > I think naming doctype docelem or rootelem would be better. It took me
 > a couple of seconds to figure out what it meant.

Agreed -- I'll rename it to "root".

 > |     public abstract void startEntity (String name)
 > | 	throws SAXException;
 > 
 > Is this sufficient? Now we don't even know whether it's internal or
 > external. I know EntityResolver can be used to get that information
 > (including sysid and pubid), but I'd much rather see it included here
 > as well, since IMHO EntityResolver fills a separate role from the data
 > handlers. Often one would want to plug in a separate component there.

Actually, this is a sin of omission, but of a different kind -- I was
intending to add an extra note saying that entity declarations will be
reported through the DTDDeclHandler, which I'll be posting next.  In
other words, if the parser supports
http://xml.org/sax/handlers/dtd-decl, you'll already have obtained the
information about the attribute declaration and all you need here is
the name.

 > How about this?
 > 
 >       public abstract void startEntity (String name, String publicID,
 >                                         String systemID)
 >         throws SAXException;
 >  
 > If systemID is null we know it is an external entity. Alternatively,
 > we could have a separate callback for external entities.

(You mean "If systemID is null we know it is an *internal* entity").
The problem is that a non-validating parser that does not read the
external DTD subset might not have seen the declaration for the
entity, so a null systemID is no guarantee that the entity was
declared internal.


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Mar 21 18:36:55 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:10:15 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <14068.24150.843634.988657@localhost.localdomain>
Message-ID: <36F505C5.973FA4C3@jclark.com>

David Megginson wrote:

>     public abstract void startEntity (String name)
>         throws SAXException;
> 
>     public abstract void endEntity (String name)
>         throws SAXException;

How does this allow me to find out about entity references in attribute
values (including defaulted attribute values)?

It might be convenient to distinguish parameter entities and general
entities here.  The application can do this itself by maintaining the
appropriate state, but there doesn't seem any advantage in munging
together general and parameter entities.

How would general entity references in default attribute values in
attribute list declarations be handled?

Would all parameter entity references be reported or just those at the
top-level (ie between markup declarations)?

I don't think support for entities can be fully designed without also
considering DTD information.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 21 18:45:22 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:16 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1 
In-Reply-To: <199903211815.LAA02365@malatesta.local>
References: <005101be73a6$325a59e0$c8a8a8c0@thing1>
	<199903211815.LAA02365@malatesta.local>
Message-ID: <14069.15798.52075.214694@localhost.localdomain>

uche.ogbuji@fourthought.com writes:

 > > >and the XML declaration?
 > > 
 > > I think this could be helpful when updating a document, the
 > > encoding in particular. (An update should make the minimum
 > > necessary changes.)
 > 
 > Is it better to just use the processing instruction event?  I know
 > it would be nice to specialize the PI parameters, but do we then
 > extend this to other PIs that come out of the XML spectrum, such as
 > style-sheet specifiers?

They're different beasties.  Any other standardised PI's are still
PI's, while the XML declaration is definitely not a processing
instruction according to the XML 1.0 REC (nor is the encoding
declaration at the beginning of an external parsed entity).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 21 18:59:04 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:16 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <36F505C5.973FA4C3@jclark.com>
References: <14068.24150.843634.988657@localhost.localdomain>
	<36F505C5.973FA4C3@jclark.com>
Message-ID: <14069.16001.132059.772335@localhost.localdomain>

James Clark writes:

 > David Megginson wrote:
 > 
 > >     public abstract void startEntity (String name)
 > >         throws SAXException;
 > > 
 > >     public abstract void endEntity (String name)
 > >         throws SAXException;
 > 
 > How does this allow me to find out about entity references in attribute
 > values (including defaulted attribute values)?

This is an interesting problem, and I was thinking about it last night 
while I was out walking my dog.  One possible solution is to extend
AttributeList to provide access to this information somehow (as we'll
be extending it, no doubt, to provide access to isSpecified
information).  Here's one alternative:

  public interface AttributeValueHandler
  {
    public abstract void startEntity (String name)
      throws SAXException;
    public abstract void endEntity (String name)
      throws SAXException;
    public abstract void characters (char ch[], int start, int length)
      throws SAXException;
  }

  public interface AttributeValue2 extends AttributeValue
  {
    public abstract boolean isSpecified (String name);
    public abstract void accept (AttributeValueHandler handler)
      throws SAXException;
  }

With this approach, the 99.9% of SAX applications that don't care
about entity boundaries within attribute values can continue to use
getValue() (which returns a literal string), while the others can use
AttributeValueHandler if the parser supports it -- of course, the
parser can throw a SAXNotSupportedException if it thinks that this
whole thing is too pathological.

 > It might be convenient to distinguish parameter entities and general
 > entities here.  The application can do this itself by maintaining the
 > appropriate state, but there doesn't seem any advantage in munging
 > together general and parameter entities.

There are two alternatives here:

1. Declare that PEs will always have '%' prepended to their names; or
2. add a boolean parameter isParameterEntity.

I'll add (2) for now.

 > How would general entity references in default attribute values in
 > attribute list declarations be handled?

Probably using a method similar to the one I specified above.  Now I
remember why I didn't want to cover DTD-related events in SAX...

 > Would all parameter entity references be reported or just those at the
 > top-level (ie between markup declarations)?

All, I'd imagine -- would there be a good reason for not doing so?

 > I don't think support for entities can be fully designed without also
 > considering DTD information.

Correct -- I apologise for my omitted note from earlier (which Lars
also caught).  I am assuming that there is already a mechanism for
reporting entity declarations.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Sun Mar 21 19:46:04 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:10:16 2004
Subject: DOM Implemetation in C?
Message-ID: <001f01be73d3$26a05cd0$4a5eedc1@arp01>

Hi Tom,

The old site does have an 'old' downloadable toolkit, but it is in a hidden
area.  I can not remember the URL though.

Vivid Creations is still being constructed, hence the lack of much response.
The site should be fully running by late next week, so I'm sure you email
will get responded to then.

The toolkit provides a C++ DOM/SAX implementation at a cost of something
like 200 pounds per developer.  The license currently allows for royality
free distribution of compiled object code using the toolkit.

Hope that helps for the time being,

Regards,

Richard.
-----Original Message-----
From: Tom Harding <tomh@thinlink.com>
To: Richard Anderson <rja@arpsolutions.demon.co.uk>
Date: Sunday, March 21, 1999 4:09 PM
Subject: Re: DOM Implemetation in C?


>Richard,
>
>Thanks.  All I could find at your old (?) site in the DOM toolkit area was
an example, but no
>docs or code.  Is that what you were referring to?
>
>I also had stumbled over the Vivid Creations site.  Are you involved with
that?  Anyway I am
>interested and will be watching.  I have previously sent an inquiry to them
but received no
>response.
>
>Thanks again,
>
>   -Tom
>
>
>
>Richard Anderson wrote:
>
>> I think there are some DOM C mappings on my old web site
>> http://www.arpsolutions.demon.co.uk
>>
>> Next week (check around Friday) there *will* be a DOM C/C++
implementation
>> at http://www.vivid-creations.com
>>
>> The DOM stuff will be chargable, but there should also be some freebie
>> stuff. ( new SAX control etc )
>>
>> Cheers,
>>
>> Richard A.
>>
>> -----Original Message-----
>> From: Rick Jelliffe <ricko@allette.com.au>
>> To: XML Dev <xml-dev@ic.ac.uk>
>> Date: Sunday, March 21, 1999 1:31 PM
>> Subject: Re: DOM Implemetation in C?
>>
>> >
>> >>I've checked through everything at the w3c, as well as Robin Cover's
>> >>list... does anyone know where I might find an implementation of the
>> >DOM
>> >>(any level will do) written in C?  If there isn't one, is anyone else
>> >>interested?
>> >
>> >I was working on one, but we have put that project on hold. If you are
>> >interested, I think I can send you a version of the .h file for DOM, but
>> >not any of the implementation files unfortunately. (Our experimental DOM
>> >was being optimised for read-only Chinese documents...in any case, we
>> >had dropped our C requirement, and if it comes to life again we will
>> >move to C++.)
>> >
>> >There is a technical problem that CORBA IDL mappings do not (as far as I
>> >can see) provide C mappings to let us know how to create objects, but it
>> >seems that DOM (or, at least, DOM users) require object creation and
>> >finalization. That being the case, DOM cannot have a completely portable
>> >C interface.  (I would love to be wrong in this...any IDL->C gurus
>> >around?)
>> >
>> >Rick Jelliffe
>> >
>> >
>> >
>> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>> >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
>> CD-ROM/ISBN 981-02-3594-1
>> >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>> >(un)subscribe xml-dev
>> >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
>> message;
>> >subscribe xml-dev-digest
>> >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>> >
>>
>> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>> (un)subscribe xml-dev
>> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>> subscribe xml-dev-digest
>> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Sun Mar 21 20:34:41 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:10:16 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <199903182155.PAA01202@bruno.techno.com> <36F233F4.5CC629EA@mecomnet.de> <199903192331.RAA03429@bruno.techno.com>
Message-ID: <36F55B8A.E7045F9@mecomnet.de>

We're writing hypothetically, about a mechanism (namespaces == architecture)
which is poorly defined, but ...

I would take "used to do architectural forms" to mean that the information to
be inferred from a namespace declaration would supplant (some portion of) that
which would otherwise have been provided by existing architectural
declarations. Otherwise namespaces are just "architecture-neutral".

When I wondered about how this might work, it occurred that the identity
between the URI in a namespace declaration and that in the system-id of a
IS10744:arch PI might be used to infer mappings equivalent to those provided
by the individual architectural mapping attributes.

The problem with this is that it provides no means to map the local parts of
universal names. Without such a means, either a given element type maps to
exactly one architectural form (my original question), or the local parts of
the respective type names must be identical in all architectures (which would
seem an equally severe restriction).  That is, the standard architectural
declarations are still necessary and, again namespaces are "architecture neutral".

What mechanism were you supposing?

Steven R. Newcomb wrote:
> 
> [James Anderson:]
> 
> > > ...  Namespaces (at least the bulk of their
> > > syntax and the idea of identifying a namespace via a URI)
> > > could also be used to do architectural forms.
> 
> > I've wondered about this.
> > Wouldn't they, at least in their standardized form, be restricted to the
> > limited fimaily of architectures which are mutually exclusive?
> 
> I don't [yet] see why.  What makes you think so?
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From moor0361 at tc.umn.edu  Sun Mar 21 21:19:00 1999
From: moor0361 at tc.umn.edu (Kimberly S Moorman)
Date: Mon Jun  7 17:10:16 2004
Subject: Final Call for Articles - Crossroads Magazine
Message-ID: <Pine.SOL.4.05.9903211516310.5071-100000@garnet.tc.umn.edu>


                           Call For Articles
           Crossroads, the Association for Computing Machinery 
                            Student Magazine
                      Markup Languages (Winter 1999)
                  DUE DATE:           April 15, 1999 
                  SUBMISSION ADDRESS:  xrds-submit@acm.org
                  INFORMATION:         crossroads@acm.org
                                       http://www.acm.org/crossroads/

SPECIAL NOTE: If you are interested in being a Student Guest Editor for
this issue, please contact us at crossroads@acm.org or fill out the online application
for student guest editors.

The Crossroads editorial staff invites authors to submit articles dealing
with topics drawn from several areas pertaining to Markup Languages.  
The following partial list of topics is provided to give prospective
authors ideas for articles and is by no means exhaustive; other relevant topics
will be considered.


History, future, and comparisons of markup languages like 
	-Hypertext Markup Language (HTML),
	-Standard Generalized Markup Language (SGML),
	-Extensible Markup Language (XML),
	-Java Speech Markup Language (JSML),
	-Synchronized Multimedia Integration Language (SMIL),
	-Handheld Device Markup Language (HDML),
	-etc.
	

Articles should include a basic description of the kinds of problems
being worked on, the state of the art of research, the state of the art
of commercial applications, open problems, or future research/commercial
development trends. Interviews with researchers; reviews of related books, 
software, videos, or conferences; and opinion columns on related issues 
are also welcome.  We especially encourage both undergraduate and graduate 
students to submit articles.  However, articles written or coauthored by 
professionals will also be considered.

Crossroads articles should be written for a broad audience.  They should
be easily understandable by someone who has had only the most basic
computer science instruction, and yet still be interesting to the
advanced computer enthusiast.  Articles longer than 6000 words will
generally not be considered for publication.  Feature articles should be
between 1500 and 6000 words; reviews should be between 800 and 2000
words; and opinion columns should be between 800 and 3000 words.
Articles should be written in a magazine style rather than a research
paper style.  In consideration of our diverse readership, authors should
try to use language that is inclusive of people regardless of their
gender, race, religion, nationality, or field of study.  Additional
writing guidelines and submission information are available online at
the Crossroads web site 
http://www.acm.org/crossroads/.

Crossroads is published both online and in print.  We have a print
circulation of about 13,000.  All back issues are available for free on
our website.  Authors that have an article printed in Crossroads can
receive complementary copies of the issue they were published in.

All submissions should be formatted in HTML or plain text format and
emailed to xrds-submit@acm.org.  Please include your submission in the
body of your message: DO NOT include it as an attachment. Submissions are
due April 15, 1999.  They will be reviewed shortly thereafter and authors
of accepted submissions will be notified within two to three weeks of the
deadline.

Prospective authors are invited to send email to the editors of Crossroads
crossroads@acm.org indicating their intention to submit an article.  In
this way we can keep everyone informed of any changes in deadlines or
formats and make sure we have a good variety of articles.  General
questions should also be sent to the Crossroads editors.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Sun Mar 21 22:34:40 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:16 2004
Subject: XML complexity, namespaces (was WG)
References: <4.1.19990320162057.00cd19f0@steptwo.com.au>
		<36F1BBE8.A3AB13EB@prescod.net>
		<002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
		<36F0CFF4.365B@hiwaay.net>
		<36F10CFC.CFEB89A8@goon.stg.brown.edu>
		<36F13992.150D05F9@w3.org>
		<36F18209.8C68524@allette.com.au>
		<36F19AC0.B5B40B20@goon.stg.brown.edu>
		<14066.21253.117700.991361@localhost.localdomain>
		<14067.47347.332566.447160@localhost.localdomain>
		<4.1.19990321102248.00baff00@steptwo.com.au> <14068.22702.384640.143448@localhost.localdomain>
Message-ID: <36F573DC.32762792@allette.com.au>


David Megginson wrote:

> In the SGML world, someone had to create one or more variant DTDs to
> ensure that the document was valid at each stage (either through
> elaborate and obfuscatory configuration with parameter entities and
> marked sections, or by writing separate DTDs from scratch, both often
> with liberal sprinklings of ANY).  If the production engineers made
> even minor changes in the system design, the DTDs would (usually)
> immediately break validation and shut down the whole system -- in
> other words, the system was brittle and expensive to maintain.
>
> Sometimes, the benefit of tight validation at each step makes the
> extra expense worthwhile, but as I mentioned before, XML lets you make
> that decision rather than making it for you.

I think it becomes difficult to continue discussing this issue as we move further into
theoritical datasets and workflows, as we are probably all forming a mental picture of a
scenario that serves our own positions. While I agree with much of the above, I think it's
only fair to point out that the variant DTDs need not be created just for the sake of
validating the instances in their current state. They also provide context for the next
conversion, which can reduce the coding required to get to the next stage. In that role, they
can save money by simplifying the coding and accurately documenting the structure of the
instances. If the changes are simple, then so might be the maintenance of the DTDs - if
they're complex (such as converting specialised tables to CALS tables), the task can be made
much easier if you have access to the structure. It could be argued that changing the order of
the six transformations that you provided would be easier if you had a DTD describing the
structure at each point, making it more expensive up front, but potentially more flexible; at
least you might not have to start from scratch.

Of course, all of these scenarios depend on the dynamics of the project - I just don't think
that there is an automatic and substantial loss associated with validation at all stages. I do
agree that it's good that XML allows you to make the choice about validation, but I don't
accept that it will always be cheaper.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Mon Mar 22 00:40:54 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:10:16 2004
Subject: IE5 totally ignores Content-Type was RE: IE5.0 does not conform to RFC2376
In-Reply-To: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp>
Message-ID: <000001be73fb$c0b22b00$1b19da18@ne.mediaone.net>

>
> I believe that IE 5.0 does not conform to RFC2376 (XML Media Types),
> of which I am a co-author.
>
> As for the XML media type "text/xml", the charset parameter in the
> MIME header is authoritative.  Encoding declarations have to be ignored
> so that transcoding is possible.
>
> However, IE 5.0 appears to always ignore the charset parameter and use
> the BOM or encoding declaration only.  Therefore, IE 5.0 does not
> conform to
> RFC 2376.
>
	The release version of IE5 is *totally* ignoring the content-type of many
files returned via ASP for me (including text/plain) ... I think it is just
broken... for example my XMTP app worked correctly under IE5b2 ... now to my
dismay it is broken under IE5 release see: http://jabr.ne.mediaone.net and
browse the XMTP board.

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Mon Mar 22 01:02:36 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:16 2004
Subject: unifying the notion of Namespace and Notation
Message-ID: <03b501be73ff$912aab60$0300000a@cygnus.uwa.edu.au>

As many of you know, I am a big fan of the under-appreciated Notation in
XML. What some have thought of as unnecessary baggage for XML to carry
around, I've thought of as a useful device whose absence would have only led
to hacks to achieve the same thing.

Many of you also know that I have variously called Notations "Namespaces for
Content" and Namespaces "Notations for Names". The more I think about it,
the more I think there is incredible overlap between namespaces and
notations. Reading through the RDF stuff yesterday, I became even more
convinced that this is the case.

Notations and Namespaces are both identified by URI. Actually, notations can
be identified by public identifier but I suspect this was largely because
people didn't feel comfortable with the idea of a URI that didn't point to
something. In this age of Namespaces and URI as Universal Identifier, there
is, IMHO, no need for notations having public identifiers. System
identifiers (URIs) will do. They don't need to point to anything.

Notations and Namespaces both have a Name which acts as an arbitrary,
internal proxy for the URI because URIs can't occur where notations and
namespaces need to be referred to.

Notations, as described in the XML 1.0 REC, serve three purposes.

Firstly, for declaring PI targets. At least in spirit, I see no difference
between this mechanism and the namespace mechanism. In the presence of a
notation declaration, PI target names are just like Namespace prefixes. They
are proxies; it's the notation URI that really identifies the target. IMHO,
Notations are Namespaces for PIs. Wouldn't it be nice if they used the same
mechanism. (issues: scope, PIs in DTDs). On an RDF note, an RDF document
describing an application (not a web page about an application, but the
actual *application*) might use a URI to identify an application. Is there
any reason why this should not be the *exact* same URI used for a notation
for PIs targetted at that application?

Secondly, Notations are used for identifying unparsed entity formats. We
don't say "this is a GIF file" because "GIF" isn't guaranteed universally
unique. So we use a URI. Sounds familiar? Notations are Namespaces for
unparsed entity formats. If we were describing an image file using RDF and
we wanted to say what format it was in, we might use a URI to identify the
format. Is there any reason why this should not be the *exact* same URI used
for unparsed entity notations? The current XSL WD mentions a mechanism for
indicating that the result tree is to be serialised as non-XML (IOW, an
unparsed entity). The specification of the non-XML format to use is via a
namespace. Here namespaces are used to identify a non-XML format. Exactly
the job of notations.

Third and finally (and this is my favourite use), Notations are used for
identifying the format of content. The spirit of XML is that
<Date>1999-03-22</Date> says that 1999-03-22 is a Date. Notation attributes
provide the means to say what format the Date is in ("YYYY-MM-DD" but
expressed as a URI) This is very similar to saying what format an unparsed
entity is in. In fact this is also similar to the way RDF might say John
Smith has a weight of 200 where 200 is in pounds and give a URI identifying
what "pounds" means (say http://www.nist.gov/units/pounds). Now this could
be achieved with notations, declaring a notation "pounds" with the URI just
mentioned and then using a notation attribute "unit" on a weight element
<weight unit="pounds">200</weight>. We'd probably also want to associate a
URI with the element type "weight". For this, we use Namespaces.

Is it just me or does this cry out for unification?

I would imagine that a schema language that uses RDF could easily unify
notations (at least the second and third type) into namespaces. A quick look
(read "search for the word notation") in DCD didn't seem to indicate that
*it* does, though.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Mon Mar 22 01:36:54 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:16 2004
Subject: Behaviour of IE5 (was Re: IE5.0 does not conform to RFC2376)
Message-ID: <001601be7404$c6f02270$12f96d8c@NT.JELLIFFE.COM.AU>


From: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>

>However, IE 5.0 appears to always ignore the charset parameter and use
>the BOM or encoding declaration only.  Therefore, IE 5.0 does not
conform to
>RFC 2376.

In the IE5 beta, that there *seemed* to be two related matters: the
first was which encoding the document is used to read into IE5, and the
second was which encoding is used to drive the fonts. In the beta, the
former seemed to work, and it was the latter which was not reliable (it
was quite sticky, and seemed to use autodetection rather than the
markup--if you had looked at a Turkish file, then GB detection didnt
work, for example). This seems to have been fixed now.

Our test files show that IE5 always looks for the encoding header when
finding which content type to select: even if you send an XML document
with an MIME content type as text/plain and an extension .txt, IE5
imports it as XML.

Tim commented that &lt;? handling is still not correct. This is a
problem they have known about for a few months. I had email with someone
from Microsoft about it, and he suggested that if I wanted to use CSS, I
should double delimit my data! Given that delimiting output strings is
the first thing everyone finds out about text processing, and given that
it typically takes a paragraph of code, and given that Microsoft have
known about this for a while, I don't think we should treat this as a
bug but an engineering decision by Microsoft; I wouldn't like to imagine
the purpose.

But Microsoft's Chinese support (in the Eglish version on an English OS)
is great. Shame about CSS, &lt;  etc. (CSS is a bit better than before,
though.)

Rick


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar 22 02:12:06 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:16 2004
Subject: Validation
In-Reply-To: <36F15372.3FF36ABB@prescod.net>; from Paul Prescod on Thu, Mar 18, 1999 at 01:26:42PM -0600
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F15372.3FF36ABB@prescod.net>
Message-ID: <19990321133612.D29582@io.mds.rmit.edu.au>

Cheers,
Marcelo

On Thu, Mar 18, 1999 at 01:26:42PM -0600, Paul Prescod wrote:
> Chris Lilley wrote:
> > 
> > Unfortunately I came across EBNF long before I came accross DTD syntax,
> > so about half an hour after meeting DTDs I was, like, what do you mean
> > it can't express that this attribute is a url? Why can't it express that
> > this attribute is an ISO standard date?
> 
> I can guarantee you today that the XML schema effort will not allow you to
> express everything that EBNF will so if that's your standard it will fail.
> But even if we use EBNF as our standard: do you know of any programming
> languages expressed entirely in EBNF? Or even entirely in *any formalism*?
> 
> > Yes, validation is important - and I mean real validation, with no
> > critical-path human-readable comments in the DTD and multiple utilities
> > to check different aspects of validity (like separate scripts to ensure
> > that an attribute is a valid date or customer number).
> 
> It will never be the case that it will be possible to write schemas that
> are so tight that they remove the need for comments that describe
> additional constraints to other human beings. There will always be a need
> not only for multiple schema languages but also for the ultimately
> flexible schema language: prose text.

At the risk of sounding repetitious, an analogy may be of use here:

One mark of a good programming language is strong compile-time
checking (type safety, pre- and post-conditions and invariants are
typical measures of this).  Users of such languages typically
characterise them by exclaiming that programs, once compiled, usually
work the first time.  Of course no-one would argue that this is a bad
thing.  It would be silly, however, to take this as evidence that such
languages are bug free.  There will never be a bug free programming
language.  The simple reason is that no programming language can guess
the desired semantics if you get them wrong.  No language can stop you
from trying to implement sort and ending up with reverse sort!  All
they can do is prevent you from ever exhibiting undefined behaviour
(post-conditions can make it painfully difficult to do something so
silly, but I doubt they could do so at compile time).

Languages will never be capable of fully expressing what we want to
achieve in a declarative way.  Declarative languages such as SQL do
provide a very elegant and expressive mapping between intent and code.
But they typically address a very narrow and well-defined problem
domain.  No such silver bullet has been discovered for programming in
the large.

The purpose of this analogy is to illustrate what I believe to be the
same situation in the notion of a DTD.  A DTD defines the language
used to express a certain data domain.  This language provides some
constraints on what constitutes a legal piece of data.  However, just
as a programming language can never fully express the intent of the
user (i.e. it must always include procedural elements which implicitly
rather than explicitly embody the intent), so too can a schema
language never express the full set of constraints one might wish to
impose on a document.  It is easy to come up with trivial cases that
demonstrate this: imagine a document class in which the number of
paragraphs inside the nth section must be less than or equal to the
nth fibbonaci number; or another in which the content model of a
CONTENT element is defined in the PCDATA of the preceding MODEL
element; or how about one in which the maximum depth of the element
heirarchy is defined by the ascii value of the 100th character of the
file stored at a given URL!

Yes these are pathological examples, but the point of illustrating
with extremes is to obviate the fact that no matter how sophisticated
your schema language, it will never be able to handle all
contingencies.  There will always be someone, somewhere, for whom the
available schemas simply don't express the semantics they need.  If
you seriously want to address the widest possible set of schema
language requirements, EBNF doesn't come close; you would need Turing
completeness just as the starting point.

For those who would prefer to see a more concrete case-in-point, I
adduce the very example Chris Lilley provided, that of validating a
customer number.  What if a valid customer number is a composite of
the customer's priority level, her date of birth, and a sequence
number?  How would you express that the 3rd to 10th digits constitute
a date in the format yyyymmdd?  This would require a language with
built-in functions for type conversion, substring extraction and date
composition.  What if people born after 1990 couldn't have a priority
level higher than 25?  This would require branching constructs.  It is
quite common for things like customer codes, site codes, product
codes, etc. to have a composite structure, particularly in legacy
systems.  Often the parts are mutually dependent in non-trivial ways.
Sometimes validity can only be checked by consulting an external,
volatile source of information, such as a database (e.g. "Customer
code is invalid if the first two digits do not appear in the CODE
field of a record in the PRIORITY table...").

This is not, despite appearances, an exclamation of hopelessness.  But
rather to point out that completeness is not necessarily a desirable
goal.  If you wish to insist on developing a schema language that
handles all the validation requirements of any conceivable data
domain, you are aiming not only for an impossible goal, but for one
which, if it could be attained, would be so complex, so arbitrary and
so unwieldy that no-one would want to use it.  In real life, multiple
subsystems are brought into play when validating and transforming
data.  No single component can know everything about a piece of data,
and certainly not enough to definitively ascertain validity in all its
semantic richness.  In fact, the full set of constraints may not even
exist in one environment, but may be distributed across multiple
independent subsystems, or even across multiple hosts; for instance, a
timesheet workflow system may operate as follows:

  1. Employee fills in XML timesheet and submits to accounts server.

  2. The accounts server validates the project number field against
     projects the employee is permitted to book against, and passes
     the document on to the personnel server.

  3. The personnel server knows that the employee doesn't work on
     weekends and Tuesdays and checks for this.  It then dispatches
     the document to the repository.

  4. The repository performs basic DTD validation, which the other two
     servers probably did anyway.

The DTD (or XML Schemas) can provide a basic level of validation, but
there will always be more to do.  And this is not a problem if you
accept that there may be multiple stages to the process, probably
involving multiple languages and environments.

The complaint might be raised that the examples I have given mostly
involve table lookup and therefore belong more properly in the domain
of referential integrity mainenance, but this is not necessarily so.
For instance the accounts server may know that the project number and
employee number must begin with the same two digits (due to the
organisational structure) unless the project number begins with 99
(which represents admin codes).  Furthermore, in the case of a complex
customer code, validation may involve table lookup, but it is not with
a view to ensuring that the customer code refers to an existing
record, and hence is not a referential integrity constraint (the
constraint could even be revised to: "Customer code is invalid if the
first two digits do not appear in the CODE field ... _and_ the date of
birth is after 1990," in which case the record could be valid even
though the lookup did not find any matching records).

Quite apart from the problem of intractability, there is the equally
important issue of parsimony.  For many purposes, a fully expressive
language is more than one needs. Consequently, the user is forced the
to learn a complex environment to perform a simple task.  This is why
a language like CSS is in no danger of being superceded by XSL.  It
doesn't express everything XSL (or DSSSL) can, but it is simple.  An
average hack Web master can come to grips with CSS in a matter of
minutes, and can be using it to good effect within half an hour.  Not
to mention the fact that CSS is just plain easier to read (I hear much
debate about whether it is appropriate for humans to edit XML
directly, but I haven't heard anyone suggest that XSL should be
machine generated; I wonder about this from time to time).  For that
matter, XSL and DSSSL can't express every conceivable typography
requirement either.

Another concrete example comes to mind in the domain of configuration
files.  I have played around with moving our configuration file format
(which is a little ugly at present) to XML.  I was horrified at the
result and am now looking far more seriously at something like .INI
files.  It may not be intrinsically heirarchical (and hence is less
expressive), but it is much simpler, and much easier for a human to
read and manipulate.

Likewise, DTD's and XML Schemas will offer differing levels of
constraint specification, but neither of them (nor any future
language) can express every kind of validation rule that people will
want to express.  Life is simply too complex for that to be possible
(more specifically, real life is arbitrarily complex, and hence so are
the systems that try to model it).

> Luckily, eliminating all other schema languages is not a goal of the W3C
> schema language effort. 
> 
> > So what is critically needed is a real, namespace-aware, schema 
> > language that can be used to do real validation.
> 
> I hear a lot of users saying that. They don't seem to realize that there
> is no such thing as "real validation" there is only "the validation I need
> to do today." Ten years from now, we'll be griping that XMLSchemas don't
> do "real validation" for some other arbitrarily advanced definition of
> "real."

I heartily concur.  There is no silver bullet, so it is a waste of
time looking for one.  The focus should be on developing standards
that solve today's problems today, with an eye to leaving room for
future wisdom without being prescriptive.

Of course, none of the above discourse will eliminate the need for
discussion on what, exactly, is needed and how that need is to be
satisfied.  As one colleague astutely pointed out to me, I am really
transforming the issue from "real validation" to "sufficient
validation".  It would be a mistake, however, to conclude that this is
a trivial transformation in the statement of the problem.  It diverts
the emphasis of the search markedly away from completeness and towards
practicality and useability (of course, completeness remains
desirable, it merely ceases to be a central goal).


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Mon Mar 22 03:41:26 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:16 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F55B8A.E7045F9@mecomnet.de> (message from james anderson on
	Sun, 21 Mar 1999 21:50:52 +0100)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <199903182155.PAA01202@bruno.techno.com> <36F233F4.5CC629EA@mecomnet.de> <199903192331.RAA03429@bruno.techno.com> <36F55B8A.E7045F9@mecomnet.de>
Message-ID: <199903220034.SAA05178@bruno.techno.com>

[James Anderson:]

> I would take "used to do architectural forms" to mean that
> the information to be inferred from a namespace
> declaration would supplant (some portion of) that which
> would otherwise have been provided by existing
> architectural declarations. Otherwise namespaces are just
> "architecture-neutral".

> When I wondered about how this might work, it occurred
> that the identity between the URI in a namespace
> declaration and that in the system-id of a IS10744:arch PI
> might be used to infer mappings equivalent to those
> provided by the individual architectural mapping
> attributes.

> The problem with this is that it provides no means to map
> the local parts of universal names. Without such a means,
> either a given element type maps to exactly one
> architectural form (my original question), or the local
> parts of the respective type names must be identical in
> all architectures (which would seem an equally severe
> restriction).  That is, the standard architectural
> declarations are still necessary and, again namespaces are
> "architecture neutral".

Yes, I agree that you'd have to enhance the XML Namespaces
Recommendation, one way or another.  I personally feel such
enhancement would be a positive direction for the evolution
of XML Namespaces.  Since I believe that architectural forms
(or something that's functionally equivalent) are both vital
and inevitable, I believe there are really only 3 choices
here:

(1) Invent something completely new in order to support
    architectural inheritance at the document level.  If
    this means inventing a new schema language to replace
    DTDs, so be it.  (But that alone won't satisfy the
    requirement that architectural inheritance be possible
    without a DTD or something like one.  Namespaces already
    have the virtue of working without a DTD.)

(2) Use the ISO/IEC 10744:1997 way of doing it (which is
    already being done with XML in several quarters).  It
    works with or without a DTD, incidentally.  This would
    be just fine with me; I favor collaboration between ISO
    and W3C.  However, the ISO way of inheriting an
    information architecture isn't perfect.  I'd still like
    to see the future evolutions of XML and SGML remain in
    harmony with each other.  Several people have complained
    in this forum that Namespaces have various deficiencies.
    Frankly, the alleged deficiencies haven't bothered me at
    all [yet].  What bothers me is the stunning departure
    that Namespaces represent from a path of harmonious
    co-evolution with the ISO family of architected
    information interchange standards, and the fact that the
    distinct goal toward which Namespaces is driving us is
    not at all clear to me.  Was Namespaces designed as a
    merely temporary, expedient, under-high-pressure
    solution?  Or is it a step on workable path toward a
    sensible, efficient, reliable future?

(3) The answer to the latter question can be "Yes" if we

    * Recognize that the reasons for doing architectural
      inheritance are a compelling superset of the reasons
      for inventing XML Namespaces.

    * Accept that there is great need for orderly evolution
      in XML-land, and allow/cause XML Namespaces to evolve
      in the general direction of inheritable architectures.

    I like this third option better than choice #1 because I
    favor having a only few, general, powerful syntaxes and
    syntactical features, as opposed to an endless panoply
    of special-case add-ons.  The danger that XML (or,
    rather, the usefulness and efficiency of XML) will die
    of such obesity is very real.  There are numerous
    examples of standards, both de facto and de jure, that
    sooner or later became useless in exactly the same way.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eric at hellman.net  Mon Mar 22 06:56:44 1999
From: eric at hellman.net (Eric Hellman)
Date: Mon Jun  7 17:10:16 2004
Subject: IE5 and iso entities
In-Reply-To: <E10Ou0n-0005w5-00@bowmore.cc.ic.ac.uk>
Message-ID: <v04020a17b31b95bb7332@[192.168.1.1]>

Has anyone been able to get IE5 to load full sets of ISO entities?

I'm using slightly modified versions of Rick Jellife's XMLized ISO Entity
tables, and IE5 (release version) seems to always stop loading with bogus
complaints after the first entity table.

A test document (a technical article describing blue semiconductor lasers,
if anyone cares) is at http://nsr.mij.mrs.org/4/1/article.xml

Eric
Eric Hellman
Openly Informatics, Inc.
http://www.openly.com/           Tools for 21st Century Scholarly Publishing

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Mon Mar 22 07:21:01 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:16 2004
Subject: IE5 totally ignores Content-Type was RE: IE5.0 does not conform to RFC2376
Message-ID: <009b01be7434$43b7e840$b1acdccf@ix.netcom.com>

When I create an object with the new dll i.e.

myDoc=server.createObject("Microsoft.XMLDOM")

My object fails to recognize external DTD's!

this is so with ASP, (I reinstalled the old dll on my server), and with C
and VB programs that use the dll.

I havn't seen whether it is a problem with an activex object in a web page
yet.

This is a bug with the new DLL it worked fine with the beta2.DLL

Frank
----- Original Message -----
From: Jonathan Borden <jborden@mediaone.net>
To: MURATA Makoto <murata@apsdc.ksp.fujixerox.co.jp>; <xml-dev@ic.ac.uk>
Sent: Sunday, March 21, 1999 7:34 PM
Subject: IE5 totally ignores Content-Type was RE: IE5.0 does not conform to
RFC2376


>>
>> I believe that IE 5.0 does not conform to RFC2376 (XML Media Types),
>> of which I am a co-author.
>>
>> As for the XML media type "text/xml", the charset parameter in the
>> MIME header is authoritative.  Encoding declarations have to be ignored
>> so that transcoding is possible.
>>
>> However, IE 5.0 appears to always ignore the charset parameter and use
>> the BOM or encoding declaration only.  Therefore, IE 5.0 does not
>> conform to
>> RFC 2376.
>>
> The release version of IE5 is *totally* ignoring the content-type of many
>files returned via ASP for me (including text/plain) ... I think it is just
>broken... for example my XMTP app worked correctly under IE5b2 ... now to
my
>dismay it is broken under IE5 release see: http://jabr.ne.mediaone.net and
>browse the XMTP board.
>
>Jonathan Borden
>http://jabr.ne.mediaone.net
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From SMUENCH at us.oracle.com  Mon Mar 22 07:45:31 1999
From: SMUENCH at us.oracle.com (Steve Muench)
Date: Mon Jun  7 17:10:17 2004
Subject: IE5 totally ignores Content-Type was RE: IE5.0 does not conform to RFC2376
Message-ID: <199903220745.XAA09384@usmail04>

Frank, 
 
The IE5 production parser has a couple of new methods on XMLDomDocument: 
 
   validateOnParse 
 
and 
 
   resolveExternals 
 
The doc claims that resolveExternals is "true" by default 
but maybe it's worth a try to explicitly set it to "true"... 
 
____________________________________________________________ 
Steve Muench, Consulting Product Manager & XML Evangelist 
Java Business Objects Dev't Team - http://www.oracle.com/xml
-------------- next part --------------
An embedded message was scrubbed...
From: "Frank Boumphrey" <bckman@ix.netcom.com>
Subject: Re: IE5 totally ignores Content-Type was RE: IE5.0 does not conform to RFC2376
Date: 21 Mar 99 23:18:55
Size: 4389
Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990322/ada6f6ac/attachment.eml
From Andrea.Marchetti at iat.cnr.it  Mon Mar 22 09:20:05 1999
From: Andrea.Marchetti at iat.cnr.it (Andrea.Marchetti@iat.cnr.it)
Date: Mon Jun  7 17:10:17 2004
Subject: IE5 without validation
Message-ID: <3.0.6.32.19990322102215.009e9350@pop.cnuce.cnr.it>

I don't understand; with the final release of IE5 
it's impossible to validate XML document, I have tried to look for
an option without result. It seems that IE5 check the syntax of DTD and
of the xml document but doesn't make the matching between document and DTD.
I remember that IE5B2 validated xml documents. 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Mon Mar 22 13:09:09 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:17 2004
Subject: Is this invalid?
Message-ID: <NBBBJPGDLPIHJGEHAKBACEMMCPAA.martind@netfolder.com>

Hi

Is this markup an invalid XML element?

<!AFDR "ISO/IEC 10744:1997">

IE 5.x XML parser reject this markup as an invalid element

Declaration has an invalid name. Line 1, Position 3

<!AFDR "ISO/IEC 10744:1997">

I am trying to find something in W3C specs about it but with no sucess. By
the way, what is the official list of valid <!...> markups:

<!DOCTYPE....> ???, <!doctype....> ???, both ????
<!SGML...> ????, <!sgml..> ???, both ???
others ????

In which document is this listed? I know that these are all valid SGML
markup and there is ISO documents on it, but where can I find W3C documents
giving that information?


Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Mon Mar 22 13:12:37 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:17 2004
Subject: IE5.0 does not conform to RFC2376
References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp>
Message-ID: <36F62D81.A623C0A2@w3.org>


MURATA Makoto wrote:
> 
> I believe that IE 5.0 does not conform to RFC2376 (XML Media Types),
> of which I am a co-author.
> 
> As for the XML media type "text/xml", the charset parameter in the
> MIME header is authoritative.  Encoding declarations have to be ignored
> so that transcoding is possible.

So, if the file is saved to some local browser cache and then re-read,
it may have no MIME header so the encoding declaration is then
authoritative.

Why can't the transcoding proxy also rewrite the encoding declaration,
since it is rewriting the file anyway? It is trivially easy to find,
process, and change.
I imagine that someone could take some generic charset-converting code
and make a n XML-aware transcoding servlet that rewrote the encoding
declaration in about what, an hour? If someone does this, I will see
about getting it included in the next Jigsaw version.

> However, IE 5.0 appears to always ignore the charset parameter and use
> the BOM or encoding declaration only.  Therefore, IE 5.0 does not conform to
> RFC 2376.

Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation?

> Proof: I made a UTF-8 XML document which also parses even when it is assumed as
> Shift_JIS.  Then, I provided the correct charaset parameter "utf-8"
> in the MIME header by configuring Apache and provided an encoding declaration
> "Shift_JIS" in the XML document.  Such mismatch is perfectly legal and
> usual when proxies perform code conversion.  I tried this document with IE 5.0.
> Incorrect characters were displayed.  Q.E.D.

Okay, proof accepted.

> When the charset parameter is not specified, it is assumed as US-ASCII. 

Wow. So, what this RFC says is that, when used in email and on HTTP, the
encoding declaration is *always ignored*.

That is a pretty big change and, frankly IMHO, ill-advised.


> If you are using Apache and overriding by AddType is allowed, you only have to
> create a file named .htaccess in your directory and write a line as below:
> 
>         AddType "text/xml;  charset=utf-8"    xml

Correction: if you are the *administrator* of an Apache server. One of
the ways in which the Web has changed over the last 5 years is that the
percentage of Web authors who also administer the site that they serve
from has dropped from a substantial majority to an insignificant
minority.

What this RFC appears to do is remove author control over correctly
labelling the encoding, and ensure that most if not all XML documents
get incorrectly labelled as US-ASCII. Then, if the parser is working
correctly, they will compain about all bytes with value >127 being
"illegal characters" and halt with a fatal error[1]

So, this RFC removes at a stroke the possibility of authors correctly
labelling the encoding of their XML documents and takes us back to that
dark time (the present) when the majority of, say, Japanese Web content
was mis-labelled. And it seems to have done this simply to save a very
small part of coding effort for people writing transcoders.

I suspect that this was not the desired result.

This could have been avoided:

1) Require explicit charset for overriding the internal encoding
declaration, so if one really wants to re-label a document as US-ASCII
one actually has to send it out as text/xml; charset="US-ASCII"

2) Define the absence  of an explicit charset encoding in the MIME
header not as "US-ASCII" but as "use encoding in XML instance" in
accordance with the XML 1.0 Recommendation.

3) Encourage transcoding software to rewrite the internal encoding
declaration

4) Make suitable transcoding softare freely available so that the cost
of not complying with point 3 (write your own) is higher than the cost
of complying with it (use a pre-built one).


Please consider points 1 and 2 to be a defect report on RFC2376


--
Chris
[1] http://www.w3.org/TR/REC-xml.html#charencoding


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Mon Mar 22 13:41:35 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:17 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <93CB64052F94D211BC5D0010A80013310EB3B6@WWMESS3.172.19.125.2>

> >Do we really need to know about CDATA sections 
> 
> 
> Debatable perhaps, but supported by the DOM. (Anyone know why?)
> But I'd really like to see better SAX/DOM integration, so Yes!
> 

It seems we are trying to provide two views of a document, the reader's view
and the writer's view. The reader's view needs to present roughly what's in
SAX1. The writer's view arguably should preserve all the arbitrary choices
made by the document author, including whether to use CDATA or entity
references or character references, where to put the line breaks, whether to
use empty element syntax, where to put optional spaces, what kind of quotes
to use round attributes, etc, etc. If we are retaining any of this for the
benefit of people who want to edit the document, then logically we should
retain all of it.

Mike Kay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990322/c3c24046/attachment.htm
From LippmannJ at MICROMODELING.COM  Mon Mar 22 14:05:00 1999
From: LippmannJ at MICROMODELING.COM (Lippmann, Jens)
Date: Mon Jun  7 17:10:17 2004
Subject: Tree view from IE5
Message-ID: <1CEC4A85AB34D21181C900A0C9CFE1279A7781@NY_EXCH_01>

Is there a way to "borrow" the stylesheet that creates the XML tree in IE5
for XML files without an attached stylesheet, or is?the tree hardcoded into
the msxml.dll?
?
Jens?
?
?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From reschke at medicaldataservice.de  Mon Mar 22 15:27:57 1999
From: reschke at medicaldataservice.de (Julian Reschke)
Date: Mon Jun  7 17:10:17 2004
Subject: SQL queries expressed in XML
Message-ID: <004d01be7478$81d7be90$2e00a8c0@julian>

Hi,

we recently had the idea to use XML to express SQL-like queries (so this is
not about querying XML -- it is about using XML to express queries). It
seems to me that we might not be the first ones; so has anybody defined an
XML document type for expressing SQL queries?


--
Julian Reschke
MedicalData Service GmbH (http://www.medicaldataservice.de)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jabuss at cessna.textron.com  Mon Mar 22 15:53:08 1999
From: jabuss at cessna.textron.com (Buss, Jason A)
Date: Mon Jun  7 17:10:17 2004
Subject: Article on IE5 support of web standards...
Message-ID: <F7E1775C1C27D211881F00A024B2853046A047@CESS01AMX03>

I just got this in the mail, and this article seems to hit the whole problem
with bringing XML mainstream:  Vendors who pledge their allegiance to any
particular web standard, and then release products with that "close, but not
quite; eventually..." support that just turns people (me included).  I had
hopes for IE5, tried out the beta, and hoped for the best with the final
release.  Oh, well....

http://www.computerworld.com/home/news.nsf/CWFlash/9903195web

Netscape's turn.....

Thanks to all...

Jason A. Buss
Single Engine Technical Publications
Cessna Aircraft Co.
jabuss@cessna.textron.com
"Webstandards....  eventually....  *sigh*..."


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Mon Mar 22 16:16:49 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:10:17 2004
Subject: SQL queries expressed in XML
In-Reply-To: <004d01be7478$81d7be90$2e00a8c0@julian>
Message-ID: <002001be747f$6613e100$ab20268a@pc-lrd.bath.ac.uk>

> we recently had the idea to use XML to express SQL-like queries
> (so this is
> not about querying XML -- it is about using XML to express queries). It
> seems to me that we might not be the first ones; so has anybody defined an
> XML document type for expressing SQL queries?

And just to widen this question slightly - assuming I do have an XML
representation
of a language construct - whats the best way to do the conversion from
the XML representation to the 'correct' language representation.

Could I use XSL to do this - or would this be going against the grain?

(Just to qualify this I'm relatively new to XML, and *extremely* new to
XSL).

L.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Mon Mar 22 16:29:51 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:17 2004
Subject: IE5 and iso entities
Message-ID: <001301be7481$719e3730$31f96d8c@NT.JELLIFFE.COM.AU>

Make sure you are using a recent version.  There are some versions
around with two bugs:
1) the thetas entity has a spurious "?"
2) another entity has a spurious space in a comment delimiter (-- >)

What are the bogus complaints you are getting?

I will be putting the recent version on http://xml.ascc.net/xml/
in the resources page, as the definitive version.

Rick Jelliffe

-----Original Message-----
From: Eric Hellman <eric@hellman.net>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Monday, 22 March 1999 18:17
Subject: IE5 and iso entities


>Has anyone been able to get IE5 to load full sets of ISO entities?
>
>I'm using slightly modified versions of Rick Jellife's XMLized ISO
Entity
>tables, and IE5 (release version) seems to always stop loading with
bogus
>complaints after the first entity table.
>
>A test document (a technical article describing blue semiconductor
lasers,
>if anyone cares) is at http://nsr.mij.mrs.org/4/1/article.xml
>
>Eric
>Eric Hellman
>Openly Informatics, Inc.
>http://www.openly.com/           Tools for 21st Century Scholarly
Publishing
>
>xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Mon Mar 22 16:34:52 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:17 2004
Subject: Is this invalid?
Message-ID: <001d01be7482$34a37290$31f96d8c@NT.JELLIFFE.COM.AU>

 From: Didier PH Martin <martind@netfolder.com>
>Is this markup an invalid XML element?
>
><!AFDR "ISO/IEC 10744:1997">

>In which document is this listed? I know that these are all valid SGML
>markup and there is ISO documents on it, but where can I find W3C
documents
>giving that information?

It is not valid XML. The AFDR markup type was invented to signify that
the file was *not* SGML or XML. (What a strange thing to do: it is the
kiss of death. )

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Mon Mar 22 16:35:15 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:17 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <001b01be7482$0e756e20$c8a8a8c0@thing1>

From: Kay Michael <Michael.Kay@icl.com>
>It seems we are trying to provide two views of a document, the reader's view
>and the writer's view. The reader's view needs to present roughly what's in
>SAX1. The writer's view arguably should preserve all the arbitrary choices
>made by the document author, including whether to use CDATA or entity
>references or character references, where to put the line breaks, whether to
>use empty element syntax, where to put optional spaces, what kind of quotes
>to use round attributes, etc, etc. If we are retaining any of this for the
>benefit of people who want to edit the document, then logically we should
>retain all of it.


Beautiful. I think you have identified two distinct modes of operation. I think this
applies to namespaces as well. A reader has no need for the original prefix
and namespace processing should remove all the xmlns attributes, while a writter
may wish to preserve the information.

Perhaps we should have a writer feature that we can turn on or off, which will
give us two broadly different modes of operation. Other features may be
turned on or off individually, but the default for those features may well depend
on the use of the parser by a reader or a writer.

This also gives us a way to partition events--an interface for a set of events
should not include both reader and writer events.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Birbeck at iedigital.net  Mon Mar 22 16:49:00 1999
From: Mark.Birbeck at iedigital.net (Mark Birbeck)
Date: Mon Jun  7 17:10:17 2004
Subject: Article on IE5 support of web standards...
Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054AB4@SOHOS002>

Of course you could just as easily present it the other way round. We
have been working with MS XML software for about eight months now, but
have yet to work with a NS implementation. We've had to change our XSL
stylesheets twice (not a great problem), and our stuff keeps getting
better (of course I would say that, but I'll let you cast your verdicts
next week). I wish I was able to get that sort of experience and
exposure with other tools and products.

No-one says you have to use non-finalised features. There's no shame in
waiting (no fun either).

Regards,

Mark

> -----Original Message-----
> From: Buss, Jason A 
> Sent: 22 March 1999 15:49
> To: 'xml-dev@ic.ac.uk'
> Subject: Article on IE5 support of web standards...
> 
> 
> I just got this in the mail, and this article seems to hit 
> the whole problem
> with bringing XML mainstream:  Vendors who pledge their 
> allegiance to any
> particular web standard, and then release products with that 
> "close, but not
> quite; eventually..." support that just turns people (me 
> included).  I had
> hopes for IE5, tried out the beta, and hoped for the best 
> with the final
> release.  Oh, well....
> 
> http://www.computerworld.com/home/news.nsf/CWFlash/9903195web
> 
> Netscape's turn.....
> 
> Thanks to all...
> 
> Jason A. Buss
> Single Engine Technical Publications
> Cessna Aircraft Co.
> jabuss@cessna.textron.com
> "Webstandards....  eventually....  *sigh*..."
> 
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: 
> http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar 22 17:00:36 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:18 2004
Subject: Is this invalid?
In-Reply-To: <NBBBJPGDLPIHJGEHAKBACEMMCPAA.martind@netfolder.com>
References: <NBBBJPGDLPIHJGEHAKBACEMMCPAA.martind@netfolder.com>
Message-ID: <wk7ls9o3xf.fsf@ifi.uio.no>


* Didier PH Martin
| 
| Is this markup an invalid XML element?
| 
| <!AFDR "ISO/IEC 10744:1997">

No. It's not an XML element at all, it's an invalid declaration, since
XML does not have AFDR declarations.
 
| IE 5.x XML parser reject this markup as an invalid element
| 
| Declaration has an invalid name. Line 1, Position 3
| 
| <!AFDR "ISO/IEC 10744:1997">

In fact it recognizes it to be a declaration, but rightly complains
that it's not recognized.
 
| <!DOCTYPE....> ???, <!doctype....> ???, both ????

Only DOCTYPE.

| <!SGML...> ????, <!sgml..> ???, both ???

None of these are allowed. From an SGML point of view XML has a fixed
SGML declaration, from the XML point of view there is no such thing as
an SGML declaration.
 
| In which document is this listed? I know that these are all valid
| SGML markup and there is ISO documents on it, but where can I find
| W3C documents giving that information?

The XML specification would be an obvious place to look:

<URL:http://www.w3.org/TR/REC-xml>

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Mon Mar 22 17:06:42 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:18 2004
Subject: Tree view from IE5
Message-ID: <00dd01be7486$0bb1e480$9aacdccf@ix.netcom.com>

Hi Jens,
is this what you want?

You can get it by intercepting the stream that your server sends to your
client. This is IE5 beta 2's script and style sheet. I havn't looked at the
IE5 release yet, but I suspect that MS have somehow made it more difficult
to intercept the stream.

the document I sent was <xdoc>some simple text</xdoc>, and this is how it
comes wrapped!


Frank

<HTML><HEAD>
<STYLE>
BODY{font:10pt Verdana}
.c{cursor:hand}
.b{color:red;font-family:FixedSys}
.e{margin-left:1em;text-indent:-1em;margin-bottom:3px}
.k{margin-left:1em;text-indent:-1em;margin-bottom:3px}
.t{color:#444444}
.a{color:#444444}
.tx{font-weight:bold}

.db{text-indent:0px;margin-left:2em;margin-top:0px;margin-bottom:0px;font:9p
t Courier;color:#0044BB}
.di{font:9pt Courier;color:#0044BB}
.d{color:#0044BB}
.pi{color:#0044BB}

.cb{text-indent:0px;margin-left:2em;margin-top:0px;margin-bottom:0px;font:9p
t Courier;color:#0044BB}
.ci{font:9pt Courier;color:#0044BB}
PRE{margin:0px;display:inline}
</STYLE>
<SCRIPT>
<!--
function initState()
{
var pres=document.all.tags("PRE");
for (var i=0;i<pres.length;i++)
{
var cparent=pres(i).parentElement;
if (cparent.className=="ci")
{
if (cparent.children(0).innerText.indexOf("\n")>0)
{
cparent.className="cb";
cparent.style.display="block";
cparent.parentElement.children(0).className="c";
cparent.parentElement.children(0).children(0).innerText="-";
}
}

if (cparent.className=="di")
{
if (cparent.children(0).innerText.indexOf("\n")>0)
{
cparent.className="db";
cparent.style.display="block";
cparent.parentElement.children(0).className="c";
cparent.parentElement.children(0).children(0).innerText="-";
}
}
}
}

function changeState(e)
{
mark=e.children(0).children(0);

if (mark.innerText=="+")
{
for (var i=1;i<e.children.length;i++)
e.children(i).style.display="block";
mark.innerText="-";
}
else if (mark.innerText=="-")
{
for (var i=1;i<e.children.length;i++)
e.children(i).style.display="none";
mark.innerText="+";
}
}

function changeChunkState(e)
{
mark=e.children(0).children(0);
contents=e.children(1);

if (mark.innerText=="+")
{
if (contents.className=="db"||contents.className=="cb")
contents.style.display="block";
else
contents.style.display="inline";
mark.innerText="-";
}
else if (mark.innerText=="-")
{
contents.style.display="none";
mark.innerText="+";
}
}

function document_click()
{
e=window.event.srcElement;
if (e.className!="c")
{e=e.parentElement;if (e.className!="c")
{
return;
}
}

e=e.parentElement;
if (e.className=="e") changeState(e);
if (e.className=="k") changeChunkState(e);
}
document.onclick=document_click;
--></SCRIPT>
<SCRIPT FOR="window" EVENT="onload">initState();</SCRIPT>
</HEAD>
<BODY class="st"><DIV class="e">
<DIV STYLE="margin-left:1em;text-indent:-2em">
<SPAN class="b">&nbsp;</SPAN>
<SPAN class="t">&lt;xdoc</SPAN><SPAN class="t">&gt;</SPAN><SPAN
class="tx">some simple text</SPAN><SPAN class="t">&lt;/xdoc&gt;</SPAN>
</DIV></DIV>
</BODY>
</HTML>


----- Original Message -----
From: Lippmann, Jens <LippmannJ@MICROMODELING.COM>
To: <xml-dev@ic.ac.uk>
Sent: Monday, March 22, 1999 9:01 AM
Subject: Tree view from IE5


>Is there a way to "borrow" the stylesheet that creates the XML tree in IE5
>for XML files without an attached stylesheet, or is the tree hardcoded into
>the msxml.dll?
>
>Jens
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Mon Mar 22 17:10:16 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:18 2004
Subject: Article on IE5 support of web standards...
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A16E1@EUKBANT101>

Out of interest along these lines, for those that don't already know -
mozilla.org have released M3 (milestone 3) of mozilla. I'd consider this
pre-alpha now (whereas gecko wasn't even that), and it's got some really
neat features - like the whole UI for navigator and mail/news and editor is
built in XML - and you can edit it and change the layout completely. Nice.

However the disappointment is that it appears that they aren't using expat
yet (why???). I tried editing my .xul file to contain this:

<html:h1>Hello World!!!<blah></h1></blah>

Obvioulsy totally wrongo. But it parsed it and displayed a <h1> just fine.
No error messages. I'm submitting a bug report now.

Matt.
--
http://come.to/fastnet
Perl on Win32, PerlScript, ASP, Database, XML
GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V 
!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++

> -----Original Message-----
> From:	Buss, Jason A [SMTP:jabuss@cessna.textron.com]
> Sent:	Monday, March 22, 1999 3:49 PM
> To:	'xml-dev@ic.ac.uk'
> Subject:	Article on IE5 support of web standards...
> 
> I just got this in the mail, and this article seems to hit the whole
> problem
> with bringing XML mainstream:  Vendors who pledge their allegiance to any
> particular web standard, and then release products with that "close, but
> not
> quite; eventually..." support that just turns people (me included).  I had
> hopes for IE5, tried out the beta, and hoped for the best with the final
> release.  Oh, well....
> 
> http://www.computerworld.com/home/news.nsf/CWFlash/9903195web
> 
> Netscape's turn.....
> 
> Thanks to all...
> 
> Jason A. Buss
> Single Engine Technical Publications
> Cessna Aircraft Co.
> jabuss@cessna.textron.com
> "Webstandards....  eventually....  *sigh*..."
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Tim.Shaw at wdr.com  Mon Mar 22 17:18:07 1999
From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com)
Date: Mon Jun  7 17:10:18 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <001b01be7482$0e756e20$c8a8a8c0@thing1>
Message-ID: <H0000586017ea457@MHS>

     
Perilously close to 'Design Time' flag (a la Beans) - neat idea, but not quite 
as simple as it at first appears. Remember I (a 'writer') may want to see what 
my 'reader' sees in my 'IDE'.

As I say, neat idea

Gluck

tim 

______________________________ Reply Separator _________________________________
Subject: Re: SAX2 RFD: LexicalHandler draft v.1.1
Author:  b.laforge (b.laforge@jxml.com) at unix,mime
Date:    22/03/99 16:35


From: Kay Michael <Michael.Kay@icl.com>
>It seems we are trying to provide two views of a document, the reader's view 
>and the writer's view. The reader's view needs to present roughly what's in 
>Snip< 

>Response snipped<
Perhaps we should have a writer feature that we can turn on or off, which will 
give us two broadly different modes of operation. Other features may be
turned on or off individually, but the default for those features may well depen
d
on the use of the parser by a reader or a writer.
     
This also gives us a way to partition events--an interface for a set of events 
should not include both reader and writer events.
     
Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Mon Mar 22 17:32:42 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:18 2004
Subject: SQL queries expressed in XML
Message-ID: <93CB64052F94D211BC5D0010A80013310EB3B8@WWMESS3.172.19.125.2>

> 
> we recently had the idea to use XML to express SQL-like 
> queries (so this is
> not about querying XML -- it is about using XML to express 
> queries). It
> seems to me that we might not be the first ones; so has 
> anybody defined an
> XML document type for expressing SQL queries?
> 
I've thought about the question and some of my thoughts are implemented in
SAXON's SQLStyleSheet, which is the beginnings of an XSL extension to allow
a stylesheet to update an RDBMS with data from an XML source document.

As always in this area the first problem is deciding how much of the syntax
should be "angle brackets" and how much should be rules for the content of
elements/attributes. The answer to that depends on tradeoffs between
different modes of use. So the question is, who is going to use it, and what
for?

In particular if you are interested in queries, what are you planning to do
with the results? Print them out, merge them into the DOM representation of
the document, or what? 

Mike Kay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990322/e429f26a/attachment.htm
From srn at techno.com  Mon Mar 22 18:01:39 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:18 2004
Subject: Is this invalid?
In-Reply-To: <001d01be7482$34a37290$31f96d8c@NT.JELLIFFE.COM.AU>
	(ricko@allette.com.au)
References: <001d01be7482$34a37290$31f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <199903221750.LAA06301@bruno.techno.com>


[Rick Jelliffe:]

> From: Didier PH Martin <martind@netfolder.com>
> >Is this markup an invalid XML element?
> >
> ><!AFDR "ISO/IEC 10744:1997">
> > In which document is this listed? I know that these are
> > all valid SGML markup and there is ISO documents on it,
> > but where can I find W3C documents giving that
> > information?

> It is not valid XML. The AFDR markup type was invented to
> signify that the file was *not* SGML or XML. (What a
> strange thing to do: it is the kiss of death. )

Correction: The purpose of the <!AFDR... declaration is to
alert the SGML parser that, in the subsequent DTD, certain
extensions to the syntax of SGML DTDs are used.  These
extensions are standardized in the "Architectural Forms
Definition Requirements" (which a full-fledged ISO standard
in the SGML family of standards: A.3 of ISO/IEC 10744:1997).
Far from being "*not* SGML", it *is* SGML, except in the
technical sense that this extension hasn't yet been
incorporated, officially, in a long-awaited revision of ISO
8879.

The <!AFDR... syntax extensions are minor and it's easy (but
redundant) to do without them and still use architectural
forms at full power.  The main extension is that there can
be more than one <!ATTLIST... that contributes attribute
definitions to a single element type.  The other extension
is related to the first: there is a bogus element type
called "#ALL" that can be used in <!ATTLISTs to add
attribute definitions to all element types in the DTD.  This
is merely an easier, more maintainable, and clearer way of
adding "common" attribute definitions to every element.

As for <!AFDR... being the "kiss of death", I couldn't agree
less.  The primary SGML parser in industrial use today is
SP, and SP both recognizes <!AFDR... *and* fully supports
these minor syntax extensions.

Rick is correct in saying that <!AFDR... isn't part of any
W3C Recommendation for XML (at least as far as I know).

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jabuss at cessna.textron.com  Mon Mar 22 18:26:05 1999
From: jabuss at cessna.textron.com (Buss, Jason A)
Date: Mon Jun  7 17:10:18 2004
Subject: XSL stylesheets (was Article on IE5 support of web standards.
	..)
Message-ID: <F7E1775C1C27D211881F00A024B2853046A049@CESS01AMX03>

XSL stylesheets?  Is there some software out there (other than notepad) for
creating XSL stylesheets that I haven't come across?

I have used Arbortext's XML styler, but they aren't supporting or revising
it anymore, so if anyone knows of a tool that aids in creating XSL
stylesheets (the transformation part of the spec seems fairly well
supported, looking for formatting only, or something that supports both)
please point me towards it.  I had heard something a while back about the
XSL WG splitting the XSL draft, to separate the formatting and the
transformation parts of the draft.  Anyone hear anything likewise?

Thanks...

Jason A. Buss
Single Engine Technical Publications
Cessna Aircraft Co.
jabuss@cessna.textron.com


> -----Original Message-----
> From:	Mark Birbeck [SMTP:Mark.Birbeck@iedigital.net]
> Sent:	Monday, March 22, 1999 10:49 AM
> To:	'xml-dev@ic.ac.uk'
> Subject:	RE: Article on IE5 support of web standards...
> 
> Of course you could just as easily present it the other way round. We
> have been working with MS XML software for about eight months now, but
> have yet to work with a NS implementation. We've had to change our XSL
> stylesheets twice (not a great problem), and our stuff keeps getting
> better (of course I would say that, but I'll let you cast your verdicts
> next week). I wish I was able to get that sort of experience and
> exposure with other tools and products.
> 
> No-one says you have to use non-finalised features. There's no shame in
> waiting (no fun either).
> 
> Regards,
> 
> Mark
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Mon Mar 22 18:27:39 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:10:18 2004
Subject: SQL queries expressed in XML 
In-Reply-To: Your message of "Mon, 22 Mar 1999 16:16:53 GMT."
             <002001be747f$6613e100$ab20268a@pc-lrd.bath.ac.uk> 
Message-ID: <199903221826.SAA08052@aniwa.sky>

> > we recently had the idea to use XML to express SQL-like queries
> > (so this is
> > not about querying XML -- it is about using XML to express queries). It
> > seems to me that we might not be the first ones; so has anybody defined an
> > XML document type for expressing SQL queries?
> 
> And just to widen this question slightly - assuming I do have an XML
> representation
> of a language construct - whats the best way to do the conversion from
> the XML representation to the 'correct' language representation.
> 
> Could I use XSL to do this - or would this be going against the grain?
> 
> (Just to qualify this I'm relatively new to XML, and *extremely* new to
> XSL).

XSL doesn't seem to do very well where the desired output is not well formed.  
If your SQL queries have '"', '<', '>' or '&' in them, then you're going to 
start getting into kludges.  perl or DSSSL would be better suited to the task.

*Why* do you want to put your queries into XML?  Do you need access to the 
structure of your queries?  Perhaps you just need something that can be 
embedded comfortably in your XML documents.  What you are trying to achieve is 
likely to affect how you approach the problem.


I've got a problem to tackle soon which provides an example of a reason one 
might want to have queries in an XML format, and the implications it has for 
encoding of my queries.  It may be that others are doing similar stuff - if so 
 I'd like to hear about it.

I have a steady flow of news material coming through my site.  I have 
subscribers who receive material filtered from this according to custom 
preferences.  Whenever a story comes through I need my system to turn around 
several thousand queries within a few minutes at worst (while not unduly 
slowing my web server). I want to offer more flexible customization than I 
have at present.

Basically what I need to do is to invert the problem and define a query based 
on the story data which can be applied to the stored queries to find the set 
of queries which the story matches.  (Did that make sense?)

XML expression of queries appeals since it facilitates interchanging of 
queries and data.  The XML query languages I'm aware of don't seem helpful 
though, as they tend to store query expressions as CDATA and  don't expose the 
query structure.

The sort of queries I want to do are boolean logic queries.  Primitives I need 
are literal specification of element content or attribute content, or 
containment of particular words within the element contents.  Extensions of 
this boolean model might include stemming (reasonably likely) and use of term 
weighting (probably not).  These are amply discussed in the Information 
Retrieval literature for those who don't know about them.

I figure any boolean query can be expressed as a decision tree terminating in 
true or false leaf nodes, that this maps well into XML, and that it should be 
able to be used to search for queries matching a given document using existing 
tools (eg sgrep).  I believe this could lead to a relatively simple processing 
model, but it remains to be seen how efficient it will be.

If anyone is aware of any relevant work that is being or has been done I'd 
appreciate hearing about it.  XML or otherwise.

Andrew McNaughton


-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Mon Mar 22 18:30:11 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:18 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au>
Message-ID: <36F68B71.5967A029@w3.org>


Marcus Carr wrote:
> 
> Chris Lilley wrote:
> 
> [a number of sideways kicks at SGML, then:]
(Generally deserved, I thought)

> > There are significant portions of the old SGML community working to
> > improve XML and to help build the missing parts which are needed. I have
> > a lot of rwespect for that portion. There are, as you say, other parts
> > which are merely trying to save their own highly paid jobs as priests of
> > complex, low-powered technology. One can usually tell the difference by
> > noting that the former portion have their eyes open.

> Spare me. The biggest driving factor behind people working in SGML 
> is the fact that there are clients who want work done. 

Uh, this is actually a fairly big driver for people working in XML too.

> SGML is neither complex nor low-powered, as numerous defence,
> telcos, legal publishers, stock exchanges, aircraft manufacturers, automotive companies, etc.
> can attest. 

I'm not saying that its impossible to get value from it, or that it is
without power. But it is significantly underpowered in some ways, and
pays too big a price in parsing complexity for minor keystroke savings,
and the original design constraints don't necessarily apply to todays
applications, which is why I see XML as more powerful than SGML, not
less, in spite of being (now) a subset of SGML.


> Generalisations of the participants such as those above, create friction between
> the XML and SGML camps and reveal an inate lack of understanding about the relationship
> between the two. I will thank you to not to categorise me as either a "good XML groupie" or a
> "garden gnome".

;-()

Well if you are an SGML user who is not

a) involved in furthering the XML effort, or
b) involved in slowing down the XML effort

then I didn't categorise you at all, since I was speaking of only two
particular portions of the "old SGML community". There are, of course
other portions; and there are, of course, other communities.

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From SMUENCH at us.oracle.com  Mon Mar 22 18:43:29 1999
From: SMUENCH at us.oracle.com (Steve Muench)
Date: Mon Jun  7 17:10:18 2004
Subject: Tree view from IE5
Message-ID: <199903221843.KAA00573@usmail04>

IE5 uses a built-in stylesheet for this. It's not  
something that's meant to be encrypted in any way or 
hard to "capture". :-) 
 
>From Microsoft Reference material on IE5, you 
find: 
 
|  When using an XSL style sheet, you can access the 
|  XML source through the XML Document Object Model 
|  (DOM). Two additional properties are exposed on 
|  the document object from DHTML: 
| 
|  document.XMLDocument 
|  document.XSLDocument 
| 
|  The XMLDocument property returns the root of the 
|  XML source tree, and the XSLDocument property 
|  returns the root of the XSL style sheet. 
 
By creating a two-frame frameset with an XML file 
being browsed in the left frame and some Javascript 
in the righ frame, you can write out the value of: 
 
  parent.leftframe.document.XSLDocument.xml 
 
to a text file. A little cryptic, but a definitely 
learning tool. See below... 
 
Have fun. 
 
____________________________________________________________ 
Steve Muench, Consulting Product Manager & XML Evangelist 
Java Business Objects Dev't Team - http://www.oracle.com/xml 
 
=/ Include /= 
 
<x:stylesheet xmlns:x="http://www.w3.org/TR/WD-xsl" 
              xmlns:dt="urn:schemas-microsoft-com:datatypes" 
              xmlns:d2="uuid:C2F41010-65B3-11d1-A29F-00AA00C14882">  
<x:template match="/"> 
<HTML> 
<HEAD> 
<STYLE> 
BODY{font:x-small 'Verdana';margin-right:1.5em} 
.c{cursor:hand} 
.b{color:red;font-family:'Courier New';font-weight:bold;text-decoration:none} 
.e{margin-left:1em;text-indent:-1em;margin-right:1em} 
.k{margin-left:1em;text-indent:-1em;margin-right:1em} 
.t{color:#990000} 
.xt{color:#990099} 
.ns{color:red} 
.dt{color:green} 
.m{color:blue} 
.tx{font-weight:bold} 

.db{text-indent:0px;margin-left:1em;margin-top:0px;margin-bottom:0px;padding-le
ft:.3em;border-left:1px solid #CCCCCC;font:small Courier} 
.di{font:small Courier} 
.d{color:blue} 
.pi{color:blue} 

.cb{text-indent:0px;margin-left:1em;margin-top:0px;margin-bottom:0px;padding-le
ft:.3em;font:small Courier;color:#888888} 
.ci{font:small Courier;color:#888888}PRE{margin:0px;display:inline}</STYLE>    
<SCRIPT> 
<x:comment> 
function f(e){if (e.className=="ci"){if 
(e.children(0).innerText.indexOf("\n")&gt;0) fix(e,"cb");} 
if (e.className=="di"){if (e.children(0).innerText.indexOf("\n")&gt;0) 
fix(e,"db");} 
e.id="";} 
function 
fix(e,cl){e.className=cl;e.style.display="block";j=e.parentElement.children(0);
j.className="c";k=j.children(0);k.style.visibility="visible";k.href="#";} 
function ch(e){mark=e.children(0).children(0);if 
(mark.innerText=="+"){mark.innerText="-";for (var 
i=1;i&lt;e.children.length;i++)e.children(i).style.display="block";} 
else if (mark.innerText=="-"){mark.innerText="+";for (var 
i=1;i&lt;e.children.length;i++)e.children(i).style.display="none";} 
} 
function ch2(e){mark=e.children(0).children(0);contents=e.children(1);if 
(mark.innerText=="+"){mark.innerText="-";if 
(contents.className=="db"||contents.className=="cb")contents.style.display="blo
ck";else contents.style.display="inline";} 
else if 
(mark.innerText=="-"){mark.innerText="+";contents.style.display="none";} 
} 
function cl(){e=window.event.srcElement;if 
(e.className!="c"){e=e.parentElement;if (e.className!="c"){return;} 
} 
e=e.parentElement;if (e.className=="e") ch(e);if (e.className=="k") ch2(e);} 
function ex(){} 
function h(){window.status=" ";} 
document.onclick=cl;</x:comment></SCRIPT>   </HEAD>   <BODY class="st"> 
<x:apply-templates/> 
</BODY> 
</HTML>  
</x:template>  
<x:template match="node()[nodeType()=10]">  <DIV class="e"><SPAN>    <SPAN 
class="b"><x:entity-ref name="nbsp"/></SPAN>    <SPAN class="d">&lt;!DOCTYPE 
<x:node-name/><I> (View Source for full doctype...)</I>&gt;</SPAN>   
</SPAN></DIV>  
</x:template>  
<x:template match="pi()">  <DIV class="e">   <SPAN class="b"><x:entity-ref 
name="nbsp"/></SPAN>   <SPAN class="m">&lt;?</SPAN><SPAN 
class="pi"><x:node-name/>    <x:value-of/></SPAN><SPAN class="m">?&gt;</SPAN>  
</DIV>  
</x:template>  
<x:template match="pi('xml')">  <DIV class="e">   <SPAN 
class="b"><x:entity-ref name="nbsp"/></SPAN>   <SPAN 
class="m">&lt;?</SPAN><SPAN class="pi">xml <x:for-each 
select="@*"><x:node-name/>="<x:value-of/>" </x:for-each></SPAN><SPAN 
class="m">?&gt;</SPAN>  </DIV>  
</x:template>  
<x:template match="@*" xml:space="preserve"><SPAN><x:attribute 
name="class"><x:if match="x:*/@*">x</x:if>t</x:attribute> 
<x:node-name/></SPAN><SPAN class="m">="</SPAN><B><x:value-of/></B><SPAN 
class="m">"</SPAN> 
</x:template>  
<x:template match="@xmlns:*|@xmlns|@xml:*"><SPAN class="ns">   
<x:node-name/></SPAN><SPAN class="m">="</SPAN><B 
class="ns"><x:value-of/></B><SPAN class="m">"</SPAN> 
</x:template>  
<x:template match="@dt:*|@d2:*"><SPAN class="dt">   <x:node-name/></SPAN><SPAN 
class="m">="</SPAN><B class="dt"><x:value-of/></B><SPAN class="m">"</SPAN> 
</x:template>  
<x:template match="textnode()">  <DIV class="e">   <SPAN 
class="b"><x:entity-ref name="nbsp"/></SPAN>   <SPAN 
class="tx"><x:value-of/></SPAN>  </DIV>  
</x:template>  
<x:template match="comment()">  <DIV class="k">   <SPAN><A class="b" 
onclick="return false" onfocus="h()" STYLE="visibility:hidden">-</A>    <SPAN 
class="m">&lt;!--</SPAN></SPAN>   <SPAN id="clean" 
class="ci"><PRE><x:value-of/></PRE></SPAN>   <SPAN class="b"><x:entity-ref 
name="nbsp"/></SPAN>   <SPAN class="m">--&gt;</SPAN>   
<SCRIPT>f(clean);</SCRIPT></DIV>  
</x:template>  
<x:template match="cdata()">  <DIV class="k">   <SPAN><A class="b" 
onclick="return false" onfocus="h()" STYLE="visibility:hidden">-</A>    <SPAN 
class="m">&lt;![CDATA[</SPAN></SPAN>   <SPAN id="clean" 
class="di"><PRE><x:value-of/></PRE></SPAN>   <SPAN class="b"><x:entity-ref 
name="nbsp"/></SPAN>   <SPAN class="m">]]&gt;</SPAN>   
<SCRIPT>f(clean);</SCRIPT></DIV>  
</x:template>  
<x:template match="*">  <DIV class="e"><DIV 
STYLE="margin-left:1em;text-indent:-2em">    <SPAN class="b"><x:entity-ref 
name="nbsp"/></SPAN>    <SPAN class="m">&lt;</SPAN><SPAN><x:attribute 
name="class"><x:if match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN>    
<x:apply-templates select="@*"/><SPAN class="m"> /&gt;</SPAN>   </DIV></DIV>  
</x:template>  
<x:template match="*[node()]">  <DIV class="e">   <DIV class="c"><A href="#" 
onclick="return false" onfocus="h()" class="b">-</A>    <SPAN 
class="m">&lt;</SPAN><SPAN><x:attribute name="class"><x:if 
match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN><x:apply-templates 
select="@*"/>    <SPAN class="m">&gt;</SPAN></DIV>   <DIV><x:apply-templates/> 
   <DIV><SPAN class="b"><x:entity-ref name="nbsp"/></SPAN>     <SPAN 
class="m">&lt;/</SPAN><SPAN><x:attribute name="class"><x:if 
match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN><SPAN 
class="m">&gt;</SPAN></DIV>   </DIV></DIV>  
</x:template>  
<x:template match="*[textnode()$and$$not$(comment()$or$pi()$or$cdata())]">  
<DIV class="e"><DIV STYLE="margin-left:1em;text-indent:-2em">    <SPAN 
class="b"><x:entity-ref name="nbsp"/></SPAN>    <SPAN 
class="m">&lt;</SPAN><SPAN><x:attribute name="class"><x:if 
match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN><x:apply-templates 
select="@*"/>    <SPAN class="m">&gt;</SPAN><SPAN 
class="tx"><x:value-of/></SPAN><SPAN class="m">&lt;/</SPAN><SPAN><x:attribute 
name="class"><x:if 
match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN><SPAN 
class="m">&gt;</SPAN>   </DIV></DIV>  
</x:template>  
<x:template match="*[*]">  <DIV class="e">   <DIV class="c" 
STYLE="margin-left:1em;text-indent:-2em"><A href="#" onclick="return false" 
onfocus="h()" class="b">-</A>    <SPAN class="m">&lt;</SPAN><SPAN><x:attribute 
name="class"><x:if 
match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN><x:apply-templates 
select="@*"/>    <SPAN class="m">&gt;</SPAN></DIV>   <DIV><x:apply-templates/> 
   <DIV><SPAN class="b"><x:entity-ref name="nbsp"/></SPAN>     <SPAN 
class="m">&lt;/</SPAN><SPAN><x:attribute name="class"><x:if 
match="x:*">x</x:if>t</x:attribute><x:node-name/></SPAN><SPAN 
class="m">&gt;</SPAN></DIV>   </DIV></DIV>  
</x:template> 
</x:stylesheet>
-------------- next part --------------
An embedded message was scrubbed...
From: "Frank Boumphrey" <bckman@ix.netcom.com>
Subject: Re: Tree view from IE5
Date: 22 Mar 99 09:04:11
Size: 6082
Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990322/dce58ff7/attachment.eml
From tgraham at mulberrytech.com  Mon Mar 22 19:08:40 1999
From: tgraham at mulberrytech.com (Tony Graham)
Date: Mon Jun  7 17:10:18 2004
Subject: XSL stylesheets (was Article on IE5 support of web standards.)
In-Reply-To: <F7E1775C1C27D211881F00A024B2853046A049@CESS01AMX03>
References: <F7E1775C1C27D211881F00A024B2853046A049@CESS01AMX03>
Message-ID: <14070.20236.520000.566990@menteith.com>

At 22 Mar 1999 12:25 -0600, Buss, Jason A wrote:
 > XSL stylesheets?  Is there some software out there (other than notepad) for
 > creating XSL stylesheets that I haven't come across?

See http://www.mulberrytech.com/xsl/xslide/ for information on my XSL
mode for Emacs.  I haven't finished updating it to match the current
working draft, but it's still going to be better than using notepad.

Regards,


Tony Graham
======================================================================
Tony Graham                            mailto:tgraham@mulberrytech.com
Mulberry Technologies, Inc.                http://www.mulberrytech.com
17 West Jefferson Street                    Direct Phone: 301/315-9632
Suite 207                                          Phone: 301/315-9631
Rockville, MD  20850                                 Fax: 301/315-8285
----------------------------------------------------------------------
  Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 22 19:22:40 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:19 2004
Subject: Expat and Mozilla (was RE: Article on IE5 support of web standards...)
In-Reply-To: <5F052F2A01FBD11184F00008C7A4A800022A16E1@EUKBANT101>
References: <5F052F2A01FBD11184F00008C7A4A800022A16E1@EUKBANT101>
Message-ID: <14070.38971.971587.407210@localhost.localdomain>

Matthew Sergeant (EML) writes:

 > However the disappointment is that it appears that they aren't using expat
 > yet (why???).

If that's the case, it's probably because Expat would (correctly)
throw out half of the pseudo-XML that the other Mozilla modules
generate.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Mar 22 20:57:01 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:19 2004
Subject: SAX2: LexicalHandler draft v.1.1
Message-ID: <8725673C.0072EEF5.00@d53mta03h.boulder.ibm.com>


>public interface LexicalHandler
>{
>    public abstract void xmlDecl (String version,
>                     String encoding,
>                     String standalone)
>    throws SAXException;
>

Some of this stuff I've already dealt with in the internal event APIs of
the new IBM parser, so I'd like to throw in a couple of points here.... and
hopefully some of this is not off the actual topic, since I've been too
busy to follow this thread as closely as I should have. If some of this
really applies to another thread, then assume I really wrote it there :-)


1) The xmlDecl() needs another parameter. In addition to the encoding
string, which is the exact text of the string in the document, some
customers need to know what the actual encoding is (which might have been
auto-sensed.) They need this in some cases to get the document back to the
original encoding. So there should be an 'actualEncoding' parameter which
is either the same as encoding (if there was an encoding string in the
document) or the actual encoding used if not (probably in some canonical
format, since there are only about 6 auto-sensed encodings right?)

2) I made the names for the comment, PI, and whitespace call backs on the
DTD handler have different names from those of the ones on the document
handler. This is somewhat safer in C++ since it means not having a single
method override two pure virtuals from a mixin. It also allows the handler
to be less stateful in the situation where the same object is implementing
the handler for both document and DTD (since they then know that its for
one or the other without having to keep flags for that stuff, which is not
really a biggie but I thought it was worth it.)

3) I report whitespace in the DTD, so that it can also be pretty much
exactly recreated. I only report this if I'm asked to (by an 'advanced
callbacks' flag, which also controls comments and PIs being reported from
the DTD.)

4) I have events for the begin/end of the internal subset.

5) I have a callback for notation decl, attlist decls, and attdefs, which
are important.

6) I have a flag on each entity, element, etc... decl callback called
'isIgnored'. This lets the caller know that this one was ignore because it
was a subsequent instance of a previously declared decl. So they don't need
to keep it if they just care about actual content, but they do if they want
to recreate the original document (which is extremely important to some
folks.)

7) I haven't done this yet, but some customers are insisting that any event
callback that reports a quoted string indicate whether single or double
quotes were used (again for recreation of the original document.) This
seems a bit over the top to me, since they are equivalent, but I guess the
customer is always right even when he's wrong.


That's all I can think of right now. It would really be nice if we could
map all of the information that we go through the trouble (and overhead) of
parsing to public APIs. Otherwise, customers end up using our internal
event API in order to get the information that they require. This locks
down our internal API more than we'd like, but there is little we can do
about it if they *have* to have this extra info to do what they do.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Mar 22 21:01:06 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:19 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <8725673C.007350D8.00@d53mta03h.boulder.ibm.com>


>From: Lars Marius Garshol <larsga@ifi.uio.no>
>Date: 21 Mar 1999 18:14:13 +0100
>Subject: Re: SAX2 RFD: LexicalHandler draft v.1.1
>
>* David Megginson
>|
>|     public abstract void xmlDecl (String version,
>|                    String encoding,
>|                    String standalone)
>|   throws SAXException;
>
>Should we perhaps make standalone a boolean instead?  It can only have
>two values anyway, and this will spare us a lot of
>standalone.equals(this or that).
>

I did that at first with my internal event APIs, but it didn't work out.
There is then no way of knowing whether the document *really* said yes or
no, or whether it was just no there at all and the default was used. This
prevents the recreation of the original document.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Mon Mar 22 21:49:36 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:19 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <00ff01be74ad$c71eeed0$2ee044c6@arcot-main>

>  public interface AttributeValueHandler
>  {
>    public abstract void startEntity (String name)
>      throws SAXException;
>    public abstract void endEntity (String name)
>      throws SAXException;
>    public abstract void characters (char ch[], int start, int length)
>      throws SAXException;
>  }
>
>  public interface AttributeValue2 extends AttributeValue
>  {
>    public abstract boolean isSpecified (String name);
>    public abstract void accept (AttributeValueHandler handler)
>      throws SAXException;
>  }


David,

I don't think event-based interface is appropriate for this purpose.  Why
not introduce an interator or an array-like interface?

Don


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Mon Mar 22 22:06:07 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:19 2004
Subject: About Tim's article on XML
References: <30649320C177D111ADEC00A024E9F297169FA7@exchange-server.dega.com>
Message-ID: <36F6B38F.E51161C7@w3.org>


Ed Howland wrote:
> 
> Everyone,
> 
> The correct link for this article should be:
> http://www.xml.com/xml/pub/1999/03/ie5/first-x.html


Well, it is no more correct than the other link, but it does reference a
resource 
variant written in HTML rather than in XML.

I guess Didier felt that, to this list in particular, it was reasonable
to point to the XML resource variant.

> From: Didier PH Martin [mailto:martind@netfolder.com]

> I read Tim's article in XML.com with interest (Ref:
> http://www.xml.com/1999/03/ie5/first-x.xml). 

--
Chris


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Mon Mar 22 22:07:38 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:19 2004
Subject: XSA client kit released
Message-ID: <wkww09mb56.fsf@ifi.uio.no>


A kit with a Java client for automatically monitoring XSA documents
has now been released at

<URL:http://birk105.studby.uio.no/www_work/xsa/xsasdk.html>


The client that comes with the kit can be used to automatically
discover changes to a set of XSA documents (addresses, new versions,
new products etc). The kit also contains an API that can be used to
build custom clients (or other kinds of XSA-aware software).

The kit has already been used for a while by the maintainers of
<URL:http://www.stud.ifi.uio.no/~larsga/linker/XMLtools.html> and
<URL:http://www.xmlsoftware.com/>.

See <URL:http://birk105.studby.uio.no/www_work/xsa/> for information
about XSA.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Mon Mar 22 22:17:09 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:19 2004
Subject: eyes open (was XML complexity, namespaces)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F19AC0.B5B40B20@goon.stg.brown.edu> <36F1AFF5.2DF948A2@allette.com.au> <36F26A53.E4395E9C@goon.stg.brown.edu>
Message-ID: <36F6C03B.1222E5DE@w3.org>


"Richard L. Goerwitz" wrote:
> 
> Marcus Carr wrote:
> >
> > So... how do we get back to SGML people not having their eyes open?
> 
> Just for the record, I never said they were closed.  I believe it was
> Chris Lilley. 

Yes.

> And when he said this, he wasn't characterizing the en-
> tire SGML community. 

Correct

> In fact, he was, overall, defending SGML.

Glad someone noticed. Its not something I do often ;-)

We now return to our regular programming.

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Mon Mar 22 22:30:40 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:19 2004
Subject: Oops! was Re: Transformation tool for windows
Message-ID: <002801be74b3$37c9e9a0$a4addccf@ix.netcom.com>

I put a buad version of the program on my site. It refuses to open because
it references a file that I am sure is not on your computers.!

I have put up a corrected version.

I appologise to every one who tried to run the old version

Please go to www.hypermedic.com/style to down load the zip file (20K). Look
under transform XML.

Frank
----- Original Message -----
From: Frank Boumphrey <bckman@ix.netcom.com>
To: xml mailing list <xml-dev@ic.ac.uk>
Sent: Thursday, March 18, 1999 1:03 AM
Subject: Transformation tool for windows


>At the suggestion of several people I am making generaly available a simple
>tool that carries out batch transformations of XML files under windows 95,
>98, or NT. Although stable, it is very much alpha ware and is still a 'work
>in process'. I would be glad of any feed back from members of this list.
>
>It was written for an undergraduate class and requires no more skill's to
>run than than basic windows skill's but in spite of that it is quite
>powerful and can easily handle documents up to 2M in size. (I havn't tested
>it on anything larger)
>
>This tool is exerpted from a larger editing tool which uses the MSXML
>parser. However as the later is in flux and the MSXML dll has not been
>released or liscensed for general use, I have split the transformation tool
>off from the editing and DOM tool.
>
>'TransformXML' allows the following proceeses to be automated.
>
> 1. Creating a list of xml files for processing.
> 2. Running a list of commands on each file.
> 3. Transforming one xml nametag to another.
>
>It has not yet been optimized for speed. for example on a middle of the
road
>platform it takes about 1 minute to convert an XML file marked up by Jon
>Bosak into HTML. It took 20 minutes to transform the complete works of
>Shakespeare from xml to xhtml.
>
>Please go to www.hypermedic.com/style to down load the zip file (20K). Look
>under transform XML.
>
>It uses the VB5 dll's which are also available if needed.
>
>
>Frank Boumphrey
>
>XML and style sheet info at Http://www.hypermedic.com/style/index.htm
>Author: - Professional Style Sheets for HTML and XML http://www.wrox.com
>CoAuthor:  XML applications from Wrox Press, www.wrox.com
>Author: Using XML on the Web (March)
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sawhneya at ms.com  Mon Mar 22 23:01:56 1999
From: sawhneya at ms.com (Avneet Sawhney)
Date: Mon Jun  7 17:10:19 2004
Subject: XML and Sybase
Message-ID: <36F6CBA3.3BD1734A@ms.com>


Hi,

I would like to know what other people are doing with respect to using
XML in a Sybase environment. Will Sybase have any XML support in their
SQL server? Are there other products which could be used?

Most of the products I have seen for database integration seem to be
Windows centric, or they are tied to other database servers.

I want to leverage XML in the middle tier with the current Sybase
environment, but I prefer to use products(besides parsers, etc.) that
lend themselves to this work. I'm thinking i should not have to start
from scratch. as others would have come up against this as well.

BTW, with respect to other thread, I also thought one thing would be to
use XML to express all interaction with the database. With some more
detail, I guess this could be extended to also abstract the data model
as well. I have started working on this, but I am looking for better
ways to use XML in the middle tier. I know the "why's" of doing this,
and I would like to get info on some better "how's" in a C++/UNIX
environment.

Thanks,

-Avneet


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Mon Mar 22 23:17:04 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:19 2004
Subject: Is this invalid?
References: <NBBBJPGDLPIHJGEHAKBACEMMCPAA.martind@netfolder.com>
Message-ID: <36F6CDD6.72D1F8CE@w3.org>


Didier PH Martin wrote:

> I am trying to find something in W3C specs about it but with no sucess. By
> the way, what is the official list of valid <!...> markups:
> 
> <!DOCTYPE....> ???, <!doctype....> ???, both ????
> <!SGML...> ????, <!sgml..> ???, both ???
> others ????
> 
> In which document is this listed? 

Unless I am missing something here, the answer is really obvious - in
the XML specification.

http://www.w3.org/TR/REC-xml

I presume you mean something more complicated?

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Mon Mar 22 23:39:03 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:19 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org>
Message-ID: <36F6D46A.FB33D473@allette.com.au>


Chris Lilley wrote on SGML:

> Uh, this is actually a fairly big driver for people working in XML too.

So... if I was taking potshots at it, it might be indicative that I'm missing something, right?

The reason that people have been marking data up as SGML for many years now is because there was no
XML. We were anticipating its arrival, but until it came we knew we had to mark up the data in some
way that would ensure that it was useful for many years to come. We didn't know what future
technology would hold and to some extent, still don't. What shape will the web be in ten years?
Will loosleafing paper documents survive, or be forced out by smarter data handling and delivery?
Two valid questions ten years apart.

There is what I believe to be a misconception amonst some portions of the XML community that XML
and SGML are locked in some sort of competition, but I don't see any of the same feeling from the
SGML community. The conclusion that I draw from this is that there is some sort of insecurity,
perhaps due to the fact that XML feels that it must replace SGML in order to ensure its survival.
The SGML community sees XML as a great boon - a truly sweet way of using the data and realising the
long-term effort that they have put into their datasets.

Not only does XML address the long-term storage issues that motivated people to move to SGML
(despite the absence of a proliferation of tools), it also addresses putting the data to work. The
SGML community isn't bitter about this - it is exactly what we want. Many organisations now have
huge SGML datasets that can be XML compliant in a couple of days. XML vindicates those of us who
have been pushing SGML for years.

> I'm not saying that its impossible to get value from it, or that it is
> without power. But it is significantly underpowered in some ways, and
> pays too big a price in parsing complexity for minor keystroke savings,
> and the original design constraints don't necessarily apply to todays
> applications, which is why I see XML as more powerful than SGML, not
> less, in spite of being (now) a subset of SGML.

One reason that SGML came about was that organisations were starting to amass large datasets that
were being locked into proprietary applications. Two issues that (I suspect) drove features like
tag omitability were the conversion of these large legacy sets and the perception that in the
absence of SGML tools, politically, markup would have to as simple as possible. The standard may
well have been skewed toward the user and away from the application developer, but I don't think
it's fair to retrospectively bag this decision just because the methods that we now use to collect
and tag data have evolved. Any general comparison of which is "more powerful" is invalid - XML is
capable of very much more than SGML is, but someone with a huge repositiry of SGML documents that
can be valid XML in a week is surely in a "more powerful" position than someone just starting to
collect XML data.

> Well if you are an SGML user who is not
>
> a) involved in furthering the XML effort, or
> b) involved in slowing down the XML effort
>
> then I didn't categorise you at all, since I was speaking of only two
> particular portions of the "old SGML community". There are, of course
> other portions; and there are, of course, other communities.

Is there a "new SGML community" as well? Now I'm not even sure what suburb I live in. I've put a
down payment on a flash new house in XML, but I want to hold on to my familial house in SGML. It
would be foolish to sell it now, when it continues to provide solid, long-term gains. Besides, I
have friends there.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Tue Mar 23 00:10:37 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:10:19 2004
Subject: IE5 Stylesheet
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF16A@RED-MSG-08>

The style sheet used by IE5 tree view is available at
http://www.microsoft.com/xml/xsl/tutorials/transform-defaultss.asp
<http://www.microsoft.com/xml/xsl/tutorials/transform-defaultss.asp> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From incze at mail.matav.hu  Tue Mar 23 01:21:12 1999
From: incze at mail.matav.hu (Incze Lajos)
Date: Mon Jun  7 17:10:19 2004
Subject: Mozilla/milestone3
Message-ID: <36F6ECCC.97FA872@mail.matav.hu>

If anybody is interested - I just checked the new
mozilla browser on Tim Bray's Explorer5 article in XML.
I'm running Linux, so don't really know whait would be
it look like on IE5. In the Mozilla it has a grey background with white
background / red bordered boxes
in it, red section headers and green anchor color. (They can be the
defaults.) The rendering is acceptable.
                                                Incze

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 01:45:38 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F6D46A.FB33D473@allette.com.au>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F68B71.5967A029@w3.org>
	<36F6D46A.FB33D473@allette.com.au>
Message-ID: <14070.56970.50161.169467@localhost.localdomain>

Marcus Carr writes:

 > There is what I believe to be a misconception amonst some portions
 > of the XML community that XML and SGML are locked in some sort of
 > competition, but I don't see any of the same feeling from the SGML
 > community. The conclusion that I draw from this is that there is
 > some sort of insecurity, perhaps due to the fact that XML feels
 > that it must replace SGML in order to ensure its survival.  The
 > SGML community sees XML as a great boon - a truly sweet way of
 > using the data and realising the long-term effort that they have
 > put into their datasets.

XML does nothing that SGML cannot do.

SGML does nothing that XML cannot do.

There are some differences in the ways that XML and SGML accomplish
the same thing, but those differences are trival and unimportant from
an architectural perspective.

XML benefited from (at the time) 12 years of SGML industry experience
by eliminating a lot of original SGML features (such as the ability to
vary the delimiter set or to omit tags) that turned out to be
obfuscatory design mistakes.

SGML benefits from 13 years of industry experience in the form of a
small base of stable, production-quality COTS and OSS.

The question, however, is whether there is a real benefit to
supporting two slightly-variant standards that, in the view of a
system architect, accomplish exactly the same thing in pretty much the
same way.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 23 02:16:35 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:20 2004
Subject: RDF DTD ?
Message-ID: <005a01be74d3$741139c0$11f96d8c@NT.JELLIFFE.COM.AU>

I have written an XML DTD fragment for RDF and put it online at
http://xml.ascc.net/xml/en/utf-8/resource-index.html

I don't see why RDF WG didn't put XML declarations in: perhaps
they were tired--certainly it was not difficult to make. A DTD would
help many users, and also allow more informed commentary on the
comparative virtues of RDF-schema and the various schema proposals.

A combination of this DTD and an XSL-based structure validator should
be enough to check all the structural constraints in RDF.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar 23 02:25:00 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:20 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <36F62D81.A623C0A2@w3.org> from "Chris Lilley" at Mar 22, 99 12:46:09 pm
Message-ID: <199903230328.WAA19702@locke.ccil.org>

Chris Lilley scripsit:

> Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation?

The Recommendation basically says "We yield to the RFC once it is
published".

> > When the charset parameter is not specified, it is assumed as US-ASCII. 
> 
> Wow. So, what this RFC says is that, when used in email and on HTTP, the
> encoding declaration is *always ignored*.

Unfortunately this is a side effect of the rules for the media type
"text/*", which says that the default value of "charset" is always US-ASCII.
The alternative is to use "application/xml", which has no such
obnoxious rule.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 23 02:29:14 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
Message-ID: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com
>SGML does nothing that XML cannot do.

I don't know how Dave can say that.

For example, many asian documents use user-defined characters (East
Asian character sets have a special code space reserved for these, and
East Asian word processing applications come bundled with font editors
to allow definition of user-defined characters).

In SGML I can short-reference these codepoints to entity which points to
the appropriate glyphs and which has other data attributes to describe
character properties.

In XML, to do this I have to write a special program to simulate this
behaviour.

And if the program just inserts elements rather than entity references
(because XML has no attributes on entities, so I have to use elements),
my element structure is made more complicated.

Furthermore I cannot use elements inside attribute values, while I can
use entity references. The lack of this kind in XML has closed off the
obvious and simple solution to private-use area (PUA) characters: East
Asians and MathML could each have found it useful.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 23 02:38:29 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
Message-ID: <007201be74d6$87ed6ce0$11f96d8c@NT.JELLIFFE.COM.AU>

 From: Chris Lilley <chris@w3.org 

>Well if you are an SGML user who is not
>
>a) involved in furthering the XML effort, or
>b) involved in slowing down the XML effort
>
>then I didn't categorise you at all, since I was speaking of only two
>particular portions of the "old SGML community". There are, of course
>other portions; and there are, of course, other communities.
 
It would be interesting if Chris would name names and give examples. 

Who are these portions of the "old SGML community"? Is it 
Steve Newcomb  or Dave Megginson or Paul Prescod or Dave 
Peterson or even me? I think it is dishonest argument to allude to 
sinister forces without naming them or their particular views. 
Frankly, it makes it sound like Chris is inventing bogus boogymen 
as an argument for moving XML in non-standard directions. 

Who are these people from the community formerly known as 
SGML who are involved in "slowing down the XML effort"?  
I don't believe they exist.  In fact, it sounds like "slowing down
the XML effort" is synonymous in Chris' mind with "wanting
XML to be standard", which has certainly not been demonstrated:
in fact, there are calls for greater layering, not for increased 
divergence. 

It should be plain to everyone by now that W3C specifications 
are not treated by vendors as standards which should be strictly
adhered to: they are treated as sources of APIs which can be 
embraced and extended, or partially implemented.  W3C does
not have the authority to check, demand or expect conformance:
either moral or legal. A W3C spec needs all the help it can get
to ensure complete implementation: being an ISO standard helps.

One can see the same attitude at work with the MIME RFC: it
was thoroughly debated by the XML WG and SIG, with input
from other major goups such as WebDAV, and has been out for
quite a while. But if someone doesn't agree they are quite happy to
be non-conforming. Standards are a discipline: it is easy to 
diverge and difficult to interoperate.


Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 02:42:53 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU>
References: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <14070.64712.690500.339308@localhost.localdomain>

Rick Jelliffe writes:

 > From: David Megginson <david@megginson.com

 > >SGML does nothing that XML cannot do.
 > 
 > I don't know how Dave can say that.

>From a system-architecture perspective, my statement is true -- what
we're discussing here are simply implementation details.  I agree that
having to use PUA characters rather than special entities is a mild
annoyance (in the past, I have dealt with similar problems trying to
represent specialised characters in early medieval English
manuscripts, including variant graphemes of the same graph).

 > In SGML I can short-reference these codepoints to entity which
 > points to the appropriate glyphs and which has other data
 > attributes to describe character properties.
 >
 > In XML, to do this I have to write a special program to simulate
 > this behaviour.

In SGML, you have to write a special program to act on the information
in the data attributes (nothing does this out of the box); in XML, you
have to write a special program to act on the PUA.

I'd say that SGML wins a 5.2 out of six 6 on non-canonical characters
(because its approach is slightly more modular and maintainable),
while XML wins a 5.0 (because it still works).  But again, you *can*
represent non-canonical characters in both, and the difference is too
trivial to interest anyone but hard-core SGML wonks like Rick and me
-- it certainly wouldn't be worth spending time on at a large
project-management meeting.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Tue Mar 23 02:45:11 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
		<36F0CFF4.365B@hiwaay.net>
		<36F10CFC.CFEB89A8@goon.stg.brown.edu>
		<36F13992.150D05F9@w3.org>
		<36F18209.8C68524@allette.com.au>
		<36F68B71.5967A029@w3.org>
		<36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain>
Message-ID: <36F70005.3D1F76D2@allette.com.au>


David Megginson wrote:

[... some excellent points about SGML and XML that I completely agree with, then:]

> The question, however, is whether there is a real benefit to
> supporting two slightly-variant standards that, in the view of a
> system architect, accomplish exactly the same thing in pretty much the
> same way.

No question - it would be better if there was a single standard, but the demise of SGML should
be natural, driven by nothing other than natural attrition. If it is to go, it will go because
organisations finish mapping datasets across and start using some of the sexy new tools that
we're currently waiting for, obviating the need for SGML. It may well eventuate that SGML
ceases to be required, but until that time, we have a responsibility to ensure that discussion
of the relative positions of the two should be predominately free of passion and politics.

(Yes, that should apply to both sides and no, the previous comment was not directed at David -
I may not agree with all of his opinions, but I believe them to be well-considered.)


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Tue Mar 23 03:24:48 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:20 2004
Subject: RDF DTD ?
References: <005a01be74d3$741139c0$11f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <36F7095A.F9744939@allette.com.au>


Rick Jelliffe wrote:

> I have written an XML DTD fragment for RDF and put it online at
> http://xml.ascc.net/xml/en/utf-8/resource-index.html

I think this should be http://xml.ascc.net/xml/en/utf-8/resource_index.html.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Tue Mar 23 03:28:25 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:20 2004
Subject: Is this invalid?
In-Reply-To: <36F6CDD6.72D1F8CE@w3.org>
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEOKCPAA.martind@netfolder.com>

Hi Chris,

<YourComment>
Unless I am missing something here, the answer is really obvious - in
the XML specification.

http://www.w3.org/TR/REC-xml

I presume you mean something more complicated?
</YourComment>

<Reply>
Steve Newcomb gave me a good answer about the <!AFDR "ISO/IEC 10744:1997">
markup. And I should thank him for giving this information. I didn't knew
the origin of this markup and encountered it often for Hytime documents. I
thought that this was transfered to XML because XML is supposed to be a
subset of SGML. But Steve brought the information that this markup is not
even a SGML standard because it is not yes part of the SGML new spec. So,
let's put that in the perspective that's its a common practice but not yet
part of a published standard.

About the uppercase lowercase for prolog declarations I finally found the
clause specify uppercase by example like in the following clause:
[52]  AttlistDecl ::=  '<!ATTLIST' S Name AttDef* S? '>'
[53]  AttDef ::=  S Name S AttType S DefaultDecl
I got some doubt after being argued that these reserved keyword should be
uppercase and lowercase, my parser only accept uppercase. I am reassured
now, the parser is OK.
</Reply>

Regards


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Tue Mar 23 03:38:13 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEOLCPAA.martind@netfolder.com>

Hi

<YourComment>
From: David Megginson <david@megginson.com
>SGML does nothing that XML cannot do.
<YourComment>

<Reply>
By simple curiosity: Is it possible to declare an architectural instance
from an architectural form in XML by strictly following the XML 1.0 spec? I
do not mean here to simply have the architectural elements as our element
properties but to declare in the prolog the correspondance between each
markup and each architectural element.
</Reply>

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar 23 03:47:47 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU> from "Rick Jelliffe" at Mar 23, 99 01:31:12 pm
Message-ID: <199903230450.XAA22944@locke.ccil.org>

Rick Jelliffe scripsit:

> In SGML I can short-reference these codepoints to entity which points to
> the appropriate glyphs and which has other data attributes to describe
> character properties.
> 
> In XML, to do this I have to write a special program to simulate this
> behaviour.

At last, someone who wants the LocalMarkupFilter (level 1) I specced
out but never implemented because everybody pooh-poohed it.

This is a SAX filter that detects some PIs and processes character
data.  Each properly declared character in the content of a specified
element is transformed into an empty element.

Here are the PIs:

<?LocalMarkup mapname elementname?> says that any characters in the
content of the element "elementname" are transformed according to
"mapname".

<?LocalMarkup mapname "x" elementname?>
says that when map "mapname" is in effect, the character "x" is
changed into an empty element named "elementname", with
an attribute "char" saying what the character was.

This is not as flexible as shortrefs, assuming I understand them
correctly (can be more than one character long, and are transformed
into an arbitrary entity, not a fixed element name) but is easily
layered over SAX.

Are you interested?

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Tue Mar 23 04:31:13 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <14070.56970.50161.169467@localhost.localdomain>; from David Megginson on Mon, Mar 22, 1999 at 08:45:53PM -0500
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain>
Message-ID: <19990323153036.A9794@io.mds.rmit.edu.au>

On Mon, Mar 22, 1999 at 08:45:53PM -0500, David Megginson wrote:
> Marcus Carr writes:
> 
>  > There is what I believe to be a misconception amonst some portions
>  > of the XML community that XML and SGML are locked in some sort of
>  > competition, but I don't see any of the same feeling from the SGML
>  > community. The conclusion that I draw from this is that there is
>  > some sort of insecurity, perhaps due to the fact that XML feels
>  > that it must replace SGML in order to ensure its survival.  The
>  > SGML community sees XML as a great boon - a truly sweet way of
>  > using the data and realising the long-term effort that they have
>  > put into their datasets.
> 
> XML does nothing that SGML cannot do.

When developing the TOC management system for our document fragmenting
toolkit, we chose XML to represent the TOC.  SGML was not an option,
because we didn't know the content model in advance and couldn't
build it automatically from the DTD's of the individual documents.

Also, we couldn't use a homogeneous element tree with attributes,
because we actually extracted structured content from the documents
for insertion into the TOC (sure, we could have serialised the content
into an SGML attribute, but that would have a been perverse and
painful alternative to simply using XML).

> SGML does nothing that XML cannot do.

On several occasions I have had to import textual information, and
have been able treat the data as SGML with appropriate choice of
shortrefs.

With XML I would have been forced to write an intermediate translation
layer and would have consequently lost the originals (or been forced
to store the original and transformed document, or add the extra layer
to every access).

True, they are not always adequate for the job, but I certainly would
not have happily forgone them in my project because they wouldn't have
been useful in someone else's project!

> There are some differences in the ways that XML and SGML accomplish
> the same thing, but those differences are trival and unimportant from
> an architectural perspective.

Whether the differences are trivial is a matter for the requirements
spec. to decide, rather than something you can decree in a priori
fashion.  There will often be cases where such "trivial" differences
can have a profound impact on the cost and complexity of a project.

> XML benefited from (at the time) 12 years of SGML industry experience
> by eliminating a lot of original SGML features (such as the ability to
> vary the delimiter set or to omit tags) that turned out to be
> obfuscatory design mistakes.

I couldn't disagree more, for all the above reasons.  One man's design
mistakes are another man's salvation.

> SGML benefits from 13 years of industry experience in the form of a
> small base of stable, production-quality COTS and OSS.
> 
> The question, however, is whether there is a real benefit to
> supporting two slightly-variant standards that, in the view of a
> system architect, accomplish exactly the same thing in pretty much the
> same way.

If they did, there might be an issue to resolve.  But they don't, so
there isn't.  Both will continue to be used and developed, and this is
as it should be.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 23 04:49:06 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:20 2004
Subject: XML complexity, namespaces (was WG)
Message-ID: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com

>In SGML, you have to write a special program to act on the information
>in the data attributes (nothing does this out of the box); in XML, you
>have to write a special program to act on the PUA.

Huh? OmniMark allows access to data attributes just as easily as element
attributes (http://www.omnimark.com/develop/om40/doc/concept/646.htm),
out of the box. Several CALS-aware tools understand the notations used
in data attributes, e.g.,  when used for graphics.

And I dont agree that elements and characters and attributes and
entities should be thought of  as interconvertable: search routines look
for character codes--I don't know of any search routines which allow
grepping on data and elements.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Tue Mar 23 05:33:04 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:20 2004
Subject: LocalMarkupFilter (was Re: XML complexity, namespaces (was WG))
References: <199903230450.XAA22944@locke.ccil.org>
Message-ID: <007f01be74ee$14b07660$0300000a@cygnus.uwa.edu.au>

----- Original Message -----
From: John Cowan <cowan@locke.ccil.org>
> At last, someone who wants the LocalMarkupFilter (level 1) I specced
> out but never implemented because everybody pooh-poohed it.
[...]
> Are you interested?

I like the idea of it.

And on a different problem but, I think, similar solution, could you do
local character data mapping the same way? I'd like a nice way of being able
to use some transliteration when hand-editing XML and have it mapped to the
appropriate Unicode code points (eg I'd like to say "in this element, map B
to &#x03B2;")

Mind you, I need to be able to map more than one character.

[...]
> Here are the PIs:
>
> <?LocalMarkup mapname elementname?> says that any characters in the
> content of the element "elementname" are transformed according to
> "mapname".

This sounds like a job for notations, where each mapname is a notation.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Mar 23 05:38:02 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:20 2004
Subject: LocalMarkupFilter (was Re: XML complexity, namespaces (was WG))
In-Reply-To: <007f01be74ee$14b07660$0300000a@cygnus.uwa.edu.au> from "James Tauber" at Mar 23, 99 01:21:31 pm
Message-ID: <199903230642.BAA26432@locke.ccil.org>

James Tauber scripsit:

> And on a different problem but, I think, similar solution, could you do
> local character data mapping the same way? I'd like a nice way of being able
> to use some transliteration when hand-editing XML and have it mapped to the
> appropriate Unicode code points (eg I'd like to say "in this element, map B
> to &#x03B2;")
> 
> Mind you, I need to be able to map more than one character.

But still a 1-1 mapping?  That would be easy to incorporate.

> This sounds like a job for notations, where each mapname is a notation.

But XML notations don't have attributes, so what is gained?

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Tue Mar 23 06:06:51 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:21 2004
Subject: LocalMarkupFilter (was Re: XML complexity, namespaces (was WG))
References: <199903230642.BAA26432@locke.ccil.org>
Message-ID: <00c701be74f2$ceb18280$0300000a@cygnus.uwa.edu.au>

> > Mind you, I need to be able to map more than one character.
>
> But still a 1-1 mapping?  That would be easy to incorporate.

I was initially concerned that what I wanted involved context-sensitive
mapping but I no longer think it does, so, yes, it should be easy.

> > This sounds like a job for notations, where each mapname is a notation.
>
> But XML notations don't have attributes, so what is gained?

I was thinking of just the association of map with element (ie the first
PI). If I understand correctly, your first PI associates the mapping with
all elements of a named type. I would like the flexibility of being able to
control that on an element-by-element basis.

An attribute seems a good way of doing this (if all elements of a type have
the same mapping, you can have an attribute default). So what you end up
with is an attribute that, in effect, is saying how to process the character
data in the content (sounds like a notation attribute right?)

It would be useful, I think, to make such a specification independent of the
particular usage by the LocalMarkupFilter. Other applications might want to
know about it to. So a more general solution, IMHO, would be to have the
mapping triggered by notation.

In fact, rather than *replacing* your first PI, that PI could remain but
instead map a mapname to a notation.

To me that is more in the spirit of descriptive/generic markup.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From grove at infotek.no  Tue Mar 23 07:48:31 1999
From: grove at infotek.no (Geir Ove Gr�nmo)
Date: Mon Jun  7 17:10:21 2004
Subject: ANN: tmproc 0.10, a Topic Map implementation
Message-ID: <GROVE-82bthkwss9.fsf@pc-grove.infotek.no>


Hello,

I'm pleased to announce the first release of tmproc, a Topic Map
processor. This release is meant to be a technology preview.

Enjoy!

Geir O.

--------------------------------------------------------------------------

Title: tmproc
Version: 0.10

Released: March 23rd 1999
Author: Geir O. Gr?nmo, grove@infotek.no

Homepage: http://www.infotek.no/~grove/software/tmproc/index.html

Requirements: 
   
    - Python 1.5.1 or newer [1]
    - An SGML/XML parser with a SAX driver
    - SAX for Python [2]
    - xmlarch 0.25, optional unless architectural processing is needed [3]

- --

>>> What is tmproc?

tmproc is an implementation of the new international standard ISO/IEC
13250 Topic Maps[4]. tmproc is written in Python, and it should work
on any platform to which Python have been ported[2].

tmproc is a set of classes that represents a framework for doing topic
map processing in Python.

The current release includes the following set of classes:

 o classes for representing topic map objects like TopicMap, Topic,
 TopicName, Occurrence, Locator, Association, AssociationRole, Facet
 and FacetValue.

 o a factory class for creating topic map objects.

 o a class for importing topic maps, TMImporter. It listens to SAX
 events and use a factory class and interfaces to build a Topic Map.

 o an export class, TMExporter, that emits SAX events in the topic map
 interchange format so that any SAX document handler may be used for
 export.

 o statistical and information printing classes, TMUtils and TMStats.

A command line utility is also included in the distribution.

The implementation is currently based on a draft released some time
before the final ballot. Some deviations from the - soon to be
released - final standard is expected.

Currently only a in-memory implementation is available. A relational
database implementation have also been written, but is not available
in the distribution because it is a bit crude at the moment.

Fortunately tmproc has been written in a way that makes it easy to do
additional implementations.

- --

>>> Some of the features are:

 o Import, export, query and manipulation of topic maps.

 o Full set of extensible topic map classes with clearly defined
   interfaces. Association, AssociationRole, Facet, FacetValue, Locator,
   Occurrence, Topic, TopicMap, TopicMapFactory and TopicName.

 o Access to data in topic map objects using getter and setter methods.

 o Get types including transitive types of topics, associations and facets.

 o Get objects [e.g. topics, associations and facets] that are of
   given types or more specific types.

 o Get objects [e.g. associations] that exists in a scope or in any of
   the scopes' subscopes.

 o Optional architectural processing [requires xmlarch].

 o Introduction and reference documentation.


Suggestions and bug reports should be sent to: grove@infotek.no

- --

[1] http://www.python.org/

[2] http://www.stud.ifi.uio.no/~larsga/download/python/xml/saxlib.html

[3] http://www.infotek.no/~grove/software/xmlarch/index.html

[4] Final CD Text for ISO/IEC 13250, Topic Navigation Maps,
    http://www.ornl.gov/sgml/sc34/document/0008.htm

<P><A HREF="http://www.infotek.no/~grove/software/tmproc/index.html">tmproc
0.10</A> - an implementation of the new international standard ISO/IEC
13250 Topic Maps.  (22-Mar-99)

-- 
 ==================  Geir Ove Gr?nmo  ==================
|  STEP Infotek as, Gjerdrumsvei 12, 0486 Oslo, Norway  |
|        grove@infotek.no http://www.infotek.no/        |
 -------------------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From grove at infotek.no  Tue Mar 23 07:57:44 1999
From: grove at infotek.no (Geir Ove Gr�nmo)
Date: Mon Jun  7 17:10:21 2004
Subject: ANN: xmlarch 0.25, an XML architectural forms processor
Message-ID: <GROVE-82ww08vdsf.fsf@pc-grove.infotek.no>


xmlarch: An XML architectural forms processor written in Python

Version:  0.25
Released: March 23rd 1999
Author:   Geir Ove Gr?nmo
Email:    grove@infotek.no

Homepage: http://www.infotek.no/~grove/software/xmlarch/index.html

---

What is xmlarch?

The xmlarch module contains an XML architectural forms processor
written in Python. It allows you to process XML architectural forms
using any parser that uses the SAX interfaces. The module allow you to
process several architectures in one parse-pass. Architectural
document events for an architecture can even be broadcasted to
multiple DocumentHandlers.

The main reason for releasing this version is to be able to support
architectural processing with tmproc[1]. Topic Map processing relies
heavily on the existence of the #GI mapping.

What's new?

  - Added support for the new #GI mapping token.

  - Added a method called get_current_element_name() to the
  ArchDocHandler class, so that you can easily keep track of the
  original generic identifier.

Fixes:

  - Bug related to the mapping between attributes and content.
  - Some minor ones.

[1] http://www.infotek.no/~grove/software/tmproc/index.html

---

Enjoy!

Geir Ove Gr?nmo

-- 
 ==================  Geir Ove Gr?nmo  ==================
|  STEP Infotek as, Gjerdrumsvei 12, 0486 Oslo, Norway  |
|        grove@infotek.no http://www.infotek.no/        |
 -------------------------------------------------------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Tue Mar 23 08:08:01 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:21 2004
Subject: IE5.0 does not conform to RFC2376
References: <199903230328.WAA19702@locke.ccil.org>
Message-ID: <36F74B26.21CF46EC@w3.org>


John Cowan wrote:
> 
> Chris Lilley scripsit:
> 
> > Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation?
> 
> The Recommendation basically says "We yield to the RFC once it is
> published".

> > > When the charset parameter is not specified, it is assumed as US-ASCII.
> >
> > Wow. So, what this RFC says is that, when used in email and on HTTP, the
> > encoding declaration is *always ignored*.
> 
> Unfortunately this is a side effect of the rules for the media type
> "text/*", which says that the default value of "charset" is always US-ASCII.

The default rules if no other rule is in place for a specific Media
type. The registration for text/xml can overridfe this behaviour if it
wishes to.

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Tue Mar 23 08:48:12 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:21 2004
Subject: Mozilla/milestone3
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A16E3@EUKBANT101>

The rendering was exactly correct on Win32 M3. More than that - I could do a
"view source" and get the XML (but not the XSL - but I think that's expected
- you can't do that with css either). This 6 months is going to be a long
but fun wait...

Matt.
--
http://come.to/fastnet
Perl on Win32, PerlScript, ASP, Database, XML
GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V 
!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++

> -----Original Message-----
> From:	Incze Lajos [SMTP:incze@mail.matav.hu]
> Sent:	Tuesday, March 23, 1999 1:22 AM
> To:	xml-dev@ic.ac.uk
> Subject:	Mozilla/milestone3
> 
> If anybody is interested - I just checked the new
> mozilla browser on Tim Bray's Explorer5 article in XML.
> I'm running Linux, so don't really know whait would be
> it look like on IE5. In the Mozilla it has a grey background with white
> background / red bordered boxes
> in it, red section headers and green anchor color. (They can be the
> defaults.) The rendering is acceptable.
>                                                 Incze
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Tue Mar 23 08:53:15 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:21 2004
Subject: IE5.0 does not conform to RFC2376
References: <199903230328.WAA19702@locke.ccil.org>
Message-ID: <36F755CB.C996CE2D@w3.org>


John Cowan wrote:

> Chris Lilley scripsit:
> > Wow. So, what this RFC says is that, when used in email and on HTTP, the
> > encoding declaration is *always ignored*.
> 
> Unfortunately this is a side effect of the rules for the media type
> "text/*", which says that the default value of "charset" is always US-ASCII.
> The alternative is to use "application/xml", which has no such
> obnoxious rule.

So, in consequence: example file such as the Chinese XML examples at
http://xml.ascc.net/xml/test/index.html (where each example is available
in 
UTF-8, Big5 and GB2312, all correctly labelled in the XML encoding
declaration) are now sets of invalid XML files which are required to
produce a critical error because of the invalid byte sequences in what
is now described as a US-ASCII file?

This is deeply counterproductive, and could have been avoided. 

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Tue Mar 23 08:56:37 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:21 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>; from Rick Jelliffe on Tue, Mar 23, 1999 at 03:51:11PM +1100
References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <19990323195615.C9794@io.mds.rmit.edu.au>

On Tue, Mar 23, 1999 at 03:51:11PM +1100, Rick Jelliffe wrote:
> 
> From: David Megginson <david@megginson.com
> 
> >In SGML, you have to write a special program to act on the information
> >in the data attributes (nothing does this out of the box); in XML, you
> >have to write a special program to act on the PUA.
> 
> Huh? OmniMark allows access to data attributes just as easily as element
> attributes (http://www.omnimark.com/develop/om40/doc/concept/646.htm),
> out of the box. Several CALS-aware tools understand the notations used
> in data attributes, e.g.,  when used for graphics.
> 
> And I dont agree that elements and characters and attributes and
> entities should be thought of  as interconvertable: search routines look
> for character codes--I don't know of any search routines which allow
> grepping on data and elements.

SIM builds indexes on arbitrary expressions.  This allows you to index
content, attributes, and even processing instructions if you want.

When doing path indexing, a search engine can treat attributes as
nodes of the tree rather than special things attached to nodes (one
possibility is to treat them as child elements with an '@' in front of
the element name).


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Tue Mar 23 09:49:09 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:10:21 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity,
  namespaces (was WG))
In-Reply-To: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie>

This is an interesting thread.  Many non-tag-minimization
reliables can be put forth as things that SGML "can do" that
XML cannot. Things like data attributes, exclusion exceptions,
internal SDATA entities and so on.

I see SGML and XML at opposite ends of a balanced lever.
On one side we have SGML - high on declarative syntax, low
on home grown code. On the other side we have XML - low on
declarative syntax, high on home grown code.

SGML gives you declarative syntax that can obviate the
need for coding around certain types of data modelling,
content authoring problems.

XML is light on the declarative syntax, leaving more
in the realm of "application specific" implementation
in a programming language.

Ultimately, both views have their place and both
may be "correct" for a given problem domain.

For me, I favour the XML side of the lever. Any
declarative syntax has its limits. It has
been my experience that the limits of SGML's
declarative syntax are quickly reached.[1] Any SGML
system I have ever worked on has a large collection
of ancilliary software to perform validation, data
aggregation, authoring short-cuts that are not
possible with pure SGML syntax.

XML fills a nice 80/20 niche here. 20% of SGML's
declarative syntax is used 80% of the time.
XML draws a line in the sand saying "here is
the most useful 80% in an allround cheaper package. You
will need to write processing software on top
of this but hey!. You would need to do that
with SGML anyway."

Analogies abound. What does it mean to say
you have your data in third normal form
in a relational database? It means that
you have a base data model that is interchangeable
amongst relational database systems. *But* and
it is a big *but*. The rest of the stuff that
makes up the solution is in some application
specific 4GL.

<stance slant="pragmatic">
Declarative syntax does not put bread on
my table. Solutions to business problems
using the beautiful ideas of SGML puts
bread on my table. XML gives me a nice
package that gives me most of what I want
in terms of a robust, simple, implementation
of the SGML philosophy.

I will build software around this package
all day long without ever once missing
an SGML feature. Whats more, I'll do it
in an open, standardized, cheap programming
language that gets the job done fast.[2]

When I go under the bus, I believe my
customers are in a better state than they
would have been if I'd pulled every
obscure SGML declarative syntax trick
in the book.
</stance>

[1] Notations are for me, the classic example
of the limitations of a declarative syntax and
how a declarative syntax feature can subtly
fool you into thinking you have solved a problem
when all you have done is defer it.

You hit, say, a data validation problem that cannot
be solved with SGML syntax alone so you invent
a notation for it. Knock up the declarative
syntax for it. Lovely. It all parses. *However*
the declarative syntax does not do anything. You
still need to implement it as a processing
layer above SGML.

[2] Python http://www.python.org

<Sean uri="http://www.digitome.com/sean.htm"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Tue Mar 23 11:08:04 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:21 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity,   namespaces (was WG))
In-Reply-To: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie>
References: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie>
Message-ID: <wkiubssbub.fsf@ifi.uio.no>


* Sean Mc Grath
| 
| For me, I favour the XML side of the lever. Any declarative syntax
| has its limits. It has been my experience that the limits of SGML's
| declarative syntax are quickly reached.[1] Any SGML system I have
| ever worked on has a large collection of ancilliary software to
| perform validation, data aggregation, authoring short-cuts that are
| not possible with pure SGML syntax.

I tend to favour the XML side myself (unless I have to write the
documents manually), and I think most people will do so. To me, XML
and SGML are a perfect example of what happens when the
worse-is-better and the-right-thing philosophies collide. (Even though
SGML doesn't really qualify as the-right-thing.)

The main problem with SGML is the complexity of the syntax, which
means that you need a large and complex application to get hold of
your data, and as Gabriel prophesied this means that you have few
choices of applications.

For XML we are beginning to see what we never saw with SGML: a
plethora of pluggable processing components. Much of this is due to
SAX, I think, but much is also due to the simpler nature of XML
syntax. I'm pretty sure that SAX2 will only reinforce this trend by
making it easier to develop and plug together parser filters and other
such components. 

Better design of XSL processors to allow the introduction of SAX
components at various points (of which 4XSL seems to be a good
example) would also help. Likewise with toolkits like SAXON.

In fact, the only downside is that most of this is happening in a
language as awkward as Java.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 11:15:16 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:21 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <36F70005.3D1F76D2@allette.com.au>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F68B71.5967A029@w3.org>
	<36F6D46A.FB33D473@allette.com.au>
	<14070.56970.50161.169467@localhost.localdomain>
	<36F70005.3D1F76D2@allette.com.au>
Message-ID: <14071.29413.900398.832442@localhost.localdomain>

Marcus Carr writes:

 > No question - it would be better if there was a single standard,
 > but the demise of SGML should be natural, driven by nothing other
 > than natural attrition.

I agree, and in fact, it's not really a question of demise at all --
XML is just another iteration of SGML, and SGML is still the
International Standard that provides its foundation.

Everything that we learned in SGML is there in XML, and all the
careful thought and person years of work from the Charles Goldfarb and
the other members of the ISO subcommittees is the fundamental reason
for XML's success.  Essentially, the W3C just did what ISO was too
slow at doing, and gave SGML a proper 12-year review; without the ISO
baggage and the emotional attachment to the minutiae of ISO 8879:1986
esoterica, the W3C's SGML ERB cum XML WG was able to wield a sharp
knife and cut away a lot of fat (though still not all of it).

Very soon, I expect that ISO 8879 will pass the flag to XML and move
to a legacy position (no one will be implementing new systems that use
it), but that won't happen until the rest of the XML enterprise-level
software support stabilises.  Even then, there will be major SGML
systems running for decades -- it is a credit to both the SGML and XML
designers and cross-translation between SGML and XML for import/export
is trivially simple, and that there will be few interop problems.

As for standards bodies, I don't know.  Perhaps XML will eventually
migrate to an Internation Standards body of some sort -- who knows if
the W3C will even exist in five years? -- or (and this might be
preferable) the torch will pass to a new, better-constituted body that
takes over both the W3C and IETF standards.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 11:28:43 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:21 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <NBBBJPGDLPIHJGEHAKBAAEOLCPAA.martind@netfolder.com>
References: <007101be74d5$3ba30cb0$11f96d8c@NT.JELLIFFE.COM.AU>
	<NBBBJPGDLPIHJGEHAKBAAEOLCPAA.martind@netfolder.com>
Message-ID: <14071.30766.638574.493702@localhost.localdomain>

Didier PH Martin writes:

 > By simple curiosity: Is it possible to declare an architectural
 > instance from an architectural form in XML by strictly following
 > the XML 1.0 spec? I do not mean here to simply have the
 > architectural elements as our element properties but to declare in
 > the prolog the correspondance between each markup and each
 > architectural element.

Yes -- this works in both SGML and XML: in XML, the architectural
declarations use alternatives to data attributes.

Please, everyone, remember that my statement was that there is nothing
that SGML does that XML cannot do (and vice-versa), not that they
always do them in the same way.

Please step back and take the perspective of a system architect, who
is not concerned with the minutiae of tag omission, data attributes,
or ignorable whitespace: XML and SGML both provide a clear-text
serialisation format for a single-rooted hierarchical tree, with the
ability to impose arbitrary directed graphs on top of that tree.
Nodes are named and have named properties as well as children, and a
node's children can contain both data and other nodes.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 11:34:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:21 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>
References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <14071.31214.638556.368976@localhost.localdomain>

Rick Jelliffe writes:
 > 
 > From: David Megginson <david@megginson.com
 > 
 > >In SGML, you have to write a special program to act on the information
 > >in the data attributes (nothing does this out of the box); in XML, you
 > >have to write a special program to act on the PUA.
 > 
 > Huh? OmniMark allows access to data attributes just as easily as element
 > attributes (http://www.omnimark.com/develop/om40/doc/concept/646.htm),

Yes, so does SP.  But (with the exception you note below) you still
have to write an Omnimark or Perl or C++ program to act on the
information in the data attributes.

 > out of the box. Several CALS-aware tools understand the notations used
 > in data attributes, e.g.,  when used for graphics.

I agree that there are some tools already written that understand
specific data attributes in specific cases, but the general case, you
still have to write a specialised program (using Omnimark, Perl, or
whatever) to do something useful with the data attributes, just as you
have to write a specialised program (using Java, Perl, or whatever) to
do something useful with PUA characters in XML.

 > And I dont agree that elements and characters and attributes and
 > entities should be thought of as interconvertable: search routines
 > look for character codes--I don't know of any search routines which
 > allow grepping on data and elements.

Perhaps I misunderstood -- I thought that you were talking about the
problem of including specialised, non-canonical characters in
attribute values (say, to represent three variant 'd' graphemes in a
10th-century English manuscript or a customised Han character).  I
think that PUA characters provide a good solution for that problem --
the only difficulty is that all of the knowledge about those
characters has to be encoded in the processing software using a lookup
table, while the SGML data-attribute solution is slightly more modular
since you can pass on extra generic information.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From reschke at medicaldataservice.de  Tue Mar 23 11:50:51 1999
From: reschke at medicaldataservice.de (Julian Reschke)
Date: Mon Jun  7 17:10:21 2004
Subject: SQL queries expressed in XML
Message-ID: <001201be7523$5fb1b720$2e00a8c0@julian>

Andrew McNaughton <andrew@squiz.co.nz> wrote:

> > > we recently had the idea to use XML to express SQL-like queries
> > > (so this is
> > > not about querying XML -- it is about using XML to express queries).
It
> > > seems to me that we might not be the first ones; so has anybody
defined an
> > > XML document type for expressing SQL queries?
> >
> > And just to widen this question slightly - assuming I do have an XML
> > representation
> > of a language construct - whats the best way to do the conversion from
> > the XML representation to the 'correct' language representation.
> >
> > Could I use XSL to do this - or would this be going against the grain?
> >
> > (Just to qualify this I'm relatively new to XML, and *extremely* new to
> > XSL).
>
> XSL doesn't seem to do very well where the desired output is not well
formed.
> If your SQL queries have '"', '<', '>' or '&' in them, then you're going
to
> start getting into kludges.  perl or DSSSL would be better suited to the
task.
>
> *Why* do you want to put your queries into XML?  Do you need access to the
> structure of your queries?  Perhaps you just need something that can be

The idea was to reuse XML tools in a project which is XML related anyway.
Expressing a query in XML instead of using a "proprietary" representation
would allow us to use a standard parser to transform it into a object
representation (DOM), and it would also have the benefit that standard tools
could be used to actually enter or render a query string.

> ...
> I figure any boolean query can be expressed as a decision tree terminating
in
> true or false leaf nodes, that this maps well into XML, and that it should
be
> able to be used to search for queries matching a given document using
existing
> tools (eg sgrep).  I believe this could lead to a relatively simple
processing
> model, but it remains to be seen how efficient it will be.

Basically this is similar to our thinking...

> If anyone is aware of any relevant work that is being or has been done I'd
> appreciate hearing about it.  XML or otherwise.

This is precisely why I asked :-)

--
Julian Reschke
MedicalData Service GmbH (http://www.medicaldataservice.de)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From reschke at medicaldataservice.de  Tue Mar 23 12:17:41 1999
From: reschke at medicaldataservice.de (Julian Reschke)
Date: Mon Jun  7 17:10:22 2004
Subject: SQL queries expressed in XML
Message-ID: <001001be7522$aa5aae40$2e00a8c0@julian>

Kay Michael <Michael.Kay@icl.com> wrote:

>> we recently had the idea to use XML to express SQL-like
>> queries (so this is
>> not about querying XML -- it is about using XML to express
>> queries). It
>> seems to me that we might not be the first ones; so has
>> anybody defined an
>> XML document type for expressing SQL queries?
>>
>I've thought about the question and some of my thoughts are implemented in
>SAXON's SQLStyleSheet, which is the beginnings of an XSL extension to allow
>a stylesheet to update an RDBMS with data from an XML source document.
>
>As always in this area the first problem is deciding how much of the syntax
>should be "angle brackets" and how much should be rules for the content of
>elements/attributes. The answer to that depends on tradeoffs between
>different modes of use. So the question is, who is going to use it, and
what
>for?
>
>In particular if you are interested in queries, what are you planning to do
>with the results? Print them out, merge them into the DOM representation of
>the document, or what?

I don't think that this is really relevant, because one might want to talk
to a storage which doesn't even do anything XML related. However, to answer
the question, I would expect to get the results either in a DOM or in an XML
string.


--
Julian Reschke
MedicalData Service GmbH (http://www.medicaldataservice.de)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bhall at merrillhall.com  Tue Mar 23 12:59:19 1999
From: bhall at merrillhall.com (Ben Hall)
Date: Mon Jun  7 17:10:22 2004
Subject: MS XML 2.0 book cancelled
Message-ID: <E10PQlv-0007Id-00@punch.ic.ac.uk>

I received the following email from Amazon.Com.

At 09:23 PM 3/18/99 -0800, you wrote:
>
>Hello from Amazon.com!  
>
>We have contacted the supplier by phone and are sorry 
>to report that  the release of the following title has been
>cancelled:
>	
>   Microsoft Corporation "Microsoft XML 2.0 Programmer's 
>	Guide and Software Development Kit With CDROM"
>
>This unavailable item has been cancelled from your order.
>	
>Your credit card will NOT BE CHARGED for this item.
>	
>Your order has been cancelled.
>
>Thanks for shopping at Amazon.com, and we hope to see you again!
>
>Sincerely,
>
>Customer Service Department
>Amazon.com
>http://www.amazon.com
>Earth's Biggest Selection
>


===================================
benjamin hall

merrill-hall new media, inc.

bhall@merrillhall.com
===================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kurt.donath at lmco.com  Tue Mar 23 14:08:01 1999
From: kurt.donath at lmco.com (Kurt Donath)
Date: Mon Jun  7 17:10:22 2004
Subject: Microsoft XML 2.0?
Message-ID: <36F79E80.84A9E617@lmco.com>


Simon,

You had posted a message to xml-dev about the Microsoft XML 2.0 book on
sale at Amazon.  I went and placed an order for it, then was informed
today:

"We have contacted the supplier by phone and are sorry 
to report that  the release of the following title has been
cancelled:
        
   Microsoft Corporation "Microsoft XML 2.0 Programmer's 
        Guide and Software Development Kit With CDROM"

This unavailable item has been cancelled from your order."

Hmmm.  Is this YOUR doing?


Kurt Donath


-- 
Kurt Donath
315.456.6276
Staff Systems Engineer
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . 
                                     Lockheed Martin - Enterprise
Information Systems
                                                        Systems
Engineering / Webserv

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Tue Mar 23 14:26:26 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:22 2004
Subject: small problem
Message-ID: <005a01be7538$b5d4e7c0$91acdccf@ix.netcom.com>

Can we make single XML file which contains the data and also style of
>that data( How to display in the  browser ) with out having another XSL

Yes it is possible, use CSS

Frank

----- Original Message ----- 
From: Jayadeva Babu Gali <jayadeva@lgsi.co.in>
To: <xsl-list@mulberrytech.com>; <xml-dev@ic.ac.uk>
Sent: Tuesday, March 23, 1999 4:43 AM
Subject: small problem


>Hi,
>
>Can we make single XML file which contains the data and also style of
>that data( How to display in the  browser ) with out having another XSL
>if its possible can u please correct the attaching file with this mail.
>
>/*****  xml file with style sheet *****/
><?xml version="1.0"?>
><xsl:stylesheet
>       xmlns:xsl="http://www.w3.org/TR/WD-xsl"
>       xmlns="http://www.w3.org/TR/REC-html40"
>       result-ns="">
>
>         <xsl:template match="/">
>         <HTML>
>          <HEAD>
>           <TITLE>Test</TITLE>
>          </HEAD>
>          <BODY>
>             <xsl:apply-templates/>
>          </BODY>
>        </HTML>
>        </xsl:template>
>        <xsl:template match="*">
>        <xsl:apply-templates/>
>        </xsl:template>
>
><xsl:template match="persons">
><xsl:for-each select="person[1]">
>   <h1><xsl:value-of select="firstname"/></h1>
>   <h1><xsl:value-of select="lastname"/></h1>
></xsl:for-each>
>        </xsl:template>
>        <xsl:template match="textnode()">
>        <xsl:value-of select="."/>
>        </xsl:template>
>
></xsl:stylesheet>
>
>
><persons>
>   <person>
>     <firstname>jayadev</firstname>
>     <lastname>gali</lastname>
>   </person>
>   <person>
>       <firstname>shekar</firstname>
>       <lastname>ksirsagar</lastname>
>   </person>
></persons>
>
>
>
>
>
> XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at oreilly.com  Tue Mar 23 14:55:36 1999
From: crism at oreilly.com (Chris Maden)
Date: Mon Jun  7 17:10:22 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <36F74B26.21CF46EC@w3.org> (message from Chris Lilley on Tue, 23
	Mar 1999 09:04:54 +0100)
Message-ID: <199903231453.JAA00219@ruby.ora.com>

> Date: Tue, 23 Mar 1999 09:04:54 +0100
> From: Chris Lilley <chris@w3.org>
> 
> The default rules if no other rule is in place for a specific Media
> type. The registration for text/xml can overridfe this behaviour if
> it wishes to.

In theory, but not in practice.  A processor that understands
text/plain but not text/xml is allowed to use the rules for text/plain
when encountering text/xml.  So although text/xml can say, "Do X," a
processor that doesn't know text/xml from text/adam may well do Y
instead.  Mandating that people who can't hear you must listen is not
particularly effective.  This is why application/xml exists: to avoid
fallback text/* rules.

> Date: Tue, 23 Mar 1999 09:50:19 +0100
> From: Chris Lilley <chris@w3.org>
> 
> So, in consequence: example file such as the Chinese XML examples at
> http://xml.ascc.net/xml/test/index.html (where each example is
> available in UTF-8, Big5 and GB2312, all correctly labelled in the
> XML encoding declaration) are now sets of invalid XML files which
> are required to produce a critical error because of the invalid byte
> sequences in what is now described as a US-ASCII file?

Describing files in encodings other than US-ASCII or ISO 8859-1 (or
maybe other ISO 8859s) as text/anything is not a very good idea.  The
rules for text/* allow many unhealthy things; 8-bit data is not even a
safe assumption, and line-end normalization can be a killer.  The
fallback rules for MIME's two-level hierarchy is only the final straw;
for non-European encodings, I would use application/xml.

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 17:41:42 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:22 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity,
  namespaces (was WG))
In-Reply-To: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie>
References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>
	<3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie>
Message-ID: <14071.33102.817935.724149@localhost.localdomain>

Sean Mc Grath writes:

 > This is an interesting thread.  Many non-tag-minimization
 > reliables can be put forth as things that SGML "can do" that
 > XML cannot. Things like data attributes, exclusion exceptions,
 > internal SDATA entities and so on.

I think that I agree with what Sean is saying here and later in the
message -- think of *what* you can represent rather than *how* you
represent it.  For instance, let's take a graphic where we want to
provide the width, height, and colour depth to the processor.  Here's
a typical, declaration-heavy (Sean's term) SGML way to do it (except
that a hard-core SGMLie would use public IDs):

  <!DOCTYPE doc SYSTEM "mydoc.dtd" [
    <!NOTATION png PUBLIC "....">
    <!ATTLIST #NOTATION pgn
      width NUMBER #IMPLIED
      height NUMBER #IMPLIED
      depth NUMBER #IMPLIED>
    <!ENTITY pic1 SYSTEM "pic1.png" NDATA png [
      width=300
      height=200
      depth=16
    ]>

    <doc>
     <photo src=pic1>
    </doc>

Here's a typical XML way to do it (also works in SGML):

    <?xml version="1.0"?>

    <!DOCTYPE doc SYSTEM "mydoc.dtd">

    <doc>
     <photo href="pic1.png" width="300" height="200" depth="16"/>
    </doc>

You're modelling exactly the same information about the picture in
both -- data attributes provide an alternative mechanism for modelling 
the information, but they do not allow you to represent anything that
you could not represent without them.

 > I see SGML and XML at opposite ends of a balanced lever.
 > On one side we have SGML - high on declarative syntax, low
 > on home grown code. On the other side we have XML - low on
 > declarative syntax, high on home grown code.
 > 
 > SGML gives you declarative syntax that can obviate the
 > need for coding around certain types of data modelling,
 > content authoring problems.
 > 
 > XML is light on the declarative syntax, leaving more
 > in the realm of "application specific" implementation
 > in a programming language.
 > 
 > Ultimately, both views have their place and both
 > may be "correct" for a given problem domain.

Right -- the question is not whether there is a benefit to continuing
to develop the two in parallel, but whether the benefit will outweight
the cost.  We'll see what the market decides over the next few years.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 17:41:59 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:22 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <19990323153036.A9794@io.mds.rmit.edu.au>
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
	<36F0CFF4.365B@hiwaay.net>
	<36F10CFC.CFEB89A8@goon.stg.brown.edu>
	<36F13992.150D05F9@w3.org>
	<36F18209.8C68524@allette.com.au>
	<36F68B71.5967A029@w3.org>
	<36F6D46A.FB33D473@allette.com.au>
	<14070.56970.50161.169467@localhost.localdomain>
	<19990323153036.A9794@io.mds.rmit.edu.au>
Message-ID: <14071.31774.199352.713952@localhost.localdomain>

In his message, in a part that I'm not quoting (I do respond to
specific details below), Marcelo Cantos argues that it's not for us to
decide whether both full SGML and XML can co-exist, and I agree -- I
am simply predicting that the market might not find it worthwhile to
continue developing two standards that are architecturally identical
and differ even in the implementation details only in nit-picky ways.

Choose an arbitrary number for the cost of containing to develop two
standards rather than one -- say, US$100M/year (if all of the big
enterprise vendors have to develop, test, debug, document, support,
and maintain both full SGML and XML versions of their software, as
well as donate employees' time to committee work) and unaccountable
additional hours of free time donated by OSS writers.

Do SGML-specific features like SHORTREFs, data attributes, and
omissible tags sometimes make life simpler for implementors?  Of
course they do.

Are the differences worth US$100M/year (or whatever number you pick)?
I don't know, and the decision is not ours to make, but the market
will figure it out soon enough.  Whatever happens, there will
certainly be money to be made from supporting the existing SGML
installations, so there will be good justification for
backwards-compatibility in some major tools.

Now, on to the specific points...


Marcelo Cantos writes:

 > > XML does nothing that SGML cannot do.
 > 
 > When developing the TOC management system for our document
 > fragmenting toolkit, we chose XML to represent the TOC.  SGML was
 > not an option, because we didn't know the content model in advance
 > and couldn't build it automatically from the DTD's of the
 > individual documents.
 >
 > Also, we couldn't use a homogeneous element tree with attributes,
 > because we actually extracted structured content from the documents
 > for insertion into the TOC (sure, we could have serialised the content
 > into an SGML attribute, but that would have a been perverse and
 > painful alternative to simply using XML).

There are work-arounds that you could have used in SGML, such as
synthesised DTDs using ANY.  Both SGML and XML *can* do this, but in
your case, XML makes it a little easier (as would WebSGML).  The
differences are important to us, as SGML/XML implementors, but would
not really concern the architect of a large system except to the point 
that they affected maintainability.

 > > SGML does nothing that XML cannot do.
 > 
 > On several occasions I have had to import textual information, and
 > have been able treat the data as SGML with appropriate choice of
 > shortrefs.
 > 
 > With XML I would have been forced to write an intermediate
 > translation layer and would have consequently lost the originals
 > (or been forced to store the original and transformed document, or
 > add the extra layer to every access).
 >
 > True, they are not always adequate for the job, but I certainly would
 > not have happily forgone them in my project because they wouldn't have
 > been useful in someone else's project!

Or you could simply have defined a round-trip mapping -- tab-delimited 
fields map to <item> elements map back to tab-delimited fields.  You
could also, with XML or SGML, point into the original without altering 
it (HyTime provides good mechanisms for doing that in SGML or XML).

Again, however, Marcello is writing about implementation details, not
about what SGML and XML are capable of representing in the abstract.
In this case, SGML makes life a little easier for a *very* experienced
designer under high-specialised circumstances.

Lexically, SGML and XML differ in minor ways; logically, they are
essentially identical.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eric at hellman.net  Tue Mar 23 21:03:41 1999
From: eric at hellman.net (Eric Hellman)
Date: Mon Jun  7 17:10:22 2004
Subject: IE5 and iso entities
In-Reply-To: <004201be754e$00ddd730$3af96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <v04020a2ab31dab06ce6f@[192.168.1.1]>

At 3:55 AM +1100 3/24/99, Rick Jelliffe wrote:
> >>>A test document (a technical article describing blue semiconductor
>>>lasers,
>>>>if anyone cares) is at http://nsr.mij.mrs.org/4/1/article.xml
>
>I have put the latest versions at  http://www.ascc.net/xml/
>under the resources page.
>
>When I looked at your DTD I got a load error too. But I notice that your
>version of ISOnum seems to be incomplete (at least, when I download it
>to here it ends with a <!--  which is not correct.)
>
>Rick

So, I installed your latest pen files, but IE5 still gives screwy error
messages.


A name was started with an invalid character. Line 100, Position 19

<!ENTITY percnt "%" ><!--=percent sign-->
------------------^

I replaced "%" with &#37; then I got:

The replacement text for a parameter entity must be properly nested with
parenthesized groups. Line 43, Position 9

%ISOnum

I removed the entities for [,],{,},(,)

my new error was :

An invalid character was found inside an entity reference. Line 191,
Position 19

<!ENTITY nbsp   "&


The full text of this entry is
<!ENTITY nbsp   "&#160;" >

I tried changing it to
<!ENTITY nbsp   "&#xA0;" > and got the same error message.

So I ask the list again: has anyone, anywhere, gotten IE5 to read ISO
entity tables, or we going to have to do entity substituion on the server
side?

Eric
Eric Hellman
Openly Informatics, Inc.
http://www.openly.com/           Tools for 21st Century Scholarly Publishing

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Tue Mar 23 22:17:10 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:22 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity,  namespaces (was WG))
References: <3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie>
Message-ID: <36F812B6.F46B5249@allette.com.au>


Sean Mc Grath wrote:

> SGML gives you declarative syntax that can obviate the
> need for coding around certain types of data modelling,
> content authoring problems.
>
> XML is light on the declarative syntax, leaving more
> in the realm of "application specific" implementation
> in a programming language.
>
> Ultimately, both views have their place and both
> may be "correct" for a given problem domain.

That was the topic of my presentation at the XML/SGML Asia Pacific conference last year (call
for papers soon to be issued). If the deliverable is simply documents that conform to a
certain structure, the most flexible approach would allow you to use an SGML or XML processor
depending on the task. Provided the cost of this isn't excessive (sometimes it's nothing), it
can be handy to use one processor or the other.

Perhaps this is partly due to the dynamics of our organisation; typically we have data
delivered to us in any format and we're expected to deliver back *ML. The clients usually want
this to be as "black box" as possible, so we're free to implement whatever methods and tools
we see fit. During conversion, we may use an SGML parser to aid with tag omitability, but
increasingly our clients want valid XML data, so it must finally be parsed with an XML parser,
as well as any stages that benefit from a well-formedness check. As David Megginson mentioned
the other day, this may be difficult across an organisation, but it's not difficult across a
conversion team. Although I know this won't work for everyone, I prefer to consider SGML and
XML as two arrows in a quiver, not two quivers.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 22:23:07 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:22 2004
Subject: SAX2: LexicalHandler draft v.1.1
In-Reply-To: <8725673C.0072EEF5.00@d53mta03h.boulder.ibm.com>
References: <8725673C.0072EEF5.00@d53mta03h.boulder.ibm.com>
Message-ID: <14072.4772.577599.352783@localhost.localdomain>

roddey@us.ibm.com writes:

 > >public interface LexicalHandler
 > >{
 > >    public abstract void xmlDecl (String version,
 > >                     String encoding,
 > >                     String standalone)
 > >    throws SAXException;
 > >

 > 1) The xmlDecl() needs another parameter. In addition to the encoding
 > string, which is the exact text of the string in the document, some
 > customers need to know what the actual encoding is (which might have been
 > auto-sensed.) They need this in some cases to get the document back to the
 > original encoding. So there should be an 'actualEncoding' parameter which
 > is either the same as encoding (if there was an encoding string in the
 > document) or the actual encoding used if not (probably in some canonical
 > format, since there are only about 6 auto-sensed encodings right?)

With the new SAX2 modular setup, it will be possible for people to
create handlers that provide this level of detail if they want.  I'm
still wavering about including the XML Declaration at all.

 > 2) I made the names for the comment, PI, and whitespace call backs
 > on the DTD handler have different names from those of the ones on
 > the document handler. This is somewhat safer in C++ since it means
 > not having a single method override two pure virtuals from a
 > mixin. It also allows the handler to be less stateful in the
 > situation where the same object is implementing the handler for
 > both document and DTD (since they then know that its for one or the
 > other without having to keep flags for that stuff, which is not
 > really a biggie but I thought it was worth it.)

That's an interesting suggestion -- I don't think that the state
information is too much of a burdon, but we can watch closely.
There's also an interop problem, since SAX 1.0 parsers already use
DocumentHandler.processingInstruction() to report PIs in the DTD as
well.

 > 3) I report whitespace in the DTD, so that it can also be pretty
 > much exactly recreated. I only report this if I'm asked to (by an
 > 'advanced callbacks' flag, which also controls comments and PIs
 > being reported from the DTD.)

This is too far for the SAX core, but I'd encourage others to develop
handlers like this (a crowded market is a healthy market).

 > 4) I have events for the begin/end of the internal subset.

This information is available in the current lexical handler in a
slightly different form: the start/endDTD() handler gives the overall
boundaries, and the start/endEntity() call for "[dtd]" will delimit
the external subset (if any); everything inside the DTD but outside
the external subset (or other external parameter entities) is in the
internal subset by default.

 > 5) I have a callback for notation decl, attlist decls, and attdefs,
 > which are important.

Notations are already in SAX 1.0 (as required by the XML REC).  The
remainder will appear in DTDDeclHandler as soon as I have a chance to
draft a proposal for it.

 > 6) I have a flag on each entity, element, etc... decl callback
 > called 'isIgnored'. This lets the caller know that this one was
 > ignore because it was a subsequent instance of a previously
 > declared decl. So they don't need to keep it if they just care
 > about actual content, but they do if they want to recreate the
 > original document (which is extremely important to some folks.)

Yes, this is still an open question for DTDDeclHandler.

 > 7) I haven't done this yet, but some customers are insisting that
 > any event callback that reports a quoted string indicate whether
 > single or double quotes were used (again for recreation of the
 > original document.) This seems a bit over the top to me, since they
 > are equivalent, but I guess the customer is always right even when
 > he's wrong.

That's precisely why SAX2 (I almost typed "ModSAX" -- sniff, sniff) is 
designed for easy extensibility and feature discovery.  Business
requirements will demand different types of support for different
situations, and SAX2 provides a clean way to do that.  I don't imagine 
that we'd put this kind of thing in the core, though.

 > That's all I can think of right now. It would really be nice if we
 > could map all of the information that we go through the trouble
 > (and overhead) of parsing to public APIs. Otherwise, customers end
 > up using our internal event API in order to get the information
 > that they require. This locks down our internal API more than we'd
 > like, but there is little we can do about it if they *have* to have
 > this extra info to do what they do.

See my comments above on extensibility.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 23 22:33:21 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:22 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <00ff01be74ad$c71eeed0$2ee044c6@arcot-main>
References: <00ff01be74ad$c71eeed0$2ee044c6@arcot-main>
Message-ID: <14072.5617.960838.731783@localhost.localdomain>

Don Park writes:

[dpm]

 > >  public interface AttributeValueHandler
 > >  {
 > >    public abstract void startEntity (String name)
 > >      throws SAXException;
 > >    public abstract void endEntity (String name)
 > >      throws SAXException;
 > >    public abstract void characters (char ch[], int start, int length)
 > >      throws SAXException;
 > >  }
 > >
 > >  public interface AttributeValue2 extends AttributeValue
 > >  {
 > >    public abstract boolean isSpecified (String name);
 > >    public abstract void accept (AttributeValueHandler handler)
 > >      throws SAXException;
 > >  }

[Don]

 > I don't think event-based interface is appropriate for this
 > purpose.  Why not introduce an interator or an array-like
 > interface?

Perhaps -- personally, I'm a little annoyed at having to do this at
all.  XML messed up a little here by making attribute values too
difficult to process.  

The problem is that even if you don't care about entity boundaries,
the XML 1.0 REC requires reporting of any entities that are not
expanded (in the case, for example, of a non-validating parser that
hasn't read the declaration in the external DTD subset).  As a result,
in a literal reading of the spec, a fully-conformant XML 1.0 API can
*never* treat attribute values simply as strings.  SAX 1.0 does so,
and no one has ever minded, but conformance is conformance...


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simpson at polaris.net  Wed Mar 24 00:39:06 1999
From: simpson at polaris.net (John E. Simpson)
Date: Mon Jun  7 17:10:22 2004
Subject: IE5 and iso entities
In-Reply-To: <v04020a2ab31dab06ce6f@[192.168.1.1]>
References: <004201be754e$00ddd730$3af96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <3.0.5.32.19990323193820.01539d60@nexus.polaris.net>

At 04:05 PM 3/23/99 -0500, Eric Hellman wrote:
>So I ask the list again: has anyone, anywhere, gotten IE5 to read ISO
>entity tables, or we going to have to do entity substituion on the server
>side?

Yesterday on the XML-L list, John Robert Gardner (mailto:
jgardner@blue.weeg.uiowa.edu) announced a demo that does that (among other
things). His DTD is at:
	http://www.uiowa.edu/~etd/tdm.dtd
The entitities declared all point to Rick's .pen files stored at James
Tauber's schema.net site. (A couple of the files had minor typos until last
week, but Rick has since cleaned them up.)

The demonstration is at:
	http://www.uiowa.edu/~etd/front.xml
and there's a discussion of concepts at:
	http://www.uiowa.edu/~etd/

As far as I know, JohnG is not actually referencing the entities anywhere
in the instance -- he included them in the DTD simply for completeness and
possible future use. Nevertheless, IE5 doesn't choke on their declarations.
==========================================================
John E. Simpson            | The secret of eternal youth
simpson@polaris.net        | is arrested development.
http://www.flixml.org      |  -- Alice Roosevelt Longworth

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simpson at polaris.net  Wed Mar 24 00:49:21 1999
From: simpson at polaris.net (John E. Simpson)
Date: Mon Jun  7 17:10:22 2004
Subject: FWD: Announcement - World Wide Web Wrapper Factory (W4F)
Message-ID: <3.0.5.32.19990323194855.01539b40@nexus.polaris.net>

I received this announcement via e-mail yesterday. It may (or may not :) be
of interest to xml-dev and xml-l subscribers. Contact information is at the
foot of the announcement.

[Disclaimer: I have no affiliation with the W4F product development group.
My correspondent, previously unknown to me, just happened on my website.
Apologies for the cross-posting to subscribers of both lists.]

>----- Looking at the Web through XML glasses, using W4F -----
>
>The World Wide Web Wrapper Factory (W4F) is a Java toolkit to
>generate wrappers for HTML data sources.
>
>Version 1.03 offers a built-in declarative mapping to XML.
>Using W4F it is now possible to easily specify the translation 
>of HTML pages into XML documents. Moreover, the specification 
>gives for free the DTD.
>
>W4F consists of a retrieval language to identify Web sources, a
>declarative extraction language (HEL: HTML Extraction Language) 
>to express robust extraction rules and a mapping interface to 
>export the extracted information into some user-defined data-
>structures (text, Java objects, XML, etc.).
>The wrappers are generated as Java classes that can be used as is 
>or integrated into higher-level applications.
>
>Version 1.03 provides some improved visual support to make the
>creation of wrappers easier and faster. In particular, the 
>extraction of HTML can be done via a wysiwyg interface.
>
>The W4F toolkit comes as a Java package and can be downloaded from 
>the W4F web site. It is free for non-commercial use.
>Various examples of running wrappers are also available for download
>from the web site.
>
>Web site:
>http://db.cis.upenn.edu/W4F
>
>Contacts:
>Arnaud Sahuguet
>Database Research Group, Univ. of Pennsylvania, PA, USA
>sahuguet@gradient.cis.upenn.edu
>http://www.cis.upenn.edu/~sahuguet
>
>Fabien Azavant
>?cole Nationale Sup?rieure des T?l?communications, Paris, France
>Fabien.Azavant@enst.fr
>http://www.stud.enst.fr/~azavant

==========================================================
John E. Simpson            | The secret of eternal youth
simpson@polaris.net        | is arrested development.
http://www.flixml.org      |  -- Alice Roosevelt Longworth

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Wed Mar 24 00:54:27 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:10:22 2004
Subject: Keeping space XML -> XSL -> HTML
Message-ID: <001201be7590$916c3200$4a5eedc1@arp01>

Hi.

If an element within an XML file is marked to preserve spaces, would one
expect the spaces to be lost during an XSL transformation to HTML ?

These seems to be the behaviour of IE5.

Any ideas how to maintain the spaces ?

Regards,

Richard.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Wed Mar 24 01:17:20 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:23 2004
Subject: XML complexity, namespaces (was WG)
In-Reply-To: <14071.31774.199352.713952@localhost.localdomain>; from David Megginson on Tue, Mar 23, 1999 at 06:53:00AM -0500
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F18209.8C68524@allette.com.au> <36F68B71.5967A029@w3.org> <36F6D46A.FB33D473@allette.com.au> <14070.56970.50161.169467@localhost.localdomain> <19990323153036.A9794@io.mds.rmit.edu.au> <14071.31774.199352.713952@localhost.localdomain>
Message-ID: <19990324121655.A29837@io.mds.rmit.edu.au>

On Tue, Mar 23, 1999 at 06:53:00AM -0500, David Megginson wrote:
> In his message, in a part that I'm not quoting (I do respond to
> specific details below), Marcelo Cantos argues that it's not for us
> to decide whether both full SGML and XML can co-exist, and I agree
> -- I am simply predicting that the market might not find it
> worthwhile to continue developing two standards that are
> architecturally identical and differ even in the implementation
> details only in nit-picky ways.

Well, I guess we are all entitled to prognosticate.  My own personal
view is that SGML is useful enough (over and above XML) in enough
serious systems that it will not go away in the foreseeable future
(which, admittedly, isn't that long in this industry).

> Choose an arbitrary number for the cost of containing to develop two
> standards rather than one -- say, US$100M/year (if all of the big
> enterprise vendors have to develop, test, debug, document, support,
> and maintain both full SGML and XML versions of their software, as
> well as donate employees' time to committee work) and unaccountable
> additional hours of free time donated by OSS writers.

I personally doubt that the maintenance of two standards will have any
noticeable impact on implementors.  Our internal libraries are, for
the most part, built and work nicely with either format.  Furthermore,
the major implementation effort involves the commonality, not the
variability between the standards.

As for the perspective of the standards architect, I can't make any
real judgements on how much work is involved there.  I would, however,
speculate that standards are driven by demand more than by economics.

> Do SGML-specific features like SHORTREFs, data attributes, and
> omissible tags sometimes make life simpler for implementors?  Of
> course they do.
> 
> Are the differences worth US$100M/year (or whatever number you
> pick)?  I don't know, and the decision is not ours to make, but the
> market will figure it out soon enough.  Whatever happens, there will
> certainly be money to be made from supporting the existing SGML
> installations, so there will be good justification for
> backwards-compatibility in some major tools.

And we still encounter new clients with new projects that are opting
for SGML because XML doesn't satisfy their needs.  The usual reason is
having to deal with legacy data.  But then one must ask how soon do
you think legacy data will go away?

I should point, however, that I am not arguing that SGML will
continue to dominate the market.  I believe that XML will increase
dramatically in use and will ultimately become the dominant player by
a wide margin.  What I disagree with is the notion that SGML has no
future role to play and will not be supported.

> Now, on to the specific points...
> 
> Marcelo Cantos writes:
> 
>  > > XML does nothing that SGML cannot do.
>  > 
>  > When developing the TOC management system for our document
>  > fragmenting toolkit, we chose XML to represent the TOC.  SGML was
>  > not an option, because we didn't know the content model in
>  > advance and couldn't build it automatically from the DTD's of the
>  > individual documents.
>  >
>  > Also, we couldn't use a homogeneous element tree with attributes,
>  > because we actually extracted structured content from the
>  > documents for insertion into the TOC (sure, we could have
>  > serialised the content into an SGML attribute, but that would
>  > have a been perverse and painful alternative to simply using
>  > XML).
> 
> There are work-arounds that you could have used in SGML, such as
> synthesised DTDs using ANY.  Both SGML and XML *can* do this, but in
> your case, XML makes it a little easier (as would WebSGML).  The
> differences are important to us, as SGML/XML implementors, but would
> not really concern the architect of a large system except to the
> point that they affected maintainability.

But would you then seriously suggest that maintenance is not a
significant component of a project's cost?

Of course SGML can do it, but the question boils down to whether it's
worth it.  We, as implementors, consider it far more cost effective to
maintain two standards (the cost is really quite minimal, IMHO) than
to insist on one or the other.

To say that SGML does everything XML does is ignoring the fact that
implementation details really do matter.  It is like saying that a
spreadsheet can do everything a word processor can.  Of course it can,
but that's not the point.

In any event, since the issue is whether XML will replace SGML, not
vice-versa, the "XML does nothing that SGML cannot do" comment is a
bit of a red herring.  The latter statement is far more pertinent.

>  > > SGML does nothing that XML cannot do.
>  > 
>  > On several occasions I have had to import textual information,
>  > and have been able treat the data as SGML with appropriate choice
>  > of shortrefs.
>  > 
>  > With XML I would have been forced to write an intermediate
>  > translation layer and would have consequently lost the originals
>  > (or been forced to store the original and transformed document,
>  > or add the extra layer to every access).
>  >
>  > True, they are not always adequate for the job, but I certainly
>  > would not have happily forgone them in my project because they
>  > wouldn't have been useful in someone else's project!
> 
> Or you could simply have defined a round-trip mapping --
> tab-delimited fields map to <item> elements map back to
> tab-delimited fields.  You could also, with XML or SGML, point into
> the original without altering it (HyTime provides good mechanisms
> for doing that in SGML or XML).

So what you are saying, effectively, is, why not add an extra layer,
and use it on every access?  I guess the simple answer is, I'd rather
not.

You are suggesting complicated solutions to something that was
inherently simple to solve!  Sure, we could have done all those
things, and it would have dramatically increased the workload.  We
would have had to add that extra layer, or bring in additional
technologies.

Even from an abstract perspective, the solutions you are offering
cannot, by any stretch of the imagination, be considered to fall under
the "SGML does nothing that XML cannot do" premise.  In reality they
involve drawing on additional tools and technologies to make up a very
real shortfall in XML's capabilities.  This merely emphasises the fact
that SGML and XML are _not_ the same thing.

> Again, however, Marcello is writing about implementation details,
> not about what SGML and XML are capable of representing in the
> abstract.  In this case, SGML makes life a little easier for a
> *very* experienced designer under high-specialised circumstances.

Actually, the tab-delimited stuff was one of the first problems I
encountered when starting to use SGML.  But such an answer would be
something of a diversion.  The real point is that SGML _is_ for
experienced designers under highly specialised circumstances.  If they
aren't working under such circumstances, then by all means use XML
(which is what most of our clients are, in fact, doing)!

> Lexically, SGML and XML differ in minor ways; logically, they are
> essentially identical.

And I reiterate, implementation matters.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Wed Mar 24 01:23:20 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:23 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG))
Message-ID: <001c01be7595$36a14c70$64f96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com>
 >You're modelling exactly the same information about the picture in
>both -- data attributes provide an alternative mechanism for modelling
>the information, but they do not allow you to represent anything that
>you could not represent without them.

Except that the SGML example gives the attributes as belonging to the
thing pointed to (the entity) and not the particular invocation.

In the XML version, there is nothing to say that the attributes belong
to the thing pointed to rather than at the invocation. For example, take
the common case of where the entity has a size (natural size) and the
element also has size attributes (scaled size).

Surely the equivalent XML to the SGML examples given is really:

    <doc>
     <photodef id="p1" href="pic1.png" notation="text/png" width="300"
    height="200"
    depth="16"
    type="I point to some object called an NDATA entity"
    content-model="I must be empty"
    addressing="don't count me as an element when doing treeloc" />
     <photo href="#p1" />
    </doc>

(and perhaps the photo element should not be  simple link)

The information modelled does not only include the elements and
attributes but also the structure, and the fact that an entity is
labelled as an entity, which have different addressing rules. In the
absense of XML having conventions for the last three attributes, I dont
think one can say that one can model everying that SGML models using
XML.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Wed Mar 24 02:04:26 1999
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:10:23 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <36F62D81.A623C0A2@w3.org>
Message-ID: <199903240153.AA00022@archlute.apsdc.ksp.fujixerox.co.jp>

XML requires a draconian approach.  100% interoperability for conformant 
implementations is most important.  Very low interoperability 
for non-conformant implementations is acceptable.

As for HTML, users see corrupted documents when the browser chooses 
an incorrect encoding.  Then, they can tell the correct encoding to 
the browser.  Thus, it might not be a bad idea to provide 80% interoperability 
for conformant implementations and 50% interoperability for non-conformant 
implementations.  The heuritics in HTML 4.0 is based on such an assumption, 
as I see it.

As for XML, recipients of XML might be programs or database systems.  
In the worst case, corrupted documents will contaminate the entire 
database.  A single XML document on the WWW may destroy XML-aware 
search engines.  Hence, I believe that we need a draconian approach; we 
have to ensure 100% interoperability for conformant implementations.  
Ideally, it should be possible to point out non-conformant data and 
implementations.  expat sometimes detects incorrect charsets.

HTTP/1.1 quite clearly says that the charset parameter is authoritative.  
If RFC 2376 had said something different, interoperability for 
conformant implementations would have been destroyed.

Chris Lilley wrote:
> 
> 
> MURATA Makoto wrote:
> > 
> > I believe that IE 5.0 does not conform to RFC2376 (XML Media Types),
> > of which I am a co-author.
> > 
> > As for the XML media type "text/xml", the charset parameter in the
> > MIME header is authoritative.  Encoding declarations have to be ignored
> > so that transcoding is possible.
> 
> So, if the file is saved to some local browser cache and then re-read,
> it may have no MIME header so the encoding declaration is then
> authoritative.

The same thing applies to HTML.  The cache must have MIME headers as well.

> Why can't the transcoding proxy also rewrite the encoding declaration,
> since it is rewriting the file anyway? It is trivially easy to find,
> process, and change.

For security reason, transcoding proxies should not rewrite documents.  
Moreover, if we mandate embedded encoding signatures for HTML, XML, CSS, 
etc., I18N of flat text will become impossible.  

I have believed that there is a conssensus in the W3C team and I am quite 
puzzled by your response.  You might want to speak with Martin Duerst.  

> I imagine that someone could take some generic charset-converting code
> and make a n XML-aware transcoding servlet that rewrote the encoding
> declaration in about what, an hour? If someone does this, I will see
> about getting it included in the next Jigsaw version.

Please don't do that.
 
> > However, IE 5.0 appears to always ignore the charset parameter and use
> > the BOM or encoding declaration only.  Therefore, IE 5.0 does not conform to
> > RFC 2376.
> 
> Okay. But does RFC 2376 conflict with the XML 1.0 Recommendation?

As Jon Cowan pointed out, it does not.

> > When the charset parameter is not specified, it is assumed as US-ASCII. 
> 
> Wow. So, what this RFC says is that, when used in email and on HTTP, the
> encoding declaration is *always ignored*.

If the media type is text/xml, yes.  As for application/xml, we use 
the procedure in Appendix F of XML 1.0.
 
> That is a pretty big change and, frankly IMHO, ill-advised.

Frankly, I am quite surprised that a W3C team member says such a thing 
in a public place after an RFC is published.  

Chris Lilley wrote:
> 
> Correction: if you are the *administrator* of an Apache server. One of
> the ways in which the Web has changed over the last 5 years is that the
> percentage of Web authors who also administer the site that they serve
> from has dropped from a substantial majority to an insignificant
> minority.

Are you aware of the  "AddCharset" patch developed by W3C Keio?   It 
allows casual users to configure Apache.  Please concact Koga-san at 
W3C Keio (y-koga@ccs.mt.nec.co.jp).

Chris Lilley wrote:
> Please consider points 1 and 2 to be a defect report on RFC2376

These points are clearly in conflict with HTTP 1.1.

Cheers,


Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Wed Mar 24 02:28:30 1999
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:10:23 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <36F74B26.21CF46EC@w3.org>
Message-ID: <199903240217.AA00023@archlute.apsdc.ksp.fujixerox.co.jp>

Chris Lilley wrote:
> > Unfortunately this is a side effect of the rules for the media type
> > "text/*", which says that the default value of "charset" is always US-ASCII.
> 
> The default rules if no other rule is in place for a specific Media
> type. The registration for text/xml can overridfe this behaviour if it
> wishes to.

HTTP/1.1 (RFC2068 and the latest "Draft Standard") quite clearly says:

   The "charset" parameter is used with some media types to define the
   character set (section 3.4) of the data. When no explicit charset
   parameter is provided by the sender, media subtypes of the "text"
   type are defined to have a default charset value of "ISO-8859-1" when
   received via HTTP. Data in character sets other than "ISO-8859-1" or
   its subsets MUST be labeled with an appropriate charset value.

Here, the default is 8859-1 ;-(

The latest I-D for RFC2376 also said that the default is 8859-1 when the XML 
document is being tramsmitted by HTTP.  However, the IESG requested  US-ASCII 
as the default.

> > IESG discussed the document today that defines the text/xml media type.
> > We note that it contines the practice of text/plain where the default
> > charset is iso-8859-1 if transported over HTTP, but us-ascii if
> > transported over SMTP.
> >
> > This inconsistency was a result of a wide deployment of HTTP
> > implementations that did not properly following the MIME spec.
> > Having one media type which is used inconsistently between HTTP
> > and SMTP is bad enough, but we don't want to continue this practice
> > for new media types.  Inconsistencies between HTTP and SMTP
> > usage make it more difficult to gateway between HTTP and email,
> > or to use HTTP to access email contents.
> >
> > We suggest to have the charset parameter default to US-ASCII regardless
> > of transport, and strongly recommend that the parameter always be
> > supplied by senders.  (If the sender is unsure whether the charset
> > is US-ASCII or ISO-8859-1, it can safely label it as ISO-8859-1,
> > since the former is a subset of the latter).


Chris Lilley wrote:
> So, in consequence: example file such as the Chinese XML examples at
> http://xml.ascc.net/xml/test/index.html (where each example is available
> in 
> UTF-8, Big5 and GB2312, all correctly labelled in the XML encoding
> declaration) are now sets of invalid XML files which are required to
> produce a critical error because of the invalid byte sequences in what
> is now described as a US-ASCII file?

Yes.  Conformant XML parsers must report a fatal error.  This is great since 
non-conformant data can always be detected.

Examples of conformant XML documents are available at: 
http://www.fxis.co.jp/DMS/sgml/xml/charset/

Cheers,

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Wed Mar 24 02:44:24 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:23 2004
Subject: DOM Implemetation in C?
In-Reply-To: <004801be739f$4b5a5c80$17f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <000101be759e$f1c25540$0100007f@eps.inso.com>

> There is a technical problem that CORBA IDL mappings do not
> (as far as I can see) provide C mappings to let us know how to create
> objects, but it seems that DOM (or, at least, DOM users) require object
creation and
> finalization.

The DOM includes factory methods...


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Wed Mar 24 02:44:29 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:23 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <005101be73a6$325a59e0$c8a8a8c0@thing1>
Message-ID: <000201be759e$f318bd80$0100007f@eps.inso.com>

> >Do we really need to know about CDATA sections 
> 
> Debatable perhaps, but supported by the DOM. (Anyone know why?)
> But I'd really like to see better SAX/DOM integration, so Yes!

CDATA sections *are* different from normal text, even if only 
because the author used them. Note the interface inheritance in
the DOM that tries to hide the distinction for those that need 
not see it.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crey at dcd.abk.nec.co.jp  Wed Mar 24 02:51:44 1999
From: crey at dcd.abk.nec.co.jp (Charlemagne L. Rey)
Date: Mon Jun  7 17:10:23 2004
Subject: null document value
Message-ID: <36F852B4.770BA4D4@dcd.abk.nec.co.jp>

I got a small problem which bothers me a lot.
I'm trying to parse an XML file and access its
objects using tagname through a DOMParser
and yield a null value for the Document.

As of now, I got no idea.  I attached the codes
as well as the xml and dtd file for you to help
me find out why.

--
<signature>
  <name>Charlemagne L. Rey</name>
  <voice>
    <office local="34312">+81-0471-856713</office>
    <home>+81-0471-838227</home>
  </voice>
  <e-mail>crey@software.ntep.nec.co.jp</e-mail>
  <company>NEC Corporation</company>
</signature>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: XMLDOMParser.java
Type: application/x-unknown-content-type-java_auto_file
Size: 1441 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990324/753eaab5/XMLDOMParser.bin
-------------- next part --------------
<?xml version="1.0" ?>
<!DOCTYPE borrower SYSTEM "Borrower.dtd">
 <borrower>
  <entry>
   <logname>babeth</logname>
   <name>BABETH ABREA</name>
   <email>babeth@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>archie</logname>
   <name>ARCHIE YAP</name>
   <email>archie@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>abijay</logname>
   <name>FRED ABIJAY</name>
   <email>abijay@software</email>
   <note>null</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>nolan</logname>
   <name>NOLAN BATHAN</name>
   <email>nolan@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ana</logname>
   <name>ANACORINA M. CAVITE</name>
   <email>ana@software</email>
   <note>SW Design Clerk</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>glennj</logname>
   <name>GLENN OGAPONG</name>
   <email>glennj@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jbarba</logname>
   <name>JORGE BARBA</name>
   <email>jbarba@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>din</logname>
   <name>JAMES DIN</name>
   <email>din@hardware.ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>petchie</logname>
   <name>PETCHIE ABADINAS</name>
   <email>abadinas@ntep.nec.co.jp</email>
   <note>Ang guwapang tisay.</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>msdiez</logname>
   <name>MADONNA SALOME DIEZ</name>
   <email>msdiez@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>fecerico</logname>
   <name>ELLAMAE CERICO</name>
   <email>fecerico@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rachel</logname>
   <name>RACHEL AGNES ALFAFARA</name>
   <email>rachel@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>felix</logname>
   <name>FELIX CANTAY</name>
   <email>felix@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mike</logname>
   <name>MICHAEL CO MANABAT</name>
   <email>manabat@software</email>
   <note>The Dojo Master</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>arvin</logname>
   <name>ARVIN SAGARINO</name>
   <email>arvin@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jdala</logname>
   <name>JOEY DALA</name>
   <email>jdala@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>esolis</logname>
   <name>EDUARDO SOLIS</name>
   <email>esolis@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ariadne</logname>
   <name>ALYWIN POSTRERO</name>
   <email>ariadne@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>bautista</logname>
   <name>NEIL BAUTISTA</name>
   <email>bautista@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>nelanie</logname>
   <name>NELANIE LUTH NIMIS</name>
   <email>nelanie@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jleones</logname>
   <name>JEREMY LEONES</name>
   <email>jleones@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jlauta</logname>
   <name>JENNIFER LAUTA</name>
   <email>jlauta@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>epalen</logname>
   <name>EMILIO PALEN</name>
   <email>epalen@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>alvin</logname>
   <name>ALVIN FERNANDEZ</name>
   <email>alvin@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>noel</logname>
   <name>NOEL CHING ALLOSA</name>
   <email>noel@software.ntep.nec.co.jp</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ronnelm</logname>
   <name>RONNEL MAGLASANG</name>
   <email>ronnelm@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>baumelt</logname>
   <name>BAUMEL TANDOGON</name>
   <email>baumelt@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ali</logname>
   <name>AZALEAH TIANERO</name>
   <email>ali@software</email>
   <note>null</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>giovanni</logname>
   <name>GIOVANNI BAUTISTA</name>
   <email>giovanni@ntep.nec.co.jp</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>emeree</logname>
   <name>EMEREE CLAIRE SANCHEZ</name>
   <email>emeree@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ajuan</logname>
   <name>JUAN F. ABULENCIA, JR.</name>
   <email>ajuan@software</email>
   <note>gwapo, uyab ni Cecille</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rubie</logname>
   <name>RUBIE LIM</name>
   <email>rubie@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>joseph</logname>
   <name>JOSEPH ONG</name>
   <email>joseph@ntep.nec.co.jp</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>donna</logname>
   <name>DONNA MARIE FRADES</name>
   <email>donna@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>neil</logname>
   <name>NEIL AGUILAR</name>
   <email>neil@ntep.nec.co.jp</email>
   <note>Production Engineering Dep't.</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>vrocales</logname>
   <name>VICTOR ROCALES</name>
   <email>vrocales@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>nlongjas</logname>
   <name>NOEMI LONGJAS</name>
   <email>nlongjas@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>josephus</logname>
   <name>JOSEPHUS PESIRLA</name>
   <email>josephus@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jlumapas</logname>
   <name>JASON LUMAPAS</name>
   <email>jlumapas@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>alegaspi</logname>
   <name>ANGELITO LEGASPI</name>
   <email>alegaspi@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>chingie</logname>
   <name>CHINGIE TANCAWAN</name>
   <email>chingie@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>lotlot</logname>
   <name>LUTHGARDA PAYLADO</name>
   <email>lotlot@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>gladys</logname>
   <name>GLADYS ZALDIVAR</name>
   <email>gladys@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>andrew</logname>
   <name>ANDREW LACAYA</name>
   <email>andrew@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>fred</logname>
   <name>FRED KINTANAR</name>
   <email>fred@software</email>
   <note>Assistant Manager</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>gbaguia</logname>
   <name>GLICERIO BAGUIA</name>
   <email>gbaguia@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>may</logname>
   <name>MARIA GERMAINE GERMAN</name>
   <email>may@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>oflas</logname>
   <name>FELIPE JUN OFLAS JR.</name>
   <email>oflas@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>dodgie</logname>
   <name>DODGIE DANOSOS</name>
   <email>dans@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jopee</logname>
   <name>JOPEE CAMIGUE</name>
   <email>jopee@ntep.nec.co.jp</email>
   <note>Hardware Design Clerk</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>guday</logname>
   <name>NOEL GUDAY</name>
   <email>guday@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>dondon</logname>
   <name>DONDON MATARANAS</name>
   <email>dondon@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>tina</logname>
   <name>CRISTINA D. FAELDONEA</name>
   <email>cristy@ntep.nec.co.jp</email>
   <note>Hardware Design Clerk</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>inad</logname>
   <name>IRWIN ENAD</name>
   <email>inad@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mar</logname>
   <name>MAR ARES</name>
   <email>mar@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ars</logname>
   <name>RAYMUND ARCILLA</name>
   <email>ars@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>nelson</logname>
   <name>NELSON BRIONES</name>
   <email>nelson@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>reneb</logname>
   <name>RENE BAJARIAS</name>
   <email>reneb@ntep.nec.co.jp</email>
   <note>EDP</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mojal</logname>
   <name>LEOPOLDO MOJAL</name>
   <email>mojal@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>orson</logname>
   <name>ORSON YU</name>
   <email>orson@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>allan</logname>
   <name>ALLAN FABIANA</name>
   <email>alanmf@ntep.nec.co.jp</email>
   <note>Production Control</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>romy</logname>
   <name>ROMEO HEYRANA</name>
   <email>romy@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>maluenda</logname>
   <name>ANTONIO MALUENDA</name>
   <email>maluenda@ntep.nec.co.jp</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>john</logname>
   <name>JOHN DEXTER OMOLON</name>
   <email>john@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>robert</logname>
   <name>ROBERT DALE MONTESCLAROS</name>
   <email>robert@software</email>
   <note>Intake date: 11/11/96</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>imaceda</logname>
   <name>IVAN MACEDA</name>
   <email>imaceda@software</email>
   <note>Intake date: 11/12/96</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>pierre</logname>
   <name>PIERRE ENRIQUEZ</name>
   <email>pierre@software</email>
   <note>Wala ko'y sure sa iyang e-mail</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jeff</logname>
   <name>JEFFREY ALBARRACIN</name>
   <email>jeff@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>salazar</logname>
   <name>RICHARD SALAZAR</name>
   <email>salazar@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jerry</logname>
   <name>JERMANDO RODRIGUEZ</name>
   <email>jerry@software</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>nonon</logname>
   <name>LEANDRO FAELDONEA</name>
   <email>nonon@ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>markh</logname>
   <name>MARK AGUILAR</name>
   <email>markh@hardware.ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rsardon</logname>
   <name>RINO SARDON</name>
   <email>rsardon@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>chris</logname>
   <name>CHRISTOPHER IDO</name>
   <email>chris@ntep.nec.co.jp</email>
   <note>Production Control Dep't.</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>edward</logname>
   <name>EDWARD</name>
   <email>edward@ntep.nec.co.jp</email>
   <note>Manufacturing Office</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>bobby</logname>
   <name>ROMEO LONOY</name>
   <email>bobby@ntep.nec.co.jp</email>
   <note>Hardware Design</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jlim</logname>
   <name>JOEL LIM</name>
   <email>jlim@software</email>
   <note>Intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>idongon</logname>
   <name>IKE DONGON</name>
   <email>idongon@software</email>
   <note>Intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jgumaroy</logname>
   <name>JONATHAN GUMAROY</name>
   <email>jgumaroy@software</email>
   <note>intake date:  April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>etaladua</logname>
   <name>EARL WILLIAM KHO TALADUA</name>
   <email>etaladua@software</email>
   <note>intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>troldan</logname>
   <name>ROLDAN TORIBIO</name>
   <email>troldan@software</email>
   <note>intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>vjangus</logname>
   <name>VICTOR JESUS ANGUS</name>
   <email>vjangus@software</email>
   <note>intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>johnd</logname>
   <name>JOHN JUN DORMITORIO</name>
   <email>johnd@software</email>
   <note>intake date:  April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>cabella</logname>
   <name>CHRISTIAN ABELLA</name>
   <email>cabella@software</email>
   <note>intake date:  April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>henrison</logname>
   <name>HENRISON SIA</name>
   <email>henrison@software</email>
   <note>intake date:  April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>sdeanna</logname>
   <name>DEANNA SALMERON</name>
   <email>sdeanna@software</email>
   <note>intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>njclark</logname>
   <name>JOHN CLARK NALDOZA</name>
   <email>njclark@software</email>
   <note>intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ericp</logname>
   <name>ERIC PIZON</name>
   <email>ericp@software.ntep.nec.co.jp</email>
   <note>intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mmedina</logname>
   <name>MICHELLE MEDINA</name>
   <email>mmedina@software</email>
   <note>Intake date:  April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>josephb</logname>
   <name>JOSEPH BENAVIDES</name>
   <email>josephb@software</email>
   <note>Intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jecal</logname>
   <name>JOSE ELIAS CALDERON</name>
   <email>jecal@software</email>
   <note>Intake date: April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rvchua</logname>
   <name>R. VICTORIA CHUA</name>
   <email>rvchua@ntep.nec.co.jp</email>
   <note>QC Engineer</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>crey</logname>
   <name>CHARLEMAGNE REY</name>
   <email>crey@software</email>
   <note>Intake date:  April 7, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>gboston</logname>
   <name>GERSON BOSTON</name>
   <email>gboston@ntep.nec.co.jp</email>
   <note>Hardware Design Engineer</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>benl</logname>
   <name>BENJAMIN LAHOY</name>
   <email>benl@ntep.nec.co.jp</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ivy</logname>
   <name>IVY PASCUA</name>
   <email>ipascua@ntep.nec.co.jp</email>
   <note>null</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>russel</logname>
   <name>RUSSEL OBERIO</name>
   <email>roberio@software</email>
   <note>Intake date: November 4, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>acreer</logname>
   <name>ALLAN CREER</name>
   <email>acreer@software</email>
   <note>Intake date: Nov. 4, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ken</logname>
   <name>KEN ERICK SARMAGO</name>
   <email>ken@software</email>
   <note>Intake date: Nov. 4, 1997</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>tanj</logname>
   <name>JERALDINE TAN</name>
   <email>tanj@software.ntep.nec.co.jp</email>
   <note>Intake date:  11/11/97</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>garry</logname>
   <name>GARRY M. (PC)</name>
   <email>na</email>
   <note>PC</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>beths</logname>
   <name>MARIBETH SUICO</name>
   <email>beths@ntep.nec.co.jp</email>
   <note>Finance Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>puppy</logname>
   <name>EDMUND MOLINA</name>
   <email>emolina@ntep.nec.co.jp</email>
   <note>PE Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>vincent</logname>
   <name>VINCENT LADLAD</name>
   <email>vincentl@software</email>
   <note>Intake date: May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jun</logname>
   <name>LEONARDO ARTIAGA JR.</name>
   <email>leonardo@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>grace</logname>
   <name>GRACE CAGULANGAN</name>
   <email>gracec@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jay</logname>
   <name>JAY CAMINERO</name>
   <email>jayc@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rcamus</logname>
   <name>RIZZA CAMUS</name>
   <email>rcamus@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>cvanessa</logname>
   <name>VANESSA CASTILLO</name>
   <email>cvanessa@software</email>
   <note>Intake date: May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>ccayacap</logname>
   <name>CELESTE CAYACAP</name>
   <email>ccayacap@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mjdultra</logname>
   <name>MARK ELJUNNE DULTRA</name>
   <email>mjdultra@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>fendaya</logname>
   <name>FRANKLIN ENDAYA</name>
   <email>fendaya@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>nickson</logname>
   <name>NICKSON IAN LEGASPI</name>
   <email>nicksonl@software</email>
   <note>Intake date: May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>william</logname>
   <name>WILLIAM GEORGE GO</name>
   <email>william@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>adrianr</logname>
   <name>ADRIAN P. RESTAURO</name>
   <email>adrianr@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>charles</logname>
   <name>CHARLES TORREJOS</name>
   <email>tcharles@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jlee</logname>
   <name>JENNYLEE UY</name>
   <email>jlee@software.ntep.nec.co.jp</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>emmanuel</logname>
   <name>EMMANUEL VILLACERAN</name>
   <email>emmanv@software</email>
   <note>Intake date:  May 4, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>sherwin</logname>
   <name>SHERWIN GARCIA</name>
   <email>garcia@ntep.nec.co.jp</email>
   <note>MFG. Department</note>
   <mservice>N</mservice>
  </entry>
  <entry>
   <logname>georgec</logname>
   <name>GEORGE L. CORDERO</name>
   <email>georgec@software</email>
   <note>June 3, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jubalde</logname>
   <name>JERICO UBALDE</name>
   <email>jubalde@mailman.ntep.nec.co.jp</email>
   <note>Hardware Design Department</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>fretsie</logname>
   <name>FRETSIE SALAZAR</name>
   <email>fsalazar@ntep.nec.co.jp</email>
   <note>Production Engineering Dep't.</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>reina</logname>
   <name>HEIDI RAMAS</name>
   <email>reina@ntep.nec.co.jp</email>
   <note>EDP</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>caudie</logname>
   <name>CAUDI DISCIPULO</name>
   <email>caudie@ntep.nec.co.jp</email>
   <note>EDP Manager</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jmocol</logname>
   <name>JOSEPH MARIE OCOL</name>
   <email>jmocol@ntps10</email>
   <note>Intake date: October 1, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>llim</logname>
   <name>LEO C. LIM</name>
   <email>llim@ntps10</email>
   <note>Intake date:  October 1, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jjambata</logname>
   <name>JAY JOHN AMBATA</name>
   <email>jjambata@ntps10</email>
   <note>Intake date: November 6, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>daparici</logname>
   <name>DEXTER APARICIO</name>
   <email>daparici@ntps10</email>
   <note>Intake date: November 6, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mguibone</logname>
   <name>MIRIAM GUIBONE</name>
   <email>mguibone@ntps10</email>
   <note>Intake date: November 06, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>jpllosa</logname>
   <name>JOEL PATRICK LLOSA</name>
   <email>jpllosa@ntps10</email>
   <note>Intake date: November 6, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>lamartin</logname>
   <name>LILY ANN MARTIN</name>
   <email>lamartin@ntps0504</email>
   <note>Intake date: November 6, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>mdmejora</logname>
   <name>MARK DAVE MEJORADA</name>
   <email>mdmejora@ntps0504</email>
   <note>Intake date: November 6, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rroleda</logname>
   <name>RYAN ROLEDA</name>
   <email>rroleda@ntps10</email>
   <note>Intake date: November 6, 1998</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>christine</logname>
   <name>CHRISTINE PENA</name>
   <email>chetpena@hotmail.com</email>
   <note>null</note>
   <mservice>Y</mservice>
  </entry>
  <entry>
   <logname>rita</logname>
   <name>RITA C. DEBALUCOS</name>
   <email>rita@software.ntep.nec.co.jp</email>
   <note>Software Design Clerk</note>
   <mservice>N</mservice>
  </entry>
 </borrower>

-------------- next part --------------
<!ELEMENT borrower (entry)+>
<!ELEMENT entry (logname, name, email, note, mservice)>
<!ELEMENT logname   (#PCDATA)>
<!ELEMENT name      (#PCDATA)>
<!ELEMENT email     (#PCDATA)>
<!ELEMENT note      (#PCDATA)>
<!ELEMENT mservice  (#PCDATA)>
From cbullard at hiwaay.net  Wed Mar 24 04:08:06 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:10:24 2004
Subject: Validation
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU> <36F0CFF4.365B@hiwaay.net> <36F10CFC.CFEB89A8@goon.stg.brown.edu> <36F13992.150D05F9@w3.org> <36F15372.3FF36ABB@prescod.net> <19990321133612.D29582@io.mds.rmit.edu.au>
Message-ID: <36F86472.7D1D@hiwaay.net>

Marcelo Cantos wrote:
> 
> Of course, none of the above discourse will eliminate the need for
> discussion on what, exactly, is needed and how that need is to be
> satisfied.  As one colleague astutely pointed out to me, I am really
> transforming the issue from "real validation" to "sufficient
> validation".  It would be a mistake, however, to conclude that this is
> a trivial transformation in the statement of the problem.  It diverts
> the emphasis of the search markedly away from completeness and towards
> practicality and useability (of course, completeness remains
> desirable, it merely ceases to be a central goal).

Not in disagreement.  Still, DTDs play a role in expressing constraints 
that in some way, must be implementable and to some degree must be 
validated for a particular piece of content.  Here is a different 
kind of schema from the VRML language.  How would any/all of the 
DTDs/schemas proposed for XML be used to define this?  Which if 
any are better?

Transform {
  eventIn  MFNode addChildren
  eventIn  MFNode removeChildren
  exposedField SFVec3f  center 0 0 0
  exposedField MFNode children [ ]
  exposedField SFRotation 0 0 0 1 0
  exposedField SFVec3f  scale 1 1 1 
  exposedField SFRotation scaleOrientation 0 0 1 0
  exposedField SFVec3f translation 0 0 0
  field SFVec3f bboxCenter 0 0 0
  field SFVec3f bboxSize -1 -1 -1

}

It isn't a trick question.  Serious people are currently evaluating 
the suitability of XML for this.  In this form, the declaration is 
quite compact.  Right now, without datatypes and without event models, 
this *simple* node may be more than XML can describe.

Any takers?

len bullard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Wed Mar 24 04:29:28 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:10:24 2004
Subject: XML complexity, namespaces (was WG)
References: <002101be70ef$17ec9d70$3ff96d8c@NT.JELLIFFE.COM.AU>
		<36F0CFF4.365B@hiwaay.net>
		<36F10CFC.CFEB89A8@goon.stg.brown.edu>
		<36F13992.150D05F9@w3.org>
		<36F18209.8C68524@allette.com.au>
		<36F68B71.5967A029@w3.org>
		<36F6D46A.FB33D473@allette.com.au>
		<14070.56970.50161.169467@localhost.localdomain>
		<36F70005.3D1F76D2@allette.com.au> <14071.29413.900398.832442@localhost.localdomain>
Message-ID: <36F86974.5593@hiwaay.net>

David Megginson wrote:
> 
> As for standards bodies, I don't know.  Perhaps XML will eventually
> migrate to an Internation Standards body of some sort -- who knows if
> the W3C will even exist in five years? -- or (and this might be
> preferable) the torch will pass to a new, better-constituted body that
> takes over both the W3C and IETF standards.

I think also that trend is already in motion as evidenced by the 
formal working agreements between ISO and various consortia including 
the W3C and Web3D.  The ISO VRML97 standard started as a consortium 
standard which when mature enough and for which working implementations 
could be demonstrated reliably, was forwarded to ISO for international 
standardization.  That is a very healthy way to do this business.  
The W3C can stick closer to its charter of promoting technologies 
and specifications and spend less time on *standardization*.  This is
not 
to say the W3C work is not worthy, but the focus of standardization
often 
has legal tangles.  When engineers practice law, you get poor law.  
When lawyers engineer, airplanes fall out of the sky.  Its a matter 
of practice and focus.

The working agreements are like a wheel inside a wheel.  The inner 
wheels (the consortia) can turn fast.  The outer wheel (ISO) turns 
slower.  In concert, events are notated smoothly.

XML won't supplant SGML.  It won't have to. The same people I met 
building SGML are building XML.  The community matters.  The specs 
and standards are what we implement and agree on.  Nothing more. 

I miss Yuri.  He understood that.

len

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Wed Mar 24 08:34:11 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:10:24 2004
Subject: XML conference
Message-ID: <360209e3.240299@smtpgate1.ONE2ONE.CO.UK>


Hi all,
I am seeking info on the 'XML One' conference planned for May 24 at   
 Austin, TX . The web is under construction so details are very limited.   
Is anyone on this list intend on doing something interesting at the   
conference?


adios

 -lucio

 ---------------------------------------------------------------------
 One2One              LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower      tel : +44 181 214 3847
 Elstree Way
 Borehamwood                 fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lucio.piccoli at one2one.co.uk  Wed Mar 24 10:39:05 1999
From: lucio.piccoli at one2one.co.uk (LUCIO PICOLLI)
Date: Mon Jun  7 17:10:24 2004
Subject: XML conference
Message-ID: <36020bcc.240299@smtpgate1.ONE2ONE.CO.UK>


There seems to be confusion about my previous email. I'll attempt to   
rephrase my questions.

I am considering attending the 'XML One' conference planned for May 24 at   
Austin, TX. I am searching for details so i can convince my manager to   
pay for the conference fees. However the official conference web site is   
under construction. I guessed that most of the speakers would come from   
this interest group. So if anyone has info about the conference please   
let me know.

Thanks

 -lucio


> Hi all,
> I am seeking info on the 'XML One' conference planned for May 24 at
>  Austin, TX . The web is under construction so details are
> very limited.
> Is anyone on this list intend on doing something interesting at the
> conference?
>
>
> adios
>
> -lucio
>
> ---------------------------------------------------------------------
>  One2One              LUCIO.PICCOLI@one2one.co.uk
>  Elstree Tower      tel : +44 181 214 3847
>  Elstree Way
>  Borehamwood                 fax :+44 181 214 2325
>  LONDON WD6 1DT
>  __________ http://www.one2one.co.uk _____________
>
>
> xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on   
CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following   
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 13:59:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:24 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG))
In-Reply-To: <001c01be7595$36a14c70$64f96d8c@NT.JELLIFFE.COM.AU>
References: <001c01be7595$36a14c70$64f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <14072.59101.373961.418429@localhost.localdomain>

Rick Jelliffe writes:

 > Surely the equivalent XML to the SGML examples given is really:
 > 
 >     <doc>
 >      <photodef id="p1" href="pic1.png" notation="text/png" width="300"
 >     height="200"
 >     depth="16"
 >     type="I point to some object called an NDATA entity"
 >     content-model="I must be empty"
 >     addressing="don't count me as an element when doing treeloc" />
 >      <photo href="#p1" />
 >     </doc>
 > 
 > (and perhaps the photo element should not be  simple link)
 >
 > The information modelled does not only include the elements and
 > attributes but also the structure, and the fact that an entity is
 > labelled as an entity, which have different addressing rules. In the
 > absense of XML having conventions for the last three attributes, I dont
 > think one can say that one can model everying that SGML models using
 > XML.

Rick, you're still pointing to implementation details rather than
abstract modelling.  Try to express the question in terms of the thing 
being modelled -- for example, at a project meeting, the system
architect might ask the following question:

  Can SGML and XML both model a reference to a photograph, providing
  the width, height, and colour depth?

The answer, of course, is 'yes'.  At this point, the system architect
drops out of the discussion and starts playing Tetris on her Palm
Pilot.

Next, someone asks whether there's a substantial difference in the
time and cost for implementation and maintenance.  Both SGML and XML
can declare the object in a single place as an external NDATA entity
or upon each reference as an HREF attribute, and both SGML and XML can
provide the type information explicitly in a single place through a
notation or upon each reference through a MIMETYPE attribute, or allow
the application to determine the type through the transfer protocol
(i.e. HTTP), file extension, magic patterns at the start, etc.

However, the SGML guru points out that information about the graphic's
size (if needed) can be expressed in SGML in a single place using data
attributes when the entity is declared, while in XML it needs to be
repeated in attributes for each reference.  The XML zealot mentions
that you could use a single XML element to model the photograph as an
independent object, and a short and confusing debate ensues with the
XML specialist and a couple of data-modelling specialists taking up
one side, and the SGML guru and a couple of document-modelling
specialists taking up the other, while the rest of the room falls into 
a stupor.

Suddenly, the project manager jolts himself awake and asks what the
disadvantage is to giving the size information for each reference
rather than once in the declaration, and whether doing so will delay
the project or cause serious headaches when the project migrates to V2
in the fall.

The SGML guru declares that it's always better to maintain the
information in a single place rather than repeating it, because if the
information changes, it's all in one place and can be accessed easily.

The XML zealot argues that you can do the same thing by treating the
picture as a first-class object, modelled with elements.  The SGML
zealot cuts in and says that that will mess up HyTime addressing, and
at the mention of the word 'HyTime' a sudden panic grips the room,
until the XML zealot kindly points out that the same thing might apply
to XPointer.

There is a second, brief religious war between the SGML guru and the
XML zealot about whether the photograph is a declaration that belongs
in the prolog or an object that should be modelled in the document
element (again, the data-modelling specialists and the
document-modelling specialists take sides), but the meeting is going
on too long and it's becoming obvious to everyone except the SGML and
XML people that the differences aren't really important enough to have
a measurable effect on the project.

Just to be certain, though, the project manager cuts off the debate by
asking how often the same picture will appear in a single document.
The graphic designer mentions repeated graphical elements like
specialised bullets, icons in the page headers, etc., but the SGML
guru and the XML zealot both shout her down by saying that that kind
of thing is handled by the stylesheet.  

The system architect (who has finished Tetris) then declares that
information about the photograph's size, type, etc. should be stored
in a relational database, where it can be easily maintained and
updated, and that the SGML or XML will simply contain a unique
identifier that can be used to generate a primary key for a database
lookup.

Every else at the meeting except for the markup specialists nods
vigorous agreement, the meeting breaks up, and they all rush for the
coffee machine or washrooms, except for the SGML guru and the XML
zealot -- they stay at the table, arguing whether the unique
identifier for the database lookup should be a formal public
identifier or a URI.....


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 14:00:10 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:24 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <000201be759e$f318bd80$0100007f@eps.inso.com>
References: <005101be73a6$325a59e0$c8a8a8c0@thing1>
	<000201be759e$f318bd80$0100007f@eps.inso.com>
Message-ID: <14072.61385.105692.234306@localhost.localdomain>

Gavin Thomas Nicol writes:

 > > >Do we really need to know about CDATA sections 
 > > 
 > > Debatable perhaps, but supported by the DOM. (Anyone know why?)
 > > But I'd really like to see better SAX/DOM integration, so Yes!
 > 
 > CDATA sections *are* different from normal text, even if only 
 > because the author used them. Note the interface inheritance in
 > the DOM that tries to hide the distinction for those that need 
 > not see it.

By the same argument,

<p
x="1">

and 

<p x="1">

are different, because the author used them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eriblair at mediom.qc.ca  Wed Mar 24 14:23:44 1999
From: eriblair at mediom.qc.ca (=?iso-8859-1?Q?=C9ric_Riblair?=)
Date: Mon Jun  7 17:10:24 2004
Subject: How to convert an XML file to an Access database ...
Message-ID: <01ba01be7602$7c3e6a70$1f9ccb84@grr.ulaval.ca>

Hello, 
I would like to know the simplest way to import the information contained in a file XML to an Access database... 

Thank you for your answers, 

Regards,

?ric

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990324/eccc1e1e/attachment.htm
From jatkins at Bluestone.com  Wed Mar 24 15:59:44 1999
From: jatkins at Bluestone.com (Atkins, Jon)
Date: Mon Jun  7 17:10:24 2004
Subject: XML conference
Message-ID: <9A4DF69E3C5ED211B86400A0C9D1776095F0E2@thor.operations.bluestone.com>

XML One is being positioned as a comprehensive XML soultions and technology
forum with 3 tracks and 30 sessions on the latest XML technology.

Bob Bickel, Sr. Vice President Products for Bluestone Software,Inc., will be
giving a presentation entitled: Developing and deploying applications with
Dynamic XML Servers.  The presentation will take place on 5/27/99 at 4:00 pm
and will look at what a dynamic XML server is and how to develop and deploy
applications.


----Original Message-----
From: LUCIO PICOLLI [mailto:lucio.piccoli@one2one.co.uk]
Sent: Wednesday, March 24, 1999 3:29 AM
To: xml-dev@ic.ac.uk
Subject: XML conference


Hi all,
I am seeking info on the 'XML One' conference planned for May 24 at   
 Austin, TX . The web is under construction so details are very limited.   
Is anyone on this list intend on doing something interesting at the   
conference?


adios

 -lucio

 ---------------------------------------------------------------------
 One2One              LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower      tel : +44 181 214 3847
 Elstree Way
 Borehamwood                 fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 24 16:46:57 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:24 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <001901be7616$8e4caba0$c8a8a8c0@thing1>

From: Gavin Thomas Nicol <gtn@eps.inso.com>
>CDATA sections *are* different from normal text, even if only 
>because the author used them.

Again, is anyone aware of why CDATA is preserved by the DOM?
What was the reasoning behind this decision? Other things, like
whitespace within an element tag or even attribute order, are not preserved.
Why then was CDATA? 

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 24 16:51:40 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:24 2004
Subject: DOM CDATA vs Normalization
Message-ID: <002001be7617$2fc33440$c8a8a8c0@thing1>

Normalization of an element combines various text objects into a single 
text object. Does it then merge text and CDATA objects to a single object?
And what about ignorable whitespace?

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From BPosert at filenet.com  Wed Mar 24 17:08:15 1999
From: BPosert at filenet.com (Posert, Bob)
Date: Mon Jun  7 17:10:24 2004
Subject: SQL queries expressed in XML
Message-ID: <C3AF5E329E21D2119C4C00805F6FF58FB2CC97@hq-expo2.filenet.com>

You might want to take a look at the WebDAV-related DASL "Distributed
Searching And Locating" working group page at
http://www.ics.uci.edu/pub/ietf/dasl 

>From their charter:
Working Group Scope 
A generalized search mechanism is a broad problem space. It encompasses a
variety of object models, typing schemes, and media. By focusing on a subset
of this space, the problem of locating resources based on property values
and text content, the working group will leverage much of the existing work
that has been done on querying under simple property and resource models. 
In-Scope items include: - typing - comparisons (>, >=, <, <=, !=, ==) -
internationalized content - text content matching - dealing with arbitrary
XML values 
Out-of-scope items include: - definitions of well-known properties -
server-to-server communication protocols - cross-language comparisons -
searching for non-text content (images, video, audio, etc.) - client control
of server administration (e.g. indexing) 

--Bob

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Wed Mar 24 17:10:28 1999
From: paul.janssens at skynet.be (JPA)
Date: Mon Jun  7 17:10:24 2004
Subject: XML convertor generator
Message-ID: <36F91BC9.4310@skynet.be>

Hello,

I'm currently working on an xml convertor-generator. When finished, the
tool will, if you take the bother to type the structure of your input
format and mappings on entities and attributes, construct a convertor. 
There's no documentation as yet, and some stuff missing (escaping, for
one thing), but if there's enough interest I'll put it on a website as
is.


Paul Janssens - paul.janssens@skynet.be


Here's an example of what the tool actually does:

1) sample input (Your legacy data here)

ACCEPT x FROM y 
ACCEPT t FROM z END_ACCEPT
ACCEPT t FROM z AT LINE 23
ACCEPT t FROM z AT COLUMN NUMBER 5
ACCEPT t FROM z AT COLUMN NUMBER col
ACCEPT t FROM z ON EXCEPTION BUMMER
ACCEPT t FROM z NOT ON EXCEPTION OK


2) syntax file (The stuff the user has to type.)

TOKEN identifier  '[A-Za-z_][A-Za-z_0-9]*';
TOKEN number      '[+-]?[0-9]+';


acceptstatements:
(  ACCEPT (identifier % )/acceptdestination
   (FROM identifier % )/acceptsource ? 

   (  AT (!LINE|!COL|!COLUMN) # measure
      NUMBER? (identifier|number) %
   )/acceptposition ?

   <onexception>? <notonexception>? END_ACCEPT?
)/acceptstatement *;

onexception:
    (ON EXCEPTION BUMMER);

notonexception:
    (NOT ON EXCEPTION OK);


3) convertor output

<acceptstatements>
 <acceptstatement>
  <acceptdestination>x</acceptdestination>
  <acceptsource>y</acceptsource>
 </acceptstatement>
 <acceptstatement>
  <acceptdestination>t</acceptdestination>
  <acceptsource>z</acceptsource>
 </acceptstatement>
 <acceptstatement>
  <acceptdestination>t</acceptdestination>
  <acceptsource>z</acceptsource>
  <acceptposition measure="LINE">23</acceptposition>
 </acceptstatement>
 <acceptstatement>
  <acceptdestination>t</acceptdestination>
  <acceptsource>z</acceptsource>
  <acceptposition measure="COLUMN">5</acceptposition>
 </acceptstatement>
 <acceptstatement>
  <acceptdestination>t</acceptdestination>
  <acceptsource>z</acceptsource>
  <acceptposition measure="COLUMN">col</acceptposition>
 </acceptstatement>
 <acceptstatement>
  <acceptdestination>t</acceptdestination>
  <acceptsource>z</acceptsource>
  <onexception/>
 </acceptstatement>
 <acceptstatement>
  <acceptdestination>t</acceptdestination>
  <acceptsource>z</acceptsource>
  <notonexception/>
 </acceptstatement>
</acceptstatements>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 24 17:32:22 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:24 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <01BE7624.7A324E50@grappa.ito.tu-darmstadt.de>

Bill la Forge wrote:

> From: Gavin Thomas Nicol <gtn@eps.inso.com>
> >CDATA sections *are* different from normal text, even if only
> >because the author used them.
>
> Again, is anyone aware of why CDATA is preserved by the DOM?
> What was the reasoning behind this decision? Other things, like
> whitespace within an element tag or even attribute order, are not 
preserved.
> Why then was CDATA?

I can't say why the DOM included CDATA, but I'll hazard a guess and agree 
with Gavin.  If I'm using a CDATA section, it means that I really, really, 
really don't want what's in the section to be parsed and it would be a 
royal pain for me if it was.  (Think about writing an HTML tutorial.)

The obvious place where preservation of CDATA is important, then, is when 
I'm co-authoring a document with a friend who uses a DOM-based editor while 
I prefer a text editor.  If every time my friend edits the document all the 
CDATA sections get wiped out, neither our friendship nor our co-authorship 
are going to last very long.

This is quite different from whitespace in element tags and attribute 
order, which are more aesthetic concerns than practical ones.  I might be a 
bit annoyed if my friends editor rearranges these, but I am unlikely to go 
looking for new partners because of it.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Wed Mar 24 17:47:42 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <001901be7616$8e4caba0$c8a8a8c0@thing1>
Message-ID: <36F9250B.B4F747E0@sqwest.bc.ca>

Bill la Forge wrote:
> 
> From: Gavin Thomas Nicol <gtn@eps.inso.com>
> >CDATA sections *are* different from normal text, even if only
> >because the author used them.
> 
> Again, is anyone aware of why CDATA is preserved by the DOM?
> What was the reasoning behind this decision? 

Gavin summed it up quite well - the author used a CDATA Section and
may have attached some semantic meaning to it (I know that several
people disagree that CDATA sections can have semantic meaning;
others think they can) so the DOM doesn't throw away that
distinction, just in case.

Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Wed Mar 24 17:52:01 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:10:25 2004
Subject: DOM CDATA vs Normalization
References: <002001be7617$2fc33440$c8a8a8c0@thing1>
Message-ID: <36F92629.1D989E9E@sqwest.bc.ca>

Bill la Forge wrote:
> 
> Normalization of an element combines various text objects into a single
> text object. Does it then merge text and CDATA objects to a single object?
> And what about ignorable whitespace?

Normalization merges only adjoining Text nodes, regardless of their
content. It does not merge Text nodes with CDATA Section nodes,
comments or PIs.

You will notice in the latest draft of the DOM Level 2, at
http://www.w3.org/TR/WD-DOM-Level-2/,
that one of the items on the list of issues to be addressed in Level
2 is
Conversion of a CDATASection node to a TEXT node.

This could include merging adjacent Text and CDATA Section nodes.

Might I suggest this discussion take place on the public DOM mailing
list? www-dom@w3.org, to subscribe send email to
www-dom-request@w3.org with the subject line "subscribe".

regards,

Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Wed Mar 24 17:55:09 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <005101be73a6$325a59e0$c8a8a8c0@thing1>
		<000201be759e$f318bd80$0100007f@eps.inso.com> <14072.61385.105692.234306@localhost.localdomain>
Message-ID: <36F926D5.D1CEE889@sqwest.bc.ca>

David Megginson wrote:
> 
> Gavin Thomas Nicol writes:
> 
>  > > >Do we really need to know about CDATA sections
>  > >
>  > > Debatable perhaps, but supported by the DOM. (Anyone know why?)
>  > > But I'd really like to see better SAX/DOM integration, so Yes!
>  >
>  > CDATA sections *are* different from normal text, even if only
>  > because the author used them. Note the interface inheritance in
>  > the DOM that tries to hide the distinction for those that need
>  > not see it.
> 
> By the same argument,
> 
> <p
> x="1">
> 
> and
> 
> <p x="1">
> 
> are different, because the author used them.

I haven't heard anyone argue that the whitespace can have semantic
meaning, whereas I have heard it about CDATA sections. (Note that I
do not necessarily agree that CDATA sections can have semantic
meaning, simply that some people think they do.)

Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Mar 24 18:14:34 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <00dd01be7622$13368ba0$2ee044c6@arcot-main>

>The problem is that even if you don't care about entity boundaries,
>the XML 1.0 REC requires reporting of any entities that are not
>expanded (in the case, for example, of a non-validating parser that
>hasn't read the declaration in the external DTD subset).  As a result,
>in a literal reading of the spec, a fully-conformant XML 1.0 API can
>*never* treat attribute values simply as strings.  SAX 1.0 does so,
>and no one has ever minded, but conformance is conformance...


The XML REC uses the word 'report' a lot but wisely does get into what
reporting means.  I think that as long as the information is available
on-demand through one mechanism or another, we can consider the reporting
requirement met.

Don


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 24 18:21:15 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:25 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity,namespaces 
 (was WG))
References: <002201be74e8$c8ca5310$11f96d8c@NT.JELLIFFE.COM.AU>
		<3.0.6.32.19990323093744.009d6ec0@gpo.iol.ie> <14071.33102.817935.724149@localhost.localdomain>
Message-ID: <36F91A38.53F4C78@prescod.net>

David Megginson wrote:
> 
> Sean Mc Grath writes:
> 
>  > This is an interesting thread.  Many non-tag-minimization
>  > reliables can be put forth as things that SGML "can do" that
>  > XML cannot. Things like data attributes, exclusion exceptions,
>  > internal SDATA entities and so on.
> 
> I think that I agree with what Sean is saying here and later in the
> message -- think of *what* you can represent rather than *how* you
> represent it.

That representation alone isn't good enough -- standardization is also
important.

Here's what I heard Sean saying:

 * SGML favours globally standardized declarations over locally maintained
custom code.

 * XML restricts the number of globally standardized declarations in favor
of locally maintained custom code.

In other words: SGML favours standarization and XML favours one-off
system-specific ad-hocery. If I really believed that then I would drop XML
and advise my customers not to use it. XML removed certain specific
declarative features of SGML that were either not used enough or could be
added in at another level. But little by little XML is becoming more and
ore declarative through other layers like XLink, XSL, RDF and XML Schemas.
The move towards declarativeness and away from ad hoc code is precisely
XML's gift to the Web. Standardized declarativeness is the real XML
revolution. XML just happens to be the syntax.

Let me demonstrate that XML is also standard declaration-focused by
turning around the "notations" example. The SGML way was to declare a
notation and have a second level validate that the data adhered to the
notation. Unfortunately, we never really standardized a decent declarative
syntax for the second level. In other words, SGML was not declarative
*enough*.

XML, on the other hand, will likely have a mechanism where notations can
be declared in the schema (under the title of "user-defined data types").
So the XML family will have a more powerful, standardized, declarative
mechanism which will reduce the need for maintaining custom code. The
declarativeness baton has been passed from SGML to XML.

Custom code is the enemy. We will always need it but we must continue to
relegate it to more and more complex or esoteric problems.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at quake.net  Wed Mar 24 18:31:15 1999
From: donpark at quake.net (Don Park)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <001401be7624$668dc5a0$2ee044c6@arcot-main>

>The XML REC uses the word 'report' a lot but wisely does get into what
>reporting means.  I think that as long as the information is available
>on-demand through one mechanism or another, we can consider the reporting
>requirement met.


OOPS.  I meant to say that REC does NOT explain what reporting means.

Don


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 24 18:40:05 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de>

Lauren Wood wrote:

> Gavin summed it up quite well - the author used a CDATA Section and
> may have attached some semantic meaning to it (I know that several
> people disagree that CDATA sections can have semantic meaning;
> others think they can) so the DOM doesn't throw away that
> distinction, just in case.

I'm having trouble imagining how a CDATA section can have semantic meaning 
in all but the most abusive ways.  (Hmmm, there's a CDATA section.  Fire up 
the pizza delivery DLL.)  Could you give an example?  Thanks.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 19:27:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <001901be7616$8e4caba0$c8a8a8c0@thing1>
References: <001901be7616$8e4caba0$c8a8a8c0@thing1>
Message-ID: <14073.15473.885207.805699@localhost.localdomain>

Bill la Forge writes:

 > Again, is anyone aware of why CDATA is preserved by the DOM?
 > What was the reasoning behind this decision? Other things, like
 > whitespace within an element tag or even attribute order, are not preserved.
 > Why then was CDATA? 

I would guess that the DOM WG believed that users of XML editors and
repositories would want to see CDATA section boundaries and comments
survive a round trip in and out of the tools.  Personally, I am
extremely skeptical, but I have heard this argument many times from
the employees of the vendors themselves.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 19:37:28 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
In-Reply-To: <01BE7624.7A324E50@grappa.ito.tu-darmstadt.de>
References: <01BE7624.7A324E50@grappa.ito.tu-darmstadt.de>
Message-ID: <14073.15571.764196.447559@localhost.localdomain>

Ronald Bourret writes:

 > The obvious place where preservation of CDATA is important, then,
 > is when I'm co-authoring a document with a friend who uses a
 > DOM-based editor while I prefer a text editor.  If every time my
 > friend edits the document all the CDATA sections get wiped out,
 > neither our friendship nor our co-authorship are going to last very
 > long.

Yes, but there would be easier ways to handle this.  Let's say, for
example, that you consistently use the following in your text editor:

  <example><![CDATA[
  <s>This is literal XML markup used as an example</s>
  ]]></example>

Now, if CDATA boundaries were discarded, when your friend loaded this
into her DOM-based editor and then saved it again, you would see
something like the following:

  <example>
  &#60;s&#62;This is literal XML markup used as an example.&#60;/s&#62;
  </example>

If this kind of thing does matter (as it probably would to you),
perhaps your friend could configure her editor to select certain
element types that would always have their content CDATA escaped on
export (nearly every document type has only one or two candidates,
such as HTML <pre>).

Even if your friend's editor didn't support that, nearly anyone on
this list could hack together a Perl or Java program in about 15
minutes that you allow you to do something like

  xml-cdata-escape mydoc.xml example > mydoc2.xml

Voila, your CDATA is back!  Of course, there are a few situations
where people use CDATA less predictably, but I hardly believe that the
requirement would survive a real cost-benefit analysis if the DOM WG
had made one.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 19:40:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <00dd01be7622$13368ba0$2ee044c6@arcot-main>
References: <00dd01be7622$13368ba0$2ee044c6@arcot-main>
Message-ID: <14073.16162.923732.438041@localhost.localdomain>

Don Park writes:

 > >The problem is that even if you don't care about entity boundaries,
 > >the XML 1.0 REC requires reporting of any entities that are not
 > >expanded (in the case, for example, of a non-validating parser that
 > >hasn't read the declaration in the external DTD subset).  As a result,
 > >in a literal reading of the spec, a fully-conformant XML 1.0 API can
 > >*never* treat attribute values simply as strings.  SAX 1.0 does so,
 > >and no one has ever minded, but conformance is conformance...
 > 
 > The XML REC uses the word 'report' a lot but wisely does get into what
 > reporting means.  I think that as long as the information is available
 > on-demand through one mechanism or another, we can consider the reporting
 > requirement met.

Yes, I agree -- we *can* provide the attribute value as a string, but
we also have to make the alternative representation available in case
in 10 or 20 years someone actually needs it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Wed Mar 24 19:48:17 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:10:25 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <3.0.32.19990324134646.00d152cc@pophost.arbortext.com>

At 14:28 1999 03 24 -0500, David Megginson wrote:
>Bill la Forge writes:
> > Again, is anyone aware of why CDATA is preserved by the DOM?
> > What was the reasoning behind this decision? Other things, like
> > whitespace within an element tag or even attribute order, are not
preserved.
> > Why then was CDATA? 
>
>I would guess that the DOM WG believed that users of XML editors and
>repositories would want to see CDATA section boundaries and comments
>survive a round trip in and out of the tools.  Personally, I am
>extremely skeptical, but I have heard this argument many times from
>the employees of the vendors themselves.

As such a vendor, I hear this from our customers.  

When authoring a document, the user may want to know there
is a region into which s/he can paste stuff containing < and & 
characters and know they won't be interpreted as markup.  True,
the editing application can magically escape them (e.g., &lt;)
as part of the paste operation, but what if the user is using
Notepad to copy a parsable XML example into an XML document? 
Having to escape the special characters destroys the ability
to have that data remain parsable/validatable at the same time
as embedded in the larger document, and that destroys an important 
reuse/multipurpose feature otherwise available in XML.  (Think
of a dynamic XML document that allows you to "verify as well-formed"
the content of any <sample-xml> element in your tutorial document.)
 
The point is that the user-author inserted the CDATA section for 
a reason, and they might well want it to stay there.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 24 20:04:30 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
Message-ID: <01BE7639.C333D500@grappa.ito.tu-darmstadt.de>

David Megginson writes:

> If this kind of thing does matter (as it probably would to you),
> perhaps your friend could configure her editor to select certain
> element types that would always have their content CDATA escaped on
> export (nearly every document type has only one or two candidates,
> such as HTML <pre>).
>
> Even if your friend's editor didn't support that, nearly anyone on
> this list could hack together a Perl or Java program in about 15
> minutes that you allow you to do something like
>
>   xml-cdata-escape mydoc.xml example > mydoc2.xml

I buy the first argument (seems like a reasonable feature of an XML editor) 
but not the second. Most users of such systems are unlikely to be able to 
hack together such a program and they may or may not have a friendly 
programmer to whom they can turn.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 20:33:15 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
In-Reply-To: <01BE7639.C333D500@grappa.ito.tu-darmstadt.de>
References: <01BE7639.C333D500@grappa.ito.tu-darmstadt.de>
Message-ID: <14073.19387.229257.98559@localhost.localdomain>

Ronald Bourret writes:

 [snip]

 > > Even if your friend's editor didn't support that, nearly anyone
 > > on this list could hack together a Perl or Java program in about
 > > 15 minutes that you allow you to do something like
 > >
 > >   xml-cdata-escape mydoc.xml example > mydoc2.xml
 > 
 > I buy the first argument (seems like a reasonable feature of an XML
 > editor) but not the second. Most users of such systems are unlikely
 > to be able to hack together such a program and they may or may not
 > have a friendly programmer to whom they can turn.

Ah, yes, but people who edit their XML in a text editor probably would 
be capable of starting a command-line application.  What I'm
suggesting is that *something* had to be written -- either the simple
filter or full CDATA support for all DOM applications.  Full CDATA
support won (love it or leave it), but the other might have been a
little easier.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Wed Mar 24 21:14:20 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:25 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
Message-ID: <01BE7643.888FE290@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> Ah, yes, but people who edit their XML in a text editor probably would
> be capable of starting a command-line application.  What I'm
> suggesting is that *something* had to be written -- either the simple
> filter or full CDATA support for all DOM applications.  Full CDATA
> support won (love it or leave it), but the other might have been a
> little easier.

Hmmm.  If those are the choices, I vote for putting it in.  Forcing all 
companies to write a quick little application and forcing all users to run 
a quick little application seems far more onerous than forcing DOM 
programmers to work around this.

I wasn't even going to reply, but then I remembered that the real question 
here is whether SAX (not the DOM) should tell people about CDATA sections. 
 I think the answer is yes.  Unlike the DOM, where people not interested in 
CDATA sections still have to work around them, SAX applications that are 
not interested in CDATA sections simply have null implementations of 
start/endCDATA.

The only drawback I see is that applications not interested in CDATA 
sections are forced to suffer through three calls to 
DocumentHandler.character -- before, during, and after the CDATA section. 
 The application can use a filter to solve this, of course, but it's still 
likely to be a source of application errors.  (Depending on how parsers 
implement LexicalHandler callbacks, this could happen even if the 
application doesn't register a LexicalHandler implementation.  Does the 
property requesting a single call to character() apply in this case?  It 
ought to.)

-- Ron


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 22:02:15 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:25 2004
Subject: XML and (K)Office
Message-ID: <14073.23919.284210.386065@localhost.localdomain>

Now that the press is (prematurely) declaring Microsoft's imminent
demise [1], perhaps we can stop worrying about MS Office's XML support
(why hitch your cart to an allegedly dying horse?) and look at Linux.

For those of you who don't know, the current incarnation of the
emacs/vi or sh/csh religious battle (or perhaps SGML/XML) in the Linux
world is KDE vs. Gnome as the desktop manager.  I'm in the Gnome camp,
so it is with mixed feelings that I draw attention to the fact that
KOffice for KDE (see article [2]) uses XML-based save formats for
*all* of its applications (word processor, spreadsheet, formula
designer, presentation manager, etc. etc.).

To be fair, the main Gnome spreadsheet also uses a (no doubt
incompatible) XML-based save format.

There's also a hot rumour [3] that Microsoft has assigned 37
programmers to work on a Linux port of MS Office.

The place to go for this kind of stuff is slashdot.org, which is heavy 
on the hacker look-and-feel.


All the best,


David

[1] http://www.cnn.com/TECH/computing/9903/24/mslinux.html/ (and many
    others) 
[2] http://www.mieterra.com/article/koffice.html
[3] http://www.heise.de/newsticker/data/cp-19.03.99-000/ (auf Deutsch)

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Wed Mar 24 22:12:26 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:25 2004
Subject: How to convert an XML file to an Access database ...
In-Reply-To: <01ba01be7602$7c3e6a70$1f9ccb84@grr.ulaval.ca>
Message-ID: <4.1.19990325090956.00bda4c0@steptwo.com.au>

At 00:27 25/03/1999 , ?ric Riblair wrote: 
>
> Hello, 
>
> I would like to know the simplest way to import the information contained in
> a file XML to an Access database... 
>
> Thank you for your answers, 
>
> Regards, 
>
> ?ric


Omnimark and its ODBC libraries?

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Wed Mar 24 22:16:02 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:25 2004
Subject: XML convertor generator
In-Reply-To: <36F91BC9.4310@skynet.be>
Message-ID: <4.1.19990325091311.00ba01c0@steptwo.com.au>

At 03:08 25/03/1999 , JPA wrote:
  | Hello,
  | 
  | I'm currently working on an xml convertor-generator. When finished, the
  | tool will, if you take the bother to type the structure of your input
  | format and mappings on entities and attributes, construct a convertor. 
  | There's no documentation as yet, and some stuff missing (escaping, for
  | one thing), but if there's enough interest I'll put it on a website as
  | is.
  | 
  | 
  | Paul Janssens - paul.janssens@skynet.be

Paul,

Not wishing to rain on your parade, but aren't
you re-inventing the wheel here?

Will your solution do anything that Perl or
Omnimark can't already do?

(Just trying to save you a lot of time.)

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Wed Mar 24 22:21:03 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
In-Reply-To: <14073.23919.284210.386065@localhost.localdomain>; from David Megginson on Wed, Mar 24, 1999 at 05:02:34PM -0500
References: <14073.23919.284210.386065@localhost.localdomain>
Message-ID: <19990324172040.I21831@w3.org>

> For those of you who don't know, the current incarnation of the
> emacs/vi or sh/csh religious battle (or perhaps SGML/XML) in the Linux
> world is KDE vs. Gnome as the desktop manager.  I'm in the Gnome camp,

  No KDE/Gnome war here, please, however I'm one of the Gnome developpers.

> so it is with mixed feelings that I draw attention to the fact that
> KOffice for KDE (see article [2]) uses XML-based save formats for
> *all* of its applications (word processor, spreadsheet, formula
> designer, presentation manager, etc. etc.).

  A large number of Gnome apps are also using XML or moving to XML formats.
A very good example is glade, the GTK application builder, which saves
it state as an XML file.
  
> To be fair, the main Gnome spreadsheet also uses a (no doubt
> incompatible) XML-based save format.

  Yep, Gnumeric uses XML (actually gzipped XML on disk) and uses namespaces.
I'm pretty sure it's incompatible, since when I coded the XML export I didn't
know that KDE would do alike. There is a virtual cookie reward to the first
sending me a good example of KDE spreadsheet XML file :-) . A clean DTD
would be even better.

Daniel

-- 
	    [Yes, I have moved back to France !]
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux, WWW, rpmfind,
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | rpm2html, XML,
http://www.w3.org/People/W3Cpeople.html#Veillard | badminton, and Kaffe.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Williams.Arumainathan at fmr.com  Wed Mar 24 22:47:47 1999
From: Williams.Arumainathan at fmr.com (Arumainathan, Williams)
Date: Mon Jun  7 17:10:26 2004
Subject: XML Documents
Message-ID: <E7DDE4E10279D21182200060B06A95C09C2DA6@msgbos644nts.fmr.com>

Hi, 
I have just started learning XML. Can you suggest good website names
please

Thank you,
Williams.

> -----Original Message-----
> From:	James Robertson [SMTP:jamesr@steptwo.com.au]
> Sent:	Wednesday, March 24, 1999 6:15 PM
> To:	xml-dev@ic.ac.uk
> Subject:	Re: XML convertor generator
> 
> At 03:08 25/03/1999 , JPA wrote:
>   | Hello,
>   | 
>   | I'm currently working on an xml convertor-generator. When finished,
> the
>   | tool will, if you take the bother to type the structure of your input
>   | format and mappings on entities and attributes, construct a convertor.
> 
>   | There's no documentation as yet, and some stuff missing (escaping, for
>   | one thing), but if there's enough interest I'll put it on a website as
>   | is.
>   | 
>   | 
>   | Paul Janssens - paul.janssens@skynet.be
> 
> Paul,
> 
> Not wishing to rain on your parade, but aren't
> you re-inventing the wheel here?
> 
> Will your solution do anything that Perl or
> Omnimark can't already do?
> 
> (Just trying to save you a lot of time.)
> 
> Cheers,
> 
> James
> 
> 
> -------------------------
> James Robertson
> Step Two Designs Pty Ltd
> SGML, XML & HTML Consultancy
> http://www.steptwo.com.au/
> jamesr@steptwo.com.au
> 
> "Beyond the Idea"
>  ACN 081 019 623
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mvidal at umiacs.umd.edu  Wed Mar 24 22:51:36 1999
From: mvidal at umiacs.umd.edu (Maria Esther Vidal)
Date: Mon Jun  7 17:10:26 2004
Subject: XML DTD to relational 
Message-ID: <199903242251.RAA12467@loomba.umiacs.umd.edu>

Hello,

I would like to know if there is a Java library that creates a
relational schema from an XML DTD or a Java library that parses
an XML DTD?

Many thanks,

Maria Esther Vidal

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 24 22:53:19 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
References: <14073.23919.284210.386065@localhost.localdomain>
Message-ID: <36F9689A.CEDA6214@prescod.net>

David Megginson wrote:
> 
> For those of you who don't know, the current incarnation of the
> emacs/vi or sh/csh religious battle (or perhaps SGML/XML) in the Linux
> world is KDE vs. Gnome as the desktop manager.  I'm in the Gnome camp,
> so it is with mixed feelings that I draw attention to the fact that
> KOffice for KDE (see article [2]) uses XML-based save formats for
> *all* of its applications (word processor, spreadsheet, formula
> designer, presentation manager, etc. etc.).

Note that other standards in use in KOffice include CORBA, and
Linuxdoc/SGML (for KDE documentation). These guys obviously have a
standards focus.

Probably not coincidentally they use Python for scripting and formulas.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Wed Mar 24 22:57:37 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
In-Reply-To: <14073.23919.284210.386065@localhost.localdomain>
Message-ID: <199903242256.XAA04448@sonne.darmstadt.gmd.de>

David Megginson <david@megginson.com> wrote at 24 Mar 99, 17:02:

> There's also a hot rumour [3] that Microsoft has assigned 37
> programmers to work on a Linux port of MS Office.

I was very surprised by David Megginsons note that MS is porting
Office. In the hope not to infringe copyrights, here is a partial
translation of the "Heise newsticker" article. C't can be considered
one of (if not the) leading computer magazines in German. 

	++im

--- snip --
Rumours are out for a while, but for the first time there are
indications: Microsoft is porting their popular Office suite to
Linux. c't [1] was told from good authority, that there was a
project formed in Redmond. Following the source there are 37
developers working on the port of Office to Linux. 

It's expected, that Microsoft will announce the activity during
CeBIT [2] and will give a time schedule for completation. [...]

[1] http://www.heise.de/ct/
[2] http://www.cebit.de/
--- snip ---


--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 22:59:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:26 2004
Subject: XML and (K)Office
In-Reply-To: <19990324172040.I21831@w3.org>
References: <14073.23919.284210.386065@localhost.localdomain>
	<19990324172040.I21831@w3.org>
Message-ID: <14073.28055.304975.377332@localhost.localdomain>

Daniel Veillard writes:

 >   Yep, Gnumeric uses XML (actually gzipped XML on disk) and uses
 > namespaces.  I'm pretty sure it's incompatible, since when I coded
 > the XML export I didn't know that KDE would do alike. There is a
 > virtual cookie reward to the first sending me a good example of KDE
 > spreadsheet XML file :-) . A clean DTD would be even better.

Doesn't the recipe for virtual cookies come with the Gnu Emacs
distribution?

Anyway, let's get this right -- I think that it's healthy for both
Gnumeric and the KOffice Spreadsheet program both to exist, but there
is no excuse for them to use entirely incompatible formats.  As a
matter of fact, if we could convince KDE and Gnome to use compatible
XML formats for lots of things (like interface construction), the
media's predictions of a Linux fracture will be proven to be hot air.

Do the Gnome and KDE people talk to each other much?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Mar 24 23:00:52 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:26 2004
Subject: How to convert an XML file to an Access database ...
In-Reply-To: <4.1.19990325090956.00bda4c0@steptwo.com.au>
References: <01ba01be7602$7c3e6a70$1f9ccb84@grr.ulaval.ca>
	<4.1.19990325090956.00bda4c0@steptwo.com.au>
Message-ID: <14073.28257.44465.534584@localhost.localdomain>

James Robertson writes:

 > Omnimark and its ODBC libraries?

Or Perl, or Java, or (probably) Python.  There are lots of choices,
free and commercial.


All the best,


Daivd

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 24 23:12:38 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:26 2004
Subject: XML conference
References: <36020bcc.240299@smtpgate1.ONE2ONE.CO.UK>
Message-ID: <36F96F3C.4A498FBB@prescod.net>

LUCIO PICOLLI wrote:
> 
> I am considering attending the 'XML One' conference planned for May 24 at
> Austin, TX. I am searching for details so i can convince my manager to
> pay for the conference fees. However the official conference web site is
> under construction. I guessed that most of the speakers would come from
> this interest group. So if anyone has info about the conference please
> let me know.

I will be speaking at XML One on Python/XML and also on object/XML
bridging.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sharris at primus.com  Thu Mar 25 00:31:28 1999
From: sharris at primus.com (Steve Harris)
Date: Mon Jun  7 17:10:26 2004
Subject: Architectures capability question - attribute values to element G
	I?
Message-ID: <228F2F40E87CD211ABA20008C7B13C5A133D95@EXCHANGE1>

Is it possible to use Architectures to map an attribute value to an
element GI in the target architecture (that is, to 'dynamically specify'
the architectural form)? This desire is in reverse to the common example
usage of the 'renamer-att' architecture support attribute. I have seen
this idea kicked around in various discussions, but cannot find any
documentation or examples to back up the claim that it's possible.
  The desired transformation would be from

<!-- original document instance fragment -->
<thing type="foo">bar</thing>
<thing type="baz">gorp</thing>

  to

<!-- fragment after architectural processing -->
<foo>bar</foo>
<baz>gorp</baz>

Is this really a job for something like DSSSL or XSL? I'd like to find
the limits of what Architectures can do. Please advise.


Steven E. Harris
Software Engineer
PRIMUS

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 00:38:15 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:26 2004
Subject: XML Documents
In-Reply-To: <E7DDE4E10279D21182200060B06A95C09C2DA6@msgbos644nts.fmr.co
 m>
Message-ID: <4.1.19990325113516.00bc69e0@steptwo.com.au>

At 08:46 25/03/1999 , Arumainathan, Williams wrote:

  | Hi, 
  | I have just started learning XML. Can you suggest good website names
  | please
  | 
  | Thank you,
  | Williams.
  | 
  | > -----Original Message-----
  | > From:	James Robertson [SMTP:jamesr@steptwo.com.au]
  | > Sent:	Wednesday, March 24, 1999 6:15 PM
  | > To:	xml-dev@ic.ac.uk
  | > Subject:	Re: XML convertor generator
  | > 
  | > At 03:08 25/03/1999 , JPA wrote:
  | >   | Hello,
  | >   | 
  | >   | I'm currently working on an xml convertor-generator. When finished,
  | > the
  | >   | tool will, if you take the bother to type the structure of your input
  | >   | format and mappings on entities and attributes, construct a
convertor.
  | > 
  | >   | There's no documentation as yet, and some stuff missing
(escaping, for
  | >   | one thing), but if there's enough interest I'll put it on a
website as
  | >   | is.
  | >   | 

Well, regarding websites for XML conversion tools:

Omnimark is easy: www.omnimark.com (have a look at OmnimarkLE in particular).

Anyone have some good sites for Perl and Python with respect
to XML conversion?

Hope this helps,

James

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Thu Mar 25 01:17:03 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:26 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de> from "Ronald Bourret" at Mar 24, 99 07:38:47 pm
Message-ID: <199903250221.VAA18169@locke.ccil.org>

Ronald Bourret scripsit:

> I'm having trouble imagining how a CDATA section can have semantic meaning 
> in all but the most abusive ways.  (Hmmm, there's a CDATA section.  Fire up 
> the pizza delivery DLL.)  Could you give an example?  Thanks.

For one thing, a CDATA section can contain only characters present in the
repertoire of the current encoding (no character references).  Some
people may depend on this property.

(I think this example is weak myself, but it *has* come up.)

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smo at jst.com.au  Thu Mar 25 01:23:06 1999
From: smo at jst.com.au (Steve Oldmeadow)
Date: Mon Jun  7 17:10:26 2004
Subject: XML convertor generator
Message-ID: <004701be765d$c2007ac0$02c809c0@stimpy>


-----Original Message-----
From: James Robertson <jamesr@steptwo.com.au>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: 25/03/1999 06:38
Subject: Re: XML convertor generator


>At 03:08 25/03/1999 , JPA wrote:
>  | Hello,
>  |
>  | I'm currently working on an xml convertor-generator. When finished, the
>  | tool will, if you take the bother to type the structure of your input
>  | format and mappings on entities and attributes, construct a convertor.
>  | There's no documentation as yet, and some stuff missing (escaping, for
>  | one thing), but if there's enough interest I'll put it on a website as
>  | is.
>  |
>  |
>  | Paul Janssens - paul.janssens@skynet.be
>
>Paul,
>
>Not wishing to rain on your parade, but aren't
>you re-inventing the wheel here?
>
>Will your solution do anything that Perl or
>Omnimark can't already do?


With that sort of attitude XML would never have gotten off the ground.  Perl
and OmniMark???  You must be a masochist.

In reply to the original post:  Paul I would be interested if you are making
the source available and it is in Java.

Steve Oldmeadow
Justice Systems Technologies


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Thu Mar 25 03:25:38 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:27 2004
Subject: Whence XQL?
Message-ID: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>

Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
hypeware about this and I understand that its just a suggestion for a
proposal for a note for a draft for a recommendation. Whatever.
I want my XQL! 

Seriously, all ranting aside, I haven't seen any talk here or in XSL
listserv land about it yet (recently). The proposal seems complete enough to
me for someone to have at least announced a beta implementation of it. I'd
be happy with mostly unfinshed code if it were written in Java.

So does anybody have a clue about this? I know about XSL so please don't
send me down that path. I also know about the Datachannel attempt. 

If not, I'm tempted to write something myself. I'm not sure about a couple
of things because its still just off the top of my head so far.

Ok, its in Java. It uses some free XML parser, probably XML4J because its
the one I'm most familiar with. The XQL syntax parser will be written in
ANTLR, since it outputs nice O-O Java classes. The result set of XQL is well
formed XML. This can be handled easily by XML4J's ability of any node in any
tree (or transformed sub-tree) to print itself in XML to any stream. XML4J
has a nice getNodesByName() mathod that can operate at any level of the tree
returning a NodeList of siblings with that tag name. Wrapping a result tree
in <xql:result></xql:result> and iterating the NodeList gets you the
simplist query.

Internally the result set is just another DOM tree so you should be able to
add the .jar file to your Java app and thus satisfy that type of XQL result.
The input can be done in a variety of ways. I assume that the Perl module
XML::XQL can be used in a CGI context to extract the XQL query, execute it
and return either XML or XSL transformed output to the calling app(browser.)
Likewise, a Java servlet could do the same thing.

Cons: Xml4J doesn't yet handle PI's so its maybe not the overall best
solution. (I may be wrong about this, IBM uploaded a new major release that
may have fixed it.) Its just the one I'm comfortable with, at the moment. On
my hard drive are XML parsers from Sun, Microsoft, Oracle, James Clark and
one or two others I haven't had time to play with yet.

I don't care about efficiency or optimization. All partially created result
sets will live in memory till they are ready to be output. I also don't care
about searching multiple files, although that should be realtively easy to
add. (I'm still confused about XML repositorys. Would XQL have to understand
directory paths? Does XQL need to be able to follow XLinks?)

I'm leaving out sequences but I may add them in (much) later. Return values
(analog to SQL's SELECT) are important to my application, as are
conditionals. 

Unless someone warns me that I'm clueless (which is usually the case,) I'll
post a cut of the ANTLR grammer as soon as I get a working one. I'll
probably put it on my web site.

Ed


Ed Howland
ed@dega.com
http://www.dega.com 
"As your attorney, I advise you to take some adrenalchrome"


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 03:39:22 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:27 2004
Subject: XML and (K)Office
In-Reply-To: <36F9689A.CEDA6214@prescod.net>
References: <14073.23919.284210.386065@localhost.localdomain>
	<36F9689A.CEDA6214@prescod.net>
Message-ID: <14073.44996.847735.659024@localhost.localdomain>

Paul Prescod writes:

 > Note that other standards in use in KOffice include CORBA, and
 > Linuxdoc/SGML (for KDE documentation). These guys obviously have a
 > standards focus.

Gnome uses DocBook (!!!) and CORBA.

 > Probably not coincidentally they use Python for scripting and formulas.

I'll try not to hold that against them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 03:41:28 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:27 2004
Subject: XML convertor generator
In-Reply-To: <004701be765d$c2007ac0$02c809c0@stimpy>
Message-ID: <4.1.19990325143624.00ca5d60@steptwo.com.au>

At 11:21 25/03/1999 , Steve Oldmeadow wrote:

  | >At 03:08 25/03/1999 , JPA wrote:
  | >  | Hello,
  | >  |
  | >  | I'm currently working on an xml convertor-generator. When
finished, the
  | >  | tool will, if you take the bother to type the structure of your input
  | >  | format and mappings on entities and attributes, construct a convertor.
  | >  | There's no documentation as yet, and some stuff missing (escaping, for
  | >  | one thing), but if there's enough interest I'll put it on a website as
  | >  | is.
  | >  |
  | >  |
  | >  | Paul Janssens - paul.janssens@skynet.be
  | >
  | >Paul,
  | >
  | >Not wishing to rain on your parade, but aren't
  | >you re-inventing the wheel here?
  | >
  | >Will your solution do anything that Perl or
  | >Omnimark can't already do?
  | 
  | With that sort of attitude XML would never have gotten off the ground.
Perl
  | and OmniMark???  You must be a masochist.

Why?

I can think of two situations:

1. You want to develop a new conversion tool, either for
   the kudos, or for the money. If so, go for it. 

   But be warned, conversion tools need to be powerful in order
   to be useful (I should know, I spend most of my life
   converting to and from SGML/XML).

2. You have a practical problem to solve that involves converting
   files to XML. 

   If so, why on earth wouldn't you use existing off-the-shelf tools
   to do the work? Especially if they are freely available.

Now, from the original e-mail, I assumed case 2, but I
could be wrong.

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Mar 25 03:53:32 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:10:27 2004
Subject: XML convertor generator
References: <4.1.19990325091311.00ba01c0@steptwo.com.au>
Message-ID: <36F9B224.CE017C45@manhattanproject.com>

James Robertson wrote:
| At 03:08 25/03/1999 , JPA wrote:
| | Hello,
| |
| | I'm currently working on an xml convertor-generator. When finished, the
| | tool will, if you take the bother to type the structure of your input
| | format and mappings on entities and attributes, construct a convertor.
| | There's no documentation as yet, and some stuff missing (escaping, for
| | one thing), but if there's enough interest I'll put it on a website as is.
| |
| | Paul Janssens - paul.janssens@skynet.be
| 
| Paul,
| 
| Not wishing to rain on your parade, but aren't
| you re-inventing the wheel here?

Actually, a program which created an efficient
program to convert XML conforming to a specific
DTD to another product would be a very cool 
invention, very different from using Perl 
and/or Omnimark.

I have Omnimark programs which take a great
deal of processing power (I'd hate to see the 
Perl equivalent).  Cutting it in half with a 
program that generated a program would be 
very cool indeed.   What kind of 'efficiencies'
do you get when you remove the interpreted layer?

I'm reading this that you are more or less
doing a YACC thing?  Is this a correct
interpretation?  Will it do SGML?
(I guess I can run it through nsgmls 
to make the XML equivalent first.)
Is it open source?  Hopefully it 
will generate C code (for speed).

Clark Evans

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 04:25:22 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:27 2004
Subject: XML convertor generator
In-Reply-To: <36F9B224.CE017C45@manhattanproject.com>
References: <4.1.19990325091311.00ba01c0@steptwo.com.au>
Message-ID: <4.1.19990325151909.00baac70@steptwo.com.au>

At 13:48 25/03/1999 , Clark Evans wrote:

  | James Robertson wrote:
  | | At 03:08 25/03/1999 , JPA wrote:
  | | | Hello,
  | | |
  | | | I'm currently working on an xml convertor-generator. When finished, the
  | | | tool will, if you take the bother to type the structure of your input
  | | | format and mappings on entities and attributes, construct a convertor.
  | | | There's no documentation as yet, and some stuff missing (escaping, for
  | | | one thing), but if there's enough interest I'll put it on a website
as is.
  | | |
  | | | Paul Janssens - paul.janssens@skynet.be
  | | 
  | | Paul,
  | | 
  | | Not wishing to rain on your parade, but aren't
  | | you re-inventing the wheel here?
  | 
  | Actually, a program which created an efficient
  | program to convert XML conforming to a specific
  | DTD to another product would be a very cool 
  | invention, very different from using Perl 
  | and/or Omnimark.

This _would_ be useful.

However, to be useful, it would have to support:

* Regular expressions.
* Complex data types, especially things like hash
  tables.
* Some form of "reference"-like lookahead.
* Context-sensitive code based on the current
  SGML state.

These are the things that I use every day.

Converting from legacy (or as I have recently
heard it called, "heritage") data to XML is
not simple. If the source is very consistent,
you're fine. 

Otherwise, it's always a struggle, in which
you use every tool in your toolbox.

  | I have Omnimark programs which take a great
  | deal of processing power (I'd hate to see the 
  | Perl equivalent).  Cutting it in half with a 
  | program that generated a program would be 
  | very cool indeed.   What kind of 'efficiencies'
  | do you get when you remove the interpreted layer?

Omnimark is actually pretty good. On the basis
of the speeds reported on this mailing list, I
would rate it quite fast, especially on large
data sets.

But of course, if you're doing a complex
conversion, then your code is going to be
slow. Fact of life.

  | I'm reading this that you are more or less
  | doing a YACC thing?  Is this a correct
  | interpretation?  Will it do SGML?
  | (I guess I can run it through nsgmls 
  | to make the XML equivalent first.)
  | Is it open source?  Hopefully it 
  | will generate C code (for speed).

A YACC-like tool would be way cool.

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Thu Mar 25 05:07:43 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:10:27 2004
Subject: Architectures capability question - attribute values to element G
	I?
In-Reply-To: <228F2F40E87CD211ABA20008C7B13C5A133D95@EXCHANGE1> (message from
	Steve Harris on Wed, 24 Mar 1999 16:24:25 -0800)
References: <228F2F40E87CD211ABA20008C7B13C5A133D95@EXCHANGE1>
Message-ID: <199903250453.WAA00944@bruno.techno.com>

[Steve Harris:]

> Is it possible to use Architectures to map an attribute value to an
> element GI in the target architecture (that is, to 'dynamically
> specify' the architectural form)? This desire is in reverse to the
> common example usage of the 'renamer-att' architecture support
> attribute. I have seen this idea kicked around in various
> discussions, but cannot find any documentation or examples to back
> up the claim that it's possible.  The desired transformation would
> be from

> <!-- original document instance fragment -->
> <thing type="foo">bar</thing>
> <thing type="baz">gorp</thing>
> 
>   to
> 
> <!-- fragment after architectural processing -->
> <foo>bar</foo>
> <baz>gorp</baz>
> 
> Is this really a job for something like DSSSL or XSL? I'd like to find
> the limits of what Architectures can do. Please advise.

This example happens to be an especially natural case for the use of
an inheritable architecture.  No renaming attribute is required.  Let
me rename your "type" attribute to "orlando", and provide an "orlando
architecture meta-DTD" to make things clearer.

The orlando architecture (a meta-DTD):

<!ELEMENT orlandoDoc ( foo | baz)* >
<!ELEMENT foo ( #PCDATA) >
<!ELEMENT baz ( #PCDATA) >


<!-- document instance fragment: -->
<!-- (The orlando attribute gives the name of the architectural form 
     in the orlando architecture.) -->

<thing orlando="foo">bar</thing>
<thing orlando="baz">gorp</thing>


<!-- the "orlando" architecture's view of the same
     fragment, as, e.g., SP would report it: -->

<foo>bar</foo>
<baz>gorp</baz>
 

The value of the orlando attribute is, in effect, "the element type
name for orlando purposes".  This is the most fundamental thing to
know about how inheritable architectures work.

It occurs to me, on account of your use of the phrase "dynamically
specify", that maybe you're asking something more subtle, which I
would rephrase as follows:

  "Does the value of the architectural form name attribute have to be
  #FIXED in the DTD?"

The answer is "No."  There doesn't even have to be a DTD.  If you can
make it be #FIXED in the DTD, or at least default it in the DTD, that
can save a lot of markup from having to be specified in the instance,
because you won't have to say, e.g. "orlando=foo" in every <thing>
tag.  Even if there is a DTD, there is no requirement that the DTD's
GIs ("generic identifiers" or "element type names") correspond
consistently with the GIs of any of the meta-DTDs that are being
inherited.  So, it's perfectly OK for a particular <thing> element to
be, in orlando terms, a <foo>, and, even in the same document, for
another <thing> to be a <baz> in orlando terms.  (It may seem a bit
odd, but it does happen.  In fact, there's one place in the Topic Maps
architecture (which is about to be an ISO standard, BTW) where it
happens: a <topic> architectural form in the Topic Maps architecture
is sometimes a <HyBrid> and other times a <varlink> in the HyTime
architecture.)

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 231 4098 (at ISOGEN: +1 214 953 0004 x137)
fax    +1 972 994 0087 (at ISOGEN: +1 214 953 3152)

3615 Tanner Lane
Richardson, Texas 75082-2618 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From avirr at LanMinds.Com  Thu Mar 25 05:54:29 1999
From: avirr at LanMinds.Com (Avi Rappoport)
Date: Mon Jun  7 17:10:27 2004
Subject: Whence XQL?
In-Reply-To: 
 <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>
References: 
 <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>
Message-ID: <v04204b00b31f7f6a66a6@[207.33.50.55]>

At 7:24 PM -0800 3/24/1999, Ed Howland wrote:
>Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
>hypeware about this and I understand that its just a suggestion for a
>proposal for a note for a draft for a recommendation. Whatever.
>I want my XQL!

There's a great article by Lisa Rein on the W3C Query Workshop late 
last year -- I'm sure it's still at XML.com.  There are links from 
the article to the position papers of the participants, and I found 
them fascinating and enlightening.  There are a *lot* of issues to be 
solved!

Avi

________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr@lanminds.com>
Guide to Site Indexing and Local Search Engines: <http://www.searchtools.com>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vivi at odi.com  Thu Mar 25 06:22:09 1999
From: vivi at odi.com (Vittorio Viarengo)
Date: Mon Jun  7 17:10:27 2004
Subject: ANN: Managing XML Data
Message-ID: <000101be7687$6516ad00$fb1f03c6@durango.datachannel.com>

All,

I hope you'll forgive me for this announcement but I saw eXcelon mentioned
in a couple of messages and given that eXcelon 1.0 is now shipping, I
thought I'd send you directions for downloading it so that you can try it
yourself.

eXcelon is a high performance, highly scalable XML data server. It's used
to build enterprise XML Web applications and can be used with existing data
sources as a middle-tier application cache or in standalone mode as a
back-end data source. Key features include:

- Support for XML (eXcelon efficiently stores well-formed XML down to the
  element level without requiring prior knowledge of the document schema
- Support for the DOM
- Support for XQL and structural and content indexes
- In-memory distributed XML database
- XML Update grammar to declaratively modify XML documents
- Comprehensive tool suite (including a visual XQL query builder and
  a DCD editor with code generator)

Unlike with the relational approach, eXcelon fully leverages XML flexibility
and extensibility by storing XML in its native format.

You can find the eXcelon evaluation version on the Object Design Web site
(http://www.objectdesign.com/excelon).

Please feel free to contact me if you need additional technical information
regarding eXcelon.

I hope you find this useful

Sorry for the intrusion

Regards

Vittorio


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Thu Mar 25 06:27:31 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:27 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de>
Message-ID: <00de01be7688$622e8bc0$0300000a@cygnus.uwa.edu.au>

> I'm having trouble imagining how a CDATA section can have semantic meaning
> in all but the most abusive ways.  (Hmmm, there's a CDATA section.  Fire
up
> the pizza delivery DLL.)  Could you give an example?  Thanks.

The different ways of expressing character data (literal, CDATA section,
character references) as well as other things like ignorable whitespace,
comments, even physical (ie entity) structure, etc are irrelevant for most
applications, but there is the odd application that wants to know about such
things. The standard example is an XML editor.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Thu Mar 25 06:55:08 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:27 2004
Subject: XML Documents
References: <E7DDE4E10279D21182200060B06A95C09C2DA6@msgbos644nts.fmr.com>
Message-ID: <018201be768c$29c18540$0300000a@cygnus.uwa.edu.au>

> Hi,
> I have just started learning XML. Can you suggest good website names
> please

If you've just started, try http://www.xmlinfo.com/newcomers/ which has
links to the other sites I'd recommend.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 07:12:07 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:27 2004
Subject: XML Documents
In-Reply-To: <E7DDE4E10279D21182200060B06A95C09C2DA6@msgbos644nts.fmr.com>
References: <E7DDE4E10279D21182200060B06A95C09C2DA6@msgbos644nts.fmr.com>
Message-ID: <wkaex2f3gl.fsf@ifi.uio.no>


* Williams Arumainathan
|
| I have just started learning XML. Can you suggest good website names
| please

<URL: http://www.xmlinfo.com/>
<URL: http://www.oasis-open.org/cover/>

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Thu Mar 25 07:15:26 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:27 2004
Subject: Whence XQL?
Message-ID: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>

Sorry for the cross post.

I read most of those position papers as well. But the one by Jonathan Robie,
Texcel, Inc. Joe Lapp, webMethods, Inc. and David Schach, Microsoft
Corporation seemed the most complete. It even has a BNF for a parser for
XQL.

It occurred to me that someone might have taken that BNF and made it into
something by now. One assumes that MS or the one other co-author's companies
might be doing that in the their skunk works, about to release something.

The papers described different syntatical forms. The XSLish one of XQL(MS)
seemed useful especially in light of embedding it in CGI-like urls.

I am just experimenting, but in the hope that this might become another 
xml-dev mini-project. 

So far I have the BNF translated to a ANTLR grammer. I had to fix one
infinite recursive definition in the original file (filter). I also had to
decide which things were tokens and which were true productions. It is still
broken at this point because it generates many non-determinisms. Most of
these are due to the tendancy to represent things like Text and NCName as
starting out with Letter and continuing through to Letters again via some
path. I'm going to have to research how to do this better. I'd like to
preserve the nomenclature of the article so everybody is operating with the
same documentation, which is at http://www.w3.org/TandS/QL/QL98/pp/xql.html
BTW.

The parser it generates works on a few items but not entirely. If anybody
shows any interest and has experience with ANTLR, I'll post it and we can
collaborate.

Ed


-----Original Message-----
From: Avi Rappoport [mailto:avirr@LanMinds.Com]
Sent: Wednesday, March 24, 1999 9:53 PM
To: Ed Howland; 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com'
Subject: Re: Whence XQL?


At 7:24 PM -0800 3/24/1999, Ed Howland wrote:
>Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
>hypeware about this and I understand that its just a suggestion for a
>proposal for a note for a draft for a recommendation. Whatever.
>I want my XQL!

There's a great article by Lisa Rein on the W3C Query Workshop late 
last year -- I'm sure it's still at XML.com.  There are links from 
the article to the position papers of the participants, and I found 
them fascinating and enlightening.  There are a *lot* of issues to be 
solved!

Avi

________________________________________________________________
Avi Rappoport, Search Tools Maven: <mailto:avirr@lanminds.com>
Guide to Site Indexing and Local Search Engines:
<http://www.searchtools.com>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 07:37:40 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:27 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <8725673C.007350D8.00@d53mta03h.boulder.ibm.com>
References: <8725673C.007350D8.00@d53mta03h.boulder.ibm.com>
Message-ID: <wk90cmf2ad.fsf@ifi.uio.no>


* Lars Marius Garshol
|
| Should we perhaps make standalone a boolean instead?  It can only have
| two values anyway, and this will spare us a lot of
| standalone.equals(this or that).

* roddey@us.ibm.com
| 
| I did that at first with my internal event APIs, but it didn't work
| out.  There is then no way of knowing whether the document *really*
| said yes or no, or whether it was just no there at all and the
| default was used. This prevents the recreation of the original
| document.

Given that this is supposed to be the handler for lexical information,
where this sort of thing does matter, I agree. It should be a string.
Don't know how I managed to overlook that.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rxcharul at cs.twsu.edu  Thu Mar 25 08:05:18 1999
From: rxcharul at cs.twsu.edu (Madan Mohan Ranganath)
Date: Mon Jun  7 17:10:27 2004
Subject: XML quick question
In-Reply-To: <Pine.LNX.3.96.990324235932.10810B-100000@data.cs.twsu.edu>
Message-ID: <Pine.LNX.3.96.990325020413.11509A-100000@data.cs.twsu.edu>


Hello,
 
I am new to XML and just wrote my first program as shown below. I used
Internet Explorer 5.0 as the browser to read the XML document on
Windows-95(I came to know that the browser supports reading XML
documents). But the problem I am facing is that the browser is printing
the entire document with the tags which it should not. Anyone please
inform me where I am going wrong. I gave ".xml" as the extension for the
XML file name.
 
<?xml version="1.0"?>
<greeting> Hello,World!</greeting>

Regards,
Madan,                                 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Mar 25 08:28:33 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:27 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca>

At 09:00 AM 3/24/99 -0500, David Megginson wrote:
>By the same argument,
><p
>x="1">
>and 
><p x="1">
>are different...

David is right.  It's too late now, because DOM level 1 wrote 
CDATA sections into the spec so we're stuck with 'em - it's a
pity we didn't have the infoset back then. (I assume it won't
include them, right David?) -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Mar 25 08:28:57 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:27 2004
Subject: DOM CDATA vs Normalization
Message-ID: <3.0.32.19990324144714.00e6dd44@pop.intergate.bc.ca>

At 11:55 AM 3/24/99 -0500, Bill la Forge wrote:
>Normalization of an element combines various text objects into a single 
>text object. Does it then merge text and CDATA objects to a single object?
>And what about ignorable whitespace?

XML does not repeat NOT have anything such as "ignorable" 
whitespace. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Mar 25 08:29:45 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:28 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandler
  draft v.1.1)
Message-ID: <3.0.32.19990324145555.00e6dd44@pop.intergate.bc.ca>

At 10:13 PM 3/24/99 +0100, Ronald Bourret wrote:
>I wasn't even going to reply, but then I remembered that the real question 
>here is whether SAX (not the DOM) should tell people about CDATA sections. 
> I think the answer is yes.  

The implication is that a parser that doesn't pass on word of CDATA
sections is a second-rate parser.  Hrummph.  Is this not a slippery-
slope that puts us on the road to reporting whether single or double
quotes were used for attribute values? -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Gareth.sbradley at adv.sonybpe.com  Thu Mar 25 09:01:47 1999
From: Gareth.sbradley at adv.sonybpe.com (Gareth Sylvester-Bradley)
Date: Mon Jun  7 17:10:28 2004
Subject: XML convertor generator
In-Reply-To: <wkzp5oe0m3.fsf@ifi.uio.no>
Message-ID: <000001be711d$7e9f4530$a32fc22b@carrion>

> >  | I'm currently working on an xml convertor-generator
> >  <snip>
> >  | Paul Janssens - paul.janssens@skynet.be
> <snip>
> In reply to the original post:  Paul I would be interested if you are
> making the source available and it is in Java.
>
> Steve Oldmeadow
> Justice Systems Technologies

Ditto if either Java or C++.

Cheers
-- Gareth SB


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Thu Mar 25 09:06:25 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:28 2004
Subject: Whence XQL?
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1712@EUKBANT101>

> -----Original Message-----
> From:	Ed Howland [SMTP:Ed@dega.com]
> 
> Ok, so now it is April (or thereabouts) and still no XQL. I've read all
> the
> hypeware about this and I understand that its just a suggestion for a
> proposal for a note for a draft for a recommendation. Whatever.
> I want my XQL! 
> 
	I haven't followed the Java implementations very closely, but are
you saying that the perl implementation of XQL is the only one? Chalk one
up...

	Matt.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Thu Mar 25 09:21:30 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:28 2004
Subject: XML and (K)Office
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101>

> -----Original Message-----
> From:	David Megginson [SMTP:david@megginson.com]
> 
> Daniel Veillard writes:
> 
>  >   Yep, Gnumeric uses XML (actually gzipped XML on disk) and uses
>  > namespaces.  I'm pretty sure it's incompatible, since when I coded
>  > the XML export I didn't know that KDE would do alike. There is a
>  > virtual cookie reward to the first sending me a good example of KDE
>  > spreadsheet XML file :-) . A clean DTD would be even better.
> 
> Doesn't the recipe for virtual cookies come with the Gnu Emacs
> distribution?
> 
> Anyway, let's get this right -- I think that it's healthy for both
> Gnumeric and the KOffice Spreadsheet program both to exist, but there
> is no excuse for them to use entirely incompatible formats.  As a
> matter of fact, if we could convince KDE and Gnome to use compatible
> XML formats for lots of things (like interface construction), the
> media's predictions of a Linux fracture will be proven to be hot air.
> 
	Although I agree to an extent, if they have different feature sets
it's pretty unlikely that you're going to get an entirely perfect agreement
on a spreadsheet DTD.

	However, that's the beauty of XML. Writing a converter from one
format to another is trivial in the extreme, so it's not a huge issue in my
(humble) opinion.

	Matt.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Thu Mar 25 09:49:13 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:28 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandlerdraft v.1.1)
Message-ID: <01BE76AC.EFC59B30@grappa.ito.tu-darmstadt.de>

Tim Bray wrote:

> At 10:13 PM 3/24/99 +0100, Ronald Bourret wrote:
> >I wasn't even going to reply, but then I remembered that the real 
question
> >here is whether SAX (not the DOM) should tell people about CDATA 
sections.
> > I think the answer is yes.
>
> The implication is that a parser that doesn't pass on word of CDATA
> sections is a second-rate parser.  Hrummph.  Is this not a slippery-
> slope that puts us on the road to reporting whether single or double
> quotes were used for attribute values? -Tim

Actually, the implication is that a parser that doesn't pass on word of 
CDATA sections is one that doesn't support LexicalHandler.  Maybe we should 
ask what kind of applications are likely to use LexicalHandler (mine 
certainly won't -- I just want the data).  The obvious groups are DOM 
builders and editors.  Preserving CDATA sections in editors is a nice thing 
to do -- I know that I would appreciate it as a user.  If LexicalHandler is 
aimed at a different audience, then somebody please say so.

As to single and double quotes, I'm quite happy to draw a line in the sand 
before we get to the road that leads to the brink of the slippery slope.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Thu Mar 25 09:59:50 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:28 2004
Subject: XML convertor generator
Message-ID: <36FA087E.161A@skynet.be>

In answer to questions:

Conversion is TO xml. The intention of the tool is to ease the
conversion of non-markup data to markup, be it once-and-for-all or
repeatedly (source code analysis.) Coded in C and bison for maximum
throughput. Source will be open (after some more cleanup).

As I said early, there's a lot of missing functionality at the moment
(Unicode, escaping, DTD generation, a DTD for the input format) but it
can allready do simple stuff, like converting simple mathematical
expressions to math-ml.

It makes sense to have some semantical additions during the conversion
(If you're converting a programming language, it would be nice for
variable or procedure references to have an IDREF to the variable or
procedure declaration, becauses it adds validition) but a lot of this
stuff can be done as postprocessing, transforming xml to xml, which is
more elegantly done in scheme-like languages.


Paul Janssens - paul.janssens@skynet.be

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 10:01:51 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:28 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <14068.24150.843634.988657@localhost.localdomain>
References: <14068.24150.843634.988657@localhost.localdomain>
Message-ID: <wk677pga60.fsf@ifi.uio.no>


* David Megginson
| 
|     public abstract void startCDATA ()
| 	throws SAXException;
| 
|     public abstract void endCDATA ()
| 	throws SAXException;

This implies that the parser reports the contents of CDATA sections as
separate DocumentHandler.characters events, which is of course the
most natural way to implement things anyway.

However, the 1999-03-12 list of core features contains this:

  http://xml.org/sax/features/normalize-text
    Ensure that all consecutive text is returned in a single callback to
    DocumentHandler.characters or DocumentHandler.ignorableWhitespace
    (true) or explicitly do not require it (false).


This is potentially problematic, since it's unspecified what the
parser should do about CDATA sections in this case. (I suspect we will
see more problems of this kind when we start using really using and
stacking filters.) Should they be normalized, or should they be
reported separately? (Ie: what is consecutive text, exactly?) The same
problem appears with entity boundaries and character references.

I assume most users of normalize-text will want consecutive text to be
interpreted in the logical view of the document, rather than the
lexical view. Otherwise the DocumentHandler will receive different
events in these two cases:

  <desc>
  A problematic case.
  </desc>

and

  <desc>
  A <![CDATA[problematic]]> case.
  </desc>

which is rather fragile, and this behaviour should be avoided, IMHO.


So basically the problem is that normalize-text and LexicalHandler
don't go well together. You can have one, but not both at the same
time, unless the driver changes it's behaviour. In other words, this
seems to require the driver to have explicit knowledge about
normalize-text.

Possible solutions:

 - reject normalize-text true if a LexicalHandler has been registered,
 and reject LexicalHandler registration if normalize-text has been set
 to true
 - make normalize-text have a logical interpretation by default, and
 switch to lexical if a LexicalHandler has been registered
 - make normalize-text always have a lexical interpretation
 - have separate normalize-text-logical and normalize-text-lexical
 events, with reject-behaviour for the first

Thoughts?

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david.hitch at dial.pipex.com  Thu Mar 25 10:37:08 1999
From: david.hitch at dial.pipex.com (David Hitchcock)
Date: Mon Jun  7 17:10:28 2004
Subject: XML Documents
Message-ID: <01be76a1$e04800e0$0100007f@ketlux03>

Hi Williams

You could try El.pub - interactive publishing news and resources, at:
http://www.pira.co.uk/IE, particularly the standards section:
http://www.pira.co.uk/IE/top011.htm - follow the links -
and the products section at:
http://www.pira.co.uk/IE/base09.htm#SGML

Also available is a free Weekly newsletter update service for the site
sign-up (email only required) on the welcome page:
http://www.pira.co.uk/IE

Incidentally, list members may be interested to know that the European
Commission launched a Call for Project Proposals under the new 5th
Framework R&D Proframme (FP5) on 19 March, 1999. Accepted R&D
projects can qualify for 50% funding by the EU. The message I got clearly
from attending FP5 presentations was that "cross-pond" collaboration
is seen as positive. More information, including links to official sources
available at: http://www.pira.co.uk/IE - see FP5 heading on top right.
I do, of course, only speak for myself here.

Best

---> David

*********************************
David Hitchcock
Logical Events Ltd.
tel:   +44/ (0)181 255 7084
       +44/ (0)181 255 7085
email: david.hitch@dial.pipex.com
web:   http://www.pira.co.uk/IE
*********************************

-----Original Message-----
From: Arumainathan, Williams <Williams.Arumainathan@fmr.com>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Wednesday, March 24, 1999 11:49 PM
Subject: XML Documents

>Hi,
>I have just started learning XML. Can you suggest good website names
>please
>
>Thank you,
>Williams.
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 11:27:48 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:28 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <00de01be7688$622e8bc0$0300000a@cygnus.uwa.edu.au>
References: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de>
	<00de01be7688$622e8bc0$0300000a@cygnus.uwa.edu.au>
Message-ID: <14074.6976.284974.722431@localhost.localdomain>

James Tauber writes:

 > The different ways of expressing character data (literal, CDATA
 > section, character references) as well as other things like
 > ignorable whitespace, comments, even physical (ie entity)
 > structure, etc are irrelevant for most applications, but there is
 > the odd application that wants to know about such things. The
 > standard example is an XML editor.

Right, but the fact that *someone* wants something shouldn't
automatical lead to its inclusion in standards.

Standards benefit from the network effect -- their usefulness is
proportional to the square of the number of users -- so there must be
a large potential number of users to justify the extra cost of
developing, publishing, documenting, implementing, and maintaining a
standard.  If we're talking about, say, five or ten potential users,
the network effect just isn't all that exciting.

Standards also grow easily but shrink with difficulty: if in v.1 you
leave out a feature that turns out to be necessary, it is usually not
difficult to include the feature in v.2 once the need for it has been
proven in real use; if in v.1 you include a feature that turns out not
to be necessary (i.e. notations and unparsed entities in XML), then
it sticks to future versions of the spec like gum in your hair.

In any case, please remember that I am not actually proposing removing 
CDATA boundaries from LexicalHandler -- I do want to support the DOM.
I'm just whining and/or drawing lessons.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 11:33:54 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:28 2004
Subject: Whence XQL?
In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>
References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>
Message-ID: <14074.7638.522376.583407@localhost.localdomain>

I missed the start of this thread.  Did the poster really want to know
where XQL came from (whence), or was the poster interested in where
it's going (whither)?

Since SHAKESPEARE IN LOVE swept the Oscars, I expect people to get
their 16th-century English usage right.


Pedanticly yours,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 11:38:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:28 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca>
References: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca>
Message-ID: <14074.8015.743294.916312@localhost.localdomain>

Tim Bray writes:

 > David is right.  It's too late now, because DOM level 1 wrote 
 > CDATA sections into the spec so we're stuck with 'em - it's a
 > pity we didn't have the infoset back then. (I assume it won't
 > include them, right David?) -T.

DOM 1.0 is a REC, and the beast must be fed.  As our published RD
mentions, we're aiming for DOM 1.0 compatibility, but the Infoset will
at least be able to distinguish what is required from what is optional
(the DOM has been deliberately silent on that point, keeping the faith
that some day there would be an Infoset).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 11:40:20 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:28 2004
Subject: DOM CDATA vs Normalization
In-Reply-To: <3.0.32.19990324144714.00e6dd44@pop.intergate.bc.ca>
References: <3.0.32.19990324144714.00e6dd44@pop.intergate.bc.ca>
Message-ID: <14074.8279.343988.805611@localhost.localdomain>

Tim Bray writes:

 > At 11:55 AM 3/24/99 -0500, Bill la Forge wrote:
 > >Normalization of an element combines various text objects into a single 
 > >text object. Does it then merge text and CDATA objects to a single object?
 > >And what about ignorable whitespace?
 > 
 > XML does not repeat NOT have anything such as "ignorable" 
 > whitespace. -Tim

Tim's right -- SAX's terminology has thrown everything off.  Something
like "flagged whitespace" would have been better, but it would also
have reminded people of the Seinfeld episode where George took the art
book into the washroom...


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 11:55:35 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:28 2004
Subject: XML and (K)Office
In-Reply-To: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101>
References: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101>
Message-ID: <14074.8435.653789.348824@localhost.localdomain>

Matthew Sergeant (EML) writes:

[David]

 > > Anyway, let's get this right -- I think that it's healthy for
 > > both Gnumeric and the KOffice Spreadsheet program both to exist,
 > > but there is no excuse for them to use entirely incompatible
 > > formats.  As a matter of fact, if we could convince KDE and Gnome
 > > to use compatible XML formats for lots of things (like interface
 > > construction), the media's predictions of a Linux fracture will
 > > be proven to be hot air.

[Matt]

 > Although I agree to an extent, if they have different feature sets
 > it's pretty unlikely that you're going to get an entirely perfect
 > agreement on a spreadsheet DTD.

I disagree *very* strongly -- with Namespaces, we can design a common
format for the 90% of functionality that the two spreadsheets actually
have in common (text cells, data cells, basic formulas, general
formatting information [font, alignment, colour, size], etc.)  and
then allow each to provide extended information
unambiguously-delimited through the use of separate namespaces.

The more material in the common spec, the better interoperability.
Linux needs to set an example here.

 > However, that's the beauty of XML. Writing a converter from one
 > format to another is trivial in the extreme, so it's not a huge
 > issue in my (humble) opinion.

For n XML-based formats, we need (n * (n - 1)) converters.  If there
are only two different XML-based spreadsheet formats, then we need
only two converters:

 a => b
 b => a

If there are three XML-based different formats, then we need six
converters:

 a => b
 a => c
 b => a
 b => c
 c => a
 c => b

If there are four different XML-based formats, then we need twelve
converters:

 a => b
 a => c
 a => d
 b => a
 b => c
 b => d
 c => a
 c => b
 c => d
 d => a
 d => b
 d => c

Add a couple more, and the problem definitely isn't easy by any
definition.  Ten different XML-based formats requires 90 converters,
and a change to only one of the formats will require changes to (2 *
(n - 1)), or 18 converters!


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Mar 25 11:58:44 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:10:28 2004
Subject: Whence XQL?
References: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.dega.com>
Message-ID: <36FA24CE.C825D232@mitre.org>

Have you looked at XML-QL?  I have been playing around with this XML
query tool for a few weeks.  It's quite nice.  It allows you to specify
the grammer of extracted data, query multiple XML documents, etc.  See:
<http://www.research.att.com/sw/tools/xmlql>  /Roger

Ed Howland wrote:
> 
> Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
> hypeware about this and I understand that its just a suggestion for a
> proposal for a note for a draft for a recommendation. Whatever.
> I want my XQL!
> 
> Seriously, all ranting aside, I haven't seen any talk here or in XSL
> listserv land about it yet (recently). The proposal seems complete enough to
> me for someone to have at least announced a beta implementation of it. I'd
> be happy with mostly unfinshed code if it were written in Java.
> 
> So does anybody have a clue about this? I know about XSL so please don't
> send me down that path. I also know about the Datachannel attempt.
> 
> If not, I'm tempted to write something myself. I'm not sure about a couple
> of things because its still just off the top of my head so far.
> 
> Ok, its in Java. It uses some free XML parser, probably XML4J because its
> the one I'm most familiar with. The XQL syntax parser will be written in
> ANTLR, since it outputs nice O-O Java classes. The result set of XQL is well
> formed XML. This can be handled easily by XML4J's ability of any node in any
> tree (or transformed sub-tree) to print itself in XML to any stream. XML4J
> has a nice getNodesByName() mathod that can operate at any level of the tree
> returning a NodeList of siblings with that tag name. Wrapping a result tree
> in <xql:result></xql:result> and iterating the NodeList gets you the
> simplist query.
> 
> Internally the result set is just another DOM tree so you should be able to
> add the .jar file to your Java app and thus satisfy that type of XQL result.
> The input can be done in a variety of ways. I assume that the Perl module
> XML::XQL can be used in a CGI context to extract the XQL query, execute it
> and return either XML or XSL transformed output to the calling app(browser.)
> Likewise, a Java servlet could do the same thing.
> 
> Cons: Xml4J doesn't yet handle PI's so its maybe not the overall best
> solution. (I may be wrong about this, IBM uploaded a new major release that
> may have fixed it.) Its just the one I'm comfortable with, at the moment. On
> my hard drive are XML parsers from Sun, Microsoft, Oracle, James Clark and
> one or two others I haven't had time to play with yet.
> 
> I don't care about efficiency or optimization. All partially created result
> sets will live in memory till they are ready to be output. I also don't care
> about searching multiple files, although that should be realtively easy to
> add. (I'm still confused about XML repositorys. Would XQL have to understand
> directory paths? Does XQL need to be able to follow XLinks?)
> 
> I'm leaving out sequences but I may add them in (much) later. Return values
> (analog to SQL's SELECT) are important to my application, as are
> conditionals.
> 
> Unless someone warns me that I'm clueless (which is usually the case,) I'll
> post a cut of the ANTLR grammer as soon as I get a working one. I'll
> probably put it on my web site.
> 
> Ed
> 
> Ed Howland
> ed@dega.com
> http://www.dega.com
> "As your attorney, I advise you to take some adrenalchrome"
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 12:01:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:28 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <wk677pga60.fsf@ifi.uio.no>
References: <14068.24150.843634.988657@localhost.localdomain>
	<wk677pga60.fsf@ifi.uio.no>
Message-ID: <14074.9415.985411.394383@localhost.localdomain>

Lars Marius Garshol writes:

 >   http://xml.org/sax/features/normalize-text
 >     Ensure that all consecutive text is returned in a single callback to
 >     DocumentHandler.characters or DocumentHandler.ignorableWhitespace
 >     (true) or explicitly do not require it (false).
 > 
 > 
 > This is potentially problematic, since it's unspecified what the
 > parser should do about CDATA sections in this case. (I suspect we will
 > see more problems of this kind when we start using really using and
 > stacking filters.) Should they be normalized, or should they be
 > reported separately? (Ie: what is consecutive text, exactly?) The same
 > problem appears with entity boundaries and character references.

Thanks, Lars -- this is an excellent point.  I think that the
specification belongs, not with the normalize-text feature, but with
the LexicalHandler (since people may define other types of handlers
that we cannot predict).

 > Possible solutions:
 > 
 >  - reject normalize-text true if a LexicalHandler has been registered,
 >  and reject LexicalHandler registration if normalize-text has been set
 >  to true
 >  - make normalize-text have a logical interpretation by default, and
 >  switch to lexical if a LexicalHandler has been registered
 >  - make normalize-text always have a lexical interpretation
 >  - have separate normalize-text-logical and normalize-text-lexical
 >  events, with reject-behaviour for the first

The DOM's text-normalisation feature does *not* normalise CDATA
sections, but I think that SAX's should.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Tim.Shaw at wdr.com  Thu Mar 25 12:04:02 1999
From: Tim.Shaw at wdr.com (Tim.Shaw@wdr.com)
Date: Mon Jun  7 17:10:29 2004
Subject: Whence XQL?
In-Reply-To: <14074.7638.522376.583407@localhost.localdomain>
Message-ID: <H0000586018203e8@MHS>

     
Shouldn't that be 'pedantically' :^)

tim

______________________________ Reply Separator _________________________________
Subject: RE: Whence XQL?
Author:  david (david@megginson.com) at unix,mime
Date:    25/03/99 11:34


I missed the start of this thread.  Did the poster really want to know 
where XQL came from (whence), or was the poster interested in where 
it's going (whither)?
     
Since SHAKESPEARE IN LOVE swept the Oscars, I expect people to get 
their 16th-century English usage right.
     
     
Pedanticly yours,
     
     
David
     

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricker at xmls.com  Thu Mar 25 12:57:04 1999
From: ricker at xmls.com (Jeffrey Ricker)
Date: Mon Jun  7 17:10:29 2004
Subject: XML convertor generator
In-Reply-To: <36F9B224.CE017C45@manhattanproject.com>
References: <4.1.19990325091311.00ba01c0@steptwo.com.au>
Message-ID: <199903251256.HAA27933@mail.his.com>

Is this the sort of thing you are talking about?

http://www.xmls.com/news/exeter.html

At 03:48 AM 3/25/99 +0000, Clark Evans wrote:
>James Robertson wrote:
>| At 03:08 25/03/1999 , JPA wrote:
>| | Hello,
>| |
>| | I'm currently working on an xml convertor-generator. When finished, the
>| | tool will, if you take the bother to type the structure of your input
>| | format and mappings on entities and attributes, construct a convertor.
>| | There's no documentation as yet, and some stuff missing (escaping, for
>| | one thing), but if there's enough interest I'll put it on a website as
is.
>| |
>| | Paul Janssens - paul.janssens@skynet.be
>| 
>| Paul,
>| 
>| Not wishing to rain on your parade, but aren't
>| you re-inventing the wheel here?
>
>Actually, a program which created an efficient
>program to convert XML conforming to a specific
>DTD to another product would be a very cool 
>invention, very different from using Perl 
>and/or Omnimark.
>
>I have Omnimark programs which take a great
>deal of processing power (I'd hate to see the 
>Perl equivalent).  Cutting it in half with a 
>program that generated a program would be 
>very cool indeed.   What kind of 'efficiencies'
>do you get when you remove the interpreted layer?
>
>I'm reading this that you are more or less
>doing a YACC thing?  Is this a correct
>interpretation?  Will it do SGML?
>(I guess I can run it through nsgmls 
>to make the XML equivalent first.)
>Is it open source?  Hopefully it 
>will generate C code (for speed).
>
>Clark Evans
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Thu Mar 25 13:01:56 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:29 2004
Subject: Whence XQL?
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1719@EUKBANT101>

My problem with XML-QL was their use of tag minimisation (their proprietary
</> syntax) means you can't parse XML-QL with an XML parser. That's foolish
IMHO - if you're practically using XML already, why not reap the benefits?

Anyway, there's an implementation of XML-QL in my directory on CPAN for perl
users, which needs fixing up a little bit, but it's quite usable (if a
little slow). It facilitates the use of perl's regexp syntax for queries as
well as the system used by XML-QL, which makes it nice and powerful...

Matt.
--
http://come.to/fastnet
Perl on Win32, PerlScript, ASP, Database, XML
GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V 
!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++

> -----Original Message-----
> From:	Roger L. Costello [SMTP:costello@mitre.org]
> Sent:	Thursday, March 25, 1999 11:58 AM
> To:	Ed Howland
> Cc:	'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com'
> Subject:	Re: Whence XQL?
> 
> Have you looked at XML-QL?  I have been playing around with this XML
> query tool for a few weeks.  It's quite nice.  It allows you to specify
> the grammer of extracted data, query multiple XML documents, etc.  See:
> <http://www.research.att.com/sw/tools/xmlql>  /Roger
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 13:20:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:29 2004
Subject: Whence XQL?
In-Reply-To: <H0000586018203e8@MHS>
References: <14074.7638.522376.583407@localhost.localdomain>
	<H0000586018203e8@MHS>
Message-ID: <14074.14255.762607.581285@localhost.localdomain>

[offline]

Tim.Shaw@wdr.com writes:

 > Shouldn't that be 'pedantically' :^)

Sixteenth-century spelling.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 13:31:00 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:29 2004
Subject: Megginson's Spelling
Message-ID: <14074.14586.743793.79875@localhost.localdomain>

To all those kind people who pointed out my alleged misspelling of
'pedanticly':

1. I've never been a competent speller (I was warned as an
   undergraduate that if I studied Medieval or Renaissance English
   with the original orthography I'd never be able to spell Modern
   English again).

2. I plan to claim that I was using a sixteenth-century spelling.  I
   haven't actually found such a spelling used in the sixteenth
   century -- the earliest recorded usage of the word is from
   Brathwait in 1631, and he is already using the nouveau
   'pedantically' spelling -- but I'll keep looking.

3. As Donne wrote (cited in the OED), "Busie old foole, unruly
   sunne, ... Sawcy pedantique wretch, goe chide Late schooleboyes"
   (see what I mean about spelling?).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Thu Mar 25 13:34:02 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:29 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <199903231453.JAA00219@ruby.ora.com>
Message-ID: <001601be76c3$e3d91e70$0100007f@eps.inso.com>

> Describing files in encodings other than US-ASCII or ISO 8859-1 (or
> maybe other ISO 8859s) as text/anything is not a very good idea.  The
> rules for text/* allow many unhealthy things; 8-bit data is not even a
> safe assumption, and line-end normalization can be a killer.  The
> fallback rules for MIME's two-level hierarchy is only the final straw;
> for non-European encodings, I would use application/xml.

HTTP specifically ignores some things required by MIME, so the
above is only an issue in mail.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Thu Mar 25 13:34:37 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:29 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <001901be7616$8e4caba0$c8a8a8c0@thing1>
Message-ID: <001401be76c3$e0fd27a0$0100007f@eps.inso.com>

> From: Gavin Thomas Nicol <gtn@eps.inso.com>
> >CDATA sections *are* different from normal text, even if only 
> >because the author used them.
> 
> Again, is anyone aware of why CDATA is preserved by the DOM?
> What was the reasoning behind this decision? Other things, like
> whitespace within an element tag or even attribute order, are 
> not preserved. Why then was CDATA? 

Because whitespace within elements is not significant markup, nor
is attribute ordering (though we did have a number of debates over
whether attribute ordering information should be available).

Unlike these, CDATA is *explicit* markup. For many purposes, you
don't need to know about it, but you cannot simply remove it,
because you cannot know why an author put it there. Removing
CDATA would fail the test of least surprise.

Speaking of which, I am continually surprised by SAX's lack of
comment interfaces....


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Thu Mar 25 13:35:21 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:29 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <36F755CB.C996CE2D@w3.org>
Message-ID: <001501be76c3$e22d4330$0100007f@eps.inso.com>

> So, in consequence: example file such as the Chinese XML examples at
> http://xml.ascc.net/xml/test/index.html (where each example
> is available in UTF-8, Big5 and GB2312, all correctly labelled in the XML
encoding
> declaration) are now sets of invalid XML files which are required to
> produce a critical error because of the invalid byte sequences in what
> is now described as a US-ASCII file?
>
> This is deeply counterproductive, and could have been avoided.

No. Servers should be configured to label the document correctly. The
HTTP 1.1 specification clearly states that any document using an encoding
other than ISO 8859-1 should have a correct, corresponding charset
parameter. As such, for the documents above, using HTTP, you would have
a protocol error.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Mar 25 13:47:28 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:10:29 2004
Subject: XML-QL (was Re: Whence XQL?)
References: <5F052F2A01FBD11184F00008C7A4A800022A1719@EUKBANT101>
Message-ID: <36FA3E97.38E67BC1@mitre.org>

Matthew Sergeant (EML) wrote:
> 
> My problem with XML-QL was their use of tag minimisation (their proprietary
> </> syntax) means you can't parse XML-QL with an XML parser. That's foolish
> IMHO - if you're practically using XML already, why not reap the benefits?

Hi Matt,

Not sure that you could do all the things that XML-QL allows you to do
if you stick to the XML syntax.  Example, query the following XML
document for all part names:

<?xml version="1.0"?>
<!DOCTYPE Parts [
<!ELEMENT Parts (part+)>
<!ELEMENT part (name, brand, part*)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT brand (#PCDATA)>
]>
<Parts>
        <part>
                <name>Green Power Juicer</name>
                <brand>Green Power</brand>
        </part>
        <part>
                <name>Toyota Tercel</name>
                <brand>Toyota</brand>
                <part>
                     <name>Sony Stereo X11-3</name>
                     <brand>Sony</brand>
                </part>
        </part>
</Parts>

Note the recursive definition of the part element.  Thus, the part name
can be at any nesting level.  Here's how to do it using XML-QL:

function AllPartNamesQuery () {

// Source: Parts.xml
// Find the names of all the parts

construct  <name>$name</name>
where      <Parts>
               <part*>
                   <name>$name</name>
               </>
           </Parts> IN "Parts.xml"
}

How would you do this using XML syntax?  /Roger


> 
> Anyway, there's an implementation of XML-QL in my directory on CPAN for perl
> users, which needs fixing up a little bit, but it's quite usable (if a
> little slow). It facilitates the use of perl's regexp syntax for queries as
> well as the system used by XML-QL, which makes it nice and powerful...
> 
> Matt.
> --
> http://come.to/fastnet
> Perl on Win32, PerlScript, ASP, Database, XML
> GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V
> !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++
> 
> > -----Original Message-----
> > From: Roger L. Costello [SMTP:costello@mitre.org]
> > Sent: Thursday, March 25, 1999 11:58 AM
> > To:   Ed Howland
> > Cc:   'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com'
> > Subject:      Re: Whence XQL?
> >
> > Have you looked at XML-QL?  I have been playing around with this XML
> > query tool for a few weeks.  It's quite nice.  It allows you to specify
> > the grammer of extracted data, query multiple XML documents, etc.  See:
> > <http://www.research.att.com/sw/tools/xmlql>  /Roger
> >
> >


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 14:17:07 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:29 2004
Subject: SAX2: AttributeList2 and EntityRefList
Message-ID: <14074.16928.163619.681099@localhost.localdomain>

While we're polishing the details of LexicalHandler (which may yet
become DocumentHandler2 -- I'm still listening to arguments both
ways), I'd like to propose two new SAX2 support interfaces.


EntityRefList
-------------

This first interface is designed to work around a *very* nasty problem 
with XML 1.0 conformance, and at the same time, to enable the tracking 
of entity references in attribute values for the few masochists who
care.  As John Cowan has pointed out, the XML 1.0 REC requires that
processors report unexpanded entity references, and presumably that
applies to references in attribute values as well as elsewhere; as a
result, it is impossible to treat an XML attribute value simply as a
string.

On the other hand, almost nobody will every need this, so it's not
worth complicating the interface much for parser writers or for
application writers.

So, after some thought, here's what I came up with.  This is a special 
interface providing indexes to zero or more entity references in a
literal string (i.e. an attribute value).  The indices are based on
whatever array indices the programming language is using, exclusive of 
Unicode problems with combining characters, etc. (i.e. any
normalisation must already have taken place).

====================8<====================8<====================
// EntityRefList.java - list entity references in an attribute value.

package org.xml.sax;

public interface EntityRefList
{
    public int getLength ();
    public String getEntityName (int i);
    public int getEntityRefStart (int i);
    public int getEntityRefEnd (int i);
}

// end of EntityRefList.java
====================8<====================8<====================

The nice thing is that this lives outside of the String representing
the attribute value, so almost everyone can ignore it, and there
should be no performance hit.  It also provides nice
backwards-compatibility with SAX 1.0.


AttributeList2
--------------

Here's what I've come up with for lexical attribute information in
SAX2:

====================8<====================8<====================
// AttributeList2.java - SAX2 extensions for an attribute list

package org.xml.sax;

public interface AttributeList2 extends AttributeList
{
    public boolean isSpecified (int index);
    public boolean isSpecified (String name);
    public EntityRefList getEntityRefList (int index);
    public EntityRefList getEntityRefList (String name);
}

// end of AttributeList2.java
====================8<====================8<====================

This, together with the DTDDeclHandler interface I'll be describing in 
a separate posting, should provide enough information for full DOM
level one core attribute support.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 14:21:39 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:29 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
Message-ID: <14074.17776.784121.47587@localhost.localdomain>

Here's the second of the three new core handler types I'm proposing
for SAX2.  This handler takes a minimalist position: it provides
about enough information for DOM support, but not much more.  In
particular, I'm still shying away from reporting element-type
declarations, at least until someone shows me an easy and concise way
of doing it (in AElfred, I simply provided the content model as a
fully-normalised string).


====================8<====================8<====================
// DTDDeclHandler.java -- receive extended DTD declarations

package org.xml.sax;

public interface DTDDeclHandler
{
    public final static int ATTRIBUTE_DEFAULTED = 1;
    public final static int ATTRIBUTE_IMPLIED = 2;
    public final static int ATTRIBUTE_REQUIRED = 3;
    public final static int ATTRIBUTE_FIXED = 4;

    public abstract void attributeDecl (String element,
					String name,
					String type,
					String defaultValue,
					int defaultType,
					EntityRefList entityRefs)
	throws SAXException;

    public abstract void externalEntityDecl (String name,
					     boolean isParameterEntity,
					     String publicId,
					     String systemId)
	throws SAXException;

    public abstract void internalEntityDecl (String name,
					     boolean isParameterEntity,
					     String value)
	throws SAXException;
				     
}

// end of DTDDeclHandler.java
====================8<====================8<====================


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From indiketr at churchill.co.uk  Thu Mar 25 14:35:09 1999
From: indiketr at churchill.co.uk (Rajeeva Indiketiya)
Date: Mon Jun  7 17:10:29 2004
Subject: unsubscribe xml-dev
Message-ID: <Pine.SV4.4.02.9903251431060.22797-100000@chilli>

unsubscribe xml-dev indiketr@churchill.co.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From crism at oreilly.com  Thu Mar 25 14:51:19 1999
From: crism at oreilly.com (Chris Maden)
Date: Mon Jun  7 17:10:30 2004
Subject: XML and (K)Office
In-Reply-To: <199903242256.XAA04448@sonne.darmstadt.gmd.de>
	(macherius@darmstadt.gmd.de)
Message-ID: <199903251448.JAA00778@ruby.ora.com>

[Ingo Macherius]
> David Megginson <david@megginson.com> wrote at 24 Mar 99, 17:02:
> 
> > There's also a hot rumour [3] that Microsoft has assigned 37
> > programmers to work on a Linux port of MS Office.

Soem quick research on slashdot shows the rumor's evolution.  The
first sighting appears to be on ZDnet; they reported that Simson
Garfinkle, a _Boston Globe_ columnist and technology writer, mentioned
on a radio show that he was in correspondence with some of the
developers.  But even if that's true, I can think of a number of
reasons why Microsoft might be doing a port internally with no
intentions whatsoever of releasing it.  The ZDnet article notes that
Office relies heavily on MS's undocumented Win32 API calls, and just
porting the app to the standard API calls which could then be handled
in emulation on Linux would be a major chore.  Some URLs:

   Linkname: Slashdot:Search
        URL: http://www.slashdot.org/search.pl?topic=microsoft

   Linkname: Slashdot:MS Office for Linux
        URL: http://www.slashdot.org/articles/99/03/11/2327241.shtml

   Linkname: ZDNN: MS porting Office to Linux?
        URL:
          http://www.zdnet.com/zdnn/stories/news/0,4586,2224863,00.html

-Chris
-- 
<!NOTATION SGML.Geek PUBLIC "-//Anonymous//NOTATION SGML Geek//EN">
<!ENTITY crism PUBLIC "-//O'Reilly//NONSGML Christopher R. Maden//EN"
"<URL>http://www.oreilly.com/people/staff/crism/ <TEL>+1.617.499.7487
<USMAIL>90 Sherman Street, Cambridge, MA 02140 USA" NDATA SGML.Geek>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Thu Mar 25 15:03:48 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:30 2004
Subject: Whence XQL?
In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.deg
 a.com>
Message-ID: <3.0.3.32.19990325100328.00c90e50@pop.mindspring.com>

At 07:24 PM 3/24/99 -0800, Ed Howland wrote:
>Ok, so now it is April (or thereabouts) and still no XQL. I've read all the
>hypeware about this and I understand that its just a suggestion for a
>proposal for a note for a draft for a recommendation. Whatever.
>I want my XQL! 

XQL has been implemented by several vendors. I know of five implementations
that are commercially or publically available:

o	Microsoft's Internet Explorer 5.0 browser
o	webMethod's B2B Integration Server
o	DataChannel's Rio
o	ObjectStore's eXcelon
o	Perl library by Eduard Derksen (Enno) 
	that can be found in the CPAN archive

Software AG has been showing a subset of XQL in it's Tamino product at
CeBIT. There are several other prototypes that I know of, but I don't know
whether I can mention them.

>If not, I'm tempted to write something myself. I'm not sure about a couple
>of things because its still just off the top of my head so far.
>
>Ok, its in Java. It uses some free XML parser, probably XML4J because its
>the one I'm most familiar with. 

That would be great, and I'd be glad to answer questions you have as you go
along. I would really like to see something like that. I have thought of
setting up a mailing list for XQL, which might be a good place to help
implementors communicate with each other.

>The XQL syntax parser will be written in
>ANTLR, since it outputs nice O-O Java classes. The result set of XQL is well
>formed XML.

Careful...there's a very fine distinction here. The result set of XQL is
actually a set of nodes in the tree. When this is returned as an ASCII
result, it is wrapped in an <xql:result> tag to make it well formed. This
distinction is important because nodes have identity, and XML text does not.

>This can be handled easily by XML4J's ability of any node in any
>tree (or transformed sub-tree) to print itself in XML to any stream. XML4J
>has a nice getNodesByName() mathod that can operate at any level of the tree
>returning a NodeList of siblings with that tag name. Wrapping a result tree
>in <xql:result></xql:result> and iterating the NodeList gets you the
>simplist query.

This is a good approach.

>I don't care about efficiency or optimization. All partially created result
>sets will live in memory till they are ready to be output. I also don't care
>about searching multiple files, although that should be realtively easy to
>add. (I'm still confused about XML repositorys. Would XQL have to understand
>directory paths? Does XQL need to be able to follow XLinks?)

There are several ways of doing repositories, and that will depend somewhat
on the repository vendor.

Directory paths are useful - I think the easiest way to do this is to use a
URL to specify the resource, and put the directory path in the URL. The XQL
query can be appended to the end of the URL.

>I'm leaving out sequences but I may add them in (much) later. Return values
>(analog to SQL's SELECT) are important to my application, as are
>conditionals. 

Cool!

Jonathan
 
jonathan@texcel.no
Texcel Research
http://www.texcel.no

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Thu Mar 25 15:09:47 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:10:30 2004
Subject: XML conference
Message-ID: <49092BAEAC84D2119B0600805FD40F9F120EB9@MDYNYCMSX1>

>Is anyone on this list intend on doing something interesting at the   
>conference?

I'll be doing an overview of the four schema proposals submitted to the
W3C. 

You gotta love the conference's clever domain name:
www.xmlconference.com. Although, as you pointed out, it the web page
doesn't tell us much--in fact, it asks for far more information than it
offers.

Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Thu Mar 25 15:24:17 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:30 2004
Subject: Whence XQL?
In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBB@exchange-server.deg
 a.com>
Message-ID: <3.0.3.32.19990325102457.03222100@pop.mindspring.com>

At 07:24 PM 3/24/99 -0800, Ed Howland wrote:
>Ok, so now it is April (or thereabouts) and still no XQL. 

I just found another utility that is based on XQL:

http://www.cs.york.ac.uk/fp/Xtract/

The author describes it thus:

"Xtract is a command-line tool for searching XML documents. Just as `grep'
returns lines which match your regular expression, so Xtract returns all
those sub-trees from XML documents which match a query pattern. The query
expression language is simple but powerful, and is based loosely on XQL,
the recently proposed XML Query Language. An introduction to the Xtract
query pattern language, together with the full Xtract grammar is in this
tutorial."

"The major difference from XQL is that a query must return a sequence of
XML contents (either elements or text inside an element): it cannot for
instance return just an attribute value."

Looks useful.

Jonathan
 
jonathan@texcel.no
Texcel Research
http://www.texcel.no

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Thu Mar 25 15:30:24 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:30 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <01BE76DC.A75E4B50@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> > Possible solutions:
> >
> >  - reject normalize-text true if a LexicalHandler has been registered,
> >  and reject LexicalHandler registration if normalize-text has been set
> >  to true
> >  - make normalize-text have a logical interpretation by default, and
> >  switch to lexical if a LexicalHandler has been registered
> >  - make normalize-text always have a lexical interpretation
> >  - have separate normalize-text-logical and normalize-text-lexical
> >  events, with reject-behaviour for the first
>
> The DOM's text-normalisation feature does *not* normalise CDATA
> sections, but I think that SAX's should

Do you mean always normalize CDATA or normalize CDATA in the absense of a 
LexicalHandler?  I agree with the first case, but prefer Lars' second 
option (lexical interpretation of normalization) in the second case. By 
requesting normalize, the application has asked for a single call to 
character() between other calls. By registering a LexicalHandler, the 
application has stated it is still interested in lexical events.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Thu Mar 25 15:38:09 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:10:30 2004
Subject: XML DTD to relational 
Message-ID: <49092BAEAC84D2119B0600805FD40F9F120EBB@MDYNYCMSX1>

>I would like to know if there is a Java library that creates a
>relational schema from an XML DTD or a Java library that parses
>an XML DTD?

The latter: http://www.javareport.com/html/products/prod_rev.shtml has a
review of the popular Java XML parsers. The article admitted to being
out-of-date as soon as it was printed, so most of the parsers have been
updated since being reviewed.

The former: part of the point of XML and SGML is their ability to
indicate structure in data that doesn't fit neatly into normalized rows
and tables. The information is structured hierarchically. I know that
Information Builders, a former employer of mine (and Chet Ensign's), has
products that map schema back and forth between relational databases and
their hierarchically-based database products, so the algorithms exist,
but it's not trivial. I haven't heard of anything that does this with
DTDs, but it would be cool.

Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jabuss at cessna.textron.com  Thu Mar 25 15:46:32 1999
From: jabuss at cessna.textron.com (Buss, Jason A)
Date: Mon Jun  7 17:10:30 2004
Subject: Whence XQL?
Message-ID: <F7E1775C1C27D211881F00A024B2853046A051@CESS01AMX03>

I thought the tag minimization syntax (</>) was a part of the XML
recommendation...  Or am I wrong?

> -----Original Message-----
> From:	Matthew Sergeant (EML) [SMTP:Matthew.Sergeant@eml.ericsson.se]
> 
> My problem with XML-QL was their use of tag minimisation (their
> proprietary
> </> syntax) means you can't parse XML-QL with an XML parser. That's
> foolish
> IMHO - if you're practically using XML already, why not reap the benefits?
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Mar 25 16:19:42 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:30 2004
Subject: How about changing the rules?
Message-ID: <NBBBJPGDLPIHJGEHAKBACEDLDAAA.martind@netfolder.com>

Hi,

Yesterday night I talked to good friends that work at Netscape (but not for
long now) and I can tell you that this was not about celebrating. We came to
discuss about the free software movement on so on, then came an idea...

<Actual saturation>
Several people worked hard in the Linux project, then came Red Hat, big
investments, and now red hat is doing what all the other guys are doing
(that's business no?) protecting their turf and doing money (they are even
more luky than SUN or Microsoft, they are cheap labor to develop their
software - just think about it. We all know that Microsoft has probably the
lowest developement cost in the industry. They let the stock market pay
their exployees :-) but now think about a company having 0$ developement
costs Wow, thats VC dream! Follow developers, is it how you pay your bills?
Sun still own the Java JDK but at least played fair because the code is
developed with their own money.
Microsoft, played hard with all ISVs with their huge appetite for growth but
at least, like sun paid their code production.
Mozilla, again, people working for free and AOL and its stock holders
harvest the results. Just imagine that Sun and adobe put 60 000$ to have a
better XML support for Mozilla. But in the end who will get the millions
rewards. And how much is 60 000$ compared to millions, just a sustenance
given to developers like lord would do in the middle ages with their serf.
Just think about it. I am not saying that Sun or Adobe are doing something
wrong but that the rules of the games or the odds are for the bank, not for
the developers :-) (if you allow my casino analogy).
Basically the actual free software movement seems to follow this pattern:
developers work for free (cheap labor), when testing and proof of concept is
done, someone comes into and reap the rewards and the money. Result,
developers got fun but a modern version of a lord reap the financial
rewards. Do we really want to replicate middle ages patterns? Next year will
be the next millenium, do you really want that kind of order in the future?
What about a world where people could get a just reward for their efforts.
All the efforts we are doing with XML may end up the same way. I do not
speak here for people already paid by W3C or big corpora but about
individual doing all the efforts with their own time, and therefore their
own money.
</Actual situation>

<Solution>
Here's the solution that friends and me came about.
Create a company where all participating developers would have stocks. Will
work like open software group but each participant would have ownership.
Customers would get a share too. In this case, we do like Red hat is doing,
packaging the code make it easy to install, document it and _sell_ it. Each
customer would have a stock too. So, when they buy the software, they also
have ownership.

So, the idea is: create a company where all participating developers would
have stocks and therefore ownership. Customers would also have stocks and
ownership but would have to buy the software to get ownership. A free
version could be downloaded for free trial. But people using the free trial
version would not have stocks.

Results: This time, developers could get a chance to get a return on their
efforts. Just imagine the power of a company having 20 000 owners. As big as
Microsoft!

Couple years ago, a group of artist came tired of seeing someone else get
all the rewards of their work and then founded United Artist. Then now,
today, what about a new company called "United Developers".

If the idea seems interesting to you, we can start a list server to discuss
about it and create a new kind of company. Again imagine what 20 000 ,50 000
or even millions of owners can do. Just stop for a moment and think about
it.
</Solution>

If you don't want to pollute this list with comments about this, just email
me and we'll start a list just for this. I hope, we could lay grounds for
the next century with a new kind of business created from the new economy
fuel, not capital but knowledge and the capacity to produce something with
it. A company having owners located in all parts of the world. Just think
about it, we may have the power to build a better future and maybe a model
for the next knowledge worker generation.

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Mar 25 16:28:03 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:30 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <14074.17776.784121.47587@localhost.localdomain>
Message-ID: <199903251626.LAA12547@hesketh.net>

At 09:21 AM 3/25/99 -0500, David Megginson wrote:
>Here's the second of the three new core handler types I'm proposing
>for SAX2.  This handler takes a minimalist position: it provides
>about enough information for DOM support, but not much more.  In
>particular, I'm still shying away from reporting element-type
>declarations, at least until someone shows me an easy and concise way
>of doing it (in AElfred, I simply provided the content model as a
>fully-normalised string).

A fully-normalized string is fine with me - I'd rather get it as a string
and parse it myself than have to deal with something freaky a parser
developer really didn't want to have to code anyway.  But this info is
NECESSARY if anyone (me in particular) wants to build a validation engine
that lives outside the core parser.

How about:

    public abstract void elementDecl (String name,
					String contentModel)
	throws SAXException;

I like it, anyway.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Thu Mar 25 16:32:10 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:30 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <14068.24150.843634.988657@localhost.localdomain>
		<wk677pga60.fsf@ifi.uio.no> <14074.9415.985411.394383@localhost.localdomain>
Message-ID: <36FA63C8.9C5440B7@jfinity.com>

David Megginson wrote:

> Lars Marius Garshol writes:
>
> The DOM's text-normalisation feature does *not* normalise CDATA
> sections, but I think that SAX's should.
>

Are there other cases (other than text-normalization ) in SAX2 that require the parser to
aggregate notifications and save state (other than that required for well-formedness
checking)? To say it a different way, are there other examples of SAX2 providing a high(er)
level service on behalf of the applications other than raw notification of lexical and
structural events?

My impression is that SAX(2) is intended to be minimalist. If a filter network can be
composed on top of SAX2 that provides the desired capabilities, then SAX2 doesn't need to
provide that capability. If there are multiple variations in how the desired capability can
be provided (as in the normalization example), then this is an even better indicator that it
should be left to a "policy" decision at a high layer.

Maybe normalization is a good candidate for an example filter network. The fact that it would
need to be configureable (concerning CDATA handling) might make it a more useful pedagogical
aid.


Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Thu Mar 25 16:47:41 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:31 2004
Subject: Round-trip issues
Message-ID: <8525673F.005C0447.00@D51MTA03.pok.ibm.com>

Gathering and replying to several comments:

>By the same argument,
> <p
> x="1">
>and
> <p x="1">
> are different, because the author used them.

The concept of ignorable whitespace also permits individual applications to
_not_ ignore it, I believe.

>Again, is anyone aware of why CDATA is preserved by the DOM?

CDATA exists in the first place because some folks are working with applications
that are displeased by having to juggle character-entity references when
representing textual data that conflicts with XML syntax. Consider an XML editor
which is creating an XHTML page with embedded dynamic scripting. One can argue
that outputting a<b in that script code as a&lt;b ought to be fine, since the
browser's parser should convert it back before handing the code off to the
interpreter. On the other hand, it adversely affects human-readability of the
XML file. And part of the point of using XML rather than binary representations
is that the files should be reasonably human-readable.

If you don't like it, you can ignore it; in an OO language, the DOM makes
CDATASection a subclass of Text; you can simply treat it as Text and never know
the difference. But when you write the file back out, it will still be in the
CDATA wrapper, unless you explicitly take action to defeat that.


Much of this is the "source-level debugging problem" applied to data. It's
generally a bad idea to unconditionally discard human-generated information
unless you _know_ it will never be meaningful to any downstream processing
stage. It's fine for applications to request that it be discarded, or discard it
themselves; they have enough information to do so. Support routines should be
able to pass data through unchanged unless configured or instructed to do
otherwise.

______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From keshlam at us.ibm.com  Thu Mar 25 16:49:28 1999
From: keshlam at us.ibm.com (keshlam@us.ibm.com)
Date: Mon Jun  7 17:10:31 2004
Subject: xml-dev Digest V1 #277
Message-ID: <8525673F.005C3D1F.00@D51MTA03.pok.ibm.com>

BTW, to answer another question: The DOM does not apply normalization to
CDATASections, either with adjacent text or between themselves. I believe that's
the only behavioral difference inside the DOM between these and standard text
nodes.

______________________________________
Joe Kesselman  / IBM Research
Unless stated otherwise, all opinions are solely those of the author.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dante at mstirling.gsfc.nasa.gov  Thu Mar 25 16:52:28 1999
From: dante at mstirling.gsfc.nasa.gov (Dante Lee)
Date: Mon Jun  7 17:10:31 2004
Subject: KOML Question
Message-ID: <Pine.LNX.3.96.990325123442.3545A-100000@mstirling.gsfc.nasa.gov>

Does anyone know where I can find an example of either Java KOML or XML
Serialization code?  Please reply asap. 


	          Dante M. Lee    Code 588
        	NASA/GSFC Greenbelt MD 20771
 	Voice = 301-521-1077   Bldg = 23  Rm = W415 
 	  Email = dante@mstirling.gsfc.nasa.gov
                  dante4@hotmail.com                                          
 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Mar 25 17:25:35 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:31 2004
Subject: Whence XQL?
In-Reply-To: <F7E1775C1C27D211881F00A024B2853046A051@CESS01AMX03>
Message-ID: <199903251723.MAA13808@hesketh.net>

At 09:44 AM 3/25/99 -0600, Buss, Jason A wrote:
>I thought the tag minimization syntax (</>) was a part of the XML
>recommendation...  Or am I wrong?

Well, Microsoft seemed to think so for a while - but it's definitely _not_
part of the Rec.

See http://www.lists.ic.ac.uk/hypermail/xml-dev/9711/index.html, the "</>
as end tag" thread.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From RDaniel at DATAFUSION.net  Thu Mar 25 17:27:45 1999
From: RDaniel at DATAFUSION.net (Ron Daniel)
Date: Mon Jun  7 17:10:31 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
Message-ID: <0D611E39F997D0119F9100A0C931315C52F72D@datafusionnt1>

I agree with Simon that getting the content model
as a string is a reasonable choice. A seperate interface
could be put into the helpers package for help in
parsing the content model, but lets keep that out of the
critical path.

Ron

> -----Original Message-----
> From:	Simon St.Laurent [SMTP:simonstl@simonstl.com]
> Sent:	Thursday, March 25, 1999 8:29 AM
> To:	David Megginson; XML Developers' List
> Subject:	Re: SAX2: DTDDeclHandler (minimalist position)
> 
> At 09:21 AM 3/25/99 -0500, David Megginson wrote:
> >Here's the second of the three new core handler types I'm proposing
> >for SAX2.  This handler takes a minimalist position: it provides
> >about enough information for DOM support, but not much more.  In
> >particular, I'm still shying away from reporting element-type
> >declarations, at least until someone shows me an easy and concise way
> >of doing it (in AElfred, I simply provided the content model as a
> >fully-normalised string).
> 
> A fully-normalized string is fine with me - I'd rather get it as a
> string
> and parse it myself than have to deal with something freaky a parser
> developer really didn't want to have to code anyway.  But this info is
> NECESSARY if anyone (me in particular) wants to build a validation
> engine
> that lives outside the core parser.
> 
> How about:
> 
>     public abstract void elementDecl (String name,
> 					String contentModel)
> 	throws SAXException;
> 
> I like it, anyway.
> 
> Simon St.Laurent
> XML: A Primer
> Sharing Bandwidth / Cookies
> http://www.simonstl.com
> 
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Mar 25 18:06:58 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:31 2004
Subject: CDATA Section Support (was RE: SAX2 RFD: LexicalHandlerdraft v.1.1)
References: <3.0.32.19990324145555.00e6dd44@pop.intergate.bc.ca>
Message-ID: <36FA759C.82E60A19@prescod.net>

Tim Bray wrote:
> 
> At 10:13 PM 3/24/99 +0100, Ronald Bourret wrote:
> >I wasn't even going to reply, but then I remembered that the real question
> >here is whether SAX (not the DOM) should tell people about CDATA sections.
> > I think the answer is yes.
> 
> The implication is that a parser that doesn't pass on word of CDATA
> sections is a second-rate parser.   Hrummph.  

It isn't second-rate it is probably just optimized for speed instead of
fidelity.

> Is this not a slippery-
> slope that puts us on the road to reporting whether single or double
> quotes were used for attribute values? -Tim

The way to avoid the slippery slope is to define an information set. Had
the information set been defined before the DOM (or, even better, before
XML 1.0 went to REC) then the DOM creators would have known what the right
answer is. In this case they were forced to guess and IMHO they guessed
wrong.

Lesson: Information sets should follow close on the heals of syntactic
standards or should be incorporated into the syntactic standards. RDF gets
this right. Will XLink? What about future versions of CSS?

Also: Different types of applications need different amounts of
information. Therefore an information set should support different levels
of granularity. The groves model does this through "grove plans." Some
parsers provide grove plans that allow a character-for-character
round-tripping. Others provide what we used to call "ESIS."

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 18:15:25 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:31 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <36FA63C8.9C5440B7@jfinity.com>
References: <14068.24150.843634.988657@localhost.localdomain> 		<wk677pga60.fsf@ifi.uio.no> <14074.9415.985411.394383@localhost.localdomain> <36FA63C8.9C5440B7@jfinity.com>
Message-ID: <wkn211e8rh.fsf@ifi.uio.no>


* Gabe Beged-Dov
| 
| Are there other cases (other than text-normalization ) in SAX2 that
| require the parser to aggregate notifications and save state (other
| than that required for well-formedness checking)? 

Not right now, no.

| My impression is that SAX(2) is intended to be minimalist. If a
| filter network can be composed on top of SAX2 that provides the
| desired capabilities, then SAX2 doesn't need to provide that
| capability. 

This is true, and personally I was of the opinion that normalize-text
was better left to external filters.

| Maybe normalization is a good candidate for an example filter
| network.

Maybe, but we still need to reject invalid combinations of
LexicalHandler and normalize-text, so somehow the filter and/or parser
will need to handle this.

This whole issue just strengthens my conviction that we need to
specify filter handling within the SAX2 core. This will need to deal
very carefully with parser encapsulation for approaches like this one
to be really feasible in implementations.

It wouldn't be much fun if a filter did the normalization without
telling the parser and thus caused trouble with the LexicalHandler,
and no hint of this trouble ever reaching the application.

| The fact that it would need to be configureable (concerning CDATA
| handling) might make it a more useful pedagogical aid.

As to how filters work, you mean? Well, we should let ourselves be
affected by that. And, besides, whitespace normalization is a far
better example, I think.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Mar 25 18:24:39 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:10:31 2004
Subject: Whence XQL?
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF1D9@RED-MSG-08>

End tag minimization ("</>") is not part of XML.  


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 18:30:01 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:31 2004
Subject: ModSAX: Proposed Core Properties
In-Reply-To: <14064.6789.718797.734226@localhost.localdomain>
References: <01BE705C.DB375010@grappa.ito.tu-darmstadt.de> <14064.6789.718797.734226@localhost.localdomain>
Message-ID: <wklngle82q.fsf@ifi.uio.no>


* Ronald Bourret
|  
| What is the possible benefit of making any property write-only?
| That is, can any harm ever come from reading a property?

* David Megginson
| 
| There are three benefits:
| 
| 1. Keep the API absolutely as small as possible.
| 2. Avoid confusion.
| 3. Allow properties to be unknown until set.

These are all real benefits, but the disadvantage is rather large I'm
afraid: it makes assembling a processing solution from reusable
components much more difficult, since one component can't learn how
the others have modified the parser settings.

I think we should make all properties readable, which means we split
them into read-write/read-only properties. This should maintain
benefits 1 and 2 even better than the write-only/read-only split,
since most people probably expect read-write/read-only properties like
e.g CORBA attributes.
 
I also think we should go even further and make all features readable,
so that a filter can see if a feature has been enabled or not. Without
knowing the exact set of features I think disabling reading is
potentially very limiting.

| Any attempt to access a property can generate a
| SAXNotSupportedException (or the derived SAXNotRecognizedException),
| but there is no guarantee that they will be symmetrical.

Maybe we should have a SAXInvalidValueException too, so that the
parser/filters can reject invalid values without risking
misinterpretation on the part of applications/filters?

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 18:35:20 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:31 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <36FA89B1.D338ED41@lig.net>

I know, I know, this is anathema to what many of you feel is the essence of
XML, and I agree to a point.
I have come to feel however that there is room for a "works-as-if" binary
analogue to text based XML.  Something that is totally subservient to the
standard and has exactly equivalent features, but that is highly efficient
for processing at all levels and easily converted to and from text based
XML.

In using XML in real-world application work and designing future
infrastructure that is highly scalable and efficient while making use of
XML, I have come to the conclusion that I need a standard way to deal with
an XML analogue that is binary.  There are a multitude of performance
problems that this solves, not only in parsing and exporting, but processing
of related data inside applications.

Before I make all the details and ideas public, I would like to know if
there is any serious precedent directly dealing with XML.

My design has highly efficient Java processing in mind, but is not specific
to any particular language.
Compression is a secondary, but associated issue.

Thanks
sdw
--
OptimaLogic - Finding Optimal Solutions
Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect
http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 18:35:56 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:31 2004
Subject: Parser2 modification
Message-ID: <wkk8w5e7tn.fsf@ifi.uio.no>


 public abstract void setHandler (String handlerID, ModHandler handler)
    throws SAXNotSupportedException;

I think we should allow this method to throw a 
SAXInvalidConfigurationException, to be used to solve the
LexicalHandler/normalize-text problem and similar problems that may
appear with other non-core handlers.

I suggested about 10 seconds ago a SAXInvalidValueException, and I
suppose these two could be merged, since they are roughly the same
thing. 

  public abstract void setFeature (String featureID, boolean state)
    throws SAXNotSupportedException;

Also, I think this method should be allowed to throw the
SAXInvalidConfigurationException so that it can complain about things
like non-existent catalog files, catalog files with syntax errors etc

Or maybe something even more generic would be the best.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Thu Mar 25 19:02:41 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:31 2004
Subject: XML-QL (was Re: Whence XQL?) (Ok Whither XQL, Dave)
Message-ID: <30649320C177D111ADEC00A024E9F297169FC0@exchange-server.dega.com>

I took a deeper look at XML-QL and at first glance, it appears to be a
stronger syntax. I'm happy that AT&T chose Java to implement a prototype
implementation. I'm hoping I can embed the engine in my Java code.

The XQL syntax is rather cryptic, but familiar to XSL writers, and seems to
have the benefit of being embeddable. The XQL authors chose to leave source
and destination streams and formats up to implementers. These two things
appealed to me initially. I could embed an XQL engine in a Java servlet and
send it querys from ECMAScript. Internally, it makes a DOM tree which could
be transformed via XSL before being returned to the browser. Since all of
these pieces were in Java it seemed a powerful combination. (I know that
some things can be done in XSL, but I needed some of the extensions that XQL
provided.)

BTW, how do you get at the XQL part of IE5? I never saw that in MS's
writeup. Or is it just extensions to their XSL?

In the meantime, while looking at XML-QL for our short term needs, I'll
continue to work on XQL using Java and ANTLR. For now that is taking two
concurrant directions: The full XQL ANTLR grammer will proceed as usual
producing, eventually a parser that while it recognizes valid XQL, does
nothing more than genrate a abstract syntax tree. The other direction is to
generate a valid subset that actually parses, works on an internal DOM tree
genreated via XML4J, and outputs XML. The first cut of this will only do
path expressions of the form:

	element/*/sub-element//leaf-element

Ed

P.S. I found it funny that Roger's example data closely matches my own. Onw
wonders...

Ed Howland
ed@dega.com
http://www.dega.com 
"As your attorney, I advise you to take some adrenalchrome"

-----Original Message-----
From: Roger L. Costello [mailto:costello@mitre.org]
Sent: Thursday, March 25, 1999 5:48 AM
To: Matthew Sergeant (EML)
Cc: 'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com';
mff@research.att.com
Subject: XML-QL (was Re: Whence XQL?)


Matthew Sergeant (EML) wrote:
> 
> My problem with XML-QL was their use of tag minimisation (their
proprietary
> </> syntax) means you can't parse XML-QL with an XML parser. That's
foolish
> IMHO - if you're practically using XML already, why not reap the benefits?

Hi Matt,

Not sure that you could do all the things that XML-QL allows you to do
if you stick to the XML syntax.  Example, query the following XML
document for all part names:

<?xml version="1.0"?>
<!DOCTYPE Parts [
<!ELEMENT Parts (part+)>
<!ELEMENT part (name, brand, part*)>
<!ELEMENT name (#PCDATA)>
<!ELEMENT brand (#PCDATA)>
]>
<Parts>
        <part>
                <name>Green Power Juicer</name>
                <brand>Green Power</brand>
        </part>
        <part>
                <name>Toyota Tercel</name>
                <brand>Toyota</brand>
                <part>
                     <name>Sony Stereo X11-3</name>
                     <brand>Sony</brand>
                </part>
        </part>
</Parts>

Note the recursive definition of the part element.  Thus, the part name
can be at any nesting level.  Here's how to do it using XML-QL:

function AllPartNamesQuery () {

// Source: Parts.xml
// Find the names of all the parts

construct  <name>$name</name>
where      <Parts>
               <part*>
                   <name>$name</name>
               </>
           </Parts> IN "Parts.xml"
}

How would you do this using XML syntax?  /Roger


> 
> Anyway, there's an implementation of XML-QL in my directory on CPAN for
perl
> users, which needs fixing up a little bit, but it's quite usable (if a
> little slow). It facilitates the use of perl's regexp syntax for queries
as
> well as the system used by XML-QL, which makes it nice and powerful...
> 
> Matt.
> --
> http://come.to/fastnet
> Perl on Win32, PerlScript, ASP, Database, XML
> GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V
> !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++
> 
> > -----Original Message-----
> > From: Roger L. Costello [SMTP:costello@mitre.org]
> > Sent: Thursday, March 25, 1999 11:58 AM
> > To:   Ed Howland
> > Cc:   'xml-dev@ic.ac.uk'; 'xsl-list@mulberrytech.com'
> > Subject:      Re: Whence XQL?
> >
> > Have you looked at XML-QL?  I have been playing around with this XML
> > query tool for a few weeks.  It's quite nice.  It allows you to specify
> > the grammer of extracted data, query multiple XML documents, etc.  See:
> > <http://www.research.att.com/sw/tools/xmlql>  /Roger
> >
> >


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ramin at wizen.com  Thu Mar 25 19:11:24 1999
From: ramin at wizen.com (Ramin Firoozye)
Date: Mon Jun  7 17:10:31 2004
Subject: The horror...
Message-ID: <000b01be76f2$033f5d00$a432c9cf@lust.wizen.com>

Hi folks...

Sorry to clutter the list with this ...

<SELF-FLAGELLATION mode="full-on">

I was trying to be a good netizen and reply privately back to the authors of
some postings on XML-DEV. To my horror, the messages popped back into my
XML-DEV mailbox. I can't tell if this is a function of the new mail client
I've been using or I am in-fact replying back to the whole list.

If the latter, please accept my abject apologies. If you haven't gotten a
barrage of mail from me then ignore this message (phew).

</SELF-FLAGELLATION>

Thanks,
Ramin

--
Ramin Firoozye - Wizen Software.
San Francisco, California.
<mailto:ramin@wizen.com>
--


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Thu Mar 25 19:34:51 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:31 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
References: <14068.24150.843634.988657@localhost.localdomain> 		<wk677pga60.fsf@ifi.uio.no> <14074.9415.985411.394383@localhost.localdomain> <36FA63C8.9C5440B7@jfinity.com> <wkn211e8rh.fsf@ifi.uio.no>
Message-ID: <36FA8EB0.69B4A18C@jfinity.com>

Lars Marius Garshol wrote:

> This whole issue just strengthens my conviction that we need to
> specify filter handling within the SAX2 core. This will need to deal
> very carefully with parser encapsulation for approaches like this one
> to be really feasible in implementations.
>
> It wouldn't be much fun if a filter did the normalization without
> telling the parser and thus caused trouble with the LexicalHandler,
> and no hint of this trouble ever reaching the application.

The SAX filter registration interfaces don't allow multiple filters to be registered for the
same feature.  I don't see how you can have more than one owner for all the registered
handlers without getting into severe trouble. This is not a bad thing as you probably want
something like MDSAX to handle the filter networks. It allows SAX(2) to be lean and mean.

If the same logic is managing all the callbacks, it can gracefully handle the variations of
CDATA notification (or lack thereof) and text-normalization on behalf of the  client of the
filter network.

> | The fact that it would need to be configureable (concerning CDATA
> | handling) might make it a more useful pedagogical aid.
>
> As to how filters work, you mean? Well, we should let ourselves be
> affected by that. And, besides, whitespace normalization is a far
> better example, I think.

Does whitespace normalization require aggregating notifications or just scrubbing the
contents of a particular notification?

Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 20:02:29 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:31 2004
Subject: Elements-Attributes-Data (was RE: SAX2 RFD: LexicalHandler draft v.1.1)
In-Reply-To: <001401be76c3$e0fd27a0$0100007f@eps.inso.com>
References: <001901be7616$8e4caba0$c8a8a8c0@thing1>
	<001401be76c3$e0fd27a0$0100007f@eps.inso.com>
Message-ID: <14074.38225.755412.932105@localhost.localdomain>

Gavin Thomas Nicol writes:

 > Speaking of which, I am continually surprised by SAX's lack of
 > comment interfaces....

SAX was originally designed specifically for production use, not for
authoring (to meet the 80/20, or in this case, the 98/2 rule).

That said, comments will be there in SAX2 in the (optional)
LexicalHandler for people who want them, but the lack of comment and
CDATA interfaces have certainly not hindered the SAX application base
so far.

There are only three things that most XML applications need to know
about:

1. Elements
2. Attributes
3. Character Data

It makes a nice little litany: elements-attributes-data,
elements-attributes-data, elements-attributes-data,
elements-attributes-data.  Yes, XML really is/should be that easy.

Actually, apps need to know about error messages too, but that wrecks
the litany.  Everything else should be taken care of invisibly by the
parser.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 20:05:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <199903251626.LAA12547@hesketh.net>
References: <14074.17776.784121.47587@localhost.localdomain>
	<199903251626.LAA12547@hesketh.net>
Message-ID: <14074.38525.185285.772426@localhost.localdomain>

Simon St.Laurent writes:

 > How about:
 > 
 >     public abstract void elementDecl (String name,
 > 					String contentModel)
 > 	throws SAXException;
 > 
 > I like it, anyway.

You know, it doesn't inhale (I was going to say 'suck', but I know
that my American cousins are still a little sensitive after the recent
impeachment trial).  It's easy enough to parse the normalised content
model if you really need to: it would be all one string, with no
parameter entity references.

Of course, people will rightly complain that the processor has already 
done the work of parsing it.  It's hard to know what to do here.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Thu Mar 25 20:34:05 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:10:32 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1>

>I know, I know, this is anathema to what many of you feel is the 
>essence of XML, and I agree to a point.

It's not so much about feelings, as about contradicting the XML spec.

>From 1. Introduction (http://www.w3.org/TR/REC-xml#sec-intro):

"XML documents are made up of storage units called entities, which
contain either parsed or unparsed data. Parsed data is made up of
characters, some of which form character data, and some of which form
markup."

("characters" there links to http://www.w3.org/TR/REC-xml#dt-character:)

A parsed entity contains text, a sequence of characters, which may
represent markup or character data. A character is an atomic unit of
text as specified by ISO/IEC 10646 [ISO/IEC 10646]. Legal characters are
tab, carriage return, line feed, and the legal graphic characters of
Unicode and ISO/IEC 10646."

Applying XML concepts to a binary data format sounds interesting and
potentially useful, but it wouldn't be XML.

Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 25 20:41:24 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <013901be7700$96dfb060$c8a8a8c0@thing1>

From: Gabe Beged-Dov <begeddov@jfinity.com>
>The SAX filter registration interfaces don't allow multiple filters to be registered for the
>same feature.  I don't see how you can have more than one owner for all the registered
>handlers without getting into severe trouble. This is not a bad thing as you probably want
>something like MDSAX to handle the filter networks. It allows SAX(2) to be lean and mean.


Frankly, I would love to see the design process for MDSAX2 as open as SAX.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Thu Mar 25 20:54:30 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:10:32 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1>
Message-ID: <Pine.GHP.4.02A.9903252043140.4008-100000@mail.ilrt.bris.ac.uk>


On Thu, 25 Mar 1999, DuCharme, Robert wrote:

> >I know, I know, this is anathema to what many of you feel is the 
> >essence of XML, and I agree to a point.
> 
> It's not so much about feelings, as about contradicting the XML spec.

Quite so. But there are still initiatives such as 

	http://www.wapforum.org/docs/technical.htm
	http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf

which attempts to define a 'compact binary representation of XML'.
(If going down that route, I'd rather have a compact binary
representation of whatever it was that I'm representing in XML, rather
than of the XML that I might've used as a textual representation of the
data... But then that really wouldn't be XML.)

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Thu Mar 25 20:59:32 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:32 2004
Subject: Whence XQL?
In-Reply-To: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>
Message-ID: <000b01be7702$102babd0$0100007f@eps.inso.com>

> I read most of those position papers as well. But the one by
> Jonathan Robie, Texcel, Inc. Joe Lapp, webMethods, Inc. and David Schach,
Microsoft
> Corporation seemed the most complete. It even has a BNF for a
> parser for XQL.

I wouldn't bet my farm on that proposal. Folk at QL'98, both database
and IR, had serious issues with it.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Thu Mar 25 20:59:34 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <3.0.32.19990324144457.00e6dd44@pop.intergate.bc.ca>
Message-ID: <000c01be7702$1161e1e0$0100007f@eps.inso.com>

> >By the same argument,
> ><p
> >x="1">
> >and 
> ><p x="1">
> >are different...
> 
> David is right.  It's too late now, because DOM level 1 wrote 
> CDATA sections into the spec so we're stuck with 'em - it's a
> pity we didn't have the infoset back then. (I assume it won't
> include them, right David?) -T.

I think CDATA is a religious issue. Some people love them, some
people hate them. Deal with it.

They are currently in the InfoSet as a property of characters.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Mar 25 21:02:50 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <14074.38525.185285.772426@localhost.localdomain>
References: <199903251626.LAA12547@hesketh.net>
 <14074.17776.784121.47587@localhost.localdomain>
 <199903251626.LAA12547@hesketh.net>
Message-ID: <199903252102.QAA18653@hesketh.net>

At 03:05 PM 3/25/99 -0500, you wrote:
>Simon St.Laurent writes:
>
> > How about:
> > 
> >     public abstract void elementDecl (String name,
> > 					String contentModel)
> > 	throws SAXException;
> > 
> > I like it, anyway.
>
>You know, it doesn't inhale (I was going to say 'suck', but I know
>that my American cousins are still a little sensitive after the recent
>impeachment trial).  It's easy enough to parse the normalised content
>model if you really need to: it would be all one string, with no
>parameter entity references.

Er... actually, I'd like it with PE's unprocessed.

>Of course, people will rightly complain that the processor has already 
>done the work of parsing it.  It's hard to know what to do here.

Again, my plans for SAX involve keeping the data as uncooked as possible,
partly for round trip reasons and partly because of the layered processing
model I'd really like to demonstrate with standard parts.  Maybe we need to
add a 'cooked' or 'uncooked' option to tell the parser how we like our
information.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Mar 25 21:05:33 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:32 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1>
Message-ID: <199903252105.QAA18751@hesketh.net>

At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote:
>>I know, I know, this is anathema to what many of you feel is the 
>>essence of XML, and I agree to a point.
>
>It's not so much about feelings, as about contradicting the XML spec.
>
>[...]
>
>Applying XML concepts to a binary data format sounds interesting and
>potentially useful, but it wouldn't be XML.

One of these days I'd really love to stop talking about what is and isn't
XML, though I know it's fun, and start talking about what we can do with
XML and XML-like structures, whether they are SAX event flows, DOM trees,
or binary formats that build on an XML foundation.

We might even get some real work done - and it might even be fun.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 21:08:13 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <013901be7700$96dfb060$c8a8a8c0@thing1>
References: <013901be7700$96dfb060$c8a8a8c0@thing1>
Message-ID: <wkaex1e0rm.fsf@ifi.uio.no>


* Bill la Forge
| 
| Frankly, I would love to see the design process for MDSAX2 as open
| as SAX.

Then let's start it here once SAX2 is out the door.  For me, that
means when I've released the Python version of SAX2.  If SAX2 doesn't
provide all I want with regard to filters I'll be very interested in
working on a design that does, for implementation in Python.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From gtn at eps.inso.com  Thu Mar 25 21:10:07 1999
From: gtn at eps.inso.com (Gavin Thomas Nicol)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
In-Reply-To: <01BE762D.F02C6600@grappa.ito.tu-darmstadt.de>
Message-ID: <000a01be7702$0e505c20$0100007f@eps.inso.com>

> > Gavin summed it up quite well - the author used a CDATA Section and
> > may have attached some semantic meaning to it (I know that several
> > people disagree that CDATA sections can have semantic meaning;
> > others think they can) so the DOM doesn't throw away that
> > distinction, just in case.
>
> I'm having trouble imagining how a CDATA section can have
> semantic meaning in all but the most abusive ways.  (Hmmm, there's a CDATA
> section.  Fire up the pizza delivery DLL.)  Could you give an example?
Thanks.

Not necessarily a semantic,  but certainly an *intent*. The example given
earlier was pretty good. You're writing a tutorial on HTML or some
programming language, and as a convention, and as a convenience, you put
all examples in CDATA sections. This makes it easy to edit *and* easy
to extract your examples.

Like Lauren, I am not saying that think CDATA sections are necessary or not,
simply that some people really do want them.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 21:35:18 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:32 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <199903252105.QAA18751@hesketh.net>
References: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1>
	<199903252105.QAA18751@hesketh.net>
Message-ID: <14074.43833.458137.700586@localhost.localdomain>

Simon St.Laurent writes:

 > One of these days I'd really love to stop talking about what is and isn't
 > XML, though I know it's fun, and start talking about what we can do with
 > XML and XML-like structures, whether they are SAX event flows, DOM trees,
 > or binary formats that build on an XML foundation.
 > 
 > We might even get some real work done - and it might even be fun.

Nah, we're getting work done already -- we need to goof off once in a
while.

Here's my translation of the above paragraph:

 One of these days I'd really love to stop talking about what is and
 isn't XML, though I know it's fun, and start talking about what we
 can do with structured documents, whether they're in text format (as
 XML, HTML, SGML, etc.), in binary format, in databases, or available
 through abstract interfaces like SAX, the DOM, and Groves.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Thu Mar 25 21:36:08 1999
From: elharo at metalab.unc.edu (Elliotte Harold - java FAQ)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <14074.17776.784121.47587@localhost.localdomain>
Message-ID: <Pine.GSO.4.05.9903251628060.543-100000@titan.oit.unc.edu>


I haven't been paying too much attention to SAX, but today I was sitting
in my office waiting for students to drop by. They never do, except right
before and after exams, so I was a little bored and started reading
threads I'd normally filter, and I noticed something:


> 
> public interface DTDDeclHandler
> {
>     public final static int ATTRIBUTE_DEFAULTED = 1;
>     public final static int ATTRIBUTE_IMPLIED = 2;
>     public final static int ATTRIBUTE_REQUIRED = 3;
>     public final static int ATTRIBUTE_FIXED = 4;
> 

How committed are you to using integer constants? I know this is common,
but it tends to lend itself to bad code. Some people prefer a solution
like this:

public class AttributStatus {

  public final static AttributeStatus ATTRIBUTE_DEFAULTED = 
   new AttributeStatus();
  public final static AttributeStatus ATTRIBUTE_IMPLIED =
   new AttributeStatus();
  public final static AttributeStatus ATTRIBUTE_FIXED =   
   new AttributeStatus();
  public final static AttributeStatus ATTRIBUTE_REQUIRED =   
   new AttributeStatus();

  private AttributeStatus() {}

}

This creates four menmonic constants you want and gives them a checkable
type.  New constants can't be created because of the private constructor.
And there's no chance that anybody's going to write code like

  if (getAttributeStatus() == 1) {
   doSomething();
  }

Programmers are more or less forced to use the constants. What do you
think?

--
Elliotte Rusty Harold
elharo@metalab.unc.edu


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Mar 25 21:47:36 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:32 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <Pine.GSO.4.05.9903251628060.543-100000@titan.oit.unc.edu>
References: <14074.17776.784121.47587@localhost.localdomain>
	<Pine.GSO.4.05.9903251628060.543-100000@titan.oit.unc.edu>
Message-ID: <14074.44709.409452.525331@localhost.localdomain>

Elliotte Harold - java FAQ writes:

 > How committed are you to using integer constants? I know this is common,
 > but it tends to lend itself to bad code. Some people prefer a solution
 > like this:
 > 
 > public class AttributStatus {
 > 
 >   public final static AttributeStatus ATTRIBUTE_DEFAULTED = 
 >    new AttributeStatus();
 >   public final static AttributeStatus ATTRIBUTE_IMPLIED =
 >    new AttributeStatus();
 >   public final static AttributeStatus ATTRIBUTE_FIXED =   
 >    new AttributeStatus();
 >   public final static AttributeStatus ATTRIBUTE_REQUIRED =   
 >    new AttributeStatus();
 > 
 >   private AttributeStatus() {}
 > 
 > }

Yes, I do this all the time in my own Java code (the lack of enum in
Java is a serious design flaw, as I've been arguing for a few years
now), but I'm strongly committed to keeping SAX as small and simple as
possible.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 21:49:13 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:32 2004
Subject: Is there anyone working on a binary version of XML?
References: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1>
Message-ID: <36FAB727.905F4E2C@lig.net>

Let me clarify.  I want to create a new standard, very closely related to XML and tracking and
dependant on it, but using "binary" data structures.  What I really mean by "binary" is that
instead of a stream of characters that have structured meaning after parsing and
transformation to an internal datastructure, I want a data format that encodes an equivalent
data structure directly.  I have designed something that is directly usable in memory, as
loaded, with a DOM interface, or SAX, etc., only much more efficiently than starting from
XML.  The design I have in mind was controlled by an optimization process with constraints of
standard Java capabilities.

I think a reasonable name for this project would be: bXML or XMLb (probably the latter).

Thanks
sdw

"DuCharme, Robert" wrote:

> >I know, I know, this is anathema to what many of you feel is the
> >essence of XML, and I agree to a point.
>
> It's not so much about feelings, as about contradicting the XML spec.
>
> >From 1. Introduction (http://www.w3.org/TR/REC-xml#sec-intro):
>
> "XML documents are made up of storage units called entities, which
> contain either parsed or unparsed data. Parsed data is made up of
> characters, some of which form character data, and some of which form
> markup."
>
> ("characters" there links to http://www.w3.org/TR/REC-xml#dt-character:)
>
> A parsed entity contains text, a sequence of characters, which may
> represent markup or character data. A character is an atomic unit of
> text as specified by ISO/IEC 10646 [ISO/IEC 10646]. Legal characters are
> tab, carriage return, line feed, and the legal graphic characters of
> Unicode and ISO/IEC 10646."
>
> Applying XML concepts to a binary data format sounds interesting and
> potentially useful, but it wouldn't be XML.
>
> Bob DuCharme       www.snee.com/bob       <bob@
> snee.com>  see www.snee.com/bob/xmlann for "XML:
> The Annotated Specification" from Prentice Hall.
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Thu Mar 25 21:51:23 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:32 2004
Subject: Whence XQL?
In-Reply-To: <000b01be7702$102babd0$0100007f@eps.inso.com>
References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>
Message-ID: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>

At 02:19 PM 3/25/99 -0500, Gavin Thomas Nicol wrote:
 
>I wouldn't bet my farm on that proposal. Folk at QL'98, both database
>and IR, had serious issues with it.

Frankly, I don't know of anything that has been proposed to the XML or web
communities that hasn't found its critics. XQL has found both avid fans and
strong critics. Since you make no specific technical claims here, it is
hard to dismiss what you say with information, but perhaps I can make some
broad statements that address what you are implying here.

Both database and IR people made contact with me at QL'98, showing interest
and appreciation, and we have been in active and enthusiastic
correspondence ever since.

XQL has been more widely implemented than any other XML query language (I
just posted information on six implementations today), and it is closely
related to XSL Patterns.

The main criticism from database folks was that they wanted to see joins
and transformations in XQL. Peter Fankhauser has proposed extensions to XQL
for joins. Declarative transformations are, of course, very useful, but XSL
can also be used for transformations. One of the big reasons for leaving
joins and transformations out of the first version was to make
implementation simple - which is why there are quite a few implementations
of XQL. I suspect that there will be later versions of XQL that include at
least joins; I'm less certain about declarative transformations, since XSL
already exists and can do transformations, but I do really like declarative
transformations.

At least one IR person criticized XQL for doing too much, eg for having the
parent/child relationship in addition to the ancestor/descendant
relationship. This does, in fact, increase the complexity of
implementation, but offers a distinction that I find important.

The number of implementations of XQL shows that there's a fair amount of
interest in it. People who have demonstrated it at trade shows send me
email telling me how impressed people are - for instance, I have been
getting email from Software AG, which is showing XQL at CeBIT this week and
getting very enthusiastic responses. When I discuss XQL at trade shows, I
get enthusiastic responses. So the fact that there are also critics doesn't
bother me.

If you want to implement a query language today, for reasonable effort, and
you want to use a language that has been implemented in other software
systems, I think XQL is a very good choice. There will be a W3C XML Query
Language Activity, and it will develop its own query language, and nobody
can say how similar or different it will be to any existing query language
for XML. I'm sure there will be a lot of interesting and creative work done
by the bright people who will be involved in that group - if you can afford
to wait a year to implement a query language, then by all means wait for
that language to be developed.

Jonathan
 
jonathan@texcel.no
Texcel Research
http://www.texcel.no

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stark at uplanet.com  Thu Mar 25 21:58:07 1999
From: stark at uplanet.com (Peter Stark)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <Pine.GHP.4.02A.9903252043140.4008-100000@mail.ilrt.bris.ac.uk>
Message-ID: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com>

Not only "attempts to define".

The "binary XML" defined by the WAP Forum is a format for tokenized XML.
It's supported by cellular phones with WAP browsers, e.g.
http://www.nokia.com/phones/7110/index.html. Element and attribute names are
replaced by binary values to make parsing cheaper in the client. It does,
however, not support all XML features. For example, XML namespaces are not
supported.

You can read more about WAP at:
http://www.uplanet.com/pub/111398_WAP_V1whitepaper.pdf

Peter Stark

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Dan Brickley
> Sent: Thursday, March 25, 1999 12:54 PM
> To: 'xml-dev@ic.ac.uk'
> Subject: RE: Is there anyone working on a binary version of XML?
>
>
>
> On Thu, 25 Mar 1999, DuCharme, Robert wrote:
>
> > >I know, I know, this is anathema to what many of you feel is the
> > >essence of XML, and I agree to a point.
> >
> > It's not so much about feelings, as about contradicting the XML spec.
>
> Quite so. But there are still initiatives such as
>
	http://www.wapforum.org/docs/technical.htm
	http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf

which attempts to define a 'compact binary representation of XML'.
(If going down that route, I'd rather have a compact binary
representation of whatever it was that I'm representing in XML, rather
than of the XML that I might've used as a textual representation of the
data... But then that really wouldn't be XML.)

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Mar 25 22:02:12 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <14074.43833.458137.700586@localhost.localdomain>
References: <199903252105.QAA18751@hesketh.net>
 <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1>
 <199903252105.QAA18751@hesketh.net>
Message-ID: <199903252201.RAA19712@hesketh.net>

At 04:35 PM 3/25/99 -0500, David Megginson wrote:
>Simon St.Laurent writes:
>
> > One of these days I'd really love to stop talking about what is and isn't
> > XML, though I know it's fun, and start talking about what we can do with
> > XML and XML-like structures, whether they are SAX event flows, DOM trees,
> > or binary formats that build on an XML foundation.
> > 
> > We might even get some real work done - and it might even be fun.
>
>Nah, we're getting work done already -- we need to goof off once in a
>while.
>
>Here's my translation of the above paragraph:
>
> One of these days I'd really love to stop talking about what is and
> isn't XML, though I know it's fun, and start talking about what we
> can do with structured documents, whether they're in text format (as
> XML, HTML, SGML, etc.), in binary format, in databases, or available
> through abstract interfaces like SAX, the DOM, and Groves.

That translation's fine by me!  I just don't want people getting shut down
because their project "isn't XML".

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Thu Mar 25 22:03:03 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:33 2004
Subject: SAX2 RFD: LexicalHandler draft v.1.1
Message-ID: <017c01be770b$f7f189e0$c8a8a8c0@thing1>

From: Lars Marius Garshol <larsga@ifi.uio.no>
>| Frankly, I would love to see the design process for MDSAX2 as open
>| as SAX.
>
>Then let's start it here once SAX2 is out the door.  For me, that
>means when I've released the Python version of SAX2.  If SAX2 doesn't
>provide all I want with regard to filters I'll be very interested in
>working on a design that does, for implementation in Python.


After April would be best for me. Till then I'm pretty tied up.

I see almost all of the MDSAX interfaces being replaced by SAX2.
And I assure you, I plan to drop the current requirement of having
a setParser method from MDSAX2--a bad design decision on
my part is what caused it, and that created quite a few problems
in turn.

Another problem was in the incompleteness of the AttributeList api.
Hard to make extensions to. And no way to add/update attributes
in a filter because of the incompleteness of the api. Typically you
check for the use of a known implementation and if it isn't being used,
replace the whole attribute list. And that really gets messy if you are
also trying to include extensions on the attributes!

One new problem for MDSAX being introduced by SAX2 is when 
parser events are being routed between subfilters. These subfilters
may all need to be aware of application events, in contrast to a filter
stack where application events are handled by successive filters.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 22:04:36 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
References: <199903252105.QAA18751@hesketh.net>
Message-ID: <36FABAAB.845C90DA@lig.net>


"Simon St.Laurent" wrote:

> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote:
> >>I know, I know, this is anathema to what many of you feel is the
> >>essence of XML, and I agree to a point.
> >
> >It's not so much about feelings, as about contradicting the XML spec.
> >
> >[...]
> >
> >Applying XML concepts to a binary data format sounds interesting and
> >potentially useful, but it wouldn't be XML.
>
> One of these days I'd really love to stop talking about what is and isn't
> XML, though I know it's fun, and start talking about what we can do with
> XML and XML-like structures, whether they are SAX event flows, DOM trees,
> or binary formats that build on an XML foundation.
>
> We might even get some real work done - and it might even be fun.

I agree with the sentiment Simon.

I'm required (or am requiring myself) to get a lot of real work done very quickly in the next
6 months hence my focus...

Semantically, I am talking about using XML.  After parsing and creating a DOM tree or SAX
events, you no longer have XML but a data structure semantically equivalent to an XML
document.  Another way to think about what I'm proposing is that it is a cache of the data
structures produced from processing an XML document, cast in a openly documented data
structure that is already flattened and ready for IO.

In fact, this is how I arrived at this design after following a few other design constraints
and observations.  Of course from there it is a short stop to say that you can throw away the
'external' XML representation if you can recreate it from XMLb.

My scheme makes parsing of XML a non-issue.  If I only have that advantage within my closed
system, so be it, converting to and from XML for external purposes is in fact what I intend to
do.

In my case, I'm architecting a high speed clustering system, primarily targeted at Linux/Unix
and Java.  In this kind of system of course you are splitting applications into many servers.
Of course the communication between those nodes is really internal application communication,
the equivalent of that DOM tree, so it makes sense to optimize it.  Think of it this way,
you'd seldom design a large app where every method needs to parse the XML text block passed to
it to get a DOM tree (or SAX events) if the calling method has a DOM tree that it could just
pass.

sdw

> Simon St.Laurent
> XML: A Primer
> Sharing Bandwidth / Cookies
> http://www.simonstl.com
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Thu Mar 25 22:05:18 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:33 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <14074.17776.784121.47587@localhost.localdomain>
References: <14074.17776.784121.47587@localhost.localdomain>
Message-ID: <wk7ls5dy59.fsf@ifi.uio.no>


* David Megginson
|
| I'm still shying away from reporting element-type declarations, at
| least until someone shows me an easy and concise way of doing it (in
| AElfred, I simply provided the content model as a fully-normalised
| string).

This is difficult in Java, mainly because of a gross deficiency in the
language: the difficulties of representing general nested list
structures in memory. Over-emphasis on objects has some ugly
side-effects.  I think this would be easier in C, even. (Arrays and
unions should do it.)

xmlproc uses Python lists and tuples to do this:

<URL: http://www.stud.ifi.uio.no/~larsga/download/python/xml/xmlproc-dtd-doco.html#ElementType>

and similarly easy solutions are easily imaginable in other scripting
languages, as well as industrial-strength ones like Common Lisp.

For Java I suppose the string solution is the most natural one.  I
don't think that approach will be chosen in the Python version, though.
 
Also, if element declarations are included, I suppose notations should
be, too. Shouldn't be very hard, and I think the benefits are great
enough that both should be included. This should be enough to present
a SAX 1.0-like view of DTDs, more or less without lexical information,
and still be simple enough to warrant the name SAX.

| public interface DTDDeclHandler
| {
|     public final static int ATTRIBUTE_DEFAULTED = 1;
|     public final static int ATTRIBUTE_IMPLIED = 2;
|     public final static int ATTRIBUTE_REQUIRED = 3;
|     public final static int ATTRIBUTE_FIXED = 4;
| 
|     public abstract void attributeDecl (String element,
| 					String name,
| 					String type,

Here we need some convention for representing enumerations.
"ENUMERATION" will probably do. :)
 
|     public abstract void externalEntityDecl (String name,
| 					     boolean isParameterEntity,
| 					     String publicId,
| 					     String systemId)
| 	throws SAXException;

I think it would be more natural to have separate callbacks for
parameter entities. It makes the interface grow, but I think it is
more intuitive to learn (the first look at the javadoc shows how it
works, you don't have to study the parameters in detail to figure it
out) and also more natural to use. 
 
|     public abstract void internalEntityDecl (String name,
| 					     boolean isParameterEntity,
| 					     String value)
| 	throws SAXException;

Should value be named 'replacementText', just to make it clearer what
it is?

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 22:10:08 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:33 2004
Subject: Whence XQL?
References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>
Message-ID: <36FABBFE.9CB29BFA@lig.net>

Could you please recommend an Open Source project that I could use as a base and contribute
to?
Java preferably, something with indexing of multiple documents would be great.  Full-tag,
full-text would be very interesting.

I may be able to have one or more people work on this if it could have basic functionality
shortly.
I have some of my own ideas, of course....

sdw

Jonathan Robie wrote:

> At 02:19 PM 3/25/99 -0500, Gavin Thomas Nicol wrote:
>
> >I wouldn't bet my farm on that proposal. Folk at QL'98, both database
> >and IR, had serious issues with it.
>
> Frankly, I don't know of anything that has been proposed to the XML or web
> communities that hasn't found its critics. XQL has found both avid fans and
> strong critics. Since you make no specific technical claims here, it is
> hard to dismiss what you say with information, but perhaps I can make some
> broad statements that address what you are implying here.
>
> Both database and IR people made contact with me at QL'98, showing interest
> and appreciation, and we have been in active and enthusiastic
> correspondence ever since.
>
> XQL has been more widely implemented than any other XML query language (I
> just posted information on six implementations today), and it is closely
> related to XSL Patterns.
>
> The main criticism from database folks was that they wanted to see joins
> and transformations in XQL. Peter Fankhauser has proposed extensions to XQL
> for joins. Declarative transformations are, of course, very useful, but XSL
> can also be used for transformations. One of the big reasons for leaving
> joins and transformations out of the first version was to make
> implementation simple - which is why there are quite a few implementations
> of XQL. I suspect that there will be later versions of XQL that include at
> least joins; I'm less certain about declarative transformations, since XSL
> already exists and can do transformations, but I do really like declarative
> transformations.
>
> At least one IR person criticized XQL for doing too much, eg for having the
> parent/child relationship in addition to the ancestor/descendant
> relationship. This does, in fact, increase the complexity of
> implementation, but offers a distinction that I find important.
>
> The number of implementations of XQL shows that there's a fair amount of
> interest in it. People who have demonstrated it at trade shows send me
> email telling me how impressed people are - for instance, I have been
> getting email from Software AG, which is showing XQL at CeBIT this week and
> getting very enthusiastic responses. When I discuss XQL at trade shows, I
> get enthusiastic responses. So the fact that there are also critics doesn't
> bother me.
>
> If you want to implement a query language today, for reasonable effort, and
> you want to use a language that has been implemented in other software
> systems, I think XQL is a very good choice. There will be a W3C XML Query
> Language Activity, and it will develop its own query language, and nobody
> can say how similar or different it will be to any existing query language
> for XML. I'm sure there will be a lot of interesting and creative work done
> by the bright people who will be involved in that group - if you can afford
> to wait a year to implement a query language, then by all means wait for
> that language to be developed.
>
> Jonathan
>
> jonathan@texcel.no
> Texcel Research
> http://www.texcel.no
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 22:22:32 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:33 2004
Subject: XML and (K)Office
In-Reply-To: <14074.8435.653789.348824@localhost.localdomain>
References: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101>
 <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101>
Message-ID: <4.1.19990326091444.00bac620@steptwo.com.au>

At 21:55 25/03/1999 , David Megginson wrote:

  | [David]
  | 
  |  > > Anyway, let's get this right -- I think that it's healthy for
  |  > > both Gnumeric and the KOffice Spreadsheet program both to exist,
  |  > > but there is no excuse for them to use entirely incompatible
  |  > > formats.  As a matter of fact, if we could convince KDE and Gnome
  |  > > to use compatible XML formats for lots of things (like interface
  |  > > construction), the media's predictions of a Linux fracture will
  |  > > be proven to be hot air.
  | 
  | [Matt]
  | 
  |  > Although I agree to an extent, if they have different feature sets
  |  > it's pretty unlikely that you're going to get an entirely perfect
  |  > agreement on a spreadsheet DTD.
  | 
  | I disagree *very* strongly -- with Namespaces, we can design a common
  | format for the 90% of functionality that the two spreadsheets actually
  | have in common (text cells, data cells, basic formulas, general
  | formatting information [font, alignment, colour, size], etc.)  and
  | then allow each to provide extended information
  | unambiguously-delimited through the use of separate namespaces.
  | 
  | The more material in the common spec, the better interoperability.
  | Linux needs to set an example here.

Why do namespaces help us here?

It:

* Breaks validation. We are no longer able to ensure that the
  files we are reading/creating are correct and useful.

* Still has the variations between applications, so that a reader
  of a given format still needs to know 100% about what is that
  format.

Without the rigour of a DTD, we've got nothing.

Particularly since this data may well live long, and is not
some transient "sent over the web" data.

How will future users make sense of the format without
a DTD?

  |  > However, that's the beauty of XML. Writing a converter from one
  |  > format to another is trivial in the extreme, so it's not a huge
  |  > issue in my (humble) opinion.
  | 
  | For n XML-based formats, we need (n * (n - 1)) converters.  If there
  | are only two different XML-based spreadsheet formats, then we need
  | only two converters:
  | 
  |  a => b
  |  b => a
  | 
  | If there are three XML-based different formats, then we need six
  | converters:
  | 
  |  a => b
  |  a => c
  |  b => a
  |  b => c
  |  c => a
  |  c => b

Again, having namespaces doesn't solve this problem. Regardless
of what you call it, if the formats are different, they're different.

But anyway, this reasoning isn't necessarily true. What about:

a => x
b => x
c => x
x => a
x => b
x => c

That is, an intermediate DTD that captures all the usefully
sharable data. For a successful example of this, see the
Rainbow DTDs for word documents.

This greatly reduces the number of conversions as the
number of formats increases.

Cheers,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Mar 25 22:29:22 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:10:33 2004
Subject: XML and (K)Office
In-Reply-To: <199903251448.JAA00778@ruby.ora.com>
References: <199903242256.XAA04448@sonne.darmstadt.gmd.de>
Message-ID: <4.1.19990326092458.00babc10@steptwo.com.au>

At 00:48 26/03/1999 , Chris Maden wrote:

  | [Ingo Macherius]
  | > David Megginson <david@megginson.com> wrote at 24 Mar 99, 17:02:
  | > 
  | > > There's also a hot rumour [3] that Microsoft has assigned 37
  | > > programmers to work on a Linux port of MS Office.
  | 
  | Soem quick research on slashdot shows the rumor's evolution.  The
  | first sighting appears to be on ZDnet; they reported that Simson
  | Garfinkle, a _Boston Globe_ columnist and technology writer, mentioned
  | on a radio show that he was in correspondence with some of the
  | developers.  But even if that's true, I can think of a number of
  | reasons why Microsoft might be doing a port internally with no
  | intentions whatsoever of releasing it.  The ZDnet article notes that
  | Office relies heavily on MS's undocumented Win32 API calls, and just
  | porting the app to the standard API calls which could then be handled
  | in emulation on Linux would be a major chore.  Some URLs:

Isn't this the standard strategy of MS at work?

That is, they leak "rumours" that they are about to release
some wonderful new technology.

Everyone, on the basis of this, holds back on purchasing
or obtaining competing (currently available) technology. On
the understanding that MS will be releasing something soon, which
will become the defacto standard.

Also known as "vapourware".

Just look at their attempts to derail NDS (Novell Directory
Services) using the same strategies.

Just some paranoia in the morning,

James


-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
http://www.steptwo.com.au/
jamesr@steptwo.com.au

"Beyond the Idea"
 ACN 081 019 623

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Thu Mar 25 22:30:56 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse>


>Let me clarify.  I want to create a new standard, very closely related to
XML and tracking and
>dependant on it, but using "binary" data structures.

    Do you mean like MIME? MIME itself isn't related to XML but does deal
with binary data in a standard fashion, and has the advantage of very
widespread implementation. It is possible to make MIME work nicely with XML
e.g. XMTP (see http://jabr.ne.mediaone.net/documents/xmtp.htm )

What I really mean by "binary" is that
>instead of a stream of characters that have structured meaning after
parsing and
>transformation to an internal datastructure, I want a data format that
encodes an equivalent
>data structure directly.  I have designed something that is directly usable
in memory, as
>loaded, with a DOM interface, or SAX, etc., only much more efficiently than
starting from
>XML.  The design I have in mind was controlled by an optimization process
with constraints of
>standard Java capabilities.

    One option would be to define standard interfaces for MIME data. Using
property sets and groves it might be possible to define a generic DOM for
MIME. Specific property sets can be developed for arbitrary binary notation
types and such groves would be accessable via interfaces ala the DOM.

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sblackbu at erols.com  Thu Mar 25 22:48:18 1999
From: sblackbu at erols.com (Samuel R. Blackburn)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <002801be7711$6f8f3c40$0100a8c0@sammy>

XML in its current form cannot handle "binary" data at all.
At best, you would have to convert non-text data to text.
This is usually done via base64.

You could create your own version of XML that could easily
handle non-text data. All you need do is add one attribute to
any XML element that provides the length (in bytes) of the
non-text data. For example:

<GIF bin:length="4096">GIF89a[4090 non-text bytes]</GIF>

The "bin:length" attribute could tell your parser to stop parsing
and store the 4096 bytes following the closing > of the element.
After the 4096 bytes have been stored, start parsing again.

The down side of this approach is:

1) bin:length would have to be agreed on by all binary parsers out there
2) binary XML files cannot be parsed by non-binary aware parsers
(in other words, every parser in the world today)

HTH,

Sam

-----Original Message-----
From: Stephen D. Williams <sdw@lig.net>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Thursday, March 25, 1999 2:04 PM
Subject: Is there anyone working on a binary version of XML?


>I know, I know, this is anathema to what many of you feel is the essence of
>XML, and I agree to a point.
>I have come to feel however that there is room for a "works-as-if" binary
>analogue to text based XML.  Something that is totally subservient to the
>standard and has exactly equivalent features, but that is highly efficient
>for processing at all levels and easily converted to and from text based
>XML.
>
>In using XML in real-world application work and designing future
>infrastructure that is highly scalable and efficient while making use of
>XML, I have come to the conclusion that I need a standard way to deal with
>an XML analogue that is binary.  There are a multitude of performance
>problems that this solves, not only in parsing and exporting, but
processing
>of related data inside applications.
>
>Before I make all the details and ideas public, I would like to know if
>there is any serious precedent directly dealing with XML.
>
>My design has highly efficient Java processing in mind, but is not specific
>to any particular language.
>Compression is a secondary, but associated issue.
>
>Thanks
>sdw
>--
>OptimaLogic - Finding Optimal Solutions
>Web/Crypto/OO/Unix/Comm/Video/DBMS
>sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect
>http://sdw.st
>43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
>5Jan1999
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 23:00:11 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <36FAC7C1.63406FAD@lig.net>

"Simon St.Laurent" wrote:

> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote:
> >>I know, I know, this is anathema to what many of you feel is the
> >>essence of XML, and I agree to a point.
> >
> >It's not so much about feelings, as about contradicting the XML spec.
> >
> >[...]
> >
> >Applying XML concepts to a binary data format sounds interesting and
> >potentially useful, but it wouldn't be XML.
>
> One of these days I'd really love to stop talking about what is and isn't
> XML, though I know it's fun, and start talking about what we can do with
> XML and XML-like structures, whether they are SAX event flows, DOM trees,
> or binary formats that build on an XML foundation.
>
> We might even get some real work done - and it might even be fun.

I agree with the sentiment Simon.

I'm required (or am requiring myself) to get a lot of real work done very
quickly in the next
6 months hence my focus...

Semantically, I am talking about using XML.  After parsing and creating a
DOM tree or SAX
events, you no longer have XML but a data structure semantically equivalent
to an XML
document.  Another way to think about what I'm proposing is that it is a
cache of the data
structures produced from processing an XML document, cast in a openly
documented data
structure that is already flattened and ready for IO.

In fact, this is how I arrived at this design after following a few other
design constraints
and observations.  Of course from there it is a short stop to say that you
can throw away the
'external' XML representation if you can recreate it from XMLb.

My scheme makes parsing of XML a non-issue.  If I only have that advantage
within my closed
system, so be it, converting to and from XML for external purposes is in
fact what I intend to
do.

In my case, I'm architecting a high speed clustering system, primarily
targeted at Linux/Unix
and Java.  In this kind of system of course you are splitting applications
into many servers.
Of course the communication between those nodes is really internal
application communication,
the equivalent of that DOM tree, so it makes sense to optimize it.  Think of
it this way,
you'd seldom design a large app where every method needs to parse the XML
text block passed to
it to get a DOM tree (or SAX events) if the calling method has a DOM tree
that it could just
pass.

sdw

> Simon St.Laurent
> XML: A Primer
> Sharing Bandwidth / Cookies
> http://www.simonstl.com


--
OptimaLogic - Finding Optimal Solutions    
Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect  
http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
5Jan1999

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 23:06:57 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:33 2004
Subject: Is there anyone working on a binary version of XML?
References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse>
Message-ID: <36FAC962.D9F47DB6@lig.net>

No, not really a solution to the problem set I'm trying to solve.

Keep reading...

I'm not trying to create a new way to recognize XML, but a more efficient way to
do all kinds of computer processing and communication with it.  Innordinate
amounts of time, money, effort, CPU, and bandwidth are spent at the interfaces
between programs and other programs, databases, file systems, networks, servers,
etc.  XML is a good general solution, but some situations require optimization
which is what I'm working on.

sdw

Jonathan Borden wrote:

> >Let me clarify.  I want to create a new standard, very closely related to
> XML and tracking and
> >dependant on it, but using "binary" data structures.
>
>     Do you mean like MIME? MIME itself isn't related to XML but does deal
> with binary data in a standard fashion, and has the advantage of very
> widespread implementation. It is possible to make MIME work nicely with XML
> e.g. XMTP (see http://jabr.ne.mediaone.net/documents/xmtp.htm )
>
> What I really mean by "binary" is that
> >instead of a stream of characters that have structured meaning after
> parsing and
> >transformation to an internal datastructure, I want a data format that
> encodes an equivalent
> >data structure directly.  I have designed something that is directly usable
> in memory, as
> >loaded, with a DOM interface, or SAX, etc., only much more efficiently than
> starting from
> >XML.  The design I have in mind was controlled by an optimization process
> with constraints of
> >standard Java capabilities.
>
>     One option would be to define standard interfaces for MIME data. Using
> property sets and groves it might be possible to define a generic DOM for
> MIME. Specific property sets can be developed for arbitrary binary notation
> types and such groves would be accessable via interfaces ala the DOM.
>
> Jonathan Borden
> http://jabr.ne.mediaone.net

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 23:08:03 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
References: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com>
Message-ID: <36FAC986.7E2A7AF8@lig.net>

I'm already taking a look at it, but it doesn't completely address what I'm getting at.  I may
be able to branch from it.

sdw

Peter Stark wrote:

> Not only "attempts to define".
>
> The "binary XML" defined by the WAP Forum is a format for tokenized XML.
> It's supported by cellular phones with WAP browsers, e.g.
> http://www.nokia.com/phones/7110/index.html. Element and attribute names are
> replaced by binary values to make parsing cheaper in the client. It does,
> however, not support all XML features. For example, XML namespaces are not
> supported.
>
> You can read more about WAP at:
> http://www.uplanet.com/pub/111398_WAP_V1whitepaper.pdf
>
> Peter Stark
>
> > -----Original Message-----
> > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> > Dan Brickley
> > Sent: Thursday, March 25, 1999 12:54 PM
> > To: 'xml-dev@ic.ac.uk'
> > Subject: RE: Is there anyone working on a binary version of XML?
> >
> >
> >
> > On Thu, 25 Mar 1999, DuCharme, Robert wrote:
> >
> > > >I know, I know, this is anathema to what many of you feel is the
> > > >essence of XML, and I agree to a point.
> > >
> > > It's not so much about feelings, as about contradicting the XML spec.
> >
> > Quite so. But there are still initiatives such as
> >
>         http://www.wapforum.org/docs/technical.htm
>         http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf
>
> which attempts to define a 'compact binary representation of XML'.
> (If going down that route, I'd rather have a compact binary
> representation of whatever it was that I'm representing in XML, rather
> than of the XML that I might've used as a textual representation of the
> data... But then that really wouldn't be XML.)
>
> Dan
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
> 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Thu Mar 25 23:12:54 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
References: <002801be7711$6f8f3c40$0100a8c0@sammy>
Message-ID: <36FACA9B.4D7CD9B2@lig.net>

This is not what I meant.

XML has mechanisms to store binary data as characters using all the standard
methods.

What I'm talking about is using data that is structured in a directly
addressable way (think pointers, arrays, indexes, offsets) to represent the
structure and content of an XML tree.  My actual proposal is a bit more
complicated than that because I want other types of optimizations for in-memory
processing, but that is one of the roots of the idea.  In other words, after
loading the tree would be directly addressable (SAX or DOM) without any parsing
(or very limited steps).  A typical server might in fact support both XML and
XMLb queries and responses.

sdw

"Samuel R. Blackburn" wrote:

> XML in its current form cannot handle "binary" data at all.
> At best, you would have to convert non-text data to text.
> This is usually done via base64.
>
> You could create your own version of XML that could easily
> handle non-text data. All you need do is add one attribute to
> any XML element that provides the length (in bytes) of the
> non-text data. For example:
>
> <GIF bin:length="4096">GIF89a[4090 non-text bytes]</GIF>
>
> The "bin:length" attribute could tell your parser to stop parsing
> and store the 4096 bytes following the closing > of the element.
> After the 4096 bytes have been stored, start parsing again.
>
> The down side of this approach is:
>
> 1) bin:length would have to be agreed on by all binary parsers out there
> 2) binary XML files cannot be parsed by non-binary aware parsers
> (in other words, every parser in the world today)
>
> HTH,
>
> Sam
>
> -----Original Message-----
> From: Stephen D. Williams <sdw@lig.net>
> To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
> Date: Thursday, March 25, 1999 2:04 PM
> Subject: Is there anyone working on a binary version of XML?
>
> >I know, I know, this is anathema to what many of you feel is the essence of
> >XML, and I agree to a point.
> >I have come to feel however that there is room for a "works-as-if" binary
> >analogue to text based XML.  Something that is totally subservient to the
> >standard and has exactly equivalent features, but that is highly efficient
> >for processing at all levels and easily converted to and from text based
> >XML.
> >
> >In using XML in real-world application work and designing future
> >infrastructure that is highly scalable and efficient while making use of
> >XML, I have come to the conclusion that I need a standard way to deal with
> >an XML analogue that is binary.  There are a multitude of performance
> >problems that this solves, not only in parsing and exporting, but
> processing
> >of related data inside applications.
> >
> >Before I make all the details and ideas public, I would like to know if
> >there is any serious precedent directly dealing with XML.
> >
> >My design has highly efficient Java processing in mind, but is not specific
> >to any particular language.
> >Compression is a secondary, but associated issue.
> >
> >Thanks
> >sdw
> >--
> >OptimaLogic - Finding Optimal Solutions
> >Web/Crypto/OO/Unix/Comm/Video/DBMS
> >sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect
> >http://sdw.st
> >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
> >5Jan1999
> >
> >
> >
> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> >(un)subscribe xml-dev
> >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> >subscribe xml-dev-digest
> >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Mar 25 23:41:36 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net>
Message-ID: <36FAC72C.2B911970@prescod.net>

"Stephen D. Williams" wrote:
> I'm not trying to create a new way to recognize XML, but a more efficient way to
> do all kinds of computer processing and communication with it.  Innordinate
> amounts of time, money, effort, CPU, and bandwidth are spent at the interfaces
> between programs and other programs, databases, file systems, networks, servers,
> etc.  XML is a good general solution, but some situations require optimization
> which is what I'm working on.

I can see many ways that a typical XML document could be optimized for
size if XML compatibility was not a concern. Call it "compressed ML." I am
not clear, however, why CompressedML would need to be binary. There are
many languages where working with binary data is more expensive than
working with text.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Fri Mar 26 00:05:45 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <002101be771b$583233e0$0b2e249b@fileroom.Synapse>

Stephen D. Williams wrote:

>
>I'm not trying to create a new way to recognize XML, but a more efficient
way to
>do all kinds of computer processing and communication with it.  Innordinate
>amounts of time, money, effort, CPU, and bandwidth are spent at the
interfaces
>between programs and other programs, databases, file systems, networks,
servers,
>etc.  XML is a good general solution, but some situations require
optimization
>which is what I'm working on.
>

    Are you talking about using XML which is text or binary data which
isn't? XML itself isn't an interface, rather can be used as a data format to
develop interfaces. The DOM is an interface onto XML  documents which is
modelled after ... hmmm .... what the XML property set *would* generate if
it existed. This is the XML 'grove'. If you are talking about generating an
API or interface onto binary data which is similar to the DOM, I suggest
that the grove representation would be the most reasonable. The binary data
format's property set (if such exists) would be used to generate a DOM-like
interface onto the binary data.

    This deals with the interface issues on binary data formats. MIME deals
with standardized serialization issues on binary (and other) data formats.

    Perhaps you are proposing a binary data format which is in some fashion
similar to XML? Such a data format would have its own property set, grove
and set of interfaces. It would not be XML. I am suggesting that the use of
property sets, the grove formalism and generated interfaces would be the
most logical mechanism to develop a system designed to be similar to XML yet
deal with binary data formats.


Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Fri Mar 26 00:06:37 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <61DAD58E8F4ED211AC8400A0C9B468731AAC87@THOR>

I guess the key thing is what you are trying to communicate.  

If you are primarily dealing with textual information, then the only
transform that would seem to make sense is compression or encryption
(depending if you were trying to reduce required bandwidth/diskspace at the
expense of processing or trying to hide information).  The event based
parsers (such as expat) can chew through large files at blinding speed.  The
relative slowness of the DOM based parsers is primarily due to the expense
of string allocation and that would not be eliminated if you simple changed
the media.  Neither of those requires anything new from the XML world.

If you were trying to communicate something that a textual representation
cannot be comprehendable (say a JPEG image), then trying to use XML at all
is just a poor decision.

The one domain that a binary XML seems useful is when the bulk of the
content is numeric (especially floating point).  In those cases, you would
like to be able (in some circumstances) to transmit floating point numbers
without the loss of precision that comes with a conversion to and from text.


For this to work, you would need a persistance framework that took typed
information and depending on the archive object that you passed would create
either a textual XML file or a binary analogue.

My approach to storage was to expand the Microsoft Property Storage
mechanism by generating CRC's for the tag and attribute names to generate
the Property Identifer (32-bit int) and the representing the content in the
appropriate variant (numerics as IEEE format, text in Unicode).


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From skshirsa at nortelnetworks.com  Fri Mar 26 00:11:45 1999
From: skshirsa at nortelnetworks.com (Shekhar Kshirsager)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net>
Message-ID: <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com>

There are two places where use of XML can be optimized - one when it is
transfered on the wire and second when the program tries to interpret the
XML data using SAX,DOM etc.
My interpretation is that Stephen is talking about optimizing the process of
interpreting the XML document at the client.
But I'm still not sure, what will the in-memory presentation of bXML buy us
above DOM.

Thanks,
Shekhar Kshirsagar

----- Original Message -----
From: Paul Prescod <paul@prescod.net>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, March 25, 1999 6:30 PM
Subject: Re: Is there anyone working on a binary version of XML?


> "Stephen D. Williams" wrote:
> > I'm not trying to create a new way to recognize XML, but a more
efficient way to
> > do all kinds of computer processing and communication with it.
Innordinate
> > amounts of time, money, effort, CPU, and bandwidth are spent at the
interfaces
> > between programs and other programs, databases, file systems, networks,
servers,
> > etc.  XML is a good general solution, but some situations require
optimization
> > which is what I'm working on.
>
> I can see many ways that a typical XML document could be optimized for
> size if XML compatibility was not a concern. Call it "compressed ML." I am
> not clear, however, why CompressedML would need to be binary. There are
> many languages where working with binary data is more expensive than
> working with text.
>
> --
>  Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
>  http://itrc.uwaterloo.ca/~papresco
>
> "Perpetually obsolescing and thus losing all data and programs every 10
> years (the current pattern) is no way to run an information economy or
> a civilization." - Stewart Brand, founder of the Whole Earth Catalog
> http://www.wired.com/news/news/culture/story/10124.html
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Mar 26 01:33:41 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
References: <49092BAEAC84D2119B0600805FD40F9F120EC4@MDYNYCMSX1> <36FAB727.905F4E2C@lig.net>
Message-ID: <006f01be7717$4c809180$0300000a@cygnus.uwa.edu.au>

Have a look at http://www.wapforum.org/ which includes a draft document
specifying a compact binary representation of XML documents.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Fri Mar 26 01:57:36 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:10:34 2004
Subject: XML-QL (was Re: Whence XQL?) (Ok Whither XQL, Dave)
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF1DE@RED-MSG-08>

Information on Microsoft's support of various XML technologies is available
at http://www.microsoft.com/xml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Fri Mar 26 02:16:37 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:34 2004
Subject: Is there anyone working on a binary version of XML?
References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com>
Message-ID: <36FAEE03.E435EB6D@lig.net>


Imagine that you have all the features of XML: structure, flexibility, common format for
interchange, but that you perform zero processing steps to import or export the 'document'
from a program.  (Actually, I'm thinking this would be done in chunks, but essentially very
few reads and writes.)

Also imagine that when taking an XML 'document' into a program you could search, modify, or
copy the object without generating thousands of object creates and deletes/garbage collection
hits.  I call this last problem a 'malloc storm' and it appears to be one of the worst
problems with a lot of large Java systems.  (I've experienced this problem in C++ programs for
years and Java for the last year.)  Among other things, I'm directly addressing this issue.

Then imagine you can write or communicate the object to other systems simply with IO
operations with no processing involved.  Then imagine that the IO is async and very cheap and
that you are processing thousands of transactions per second, most of which generate
fundamentally little processing steps.

I am and will be, necessarily, revealing some very hard won lessons in optimizing very large
systems as part of my design for this.  I just feel strongly that this step is inevitable at
some point and I want to have the most useful form of it become standard.  As I mentioned, it
should work particularly well with Java's available capabilities, but should be easily usable
by C/C++, etc.

There are several things that could be optimized here.  CPU in application processing, CPU in
overhead, CPU in preparation for IO, size of data in memory, size of data in storage, size of
data in transit, etc.

This method would primary allow a drastic decrease in CPU for most situations and a slight
decrease in storage with an easy path to more comprehensive compression levels.

I'm going to be studying the existing binary effort and then releasing a few notes on details
of what I'm thinking.  I'll try to get a Java prototype working soon.  It appears that the
best path is to use SAX to generate bXML that will have either a SAX or DOM interface.

Note that the payload data in bXML would still be the same character data that would be in
character areas of a normal XML document (possibly without canonicalizing translations).  When
mentioning 'binary', I simply meant that the structure would be represented by 'binary' data
structures of where to find elements, etc.  In fact it's possible to do this all in
ascii/Unicode if one desired.  The point is that bXML is not designed to be editable by a text
editor since it has more of a 'structured' layout, sort of like a filesystem.


One other subject that I haven't mentioned, but need for another architecture that I designed
a while ago is a mechanism for 'parallel inheritance' overlay tree processing.  Has anyone
else worked on this?  The idea is to have one or more base trees and work with a delta tree
which represents changes from the underlying trees.  This last part is a basic data structure
for a rule engine and metadata application environment I designed last year.

I don't mean to be distracting from external XML issues and standards, however XML is close to
being perfect for using for protocols, API's, message systems, RPC, etc. vs. DCOM, Corba
(hopefully this can be resolved), etc.  Web-XML was a good example of this.  It turns out that
for message passing systems in a cluster, you really need to externalize the kinds of
optimizations I'm talking about, vs. something normally internal to a particular SAX/DOM
parser.


sdw

Shekhar Kshirsager wrote:

> There are two places where use of XML can be optimized - one when it is
> transfered on the wire and second when the program tries to interpret the
> XML data using SAX,DOM etc.
> My interpretation is that Stephen is talking about optimizing the process of
> interpreting the XML document at the client.
> But I'm still not sure, what will the in-memory presentation of bXML buy us
> above DOM.
>
> Thanks,
> Shekhar Kshirsagar
>
> ----- Original Message -----
> From: Paul Prescod <paul@prescod.net>
> To: <xml-dev@ic.ac.uk>
> Sent: Thursday, March 25, 1999 6:30 PM
> Subject: Re: Is there anyone working on a binary version of XML?
>
> > "Stephen D. Williams" wrote:
> > > I'm not trying to create a new way to recognize XML, but a more
> efficient way to
> > > do all kinds of computer processing and communication with it.
> Innordinate
> > > amounts of time, money, effort, CPU, and bandwidth are spent at the
> interfaces
> > > between programs and other programs, databases, file systems, networks,
> servers,
> > > etc.  XML is a good general solution, but some situations require
> optimization
> > > which is what I'm working on.
> >
> > I can see many ways that a typical XML document could be optimized for
> > size if XML compatibility was not a concern. Call it "compressed ML." I am
> > not clear, however, why CompressedML would need to be binary. There are
> > many languages where working with binary data is more expensive than
> > working with text.
> >
> > --
> >  Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
> >  http://itrc.uwaterloo.ca/~papresco
> >
> > "Perpetually obsolescing and thus losing all data and programs every 10
> > years (the current pattern) is no way to run an information economy or
> > a civilization." - Stewart Brand, founder of the Whole Earth Catalog
> > http://www.wired.com/news/news/culture/story/10124.html
> >
> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> > To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> > (un)subscribe xml-dev
> > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> > subscribe xml-dev-digest
> > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> >
> >
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Fri Mar 26 02:20:39 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:34 2004
Subject: XML-QL (was Re: Whence XQL?) (Ok Whither XQL, Dave)
In-Reply-To: <30649320C177D111ADEC00A024E9F297169FC0@exchange-server.deg
 a.com>
Message-ID: <3.0.3.32.19990325212134.00c48100@pop.mindspring.com>

At 11:01 AM 3/25/99 -0800, Ed Howland wrote:
 
>BTW, how do you get at the XQL part of IE5? I never saw that in MS's
>writeup. Or is it just extensions to their XSL?

See the following URL for details:

http://www.microsoft.com/workshop/xml/xmldom/scriptref/XMLDOMNode_selectNode
s.asp

Their documentation calls it "XSL Patterns", but it supports the XQL from
the paper I wrote jointly with webMethods and Microsoft. Here's their
documentation of the patterns they support, with a reference to the XQL paper:

http://www.microsoft.com/workshop/xml/xsl/reference/XSLPatternSyntax.asp

Hope this helps!

Jonathan
 
jonathan@texcel.no
Texcel Research
http://www.texcel.no

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Fri Mar 26 02:29:13 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:34 2004
Subject: Storing Lots of Fiddly Bits (was Re: What is XML for?)
References: <001501be532c$ffeed060$d3228018@jabr.ne.mediaone.net> <370CF210.759B26AB@prescod.net>
Message-ID: <36FAF0FA.123C1F40@lig.net>

I really believe, at the moment, that my ultimate database would have XML based tables and
allow relational (possibly even SQL) queries along with an XQL/XML-QL structured queries.

Normalization could take several forms and allow non-normalized or threshold normalized forms.

Of course the SQL-XML view on things would not have full XML structuring capabilities, it
would provide a great bridge and in fact solve quite a few annoying problems that RDBMS's and
OODBMS's haven't really solved satisfactorily: schema migration and flexible schema's.

I have some ideas, but I'm going through some references to see what I'm missing right now.

sdw

Paul Prescod wrote:

> "Borden, Jonathan" wrote:
> >
> > > 3. Therefore we should pretend that relational databases are really DOM
> > >    trees.
> >
> >         no. if the data is tabular then use a recordset. in the specific cases when
> > 1) we are storing data which is naturally hierarchical. 2) when the data
> > needs to interface with systems which for other reasons employ DOM
> > interfaces
>
> Okay. We can probably all agree with this. If you have software that is
> expecting a DOM and you need to connect it to data that is not XML, you
> need to build a DOM interface. This is a different point of view from
> those who say: "let's build new client software using only the DOM served
> by data with only a DOM interface. The fact that the DOM is standardized
> will just make all of my interoperability problems go away." No way. If
> your client software and your server software had an impedence mismatch,
> slapping a DOM interface on both sides makes it *worse* not better.
>
> > e.g. my XSL processor us built on a DOM interface and I wish to
> > query the database using XQL (which happens to be built into my XSL
> > processor in this example), it is more convenient to interface to the data
> > using DOM interfaces than it is using recordsets (i.e. tabular data).
>
> It's more convenient but it's probably going to run as slow as hell.
> Nobody implements SQL or OQL on top of an industry-standard interface.
> They put it right in the core engine of their database.
>
> >         Arguably, when using an ODBMS this example would be more straightforward
> > (but you picked RDBMS). The problem is that there is no standard, language
> > independent interface onto ODBMS's.
>
> ********** Yes there is! *************
>
> It isn't as widely hyped as XML/DOM. I haven't written a book about it
> (and hardly has anyone else). But the standards *do* exist. Check
> http://www.odmg.org. There are well defined APIs, bindings in a few
> languages, a solid object model and a query language. It's all in there.
>
> My fear it that these technologies will get lost in the XML hype.
>
> > The DOM, while not the perfect interface
> > *is* standard, and this is the big utility.
>
> The DOM is a standard for accessing XML, HTML and CSS information. It
> isn't for modelling arbitrary business objects. It wasn't designed for
> that and it isn't good at that.
>
> >         For example, I get to say (using 'extended DOM'):
> >
> >         NodeList anotherSet = airplanes.selectNodes("airplane[@color='red' and
> > .//screw/thread/@pitch = 64]");
> >
> > to select all red airplanes with screws having a pitch=64...
>
> The DOM is doing essentially nothing here. This imaginery XML query
> language is doing all of the work. But even the XML query language is
> going to make solving your problem harder than OQL would. For instance OQL
> can be statically type checked. XQL cannot, in general, for many subtle
> reasons. OQL can handle mathematical range constraints. OQL has a concept
> of a "stored query" that allows some level of abstraction. OQL has "local
> variables" also for abstraction.
>
> I don't completely follow your examples:
>
> >         XMOP for example (http://jabr.ne.mediaone.net/documents/xmop.htm) is a way
> > to serialize arbitrary COM objects using their typeinfo metadata. XMOP is a
> > layer that can persist objects into either a) a stream (serialization) b)
> > direct-to-DOM. When I attempted to design a direct-to-Recordset persistence
> > interface on XMOP I found that I had to essentially develop a
> > DOM<->Relational mapping. This is because arbitrary objects can be modelled
> > in a hierarchical fashion (e.g. serialized to XML).
>
> This seems like a serialization problem. We all agree that XML is great
> for serialization. If your only goal was to get the data into a "database
> of some kind" then an OO database would have been easier than an XML
> database.
>
> >         In another example, using the medical imaging DICOM protocol (a complex
> > property based protocol) I have developed a mapping to the Microsoft
> > PropertySet format (used with Index Server). This mapping is not clean (at
> > all given the inability to represent certain DICOM structures as
> > PROPVARIANTs). This causes similar problems in mapping the protocol to a
> > relational database (the workaround is to use binary data). Using XML and
> > the DOM was a piece of cake to solve this difficult problem.
>
> I'm not at all clear on how the DOM solved this impedence mismatch.
>
> --
>  Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
>  http://itrc.uwaterloo.ca/~papresco
>
> "Remember, Ginger Rogers did everything that Fred Astaire did,
> but she did it backwards and in high heels."
>                                                --Faith Whittlesey
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Fri Mar 26 02:32:45 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:34 2004
Subject: Whence XQL?
In-Reply-To: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>; from Jonathan Robie on Thu, Mar 25, 1999 at 04:52:17PM -0500
References: <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <000b01be7702$102babd0$0100007f@eps.inso.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>
Message-ID: <19990326133124.B7318@io.mds.rmit.edu.au>

On Thu, Mar 25, 1999 at 04:52:17PM -0500, Jonathan Robie wrote:
> At least one IR person criticized XQL for doing too much, eg for
> having the parent/child relationship in addition to the
> ancestor/descendant relationship. This does, in fact, increase the
> complexity of implementation, but offers a distinction that I find
> important.

This is particularly so given the broadening of XML's focus from
documents to documents and data.

> The number of implementations of XQL shows that there's a fair
> amount of interest in it. People who have demonstrated it at trade
> shows send me email telling me how impressed people are - for
> instance, I have been getting email from Software AG, which is
> showing XQL at CeBIT this week and getting very enthusiastic
> responses. When I discuss XQL at trade shows, I get enthusiastic
> responses. So the fact that there are also critics doesn't bother
> me.

I could be disingenuous ( :-) ) and suggest that the attachment to
Microsoft has more than a little to do with its success to date, but I
certainly don't want to disparage the effort in its own right.  It
offers a good compromise between expressivity and simplicity, which is
a far more practicable goal than completeness.

I am concerned (am I right on this?) at the lack of proximity
operators.  But that's just an implementor's perspective, looking at
doing things we already support.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Fri Mar 26 03:08:01 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:35 2004
Subject: Whence XQL?
In-Reply-To: <19990326133124.B7318@io.mds.rmit.edu.au>
References: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>
 <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>
 <000b01be7702$102babd0$0100007f@eps.inso.com>
 <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>
Message-ID: <3.0.3.32.19990325220915.032a1480@pop.mindspring.com>

At 01:31 PM 3/26/99 +1100, Marcelo Cantos wrote:
 
>I could be disingenuous ( :-) ) and suggest that the attachment to
>Microsoft has more than a little to do with its success to date, but I
>certainly don't want to disparage the effort in its own right.  It
>offers a good compromise between expressivity and simplicity, which is
>a far more practicable goal than completeness.

Well, Microsoft was one of the first companies I got interested in XQL ;->

>I am concerned (am I right on this?) at the lack of proximity
>operators.  But that's just an implementor's perspective, looking at
>doing things we already support.

Cool, you work on SIM? (Does that make you a SIMian?) I really enjoyed
talking to Timothy Arnold-Moore at Markup Technologies '98 - Makoto
Murata-san and I managed to snag him after his presentation and grill him
with questions for a while.

I've gone back and forth on proximity operators. Several people who have
implemented full-text search systems have told me that users don't really
use proximity operators, that they are useful in the implementation, but
need not be exposed to the user. Others vehemently disagree. I took the
pragmatic approach of leaving it out to see who would complain. Frankly,
you are the first to do so.

I have discussed proximity searching as a possibility in the following paper:

http://www.w3.org/TandS/QL/QL98/pp/murata-san.html

Here's an excerpt:

<excerpt>

In addition, functions for proximity searching might be useful. The
following returns <LINE> elements in which "rose*" and "sweet*" occur
within 10 words of each other:

LINE[near("rose*", "sweet", 10)]
This would match lines like these:

<LINE>A rose by any other name would smell as sweet.</LINE>
<LINE>Sweet roses grew along the south side of the fence.</LINE>
<LINE>She rose and smiled sweetly at the purple dwarf under the bucket.</LINE>
<LINE>Say, has anybody seen my Sweet Gypsy Rose?</LINE>

Proximity searching requires some way to indicate how close the strings
must be in order to match. This causes a difficulty when choosing the units
in which proximity is measured. In existing full-text systems, distance is
frequently measured in terms of words, which raises a number of significant
questions regarding internationalization, but is probably an intuitive way
to measure distance for most users.

</excerpt>

I'm not sure whether this is the best approach or not. Do you like this
approach? If not, what approach would you prefer?

Jonathan


jonathan@texcel.no
Texcel Research
http://www.texcel.no

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 03:52:11 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:35 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <000c01be770a$87141090$76c3c6c3@sluk.uplanet.com>
References: <Pine.GHP.4.02A.9903252043140.4008-100000@mail.ilrt.bris.ac.uk>
	<000c01be770a$87141090$76c3c6c3@sluk.uplanet.com>
Message-ID: <14075.1092.778836.678620@localhost.localdomain>

Peter Stark writes:

 > The "binary XML" defined by the WAP Forum is a format for tokenized
 > XML.  It's supported by cellular phones with WAP browsers, e.g.
 > http://www.nokia.com/phones/7110/index.html. Element and attribute
 > names are replaced by binary values to make parsing cheaper in the
 > client. It does, however, not support all XML features. For
 > example, XML namespaces are not supported.

If named elements and attributes are supported, then so are
namespaces; the client just has to do a little more work to find them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 03:54:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:35 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <wk7ls5dy59.fsf@ifi.uio.no>
References: <14074.17776.784121.47587@localhost.localdomain>
	<wk7ls5dy59.fsf@ifi.uio.no>
Message-ID: <14075.1169.878710.98496@localhost.localdomain>

Lars Marius Garshol writes:

 > Also, if element declarations are included, I suppose notations
 > should be, too.

They're there already in the SAX 1.0 DTDHandler, since XML 1.0
requires processors to report notations.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 04:07:05 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:35 2004
Subject: XML and (K)Office
In-Reply-To: <4.1.19990326091444.00bac620@steptwo.com.au>
References: <5F052F2A01FBD11184F00008C7A4A800022A1714@EUKBANT101>
	<14074.8435.653789.348824@localhost.localdomain>
	<4.1.19990326091444.00bac620@steptwo.com.au>
Message-ID: <14075.1332.510399.82540@localhost.localdomain>

James Robertson writes:

[on using Namespaces in spreadsheet formats]

 > It:
 > 
 > * Breaks validation. We are no longer able to ensure that the
 >   files we are reading/creating are correct and useful.

DTD validation cannot guarantee that a file is correct or useful; it
can only guarantee that it matches a few BNF-like productions (that's
helpful in itself because it allows some code simplication, but not as
much as some people let on).  DTDs are great for guided authoring, but
that's a different area.

Furthermore, Namespaces itself doesn't break DTD validation -- it's a
different layer.  The possibility of receiving unexpected information
does break validation, but it does so with or without namespaces; with 
namespaces, at least, you can clearly distinguish what's been added.

 > * Still has the variations between applications, so that a reader
 >   of a given format still needs to know 100% about what is that
 >   format.

Not at all -- it can use what it understands and apply simple rules to
the rest (ignore it as in RDF, skip to the top level and process the
children, etc.).

 > Without the rigour of a DTD, we've got nothing.

DTDs may be rigorous or lax, depending on the designer.  Here's a DTD
for spreadsheets:

  <!ELEMENT spreadsheet (#PCDATA)>

Just dump in the comma-delimited file, and escape any XML delimiters.
Now you have a DTD, and you still have nothing.

 > Particularly since this data may well live long, and is not
 > some transient "sent over the web" data.

That means that the format should be well-documented and validatable;
DTDs can help (and it's nice that they work with off-the-shelf tools),
but they're not worth much by themselves.

 > How will future users make sense of the format without
 > a DTD?

I've written dozens (hundreds?) of DTDs and a book on them, so I'm
quite comfortable saying that a DTD does not guarantee that users can
make sense of a format.  It is helpful in many ways, but good
documentation, examples, sample code, etc. are at least as important.

Would you like to code in C++ based only on the BNF for the language?
Of course.

Is it possible to code in C++ without ever having seen the BNF (or
whatever they use) in the ANSI spec?  Thousands do, some well and some
poorly.

That said, I think that DTDs are wonderfully useful and will be around
for a long time -- I doubt that any other schema standards that come
out will be nearly so light-weight.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Mar 26 04:26:51 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:10:35 2004
Subject: SAX2: AttributeList2 and EntityRefList
References: <14074.16928.163619.681099@localhost.localdomain>
Message-ID: <36FB08E5.DA7CEDC2@jclark.com>

David Megginson wrote:

> As John Cowan has pointed out, the XML 1.0 REC requires that
> processors report unexpanded entity references, and presumably that
> applies to references in attribute values as well as elsewhere; as a
> result, it is impossible to treat an XML attribute value simply as a
> string.

I'm not seeing this.  All I can find is:

> 4.4.3 Included If Validating
> 
> When an XML processor recognizes a reference to a parsed entity, in order to validate the document, the
> processor must include its replacement text. If the entity is external, and the processor is not attempting to
> validate the XML document, the processor may, but need not, include the entity's replacement text. If a
> non-validating parser does not include the replacement text, it must inform the application that it recognized,
> but did not read, the entity.

4.4.3 applies only to external parsed general entities.  External parsed
entities are not allowed in attribute values.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 26 04:54:09 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:35 2004
Subject: Is there anyone working on a binary version of XML?
References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com> <36FAEE03.E435EB6D@lig.net>
Message-ID: <36FB0F47.FE758A01@prescod.net>

"Stephen D. Williams" wrote:
> 
> Also imagine that when taking an XML 'document' into a program you 
> could search, modify, or copy the object without generating thousands 
> of object creates and deletes/garbage collection hits.  

I guess this is the part I don't understand. I can see how in C++ I could
just load a chunk of binary gunk and use casts to convince the computer
that it is really objects but I don't see how to do that in Java, Python,
Perl or other high level languages. And even if you get it working really
fast in Java will those same binary objects load quickly in any other
language?

Are you going to lazily build objects as the application walks the tree? 

> The point is that bXML is not designed to be editable by a text
> editor since it has more of a 'structured' layout, sort of like a 
> filesystem.

But note that a filesystem is not meant to be interpreted by more than one
program, especially not by programs written in multiple languages. You
call into the kernel (probably written in C) and it interprets the bits
for you.

Anyhow, if your "bXML" can be ASCII or Unicode then please make it so. 
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 26 05:35:03 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:35 2004
Subject: SAX2: AttributeList2 and EntityRefList
In-Reply-To: <14074.16928.163619.681099@localhost.localdomain> from "David Megginson" at Mar 25, 99 09:17:19 am
Message-ID: <199903260640.BAA13403@locke.ccil.org>

David Megginson scripsit:

> So, after some thought, here's what I came up with.  This is a special 
> interface providing indexes to zero or more entity references in a
> literal string (i.e. an attribute value).  The indices are based on
> whatever array indices the programming language is using, exclusive of 
> Unicode problems with combining characters, etc. (i.e. any
> normalisation must already have taken place).

What about references to unknown entities, though?  They don't contribute
any characters at all, and so don't fit your model.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Fri Mar 26 05:39:31 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:35 2004
Subject: Proposed new kind of SAX2 thing, with example
Message-ID: <199903260644.BAA13616@locke.ccil.org>

I believe there should be some way within SAX2 to ask for
parser properties (in the JavaBeans sense).  One example is the
architectural DTD public ID, which XAF provides access to
but can't report because it doesn't fit the SAX event model.

Another case is the current element stack.  Every parser (or almost
every parser) has to keep one of these around, and it would be
useful to have "currentStackDepth" and "stackedElementType[n]"
properties.

What's needed is to have some means of discovery.  Perhaps it's just
enough to use the JavaBeans mechanism.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From shyutz at ms1.hinet.net  Fri Mar 26 07:38:34 1999
From: shyutz at ms1.hinet.net (Kevin Hsu)
Date: Mon Jun  7 17:10:35 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <002401be7757$f99675c0$15cd4acb@flag.com.tw>

Can anyone tell me how to print the XML document as I see on the screen in IE 5.0, thanks in advance.

Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/9fee3b9d/attachment.htm
From paul.janssens at skynet.be  Fri Mar 26 08:45:20 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:35 2004
Subject: Is there anyone working on a binary version of XML?
References: <002801be7711$6f8f3c40$0100a8c0@sammy> <36FACA9B.4D7CD9B2@lig.net>
Message-ID: <36FB488F.33B7@skynet.be>

A simple solution would be to serialize using a PN like format for your
file and have the arrity of each node in the file. It's slower than
having an offset table per node, but faster when inserting or deleting
as the data is almost utterly context free.

If this is too slow, you migh add offset tables per entity-node,
but'you'll have to update these when inserting or deleting, working up
the parent chain. 

The first is better for authoring, and the second for querying I'd say.

Paul Janssens - paul.janssens@skynet.be


Stephen D. Williams wrote:
> 
> This is not what I meant.
> 
> XML has mechanisms to store binary data as characters using all the standard
> methods.
> 
> What I'm talking about is using data that is structured in a directly
> addressable way (think pointers, arrays, indexes, offsets) to represent the
> structure and content of an XML tree.  My actual proposal is a bit more
> complicated than that because I want other types of optimizations for in-memory
> processing, but that is one of the roots of the idea.  In other words, after
> loading the tree would be directly addressable (SAX or DOM) without any parsing
> (or very limited steps).  A typical server might in fact support both XML and
> XMLb queries and responses.
> 
> sdw

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From l-arcini at uniandes.edu.co  Fri Mar 26 09:00:17 1999
From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.)
Date: Mon Jun  7 17:10:35 2004
Subject: Expat using something other than Visual C++
Message-ID: <00b401be773d$eb6afbc0$0100000a@phoebe>

Hi everyone,

Has anyone tried to use expat in an environment different than Visual C++? 
Any successful attempts using C++ Builder?

Thanks for your help,

FAA


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Fri Mar 26 09:23:53 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:35 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1722@EUKBANT101>

It appears that IE5 converts internally to HTML (with the XSL style sheet),
so the answer is that you can't. Even a save to disk saves the HTML AFAIK.
Try using Mozilla - it does things right, and displays XML+XSL remarkably
well considering it's at least 6 months away from release.

Matt.
--
http://come.to/fastnet
Perl on Win32, PerlScript, ASP, Database, XML
GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V 
!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++

> -----Original Message-----
> From:	Kevin Hsu [SMTP:shyutz@ms1.hinet.net]
> Sent:	Friday, March 26, 1999 6:55 AM
> To:	XML Developers' List
> Subject:	how to print the XML document in IE 5.0
> 
> Can anyone tell me how to print the XML document as I see on the screen in
> IE 5.0, thanks in advance.
> ?
> Kevin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Fri Mar 26 09:24:22 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:35 2004
Subject: DTDDeclHandler and DTDLexicalHandler
Message-ID: <01BE7772.9F3329A0@grappa.ito.tu-darmstadt.de>

This may have already been answered, but how do DTDDeclHandler and 
DTDLexicalHandler work together?  That is, if I have the following:

<!ENTITY % foo "foo CDATA #REQUIRED">
<!ATTLIST bar %foo;>

what is the sequence of callbacks?  And even if this is well-defined, what 
good is the lexical information in this case anyway, since I can't 
determine what characters in the DTD came before and after the entity 
usage.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at qub.com  Fri Mar 26 09:40:55 1999
From: paul at qub.com (Paul at Sunnyvale)
Date: Mon Jun  7 17:10:35 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <002e01be776d$2114bb60$c0d4d6cf@g0f2n0>


>It appears that IE5 converts internally to HTML (with the XSL style sheet),
>so the answer is that you can't. Even a save to disk saves the HTML AFAIK.
>Try using Mozilla - it does things right, and displays XML+XSL remarkably
>well considering it's at least 6 months away from release.


Could you please provide the url that will show Mozilla's capability to
display
XML + _XSL_ ? Or do you mean XML + CSS ?

Rgds.Paul.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Fri Mar 26 09:49:39 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:10:35 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <36FAEE03.E435EB6D@lig.net>
Message-ID: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>

> Then imagine you can write or communicate the object to other
> systems simply with IO
> operations with no processing involved.  Then imagine that the IO
> is async and very cheap and
> that you are processing thousands of transactions per second,
> most of which generate
> fundamentally little processing steps.

I just want to clarify my understanding of this thread: you're discussing
a binary format which is analagous to the internal representation of an
XML document (a DOM tree), and which can be stored, used and manipulated
without revisiting the original XML text?

Wouldn't a (undoubtedly naive) implementation of this be simply serialising
the object graph to disk, or through an I/O stream? This is obviously easy
in Java, and again is only obviously beneficial if the serialised object
graph is more 'compact' (which I believe is at least partly behind your
desire) than the original textual version?

Just a brain check on my part ;)

L.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Fri Mar 26 10:13:50 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:36 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1723@EUKBANT101>

> -----Original Message-----
> From:	Paul at Sunnyvale [SMTP:paul@qub.com]
> 
> >It appears that IE5 converts internally to HTML (with the XSL style
> sheet),
> >so the answer is that you can't. Even a save to disk saves the HTML
> AFAIK.
> >Try using Mozilla - it does things right, and displays XML+XSL remarkably
> >well considering it's at least 6 months away from release.
> 
> 
> Could you please provide the url that will show Mozilla's capability to
> display
> XML + _XSL_ ? Or do you mean XML + CSS ?
> 
	Oops. Mozilla on it's own only displays XML+CSS. I think they hope
to have full XSL support on release.

	Matt.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From alberto.reggiori at jrc.it  Fri Mar 26 10:45:25 1999
From: alberto.reggiori at jrc.it (Alberto Reggiori)
Date: Mon Jun  7 17:10:36 2004
Subject: Is there anyone working on a binary version of XML?
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <36FB6534.FDF0192D@jrc.it>

Leigh Dodds wrote:
> Wouldn't a (undoubtedly naive) implementation of this be simply serialising
> the object graph to disk, or through an I/O stream? This is obviously easy
> in Java, and again is only obviously beneficial if the serialised object
> graph is more 'compact' (which I believe is at least partly behind your
> desire) than the original textual version?
> 

I am writing a Web application that provides an Open Web space for
secondary schools in Europe where users can interact with an oodbms
thourgh a treeview like cut/paste/rename/edit paradigm using normal 16MB
pentium PCs and ISDN connections.
One of the big issues of this application is to provide a quick
generation and rendering of
those treeviews inside normal browsers using javascript. The initial
idea was to use a bare bone javascript xml parser on the client
(jeremie.com like) to parse and create the in-memory data structure
(DOMish) of thoses views, but that solution reveals not scaling when the
user requests some 200/300 folders.
The actual solution to those problems is to use a little hack on the
server that generates directly html docs with the parsed js structure in
as nested arrays and hashes that do _not_ need parsing anymore. The
files with the "serialised" trees are a bit larger but the rendering
performances are a _lot_ better. The code is still able to display
textual xml treeviews.

I think would be really useful to have a standard and more compact way
to serialise (dump binary groves/structures) to some specific format
(java, javascript,C,C++) or in a stream of "events" instead of pure text
_only_.

I am not saying that XML should be binary, but that the parsing
businness sometimes is an issue.


Just another brain dump.


Alberto
-------------- next part --------------
A non-text attachment was scrubbed...
Name: alberto.reggiori.vcf
Type: text/x-vcard
Size: 325 bytes
Desc: Card for Alberto Reggiori
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/7acb51fa/alberto.reggiori.vcf
From costello at mitre.org  Fri Mar 26 12:02:12 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk> <36FB6534.FDF0192D@jrc.it>
Message-ID: <36FB76BA.EBE5172D@mitre.org>

Why doesn't XML support the notion of an unordered list of elements,
i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
DTDs do not support Bags, but XML has no such inherent limitation?  Does
DCD support Bags?  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Fri Mar 26 12:40:50 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
In-Reply-To: <36FB76BA.EBE5172D@mitre.org>
Message-ID: <002f01be7785$e507c540$ab20268a@pc-lrd.bath.ac.uk>

> Why doesn't XML support the notion of an unordered list of elements,
> i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
> DTDs do not support Bags, but XML has no such inherent limitation?  Does
> DCD support Bags?  /Roger

An XML newbie writes....

Isn't this an unordered list of elements?

<!ELEMENT BAG	(A?,B?,C?,D?)+>

Which appears to have order, but as elements are optional 
and the group is repeatable the ordering isn't enforced. Although 
the bag can't be empty. So, you could have...


<!ELEMENT BAG	(A?,B?,C?,D?)*>

Which allows an empty BAG

Or am I completely wrong?

Cheers,

L.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Fri Mar 26 13:03:33 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101>

> -----Original Message-----
> From:	Roger L. Costello [SMTP:costello@mitre.org]
> 
> Why doesn't XML support the notion of an unordered list of elements,
> i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
> DTDs do not support Bags, but XML has no such inherent limitation?  Does
> DCD support Bags?  /Roger
> 
	Unordered list in XML:

	<ul>
		<li>An Item</li>
		<li>Another Item</li>
	</ul>

	Ordered list in XML:

	<ol>
		<li>Item 1</li>
		<li>Item 2</li>
	</ol>

	The point is an unordered list is an application level issue, not an
XML level issue - it's easy to implement one at your application level. Nay
- I would go as far as to say it's trivial.

Matt.
--
http://come.to/fastnet
Perl on Win32, PerlScript, ASP, Database, XML
GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V 
!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 13:27:49 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:36 2004
Subject: SAX2: AttributeList2 and EntityRefList
In-Reply-To: <199903260640.BAA13403@locke.ccil.org>
References: <14074.16928.163619.681099@localhost.localdomain>
	<199903260640.BAA13403@locke.ccil.org>
Message-ID: <14075.35593.947500.623901@localhost.localdomain>

John Cowan writes:
 > David Megginson scripsit:
 > 
 > > So, after some thought, here's what I came up with.  This is a special 
 > > interface providing indexes to zero or more entity references in a
 > > literal string (i.e. an attribute value).  The indices are based on
 > > whatever array indices the programming language is using, exclusive of 
 > > Unicode problems with combining characters, etc. (i.e. any
 > > normalisation must already have taken place).
 > 
 > What about references to unknown entities, though?  They don't contribute
 > any characters at all, and so don't fit your model.

Actually, they fit in fine -- the start and end positions will simply
be the same.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From harvey at eccnet.eccnet.com  Fri Mar 26 13:28:10 1999
From: harvey at eccnet.eccnet.com (Betty L. Harvey)
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
In-Reply-To: <36FB76BA.EBE5172D@mitre.org>
Message-ID: <Pine.LNX.4.04.9903260817170.29487-100000@eccnet.eccnet.com>


Roger:

	I am not sure what you mean by "Bags" but XML supports
any type of list.  It also supports content tables which are
pretty cool:

As an example:

<!ELEMENT list (item+)>
<!ELEMENT item (#PCDATA)>

<list>
        <item>Item1</item>
        <item>Item2</item>
</list>

Example Content Tagged Table

<!ELEMENT part-table (part+)
<!ELEMENT part       (partno, nomen, price,
                      quantity)>
<!ELEMENT (partno | nomen | price | quantity) (#PCDATA)>

<part-table>
	<part id='1'>
		<partno>1</partno>
		<nomen>My Part</nomen>
		<price>$1.00</price>
		<quantity>10</quantity>
	</part>
        <part id='2'>
                <partno>2</partno>
                <nomen>My Part 2</nomen>
                <price>$2.00</price>
                <quantity>20</quantity>
        </part>       
</part-table>

Depending on your application you can do some
pretty interesting things with the parts
list.  You can do the same thing with 
a list if required.

I am not sure if this is what you were looking
for.

Betty

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Betty Harvey                         | Phone: 301-540-8251 FAX: 4268
Electronic Commerce Connection, Inc. | 
13017 Wisteria Drive, P.O. Box 333   | 
Germantown, Md.  20874               |
harvey@eccnet.com                    | Washington,DC SGML/XML Users Grp
URL:  http://www.eccnet.com          | http://www.eccnet.com/sgmlug/
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/  


On Fri, 26 Mar 1999, Roger L. Costello wrote:

> Why doesn't XML support the notion of an unordered list of elements,
> i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
> DTDs do not support Bags, but XML has no such inherent limitation?  Does
> DCD support Bags?  /Roger
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From skshirsa at nortelnetworks.com  Fri Mar 26 13:29:15 1999
From: skshirsa at nortelnetworks.com (Shekhar Kshirsagar)
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
Message-ID: <3.0.32.19990326082427.009082c0@bl-mail2.corpeast.baynetworks.com>

Well, what about <!ELEMENT BAG ANY>. That gives a bag of anything.

Or am I misinterpreting the spec?


Thanks,
Shekhar Kshirsagar


At 12:40 PM 3/26/99 -0000, Leigh Dodds wrote:
>> Why doesn't XML support the notion of an unordered list of elements,
>> i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
>> DTDs do not support Bags, but XML has no such inherent limitation?  Does
>> DCD support Bags?  /Roger
>
>An XML newbie writes....
>
>Isn't this an unordered list of elements?
>
><!ELEMENT BAG	(A?,B?,C?,D?)+>
>
>Which appears to have order, but as elements are optional 
>and the group is repeatable the ordering isn't enforced. Although 
>the bag can't be empty. So, you could have...
>
>
><!ELEMENT BAG	(A?,B?,C?,D?)*>
>
>Which allows an empty BAG
>
>Or am I completely wrong?
>
>Cheers,
>
>L.
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 13:29:20 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:36 2004
Subject: Proposed new kind of SAX2 thing, with example
In-Reply-To: <199903260644.BAA13616@locke.ccil.org>
References: <199903260644.BAA13616@locke.ccil.org>
Message-ID: <14075.35687.586960.200728@localhost.localdomain>

John Cowan writes:

 > I believe there should be some way within SAX2 to ask for parser
 > properties (in the JavaBeans sense).  One example is the
 > architectural DTD public ID, which XAF provides access to but can't
 > report because it doesn't fit the SAX event model.

Use the following from Parser2 (n?e ModParser):

    public abstract Object get (String prop)
	throws SAXNotSupportedException;


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 13:34:28 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:36 2004
Subject: DTDDeclHandler and DTDLexicalHandler
In-Reply-To: <01BE7772.9F3329A0@grappa.ito.tu-darmstadt.de>
References: <01BE7772.9F3329A0@grappa.ito.tu-darmstadt.de>
Message-ID: <14075.35813.308202.771903@localhost.localdomain>

Ronald Bourret writes:

 > This may have already been answered, but how do DTDDeclHandler and 
 > DTDLexicalHandler work together?  That is, if I have the following:
 > 
 > <!ENTITY % foo "foo CDATA #REQUIRED">
 > <!ATTLIST bar %foo;>
 > 
 > what is the sequence of callbacks?

You'll lose the entity-boundary information in this case.  What you'd
get back is

  internalEntityDecl("foo", true, "foo CDATA #REQUIRED");
  attributeDecl("bar", "foo", "CDATA", null, ATTRIBUTE_REQUIRED, refs);

 > And even if this is well-defined, what good is the lexical
 > information in this case anyway, since I can't determine what
 > characters in the DTD came before and after the entity usage.

You're right, but I think that we're taking this too far for the SAX
core.  SAX2 is specifically set up so that people can define new
handler types, so it is possible to come up with something that
provides this level of reporting, but it will have to be outside of
the SAX core.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Fri Mar 26 13:36:13 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101>
Message-ID: <36FB8C85.BEFC8103@mitre.org>

Matt,

Then let me ask another question - why do DTDs not allow me to specify
an unordered list of elements?  For example, 

<!ELEMENT Kitchen RDF:Bag(Sink, Stove, Refrigerator)>

With this notation I am trying to indicate that an XML document that
conforms to this DTD must have a <Kitchen> element which has three child
elements - <Sink>, <Stove>, and <Refrigerator>, and these child elements
can be in any order.  Isn't this a useful thing?  I have had a number of
times where I wish that I could do this.

I gather from your message that you are saying that it is not a
limitation of XML, but rather a limitation of DTDs?  How about DCDs? 
Thanks.  /Roger


Matthew Sergeant (EML) wrote:
> 
> > -----Original Message-----
> > From: Roger L. Costello [SMTP:costello@mitre.org]
> >
> > Why doesn't XML support the notion of an unordered list of elements,
> > i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
> > DTDs do not support Bags, but XML has no such inherent limitation?  Does
> > DCD support Bags?  /Roger
> >
>         Unordered list in XML:
> 
>         <ul>
>                 <li>An Item</li>
>                 <li>Another Item</li>
>         </ul>
> 
>         Ordered list in XML:
> 
>         <ol>
>                 <li>Item 1</li>
>                 <li>Item 2</li>
>         </ol>
> 
>         The point is an unordered list is an application level issue, not an
> XML level issue - it's easy to implement one at your application level. Nay
> - I would go as far as to say it's trivial.
> 
> Matt.
> --
> http://come.to/fastnet
> Perl on Win32, PerlScript, ASP, Database, XML
> GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V
> !PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 13:42:19 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:36 2004
Subject: Why doesn't XML have Bag?
In-Reply-To: <36FB76BA.EBE5172D@mitre.org>
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>
	<36FB6534.FDF0192D@jrc.it>
	<36FB76BA.EBE5172D@mitre.org>
Message-ID: <14075.36327.509783.485757@localhost.localdomain>

Roger L. Costello writes:

 > Why doesn't XML support the notion of an unordered list of elements,
 > i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
 > DTDs do not support Bags, but XML has no such inherent limitation?  Does
 > DCD support Bags?  /Roger

XML DTDs can constrain the content of a bag just fine:

  (a|b|c|d|e|f)*

XML DTDs cannot constrain the content of a set (where each element may 
appear exactly once, in any order).  This is not an SGML DTD
limitation, since in SGML you can use

  (a&b&c&d&e&f)

You can simulate this in XML DTDs, but the content models become
absurdly large.

This is not to say that you cannot have a set in XML even *with* DTD
validation; it's just that DTD validation will not catch the errors.
For example, either

  (a|b|c|d|e|f)*

or even

  ANY

will allow a set, but they will not catch the error where the same
element appears twice.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Fri Mar 26 13:43:04 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:36 2004
Subject: A Simple Thought
References: <005a01be7787$02189b90$0100a8c0@sammy>
Message-ID: <36FB8EE7.115C77B1@lig.net>

This is in fact exactly the kind of thing that I am thinking, with at least a
couple other optimizations thrown in to make processing in-place in Java fast.

sdw

"Samuel R. Blackburn" wrote:

> You know, if you parse the XML into a carefully designed data structure,
> you could write that structure to a file. To re-read the data, you would
> simply memory map the file (or put the structure into a shared memory
> segment). If the structure is designed so offsets are used instead of
> pointers, you could navigate is quickly and not have to worry about
> memory addresses involved. The OS will only page in those portions
> of the file that are really used.
>
> Just a thought,
>
> Sam
>
> -----Original Message-----
> From: Stephen D. Williams <sdw@lig.net>
> To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
> Date: Thursday, March 25, 1999 10:08 PM
> Subject: Re: Is there anyone working on a binary version of XML?
>
> >"Simon St.Laurent" wrote:
> >
> >> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote:
> >> >>I know, I know, this is anathema to what many of you feel is the
> >> >>essence of XML, and I agree to a point.
> >> >
> >> >It's not so much about feelings, as about contradicting the XML spec.
> >> >
> >> >[...]
> >> >
> >> >Applying XML concepts to a binary data format sounds interesting and
> >> >potentially useful, but it wouldn't be XML.
> >>
> >> One of these days I'd really love to stop talking about what is and isn't
> >> XML, though I know it's fun, and start talking about what we can do with
> >> XML and XML-like structures, whether they are SAX event flows, DOM trees,
> >> or binary formats that build on an XML foundation.
> >>
> >> We might even get some real work done - and it might even be fun.
> >
> >I agree with the sentiment Simon.
> >
> >I'm required (or am requiring myself) to get a lot of real work done very
> >quickly in the next
> >6 months hence my focus...
> >
> >Semantically, I am talking about using XML.  After parsing and creating a
> >DOM tree or SAX
> >events, you no longer have XML but a data structure semantically equivalent
> >to an XML
> >document.  Another way to think about what I'm proposing is that it is a
> >cache of the data
> >structures produced from processing an XML document, cast in a openly
> >documented data
> >structure that is already flattened and ready for IO.
> >
> >In fact, this is how I arrived at this design after following a few other
> >design constraints
> >and observations.  Of course from there it is a short stop to say that you
> >can throw away the
> >'external' XML representation if you can recreate it from XMLb.
> >
> >My scheme makes parsing of XML a non-issue.  If I only have that advantage
> >within my closed
> >system, so be it, converting to and from XML for external purposes is in
> >fact what I intend to
> >do.
> >
> >In my case, I'm architecting a high speed clustering system, primarily
> >targeted at Linux/Unix
> >and Java.  In this kind of system of course you are splitting applications
> >into many servers.
> >Of course the communication between those nodes is really internal
> >application communication,
> >the equivalent of that DOM tree, so it makes sense to optimize it.  Think
> of
> >it this way,
> >you'd seldom design a large app where every method needs to parse the XML
> >text block passed to
> >it to get a DOM tree (or SAX events) if the calling method has a DOM tree
> >that it could just
> >pass.
> >
> >sdw
> >
> >> Simon St.Laurent
> >> XML: A Primer
> >> Sharing Bandwidth / Cookies
> >> http://www.simonstl.com
> >
> >
> >--
> >OptimaLogic - Finding Optimal Solutions
> >Web/Crypto/OO/Unix/Comm/Video/DBMS
> >sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect
> >http://sdw.st
> >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
> >5Jan1999
> >
> >xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> >(un)subscribe xml-dev
> >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> >subscribe xml-dev-digest
> >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> >


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Fri Mar 26 13:47:12 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:37 2004
Subject: Proposed new kind of SAX2 thing, with example
Message-ID: <008101be778f$d4ddffe0$c8a8a8c0@thing1>

From: John Cowan <cowan@locke.ccil.org>

>I believe there should be some way within SAX2 to ask for
>parser properties (in the JavaBeans sense).  One example is the
>architectural DTD public ID, which XAF provides access to
>but can't report because it doesn't fit the SAX event model.


Why not use the get(featureID)?

>Another case is the current element stack.  Every parser (or almost
>every parser) has to keep one of these around, and it would be
>useful to have "currentStackDepth" and "stackedElementType[n]"
>properties.
>
>What's needed is to have some means of discovery.  Perhaps it's just
>enough to use the JavaBeans mechanism.


One of the ideas behind MDSAX was to have a shared element stack.
But if SAX2 developed such a concept, then:

    1. A parser has the option of sharing its element stack and 
    2. When a parser doesn't share its element stack, a filter could be used.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Fri Mar 26 13:52:28 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:37 2004
Subject: Is there anyone working on a binary version of XML?
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <36FB9100.A163ABE9@lig.net>


Leigh Dodds wrote:

> > Then imagine you can write or communicate the object to other
> > systems simply with IO
> > operations with no processing involved.  Then imagine that the IO
> > is async and very cheap and
> > that you are processing thousands of transactions per second,
> > most of which generate
> > fundamentally little processing steps.
>
> I just want to clarify my understanding of this thread: you're discussing
> a binary format which is analagous to the internal representation of an
> XML document (a DOM tree), and which can be stored, used and manipulated
> without revisiting the original XML text?
>
> Wouldn't a (undoubtedly naive) implementation of this be simply serialising
> the object graph to disk, or through an I/O stream? This is obviously easy
> in Java, and again is only obviously beneficial if the serialised object
> graph is more 'compact' (which I believe is at least partly behind your
> desire) than the original textual version?

Yes, that would acheive part of what I'm getting at, but not nearly enough.  You see I am
addressing several different performance problems with processing in Java at the same time so
the solution is a bit more holistic.

In concept, what I'm getting at is close to using a serialization of a DOM tree, however the
point is to avoid any transformations (even deserialization/serialization) when possible but
still have a DOM/SAX or even JGL like access to the tree.

sdw

>
>
> Just a brain check on my part ;)
>
> L.
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Fri Mar 26 13:53:43 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:37 2004
Subject: Fast filter support in SAX2
Message-ID: <009201be7790$c0a6b7a0$c8a8a8c0@thing1>

I'd like to suggest another method in Parser2:

    public String unique(String);

as well as a featureID for requesting unique element and attribute names.

The thought is to bring the speed of filters closer to the speed of doing 
things within a parser. 

If a parser supports both the unique feature and provides access to its
element stack, then we are well on the way to being able to implement
Simpon's layered parser.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Fri Mar 26 14:04:01 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag?  Uh, "set"
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>
		<36FB6534.FDF0192D@jrc.it>
		<36FB76BA.EBE5172D@mitre.org> <14075.36327.509783.485757@localhost.localdomain>
Message-ID: <36FB93E5.F662B56A@mitre.org>

Thanks Dave for clarifying terminology.  It is "set" that I meant, not
"bag".  Just to make certain that I understand, an XML DTD cannot
express the following:

"A <Kitchen> element contains exactly three child elements: one instance
of <Sink>, one instance of <Stove>, and one instance of <Refrigerator>,
and these child elements can appear in any order."

Correct?  /Roger

P.S. Attributes can be listed in any order in an XML document,
regardless of the order that they are listed in the DTD.  Right?

David Megginson wrote:
> 
> Roger L. Costello writes:
> 
>  > Why doesn't XML support the notion of an unordered list of elements,
>  > i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
>  > DTDs do not support Bags, but XML has no such inherent limitation?  Does
>  > DCD support Bags?  /Roger
> 
> XML DTDs can constrain the content of a bag just fine:
> 
>   (a|b|c|d|e|f)*
> 
> XML DTDs cannot constrain the content of a set (where each element may
> appear exactly once, in any order).  This is not an SGML DTD
> limitation, since in SGML you can use
> 
>   (a&b&c&d&e&f)
> 
> You can simulate this in XML DTDs, but the content models become
> absurdly large.
> 
> This is not to say that you cannot have a set in XML even *with* DTD
> validation; it's just that DTD validation will not catch the errors.
> For example, either
> 
>   (a|b|c|d|e|f)*
> 
> or even
> 
>   ANY
> 
> will allow a set, but they will not catch the error where the same
> element appears twice.
> 
> All the best,
> 
> David
> 
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Fri Mar 26 14:10:42 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag?
Message-ID: <005c01be7792$776f8380$4df96d8c@NT.JELLIFFE.COM.AU>

 From: Roger L. Costello <costello@mitre.org>

>Why doesn't XML support the notion of an unordered list of elements,
>i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
>DTDs do not support Bags, but XML has no such inherent limitation?
Does
>DCD support Bags?  /Roger

Answer A: XML does have a way to support Bags: its called RDF.

Answer B: SGML DTDs could support bags, because they had an operator "&"
to mean required, but in any order. XML DTDs do not have this because
everyone said it was so difficult to implement. But then many people
said oops, because it would have been nice for database data.

Answer C: XML elements have order as a property. However, whether that
property is significant in the context of a document type depends on the
document type, and sometimes just on the kind of processing being
performed at that stage in the document's life. So you could just as
easily say that XML has bags but no sets.

Answer D: XML is not a data modeling language. It is a data-model
modeling language. So you decide what semantics you are to put. This
takes place entirely outside the area of what DTD's attempt to do, which
is just to provide a simple grammar for the data-model modeling.

Answer E: XML does have a way to support Bags: its called architectures.
On any element you attach an attribute that ties it to some other
element with known properties. For example, you tie your parent element
to html:ul for a bag and html:ol for a set.

Answer F: There are whole areas of fundamental semantic ways to slice
things: you want sets and bags, I want rhetorical relationships (I would
love if I could point at any element and know what the appropriate
heading for it was; I would love it if that heading was carted around
during cutting and pasting.)  If you think bags and sets are really
important, then encourage the schema working group to include that
information.


Take your pick!

Rick Jelliffe

Author: The XML & SGML Cookbook: Recipes for Structured Information
Prentice Hall, ISBN 0-13-614223-0


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Fri Mar 26 14:12:35 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag? 
In-Reply-To: Your message of "Fri, 26 Mar 1999 08:32:53 EST."
             <36FB8C85.BEFC8103@mitre.org> 
Message-ID: <199903261411.CAA11212@aniwa.sky>

> Matt,
> 
> Then let me ask another question - why do DTDs not allow me to specify
> an unordered list of elements?  For example, 
> 
> <!ELEMENT Kitchen RDF:Bag(Sink, Stove, Refrigerator)>
> 
> With this notation I am trying to indicate that an XML document that
> conforms to this DTD must have a <Kitchen> element which has three child
> elements - <Sink>, <Stove>, and <Refrigerator>, and these child elements
> can be in any order.  Isn't this a useful thing?  I have had a number of
> times where I wish that I could do this.

Is it a useful thing?  It might be nice for humans entering the data to be unconstrained in the order in which they can enter data, but if you know exactly what elements must exist within kitchen, then does it limit you to pre-define the order they appear in the document.

I suppose if you want the order to denote something about the position of the elements within your physical kitchen, then you've lost something, but attributes are probably a better solution for storing this information about the elements within your kitchen.

Andrew McNaughton 


-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Fri Mar 26 14:38:02 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag?
In-Reply-To: <36FB8C85.BEFC8103@mitre.org>
References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org>
Message-ID: <wk90ckl3jy.fsf@ifi.uio.no>


* Roger L. Costello
| 
| Then let me ask another question - why do DTDs not allow me to specify
| an unordered list of elements?  For example, 
| 
| <!ELEMENT Kitchen RDF:Bag(Sink, Stove, Refrigerator)>
| 
| With this notation I am trying to indicate that an XML document that
| conforms to this DTD must have a <Kitchen> element which has three child
| elements - <Sink>, <Stove>, and <Refrigerator>, and these child elements
| can be in any order.  Isn't this a useful thing? 

Sure it is, and SGML has it already:

<!ELEMENT Kitchen (Sink & Stove & Refrigerator)>

| I gather from your message that you are saying that it is not a
| limitation of XML, but rather a limitation of DTDs?  

It is a limitation of DTDs and was introduced because without this
operator element content models are easily mapped to finite state
automatons, but the introduction of the '&' separator makes automaton
generation much more difficult.

Existing SGML parsers already do this, and there are some research
papers giving algorithms for this, but the designers felt that this
was one of the things that would have to go in the simplification from
SGML to XML.

| How about DCDs?

DCDs have no official standing, they're just a proposal to the W3C.
XML Schemas, when they are defined, may (or may not) have this for all
I know. If they do we might as well add it to DTDs too.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 15:01:59 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:37 2004
Subject: SAX: Modified DTDDeclHandler
Message-ID: <14075.41132.504650.207777@localhost.localdomain>

Here's another attempt at the SAX2 DTDDeclHandler, adding element type
declarations (the handlerID is http://xml.org/sax/handlers/dtd-decl):

====================8<====================8<====================
// DTDDeclHandler.java -- receive extended DTD declarations
// $Id: DTDDeclHandler.java,v 1.1 1999/03/26 14:58:47 david Exp david $

package org.xml.sax;

public interface DTDDeclHandler extends SAX2Handler
{
    public final static int MODEL_ELEMENTS = 1;
    public final static int MODEL_MIXED = 2;
    public final static int MODEL_ANY = 3;
    public final static int MODEL_EMPTY = 4;

    public final static int ATTRIBUTE_DEFAULTED = 1;
    public final static int ATTRIBUTE_IMPLIED = 2;
    public final static int ATTRIBUTE_REQUIRED = 3;
    public final static int ATTRIBUTE_FIXED = 4;

    public abstract void elementDecl (String name,
				      int modelType,
				      String model)
	throws SAXException;

    public abstract void attributeDecl (String element,
					String name,
					String type,
					String defaultValue,
					int defaultType,
					EntityRefList entityRefs)
	throws SAXException;

    public abstract void externalEntityDecl (String name,
					     boolean isParameterEntity,
					     String publicId,
					     String systemId)
	throws SAXException;

    public abstract void internalEntityDecl (String name,
					     boolean isParameterEntity,
					     String replacementText)
	throws SAXException;
				     
}

// end of DTDDeclHandler.java
====================8<====================8<====================

To this take, I've added the elementDecl() callback.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Fri Mar 26 15:15:34 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag?
References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org> <wk90ckl3jy.fsf@ifi.uio.no>
Message-ID: <36FBA3DC.59DA@skynet.be>

Lars Marius Garshol wrote:
> 
...
> It is a limitation of DTDs and was introduced because without this
> operator element content models are easily mapped to finite state
> automatons, but the introduction of the '&' separator makes automaton
> generation much more difficult.
> 
> Existing SGML parsers already do this, and there are some research
> papers giving algorithms for this, but the designers felt that this
> was one of the things that would have to go in the simplification from
> SGML to XML.
> 

Please correct me if I am wrong here but isn't that trivial?
(you may get a BIG automaton, but it's not difficult to generate)

X -> A & B & C ;

can be expressed as

X -> A X_A | B X_B | C X_C ;

X_A -> B X_AB | C X_AC ;
X_B -> A X_AB | C X_BC ;
X_C -> A X_AC | B X_BC ;

X_AB -> C X_ABC ;
X_BC -> A X_ABC ;
X_AC -> B X_ABC ;

X_ABC -> ;

and it's 'easily' visualised by the number of possible shortest paths
between two opposing points on a hypercube.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 15:19:57 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag?  Uh, "set"
In-Reply-To: <36FB93E5.F662B56A@mitre.org>
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>
	<36FB6534.FDF0192D@jrc.it>
	<36FB76BA.EBE5172D@mitre.org>
	<14075.36327.509783.485757@localhost.localdomain>
	<36FB93E5.F662B56A@mitre.org>
Message-ID: <14075.41969.677946.202551@localhost.localdomain>

Roger L. Costello writes:

 > Thanks Dave for clarifying terminology.  It is "set" that I meant, not
 > "bag".  Just to make certain that I understand, an XML DTD cannot
 > express the following:
 > 
 > "A <Kitchen> element contains exactly three child elements: one instance
 > of <Sink>, one instance of <Stove>, and one instance of <Refrigerator>,
 > and these child elements can appear in any order."
 > 
 > Correct?  /Roger

More or less.  Technically, you *can* express this constraint with an
XML DTD:

  ((sink, ((stove, refrigerator) | (refrigerator, stove))) |
   (stove, ((sink, refrigerator) | (refrigerator, sink))) |
   (refrigerator, ((sink, stove) | (stove, sink))))

Obviously, things get unmanageable if the set grows a little bigger.

In an SGML DTD, you would use

  (sink & stove & refrigerator)

but in practical use, this never worked that well for documents except
in the special case of legacy-data conversion (it confused people
using authoring tools and generally made processing unnecessarily
difficult), and most SGML gurus strongly deprecated it.  XML is
hitting a slightly different usage domain (less emphasis on documents,
more on data), so perhaps it might be worthwhile including this in the
new schema standard.

 > P.S. Attributes can be listed in any order in an XML document,
 > regardless of the order that they are listed in the DTD.  Right?

Right -- order and repetition are properties of elements but not of
attributes.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Fri Mar 26 15:22:23 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:37 2004
Subject: Why doesn't XML have Bag?
Message-ID: <008c01be779c$17c5e000$38afdccf@ix.netcom.com>

In SGML you can put
(a&b&c) whicch means that eachelement must appear only once but in any
order.

The best you can do in XML (without getting rediculosly complicatedis

(a|b|c)* which means that they can appear in any order, but there can be any
number of them

My understanding was that it was ommited because of the requirement

"XML software shall be easy to write"

It takes only a few lines of C code to validate the second requirement but a
LOT more to validate the first.

Frank
----- Original Message -----
From: Roger L. Costello <costello@mitre.org>
To: <xml-dev@ic.ac.uk>
Cc: <alk@mitre.org>; Roger Costello <costello@mitre.org>
Sent: Friday, March 26, 1999 6:59 AM
Subject: Why doesn't XML have Bag?


>Why doesn't XML support the notion of an unordered list of elements,
>i.e., a Bag?  Perhaps this is a limitation of DTD, not XML?  That is,
>DTDs do not support Bags, but XML has no such inherent limitation?  Does
>DCD support Bags?  /Roger
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Fri Mar 26 15:32:55 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:37 2004
Subject: A Simple Thought
References: <003c01be7792$eb14ebe0$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <36FBA892.43A462FB@lig.net>

Leigh Dodds wrote:

> Hmm, I guess with Java you'd use an in-memory buffer and a class
> to wrap that buffer so that your accesses to the data would appear
> to be ordinary method calls accessing member variables, but actually
> just altered/read data at byte offsets in the buffer?

Exactly!  The class interface would be SAX/DOM/JGL-like but operate on a very
efficient representation.  The realization that I had is that I typically build
very meta-data driven applications and systems and that I seldom have business
data models represented by actual classes (in C++, where I learned my lesson,
and Java).  Since the data is accessed via collection interfaces anyway, the
storage can be completely opaque and optimized.

> Originally though I thought you were talking about a 'standard'
> representation.
> Shouldn't you then be avoiding 'other optimizations...to make processing
> in-place in Java fast'? Otherwise you're targeting a particular
> implementation
> language?

Ahh, there's the trick.  I believe I have most of a design for an data structure
that is fast in memory yet is 'flat' and can have its chunks just written out or
read in at any point.  It builds on some very old ideas I came up with for a
language I designed.  When viewed as an interchange format, it may not be the
most optimal space wise (although it should be better than XML text) but trades
a small amount of space for nearly zero processing overhead.  There will
probably also be a procedure for 'compacting' an object for storage into a
database or sending over a slow link vs. the 'fast' format usable between
servers in a cluster.

I'll be implementing the rest of this shortly and we can have another round of
discussion.
I'd really like a reference to the one Java project doing something similar.

> I'm interested in this (at least in part) as I've been toying with an
> application idea which could potentially have a lot of (small) XML documents
> built into a complex in-memory object graph. I'm concerned about the size
> of the object graph (and managing interconnections amongst nodes) and its
> later storage (don't want to have to reparse every time the application
> starts).
> Serialisation was originally what I was considering.

This is exactly the kind of problem I'm thinking of.  Since most people use
class interfaces to get at the data anyway, there's no need to chew up all the
processing time manipulating it behind the scenes in expensive ways.

Unfortunately the simple, obvious, traditional ways of building things
(especially in C++ and Java) cause massive storms of activity in large
programs.  (Object creation, initialization, building links, indexing, etc.
etc.)

sdw

> L.
>
> > -----Original Message-----
> > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> > Stephen D. Williams
> > Sent: 26 March 1999 13:43
> > To: Samuel R. Blackburn
> > Subject: Re: A Simple Thought
> >
> >
> > This is in fact exactly the kind of thing that I am thinking,
> > with at least a
> > couple other optimizations thrown in to make processing in-place
> > in Java fast.
> >
> > sdw
> >
> > "Samuel R. Blackburn" wrote:
> >
> > > You know, if you parse the XML into a carefully designed data structure,
> > > you could write that structure to a file. To re-read the data, you would
> > > simply memory map the file (or put the structure into a shared memory
> > > segment). If the structure is designed so offsets are used instead of
> > > pointers, you could navigate is quickly and not have to worry about
> > > memory addresses involved. The OS will only page in those portions
> > > of the file that are really used.
> > >
> > > Just a thought,
> > >
> > > Sam
> > >
> > > -----Original Message-----
> > > From: Stephen D. Williams <sdw@lig.net>
> > > To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
> > > Date: Thursday, March 25, 1999 10:08 PM
> > > Subject: Re: Is there anyone working on a binary version of XML?
> > >
> > > >"Simon St.Laurent" wrote:
> > > >
> > > >> At 03:36 PM 3/25/99 -0500, DuCharme, Robert wrote:
> > > >> >>I know, I know, this is anathema to what many of you feel is the
> > > >> >>essence of XML, and I agree to a point.
> > > >> >
> > > >> >It's not so much about feelings, as about contradicting the
> > XML spec.
> > > >> >
> > > >> >[...]
> > > >> >
> > > >> >Applying XML concepts to a binary data format sounds interesting and
> > > >> >potentially useful, but it wouldn't be XML.
> > > >>
> > > >> One of these days I'd really love to stop talking about what
> > is and isn't
> > > >> XML, though I know it's fun, and start talking about what we
> > can do with
> > > >> XML and XML-like structures, whether they are SAX event
> > flows, DOM trees,
> > > >> or binary formats that build on an XML foundation.
> > > >>
> > > >> We might even get some real work done - and it might even be fun.
> > > >
> > > >I agree with the sentiment Simon.
> > > >
> > > >I'm required (or am requiring myself) to get a lot of real
> > work done very
> > > >quickly in the next
> > > >6 months hence my focus...
> > > >
> > > >Semantically, I am talking about using XML.  After parsing and
> > creating a
> > > >DOM tree or SAX
> > > >events, you no longer have XML but a data structure
> > semantically equivalent
> > > >to an XML
> > > >document.  Another way to think about what I'm proposing is
> > that it is a
> > > >cache of the data
> > > >structures produced from processing an XML document, cast in a openly
> > > >documented data
> > > >structure that is already flattened and ready for IO.
> > > >
> > > >In fact, this is how I arrived at this design after following
> > a few other
> > > >design constraints
> > > >and observations.  Of course from there it is a short stop to
> > say that you
> > > >can throw away the
> > > >'external' XML representation if you can recreate it from XMLb.
> > > >
> > > >My scheme makes parsing of XML a non-issue.  If I only have
> > that advantage
> > > >within my closed
> > > >system, so be it, converting to and from XML for external
> > purposes is in
> > > >fact what I intend to
> > > >do.
> > > >
> > > >In my case, I'm architecting a high speed clustering system, primarily
> > > >targeted at Linux/Unix
> > > >and Java.  In this kind of system of course you are splitting
> > applications
> > > >into many servers.
> > > >Of course the communication between those nodes is really internal
> > > >application communication,
> > > >the equivalent of that DOM tree, so it makes sense to optimize
> > it.  Think
> > > of
> > > >it this way,
> > > >you'd seldom design a large app where every method needs to
> > parse the XML
> > > >text block passed to
> > > >it to get a DOM tree (or SAX events) if the calling method has
> > a DOM tree
> > > >that it could just
> > > >pass.
> > > >
> > > >sdw
> > > >
> > > >> Simon St.Laurent
> > > >> XML: A Primer
> > > >> Sharing Bandwidth / Cookies
> > > >> http://www.simonstl.com
> > > >
> > > >
> > > >--
> > > >OptimaLogic - Finding Optimal Solutions
> > > >Web/Crypto/OO/Unix/Comm/Video/DBMS
> > > >sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect
> > > >http://sdw.st
> > > >43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax
> > > >5Jan1999
> > > >
> > > >xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> > >Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> > CD-ROM/ISBN 981-02-3594-1
> > >To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> > >(un)subscribe xml-dev
> > >To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> > message;
> > >subscribe xml-dev-digest
> > >List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> > >
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
> 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Fri Mar 26 15:38:06 1999
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:10:37 2004
Subject: How about changing the rules?
References: <NBBBJPGDLPIHJGEHAKBACEDLDAAA.martind@netfolder.com>
Message-ID: <36FBA7FF.F61B67D@infinet.com>

Didier PH Martin wrote:

> Hi,
>
> Yesterday night I talked to good friends that work at Netscape (but not for
> long now) and I can tell you that this was not about celebrating. We came to
> discuss about the free software movement on so on, then came an idea...
>
> <Actual saturation>
> Several people worked hard in the Linux project, then came Red Hat, big
> investments, and now red hat is doing what all the other guys are doing
> (that's business no?) protecting their turf and doing money (they are even
> more luky than SUN or Microsoft, they are cheap labor to develop their
> software - just think about it. We all know that Microsoft has probably the
> lowest developement cost in the industry. They let the stock market pay
> their exployees :-) but now think about a company having 0$ developement
> costs Wow, thats VC dream! Follow developers, is it how you pay your bills?
> Sun still own the Java JDK but at least played fair because the code is
> developed with their own money.
> Microsoft, played hard with all ISVs with their huge appetite for growth but
> at least, like sun paid their code production.
> Mozilla, again, people working for free and AOL and its stock holders
> harvest the results. Just imagine that Sun and adobe put 60 000$ to have a
> better XML support for Mozilla. But in the end who will get the millions
> rewards. And how much is 60 000$ compared to millions, just a sustenance
> given to developers like lord would do in the middle ages with their serf.
> Just think about it. I am not saying that Sun or Adobe are doing something
> wrong but that the rules of the games or the odds are for the bank, not for
> the developers :-) (if you allow my casino analogy).
> Basically the actual free software movement seems to follow this pattern:
> developers work for free (cheap labor), when testing and proof of concept is
> done, someone comes into and reap the rewards and the money. Result,
> developers got fun but a modern version of a lord reap the financial
> rewards. Do we really want to replicate middle ages patterns? Next year will
> be the next millenium, do you really want that kind of order in the future?
> What about a world where people could get a just reward for their efforts.
> All the efforts we are doing with XML may end up the same way. I do not
> speak here for people already paid by W3C or big corpora but about
> individual doing all the efforts with their own time, and therefore their
> own money.
> </Actual situation>

I agree wholeheartedly with this.  Many Linux developers are so dedicated to the Linux
platform mostly because the long for the day the see the demise of Microsoft because they are
disheartened by how Microsoft exploits the rest of the software industry.  However, while
doing so they forget that they are just being exploited by someone else.

> <Solution>
> Here's the solution that friends and me came about.
> Create a company where all participating developers would have stocks. Will
> work like open software group but each participant would have ownership.
> Customers would get a share too. In this case, we do like Red hat is doing,
> packaging the code make it easy to install, document it and _sell_ it. Each
> customer would have a stock too. So, when they buy the software, they also
> have ownership.

I don't know about this.  How would you sell shares to customers when there are millions.  How
would you efficiently disburse dividends to these customers?  When you go to your local
software store and buy a copy of Red Hat would the store have to issue stock to the customer?
Even if you had a mail in program this would still be a logistical nightmare.

> So, the idea is: create a company where all participating developers would
> have stocks and therefore ownership. Customers would also have stocks and
> ownership but would have to buy the software to get ownership. A free
> version could be downloaded for free trial. But people using the free trial
> version would not have stocks.

Perhaps the developers all having stock would not be so bad.  You would be following a
Waffle-House style of employee ownership of the company (100% of the stock is owned by the
employees) or even arguable something more like a model of Goldman Sachs where if you get to
be partner you get a certain percentage of the total company profits.  Those most dedicated
would earn the most rewards.

> Results: This time, developers could get a chance to get a return on their
> efforts. Just imagine the power of a company having 20 000 owners. As big as
> Microsoft!

Try managing it and resolving disagreements.  In this sense Microsoft or any other behemoth
has the efficiency of dictatorship on their side.

> Couple years ago, a group of artist came tired of seeing someone else get
> all the rewards of their work and then founded United Artist. Then now,
> today, what about a new company called "United Developers".

Not a bad idea, but most developers I think are of the political attitudes that unionization
is evil and that the laissez faire economics are the best way to go.

> If the idea seems interesting to you, we can start a list server to discuss
> about it and create a new kind of company. Again imagine what 20 000 ,50 000
> or even millions of owners can do. Just stop for a moment and think about
> it.
> </Solution>

Well, how much money would I be getting and also what if I work harder than the first 30,000
in the lot of 50,000.  I would expect to get more compensation than some guy who has no clue
what he is doing.

The idea has some potential, but someone with some money is going to have to foot the bill
initially.  Any takers?

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Fri Mar 26 16:02:06 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:38 2004
Subject: Modified DTDDeclHandler
Message-ID: <01BE77AA.3EFA4F90@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> Here's another attempt at the SAX2 DTDDeclHandler, adding element type
> declarations (the handlerID is http://xml.org/sax/handlers/dtd-decl):

[snip]

>     public final static int MODEL_ELEMENTS = 1;
>     public final static int MODEL_MIXED = 2;
>     public final static int MODEL_ANY = 3;
>     public final static int MODEL_EMPTY = 4;

Is it worth distinguishing between elements that can only contain PCDATA 
and elements that can contain both PCDATA and subelements?  I realize that 
the XML spec doesn't have separate terms for these, but in real life they 
are very different.  A PCDATA-only element is very close to an attribute, 
while an element containg PCDATA and elements is a very different beast 
altogether.

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 26 16:41:00 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:38 2004
Subject: MDServlet
Message-ID: <199903261639.LAA05643@hesketh.net>

I've built a small servlet front-end for MDSAX that's highly (perhaps too)
configurable.  My favorite feature is that you need to know pretty much
nothing about SAX or MDSAX to make it work beyond a basic understanding of
ContextML, which isn't nearly as difficult.  Using this tool, you can use
pretty much all the tools (filters) provided with MDSAX to manipulate
documents before transmission, and you can add your own filters to MDSAX
and control them from MDServlet.

Details are available at
http://www.simonstl.com/projects/mdservlet/index.html.

For my next project, I'm hoping to build a factory class that'll make James
Clark's XT easy to fit in the framework, so XSL transformations will be
possible.  (They aren't at present.)  If anyone would like to contribute to
that (especially if you've figured out how XT fits in a SAX-based
environment), I'd love to hear from you.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 17:18:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:38 2004
Subject: Modified DTDDeclHandler
In-Reply-To: <01BE77AA.3EFA4F90@grappa.ito.tu-darmstadt.de>
References: <01BE77AA.3EFA4F90@grappa.ito.tu-darmstadt.de>
Message-ID: <14075.49450.220994.285269@localhost.localdomain>

Ronald Bourret writes:

 > Is it worth distinguishing between elements that can only contain PCDATA 
 > and elements that can contain both PCDATA and subelements?  I realize that 
 > the XML spec doesn't have separate terms for these, but in real life they 
 > are very different.  A PCDATA-only element is very close to an attribute, 
 > while an element containg PCDATA and elements is a very different beast 
 > altogether.

You can distinguish that by looking at the normalised content model.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Pzingg at imsisoft.com  Fri Mar 26 17:26:01 1999
From: Pzingg at imsisoft.com (Peter Zingg)
Date: Mon Jun  7 17:10:38 2004
Subject: Maybe a naive question about XML Data
Message-ID: <4D0C1E192CE9D1119A6C00805FC1F8FA0120F800@EXCHANGE>

Let's say I develop software primarily for the Windows platform, in the
consumer space.  Let's say I'd like to get away from my products'
proprietary file formats and use XML to allow the transfer of data between
my applications, across the web, and into and out of databases.  Why
wouldn't I want to use the XML Data-derived schema language, data typing,
etc., that Microsoft is using in Office 2000 and Internet Explorer 5?  

I can think of a few reasons why not to use it:

No published specification (that I can find, anyway).  Microsoft's XML pages
refer you to W3C activity on XML Data that's at least 15 months old, and
that does not match up closely to the XML published by Office 2000.

Using DTD instead of the Microsoft XML schema would allow my data to be
validated by more parsers and tools than just the MS/DataChannel parser.

Someone else's schema definition might be better (but from what I can see,
there is only a request for comments by the competing factions, dated
2/15/99).

Then again, there are a few arguments in favor of using it:

Microsoft and global domination.  You can bet that all of the MS data access
and programming tools (ADO, OLE DB, VB, VC++) will be built around it.

Already in some kind of production today, even if it's not well documented.

What would you do if you wanted to commit to a company-wide XML strategy
today?

Peter Zingg
IMSI
 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Fri Mar 26 17:26:29 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:38 2004
Subject: Modified DTDDeclHandler
Message-ID: <01BE77B5.F8ACC250@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> > Is it worth distinguishing between elements that can only contain 
PCDATA
> > and elements that can contain both PCDATA and subelements?  I realize 
that
> > the XML spec doesn't have separate terms for these, but in real life 
they
> > are very different.  A PCDATA-only element is very close to an 
attribute,
> > while an element containg PCDATA and elements is a very different beast 
> > altogether.
>
> You can distinguish that by looking at the normalised content model.

I knew you were going to say that :)  The same is also true of the other 
content models.  The parser can determine this very easily and I find it a 
worthwhile distinction. Any other takers?

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Fri Mar 26 17:33:55 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:38 2004
Subject: How about changing the rules?
Message-ID: <001a01be77af$8a3a20c0$c8a8a8c0@thing1>

I prefer a model which works like a magazine:

1. You have a central theme, say SAX2 and the MDSAX2 component model.

2. There is an annual subscription fee, as well as charges for back issues and collections.

3. Authors/programmers can have any number of arrangements: 
        regular columns,
        work-for-hire contributions,
        royalties based on circulation of a given issue, reprints, and inclusion in collections.

I've always though authors had a better deal than programmers.
But with things like PCs, Java, XML, and component-based programming,
there is no real reason not to make the transition.

Of course, to add real value, we would want to include branding and testing into
the model. Perhaps some kind of rating system.

Right now, JXML, Inc., a Delaware Corporation, is "between business models". 
I'm doing some work for The Open Group right now, but that's it. This might be 
an interesting vision. We'd need to grow JXML quite a bit to do it, but I'm open to suggestions.

Is this a reasonable model? How could it be improved? Any ideas on how we might best
proceed? (Open Source, Open Standards, Open Business Models???)

Bill
    

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Paul_Tihansky at vanguard.com  Fri Mar 26 17:36:45 1999
From: Paul_Tihansky at vanguard.com (Paul_Tihansky@vanguard.com)
Date: Mon Jun  7 17:10:38 2004
Subject: DTD Catalogs
Message-ID: <85256740.0060BBCE.00@vgi4mail.vanguard.com>

     Does anybody know if any of the Java XML Parsers support catalog
files?  For instance, if I put a Public Indentifier in my DTD declaration
without a URL, how would a parser such as XP find the DTD?  How do I
specify where the parser can find the catalog file?

Thanks,
Paul Tihansky


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar 26 17:48:51 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:38 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <3.0.32.19990326092254.00e4a604@pop.intergate.bc.ca>

At 08:54 PM 3/25/99 +0000, Dan Brickley wrote:
>Quite so. But there are still initiatives such as 
>
>	http://www.wapforum.org/docs/technical.htm
>	http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf

I read some of it, and if you buy the idea that a binary form of XML
is useful, it seems quite sensible.  I'm agnostic; if they think they
need it who are we to tell them they don't?  Obviously it has to
round-trip with plain ole XML. -T.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar 26 17:48:53 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:38 2004
Subject: XML and (K)Office
Message-ID: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca>

At 09:20 AM 3/26/99 +1000, James Robertson wrote:
>Without the rigour of a DTD, we've got nothing.

This sentiment is not universally shared.  While DTDs are extremely
useful and should be constructed as (a small) part of any serious
language-design effort, they are in some cases unnecessary (for 
validation, full-text indexing, and lots of other things) and in
other cases insufficient - DTD validation never comes close
to real business-logic validation.  I am near-schizophrenic these days,
running around telling people that yes, they should use DTDs, and
simultaneously warning them that there are situations where they
fail to be either necessary or sufficient; the kind of mystico-
religious attitude above does not help.

>How will future users make sense of the format without
>a DTD?

And what, pray tell, part of a DTD helps you "make sense" of a
format? -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Fri Mar 26 17:50:15 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:38 2004
Subject: Modified DTDDeclHandler
Message-ID: <002301be77b0$586954c0$c8a8a8c0@thing1>

From: Ronald Bourret <rbourret@ito.tu-darmstadt.de>
>Is it worth distinguishing between elements that can only contain PCDATA 
>and elements that can contain both PCDATA and subelements?  I realize that 
>the XML spec doesn't have separate terms for these, but in real life they 
>are very different.  A PCDATA-only element is very close to an attribute, 
>while an element containg PCDATA and elements is a very different beast 
>altogether.


One advantage of not making the distinction is that you subsequently have a greater
freedom to qualify the data held by an element by adding child elements--one of the
advantages of content over attributes.

As an programmer, I agree with you. I'd like the distinction. But when I think about how
an application might mature with time, I'd rather the implementation not make that
distinction!

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at goon.stg.brown.edu  Fri Mar 26 17:54:19 1999
From: richard at goon.stg.brown.edu (Richard L. Goerwitz)
Date: Mon Jun  7 17:10:38 2004
Subject: Why doesn't XML have Bag?
References: <002f01be7785$e507c540$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <36FBC99C.28D97D7A@goon.stg.brown.edu>

Leigh Dodds wrote:

> Isn't this an unordered list of elements?
> 
> <!ELEMENT BAG   (A?,B?,C?,D?)*>
> <!ELEMENT BAG   (A?,B?,C?,D?)+>

These are equivalent (although the bottom one might be considered
'ambiguous' in SGML terms).  I suspect you'll convey your inten-
tions a lot better if you use:

> <!ELEMENT BAG (A | B | C | D)*>

-- 

Richard Goerwitz
PGP key fingerprint:    C1 3E F4 23 7C 33 51 8D  3B 88 53 57 56 0D 38 A0
For more info (mail, phone, fax no.):  finger richard@goon.stg.brown.edu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Mar 26 17:55:54 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:38 2004
Subject: Maybe a naive question about XML Data
Message-ID: <3.0.32.19990326095710.00e79c08@pop.intergate.bc.ca>

At 09:20 AM 3/26/99 -0800, Peter Zingg wrote:
>Let's say I develop software primarily for the Windows platform...
>  Why
>wouldn't I want to use the XML Data-derived schema language, data typing,
>etc., that Microsoft is using in Office 2000 and Internet Explorer 5?  
...
>What would you do if you wanted to commit to a company-wide XML strategy
>today?

Use DTDs; they will help with a small percentage of your business-logic
validation and you'll have to write code to do the rest.  When next-gen
schemas come along, they'll cover a somewhat larger portion of your
business-logic validation in a nice declarative way, and you'll be
able to retire some of your code.  But not all.  -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 26 18:05:09 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:38 2004
Subject: Why doesn't XML have Set?
References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org>
Message-ID: <36FBC3C5.AC693F16@prescod.net>

"Roger L. Costello" wrote:
> 
> Then let me ask another question - why do DTDs not allow me to specify
> an unordered list of elements?  For example,
> 
> <!ELEMENT Kitchen RDF:Bag(Sink, Stove, Refrigerator)>
> 
> With this notation I am trying to indicate that an XML document that
> conforms to this DTD must have a <Kitchen> element which has three child
> elements - <Sink>, <Stove>, and <Refrigerator>, and these child elements
> can be in any order.  Isn't this a useful thing?  

Is it useful? The author or text generator has been given no new
flexibility about *what* to write, only the order. What would they
indicate through the order, that the sink is "more important" than the
stove? That's a stretch. Allowing things in any order may be a convenience
for the generator but it will very seldom allow anything interesting to be
expressed. And it is an inconvenience for the consumer because now the
processing app has to walk around the tree to find out where the Stove is
rather than just going to the second child element.

> I have had a number of times where I wish that I could do this.

That wish usually goes away after a while. You start to wonder if there is
really any benefit in complicating your document type and creating more
work for yourself without making the language more expressive.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Fri Mar 26 18:12:14 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:10:38 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <3.0.32.19990326092254.00e4a604@pop.intergate.bc.ca>
Message-ID: <002201be77b3$39760600$1b19da18@ne.mediaone.net>

Tim Bray wrote:
>
> At 08:54 PM 3/25/99 +0000, Dan Brickley wrote:
> >Quite so. But there are still initiatives such as
> >
> >	http://www.wapforum.org/docs/technical.htm
> >	http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf
>
> I read some of it, and if you buy the idea that a binary form of XML
> is useful, it seems quite sensible.  I'm agnostic; if they think they
> need it who are we to tell them they don't?  Obviously it has to
> round-trip with plain ole XML. -T.
>

	I think what this really is, when you strip out the concept of binary XML,
is a suggestion for a compression format tuned for markup streams.

	There are two distinct issues 1) efficiency of parsing  2) compactness. A
standard compression format for XML (ala zip,gzip etc) would be for
bandwidth limited applications.

Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 18:14:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:38 2004
Subject: Maybe a naive question about XML Data
In-Reply-To: <4D0C1E192CE9D1119A6C00805FC1F8FA0120F800@EXCHANGE>
References: <4D0C1E192CE9D1119A6C00805FC1F8FA0120F800@EXCHANGE>
Message-ID: <14075.51958.839689.378413@localhost.localdomain>

Peter Zingg writes:

 > Microsoft and global domination.  You can bet that all of the MS
 > data access and programming tools (ADO, OLE DB, VB, VC++) will be
 > built around it.

I wouldn't make any such bet.  I'm not a Windows developer myself, but 
I've heard a lot of grumbling about MS abandoning its own technologies 
frequently and with little or no notice.

 > What would you do if you wanted to commit to a company-wide XML
 > strategy today?

No competent system architect should ever design a system architecture
around vendor-specific interfaces and specs except in the direst need
(and even then, she's probably better to quit and try to salvage
what's left of her reputation).

If you use vendor-specific stuff, move it to behind generic interfaces
where it can easily be changed without damaging the rest of the
system; otherwise, it will be Microsoft (or Sun or IBM or Adobe or
Texcel or what have you) who will be deciding the future evolution,
maintenance schedule, and lifespan of your system for you, and you'll
just be a helpless spectator.

So far, that's all system-architecture motherhood and apple pie (or
social welfare and poutine, up here in Central Canada).  The less
obvious point is that open standards like XML, CORBA, etc. also really
don't belong in the high-level system design: they should have nothing
to do with *what* your system does, only with *how* your system does
it, and that's an implementation detail.

If there are parts of a planned or existing system that could benefit
from using XML in their implementations, then by all means, introduce
some XML.  Start small to see if and how you're getting a real benefit
from the XML, then gradually introduce XML into other parts of the
system until you feel confident that you're getting the most benefit
from it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Fri Mar 26 18:15:53 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:38 2004
Subject: Why doesn't XML have Bag?
References: <008c01be779c$17c5e000$38afdccf@ix.netcom.com>
Message-ID: <36FBCE39.4D49@skynet.be>

Frank Boumphrey wrote:
> 
> In SGML you can put
> (a&b&c) whicch means that eachelement must appear only once but in any
> order.
...
> 
> My understanding was that it was ommited because of the requirement
> 
> "XML software shall be easy to write"
> 
> It takes only a few lines of C code to validate the second requirement but a
> LOT more to validate the first.
> 

How about expanding it as you parse the DTD (bottom-up coding)

node ampersand(node a node b) {
   return or(concat(a,b),concat(clone(b),clone(a)));
}

that's just adding three lines to your code.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Fri Mar 26 18:24:49 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:38 2004
Subject: Megginson's Spelling
Message-ID: <87256740.0064FE57.00@d53mta03h.boulder.ibm.com>


>3. As Donne wrote (cited in the OED), "Busie old foole, unruly
>  sunne, ... Sawcy pedantique wretch, goe chide Late schooleboyes"
>   (see what I mean about spelling?).
>

Wasn't "Pedantique" a movie where Sharon Stone walked around lightly
clothed and killed her husband because of his horrible spelling?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Mar 26 18:25:52 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:38 2004
Subject: Why doesn't XML have Bag?
References: <002f01be7785$e507c540$ab20268a@pc-lrd.bath.ac.uk> <36FBC99C.28D97D7A@goon.stg.brown.edu>
Message-ID: <36FBCFF0.B6A60D79@prescod.net>

"Richard L. Goerwitz" wrote:
> 
> Leigh Dodds wrote:
> 
> > Isn't this an unordered list of elements?
> >
> > <!ELEMENT BAG   (A?,B?,C?,D?)*>
> > <!ELEMENT BAG   (A?,B?,C?,D?)+>
> 
> These are equivalent (although the bottom one might be considered
> 'ambiguous' in SGML terms).  

Actually, it would not. Any particular content node list can only present
a single path through the content model.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Fri Mar 26 18:37:04 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:39 2004
Subject: Is there anyone working on a binary version of XML?
References: <002201be77b3$39760600$1b19da18@ne.mediaone.net>
Message-ID: <36FBDB92.56336820@lig.net>


Jonathan Borden wrote:

> Tim Bray wrote:
> >
> > At 08:54 PM 3/25/99 +0000, Dan Brickley wrote:
> > >Quite so. But there are still initiatives such as
> > >
> > >     http://www.wapforum.org/docs/technical.htm
> > >     http://www.wapforum.org/docs/technical1.1/WBXML-03-Feb-1999.pdf
> >
> > I read some of it, and if you buy the idea that a binary form of XML
> > is useful, it seems quite sensible.  I'm agnostic; if they think they
> > need it who are we to tell them they don't?  Obviously it has to
> > round-trip with plain ole XML. -T.
> >
>
>         I think what this really is, when you strip out the concept of binary XML,
> is a suggestion for a compression format tuned for markup streams.
>
>         There are two distinct issues 1) efficiency of parsing  2) compactness. A
> standard compression format for XML (ala zip,gzip etc) would be for
> bandwidth limited applications.

I agree.  I feel they can be solved with a similar solution in at least some circumstances.
Rather there are some straightforward ways to acheive compression that actually make
efficiency worse while some solutions for efficiency also make compression easier.

In fact there are a number of levels you could go with compression:

optional gzip/bzip2 possibly preceded by:

Dictionary compression (various forms of building a list of commonly used terms or all terms
in the current document/stream or some combination)

'Priming' for certain circumstances.  For instance, I've long thought that an ideal design for
super high bandwidth circuits (TCP connection, message queue, special purpose) is to
essentially start out with a raw state where you send, once per connection/conversation, all
of the XML or other full self describing data (a DTD is an expression of this) and possibly
even a dictionary built from past experience and then highly compress the rest of the stream
based on the defined base.  In some circumstances you could even have a base 'dictionary'
stored on each receiver to improve short messages.

Each further transaction could use all of the known information to compress in a layered way.

There are plenty of circumstances where a connection is made and many messages are sent,
sometimes millions per connection.  I've had servers that normally handled 30-50 million
messages/day.

Both careful structuring of the data (a la bXML) and things like parallel inheritance delta's
play into this kind of optimization.

sdw

> Jonathan Borden
> http://jabr.ne.mediaone.net
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Fri Mar 26 19:00:19 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:39 2004
Subject: ANNOUNCE: XQL Mailing List, XQL FAQ
Message-ID: <3.0.3.32.19990326140131.030c5100@pop.mindspring.com>

I have just set up a mailing list for XQL (XML Query Language). This list
is intended to answer questions about the definition of the language, how
to implement it, who has implemented it in what products, and whatever else
seems to be of interest.

I will also use this list to try to reach consensus in the XQL community if
decisions need to be made, eg to add new extensions.

The XQL FAQ may be found here:

  http://metalab.unc.edu/xql/

It contains a link to the mailing list, but you can also access the mailing
list directly here:

  http://franklin.oit.unc.edu/cgi-bin/lyris.pl?enter=xql

Hope this is helpful!

Jonathan

Jonathan Robie
R&D Fellow, Software AG
jonathan.robie@sagus.com <- this address will be active Monday

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Mar 26 19:06:10 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:39 2004
Subject: XML and (K)Office
In-Reply-To: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca>
Message-ID: <199903261905.OAA10928@hesketh.net>

>And what, pray tell, part of a DTD helps you "make sense" of a
>format? -Tim

The comments, of course!  (Which is a large part of why DDML provided
explicit space for documentation.)

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From derekdb at microsoft.com  Fri Mar 26 19:19:01 1999
From: derekdb at microsoft.com (Derek Denny-Brown)
Date: Mon Jun  7 17:10:39 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <8B57882C41A0D1118F7100805F9F68B506F1BF00@RED-MSG-45>

Not to be picky, but... The "Save-As" option in IE5 for XML documents _does_
save the XML.

-derek

-----Original Message-----
From: Matthew Sergeant (EML) [mailto:Matthew.Sergeant@eml.ericsson.se]

It appears that IE5 converts internally to HTML (with the XSL style sheet),
so the answer is that you can't. Even a save to disk saves the HTML AFAIK.
Try using Mozilla - it does things right, and displays XML+XSL remarkably
well considering it's at least 6 months away from release.

> -----Original Message-----
> From:	Kevin Hsu [SMTP:shyutz@ms1.hinet.net]
> 
> Can anyone tell me how to print the XML document as I see on the screen in
> IE 5.0, thanks in advance.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Fri Mar 26 19:30:31 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:10:39 2004
Subject: Maybe a naive question about XML Data
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF1FD@RED-MSG-08>

Peter Zinqq asked whether or not to use the XML-Data schema notation shipped
with IE5.  That depends on your needs and timeframes.  The IE5 MSXML parser
supports both DTD and XML-Data.  DTDs are supported by a wider range of
parsers, so you have a greater degree of interop, if replacing parsers is
important to you.  XML-Data uses XML syntax and supports namespaces and
datatypes, if that is important to you.  I expect that future MSXML parsers
will continue to support notations.

But that brings me to mention the work going on in the W3C: The XML schemas
activity is working on defining the next generation of schema notation, and
the shopping list of features includes all the features presently available
from XML-Data in MS IE5, and more.  (I expect that future MSXML parsers will
support the future schema notation.)

So a lot depends on the exact features you need and where (MSIE or other
parsers) and when (now or later) you need them.  

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Fri Mar 26 19:33:36 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:10:39 2004
Subject: how to print the XML document in IE 5.0
In-Reply-To: <002401be7757$f99675c0$15cd4acb@flag.com.tw>
Message-ID: <000901be77bf$31695c80$5118a8c0@kuantech1.quokka.com>

I am confused by the responses to this question. I selected the Print command from the File menu in IE5 final and it printed just fine. I was looking at a raw XML file with no formatting commands of any kind.

Jeff
  -----Original Message-----
  From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of Kevin Hsu
  Sent: Thursday, March 25, 1999 10:55 PM
  To: XML Developers' List
  Subject: how to print the XML document in IE 5.0


  Can anyone tell me how to print the XML document as I see on the screen in IE 5.0, thanks in advance.

  Kevin
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/45950717/attachment.htm
From Mark.Birbeck at iedigital.net  Fri Mar 26 20:16:58 1999
From: Mark.Birbeck at iedigital.net (Mark Birbeck)
Date: Mon Jun  7 17:10:39 2004
Subject: Maybe a naive question about XML Data
Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054ACB@SOHOS002>

I think also the fact that regardless of what standard is adopted the
presence of namespaces and the fact that definitions can be 'open' is
something well worth getting the hang of now. I'm using the IE5 stuff
knowing full well that it may change, because these concepts are not
present in DTDs, and so it's the only way to experiment with them.

BTW, since everything that can be represented in a DTD can be
represented in XML-Data, then you could just transform your XML to DTDs
when it leaves your system. Then, as another writer said, you hide the
specifics behind a general interface.

Regards,

Mark

> -----Original Message-----
> From: Andrew Layman 
> Sent: 26 March 1999 19:29
> To: 'xml-dev@ic.ac.uk'
> Subject: RE: Maybe a naive question about XML Data
> 
> 
> Peter Zinqq asked whether or not to use the XML-Data schema 
> notation shipped
> with IE5.  That depends on your needs and timeframes.  The 
> IE5 MSXML parser
> supports both DTD and XML-Data.  DTDs are supported by a 
> wider range of
> parsers, so you have a greater degree of interop, if 
> replacing parsers is
> important to you.  XML-Data uses XML syntax and supports 
> namespaces and
> datatypes, if that is important to you.  I expect that future 
> MSXML parsers
> will continue to support notations.
> 
> But that brings me to mention the work going on in the W3C: 
> The XML schemas
> activity is working on defining the next generation of schema 
> notation, and
> the shopping list of features includes all the features 
> presently available
> from XML-Data in MS IE5, and more.  (I expect that future 
> MSXML parsers will
> support the future schema notation.)
> 
> So a lot depends on the exact features you need and where 
> (MSIE or other
> parsers) and when (now or later) you need them.  
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Mar 26 20:45:08 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:39 2004
Subject: How about changing the rules?
In-Reply-To: <001a01be77af$8a3a20c0$c8a8a8c0@thing1>
Message-ID: <NBBBJPGDLPIHJGEHAKBAKEHHDAAA.martind@netfolder.com>

Hi Bill,

This is a very interesting model. I'll give it some thoughts. This is fresh
air Bill. Thanks again

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com

-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
Bill la Forge
Sent: Friday, March 26, 1999 12:39 PM
To: Tyler Baker; Didier PH Martin
Cc: 'XML Dev'
Subject: Re: How about changing the rules?


I prefer a model which works like a magazine:

1. You have a central theme, say SAX2 and the MDSAX2 component model.

2. There is an annual subscription fee, as well as charges for back issues
and collections.

3. Authors/programmers can have any number of arrangements:
        regular columns,
        work-for-hire contributions,
        royalties based on circulation of a given issue, reprints, and
inclusion in collections.

I've always though authors had a better deal than programmers.
But with things like PCs, Java, XML, and component-based programming,
there is no real reason not to make the transition.

Of course, to add real value, we would want to include branding and testing
into
the model. Perhaps some kind of rating system.

Right now, JXML, Inc., a Delaware Corporation, is "between business models".
I'm doing some work for The Open Group right now, but that's it. This might
be
an interesting vision. We'd need to grow JXML quite a bit to do it, but I'm
open to suggestions.

Is this a reasonable model? How could it be improved? Any ideas on how we
might best
proceed? (Open Source, Open Standards, Open Business Models???)

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 20:54:10 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:39 2004
Subject: Document Schemas and Documentation (was: RE: XML and (K)Office)
In-Reply-To: <199903261905.OAA10928@hesketh.net>
References: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca>
	<199903261905.OAA10928@hesketh.net>
Message-ID: <14075.57965.905043.828920@localhost.localdomain>

Simon St.Laurent writes:

 > >And what, pray tell, part of a DTD helps you "make sense" of a
 > >format? -Tim
 > 
 > The comments, of course!  (Which is a large part of why DDML provided
 > explicit space for documentation.)

Ain't that the truth.  As I pointed out to the DDML designers a while
back, though, every XML element and attribute needs three types of
documentation:

1. an XML 1.0 element or attribute name (i.e. "a");
2. a human-readable title (i.e. "Hypertext Anchor"); and
3. a proper description (probably including paragraphs, examples,
   tables, etc.).

Some people might also add a brief, one-sentence description
in-between (2) and (3).  Items (2) and (3) also need to be
localizable, possibly by allowing repetition coupled with xml:lang.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Unak at Level3.com  Fri Mar 26 21:07:44 1999
From: Mark.Unak at Level3.com (Mark.Unak@Level3.com)
Date: Mon Jun  7 17:10:39 2004
Subject: unsubscribe xml-dev 
Message-ID: <6DD3824BDF75D211930E0008C71EC92001B2998D@l3lsvlmail02.l3.com>

unsubscribe xml-dev indiketr@churchill.co.uk
<mailto:indiketr@churchill.co.uk> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Unak at Level3.com  Fri Mar 26 21:09:19 1999
From: Mark.Unak at Level3.com (Unak, Mark)
Date: Mon Jun  7 17:10:39 2004
Subject: unsubscribe xml-dev 
Message-ID: <6DD3824BDF75D211930E0008C71EC92001B2998E@l3lsvlmail02.l3.com>

unsubscribe xml-dev Mark.Unak@Level3.com <mailto:indiketr@churchill.co.uk> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.S.Brothers at EMCIns.Com  Fri Mar 26 21:26:35 1999
From: Michael.S.Brothers at EMCIns.Com (Michael S. Brothers)
Date: Mon Jun  7 17:10:39 2004
Subject: Megginson's Spelling
In-Reply-To: <87256740.0064FE57.00@d53mta03h.boulder.ibm.com>
Message-ID: <SIMEON.9903261522.E@PC7155.emcins.com>

On Fri, 26 Mar 1999 11:23:04 -0700 roddey@us.ibm.com wrote:

> 
> 
> 
> >3. As Donne wrote (cited in the OED), "Busie old foole, unruly
> >  sunne, ... Sawcy pedantique wretch, goe chide Late schooleboyes"
> >   (see what I mean about spelling?).
> >
> 
> Wasn't "Pedantique" a movie where Sharon Stone walked around lightly
> clothed and killed her husband because of his horrible spelling?
> 
And, I believe she also killed her husband's best friend because of 
his making obscure movie references. Diabolical wrench.
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

----------------------
Michael S. Brothers
Michael.S.Brothers@EMCIns.com
515-362-7473
At this point, I don't think that's the best
option.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 22:23:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:39 2004
Subject: SAX2: Proposed alternative DTD interface
Message-ID: <14076.1733.365295.427943@localhost.localdomain>

Here's another alternative for SAX2: forget about trying to report DTD 
declarations as events, and simply make the whole DTD available
through an interface with a Parser2.get() call.

I threw together a quick (read-only) DTD interface this morning, and
uploaded it to the following location

  http://www.megginson.com/SAX/sax2dtd-19990326.zip

The package consists of the following interfaces (and exception class)
in the org.xml.sax.dtd package:

  Attribute extends DTDComponent
  ContentGroup extends ContentParticle
  ContentParticle
  ContentParticleIterator
  ContentToken extends ContentParticle
  DTD
  DTDComponent
  DTDComponentIterator
  DTDException extends java.lang.Exception
  Element extends DTDComponent
  Entity extends DTDComponent
  Notation extends DTDComponent

The interface itself is pretty small -- the compiled class files add
up to just over 4K -- and a SAX application would get the information
like this:

  try {
    DTD dtd = (DTD)parser.get("http://xml.org/sax/props/dtd");
  } catch (SAXNotSupportedException e) {
    // ...
  }

This would print out the names of all of the declared elements:

  DTDComponentIterator it = dtd.getElements();
  while (it.hasMoreMembers()) {
    System.out.println(((Element)(it.getNextMember())).getName());
  }

etc., etc.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Mar 26 22:44:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:40 2004
Subject: SAX2: Proposed alternative DTD interface
In-Reply-To: <14076.1733.365295.427943@localhost.localdomain>
References: <14076.1733.365295.427943@localhost.localdomain>
Message-ID: <14076.3465.903408.98435@localhost.localdomain>

David Megginson writes:

 > This would print out the names of all of the declared elements:
 > 
 >   DTDComponentIterator it = dtd.getElements();
 >   while (it.hasMoreMembers()) {
 >     System.out.println(((Element)(it.getNextMember())).getName());
 >   }
 > 
 > etc., etc.

If people find this interesting, we might want to rewrite it to use
the Java 2 collection classes (and the C++ STL in a C++ port, etc).  I 
am a little wary of forcing all users to have upgraded to JDK 1.2, but 
that's a separate discussion (most DTD-related work would be
server-side anyway).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fmclain at cdgpd.com  Fri Mar 26 23:04:45 1999
From: fmclain at cdgpd.com (Fred McLain)
Date: Mon Jun  7 17:10:40 2004
Subject: Important Message From Fred McLain
Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3CA@rainier.cdgpd.com>

Here is that document you asked for ... don't show anyone else ;-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: list1.doc
Type: application/msword
Size: 40960 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/8e7376c4/list1.doc
From fmclain at cdgpd.com  Fri Mar 26 23:04:50 1999
From: fmclain at cdgpd.com (Fred McLain)
Date: Mon Jun  7 17:10:40 2004
Subject: Important Message From Fred McLain
Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com>

Here is that document you asked for ... don't show anyone else ;-)

-------------- next part --------------
A non-text attachment was scrubbed...
Name: list1.doc
Type: application/msword
Size: 40960 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/b7e08a5f/list1.doc
From jonathan at texcel.no  Fri Mar 26 23:30:49 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:40 2004
Subject: Important Message From Fred McLain
In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com>
Message-ID: <3.0.3.32.19990326183147.0338ec50@pop.mindspring.com>

At 03:03 PM 3/26/99 -0800, Fred McLain wrote:
>Here is that document you asked for ... don't show anyone else ;-)
>
>Attachment Converted: "D:\pipeplus\DOWNLOAD\list1.doc"

This document has macros in it - it could well contain a virus.

Jonathan

Jonathan Robie
R&D Fellow, Software AG
jonathan.robie@sagus.com <- this address will be active Monday

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmg at trivida.com  Fri Mar 26 23:45:14 1999
From: jmg at trivida.com (Jeff Greif)
Date: Mon Jun  7 17:10:40 2004
Subject: virus alert!!!  Re: Important Message From Fred McLain 
References: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com>
Message-ID: <075401be77e2$6a5bfc00$a24630d1@trivida.com>

Fred,

This message that you just sent to the recipients below contains an
attachment with an MS Word Macro virus.  Earlier today I got the same thing
from someone else.  Apparently, the macros installed by the virus when you
open the attachment sends the 'Important message' to everyone in your
address book or contact list, thus spreading it pretty fast.  It would be
good if you warned your correspondents not to open the attachment.
Apparently the virus tries to run MS Outlook to re-distribute itself; I was
lucky (I hope) and all attempts to send it onward failed (with error message
box) since I don't have Outlook properly installed and don't use it.

Jeff


----- Original Message -----
From: Fred McLain <fmclain@cdgpd.com>
To: <dkrylov@cgxpress.com>; <xml-dev@ic.ac.uk>
Sent: Friday, March 26, 1999 3:03 PM
Subject: Important Message From Fred McLain


> Here is that document you asked for ... don't show anyone else ;-)
>
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmg at trivida.com  Fri Mar 26 23:54:43 1999
From: jmg at trivida.com (Jeff Greif)
Date: Mon Jun  7 17:10:40 2004
Subject: What McAfee says about new Word Macro virus (I've received it twice today already!!)
Message-ID: <076a01be77e3$b399b870$a24630d1@trivida.com>

Skipped content of type multipart/alternative-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: image/gif
Size: 43 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990326/4f6b4f9d/attachment.gif
From richard at cogsci.ed.ac.uk  Sat Mar 27 00:02:23 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:10:40 2004
Subject: virus alert!!!  Re: Important Message From Fred McLain 
In-Reply-To: Jeff Greif's message of Fri, 26 Mar 1999 15:43:14 -0800
Message-ID: <15359.199903270001@doyle.cogsci.ed.ac.uk>

> It would be
> good if you warned your correspondents not to open the attachment.

This suggests that the message was sent in good faith, something I
find hard to believe.  People sending genuine messages to mailing
lists don't say "Here is that document you asked for ... don't show
anyone else".  I've mailed abuse@cdgpd.com but for all I know they
are spammers themselves.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fmclain at cdgpd.com  Sat Mar 27 00:10:26 1999
From: fmclain at cdgpd.com (Fred McLain)
Date: Mon Jun  7 17:10:40 2004
Subject: Virus in my last e-mail
Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3D3@rainier.cdgpd.com>

Folks,

The last e-mail I sent had a virus in the attached word document.  PLEASE
don't open the document.  In our office it caused Outlook 98 to autosend
itself to everyone on our address lists, turned off virus checking in word
(tools/options/general/macro virus protection), and modified the default
template normal.dot.

Sorry!


 <<Fred McLain.vcf>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Fred McLain.vcf
Type: application/octet-stream
Size: 420 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990327/ac21d9d8/FredMcLain.obj
From jgarrett at navix.net  Sat Mar 27 00:17:43 1999
From: jgarrett at navix.net (Jim Garrett)
Date: Mon Jun  7 17:10:40 2004
Subject: Important Message From Fred McLain
In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3CF@rainier.cdgpd.com>
Message-ID: <000601be77e4$fe30c350$58c8c8c8@jgp400>

Fred:
Please convert your Word Doc file w/ Macros
into HTML so we can view what you don't
want us to see...

Thanks

jg

|-----Original Message-----
|From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
|Fred McLain
|Sent: Friday, March 26, 1999 5:04 PM
|To: 'dkrylov@cgxpress.com'; 'xml-dev@ic.ac.uk'
|Subject: Important Message From Fred McLain
|
|
|Here is that document you asked for ... don't show anyone else ;-)
|
|

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jeremy at omsys.com  Sat Mar 27 00:37:07 1999
From: jeremy at omsys.com (Jeremy H. Griffith)
Date: Mon Jun  7 17:10:40 2004
Subject: virus alert!!!  Re: Important Message From Fred McLain 
In-Reply-To: <15359.199903270001@doyle.cogsci.ed.ac.uk>
References: <15359.199903270001@doyle.cogsci.ed.ac.uk>
Message-ID: <373926a9.446264444@smtp.omsys.com>

On Sat, 27 Mar 1999 00:01:43 GMT, Richard Tobin <richard@cogsci.ed.ac.uk> wrote:

>> It would be
>> good if you warned your correspondents not to open the attachment.
>
>This suggests that the message was sent in good faith, something I
>find hard to believe.  People sending genuine messages to mailing
>lists don't say "Here is that document you asked for ... don't show
>anyone else".  I've mailed abuse@cdgpd.com but for all I know they
>are spammers themselves.

Yeesh.  The *worm* sent the message, just like with happy99.  All
poor Fred did was open it, which in some mailers is automatic...
That is the nature of a worm; it sends itself on.  Get used to it.
And make sure you never, ever, load a Word/PowerPoint/Excel doc
in such a way that the Auto macros can run... which is real hard 
to avoid...


--Jeremy H. Griffith     <jeremy@omsys.com>
  http://www.omsys.com/jeremy/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Sat Mar 27 00:47:12 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:10:40 2004
Subject: virus alert!!!  Re: Important Message From Fred McLain 
In-Reply-To: Jeremy H. Griffith's message of Sat, 27 Mar 1999 00:37:34 GMT
Message-ID: <15392.199903270046@doyle.cogsci.ed.ac.uk>

> Yeesh.  The *worm* sent the message, just like with happy99.  All
> poor Fred did was open it, which in some mailers is automatic...
> That is the nature of a worm; it sends itself on.  Get used to it.

Fortunately I don't have to, I don't use MS Windows :-)

I hadn't realised that there was a way for Windows viruses to find
what mailing lists you used.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Marc.McDonald at Design-Intelligence.com  Sat Mar 27 01:16:55 1999
From: Marc.McDonald at Design-Intelligence.com (Marc.McDonald@Design-Intelligence.com)
Date: Mon Jun  7 17:10:40 2004
Subject: XML and (K)Office
Message-ID: <c=US%a=_%p=Design_Intellige%l=MASTER-990327011633Z-3883@master.design-intelligence.com>

I think the conflict is caused by the concept of a valid document as 
opposed to parsing according to a DTD. Validity introduced a useful 
concept, but perhaps we should divorce it from parsing.

A 'valid' document meets the requirements of a DTD. When we talk about 
not needing to validate a document, we are assuming that it has 
already been validated when it was created so why waste the time doing 
it again. Perhaps another way to view it is to say that a document has 
been certified against a particular specification. Currrently, this 
specification is a DTD.

But what if the specification were more abstract, say a URI? As with 
namespaces, there may be an agreed DTD associated with the URI (the 
agreement is human convention) or the specification could be non-DTD 
based (this document conforms to IRS/1999/ScheduleD). Applications 
that produce or consume documents may use DTDs or any other form to 
describe agreed structure.

This would separate validity from parsing according to a DTD - 
validity is certification of conformance to whatever a URI has been 
agreed to represent. Validity is then not a method of parsing but a 
certificate of conformance.

Marc B McDonald
Principal Software Scientist
Design Intelligence, Inc
www.design-intelligence.com


----------
From:  Tim Bray [SMTP:tbray@textuality.com]
Sent:  Friday, March 26, 1999 9:52 AM
To:  James Robertson; XML Developers' List
Subject:  RE: XML and (K)Office

At 09:20 AM 3/26/99 +1000, James Robertson wrote:
>Without the rigour of a DTD, we've got nothing.

This sentiment is not universally shared.  While DTDs are extremely
useful and should be constructed as (a small) part of any serious
language-design effort, they are in some cases unnecessary (for
validation, full-text indexing, and lots of other things) and in
other cases insufficient - DTD validation never comes close
to real business-logic validation.  I am near-schizophrenic these 
days,
running around telling people that yes, they should use DTDs, and
simultaneously warning them that there are situations where they
fail to be either necessary or sufficient; the kind of mystico-
religious attitude above does not help.

>How will future users make sense of the format without
>a DTD?

And what, pray tell, part of a DTD helps you "make sense" of a
format? -Tim


xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following 
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jgarrett at navix.net  Sat Mar 27 01:45:07 1999
From: jgarrett at navix.net (Jim Garrett)
Date: Mon Jun  7 17:10:40 2004
Subject: Virus in my last e-mail - Fred's attached Outlook profile ??
In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3D3@rainier.cdgpd.com>
Message-ID: <000001be77f2$afe6ebd0$58c8c8c8@jgp400>

	Can you attached Outlook profile execute VIRUS macro's...
How do we know that "that" doesn't also contain a Virus...??

|-----Original Message-----
|From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
|Fred McLain
|Sent: Friday, March 26, 1999 6:09 PM
|To: 'xml-dev@ic.ac.uk'; 'dkrylov@cgxpress.com'
|Subject: Virus in my last e-mail
|
|
|Folks,
|
|The last e-mail I sent had a virus in the attached word document.  PLEASE
|don't open the document.  In our office it caused Outlook 98 to autosend
|itself to everyone on our address lists, turned off virus checking in word
|(tools/options/general/macro virus protection), and modified the default
|template normal.dot.
|
|Sorry!
|
|
| <<Fred McLain.vcf>> 
|

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Sat Mar 27 01:52:22 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:41 2004
Subject: virus alert!!!  Re: Important Message From Fred McLain
References: <15392.199903270046@doyle.cogsci.ed.ac.uk>
Message-ID: <36FC41B9.D2BB120A@lig.net>

I would outlaw the sending and receipt of all MS document formats if a company really wanted
security.

Even a .txt file, if it contains "rich text" and you use Word as your default viewer, can
contain a windows binary that can be executed.

I've received zillions of these in the last several years and many other trojan horses.  I
don't even look at Word/Excel, etc. documents from people I don't know.

It really is unbelievable how little security MS designed software has.

Viva XML....  And Java.

sdw

Richard Tobin wrote:

> > Yeesh.  The *worm* sent the message, just like with happy99.  All
> > poor Fred did was open it, which in some mailers is automatic...
> > That is the nature of a worm; it sends itself on.  Get used to it.
>
> Fortunately I don't have to, I don't use MS Windows :-)
>
> I hadn't realised that there was a way for Windows viruses to find
> what mailing lists you used.
>
> -- Richard
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Ed at dega.com  Sat Mar 27 02:27:26 1999
From: Ed at dega.com (Ed Howland)
Date: Mon Jun  7 17:10:41 2004
Subject: Whither XQL again.
Message-ID: <30649320C177D111ADEC00A024E9F297169FC7@exchange-server.dega.com>

All,

Thanks for your help in my understanding the nature of XML query language3s
in general and XQL in particular. Thanks a lot to Jonathan for his insights
and help. 

I've put up a web site with what I have done so far. Its not much, but over
the weekend, I hope to accomplish a lot. The parser compiles but only
recognizes path expressions so far. Statements like: 'novel/front' and
'novel//title', compile and generate the correct ASTs. But this is just a
smidgen amount.

Anyway, please help if you can or have the time. the site is:
http://ed.dega.com/pub/xml/xql/index.html

Thanks.

Ed

Ed Howland
ed@dega.com
http://www.dega.com 
Alpha Geek and XML TV Evangelist. "Seek to be well formed, lest you incur
the wrath of the W3C!"


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sat Mar 27 02:39:33 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:10:41 2004
Subject: SAX2: DTDDeclHandler (minimalist position) 
In-Reply-To: Your message of "Thu, 25 Mar 1999 16:35:43 EST."
             <Pine.GSO.4.05.9903251628060.543-100000@titan.oit.unc.edu> 
Message-ID: <199903270239.TAA08007@malatesta.local>

> > public interface DTDDeclHandler
> > {
> >     public final static int ATTRIBUTE_DEFAULTED = 1;
> >     public final static int ATTRIBUTE_IMPLIED = 2;
> >     public final static int ATTRIBUTE_REQUIRED = 3;
> >     public final static int ATTRIBUTE_FIXED = 4;
> > 
> 
> How committed are you to using integer constants? I know this is common,
> but it tends to lend itself to bad code. Some people prefer a solution
> like this:
> 
> public class AttributStatus {
> 
>   public final static AttributeStatus ATTRIBUTE_DEFAULTED = 
>    new AttributeStatus();
>   public final static AttributeStatus ATTRIBUTE_IMPLIED =
>    new AttributeStatus();
>   public final static AttributeStatus ATTRIBUTE_FIXED =   
>    new AttributeStatus();
>   public final static AttributeStatus ATTRIBUTE_REQUIRED =   
>    new AttributeStatus();
> 
>   private AttributeStatus() {}
> 
> }
> 
> This creates four menmonic constants you want and gives them a checkable
> type.  New constants can't be created because of the private constructor.
> And there's no chance that anybody's going to write code like
> 
>   if (getAttributeStatus() == 1) {
>    doSomething();
>   }
> 
> Programmers are more or less forced to use the constants. What do you
> think?

I personally take a very dim view of systems trying to "force" programmers 
into intrinsically good practices.  Programmers can abuse any system you 
present, and at some point you have to accept that they are adults, and must 
be free to cut off their own noses if they wish.

The good programming practice of replacing "magic numbers" with descriptive 
constants is even older than the structured programming movement, and any 
programmer who writes

if (getAttributeStatus() == 1)
{
    doSomething();
}

when

if (getAttributeStatus() == ATTRIBUTE_DEFAULTED )
{
    doSomething();
}

Fully deserves his own bugs, or roasting at the next code-review.

Furthermore, I've been thinking of proposing that the SAX2 interfaces be 
specified in IDL rather than Java (or at least publishing an IDL translatiuon 
when the interfaces are stabilized), and your proposal wouldn't wash in IDL.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Mar 27 04:38:07 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:41 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU>

 From: Stephen D. Williams <sdw@lig.net>

>> Tim Bray wrote:
 >>         There are two distinct issues 1) efficiency of parsing  2)
compactness. A
>> standard compression format for XML (ala zip,gzip etc) would be for
>> bandwidth limited applications.

Someone at ITU (International Telegraph Union) was working on an ASN.1
compression of XML markup. I think they may have opted for the WAP
method, for compatability. (I think the use of ASN.1 means fixed DTDs.)

I have done a few tests on how much compacter forms of XML (e.g.
shortrefs) impact arrival characteristics of document packet-groups
under TCP/IP compared to compression.  If your packet size is small, and
you really need to get at data in the first packet (so that you can
piggy back request for auto-linked resources in with the ACK for the
first packet group), then more compact forms of markup may make a
difference. But in general, compression is more effective. (It also
depends on where the bottlenecks are in your data path.)

One trivial way to minimise file sizes for transmission is to collapse
white-space inside markup (e.g. [\ \t \n\ r]+ becomes [\n]), to make
sure that newlines are not CR LF pairs, and to minimize whitespace in
data: (removing trailing spaces, [\ \t]+\n) becomes [\n], is a safe
transformation, for example.) And select your element and attribute
names so that their length is inverse to their frequency, as much as
possible: so use "a:s" not "abracadabra:shazamarama" (you may even make
two versions of your DTD: an authoring one and a transmission one.) One
pof the main bottleneck on many SOHO systems is the modem speed:
reducing the end-to-end character count means fewer packets, and more
data arrives earlier, so more auto-links are followed earlier.


Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sat Mar 27 08:20:03 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:41 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <3.0.32.19990326114522.00e927e8@pop.intergate.bc.ca>

At 11:31 AM 3/26/99 -0800, Jeffrey E. Sussna wrote: 
>>>>
I am  confused by the responses to this question. I selected the Print command from  the File menu in IE5 final and it printed just fine. I was looking at a raw XML  file with no formatting commands of any kind.  
<<<<
Maybe there's a way to do it, but once you bring a CSS stylesheet into
play, you apparently lose the ability to print.  So far everyone I
know who's tried it reports this.  Which makes XML+CSS essentially 
unusable in IE5, but maybe that's just because I have an old-fashioned
regard for the printed word. -T.
>>>>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Sat Mar 27 10:04:45 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:10:41 2004
Subject: LISTADMIN: No attachments to list messages PLEASE
In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3CA@rainier.cdgpd.com>
Message-ID: <v04104801b3225c22aea9@[155.198.8.15]>

> This message is in MIME format. Since your mail reader does not understand
> this format, some or all of this message may not be legible.
> 
> ------_=_NextPart_000_01BE77DC.DA4DA186
> Content-Type: text/plain
> 
> Here is that document you asked for ... don't show anyone else ;-)
> 
> 
> ------_=_NextPart_000_01BE77DC.DA4DA186
> Content-Type: application/msword;


Regarding the above message,  I must say most strongly that attaching
enclosures to list postings is HIGHLY discouraged (not to mention
asking them not to show it to anyone else!).  Apart from the risk
of a virus, it also means everyone on the list has to suffer the inconvenience
of downloading a document they might not want, and in many cases might
not be able to read (Unix etc). 

I must say here that in future, anyone attaching a document to a posting
may  be unsubscribed from the list without warning. This includes the
visiting card (vcf) attachments, which are also considered bad list etiquette.

If you do wish to bring a document to the attention of the list, then place
it on an ftp/http server somewhere so that anyone interested can download
it themselves (the pull rather than the push option).  


Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Sat Mar 27 10:25:10 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:10:41 2004
Subject: LISTADMIN: PLEASE read the unsubscribe instructions!!
Message-ID: <v04104805b322608db898@[155.198.8.104]>

Too many subscribers to the list are NOT READING the 
instructions in the signature, and posting to the list itself 
to unsubscribe. I have made these instructions as clear as
possible, and there is no excuse for not following
them!

I will continue to "name and shame", since such list pollution
is in no-one's interests.

I might also add that requests of the type 
unsubscribe xml-dev indiketr@churchill.co.uk
have to be individually moderated by me, and I 
do not guarantee that this will be done immediately
(especially when  I am away at a conference as 
I have been just recently).  Such requests can take up
to a week to process since  I do them in batches


> From: Mark.Unak@Level3.com
> To: xml-dev@ic.ac.uk
> Subject: unsubscribe xml-dev 
> Date: Fri, 26 Mar 1999 14:00:52 -0700
> MIME-Version: 1.0
> Sender: owner-xml-dev@ic.ac.uk
> Precedence: bulk
> Reply-To: Mark.Unak@Level3.com
> Status: U
> 
> unsubscribe xml-dev indiketr@churchill.co.uk
> <mailto:indiketr@churchill.co.uk> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Sat Mar 27 10:41:34 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:10:41 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF204@RED-MSG-08>

Kevin Hsu asked how to print XML from MS IE5.  To print what is displayed on
the screen, select File/Print.  To print the underlying XML (before
application of style sheets) select View/Source and then print that using
File/Print.  
 
I hope this is helpful,
Andrew

 
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Sat Mar 27 12:14:32 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:41 2004
Subject: Is there anyone working on a binary version of XML?
References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <36FCCBAE.6AF9636B@toolsmiths.se>

Rick Jelliffe wrote:

> I have done a few tests on how much compacter forms of XML (e.g.
> shortrefs) impact arrival characteristics of document packet-groups
> under TCP/IP compared to compression.  If your packet size is small, and
> you really need to get at data in the first packet (so that you can
> piggy back request for auto-linked resources in with the ACK for the
> first packet group), then more compact forms of markup may make a
> difference. But in general, compression is more effective. (It also
> depends on where the bottlenecks are in your data path.)

It seems that there are more use-cases which should benefit from having a
compressed or a binary format.

I made some tests using following XML data.
<xmltest>
   <xi4 value="0" name="VALUE"/>
   <xi4 value="32768" name="VALUE"/>
    ...
</xmltest>

The resulting sizes was:
XML       602830     (Standard XML text)
FML       131143     (Fast ML, a binary ml that Im working on)
XML.gz    75528      (gzip'ed XML text using -9 as compression rate)
FML.gz     20886     (gzip'ed Fast ML using -9 as  compression rate)

The facinating result here is the dramatic reduction in size obtained by first
converting to FML and the GZIP the markup stream.

> And select your element and attribute
> names so that their length is inverse to their frequency, as much as
> possible: so use "a:s" not "abracadabra:shazamarama" (you may even make
> two versions of your DTD: an authoring one and a transmission one.) One
> pof the main bottleneck on many SOHO systems is the modem speed:
> reducing the end-to-end character count means fewer packets, and more
> data arrives earlier, so more auto-links are followed earlier.

On the other hand there is a big drawback  using "manual tag compression"
which is Readability.

/Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Sat Mar 27 12:28:07 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:41 2004
Subject: Is there anyone working on a binary version of XML?
References: <002201be77b3$39760600$1b19da18@ne.mediaone.net>
Message-ID: <36FCCEDB.FCD11117@toolsmiths.se>

Jonathan Borden wrote:

> I think what this really is, when you strip out the concept of binary XML,
> is a suggestion for a compression format tuned for markup streams.
>
>         There are two distinct issues 1) efficiency of parsing  2) compactness. A
> standard compression format for XML (ala zip,gzip etc) would be for
> bandwidth limited applications.

I would like to add one more issue which is 3) Complexity.

Writing effecient and compact parsers is considerable simpler for binary ML,
The primary reason for this is that the parser does not have to "look" for and
interpret tokens in a stream. All tokens/parts in a Binary ML are well known and
their sizes are easily derived from the stream.

A "normally" trained and educated programmer can easily write a" fairly complete"
parser in less than a week. I have'nt written an XML text parser myself yet but
it seems that its not a task form the faint hearted.

/anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sat Mar 27 12:30:09 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:41 2004
Subject: DTD Catalogs
In-Reply-To: <85256740.0060BBCE.00@vgi4mail.vanguard.com>
References: <85256740.0060BBCE.00@vgi4mail.vanguard.com>
Message-ID: <wku2v76rq6.fsf@ifi.uio.no>


* Paul Tihansky
|
| Does anybody know if any of the Java XML Parsers support catalog
| files? 

See:

<URL: http://www.stud.ifi.uio.no/~larsga/linker/xmltools/by-standard.html#XCatalog01>
<URL: http://www.stud.ifi.uio.no/~larsga/linker/xmltools/by-standard.html#SGMLOpencatalogs>

| For instance, if I put a Public Indentifier in my DTD declaration
| without a URL, how would a parser such as XP find the DTD?

You could use SAX and register an EntityResolver. Those who support it
can be tailored through the EntityResolver. 

| How do I specify where the parser can find the catalog file?

That varies.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Sat Mar 27 12:41:09 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:41 2004
Subject: Is there anyone working on a binary version of XML?
References: <002201be77b3$39760600$1b19da18@ne.mediaone.net> <36FBDB92.56336820@lig.net>
Message-ID: <36FCD1E2.8727F1C@toolsmiths.se>

"Stephen D. Williams" wrote:

> I agree.  I feel they can be solved with a similar solution in at least some circumstances.
> Rather there are some straightforward ways to acheive compression that actually make
> efficiency worse while some solutions for efficiency also make compression easier.
>
> In fact there are a number of levels you could go with compression:
>
> optional gzip/bzip2 possibly preceded by:

For small to medium size streams will the gzip/bzip2 step probably take longer time
to complete than the savings in communications time. Of cource this also depends on
the network speed.

>
> Dictionary compression (various forms of building a list of commonly used terms or all terms
> in the current document/stream or some combination)

This is probably the best first action to take when needing to compress a ML stream.

Its also possible to combine Dictionaries with "Sessions". ie: two communication
nodes could establish a Session which contains pre negotiated Dictionaries, which
means that Dictionary content have to be sent over the wire only once. All "Packets"
thereafter references the dictionaries.
This is what I do in FML , however I have no estimates of how much space is actualy saved.

/Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sat Mar 27 12:52:12 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:41 2004
Subject: Why doesn't XML have Bag?
In-Reply-To: <36FBA3DC.59DA@skynet.be>
References: <5F052F2A01FBD11184F00008C7A4A800022A1729@EUKBANT101> <36FB8C85.BEFC8103@mitre.org> <wk90ckl3jy.fsf@ifi.uio.no> <36FBA3DC.59DA@skynet.be>
Message-ID: <wksoar6qod.fsf@ifi.uio.no>


* Lars Marius Garshol 
|
| It is a limitation of DTDs and was introduced because without this
| operator element content models are easily mapped to finite state
| automatons, but the introduction of the '&' separator makes automaton
| generation much more difficult.

* Paul Janssens
| 
| Please correct me if I am wrong here but isn't that trivial?
| (you may get a BIG automaton, but it's not difficult to generate)

This is correct. However, the number of states required for n elements
is n! with this approach (ie: worse than exponential), which means
that the automaton doesn't just get BIG, for reasonably sized content
models it can get ABSOLUTELY MIND-BOGGLINGLY AWFULLY STUNNINGLY
HUGE. :)

To wit:

[24]> (! 10)
3628800
[25]> (! 30)
265252859812191058636308480000000
 
In other words, this approach doesn't work at all.

| and it's 'easily' visualised by the number of possible shortest
| paths between two opposing points on a hypercube.

I can think of easier ways of visualising it, though possibly not
'easier' ones. :)

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From shyutz at ms1.hinet.net  Sat Mar 27 13:21:03 1999
From: shyutz at ms1.hinet.net (Kevin Hsu)
Date: Mon Jun  7 17:10:41 2004
Subject: how to print the XML document in IE 5.0
References: <000901be77bf$31695c80$5118a8c0@kuantech1.quokka.com>
Message-ID: <005201be7851$157f4920$15cd4acb@flag.com.tw>


>I am confused by the responses to this question. I selected the Print
command from the File menu in >IE5 final and it printed just fine. I was
looking at a raw XML file with no formatting commands of >any kind.
>
>Jeff

if you have XML document with no XSL , it will print the raw XML document
with default stylesheets of IE 5.0 , and it will be like a tree view of XML
document.

I can print the XML with XSL style sheet , but never print well with CSS
stylesheets, try to print the document URL below:

http://www.xml.com/1999/03/ie5/first-x.xml

Kevin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 27 13:59:15 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:42 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <007a01be785a$baf28bc0$c8a8a8c0@thing1>

From: Anders W. Tell <anderst@toolsmiths.se>
>Its also possible to combine Dictionaries with "Sessions". ie: two communication
>nodes could establish a Session which contains pre negotiated Dictionaries, which
>means that Dictionary content have to be sent over the wire only once. All "Packets"
>thereafter references the dictionaries.
>This is what I do in FML , however I have no estimates of how much space is actualy saved.


If there is a DTD referenced by a document, then that could be used as the dictionary.
Just assign a number for the first occurance of each element or attribute name found in
the DTD. Yes, this is incomplete, but its sure better than using short names.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sat Mar 27 14:04:46 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:42 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <199903270239.TAA08007@malatesta.local>
References: <199903270239.TAA08007@malatesta.local>
Message-ID: <wkn20z6nbf.fsf@ifi.uio.no>


* uche ogbuji
|
| Furthermore, I've been thinking of proposing that the SAX2
| interfaces be specified in IDL rather than Java (or at least
| publishing an IDL translatiuon when the interfaces are stabilized),
| and your proposal wouldn't wash in IDL.

Many things in SAX won't wash in IDL, such as the use of the
Java-specific InputStream, Reader and Locale objects. 

Also, IDL has a problem in that it's sort of a least common
denominator, and thus leaves out many useful language-specific things.
So you'd probably want to do a manual translation anyway.

If there ever is a published SAX spec I think it should use IDL to be
politically correct and point out potential language-mapping problems.
However, the actual utility of IDL I think is low in this particular
case.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Sat Mar 27 15:28:24 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:42 2004
Subject: Is there anyone working on a binary version of XML?
References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <36FCCBAE.6AF9636B@toolsmiths.se>
Message-ID: <36FD00FB.2C6283E@lig.net>

Can you make available what you are working on?  That was the reason for my starting
this thread in the first place.

I'm happy to build upon or learn from existing designs...  It appears you have had many of the
same conclusions that I have.  Let's pool our design features and make something that can be
used as a common
solution

Thanks
sdw

"Anders W. Tell" wrote:

> Rick Jelliffe wrote:
>
> > I have done a few tests on how much compacter forms of XML (e.g.
> > shortrefs) impact arrival characteristics of document packet-groups
> > under TCP/IP compared to compression.  If your packet size is small, and
> > you really need to get at data in the first packet (so that you can
> > piggy back request for auto-linked resources in with the ACK for the
> > first packet group), then more compact forms of markup may make a
> > difference. But in general, compression is more effective. (It also
> > depends on where the bottlenecks are in your data path.)
>
> It seems that there are more use-cases which should benefit from having a
> compressed or a binary format.
>
> I made some tests using following XML data.
> <xmltest>
>    <xi4 value="0" name="VALUE"/>
>    <xi4 value="32768" name="VALUE"/>
>     ...
> </xmltest>
>
> The resulting sizes was:
> XML       602830     (Standard XML text)
> FML       131143     (Fast ML, a binary ml that Im working on)
> XML.gz    75528      (gzip'ed XML text using -9 as compression rate)
> FML.gz     20886     (gzip'ed Fast ML using -9 as  compression rate)
>
> The facinating result here is the dramatic reduction in size obtained by first
> converting to FML and the GZIP the markup stream.
>
> > And select your element and attribute
> > names so that their length is inverse to their frequency, as much as
> > possible: so use "a:s" not "abracadabra:shazamarama" (you may even make
> > two versions of your DTD: an authoring one and a transmission one.) One
> > pof the main bottleneck on many SOHO systems is the modem speed:
> > reducing the end-to-end character count means fewer packets, and more
> > data arrives earlier, so more auto-links are followed earlier.
>
> On the other hand there is a big drawback  using "manual tag compression"
> which is Readability.
>
> /Anders
> --
> /_/_/_/_/_/_/_/_/_/_/_/_/_/_/
> /  Financial Toolsmiths AB  /
> /  Anders W. Tell           /
> /_/_/_/_/_/_/_/_/_/_/_/_/_/_/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Mar 27 15:56:18 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:42 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU>
References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <14076.63486.939606.141332@localhost.localdomain>

Rick Jelliffe writes:

 > One trivial way to minimise file sizes for transmission is to
 > collapse white-space inside markup (e.g. [\ \t \n\ r]+ becomes
 > [\n]),

Yes, that might be helpful (but only minimally in most cases).

 > sure that newlines are not CR LF pairs, 

Yes, that will make a small difference.  You might get a bigger bang
by doing some quick analysis to determine which character encoding
will provide the smallest object size: UTF-8, ISO-8859-1, UTF-16,
etc. (mileage will vary depending on the languages used in the text).

 > and to minimize whitespace in data: (removing trailing spaces, [\
 > \t]+\n) becomes [\n], is a safe transformation, for example.)

No.  It might be a safe transformation for specific XML formats, but
not for XML in general, because you don't know what people might be
using that whitespace for.

In general, though, what we need is a transport layer that takes care
of things like this for us.  Document type designers should optimise
for readability and usability, and let protocol designers worry about
the optimisations.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Mar 27 16:08:46 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:42 2004
Subject: LISTADMIN: No attachments to list messages PLEASE
In-Reply-To: <v04104801b3225c22aea9@[155.198.8.15]>
References: <5FFEC1B73A7BD1119D56006008C369F30ED3CA@rainier.cdgpd.com>
	<v04104801b3225c22aea9@[155.198.8.15]>
Message-ID: <14077.67.346287.353886@localhost.localdomain>

Rzepa, Henry writes:

 > Regarding the above message, I must say most strongly that
 > attaching enclosures to list postings is HIGHLY discouraged (not to
 > mention asking them not to show it to anyone else!).  Apart from
 > the risk of a virus, it also means everyone on the list has to
 > suffer the inconvenience of downloading a document they might not
 > want, and in many cases might not be able to read (Unix etc).

As became clear in the follow-ups, the posting was done by a worm that
hides in Word macros (the Internet's equivalent of animal dung,
apparently) exploits gaping security holes in Outlook to mail itself
out to everyone in a person's address list.

In other words, the original poster did *not* post the attachment to
xml-dev, the worm did.  His only mistakes were (a) using Microsoft
Windows, (b) opening a file in MS Word, and (c) not uninstalling
Outlook from his computer the first time he booted up.  If you had
summarily unsubscribed him, then you would simply have added an unjust
punishment to the embarrassment he was already suffering.

In fact, all three of the mistakes were probably mandated by company
policy; if so the true blame belongs in three places, in diminishing
order of culpability:

1. The poster's company, for ignoring the importance of technical
   diversity and mandating the same operating system and software for
   everyone (it's much easier to write a worm or virus when everyone's 
   using exactly the same software).

2. Redmond, for ignoring security whenever possible.

3. The creator of the worm.

If I'm right about corporate policy, then most of the blame goes to
the company -- Redmond just wants to sell software, and the worm
creator just wants attention, but the company failed to act in its own
self-interest.  Technical diversity is critical for good operation:
I'd no more want to see an all-Linux shop than I'd want to see an
all-Windows or an all-Mac shop.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Sat Mar 27 17:34:49 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:42 2004
Subject: Is there anyone working on a binary version of XML?
References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <14076.63486.939606.141332@localhost.localdomain>
Message-ID: <36FD16C2.B386B0F1@toolsmiths.se>

David Megginson wrote:

> In general, though, what we need is a transport layer that takes care
> of things like this for us.  Document type designers should optimise
> for readability and usability, and let protocol designers worry about
> the optimisations.

In general I agree especially with that Document Type creator should not
be doing "manual optimizations". However I in the case of DOM to DOM
communication its possible to to much better than using XML text a content carrier.

In this case the question arises, where is the protocol interface ?
Is it an interface that accepts a arbitrary byte stream and transports opaque data
to a receiver (A) or is it a an interface that accepts a DOM tree and sends it
ot a receiver (B) ?

(A)
DOM  --> [streamifyXML] --> XML text -->[protocolInterface] -->something smaller,faster (gzip,
FastML,...)
==communicate==>
something smaller,faster --> [protocolInterface] --> XML text  ->[SAX] --> DOM

(B)
DOM  --> [protocolInterface] -->something smaller (gzip)
==communicate==>
something smaller  --> [protocolInterface] -->DOM


/anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jborden at mediaone.net  Sat Mar 27 17:58:42 1999
From: jborden at mediaone.net (Jonathan Borden)
Date: Mon Jun  7 17:10:42 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <36FD16C2.B386B0F1@toolsmiths.se>
Message-ID: <003201be787a$8d4e1340$1b19da18@ne.mediaone.net>

Anders W. Tell wrote:
...in the case of DOM to DOM
> communication its possible to to much better than using XML text
> a content carrier.
>
> In this case the question arises, where is the protocol interface ?
> Is it an interface that accepts a arbitrary byte stream and
> transports opaque data
> to a receiver (A) or is it a an interface that accepts a DOM tree
> and sends it
> ot a receiver (B) ?
>

the protocol layer would be HTTP,SMTP etc. These protocols employ MIME. The
process of content negotiation might provide something like:

Content-type: application/xml; encoding="compressed-xml"

I'm not sure about what DOM to DOM communication means, the DOM doesn't
currently have any standard methods to even create XML documents, let alone
provide communications support. Most parsers accept an href. You could
propose a new protocol such as "x-xml:..." but better might be to request
your encoding type in the HTTP request Accept: header and then check the
response content-type.


Jonathan Borden
http://jabr.ne.mediaone.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sat Mar 27 18:02:58 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:42 2004
Subject: SAX2: Proposed alternative DTD interface
In-Reply-To: <14076.1733.365295.427943@localhost.localdomain>
References: <14076.1733.365295.427943@localhost.localdomain>
Message-ID: <wkk8w27quw.fsf@ifi.uio.no>


* David Megginson
|
| Here's another alternative for SAX2: forget about trying to report
| DTD declarations as events, and simply make the whole DTD available
| through an interface with a Parser2.get() call.
 
I'm against this. Having an event-based/object-based dichotomy makes
sense for DTDs just as it does for document instances. Also, this
breaks with the rest of SAX, is relatively complex and will at some
point probably be in direct competition with the DOM Level X.

Parsers that already have an internal object representation of the DTD
will need to wrap that with this interface, which probably won't be a
a very nice job, while an adapter for the event-based interface should
be simple.

Furthermore, this can be built on top of a 100% event-based SAX2.

And, finally, I dislike the iterators. They are just a nuisance in
higher-level languages, and a plain array would probably be better.
It would also free us from all this casting.

I say do it, but on top of an event-based interface, outside of the
SAX2 core and preferably without the iterators. A nice addition might
be support for content model automatons. (Three methods are needed:
get_start_state, get_next_state and is_final_state.)

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From MikeDacon at aol.com  Sat Mar 27 18:07:47 1999
From: MikeDacon at aol.com (MikeDacon@aol.com)
Date: Mon Jun  7 17:10:42 2004
Subject: SAX2: Proposed alternative DTD interface
Message-ID: <45ea762.36fd1e0e@aol.com>

Hi David,

In a message dated 3/26/99 5:31:10 PM Eastern Standard Time,
david@megginson.com writes:
> Here's another alternative for SAX2: forget about trying to report DTD 
>  declarations as events, and simply make the whole DTD available
>  through an interface with a Parser2.get() call.
>  

Although most DTDs will be short, it seems that the event-based 
interface will still be beneficial for large DTDs and small-footprint
applications that
cannot afford the memory of receiving the entire DTD implementation object.

I think the best alternative is to allow both options, and you just
don't set a handler if you want to ignore the events.

Which leads me back to my wish list for...

  try {
    Document doc = (Document)parser.get("http://xml.org/sax/props/dom");
  } catch (SAXNotSupportedException e) {
    // ...
  }

Which follows from the same logic.  Sometimes you want an 
event-based interface and sometimes you just want the resulting
object -- a Simple API for XML should cover both cases.

Best wishes,

 - Mike  { www.gosynergy.com }

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sat Mar 27 18:27:56 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:42 2004
Subject: Fast filter support in SAX2
In-Reply-To: <009201be7790$c0a6b7a0$c8a8a8c0@thing1>
References: <009201be7790$c0a6b7a0$c8a8a8c0@thing1>
Message-ID: <wkemma7pp7.fsf@ifi.uio.no>


* Bill la Forge
|
| I'd like to suggest another method in Parser2:
| 
|     public String unique(String);
| 
| as well as a featureID for requesting unique element and attribute
| names.

Bill, is this meant to be an interface to the string interning scheme
of the parser? If so, maybe we should call it intern?

Anyway, if that's what it is I support it. I'm a bit unsure why you
think the unique method is needed, though. What kinds of uses do you
have in mind for it?

| If a parser supports both the unique feature and provides access to
| its element stack,

Hmmm. I think this should be skipped. We'll need a special interface
to represent the stack, and parsers will probably have to do some
internal juggling to weed out information from the internal stack
that's only for internal use (and to adapt it to the SAX2 interface).

I think the result will be lower performance than if the application
maintained its own element stack.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Sat Mar 27 18:45:44 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:42 2004
Subject: DOM: notations and unparsed entities
Message-ID: <01BE788A.49DEA2E0@grappa.ito.tu-darmstadt.de>

1) How do I determine if an attribute's value is a notation or unparsed 
entity?  In the case of an unparsed entity, I'm guessing that the Attr node 
has an EntityReference child (is this true?). Notations have me stumped.

2) Is there a general DOM mailing list?  The only one I could find was 
www-dom@w3.org, which I assumed was for spec comments, not questions like 
this.

Thanks,

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 27 19:16:02 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:42 2004
Subject: Fast filter support in SAX2
Message-ID: <001601be7886$f9db4440$c8a8a8c0@thing1>

From: Lars Marius Garshol <larsga@ifi.uio.no>
>| I'd like to suggest another method in Parser2:
>| 
>|     public String unique(String);
>| 
>| as well as a featureID for requesting unique element and attribute
>| names.
>
>Bill, is this meant to be an interface to the string interning scheme
>of the parser? If so, maybe we should call it intern?
>
>Anyway, if that's what it is I support it. I'm a bit unsure why you
>think the unique method is needed, though. What kinds of uses do you
>have in mind for it?


It would be great if filters had the same advantages as parsers in being able
to simply test for equality (x==y) rather than having to do a string comparison
(x.equals(y)) when checking for a specific element or attribute name. 

>From previous discussion on this list, I gathered that many parsers did the 
equivalent of String.intern(), but avoided the JavaSoft implementation for
extra speed. If this is the case, then a filter needs to use the parser's own
intern function to preprocess its constants before testing for matches in the
startElement events.

So the short answer is yes, intern is beter than unique. I should have checked 
the lang package first. 


>| If a parser supports both the unique feature and provides access to
>| its element stack,
>
>Hmmm. I think this should be skipped. We'll need a special interface
>to represent the stack, and parsers will probably have to do some
>internal juggling to weed out information from the internal stack
>that's only for internal use (and to adapt it to the SAX2 interface).
>
>I think the result will be lower performance than if the application
>maintained its own element stack.


When you are working with filter structures, it is difficult to say where the
parser ends and the application begins. You raise an implementation
issue that there should be a separate stack that is accessable, distinct
from the one used by the parser. 

My interest here is, instead, to define a means for sharing the element 
stack across independently developed filters. Just about every filter
which does anything interesting ends up implementing its own element
stack. Why not have one filter that does that, and let the rest get it from
their "parser". (Think of parser as a role, a source of events relative to a
particular event consumer, not an implementation. The confusion here
comes from giving the interface the name Parser or Parser2, when it
can be either the actual parser or just another filter.)

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sat Mar 27 19:32:43 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:42 2004
Subject: Fast filter support in SAX2
In-Reply-To: <001601be7886$f9db4440$c8a8a8c0@thing1>
References: <001601be7886$f9db4440$c8a8a8c0@thing1>
Message-ID: <wkd81u7mpj.fsf@ifi.uio.no>


* Lars Marius Garshol
| 
| I'm a bit unsure why you think the unique method is needed,
| though. What kinds of uses do you have in mind for it?

* Bill la Forge
| 
| From previous discussion on this list, I gathered that many parsers
| did the equivalent of String.intern(), but avoided the JavaSoft
| implementation for extra speed. If this is the case, then a filter
| needs to use the parser's own intern function to preprocess its
| constants before testing for matches in the startElement events.

Ah, I thinking didn't think of that. Now that I've had some more time
to think about this I realize that this would also be useful for
filters that create new names, such as XAF.
 
| So the short answer is yes, intern is beter than unique. I should
| have checked the lang package first.

This terminology is also used in Common Lisp and Python, and probably
many other places as well.
 
| My interest here is, instead, to define a means for sharing the
| element stack across independently developed filters. Just about
| every filter which does anything interesting ends up implementing
| its own element stack. Why not have one filter that does that, and
| let the rest get it from their "parser". 

Another good point, and a very good idea, too.  

However, then we need to define the element stack interface and what
should be included there. Just the elements? Elements and attributes?
Elements, attributes and sibling number? Which entity each element
comes from?

Maybe this should be done outside the SAX core? On the other hand, if
filters are included I think this should be too.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Mar 27 20:11:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:42 2004
Subject: SAX2: Proposed alternative DTD interface
In-Reply-To: <45ea762.36fd1e0e@aol.com>
References: <45ea762.36fd1e0e@aol.com>
Message-ID: <14077.14778.566156.825039@localhost.localdomain>

MikeDacon@aol.com writes:

 > Although most DTDs will be short, it seems that the event-based
 > interface will still be beneficial for large DTDs and
 > small-footprint applications that cannot afford the memory of
 > receiving the entire DTD implementation object.

It's worthwhile, perhaps, to ask whether there will be many XML
applications that

a) require a small footprint;
b) need DTD information; and
c) can use the information in a streaming format.

Any kind of DTD-driven editing tool needs to store the DTD in some
kind of a persistent structure, and I imagine that most XML processing 
on small clients will not worry much about DTDs at all.

To continue playing devil's advocate (since I don't really know which
alternative I prefer), I'll also point out that even the largest DTDs, 
like TEI or DocBook, would measure their memory requirements in
kilobytes rather than megabytes; and even-based API makes sense for
the document itself because there is not known limit to an XML
document's size.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Mar 27 20:13:59 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:42 2004
Subject: Fast filter support in SAX2
In-Reply-To: <001601be7886$f9db4440$c8a8a8c0@thing1>
References: <001601be7886$f9db4440$c8a8a8c0@thing1>
Message-ID: <14077.15323.327681.132673@localhost.localdomain>

Bill la Forge writes:

 > It would be great if filters had the same advantages as parsers in
 > being able to simply test for equality (x==y) rather than having to
 > do a string comparison (x.equals(y)) when checking for a specific
 > element or attribute name.

Yes, but as someone (James Clark?) pointed out during the last round,
with most serious applications you're going to end up doing hash
lookups anyway, so the == doesn't buy you much.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Sat Mar 27 20:29:45 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:10:42 2004
Subject: LISTADMIN: The "Melissa" Virus
Message-ID: <v04104805b322ef6f5ad4@[155.198.8.81]>

This list was hit earlier by the "Melissa" virus;

http://www.news.com/News/Item/0,4,34334,00.html

Apparently,, not many anti-viral programs detect it yet.
Please take great care with Word/Outlook combinations.
If anyone knows of anti-viral tools that detect this, please
let me and  I will alert this list.

Many thanks.  


Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 27 20:41:33 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:43 2004
Subject: Fast filter support in SAX2
Message-ID: <00a601be7892$f1d9e560$c8a8a8c0@thing1>

From: Lars Marius Garshol <larsga@ifi.uio.no>
>However, then we need to define the element stack interface and what
>should be included there. Just the elements? Elements and attributes?
>Elements, attributes and sibling number? Which entity each element
>comes from?
>
>Maybe this should be done outside the SAX core? On the other hand, if
>filters are included I think this should be too.


I'm all for delaying things which are independent of the SAX2 core. It will be
good to be able to focus on filter considerations, aka MDSAX2. The
complication is when filter considerations impact SAX2.

For example, where would be the best place for the intern method? I would
hate to see it on Parser2, as that creates added overhead for each filter.
(Yes, and I was the one who suggested it. :-)

So far, I have a pretty short list of things we might need for filter structures:

1. An intern interface.

2. Request that element and attribute names be intern'ed. (Might be combined
    with a successful get on the intern interface.)

3. Element stack interface.

4. Application event routing. Necessary for non-linear filter structures where more
    than one filter needs access to the events coming from the application, like handler
    registration.

In addition, I also see a need for a DOMWalker interface:

public interface DOMWalkerContext
{
    public Element getCurrentElement();
}

A filter could ask this of its parser and then be able to process "parse" events
based on their source in the DOM. A good start for a SAX-based XSL, I suspect.

But like I said, this should wait.

On the other hand, I would like to suggest that Parser2 NOT be derived from
Parser. We could then have a pure SAX2 implementation, where things like 
document handler would be registered just like any other SAX2 event handler.

This would make for much cleaner filter2s. And there's going to be a whole lot
more filters than parsers, mm?

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rbourret at ito.tu-darmstadt.de  Sat Mar 27 20:43:31 1999
From: rbourret at ito.tu-darmstadt.de (Ronald Bourret)
Date: Mon Jun  7 17:10:43 2004
Subject: SAX2: Proposed alternative DTD interface
Message-ID: <01BE789A.BF03B320@grappa.ito.tu-darmstadt.de>

David Megginson wrote:

> It's worthwhile, perhaps, to ask whether there will be many XML
> applications that
>
> a) require a small footprint;
> b) need DTD information; and
> c) can use the information in a streaming format.

Point (c) is the one that gets me.  All the DTD-based applications I can 
think of eventually need a set of objects over the DTD because they are 
either analyzing the DTD or continually checking against it.

The only exception I can think of to this is Simon's validation routine in 
his layered parser, and he needs so much lexical information he's likely to 
be unhappy with an event-based DTD parser anyway.  (A quick and dirty fix 
would be to redefine validation to mean logical validation, not physical 
validation.)

(By the way, can we change ContentParticle.isOmissible to isOptional?  I 
had to think a bit before I realized what isOmissible meant.)

-- Ron Bourret


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sat Mar 27 20:44:56 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:43 2004
Subject: SAX2: Proposed alternative DTD interface
Message-ID: <00ab01be7893$66902180$c8a8a8c0@thing1>

What about sequential reuse of a parser? 
If its going to process the same DTD again, couldn't it
have cached the DTD?

And wouldn't a DTD event stream preclude this important optimization?

B


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Sat Mar 27 21:41:18 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:43 2004
Subject: half-baked parsers vs binary XML
Message-ID: <36FD4FA4.26BCB466@jfinity.com>

I have been thinking about optimal XML parsing, partly as a result of
the binary XML discussion. Right now the world of XML parsers is divided
into well-formedness and validating. Another type being discussed is
binary.

I'd like to propose another, the half-baked parser. This parser is
mentioned in the notes for section 5.1 of the annotated XML spec (not in
a positive light :-).

The half-baked parser can only process XML documents that don't have a
prologue. This makes its memory footprint and execution path much
smaller and faster respectively. Unfortunately, it isn't a legal XML
parser anymore.

This can be addressed by having a modular parser architecture that would
be optimistic and try the half-baked parser first. If it encountered a
prologue, it could load either a  WF parsing module or a validating
parsing module.

I think that a highly tuned half-baked parser in combination with an
optional  stream-oriented compression scheme would address many of the
concerns that something like binary XML is intended to deal with in both
the transmission, storage and execution speed dimensions.

A great discussion of modular layered parsing can be found on Simon St.
Laurent's web site (www.simonstl.com).

Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Mar 28 00:35:55 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:10:43 2004
Subject: IDL for SAX2
In-Reply-To: Your message of "27 Mar 1999 16:04:36 +0200."
             <wkn20z6nbf.fsf@ifi.uio.no> 
Message-ID: <199903280035.RAA09535@malatesta.local>

> * uche ogbuji
> |
> | Furthermore, I've been thinking of proposing that the SAX2
> | interfaces be specified in IDL rather than Java (or at least
> | publishing an IDL translatiuon when the interfaces are stabilized),
> | and your proposal wouldn't wash in IDL.
> 
> Many things in SAX won't wash in IDL, such as the use of the
> Java-specific InputStream, Reader and Locale objects. 

Huh?  Sounds like orthogonal matters to me.

module spam {
	interface InputStream;

	interface eggs{
		string foobar(in InputStream input);
	}	
}

Is perfectly legal IDL.  IDL does not concern itself with the implementation 
of any object: strictly interface, as its name promises.  You can use 
InputStream, Reader, etc. etc. to your heart's content.  In fact, you can even 
improve on the current Java approach by actually _defining_ the interface for 
those classes, saving non-Java users a spurious trip through the Javadoc.

> Also, IDL has a problem in that it's sort of a least common
> denominator, and thus leaves out many useful language-specific things.

Examples, please.  I don't think your above example of Java-specific objects 
really minimizes the usefulness of IDL.

> So you'd probably want to do a manual translation anyway.

> If there ever is a published SAX spec I think it should use IDL to be
> politically correct and point out potential language-mapping problems.
> However, the actual utility of IDL I think is low in this particular
> case.

It's not a matter of "politically correct".  IDL is an excellent _engineering_ 
tool whenever you need to define interface.  I have used it time and time in 
my career, and I find that the ability to generate stubs in any native 
language directly from the IDL, thus ensuring adherence to the interface, 
saves much development time.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sun Mar 28 00:38:21 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:43 2004
Subject: Proposed new kind of SAX2 thing, with example
In-Reply-To: <14075.35687.586960.200728@localhost.localdomain> from "David Megginson" at Mar 26, 99 08:29:04 am
Message-ID: <199903280144.UAA03387@locke.ccil.org>

David Megginson scripsit:

> Use the following from Parser2 (n�e ModParser):
> 
>     public abstract Object get (String prop)
> 	throws SAXNotSupportedException;
> 

Ah.  In that case, please add another get method with an index
value, and ditto for set.  This way we can have indexed properties.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 28 03:17:35 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:43 2004
Subject: half-baked parsers vs binary XML
In-Reply-To: <36FD4FA4.26BCB466@jfinity.com>
References: <36FD4FA4.26BCB466@jfinity.com>
Message-ID: <14077.33404.430088.361367@localhost.localdomain>

Gabe Beged-Dov writes:

 > The half-baked parser can only process XML documents that don't have a
 > prologue. This makes its memory footprint and execution path much
 > smaller and faster respectively. Unfortunately, it isn't a legal XML
 > parser anymore.

No, you'll probably find that there's no speed difference at all (why
would there be?).  There will be a small size difference, but it will
be less exciting than you think -- the code to detect the prologue and 
load the module will make up much of the difference.  DTD validation
really doesn't require much extra code, and the code, of course, isn't 
triggered unless you're validating in the first place; doing the
well-formedness checks for legal characters can take up a lot of code, 
but you're supposed to do that anyway (I cheated with AElfred).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Sun Mar 28 04:41:22 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:43 2004
Subject: half-baked parsers vs binary XML
References: <36FD4FA4.26BCB466@jfinity.com> <14077.33404.430088.361367@localhost.localdomain>
Message-ID: <36FD95F9.7E93A231@jfinity.com>

David Megginson wrote:

> No, you'll probably find that there's no speed difference at all (why
> would there be?).

There would be a little speed difference from not having to check for defaulted attributes.
The half-baked parser might also be able to directly point to the xml input without having to
copy it, i.e. use start-length pointers for the tags and attrs.  This would be more
cumbersome if there was less of a one to one correspondence between the raw xml and what you
got after expansion and defaulting.

> There will be a small size difference, but it will
> be less exciting than you think -- the code to detect the prologue and
> load the module will make up much of the difference.

Detecting the prologue and loading an alternate module takes a few lines of Java code.
Prologue processing, entity expansion  and attribute defaulting take up a little more than
that in the parsers that I've looked at.

> doing the
> well-formedness checks for legal characters can take up a lot of code,
> but you're supposed to do that anyway (I cheated with AElfred).

I'm not sure I understand. Could you elaborate on how you cheated :-?

Thanks,

Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Mar 28 04:54:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:43 2004
Subject: half-baked parsers vs binary XML
In-Reply-To: <36FD95F9.7E93A231@jfinity.com>
References: <36FD4FA4.26BCB466@jfinity.com>
	<14077.33404.430088.361367@localhost.localdomain>
	<36FD95F9.7E93A231@jfinity.com>
Message-ID: <14077.38830.801250.747754@localhost.localdomain>

Gabe Beged-Dov writes:

[on a validating parser]

 > There would be a little speed difference from not having to check
 > for defaulted attributes.

Not a measurable one -- the parser just needs to set a boolean flag
when there are no default values available, then it doesn't have to
check each time.

 > The half-baked parser might also be able to directly point to the
 > xml input without having to copy it, i.e. use start-length pointers
 > for the tags and attrs.  This would be more cumbersome if there was
 > less of a one to one correspondence between the raw xml and what
 > you got after expansion and defaulting.

I think that James Clark does something like that with Expat, which
does read the prolog properly, though it doesn't expand external
entities by default.  At least, Expat can always return the exact
string where an event originated.

Most efficient XML parsers play pretty clever tricks with their input
buffers, even with entity expansion.

 > > There will be a small size difference, but it will be less
 > > exciting than you think -- the code to detect the prologue and
 > > load the module will make up much of the difference.
 > 
 > Detecting the prologue and loading an alternate module takes a few
 > lines of Java code.  

Well, a little more than that, because you'll have to pass the current
state on to the new module.

 > Prologue processing, entity expansion and attribute defaulting take
 > up a little more than that in the parsers that I've looked at.

The version of AElfred that I wrote was around 27K (uncompressed)
including full parsing of element, attribute, and entity declarations,
and expansion of external entities (including the external DTD
subset); even then, AElfred would have been about 7K smaller if I
hadn't written my own hashing, interning, buffer-handling etc. for
speed's sake.

I still believe that a 10K XML non-validating parser class in Java is
not out of reach, *including* parsing the prolog, if people are
willing to use the standard Java classes.

 > > doing the well-formedness checks for legal characters can take up
 > > a lot of code, but you're supposed to do that anyway (I cheated
 > > with AElfred).
 > 
 > I'm not sure I understand. Could you elaborate on how you cheated :-?

At least when I was maintaining it, AElfred didn't perform all of the
required well-formedness checks for different ranges of Unicode
characters allowed and not allowed in names, attribute values,
character data, etc.  I tried adding it, but it bloated the code by
about 7-8K (much more than parsing the prolog and DTD).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sun Mar 28 05:18:55 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:43 2004
Subject: XML and (K)Office
In-Reply-To: <3.0.32.19990326092935.00e4a604@pop.intergate.bc.ca> from "Tim Bray" at Mar 26, 99 09:52:11 am
Message-ID: <199903280424.XAA08855@locke.ccil.org>

Tim Bray scripsit:

> >How will future users make sense of the format without
> >a DTD?
> 
> And what, pray tell, part of a DTD helps you "make sense" of a
> format? -Tim

Hear, hear.  I spent far too much time poring over the XMLspec DTD
and some examples (which didn't quite match the published DTD any more),
without understanding any too much of it.  Reading the documentation
(in English) made all the difference.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Sun Mar 28 05:30:53 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:43 2004
Subject: half-baked parsers vs binary XML
References: <36FD4FA4.26BCB466@jfinity.com>
		<14077.33404.430088.361367@localhost.localdomain>
		<36FD95F9.7E93A231@jfinity.com> <14077.38830.801250.747754@localhost.localdomain>
Message-ID: <36FDA19A.27DA7B45@jfinity.com>

Another reason (other than the binary XML thread) that I brought this up was discussion on
the perl-xml mailing list of whether XML::Parser was usable for soft real-time server side
processing. The consensus there seems to be no.

XML::Parser is layered on expat. Anecdotal evidence seems to be that there is an order of
magnitude performance advantage to "parsing" something other than XML. The two alternatives
are a textual format that Perl can eval directly (Data::Dumper) and a binary format
(Storable).

In both cases (Data::Dumper and Storable) there is conversion from the on-disk format to the
in-memory format. Why is XML so much slower according to developer feedback? That is what I
was trying to understand from other peoples experience rather than doing a hands-on analysis
myself.

I may have jumped to the conclusion that it was the extra work that a well-formedness
processor has to do over what a half-baked processor would do. That still leaves the quesion
of where the slowdown is and whether it is an implementation issue or inherent is some aspect
of XML parsing.

Thanks,

Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sun Mar 28 06:34:51 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:43 2004
Subject: DOM: notations and unparsed entities
In-Reply-To: <01BE788A.49DEA2E0@grappa.ito.tu-darmstadt.de> from "Ronald Bourret" at Mar 27, 99 07:44:54 pm
Message-ID: <199903280536.AAA11704@locke.ccil.org>

Ronald Bourret scripsit:

> 1) How do I determine if an attribute's value is a notation or unparsed 
> entity?  In the case of an unparsed entity, I'm guessing that the Attr node 
> has an EntityReference child (is this true?). Notations have me stumped.

You can't tell.  And no, the value of a NOTATION attribute is a string,
not an entity reference (which is used only when you have actual &...;
markup).  The DOM conceals the XML type of attributes, at least at
level 1.

> 2) Is there a general DOM mailing list?  The only one I could find was 
> www-dom@w3.org, which I assumed was for spec comments, not questions like 
> this.

It's for all DOM talk, including spec comments.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sun Mar 28 06:35:48 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:43 2004
Subject: Fast filter support in SAX2
In-Reply-To: <wkd81u7mpj.fsf@ifi.uio.no> from "Lars Marius Garshol" at Mar 27, 99 09:32:24 pm
Message-ID: <199903280541.AAA11787@locke.ccil.org>

Lars Marius Garshol scripsit:

> This terminology is also used in Common Lisp and Python, and probably
> many other places as well.

It was born in the LISP environment, where atoms were "interned on the
OBLIST" to make them unique.

> However, then we need to define the element stack interface and what
> should be included there. Just the elements? Elements and attributes?
> Elements, attributes and sibling number? Which entity each element
> comes from?

Just the element types, IMHO.  This is very easy to expose as Strings
for almost any kind of parser.  If you want more, do it yourself.

> Maybe this should be done outside the SAX core?

It certainly can be, but the parser is already doing it, and why
reinvent the wheel?

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Sun Mar 28 06:42:26 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:10:43 2004
Subject: Why doesn't XML have Bag?  Uh, "set"
References: <001401be776d$f9af2be0$ab20268a@pc-lrd.bath.ac.uk>
			<36FB6534.FDF0192D@jrc.it>
			<36FB76BA.EBE5172D@mitre.org> <14075.36327.509783.485757@localhost.localdomain> <36FB93E5.F662B56A@mitre.org>
Message-ID: <36FDB30C.93E227EC@allette.com.au>


Roger L. Costello wrote:

> Thanks Dave for clarifying terminology.  It is "set" that I meant, not
> "bag".  Just to make certain that I understand, an XML DTD cannot
> express the following:
>
> "A <Kitchen> element contains exactly three child elements: one instance
> of <Sink>, one instance of <Stove>, and one instance of <Refrigerator>,
> and these child elements can appear in any order."
>
> Correct?

XML cannot, but as has been pointed out, SGML can. This is a classic situation where the use
of an SGML validation stage may be cheap and useful. You can check the structure with the
rigid model, then make the appropriate assumptions when using the XML. The XML content model
might not reflect your strict requirements of the data, but the overall process does.

The role of semantic checking may at some stage be taken over by a schema, but until then an
SGML parse can provide the rigidity that you need. This might be as simple as just remapping a
single parameter entity and applying a different parser.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sun Mar 28 07:10:57 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:43 2004
Subject: half-baked parsers vs binary XML
In-Reply-To: <14077.38830.801250.747754@localhost.localdomain> from "David Megginson" at Mar 27, 99 09:54:58 pm
Message-ID: <199903280616.BAA12772@locke.ccil.org>

David Megginson scripsit:

> At least when I was maintaining it, AElfred didn't perform all of the
> required well-formedness checks for different ranges of Unicode
> characters allowed and not allowed in names, attribute values,
> character data, etc.  I tried adding it, but it bloated the code by
> about 7-8K (much more than parsing the prolog and DTD).

According to the corrigenda, attribute values and character data
can now contain anything except (hex) 0000-0008, 000B-000C, 000E-001F,
(ASCII controls), D800-DFFF (surrogates), and FFFE-FFFF (non-characters).
Everything else should be allowed.

There are some rules in Appendix B of XML that allow you to leverage
the methods in Character.  When I get a chance, i"ll write some Java
code that correctly recognizes XML name and name-start characters.
The big tables are already in the java.lang.Character class.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From alank at iol.ie  Sun Mar 28 08:23:43 1999
From: alank at iol.ie (Alan Kennedy)
Date: Mon Jun  7 17:10:43 2004
Subject: Is there anyone working on a binary version of XML?
References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <14076.63486.939606.141332@localhost.localdomain> <36FD16C2.B386B0F1@toolsmiths.se>
Message-ID: <36FDA44B.C48CA611@iol.ie>

"Anders W. Tell" wrote:

> However in the case of DOM to DOM
> communication its possible to to much better than using XML text a content carrier.
>

Don't forget about IIOP, the CORBA "RPC" protocol. While not the absolutely optimal transport
protocol, it has been optimised by a group of experts (I think :-) for platform, transport,
endian, etc, independence.

This is why the DOM interfaces are defined in OMG IDL. You can take the DOM interface definitions
and feed them through an IDL compiler, which will generate client (local) and server (remote)
transport stubs. These stubs, which can be in any language supported with an IDL compiler, take
care of all parameter marshalling, etc, for transport across a network, between address spaces,
etc.

If you want to experiment with an IDL compiler for JAVA, OrbixWeb is pretty good, and you can get
a 60 day evaluation from the IONA web site, at

http://www.iona.com/info/products/orbixweb/index.html

There are free IDL compilers and ORBs available too.

This takes care of eliminating tags from the communication stream (although these would be
replaced by a wire representation of the method name and parameters), since a parsed DOM structure
could communicate directly with another (possibly remote) parsed DOM structure.

However, the actual element content would be still be transported in full representation, with no
compression.

To deal with situations such as this, OrbixWeb has a non-CORBA standard facility called
"transformers". This is basically a filter callback where you can process data being marshalled as
it goes outside an objects address space, to transform it in whatever way you wish, including
changing its representation. In this case, the obvious requirement is to compress the data in some
way. Note however that the remote DOM would have to have a comptatible "un-transformer" to
reconstitute the encoded element content.

As for what DOM to DOM communication actually means, I think that XLink is a prime use for such
communication, particularly the transclusion stuff.

But that's a whole other subject.

I think there are some very blurred boundaries here between a HTTP like client/server facility and
direct communication between persistent objects on separate machines.

If you take the view that XML documents are files that are to be transferred in whole or in part
between machines, then the HTTP style approach is the right one.

But if you take the OMG-CORBA approach of  "Objects exist; you (the client) don't need to know
where or how they are stored. Simply refer to their object reference and they will be instantiated
for you and made available as if they were in your local address space."

I think the latter is going to be the way that things will go. Picture every XML document as being
available, fully parsed, and available to you (permissions excepted) as if it was in your local
address space. In this paradigm, HTTP style servers would no longer exist. Web servers would
simply be replaced by ORBs.

And CORBA is an *open* standard.

Alan.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Sun Mar 28 10:25:29 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:44 2004
Subject: Fw: Is there anyone working on a binary version of XML?
Message-ID: <010401be78f3$e35001d0$5402a8c0@oren.capella.co.il>

Stephen D. Williams <sdw@lig.net> wrote:
>One other subject that I haven't mentioned, but need for another
architecture that I designed
>a while ago is a mechanism for 'parallel inheritance' overlay tree
processing.  Has anyone
>else worked on this?  The idea is to have one or more base trees and work
with a delta tree
>which represents changes from the underlying trees.  This last part is a
basic data structure
>for a rule engine and metadata application environment I designed last
year.


For general XML trees, I think you'll find that the only way to describe a
'delta' on a tree is using an XSL stylesheet, or something as complex, so
you might as well stick with XSL. We use "delta trees" very heavily, but in
a somewhat specialized form suitable for our application - the input trees
have to be in a very strict format and the set of operations is much
narrower then allowed in XSL.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sun Mar 28 11:21:37 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:44 2004
Subject: Fast filter support in SAX2
In-Reply-To: <199903280541.AAA11787@locke.ccil.org>
References: <199903280541.AAA11787@locke.ccil.org>
Message-ID: <wkn20yufza.fsf@ifi.uio.no>


* Lars Marius Garshol
|
| However, then we need to define the element stack interface and what
| should be included there. Just the elements? Elements and attributes?
| Elements, attributes and sibling number? Which entity each element
| comes from?

* John Cowan
| 
| Just the element types, IMHO.  This is very easy to expose as Strings
| for almost any kind of parser.  If you want more, do it yourself.

In that case you'll need to make your own stack in addition to the
element stack.  Maybe we should consider providing some means of
annotating the element stack? Some kind of property/value scheme?
 
* Lars Marius Garshol
|
| Maybe this should be done outside the SAX core?
 
* John Cowan
|
| It certainly can be, but the parser is already doing it, and why
| reinvent the wheel?

Because, like I pointed out earlier, the parser probably has more than
just element names in its stack, like the entity it appeared in, and
in some cases possibly location information as well. That needs to be
stripped out when presenting this to the application, which may well
be slower than letting the application do this for itself.

And for many applications more information in the stack is needed, and
then what?

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sun Mar 28 11:45:06 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:44 2004
Subject: IDL for SAX2
In-Reply-To: <199903280035.RAA09535@malatesta.local>
References: <199903280035.RAA09535@malatesta.local>
Message-ID: <wkk8w2uew5.fsf@ifi.uio.no>


* Lars Marius Garshol
|
| Many things in SAX won't wash in IDL, such as the use of the
| Java-specific InputStream, Reader and Locale objects.

* uche ogbuji
| 
| IDL does not concern itself with the implementation of any object:
| strictly interface, as its name promises.  You can use InputStream,
| Reader, etc. etc. to your heart's content.  In fact, you can even
| improve on the current Java approach by actually _defining_ the
| interface for those classes, saving non-Java users a spurious trip
| through the Javadoc.

Maybe, but you don't want these to be InputStream, Reader and Locale
in Python, C++ or Common Lisp. You want them to be Pyton file objects,
C++ streams and Common Lisp streams. So although it may work for Java
it won't work as well everywhere else.
 
* Lars Marius Garshol
|
| Also, IDL has a problem in that it's sort of a least common
| denominator, and thus leaves out many useful language-specific
| things.
 
* uche ogbuji
|
| Examples, please. 

Python 'magic' methods such as __getitem__, Common Lisp generic
methods, Eiffel/Sather invariants and post-/preconditions, Sather
iterators, Python/Common Lisp keyword arguments, Java/C++ overloading
and so on.

This is a problem that isn't really avoidable when you want to
seamlessly cross language boundaries, but it's not clear to me that
that is what we want to do in this particular case.

| I don't think your above example of Java-specific objects really
| minimizes the usefulness of IDL.

Not in general, but I think the fact that you want to map those
objects to different kinds of objects in different languages does mean
that IDL can't be used directly anyway, and then you may as well do
the whole translation manually.

To take one example we've now introduced AttributeList2, a subclass of
AttributeList, which is passed to the usual startElement method. In
Java you need to cast this object to get at the new methods, which is
awkward, means a run-time type-check and sort of defeats the point of
having typing in the first place.

In Common Lisp you'd rather have

(defmethod start-element((dh my-document-handler) (name string)
                         (al attribute-list))
  (error "Dang, we need a SAX 2.0 parser!"))

(defmethod start-element((dh my-document-handler) (name string)
                         (al attribute-list2))
  ; safely use attribute-list2 with no casting, no performance penalty
  ; and no typing problems
  )

This IDL can't do for you, because IDL doesn't have the concept of
generic methods.
 
* Lars Marius Garshol
|
| If there ever is a published SAX spec I think it should use IDL to be
| politically correct and point out potential language-mapping problems.
| However, the actual utility of IDL I think is low in this particular
| case.
 
* uche ogbuji
|
| It's not a matter of "politically correct".  IDL is an excellent
| _engineering_ tool whenever you need to define interface.  I have
| used it time and time in my career, and I find that the ability to
| generate stubs in any native language directly from the IDL, thus
| ensuring adherence to the interface, saves much development time.

When you want to talk to implementations in other processes, on other
computers and in other languages, yes. But I don't think that's really
what we want in this case, to the cost of having less natural
translations to the various languages.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at ifi.uio.no  Sun Mar 28 12:07:38 1999
From: larsga at ifi.uio.no (Lars Marius Garshol)
Date: Mon Jun  7 17:10:44 2004
Subject: Proposed new kind of SAX2 thing, with example
In-Reply-To: <199903280144.UAA03387@locke.ccil.org>
References: <199903280144.UAA03387@locke.ccil.org>
Message-ID: <wkg16qudum.fsf@ifi.uio.no>


* David Megginson
| 
| Use the following from Parser2 (n�e ModParser):
| 
|     public abstract Object get (String prop)
| 	throws SAXNotSupportedException;

* John Cowan
| 
| Ah.  In that case, please add another get method with an index
| value, and ditto for set.  This way we can have indexed properties.

You can anyway, if you just use a Vector or some equivalent as the
property value.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Mar 28 12:39:57 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:44 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <003a01be78ff$438759d0$35f96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com>

>Rick Jelliffe writes:
>
> > One trivial way to minimise file sizes for transmission is to
> > collapse white-space inside markup (e.g. [\ \t \n\ r]+ becomes
> > [\n]),
>
>Yes, that might be helpful (but only minimally in most cases).

The reason I suggest it is this: at several stages in a network there is
liable to be some point-to-point compression. In particular, of course,
at the modem of the receiver (well, most receiving ends). XML's
verboseness can be partially justified by the existance of this
compression.

Attempting to compress already-compressed data does not always lead to
increased benefits: in fact, compressing already-compressed data can
easily lead to larger files, which is why many compression systems first
check that they have made any gains before writing out the compressed
blocked. (And if you are going through 7-bit mail systems, then you can
increase your transmission size by compressing data, if the data is
ACII.)

When judging an XML compression, it is important to judge its effect
after being recompressed by the kind of compression that is found in
modems (i.e., at the bottleneck): the simple, fastest deflate found in
gzip can be useful.  Furthermore, it is important to recognise that,
because of the slow-start algorithm in TCP/IP and the WWW having quite
long ACK delays, a compression of  2:1 is not the same thing as a
doubling in arrival speed: more data will arrive earlier in each packet
group, but the number of packet groups may be the same.  In the case of
the binary version of XML being mentioned, it would be interesting to
see the four-way comparison (raw XML, binary "XML", compressed XML,
compressed binary).

One interesting results of my tests on the interaction of
short-referencing and compression was that collapsing white-space was
(for my independently-produced RDF test files) just as effective as
short-referencing. (One reason might be that many compression algorithms
only have a certain dictionary size, and a certain match-string size:
reducing unneeded white-space may free up dictionary entries and allow
more useful match-strings. Especially for on-the-fly compression, such
as modems. )

I was surprised, because I thought that white-space was fairly
insignificant: but I was wrong, for the data I was using (some data
would fare better, I would hope, but some may be worse).  So developers
should pay attention to letting users keep their file sizes down: a 10
percent reduction in file size may not seem much, but if, at an extreme,
all the packets are just over the size of the first packet group and the
ACK latency is greater than the packet transmission time, it can result
in the files completing in half the time.  At the smaller file sizes of
XML, and the trends to linking to external stylesheets and so on,
reducing the crap in headers is quite important. In fact, I would think
that it was good policy to have no unneccessary whitespace in header
data in XML documents.

>> and to minimize whitespace in data: (removing trailing spaces, [\
>> \t]+\n) becomes [\n], is a safe transformation, for example.)

>No.  It might be a safe transformation for specific XML formats, but
>not for XML in general, because you don't know what people might be
>using that whitespace for.

Of course. But in practise text editors and some kinds of processing
systems will often strip out trailing whitespace  on opening or closing.
So I should have said something like "It is not prudent to generate
'[\t\s]+\n' where the whitespace is significant unless you are sure how
software which uses that data treats trailing white-space."   In any
case, I was trying to say that one good way to reduce file sizes is to
not generate unneeded characters in the first-place: I was not proposing
an external compression mechanism based on white-space collapsing.

Rick


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From harvey at eccnet.eccnet.com  Sun Mar 28 13:23:44 1999
From: harvey at eccnet.eccnet.com (Betty L. Harvey)
Date: Mon Jun  7 17:10:44 2004
Subject: Melissa Virus Article
In-Reply-To: <003a01be78ff$438759d0$35f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <Pine.LNX.4.04.9903280620190.3573-100000@eccnet.eccnet.com>


Since this listserve was hit with the Melissa Virus Friday
night, I thought some of you might be interested in an
article in yesterdays Washington Post concerning the
virus:

http://www.washingtonpost.com/wp-srv/business/daily/march99/virus27.htm

Betty

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
Betty Harvey                         | Phone: 301-540-8251 FAX: 4268
Electronic Commerce Connection, Inc. | 
13017 Wisteria Drive, P.O. Box 333   | 
Germantown, Md.  20874               |
harvey@eccnet.com                    | Washington,DC SGML/XML Users Grp
URL:  http://www.eccnet.com          | http://www.eccnet.com/sgmlug/
/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/  


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Mar 28 13:31:26 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:44 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG))
Message-ID: <005301be7906$745450c0$35f96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com>

 >Rick, you're still pointing to implementation details rather than
>abstract modelling.  Try to express the question in terms of the thing
>being modelled -- for example, at a project meeting, the system
>architect might ask the following question:
>
>  Can SGML and XML both model a reference to a photograph, providing
>  the width, height, and colour depth?
>
>The answer, of course, is 'yes'
...
(Sad and somewhat mysterious story deleted: what is the point? That
anyone who discusses what information is implied by XML markup is a
boofhead? In any case, the forum here is not XML-DEV, not a company
design meeting.)

Huh? At some level of abstraction all distinctions disappear: XML
becomes the same as ethernet when the abstraction is "things that can
transport characters".

Dave seems to be  saying that
    ( X , X, X )
is the same as
     ( X, what-he-said, what-he-said)

I agree that (to bend LISP out of shap)
       eval( X, X, X)
is the same as
       eval(X, what-he-said, what-he-said)
but Dave seems to be  saying that the fact that two things (pointed to)
are the same is not "information". That seems an extraordinary claim.

<eg>
    <owner id="j1">john</owner>
    <dog owner="j1">rover</dog>
    <dog owner="j1">rex</dog>
</eg>

encodes more information than

<eg>
    <dog owner="john">rover</dog>
    <dog owner="john">rex</dog>
</eg>

unless there is schema defininion in effect somehow somewhere that the
strings in owner attributes follow the rule one-name=one-owner. In the
first version, that information is part of the model. In the second, it
is not.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bmhughes at ozemail.com.au  Sun Mar 28 17:08:04 1999
From: bmhughes at ozemail.com.au (Baden Hughes)
Date: Mon Jun  7 17:10:44 2004
Subject: OT: Melissa virus fix from NAI
Message-ID: <199903281507.HAA02264@bmhughes.com>

There's a fix for the Melissa virus from NAI:

http://www.avertlabs.com/public/datafiles/valerts/vinfo/melissa.asp

W97M/Melissa 3/27/99

W97M/Melissa
Melissa is a Word 97 Class Module Macro virus that can also be upconverted
to a Word 2000 Macro Virus. It was first discovered by NAI's Dr Solomon's
VirusPatrol today on the alt.sex newgroup. The virus has spread rapidly
around the world, and has infected thousands
Symptom
The virus can infect a system by being received from another infected user
via Outlook. This appears to be the most common method of infection. Users
will not know they have been infected, nor will the sender know the
document has been sent. A user may become alerted to the infected document
if the Macro Security settings are enabled. This warning will be displayed
to the user when the document is opened.
Pathology
When the infected document is opened, the virus checks for a setting in the
registry to test if the system has already been infected. 
If the system hasn't been infected, the virus creates an entry in the
registry: HKEY_CURRENT_USER\Software\Microsoft\Office\"Melissa?" = "... by
Kwyjibo"
(If this key exists the email process will not execute, the virus will
still infect. AVERT advises that it not be removed.)
(As a preventive message you can create this registry key to prevent the
virus from launching)
This virus also creates an Outlook object using Visual Basic instructions
and reads the list of members from Outlook Global Address Book. An email
message is created and sent to the first 50 recipients programatically all
the address books, one at a time. The message is created with the subject 
"Important Message From ? <User Name>" 
The message body of text reads 
"Here is that document you asked for ... don?t show anyone else ;-)". 
The active infected document is attached and the email is sent. The most
prevalent document being seen is one called List.DOC, however this is NOT
the only document that can be sent or received. Once the system is infected
all documents that are opened are infected. As any document can be sent, a
user that receives the infected document, who hasn?t been infected, can
become infected with this document, and the process will continue.
The virus does have a payload. If the day equals the minute value, and the
infected document is opened this text is inserted at the current cursor
position: 
" Twenty-two points, plus triple-word-score, plus fifty points for using
all my letters. Game's over. I'm outta here."
This virus checks for low security in Office2000 by checking the value from
the registry; if the value
HKEY_CURRENT_USER\Software\Microsoft\Office\9.0\Word\Security\"Level" is
not null,
the virus will disable the "MACRO/SECURITY" menu option. Otherwise Word97
menu option "TOOLS/MACRO" is disabled.
Comments inside the macro virus include:
'WORD/Melissa written by Kwyjibo
'Works in both Word 2000 and Word 97
'Worm? Macro Virus? Word 97 Virus? Word 2000 Virus? You Decide!
'Word -> Email | Word 97 <--> Word 2000 ... it's a new age!


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Sun Mar 28 17:57:26 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:44 2004
Subject: Fw: Is there anyone working on a binary version of XML?
References: <010401be78f3$e35001d0$5402a8c0@oren.capella.co.il>
Message-ID: <36FE594D.C43A4F2C@lig.net>


Oren Ben-Kiki wrote:

> Stephen D. Williams <sdw@lig.net> wrote:
> >One other subject that I haven't mentioned, but need for another
> architecture that I designed
> >a while ago is a mechanism for 'parallel inheritance' overlay tree
> processing.  Has anyone
> >else worked on this?  The idea is to have one or more base trees and work
> with a delta tree
> >which represents changes from the underlying trees.  This last part is a
> basic data structure
> >for a rule engine and metadata application environment I designed last
> year.
>
> For general XML trees, I think you'll find that the only way to describe a
> 'delta' on a tree is using an XSL stylesheet, or something as complex, so
> you might as well stick with XSL. We use "delta trees" very heavily, but in
> a somewhat specialized form suitable for our application - the input trees
> have to be in a very strict format and the set of operations is much
> narrower then allowed in XSL.

I don't understand how to use XSL in a general way to acheive a 'delta tree' architecture.  I
have a vague idea, but nothing that I could see being automated sufficiently.  Can you
elaborate?

In my case I'm really talking about a specialization also.  Certain processing or data
interpretation rules would have to be used, although these could be specified with attributes
to allow a full range of possibilities.

The situation that I am solving is where you have a base XML document and want to treat it as
a read-only base where changes are made to an overlayed read-write layer (or layers).
'Lookups' would traverse a series of trees to determine the current state.

The problems are related to ambiguous situations such as whether a read-write entity replaces
or adds to an underlying layer, how to handle deletes, etc.  There are a number of possible
partial solutions, but it's difficult to find a completely general solution.  For instance,
using unique ID's creates a problem of managing and assigning unique ID's.

This kind of thing really does have real-world application.  A year ago I designed a rule
engine for business rule processing in a web application that used this kind of data
structure.  The rulebase could have thousands of entries for structure and metadata where the
session state for each user would only consist of a few fields that were modified or had
values.  Obviously a great optimization.  Actually developing this is still on my short list.

I don't know whether this 'delta tree' aspect has solid prospects for becoming commonly used,
but I need it.

sdw

> Have fun,
>
>     Oren Ben-Kiki
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sun Mar 28 18:01:48 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:10:44 2004
Subject: Is there anyone working on a binary version of XML?
In-Reply-To: <003a01be78ff$438759d0$35f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <NBBBJPGDLPIHJGEHAKBAKEJJDAAA.martind@netfolder.com>

Hi,

We have also to consider that the transport mechanism can provide
compression. HTTP has already this unused feature. I made some tests with
HTTP 1.1 and compression transport and got for any text transport some
improvements. Because the compression is taken care in the transport layer,
the XML parser receives a document as usual.

When I probed the mechanism further here is what's happening in IE
(Currently we are working on a paper for new architecture for Mozilla
Netlib).

IN IE 5.x
the server sends the data in HTTP 1.1 compressed mode.
the client receives the data in compressed mode, then for each 8K chunks
calls a MIME filter that does the decompression.
After the decompression is done by the MIME filter, the transaction handler
give the chunks to the document handler and therefore to the parser.

In Mozilla we do not have this intermediary step. The transaction handler
deals only with a protocol handler and a MIME filter is missing. To be more
versatile and be able to process the stream adequately we have to add the
notion of filters.

So, as already mentioned in Simon's XML architecture we can say that
browsers implements some parts of the layers like:

a) transaction processing
b) MIME filtering (compression, decryption, etc...)
c) Document handling routing

So in a browser's architecture we have:

Transaction handler ----> Protocol handler -----> MIME filter------>
Document manager ------> document handler (here you find the XML parser)

In IE, these elements are already independant of the browser and are
encapsulated in a module called UrlMon. basically, you provide a display
name like "http://www.netfolder.com/ and urlmon act as a transaction
handler, calls the MIME filters and give you the data.

In Mozilla new Gecko architecture this is also a separate XPCOM module named
Netlib. You also provide a display name, the module act as a transaction
handler, calls the right protocol hanlder and give you the data (encrypted,
uncompressend, etc). There is work in progress to add the notion of MIME
filters.

Both module share a similar kind of interface based on COM. This simply
means that an object can have multiple interface and that a Query interface
mechanism is available to obtain the right one. So Mozilla XPCOM and
Microsoft COM implementations are very similar. Both kind of objects have
binary signatures in the form of a C++ pure virtual interface. Both
interfaces have mendatory members like AddRef, Release and QueryInterface.
The difference being the way you instanciate objects and how they are
registered. But, in both cases, the transaction manager and its helper
modules: protocol handlers and MIME filters are accessible to other
applications than browsers.


There is also an other thread of evolution named HTTP NG.

As you know, HTTP is like an remote object with Get, Put, Post, Delete
methods. WebDav (already implemented in IE 5.x WebFolders) recently added
PROGET, PROPUT, etc.... HTTP NG intents is to be able to create any kind of
objects and the actual HTTP 1.1 (WebDav) being one type of object: a
Document object. So, with HTTP NG, you'll be able to create your own object.

This means that the evolution of HTTP on one side and XML on the other side
is creating a concurrent to OMG. Here's why:

OMG as already mentioned in this list is a middleware with:

a) an interface language that could be mapped to different languages for
concrete implementation
b) a marshalling format for object communication.

So, here is what's happening in the Web world:

a) the interface definition language is still absent and it could be OMG
IDL. But today, no choices has been made on this. We'll need some more work
on the HTTP NG front before this happens. But when this will happens, the
web will be a web of diverse distributed objects and not solely distributed
documents.
b) the marshalling format is becoming more and more XML.

However the drawback for XML is that it is wasting a lot of transport
bandwidth compared to other more efficient formats. To palliate that, the
transport layer tries to use compression to reduce the packets payloads.
HTTP 1.1 already provides this, but this is not used very often (if not at
all). As you know, things are moving slowly. In the next 3 to 4 years, most
browsers in the field will support HTTP 1.1 and compression (about 65% does
today). The worst case is server side where a lot of improvised servers are
not well configured to support adaptative compression negociation even if
they can. This is mainly a knowledge barrier and not a technical issue. but
in some years form now, compressed transport will be the de facto way.

This was my Sunday morning .2 cents.

Regards
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Sun Mar 28 18:04:46 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:44 2004
Subject: Fast filter support in SAX2
Message-ID: <006501be7935$70e78040$c8a8a8c0@thing1>


From: David Megginson <david@megginson.com>
>Yes, but as someone (James Clark?) pointed out during the last round,
>with most serious applications you're going to end up doing hash
>lookups anyway, so the == doesn't buy you much.


At first blush, I had to agree with you. But consider the more interesting
pattern matching scenarios. Its not always reasonable to have to map
all processing into a hash lookup. 

I'm really just suggesting a capability here. Just another way to tune an
application. If interned strings are used by the parser, why not share
that capability with filters/applicaitons?

Suppose we have a parser-kernel that we want to use with some new 
wonderful schema that has been implemented in a filter? Something that
allows content validation based on ancestor patterns? Unless you are
willing to right some pretty convoluted code, interned strings would be helpful.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Sun Mar 28 18:18:55 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:45 2004
Subject: Is there anyone working on a binary version of XML?
References: <004201be780b$e9bf6300$60f96d8c@NT.JELLIFFE.COM.AU> <14076.63486.939606.141332@localhost.localdomain> <36FD16C2.B386B0F1@toolsmiths.se> <36FDA44B.C48CA611@iol.ie>
Message-ID: <36FE566D.B413A140@toolsmiths.se>

Alan Kennedy wrote:

> Don't forget about IIOP, the CORBA "RPC" protocol. While not the absolutely optimal transport
> protocol, it has been optimised by a group of experts (I think :-) for platform, transport,
> endian, etc, independence.

And dont forget Microsoft DCOM and DCE :)


> If you want to experiment with an IDL compiler for JAVA, OrbixWeb is pretty good, and you can get
> a 60 day evaluation from the IONA web site, at
>
> http://www.iona.com/info/products/orbixweb/index.html
>
> There are free IDL compilers and ORBs available too.

I use TAO for my experiments since I manly implement in C/C++
<http://www.cs.wustl.edu/~schmidt/TAO.html>

Its one of the few Corbas with real-time extensions.

>
> But if you take the OMG-CORBA approach of  "Objects exist; you (the client) don't need to know
> where or how they are stored. Simply refer to their object reference and they will be instantiated
> for you and made available as if they were in your local address space."

One problem with the current design of DOM IDL is that each element in a XML document is
an Corba Object and for large document this means a *lot* of object references. However
the newer POA have a design which handles UseCases like this better than the old BOA.
Unfortunately not many ORBs have implemented it.

My view on remote DOM trees is that in most cases its more efficient to transfer the whole document
to the client side and access it there in a local DOM tree. The reason for this is that traversing
a remote DOM tree using current IDL's would generate an enormous amount of network traffic
and the document transfer +DOM building time on the client side would significantly smaller.


/Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Sun Mar 28 18:27:50 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:45 2004
Subject: Fw: Is there anyone working on a binary version of XML?
References: <010401be78f3$e35001d0$5402a8c0@oren.capella.co.il>
Message-ID: <36FE5885.2EAFCB88@toolsmiths.se>

Oren Ben-Kiki wrote:

> For general XML trees, I think you'll find that the only way to describe a
> 'delta' on a tree is using an XSL stylesheet, or something as complex, so
> you might as well stick with XSL.

Interesting!
Im about to go into this area very soon and would be very interested in any
pointers to how to represent XML/DOM tree deltas.

Best
/Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Sun Mar 28 18:38:45 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:10:45 2004
Subject: OFF: Re: Melissa Virus Article
References: <Pine.LNX.4.04.9903280620190.3573-100000@eccnet.eccnet.com>
Message-ID: <36FE5E93.27DEF3C4@finetuning.com>

so then, would a fix be to not read email when you have a  word doc
open?  And this only effects outlook users?

Fred, were you using outlook?

thanks,

lisa

Betty L. Harvey wrote:
> 
> Since this listserve was hit with the Melissa Virus Friday
> night, I thought some of you might be interested in an
> article in yesterdays Washington Post concerning the
> virus:
> 
> http://www.washingtonpost.com/wp-srv/business/daily/march99/virus27.htm
> 
> Betty
> 
> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/
> Betty Harvey                         | Phone: 301-540-8251 FAX: 4268
> Electronic Commerce Connection, Inc. |
> 13017 Wisteria Drive, P.O. Box 333   |
> Germantown, Md.  20874               |
> harvey@eccnet.com                    | Washington,DC SGML/XML Users Grp
> URL:  http://www.eccnet.com          | http://www.eccnet.com/sgmlug/
> /\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\\/\/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Sun Mar 28 20:27:39 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:10:45 2004
Subject: OFF: Re: Melissa Virus Article
References: <Pine.LNX.4.04.9903280620190.3573-100000@eccnet.eccnet.com> <36FE5E93.27DEF3C4@finetuning.com>
Message-ID: <36FE7810.94AAA89F@finetuning.com>

never mind and sorry everybody.  just killing this thread myself    
...(scream)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Mar 28 20:31:43 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:10:45 2004
Subject: IDL for SAX2 
In-Reply-To: Your message of "28 Mar 1999 11:44:58 +0200."
             <wkk8w2uew5.fsf@ifi.uio.no> 
Message-ID: <199903281831.LAA10836@malatesta.local>

> 
> * Lars Marius Garshol
> |
> | Many things in SAX won't wash in IDL, such as the use of the
> | Java-specific InputStream, Reader and Locale objects.

<SNIP-MY-REPLIES-AND-LARS-FOLLOWUPS />

I understand your point much better than after your first post, thanks.  I had 
the impression that you were saying that certain interfaces that happen to be 
implemented in Java could not be implemented in IDL.

So you say that IDL is more useful if one desires direct language and platform 
transparency, rather than as a general protocol-definition language.  I agree 
with that assessment, but I'll also point out that it's no worse in that 
regard than Java.

All of the litany of non-Java language-specific elements you mention still 
need to be translated from Java, as they would from IDL, so I still don't see 
how that acts as an argument against IDL.  Java doesn't support Python's 
__getitem__ semantics function, for instance.

When using IDL purely for design presentation, you can add all the comments 
you like to motivate language-specific features.  At least, then you have a 
common core, and the language-specific elements are a clear departure, rgather 
than something one has to puzzle out from the behavior of Java.

If there were another language that supports defining the interface with more 
flexibility for language-specific constructs, I wouldn't mind using that 
rather than IDL.  Do you have any to suggest?

As it is, however, as parsers come in C, C++, Python, Java, Perl, etc., and I 
don't see why we shouldn't use the most widely recognized middle-ground 
language for sharing interface between these languages (maybe recognition is 
the politics you were referring to earlier, but I choose to believe that IDL 
has real merits).

But as I type, I realize that the great majority of contributors to SAX2 seem 
to have a Java bent, so maybe it's just best for Dave Meggison to publish 
Java-SAX2, and to have it translated to IDL (I guess I'll volunteer to do so, 
as I'm the lone advocate so far).  I do think that this will help others 
outside this list as they have to implement SAX2 in their work.  After all, we 
want more standardization around SAX, right?

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sun Mar 28 20:45:22 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:45 2004
Subject: half-baked parsers vs binary XML
Message-ID: <3.0.32.19990328104316.00bb89d0@pop.intergate.bc.ca>

At 07:27 PM 3/27/99 -0800, Gabe Beged-Dov wrote:
>XML::Parser is layered on expat. Anecdotal evidence seems to be that there is an order of
>magnitude performance advantage to "parsing" something other than XML. The two alternatives
>are a textual format that Perl can eval directly (Data::Dumper) and a binary format
>(Storable).

That's because there is some breakage in the design of the expat/perl linkage.
In my tests, given a file of  a couple of meg, perl can read it in under
a second, xmlwf (i.e. raw expat) in an almost unmeasurably-short time, and
XML::Parser takes 10 seconds plus.  This is just a bug and will be fixed.
 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jgarrett at navix.net  Sun Mar 28 23:45:35 1999
From: jgarrett at navix.net (Jim Garrett)
Date: Mon Jun  7 17:10:45 2004
Subject: Melissa virus fix - (in case you haven't already been there)
In-Reply-To: <v04104805b322ef6f5ad4@[155.198.8.81]>
Message-ID: <000701be7962$3beebc50$58c8c8c8@jgp400>

Melissa Virus fix...FYI (in case you haven't already been there)
 
http://www.microsoft.com/security/bulletins/ms99-002.asp
http://officeupdate.microsoft.com/downloaddetails/wd97sp.htm
 

|-----Original Message-----
|From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
|Rzepa, Henry
|Sent: Saturday, March 27, 1999 2:31 PM
|To: xml-dev@ic.ac.uk
|Subject: LISTADMIN: The "Melissa" Virus
|
|
|This list was hit earlier by the "Melissa" virus;
|
|http://www.news.com/News/Item/0,4,34334,00.html
|
|Apparently,, not many anti-viral programs detect it yet.
|Please take great care with Word/Outlook combinations.
|If anyone knows of anti-viral tools that detect this, please
|let me and  I will alert this list.
|
|Many thanks.  
|
|
|Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
|http://www.ch.ic.ac.uk/rzepa/
|
|xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
|Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
|CD-ROM/ISBN 981-02-3594-1
|To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
|(un)subscribe xml-dev
|To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
|following message;
|subscribe xml-dev-digest
|List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jackpark at thinkalong.com  Sun Mar 28 23:57:45 1999
From: jackpark at thinkalong.com (Jack Park)
Date: Mon Jun  7 17:10:45 2004
Subject: Virus in my last e-mail
In-Reply-To: <5FFEC1B73A7BD1119D56006008C369F30ED3D3@rainier.cdgpd.com>
Message-ID: <E10RNYl-00075L-00@punch.ic.ac.uk>

It's bad enough that you send viruses.  Worse yet, you force your vcf card
on all of us.

At 04:09 PM 3/26/99 -0800, you wrote:
>Folks,
>
>The last e-mail I sent had a virus in the attached word document.  PLEASE
>don't open the document.  In our office it caused Outlook 98 to autosend
>itself to everyone on our address lists, turned off virus checking in word
>(tools/options/general/macro virus protection), and modified the default
>template normal.dot.
>
>Sorry!
>
>
> <<Fred McLain.vcf>> 
>
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrys at microsoft.com  Mon Mar 29 00:51:50 1999
From: mrys at microsoft.com (Michael Rys)
Date: Mon Jun  7 17:10:45 2004
Subject: Important Message From Fred McLain (READ FIRST!!!)
Message-ID: <25983782061AD111B0800000F86310FE14282F76@RED-MSG-42>

This mail contained the MELISSA Word macro virus.

> ****** Message from InterScan E-Mail VirusWall NT ******
> 
> ** WARNING! Attached file list1.doc contains:
> 
>      W97M_MELISSA.A virus
> 
>    The infected file has been cleaned.
>    You will be sent a separate e-mail with the cleaned file.
> 
> Please go to [internal web page] and install Inoculan.  If 
> already installed please ensure you have the latest signiture 
> file. 
> 
> *****************     End of message     ***************
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Mon Mar 29 01:48:50 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:10:45 2004
Subject: xhtml and the p tag
Message-ID: <02fa01be7975$b1bf6e80$0200a8c0@mdaxke.mediacity.com>

(not sure where xhtml discussion should go; all i see on the www-html@w3.org
list are stultifying discussions about tag case-sensitivity.)

in the strict dtd from http://www.w3.org/TR/WD-html-in-xml/ ,
the p element is %Inline, which means it can't include any
block level elements such as ul. So now we have a quandary.

in practical terms, what i usually want is something 
like a non-existent <parabreak/>. That doesn't exist, because
in browsers a <br/> breaks the line; it doesn't end the
paragraph, and particularly now with xhtml, <p/> is
deprecated. But even if I *did* the extra work to wrap
<p> ... </p> around my paragraphs, that still wouldn't
work, because a p can't enclose any block level elements
such as a ul.

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From marcelo at mds.rmit.edu.au  Mon Mar 29 02:47:39 1999
From: marcelo at mds.rmit.edu.au (Marcelo Cantos)
Date: Mon Jun  7 17:10:45 2004
Subject: Whence XQL?
In-Reply-To: <3.0.3.32.19990325220915.032a1480@pop.mindspring.com>; from Jonathan Robie on Thu, Mar 25, 1999 at 10:09:15PM -0500
References: <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com> <000b01be7702$102babd0$0100007f@eps.inso.com> <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com> <19990326133124.B7318@io.mds.rmit.edu.au> <3.0.3.32.19990325220915.032a1480@pop.mindspring.com>
Message-ID: <19990329104719.A13271@io.mds.rmit.edu.au>

On Thu, Mar 25, 1999 at 10:09:15PM -0500, Jonathan Robie wrote:
> At 01:31 PM 3/26/99 +1100, Marcelo Cantos wrote:
>  
> >I could be disingenuous ( :-) ) and suggest that the attachment to
> >Microsoft has more than a little to do with its success to date, but I
> >certainly don't want to disparage the effort in its own right.  It
> >offers a good compromise between expressivity and simplicity, which is
> >a far more practicable goal than completeness.
> 
> Well, Microsoft was one of the first companies I got interested in XQL ;->
> 
> >I am concerned (am I right on this?) at the lack of proximity
> >operators.  But that's just an implementor's perspective, looking at
> >doing things we already support.
> 
> Cool, you work on SIM? (Does that make you a SIMian?)

Cute!  It might just take off around here. :-)

> I really enjoyed
> talking to Timothy Arnold-Moore at Markup Technologies '98 - Makoto
> Murata-san and I managed to snag him after his presentation and grill him
> with questions for a while.
> 
> I've gone back and forth on proximity operators. Several people who have
> implemented full-text search systems have told me that users don't really
> use proximity operators, that they are useful in the implementation, but
> need not be exposed to the user. Others vehemently disagree. I took the
> pragmatic approach of leaving it out to see who would complain. Frankly,
> you are the first to do so.

I do wonder what proportion of people looking seriously at XQL are
into text.  We find WITHIN N to be exceedingly useful.  It is also
interesting to note that we only offer proximity at the word level and
that this is all clients ever really want.  We do also offer same
sentence/paragraph queries, but virtually no-one uses them.

> I have discussed proximity searching as a possibility in the
> following paper:
> 
> http://www.w3.org/TandS/QL/QL98/pp/murata-san.html
> 
> Here's an excerpt:
> 
> <excerpt>
> 
> In addition, functions for proximity searching might be useful. The
> following returns <LINE> elements in which "rose*" and "sweet*"
> occur within 10 words of each other:
> 
> LINE[near("rose*", "sweet", 10)] This would match lines like these:
> 
> <LINE>A rose by any other name would smell as sweet.</LINE>
> <LINE>Sweet roses grew along the south side of the fence.</LINE>
> <LINE>She rose and smiled sweetly at the purple dwarf under the
> bucket.</LINE> <LINE>Say, has anybody seen my Sweet Gypsy
> Rose?</LINE>
> 
> Proximity searching requires some way to indicate how close the
> strings must be in order to match. This causes a difficulty when
> choosing the units in which proximity is measured. In existing
> full-text systems, distance is frequently measured in terms of
> words, which raises a number of significant questions regarding
> internationalization, but is probably an intuitive way to measure
> distance for most users.
> 
> </excerpt>
> 
> I'm not sure whether this is the best approach or not. Do you like
> this approach? If not, what approach would you prefer?

It's an interesting angle, though not one I had considered (not that I
have considered many angles :-).  I had understood, perhaps
incorrectly, that the only way to perform word-level boolean queries
was to treat words abstractly as leaf nodes of the document tree
rather than clumps of opaque string data.  Under this conception, to
find "other name", one would say:

  LINE[WORD="other"; WORD="name"]

It could possibly be made legal to abbreviate the above to:

  LINE["other"; "name"]

Which would be interpreted as, "a Line element which is the parent of
a leaf node equal to "other" immediately preceding a leaf node equal
to "name".  Now, support for proximity ("rose*" within 10 words of
"sweet") would simply be a matter of:

  LINE["rose*" %10 "sweet"]

(The %N syntax is borrowed from our query language.)  Higher level
proximities could be done like this:

  LINE["name"] %10 LINE["purple"]

The operator simply adopts the level of its operands mismatched
operands constitute an error.

Caveat: I confess that I don't know XQL very well at all, so I may be
saying something completely different to what I intended with the
above examples.  Corrections are most welcome.


Cheers,
Marcelo

-- 
http://www.simdb.com/~marcelo/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Mar 29 02:51:17 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:10:45 2004
Subject: Proposed new kind of SAX2 thing, with example
In-Reply-To: <wkg16qudum.fsf@ifi.uio.no> from "Lars Marius Garshol" at Mar 28, 99 12:07:29 pm
Message-ID: <199903290049.TAA06821@locke.ccil.org>

Lars Marius Garshol scripsit:

> You can anyway, if you just use a Vector or some equivalent as the
> property value.

Vectors are no real substitute for indexed properties, because they
require exposing the collection rather than just its elements,
and the bean can't get control when an element is changed.

-- 
John Cowan					cowan@ccil.org
		e'osai ko sarji la lojban.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Mon Mar 29 04:45:31 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:45 2004
Subject: xhtml and the p tag
Message-ID: <004401be798d$ff8357e0$a2aedccf@ix.netcom.com>

Hi,
The following is an extract from the strict4.0 DTD

In HTML 4.0 a paragraph can only contain an inline element, just the same as
in XHTML

<!ELEMENT P - O (%inline;)*            -- paragraph -->
<!ATTLIST P
  %attrs;                              -- %coreattrs, %i18n, %events --
  >

Frank (speaking for myself)

Frank Boumphrey

XML and style sheet info at Http://www.hypermedic.com/style/index.htm
Author: - Professional Style Sheets for HTML and XML http://www.wrox.com
CoAuthor:  XML applications from Wrox Press, www.wrox.com
Author: Using XML on the Web (Aug)
----- Original Message -----
From: Mark D. Anderson <mda@discerning.com>
To: XML List <xml-dev@ic.ac.uk>
Cc: <dsr@w3.org>
Sent: Sunday, March 28, 1999 6:49 PM
Subject: xhtml and the p tag


>(not sure where xhtml discussion should go; all i see on the
www-html@w3.org
>list are stultifying discussions about tag case-sensitivity.)
>
>in the strict dtd from http://www.w3.org/TR/WD-html-in-xml/ ,
>the p element is %Inline, which means it can't include any
>block level elements such as ul. So now we have a quandary.
>
>in practical terms, what i usually want is something
>like a non-existent <parabreak/>. That doesn't exist, because
>in browsers a <br/> breaks the line; it doesn't end the
>paragraph, and particularly now with xhtml, <p/> is
>deprecated. But even if I *did* the extra work to wrap
><p> ... </p> around my paragraphs, that still wouldn't
>work, because a p can't enclose any block level elements
>such as a ul.
>
>-mda
>
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Mon Mar 29 04:54:57 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:10:45 2004
Subject: xhtml and the p tag
Message-ID: <033c01be798f$c5570010$0200a8c0@mdaxke.mediacity.com>

>In HTML 4.0 a paragraph can only contain an inline element, just the same as
>in XHTML

Right; html is broken too. But I've already given up on html :).

I'm still curious what a better content model for paragraphs
would be. (Sorry, i suppose this topic belongs somewhere else
even if xhtml-related; suggestions are welcome.)

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Mon Mar 29 05:24:04 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:46 2004
Subject: XHTML and character entities
Message-ID: <36FEF173.F52E1751@jfinity.com>

I mention tidy below but am asking about html->xhtml conversion in
general.

I use tidy to to convert html to xhtml using the -asxml switch. The
result of many conversions is still not accepted as well-formed because
entities like agrave and friends aren't defined unless you process the
DTD.

Wouldn't it be reasonable to convert these to character entities as part
of the html->xhtml process?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Mon Mar 29 10:46:23 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:46 2004
Subject: how to print the XML document in IE 5.0
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1730@EUKBANT101>

> -----Original Message-----
> From:	Derek Denny-Brown [SMTP:derekdb@microsoft.com]
> 
> Not to be picky, but... The "Save-As" option in IE5 for XML documents
> _does_
> save the XML.
> 
You can be even pickier if you like - the entire email below was incorrect,
and I appologise. I think this was the case for the betas though (that's
probably wrong too!).

Still, Mozilla's view source is nicer 'cos it's syntax highlighted...

Matt.
--
http://come.to/fastnet
Perl on Win32, PerlScript, ASP, Database, XML
GCS(GAT) d+ s:+ a-- C++ UL++>UL+++$ P++++$ E- W+++ N++ w--@$ O- M-- !V 
!PS !PE Y+ PGP- t+ 5 R tv+ X++ b+ DI++ D G-- e++ h--->z+++ R+++


> -----Original Message-----
> From: Matthew Sergeant (EML) [mailto:Matthew.Sergeant@eml.ericsson.se]
> 
> It appears that IE5 converts internally to HTML (with the XSL style
> sheet),
> so the answer is that you can't. Even a save to disk saves the HTML AFAIK.
> Try using Mozilla - it does things right, and displays XML+XSL remarkably
> well considering it's at least 6 months away from release.
> 
> > -----Original Message-----
> > From:	Kevin Hsu [SMTP:shyutz@ms1.hinet.net]
> > 
> > Can anyone tell me how to print the XML document as I see on the screen
> in
> > IE 5.0, thanks in advance.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Matthew.Sergeant at eml.ericsson.se  Mon Mar 29 10:55:09 1999
From: Matthew.Sergeant at eml.ericsson.se (Matthew Sergeant (EML))
Date: Mon Jun  7 17:10:46 2004
Subject: half-baked parsers vs binary XML
Message-ID: <5F052F2A01FBD11184F00008C7A4A800022A1731@EUKBANT101>

> -----Original Message-----
> From:	Gabe Beged-Dov [SMTP:begeddov@jfinity.com]
> 
> Another reason (other than the binary XML thread) that I brought this up
> was discussion on
> the perl-xml mailing list of whether XML::Parser was usable for soft
> real-time server side
> processing. The consensus there seems to be no.
> 
	I think it's "Yes" - if you do it right.

> XML::Parser is layered on expat. Anecdotal evidence seems to be that there
> is an order of
> magnitude performance advantage to "parsing" something other than XML. The
> two alternatives
> are a textual format that Perl can eval directly (Data::Dumper) and a
> binary format
> (Storable).
> 
> In both cases (Data::Dumper and Storable) there is conversion from the
> on-disk format to the
> in-memory format. Why is XML so much slower according to developer
> feedback? That is what I
> was trying to understand from other peoples experience rather than doing a
> hands-on analysis
> myself.
> 
> I may have jumped to the conclusion that it was the extra work that a
> well-formedness
> processor has to do over what a half-baked processor would do. That still
> leaves the quesion
> of where the slowdown is and whether it is an implementation issue or
> inherent is some aspect
> of XML parsing.
> 
	I think the real problem is that you're doing 2 stages of work with
XML::Parser, as opposed to using Storable or Data::Dumper. With XML::Parser
I'm reading the XML and searching (querying) for specific nodes within the
XML. There's work there that has to be done in finding the nodes. If I could
just call parsefile() without any extra work I think it would be fast
enough. What I'm really doing, by using Storable is caching the parse+query
phase. That should really be considered standard practice for any high
performance system.

	Matt.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Mon Mar 29 11:52:01 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:46 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <03d001be79c9$25673220$5402a8c0@oren.capella.co.il>

Stephen D. Williams <sdw@lig.net> wrote:
>I don't understand how to use XSL in a general way to acheive a 'delta
tree' architecture.  I
>have a vague idea, but nothing that I could see being automated
sufficiently.  Can you
>elaborate?


The following (from section 2.7.12 of the current XSL draft):

<xsl:template match="*|@*|comment()|pi()|text()">
    <xsl:copy>
        <xsl:apply-templates select="*|@*|comment()|pi()|text()"/>
    </xsl:copy>
</xsl:template>

Will copy all input to the output without modification. You can then add
templates to do specific modifications. For example:

<xsl:template match="TAG/@ATTR[.='OldValue'">
    <xsl:attribute name="ATTR">
        NewValue
    </xsl:attribute>
</xsl:template>

Will take all 'TAG' elements in the input document which have an 'ATTR'
attribute whose value is 'OldValue' and change its value to 'NewValue'.
Given the power of XSL match patterns and the power of the construction
elements, I think you can express any reasonable 'delta' on the input XML
tree.

Of course, this is outside the scope of the XSL intent as it stands today.

<Rant-and-Rave>
The transformation part of XSL is just what we need for:

- An XML query language. Think about it - an XML query language should (i)
be XML; (ii) allow selecting arbitrary parts of the input XML document(s);
(iii) allow constructing result XML document(s). The transformational part
of XSL already does 80% of that. Does anyone consider making XQL a proper
superset of XSL? Not a chance. Everyone is intent on creating a new
language. XQL at least reuses the match pattern syntax, while inventing a
new incompatible way of creating the results tree; XML-QL goes for broke and
reinvents the whole thing.

- A standard way to convert XML documents to legacy non-XML languages. Oops,
I just said non-XML languages. Excuse me.

- New and unexpected uses, such as the one above: expressing differences
between XML trees (which by itself has a lot of interesting applications).

But no, due to historical reasons XSL was created as part of a style
language, so we'll just have to use a different language for each of the
above uses and any new one which comes along (making sure they are
incompatible, of course).

Never mind that CSS is alive and kicking and supported by the very same W3C
is another way of specifying style. Never mind that CSS is staying away from
anything which might look like XML syntax, and is well along the way of
inventing a new match pattern language of its own, whose only advantage over
the XSL one is that it is incompatible with it.

I'm sure it all makes sense for _someone_. Whatever the reasons are, what I
see is "Job security for XML professionals for the next millennium".
</Rant-and-Rave>

Sorry, I just had to get it off my chest :-)

Have fun,
    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tony.mcdonald at ncl.ac.uk  Mon Mar 29 12:29:49 1999
From: tony.mcdonald at ncl.ac.uk (Tony McDonald)
Date: Mon Jun  7 17:10:46 2004
Subject: SQL database table structure for encoding XML documents?
Message-ID: <v04104408b32503c8973a@[128.240.198.13]>

Well, the subject says it all really.

Does anyone have a structure that works for them that they're willing 
to share? ie

...
CREATE TABLE ...
...

Any pointers to other resources etc.  would be gratefully received.

TIA
tone
------
Dr Tony McDonald,  FMCC, Networked Learning Environments Project
The Medical School, Newcastle University Tel: +44 191 222 5888
Fingerprint: 3450 876D FA41 B926 D3DD  F8C3 F2D0 C3B9 8B38 18A2

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From hb at ix.heise.de  Mon Mar 29 13:25:55 1999
From: hb at ix.heise.de (hb@ix.heise.de)
Date: Mon Jun  7 17:10:46 2004
Subject: Namespace Question
References: <v04104408b32503c8973a@[128.240.198.13]>
Message-ID: <36FF62DA.82924B33@ix.heise.de>

Hi,

For a short example regarding namespaces I have used a variant of Tim's
example in his XML.com article. 

Is it necessary (as I presume) to assign every single attribute as long
as it is not from HTML?

<html xmlns="http://www.w3.org/html4"
     xmlns:b="http://www.my.server.de/book"
     xmlns:p="http://www.my.server.de/person">
 <head><title>My Booklist</title></head>
 <body>
  <table>
    <tr><td>
<!-- these are the two lines where the attributes in question are: -->
          <b:title b:read="yup" 
             class="important">Dream a little dream of me</b:title></td>
      <td><b:author id="sfreud">
          <p:title>Dr.</p:title>
          <b:firstname>Sigmund</b:firstname>
          <b:surname>Freud</b:surname></b:author></td></tr>
  </table>
 </body>
</html>

Best regards,

Henning Behme

iX - Magazin fuer professionelle Informationstechnik
Helstorfer Str. 7 * 30625 Hannover * Germany  
http://www.heise.de/ix/ * +49 511 5352-374 * f: -361
------ White, adj. and n. Black  (Ambrose Bierce) ------

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 29 13:27:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:46 2004
Subject: half-baked parsers vs binary XML
In-Reply-To: <36FDA19A.27DA7B45@jfinity.com>
References: <36FD4FA4.26BCB466@jfinity.com>
	<14077.33404.430088.361367@localhost.localdomain>
	<36FD95F9.7E93A231@jfinity.com>
	<14077.38830.801250.747754@localhost.localdomain>
	<36FDA19A.27DA7B45@jfinity.com>
Message-ID: <14079.25158.877601.734891@localhost.localdomain>

Gabe Beged-Dov writes:

 > Another reason (other than the binary XML thread) that I brought
 > this up was discussion on the perl-xml mailing list of whether
 > XML::Parser was usable for soft real-time server side
 > processing. The consensus there seems to be no.

The speed bottleneck, however, is Perl, not Expat: if you were acting
off a different kind of input, it would still take just as long to
execute the Perl handlers for the start and end of each element, etc.

In other words, it's not the XML *input* that you need to optimize,
but the *output* -- for example, if you have a Perl script that
renders XML in HTML, the best speed optimization is to cache the
result and reserve it for any request with the same parameters.  

The XML/SGML processing model is generally to walk through a document
(as a collection of events or as a tree) and fire off handlers for
different types of things.  Even a short to medium-length XML document 
can cause the handlers to be fired off many thousands of times, and if 
you're trying to handle hundreds of requests per second, that's going
to cause problems with or without XML.

In some cases, the query processing model might help things,
especially if the query code is moved into C or C++.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 29 13:41:27 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:46 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG))
In-Reply-To: <005301be7906$745450c0$35f96d8c@NT.JELLIFFE.COM.AU>
References: <005301be7906$745450c0$35f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <14079.26243.658297.559636@localhost.localdomain>

Rick Jelliffe writes:

[...]

 > but Dave seems to be  saying that the fact that two things (pointed to)
 > are the same is not "information". That seems an extraordinary claim.
 > 
 > <eg>
 >     <owner id="j1">john</owner>
 >     <dog owner="j1">rover</dog>
 >     <dog owner="j1">rex</dog>
 > </eg>
 > 
 > encodes more information than
 > 
 > <eg>
 >     <dog owner="john">rover</dog>
 >     <dog owner="john">rex</dog>
 > </eg>

I don't think that I said that, though I certainly typed a lot.  What
I did say is that there's not a practical difference among the
different alternatives in XML and SGML for expressing this
information, and probably not enough to justify the parallel
maintenance of the two as discrete standards.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 29 13:55:03 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:46 2004
Subject: Proposed new kind of SAX2 thing, with example
In-Reply-To: <199903290049.TAA06821@locke.ccil.org>
References: <wkg16qudum.fsf@ifi.uio.no>
	<199903290049.TAA06821@locke.ccil.org>
Message-ID: <14079.27126.38040.640445@localhost.localdomain>

John Cowan writes:

 > Lars Marius Garshol scripsit:
 > 
 > > You can anyway, if you just use a Vector or some equivalent as the
 > > property value.
 > 
 > Vectors are no real substitute for indexed properties, because they
 > require exposing the collection rather than just its elements,
 > and the bean can't get control when an element is changed.

Either John is misinterpreting Lars or I am.  I thought that Lars
meant using a vector containing the index and the value, not a vector
of all the possible values.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Mon Mar 29 15:40:09 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:10:46 2004
Subject: Interface name quandry again ...
Message-ID: <c=US%a=_%p=Cromwell_Media%l=ODIN-990329133849Z-34603@odin.cromwellmedia.co.uk>

A while ago I posted to this list asking for a
suggestion for a Java package name for interfaces and
classes that deal with stuff in the intersection of xml 
and html, but which aren't sufficiently general to cover 
all of sgml.

Someone (John Cowan, I think) suggested xhtml. That
seemed like quite a good idea at the time, but since
then voyager has been renamed. I'm not a big fan of 
overloading, so I've been scratching my head trying to 
think of something else. So far I've come up with 
precisely zilch.

If anyone can help me out I'd be very grateful.

Cheers,


Miles

-- 
Miles Sabin                          Cromwell Media
Internet Systems Architect           5/6 Glenthorne Mews
+44 (0)181 410 2230                  London, W6 0LJ
msabin@cromwellmedia.co.uk           England

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Mon Mar 29 15:42:00 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:46 2004
Subject: Java(TM)/XML(TM) Open Source Development Session
Message-ID: <3.0.3.32.19990329084134.03898920@pop.mindspring.com>

ExoLab is hosting open source software development sessions for Java and
XML technologies. This description arrived in my inbox, and I thought it
might be interesting to some people here.

Jonathan


Sender: root@hermes.oceanet.fr
Date: Mon, 29 Mar 1999 14:26:28 +0200
From: "Isma?l Ghalimi" <ghalimi@exoffice.com>
Organization: ExOffice, Inc.
Subject: Java(TM)/XML(TM) Open Source Development Session

Hi,

We are pleased to announce the ExoLab, first Open Source Development
Session dedicated to Java(TM) & XML (TM) technologies. The ExoLab will
host Open Source software development sessions during periods ranging
from one week to one month. These sessions will be open to consulting
and software companies aiming at collaboratively work on the development
of Open Source software. ExoLab Sessions will help consulting and
software companies to share their knowledge about cutting-edge Open
Source technologies thus allowing powerful technology transfers between
companies having an Open Source business model. The first ExoLab Session
will take place in Nantes (Loire-Atlantique, FRANCE) during May 1999 and
will be mainly targeted at the development of the ExoGen Framework, an
Open Source Java(TM)-based application & document server. More
information about this initiative can be found on the ExoLab Home Page:

http://www.exoffice.com/exolab.html

The following developers will be present:

* One engineer from Lutris Technologies
  http://www.lutris.com

* Two engineers from SMB, the author of the Open Source Ozone OODBMS
  http://www.softwarebuero.de

* The author of SPFC
  http://java.apache.org/spfc/index.html

* The author of OpenXML
  http://www.openxml.org

* Three core developers from the Java Apache Project
  http://java.apache.org

* The author of XSL:P
  http://www.clc-marketing.com/xslp/

* The architect of ejboss
  http://www.ejboss.org

All the developments done during these sessions will be licensed under
the LGPL, the ALL, or a BSD-like license. It will allow the integration
of these developments into commercial binaries without having to
redistribute under any Open source license any modification done on it.

Please contact me at ghalimi@exoffice.com for more information.

Best regards

Isma?l Ghalimi, CEO
ExOffice, Inc.
ghalimi@exoffice.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Mon Mar 29 15:50:32 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:46 2004
Subject: convertor generator available
Message-ID: <36FF848C.75E4@skynet.be>

Version 0.0.1 of masterplan, a convertor generator is now available from

http://users.skynet.be/mp/mp001.tar.Z

some examples are included.

You'll need gcc, bison and flex or equivalents to build both the
executable and the convertors.

Masterplan comes as a 'kit' of GPL-ed core code and less restricted
library code, so the convertors themselves aren't infected by the GPL.

Have fun.


Paul Janssens - paul.janssens@skynet.be

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Mon Mar 29 16:02:31 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:46 2004
Subject: SQL database table structure for encoding XML documents?
References: <v04104408b32503c8973a@[128.240.198.13]>
Message-ID: <36FF8768.4F16@skynet.be>

Tony McDonald wrote:
> 
> Well, the subject says it all really.
> 
> Does anyone have a structure that works for them that they're willing
> to share? ie

No, but from the top of my head:

ENTITYNAMES
tag,   name

ENTITY
uniquekey,   tag,parententitykey,index

ATTRIBUTE
ownerentitykey,name,   value

CONTENT
ownerentitykey,index,   data

You don't have validation, but you do have referential integrity, and
you can render back to XML with a simple transitive closure.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Mon Mar 29 17:28:51 1999
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:10:46 2004
Subject: Fast filter support in SAX2
References: <001601be7886$f9db4440$c8a8a8c0@thing1> <14077.15323.327681.132673@localhost.localdomain>
Message-ID: <36FF9B91.A0632786@infinet.com>

David Megginson wrote:

> Bill la Forge writes:
>
>  > It would be great if filters had the same advantages as parsers in
>  > being able to simply test for equality (x==y) rather than having to
>  > do a string comparison (x.equals(y)) when checking for a specific
>  > element or attribute name.
>
> Yes, but as someone (James Clark?) pointed out during the last round,
> with most serious applications you're going to end up doing hash
> lookups anyway, so the == doesn't buy you much.

That depends on your implementation of a hash table.  Also as of JDK 1.1.6 the equals method
for strings first tests for identity of the two string objects and then tests to see if the
length is the same and then tests for matching of each character in each string.  When dealing
with names in XML they are uniformly nothing more than symbols so in application code being
able to do something like this:

if (x == "foo")

is generally much faster than:

if (x.equals("foo"))

as you do not incur the overhead of calling one dynamic method.  Really it depends on your
code.  In an XML related technology I worked on I had lots of if-else statements that did
exactly this.  The parser I used presented the strings to the application as interned strings
and did significantly improve performance from using the equals method approach.

Another thing that I used for speeding up my applications is to have a special hash table for
interned strings.  Basically all that this table did was use System.identityHashcode() instead
of String.hashcode() to get a hash for the string.  In effect you use the Object.hashCode()
implementation.

It also depends a lot on your VM.  Some VM's are good enough with dynamic method invocation
that the difference between testing for string identity and string equality is neglibible.
The so-called Hotspot VM may even inline String.equals() into your code.

I suggest using the identity approach if possible as it is easier to read and maintain IMHO
and in the general case you may get significant speedups if your application does many string
comparisons.  If you need a faster hash table for strings build one yourself.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Mon Mar 29 17:45:53 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:46 2004
Subject: Fw: Is there anyone working on a binary version of XML?
Message-ID: <045d01be79fa$91f833e0$5402a8c0@oren.capella.co.il>

Stephen D. Williams <sdw@lig.net> wrote:
>I don't understand how to use XSL in a general way to acheive a 'delta
tree' architecture.  I
>have a vague idea, but nothing that I could see being automated
sufficiently.  Can you
>elaborate?


The following (from section 2.7.12 of the current XSL draft):

<xsl:template match="*|@*|comment()|pi()|text()">
    <xsl:copy>
        <xsl:apply-templates select="*|@*|comment()|pi()|text()"/>
    </xsl:copy>
</xsl:template>

Will copy all input to the output without modification. You can then add
templates to do specific modifications. For example:

<xsl:template match="TAG/@ATTR[.='OldValue'">
    <xsl:attribute name="ATTR">
        NewValue
    </xsl:attribute>
</xsl:template>

Will take all 'TAG' elements in the input document which have an 'ATTR'
attribute whose value is 'OldValue' and change its value to 'NewValue'.
Given the power of XSL match patterns and the power of the construction
elements, I think you can express any reasonable 'delta' on the input XML
tree.

Of course, this is outside the scope of the XSL intent as it stands today.

<Rant-and-Rave>
The transformation part of XSL is just what we need for:

- An XML query language. Think about it - an XML query language should (i)
be XML; (ii) allow selecting arbitrary parts of the input XML document(s);
(iii) allow constructing result XML document(s). The transformational part
of XSL already does 80% of that. Does anyone consider making XQL a proper
superset of XSL? Not a chance. Everyone is intent on creating a new
language. XQL at least reuses the match pattern syntax, while inventing a
new incompatible way of creating the results tree; XML-QL goes for broke and
reinvents the whole thing.

- A standard way to convert XML documents to legacy non-XML languages. Oops,
I just said non-XML languages. Excuse me.

- New and unexpected uses, such as the one above: expressing differences
between XML trees (which by itself has a lot of interesting applications).

But no, due to historical reasons XSL was created as part of a style
language, so we'll just have to use a different language for each of the
above uses and any new one which comes along (making sure they are
incompatible, of course).

Never mind that CSS is alive and kicking and supported by the very same W3C
is another way of specifying style. Never mind that CSS is staying away from
anything which might look like XML syntax, and is well along the way of
inventing a new match pattern language of its own, whose only advantage over
the XSL one is that it is incompatible with it.

I'm sure it all makes sense for _someone_. Whatever the reasons are, what I
see is "Job security for XML professionals for the next millennium".
</Rant-and-Rave>

Sorry, I just had to get it off my chest :-)

Have fun,
    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From timm at channelpoint.com  Mon Mar 29 18:20:20 1999
From: timm at channelpoint.com (Tim McCune)
Date: Mon Jun  7 17:10:46 2004
Subject: OFF: (waaay off topic) RE: LISTADMIN: No attachments to list mess
	ages PLEASE
Message-ID: <8A24EC12044FD21195E200600895E0B3016363B4@goat.channelpoint.com>

Damned eloquent David.  But I'd put the poster at #1 on that list for being
ignorant enough to open a Word document that was attached to an e-mail
message.  Your comment about technical diversity indicates to me that you've
never been a system administrator. ;)

-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
David Megginson

As became clear in the follow-ups, the posting was done by a worm that
hides in Word macros (the Internet's equivalent of animal dung,
apparently) exploits gaping security holes in Outlook to mail itself
out to everyone in a person's address list.

In other words, the original poster did *not* post the attachment to
xml-dev, the worm did.  His only mistakes were (a) using Microsoft
Windows, (b) opening a file in MS Word, and (c) not uninstalling
Outlook from his computer the first time he booted up.  If you had
summarily unsubscribed him, then you would simply have added an unjust
punishment to the embarrassment he was already suffering.

In fact, all three of the mistakes were probably mandated by company
policy; if so the true blame belongs in three places, in diminishing
order of culpability:

1. The poster's company, for ignoring the importance of technical
   diversity and mandating the same operating system and software for
   everyone (it's much easier to write a worm or virus when everyone's 
   using exactly the same software).

2. Redmond, for ignoring security whenever possible.

3. The creator of the worm.

If I'm right about corporate policy, then most of the blame goes to
the company -- Redmond just wants to sell software, and the worm
creator just wants attention, but the company failed to act in its own
self-interest.  Technical diversity is critical for good operation:
I'd no more want to see an all-Linux shop than I'd want to see an
all-Windows or an all-Mac shop.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Mon Mar 29 18:29:27 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:10:46 2004
Subject: XHTML and character entities
Message-ID: <005c01be7a01$0141d480$1eacdccf@ix.netcom.com>

Actually the best thing would be to convert them all to numeric entities,
and then the problem wouldn't arise

frank
----- Original Message -----
From: Gabe Beged-Dov <begeddov@jfinity.com>
To: XML List <xml-dev@ic.ac.uk>
Sent: Sunday, March 28, 1999 10:20 PM
Subject: XHTML and character entities


>I mention tidy below but am asking about html->xhtml conversion in
>general.
>
>I use tidy to to convert html to xhtml using the -asxml switch. The
>result of many conversions is still not accepted as well-formed because
>entities like agrave and friends aren't defined unless you process the
>DTD.
>
>Wouldn't it be reasonable to convert these to character entities as part
>of the html->xhtml process?
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
>(un)subscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Mon Mar 29 18:48:29 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:46 2004
Subject: XML query language
References: <045d01be79fa$91f833e0$5402a8c0@oren.capella.co.il>
Message-ID: <36FFAE41.2780@skynet.be>

Oren Ben-Kiki wrote:
> - An XML query language. Think about it - an XML query language should (i)
> be XML; (ii) allow selecting arbitrary parts of the input XML document(s);
> (iii) allow constructing result XML document(s). 

I think (iii) should not be a requirement of an XML query language. The
result of a query  could be a vector of tuples of pointers to the
individual matches. Whatever needs to be done with that output can be
done in a layer above that. Just because SQL mixes content with style
doesn't mean an XML query language should.

Paul Janssens - paul.janssens@skynet.be

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nhs at llnl.gov  Mon Mar 29 19:02:44 1999
From: nhs at llnl.gov (Norman H. Samuelson)
Date: Mon Jun  7 17:10:47 2004
Subject: XML to Text questions
Message-ID: <4.1.19990329084312.00aae3f0@popeye.llnl.gov>

What tools are available for translation of XML into text?

We are working on a GUI that will write the information needed as input to
a physics simulation code in XML, and we need to translate that into the
grammar required by the physics code.

Our goal is a GUI that will work for many different physics codes.  The
translators are necessary because we do not want to change the physics
simulation at this time to read XML directly.


- Norm -

Norman H. Samuelson                  nhs@llnl.gov
Lawrence Livermore National Lab      925-422-0661
P.O. Box 808, L-98
Livermore, CA 94551 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Mon Mar 29 19:30:04 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:47 2004
Subject: XHTML and character entities
References: <005c01be7a01$0141d480$1eacdccf@ix.netcom.com>
Message-ID: <36FFB7A8.5D81864@jfinity.com>

Frank Boumphrey wrote:

> Actually the best thing would be to convert them all to numeric entities,
> and then the problem wouldn't arise

That is what I meant by character entity. I should have said character
reference. Converting the general entity references to character references
is what I was trying to ask about. I.e. is it reasonable for a html->xhtml
converter to do this automagically  or should it be an option, etc..

A further question would be when Tidy would start doing it :-?

Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Thompson at dresdnerkb.com  Mon Mar 29 19:46:34 1999
From: James.Thompson at dresdnerkb.com (James.Thompson@dresdnerkb.com)
Date: Mon Jun  7 17:10:47 2004
Subject: XSL Transformation
Message-ID: <199903291742.SAA19675@harpo.dresdnerkb.com>

Hi,

I have an XML doc that looks like this

<STOCKITEM>
	<CATEGORY>A</CATEGORY>
	<STOCKCODE>123</STOCKCODE>
</STOCKITEM>
<STOCKITEM>
	<CATEGORY>A</CATEGORY>
	<STOCKCODE>456</STOCKCODE>
</STOCKITEM>
<STOCKITEM>
	<CATEGORY>B</CATEGORY>
	<STOCKCODE>789</STOCKCODE>
</STOCKITEM>

I would like to transform it into this kind of structure using XSL:

<CATGEGORY name="A">
	<STOCKITEM>
			<CODE>123</CODE>
	</STOCKITEM>
	<STOCKITEM>
			<CODE>456</CODE>
	</STOCKITEM>
</CATEGORY>
<CATGEGORY name="B">
	<STOCKITEM>
			<CODE>789</CODE>
	</STOCKITEM>
</CATEGORY>

I don't know the categories in advance, and there are also sub cats that
will nest within the categories. I could do it using scripts and some kind
of fudge based on the SQL SELECT DISTINCT category idea. However, I think
this is somewhat against the spirit of XSL. Any ideas on how this might be
done? I can't be this first person to want to do this kind idiom.

Many Thanks

James Thompson


##########################################
This email, its content and any files transmitted with it are intended
solely for the addressee(s) and may be legally privileged and/or 
confidential. Access by any other party is unauthorised without the
express written permission of the sender. If you have received this 
email in error you may not copy or use the contents, attachments or 
information in any way. Please destroy it and contact the sender on 
the number printed above, via the Dresdner Kleinwort Benson 
switchboard on +44 171 623 8000 or via e-mail return. Internet 
communications are not secure unless protected using strong 
cryptography. This email has been prepared using information believed
by the author to be reliable and accurate, but Dresdner Kleinwort 
Benson makes no warranty as to accuracy or completeness. In particular
Dresdner Kleinwort Benson does not accept responsibility for changes
made to this email after it was sent. Any opinions expressed in this 
document are those of the author and do not necessarily reflect the 
opinions of the Bank or its affiliates. They may be subject to change
without notice.
##########################################

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Mon Mar 29 19:51:23 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:47 2004
Subject: half-baked parsers vs binary XML
References: <5F052F2A01FBD11184F00008C7A4A800022A1731@EUKBANT101>
Message-ID: <36FFBCB4.15CA1B0B@jfinity.com>

Matthew Sergeant (EML) wrote:

> If I could just call parsefile() without any extra work I think it would be
fast > enough.

As Nathan Kurz's posting to the perl-xml shows, there is a bottleneck in just
the parsing of the XML without bringing callback firing, let alone query
processing into the picture.

> What I'm really doing, by using Storable is caching the parse+query phase.

This is great if your use-case supports it. It is not a general purpose
approach to providing scaleable performance for soft real time systems that
want to incorporate XML parsing.

> That should really be considered standard practice for any high
> performance system.

Once again, I would say that if your "high performance" system can architected
using a "cache the parse+query" approach and the complexity and storage
overheads are acceptable, go for it.  There are alot of "high performance"
systems that wont be amenable to this approach.

Gabe Beged-Dov
www.jfinity.com


>
>
>         Matt.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Mon Mar 29 20:02:32 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:10:47 2004
Subject: half-baked parsers vs binary XML
References: <36FD4FA4.26BCB466@jfinity.com>
		<14077.33404.430088.361367@localhost.localdomain>
		<36FD95F9.7E93A231@jfinity.com>
		<14077.38830.801250.747754@localhost.localdomain>
		<36FDA19A.27DA7B45@jfinity.com> <14079.25158.877601.734891@localhost.localdomain>
Message-ID: <36FFBF61.5307A689@jfinity.com>

David Megginson wrote:

> In other words, it's not the XML *input* that you need to optimize,
> but the *output* -- for example, if you have a Perl script that
> renders XML in HTML, the best speed optimization is to cache the
> result and reserve it for any request with the same parameters.

Assume that caching isn't an option. I.e. you have to make all your processing reasonably
fast. Its not acceptable to make 80% of your processing really fast.

> The XML/SGML processing model is generally to walk through a document
> (as a collection of events or as a tree) and fire off handlers for
> different types of things.  Even a short to medium-length XML document
> can cause the handlers to be fired off many thousands of times, and if
> you're trying to handle hundreds of requests per second, that's going
> to cause problems with or without XML.

Are we talking about throughput or responsiveness?  It would be useful  to bring up some
use-cases where XML processing can't be employed using the default handler firing model and
try to understand what the alternatives are.

Matt Sergeant has brought up one that he might be able to flesh out involving large scale
usage. I'm sure there are others.

Gabe Beged-Dov
www.jfinity.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 29 20:03:05 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:47 2004
Subject: Why Technical Diversity Matters (was OFF: (waaay off topic))
In-Reply-To: <8A24EC12044FD21195E200600895E0B3016363B4@goat.channelpoint.com>
References: <8A24EC12044FD21195E200600895E0B3016363B4@goat.channelpoint.com>
Message-ID: <14079.47860.97820.196660@localhost.localdomain>

Tim McCune writes:

 > Damned eloquent David.  But I'd put the poster at #1 on that list
 > for being ignorant enough to open a Word document that was attached
 > to an e-mail message.

You cannot expect typical users to make an informed decision about
software security risks (some can certainly do so, but it is not a
reasonable expectation in general).

 > Your comment about technical diversity indicates to me that you've
 > never been a system administrator. ;)

I've had budgetary responsibility for system administrators, and have
hired and supervised them, so I do understand why it is so tempting to
go for technical homogeneity rather than technical diversity.  In the
end, however, it's actually just bad business.

This is not a problem that is specific to computers: it's a general
business cost/risk tradeoff.  To get away from the anti-Windows hype,
imagine that you run a mid-sized, regional air carrier with all your
routes and passenger loads about the same: you will save an *enormous*
amount of money in training, maintenance, staff, facilities, etc. if
you buy all of your planes from the same manufacturer (and preferably,
if you buy the same model).

Now, let's say that you bought a fleet of 15 A320's from Airbus, and
they run beautifully for seven years.  Suddenly, there's a major crash
involving an A320 from another airline a month before Christmas, and
the FAA grounds all planes of that model until their investigation is
finished.  The investigation finishes in mid-January and your A320's
get a clean bill of health, but now you've not only missed the
Christmas rush (which accounts for a large part of your annual
revenue) and destroyed employee moral (by laying most of them off just
before Christmas), but you've upset your customers, who had to switch
to other airlines and wait at the back of the line.

MORAL
-----

When you decided to save money by buying all of your planes from the
same manufacturer, you were actually doing the opposite of buying
insurance: with insurance, you trade a fixed cost (your insurance
premiums) for a non-fixed benefit (avoiding a large, unexpected
liability); with technical homogenity, you trade a non-fixed cost (the
possibility of a complete operations shutdown of indeterminate length)
for a fixed benefit (a known reduction in the cost of ownership).

It isn't hard to see how the same point applies to computing, no
matter how good or competent a specific manufacturer is.  In the end,
some businesses may decide to take this risk, but they should at least
do it in an informed way (i.e. realise that it's a risk) and protect
themselves with some sort of derivatives or supplementary insurance.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Mon Mar 29 20:03:47 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:47 2004
Subject: XML to Text questions
In-Reply-To: <4.1.19990329084312.00aae3f0@popeye.llnl.gov>
Message-ID: <199903291803.NAA15557@hesketh.net>

At 08:58 AM 3/29/99 -0800, Norman H. Samuelson wrote:
>What tools are available for translation of XML into text?
>
>We are working on a GUI that will write the information needed as input to
>a physics simulation code in XML, and we need to translate that into the
>grammar required by the physics code.
>
>Our goal is a GUI that will work for many different physics codes.  The
>translators are necessary because we do not want to change the physics
>simulation at this time to read XML directly.

You'd have to do some 'roll-your-own' work right now, but it probably
wouldn't be very difficult to write a SAX application (in Java) that does
what you need, taking the events generated by parsing XML and converting
that information into the text format you need.  The MDSAX library has some
display (really output generation) tools that might simplify managing that
process, but you (or another lucky programmer) could probably write a
fairly simple application as long as your XML and your final text output
have similar structures.

More on SAX - http://www.megginson.com/SAX/
More on MDSAX - http://www.jxml.com/mdsax 

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 29 20:11:56 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:47 2004
Subject: XML to Text questions
In-Reply-To: <4.1.19990329084312.00aae3f0@popeye.llnl.gov>
References: <4.1.19990329084312.00aae3f0@popeye.llnl.gov>
Message-ID: <14079.49440.827275.773899@localhost.localdomain>

Norman H. Samuelson writes:

 > What tools are available for translation of XML into text?

 > We are working on a GUI that will write the information needed as input to
 > a physics simulation code in XML, and we need to translate that into the
 > grammar required by the physics code.

You can write quick one-off 10-line scripts with Perl, Python, and
probably many other scripting languages.  A Java app takes a little
more work, but the clean object-oriented structure allows you to do
harder things without going completely insane, and there are an awful
lot of good, higher-level XML libraries for Java (though the Python
and Perl collections are growing fast).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Mon Mar 29 20:12:40 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:47 2004
Subject: Whence XQL?
In-Reply-To: <19990329104719.A13271@io.mds.rmit.edu.au>
References: <3.0.3.32.19990325220915.032a1480@pop.mindspring.com>
 <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>
 <30649320C177D111ADEC00A024E9F297169FBC@exchange-server.dega.com>
 <000b01be7702$102babd0$0100007f@eps.inso.com>
 <3.0.3.32.19990325165217.00a6b550@pop.mindspring.com>
 <19990326133124.B7318@io.mds.rmit.edu.au>
 <3.0.3.32.19990325220915.032a1480@pop.mindspring.com>
Message-ID: <3.0.3.32.19990329131150.00d0b710@pop.mindspring.com>

At 10:47 AM 3/29/99 +1000, Marcelo Cantos wrote:
>On Thu, Mar 25, 1999 at 10:09:15PM -0500, Jonathan Robie wrote:
 
>> Cool, you work on SIM? (Does that make you a SIMian?)
>
>Cute!  It might just take off around here. :-)

I haven't been able to come up with a similar nickname for people who work
on XQL...

>I do wonder what proportion of people looking seriously at XQL are
>into text.  We find WITHIN N to be exceedingly useful.  It is also
>interesting to note that we only offer proximity at the word level and
>that this is all clients ever really want.  We do also offer same
>sentence/paragraph queries, but virtually no-one uses them.

One full-text search engine vendor told me that their users did not use
proximity searching. This surprised me, but it was what convinced me that I
might be able to leave proximity out of even full-text extensions to XQL.

Most of what I have done with XML until fairly recently was with structured
documents rather than data, or with documents that also contain what has
classically been considered data. I am now starting to do more with XML for
data. I think that both Microsoft and Joe Lapp of webMethods have worked
more with data than with documents.

>It's an interesting angle, though not one I had considered (not that I
>have considered many angles :-).  I had understood, perhaps
>incorrectly, that the only way to perform word-level boolean queries
>was to treat words abstractly as leaf nodes of the document tree
>rather than clumps of opaque string data.  Under this conception, to
>find "other name", one would say:
>
>  LINE[WORD="other"; WORD="name"]
>
>It could possibly be made legal to abbreviate the above to:
>
>  LINE["other"; "name"]

XQL as-is does not allow this, but I have discussed this as a possible
extension in the section on "Integrating structured and full-text queries",
in http://www.w3.org/TandS/QL/QL98/pp/murata-san.html, a paper written
together with Makoto Murata-san. It makes the above syntax legal.

The other approach, which you have used above, is to pretend that there is
markup identifying the individual words - that's a perfectly valid approach
too.

>Which would be interpreted as, "a Line element which is the parent of
>a leaf node equal to "other" immediately preceding a leaf node equal
>to "name".  Now, support for proximity ("rose*" within 10 words of
>"sweet") would simply be a matter of:
>
>  LINE["rose*" %10 "sweet"]
>
>(The %N syntax is borrowed from our query language.)  Higher level
>proximities could be done like this:
>
>  LINE["name"] %10 LINE["purple"]
>
>The operator simply adopts the level of its operands mismatched
>operands constitute an error.
 
I would have to think about how to fit that into the XQL grammar. Does it
have advantages over the function-based approach I suggested earlier?

	near("name", "purple", 10)

This fits into the XQL grammar without modification, it's just a matter of
introducing another function.

Jonathan
 
jonathan@texcel.no
Texcel Research
http://www.texcel.no

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Mon Mar 29 20:18:12 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:10:47 2004
Subject: ANNOUNCE: XQL processor in Java
Message-ID: <199903291816.UAA24037@sonne.darmstadt.gmd.de>

GMD-IPSI is pleased to announce Java based implementations
of the XQL language and a persistent W3C-DOM.

The GMD-IPSI XQL engine [1] is a Java based storage and query
application for large XML documents. The functionality may
be accessed via command line invocation or the Java API.
The engine consists of two main parts:

1. A persistent implementation of the W3C-DOM
2. A full implementation of the XQL language

The XQL engine implements the W3C-QL '98 workshop paper syntax
of XQL. It uses a novel indexing algorithm for XML (publication
pending), which indexes the document while processing the
first query. Subsequent queries to the same document are
considerably accelerated.

The persistent DOM implements the W3C-DOM interfaces on
indexed, binary XML files. Documents are parsed once and
are stored in this form, accessible to DOM calls without
the overhead of parsing them first. A cache architecture
additionally increases performance. At this time only read
access is possible, support of the full W3C-DOM API is work
in progress.

The GMD-IPSI XQL engine was developed as a research project
in GMD's XML competence center by Gerald Huck [2], with
contributions by Ingo Macherius [3]. It is free for non-commercial
use and evaluation, see the download page for details.
For commercial requests contact the main author.

[1] http://xml.darmstadt.gmd.de/xql/
[2] mailto:huck@gmd.de
[3] mailto:macherius@gmd.de

--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Mar 29 20:35:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:47 2004
Subject: half-baked parsers vs binary XML
In-Reply-To: <36FFBF61.5307A689@jfinity.com>
References: <36FD4FA4.26BCB466@jfinity.com>
	<14077.33404.430088.361367@localhost.localdomain>
	<36FD95F9.7E93A231@jfinity.com>
	<14077.38830.801250.747754@localhost.localdomain>
	<36FDA19A.27DA7B45@jfinity.com>
	<14079.25158.877601.734891@localhost.localdomain>
	<36FFBF61.5307A689@jfinity.com>
Message-ID: <14079.51018.128742.97025@localhost.localdomain>

Gabe Beged-Dov writes:

 > > The XML/SGML processing model is generally to walk through a document
 > > (as a collection of events or as a tree) and fire off handlers for
 > > different types of things.  Even a short to medium-length XML document
 > > can cause the handlers to be fired off many thousands of times, and if
 > > you're trying to handle hundreds of requests per second, that's going
 > > to cause problems with or without XML.
 > 
 > Are we talking about throughput or responsiveness?  It would be
 > useful to bring up some use-cases where XML processing can't be
 > employed using the default handler firing model and try to
 > understand what the alternatives are.

I'm talking about throughput -- using a persistent interpreter (like
mod_perl) rather than a CGI can solve most of the responsiveness
problems.

The difficulty is just that firing off so much Perl code is (in Perl's
current design) slow.  The original posting suggested using a binary
format because parsing XML with Expat is slow, but in fact, Expat and
the actual XML parsing turn out not to be a bottleneck.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fmclain at cdgpd.com  Mon Mar 29 20:38:19 1999
From: fmclain at cdgpd.com (Fred McLain)
Date: Mon Jun  7 17:10:47 2004
Subject: (waaay off topic) RE: LISTADMIN: No attachments to list mess 
	ages PLEASE
Message-ID: <5FFEC1B73A7BD1119D56006008C369F30ED3DE@rainier.cdgpd.com>

Tim,

I can understand the desire to point fingers over this.  I'm fairly well
versed in security matters and I don't see why I would have been alerted to
this virus.  The mail I read came from a trusted source - our product
manager, and was from internal e-mail, not the internet.  Under those
circumstances it seemed appropriate for me to open the e-mail attachment.
As I'm sure you are aware, once the macro was running I had no control over
it resending itself to this list.

Personally I feel the fault was with MS Word and MS Outlook.  If these
programs did not allow a macro program over e-mail to control both Outlook
and Word then this could not have happened.  Furthermore the only alert you
get when a potentially dangerous macro is being run by word is the macro
warning message, outlook doesn't even bother to warn about embedded macros
in word documents.  The macro warning message is one I see every time I
create a new document and a great many times when I read one.  If you cry
wolf often enough, you get ignored.

	-Fred-

-------------------------------------
Fred McLain, Senior Technical Advisor
Continental DataGraphics, Bellevue WA
-------------------------------------


-----Original Message-----
From: Tim McCune [mailto:timm@channelpoint.com]
Sent: Monday, March 29, 1999 8:20 AM
To: 'David Megginson'; 'XML Developers' List'
Subject: OFF: (waaay off topic) RE: LISTADMIN: No attachments to list
mess ages PLEASE


Damned eloquent David.  But I'd put the poster at #1 on that list for being
ignorant enough to open a Word document that was attached to an e-mail
message.  Your comment about technical diversity indicates to me that you've
never been a system administrator. ;)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Mar 29 20:53:49 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:47 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <87256743.0062192B.00@d53mta03h.boulder.ibm.com>


>Imagine that you have all the features of XML: structure, flexibility,
common format for
>interchange, but that you perform zero processing steps to import or
export the 'document'
>from a program.  (Actually, I'm thinking this would be done in chunks, but
essentially very
>few reads and writes.)
>

Actually, to be fair, there would be a somewhat non-trivial amount of bit
fiddlin' to get it out of whatever canonical binary format you put it in,
into the local byte order, floating point representation, byte boundary
alignment, etc... Though hopefully that couldn't be any worse than parsing
:-)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From daniela at cnet.com  Mon Mar 29 21:10:31 1999
From: daniela at cnet.com (Daniel Austin)
Date: Mon Jun  7 17:10:47 2004
Subject: xhtml and the p tag
Message-ID: <77A952A6B467D211855D00805F9521F114938A@cnet10.cnet.com>

Mark,

	This has not changed from HTML 4.0. All of your paragraphs in XHTML
documents should be enclosed with <p>...</p>
element delimiters. Since the <p> element is itself a block level element,
it cannot itself contain any block level elements.
The construction <p/> has to my knowledge never been acceptable markup in
either HTML or XHTML documents.

Regards,

D-

(speaking for myself, rather than any working group or corporation)

> -----Original Message-----
> From: Mark D. Anderson [mailto:mda@discerning.com]
> Sent: Sunday, March 28, 1999 3:50 PM
> To: XML List
> Cc: dsr@w3.org
> Subject: xhtml and the p tag
> 
> 
> (not sure where xhtml discussion should go; all i see on the 
> www-html@w3.org
> list are stultifying discussions about tag case-sensitivity.)
> 
> in the strict dtd from http://www.w3.org/TR/WD-html-in-xml/ ,
> the p element is %Inline, which means it can't include any
> block level elements such as ul. So now we have a quandary.
> 
> in practical terms, what i usually want is something 
> like a non-existent <parabreak/>. That doesn't exist, because
> in browsers a <br/> breaks the line; it doesn't end the
> paragraph, and particularly now with xhtml, <p/> is
> deprecated. But even if I *did* the extra work to wrap
> <p> ... </p> around my paragraphs, that still wouldn't
> work, because a p can't enclose any block level elements
> such as a ul.
> 
> -mda
> 
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and 
> on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jonathan at texcel.no  Mon Mar 29 21:11:55 1999
From: jonathan at texcel.no (Jonathan Robie)
Date: Mon Jun  7 17:10:47 2004
Subject: ANNOUNCE: XQL processor in Java
In-Reply-To: <199903291816.UAA24037@sonne.darmstadt.gmd.de>
Message-ID: <3.0.3.32.19990329141022.00cec100@pop.mindspring.com>

At 08:21 PM 3/29/99 +0200, Ingo Macherius wrote:

>GMD-IPSI is pleased to announce Java based implementations
>of the XQL language and a persistent W3C-DOM.
 
Cool! I'm delighted.

Jonathan
--
Jonathan Robie
R&D Fellow, Software AG
jonathan.robie@sagus.com <- this address will be active Monday

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From daniela at cnet.com  Mon Mar 29 21:22:20 1999
From: daniela at cnet.com (Daniel Austin)
Date: Mon Jun  7 17:10:48 2004
Subject: Namespace Question
Message-ID: <77A952A6B467D211855D00805F9521F114938B@cnet10.cnet.com>

Hi,

	Your question and example are seemingly incomplete; what kind of
document is this with which you are
attempting to use XML Namespaces? If it is intended to be an XHTML document,
it needs an XML PI 
like so: <?xml version="1.0" ?>. I'm asking not to be a smartass but because
it is hard to answer your question otherwise.
If your document is intended to be HTML 4.0 as one of your xmlns attribute
values suggests, then using XML Namespaces is totally inappropriate; HTML
4.0 documents are not XML documents, and XML Namespaces cannot be used in
any way. If the example is an xhtml 1.0 document (I'm assuming that it is)
then the answer to your question is this: an element with an appropriate
Namespaces prefix does not need to have the prefix attached to each of its
attributes, because the scope for that element is defined by the element
name prefix. If you use an attribute from a Namespace that differs from the
Namespace of the element on which the attribute appears, then you must
prefix it properly.

Hope this helps,

Regards,

D-

> -----Original Message-----
> From: hb@ix.heise.de [mailto:hb@ix.heise.de]
> Sent: Monday, March 29, 1999 3:24 AM
> To: xml-dev@ic.ac.uk; hb@ix.heise.de
> Subject: Namespace Question
> 
> 
> Hi,
> 
> For a short example regarding namespaces I have used a 
> variant of Tim's
> example in his XML.com article. 
> 
> Is it necessary (as I presume) to assign every single 
> attribute as long
> as it is not from HTML?
> 
> <html xmlns="http://www.w3.org/html4"
>      xmlns:b="http://www.my.server.de/book"
>      xmlns:p="http://www.my.server.de/person">
>  <head><title>My Booklist</title></head>
>  <body>
>   <table>
>     <tr><td>
> <!-- these are the two lines where the attributes in question are: -->
>           <b:title b:read="yup" 
>              class="important">Dream a little dream of 
> me</b:title></td>
>       <td><b:author id="sfreud">
>           <p:title>Dr.</p:title>
>           <b:firstname>Sigmund</b:firstname>
>           <b:surname>Freud</b:surname></b:author></td></tr>
>   </table>
>  </body>
> </html>
> 
> Best regards,
> 
> Henning Behme
> 
> iX - Magazin fuer professionelle Informationstechnik
> Helstorfer Str. 7 * 30625 Hannover * Germany  
> http://www.heise.de/ix/ * +49 511 5352-374 * f: -361
> ------ White, adj. and n. Black  (Ambrose Bierce) ------
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Mar 29 22:31:28 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:48 2004
Subject: A Simple Thought
Message-ID: <87256743.006AF60F.00@d53mta03h.boulder.ibm.com>


>Ahh, there's the trick.  I believe I have most of a design for an data
structure
>that is fast in memory yet is 'flat' and can have its chunks just written
out or
>read in at any point.  It builds on some very old ideas I came up with for
a
>language I designed.  When viewed as an interchange format, it may not be
the
>most optimal space wise (although it should be better than XML text) but
trades
>a small amount of space for nearly zero processing overhead.  There will
>probably also be a procedure for 'compacting' an object for storage into a
>database or sending over a slow link vs. the 'fast' format usable between
>servers in a cluster.
>

I think though that this would only hold up as long as you are looking at
XML data as a read-only data source. Once you started doing significant
editing of the data, having a flat structure like that would be more of a
hinderance than a help, would it not? What if I have a 10MB flat buffer and
want to add another child to the second element? This kind of gets into the
quandry that you've nailed in one nail, but now its even harder to nail a
whole raft of others as well as with the more general purpose mechanisms.

I dunno, if I were thinking along these lines, to keep it reasonably
portable, I'd look at the binary format as a fast serialization mechanism
and at least create native language objects for each one. By the time you
put enough stream format markers and whatnot into the stream to know where
things are, and interpret those during runtime, it might be just as fast to
pay the cost for creating a much more flexible, native object format for in
memory manipulation.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tomh at thinlink.com  Mon Mar 29 23:17:54 1999
From: tomh at thinlink.com (Tom Harding)
Date: Mon Jun  7 17:10:48 2004
Subject: Extensible Protocol implementation in Java
Message-ID: <36FFEDC5.64D49FB4@thinlink.com>


I have written a free Java implementation of Extensible Protocol, a
pure-XML protocol for sending and receiving XML documents on a
persistent connection.  Interested folks can find out more at

http://www.thinlink.com/xp

Comments are most welcome.

Tom Harding


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Mon Mar 29 23:50:44 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:48 2004
Subject: xhtml and the p tag
References: <77A952A6B467D211855D00805F9521F114938A@cnet10.cnet.com>
Message-ID: <36FFF4B5.D850C401@w3.org>


Daniel Austin wrote:
> 
> Mark,
> 
>         This has not changed from HTML 4.0.

Or 3.2 or 2.0

> All of your paragraphs in XHTML
> documents should be enclosed with <p>...</p>
> element delimiters. 

Yes. Explicitly.

In theory, they were enclodsed in them implicitly with HTML <=4.0
through the magic of SGML omissible end tags. In practice, though,
browsers did not correctly infer missing end tags (or indeed omitted
start tags) thus leading to the well known disparities in HTML "parsing"
which became abundantly obvious with the rise in use of CSS and DOM
(both of which require a parse tree, preferable the correct parse tree).

> Since the <p> element is itself a block level element,
> it cannot itself contain any block level elements.

Like body, ul, ol, dl and div ? These are block level elements and can
contain other block level elements.

> The construction <p/> has to my knowledge never been acceptable markup in
> either HTML or XHTML documents.

True. 

In an XML document instance which used HTML element names, <p/> would be
quite fine for an empty paragraph, in a well formed document.

XHTML warns agains using it only if you are trying to fool existing HTML
browsers into accepting your XML as if it were HTML.

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Tue Mar 30 00:54:53 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:48 2004
Subject: SAX2: Proposed alternative DTD interface
Message-ID: <87256743.0078341A.00@d53mta03h.boulder.ibm.com>


>Here's another alternative for SAX2: forget about trying to report DTD
>declarations as events, and simply make the whole DTD available
>through an interface with a Parser2.get() call.
>
>I threw together a quick (read-only) DTD interface this morning, and
>uploaded it to the following location
>

But, what would you use for the form of the DTD? Its almost certainly not
going to be stored in that way internally in the parser's pools, i.e. it
would most likely be much more optimized (or even just different for
whatever reasons.) So you would either have to totally translate all of
that into some instance of your DTD class, or you would have to make the
DTD object just a call through to get the data from the parser. However,
the latter scheme has problems if you want to reuse the parser instance
because now you've tied an instance of the DTD access object to an instance
of the parser and you cannot reuse the parser without frying the DTD access
object (and you have have no idea how long people might want to hang onto
that info.)

The same issue kind of happens with any DOM DTD access that might happen
down the road. If the DOM stores the element/entity/etc... stuff in its own
form its going to be redundant since that data is already in the parser.
However, the DOM implementation doesn't want to be tied to any particular
parser implementation really so you kind of have to store it redundantly to
avoid other issues.

If you are going to store in some other format, and that is done at a SAX
like level, then you still need event APIs to come out of the parser to
fill in the SAX DTD object that you are going to give back, right?

Hopefully this is a coherent response. I got multiply deeply nested
interrupts while trying to write it.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Tue Mar 30 01:13:54 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:10:48 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
Message-ID: <87256743.0079FE91.00@d53mta03h.boulder.ibm.com>


>> This creates four menmonic constants you want and gives them a checkable
>> type.  New constants can't be created because of the private
constructor.
>> And there's no chance that anybody's going to write code like
>>
>>   if (getAttributeStatus() == 1) {
>>    doSomething();
>>   }
>>
>> Programmers are more or less forced to use the constants. What do you
>> think?
>
>I personally take a very dim view of systems trying to "force" programmers
>into intrinsically good practices.  Programmers can abuse any system you
>present, and at some point you have to accept that they are adults, and
must
>be free to cut off their own noses if they wish.
>
>The good programming practice of replacing "magic numbers" with
descriptive
>constants is even older than the structured programming movement, and any
>programmer who writes
>

But that's not really the point I don't think. The point isn't "if you are
as macho a programmer as me you don't need any help". The point is that we
work in a commercial environment and every single semantic that can be
expressed in the code itself, so that the compiler can tell you when break
them, is a Very Goode Thinge.

It does no good at all to have a named constant if you can accidentally
pass that named constant to 150 other things for which its not intended and
the compiler cannot catch it. Its a fundamental lacking in Java that makes
me shudder to think that people actually want to do serious work in it.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Tue Mar 30 01:40:50 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:48 2004
Subject: Is there anyone working on a binary version of XML?
References: <87256743.0062192B.00@d53mta03h.boulder.ibm.com>
Message-ID: <3700177B.AC5BFE07@lig.net>

roddey@us.ibm.com wrote:

> >Imagine that you have all the features of XML: structure, flexibility,
> common format for
> >interchange, but that you perform zero processing steps to import or
> export the 'document'
> >from a program.  (Actually, I'm thinking this would be done in chunks, but
> essentially very
> >few reads and writes.)
> >
>
> Actually, to be fair, there would be a somewhat non-trivial amount of bit
> fiddlin' to get it out of whatever canonical binary format you put it in,
> into the local byte order, floating point representation, byte boundary
> alignment, etc... Though hopefully that couldn't be any worse than parsing
> :-)

Not true, especially for Java....

If you read all my 'binary' related comments, I'm not talking about storing binary data (such
as IEEE doubles), but rather normal XML style text elements, attributes, and body in a
'binary' structure that gives container-like access and speed.  There might be some reason to
allow real binary data, but that's not really my priority.  You can flash convert real binary
to hex for instance very easily.

The byte order, etc. will be Java standard.  Shouldn't be too tough for C/C++, etc.

sdw

> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Tue Mar 30 01:55:46 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:48 2004
Subject: A Simple Thought
References: <87256743.006AF60F.00@d53mta03h.boulder.ibm.com>
Message-ID: <37001AF4.5C9428C9@lig.net>


roddey@us.ibm.com wrote:

> >Ahh, there's the trick.  I believe I have most of a design for an data
> structure
> >that is fast in memory yet is 'flat' and can have its chunks just written
> out or
> >read in at any point.  It builds on some very old ideas I came up with for
> a
> >language I designed.  When viewed as an interchange format, it may not be
> the
> >most optimal space wise (although it should be better than XML text) but
> trades
> >a small amount of space for nearly zero processing overhead.  There will
> >probably also be a procedure for 'compacting' an object for storage into a
> >database or sending over a slow link vs. the 'fast' format usable between
> >servers in a cluster.
> >
>
> I think though that this would only hold up as long as you are looking at
> XML data as a read-only data source. Once you started doing significant
> editing of the data, having a flat structure like that would be more of a
> hinderance than a help, would it not? What if I have a 10MB flat buffer and
> want to add another child to the second element? This kind of gets into the
> quandry that you've nailed in one nail, but now its even harder to nail a
> whole raft of others as well as with the more general purpose mechanisms.
>
> I dunno, if I were thinking along these lines, to keep it reasonably
> portable, I'd look at the binary format as a fast serialization mechanism
> and at least create native language objects for each one. By the time you
> put enough stream format markers and whatnot into the stream to know where
> things are, and interpret those during runtime, it might be just as fast to
> pay the cost for creating a much more flexible, native object format for in
> memory manipulation.

I have a way around the modification issue.  It's a data structure I call 'elastic memory'.
It's really the main reason that I'm going to have to start mostly from scratch.
I AM trying to hit a number of nails at once and it won't be easy and I'm not sure I can make
it perfect, however I believe I can get close.  I'm only worrying about Java at the moment
with some allowances for certain restrictions that come into play and typical usage in network
protocols.

There are a number of situations where serialization just doesn't cut it.  As I mention,
imagine serializing/deserializing on every method call in a program.

I designed some of these mechanisms MANY years ago (about 8 I think) while designing a
language after I'd already built a language based on Postscript syntax for a project.  In
testing the first language, I learned the horrors of 'malloc storms' that happen when you
follow a typical design paradigm.  My system which allowed a complex application to be
represented by meta data would do about 25000 mallocs in a standard run through the app.  A
Java web server app for a very complex app  I just completed with a team does about 150,000
object creations (measured by forcing a garbage collection) in one run through the app.  It
works amazingly well, but still blows most of it's processing for things that could be
avoided.

The cool thing is that I found a way to implement it in Java.

Thanks for sparring with me! ;-)
sdw

> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From alison at research.canon.com.au  Tue Mar 30 04:25:04 1999
From: alison at research.canon.com.au (Alison Lennon)
Date: Mon Jun  7 17:10:48 2004
Subject: Ampersand connector in XML
Message-ID: <370035F2.D604B132@research.canon.com.au>

Could someone please explain to me why the ampersand group connector
of SGML was not included in XML. 

It seems to me that the absence of this connector results in
significant problems for many applications based on XML that want to
use unordered lists of elements.

Cheers,
Alison
-- 
Alison Lennon, Senior Research Engineer
Canon Information Systems Research Australia Pty Ltd (CISRA),
1 Thomas Holt Drive,North Ryde,Sydney, NSW 2113.
Ph +61-2-9805-2931, Fax +61-2-9805-2929

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Tue Mar 30 04:26:46 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:10:48 2004
Subject: XML to Text questions
Message-ID: <5BF896CAFE8DD111812400805F1991F708AAF212@RED-MSG-08>

Q: "What tools are available for translation of XML into text?"

A:  Take a look at XSL.  Information on this and other XML-related
activities can be found at http://www.w3.org/XML/Activity.html.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Mar 30 04:32:48 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:48 2004
Subject: Ampersand connector in XML
Message-ID: <3.0.32.19990329183236.00c2bb80@pop.intergate.bc.ca>

At 12:24 PM 3/30/99 +1000, Alison Lennon wrote:
>Could someone please explain to me why the ampersand group connector
>of SGML was not included in XML. 
>
>It seems to me that the absence of this connector results in
>significant problems for many applications based on XML that want to
>use unordered lists of elements.

Simply because it's a lot harder to implement than all the other
content model apparatus.  In fact, back in SGML days, it was well-known
to be buggy in several rather good and successful SGML products.  Yes,
its absence does represent a loss in expressive power.  At the time,
it seemed like a good trade-off.  To me it still does, although it
has particularly irked the (large and growing number of) people who
want to use XML to model relational semantics. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 30 04:42:21 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:48 2004
Subject: xhtml and the p tag
Message-ID: <001601be7a4e$df448500$1bf96d8c@NT.JELLIFFE.COM.AU>

 From: Mark D. Anderson <mda@discerning.com>

>>In HTML 4.0 a paragraph can only contain an inline element, just the
same as
>>in XHTML
>
>Right; html is broken too. But I've already given up on html :).
>
>I'm still curious what a better content model for paragraphs
>would be. (Sorry, i suppose this topic belongs somewhere else
>even if xhtml-related; suggestions are welcome.)

The distinction between text-blocks and rhetorical paragraphs is the
oldest problem in markup.

It is sad that HTML calls visual text blocks paragraphs, but should not
be surprising or particularly troubling (it will just make it impossible
to detect rhetorical paragraphs programatically from HTML documents.)

The best solution is to wrap the real paragraphs in a div element:
 <div class="para">
    <p>...</p>
   <ul>...</ul>
   <p class="paracont">...<p>
  </div>
where para cont means "paragraph contiunuation".

Now that we have XSL, it is probably a good thing if HTML errs on the
side of being display-structure oriented. That is why the ruby draft
(which is all wrong if we want HTML to support logical markup) is
probably appropriate for HTML now.

Actually, there is even a higher level of paragraph, the "paragraph
group", which is found in some kinds of documents (military and
technical), which is where a paragraph grows a numer, a heading (often
inline), footnotes and even metadata. (The reasons for this, you can
find in my book, The XML and SGML Cookbook: basically the idea is that
when an information block is self-contained or extractable, it naturally
becomes a microdocument, getting all the accoutrements of a doument--a
head an body, title, etc.)

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 30 04:45:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:48 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <87256743.0079FE91.00@d53mta03h.boulder.ibm.com>
References: <87256743.0079FE91.00@d53mta03h.boulder.ibm.com>
Message-ID: <14080.14974.747531.703294@localhost.localdomain>

roddey@us.ibm.com writes:

[on using type-safe objects rather than integers as Java constants]

 > It does no good at all to have a named constant if you can
 > accidentally pass that named constant to 150 other things for which
 > its not intended and the compiler cannot catch it. Its a
 > fundamental lacking in Java that makes me shudder to think that
 > people actually want to do serious work in it.

Yes, but it's also no good having a named constant that you cannot use
in a switch statement.  Unfortunately, Java is broken here, and you
have to choose one side or another


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 30 04:45:54 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:48 2004
Subject: XHTML and character entities
Message-ID: <001901be7a4f$50461d90$1bf96d8c@NT.JELLIFFE.COM.AU>


From: Gabe Beged-Dov <begeddov@jfinity.com>

>I mention tidy below but am asking about html->xhtml conversion in
>general.
>
>I use tidy to to convert html to xhtml using the -asxml switch. The
>result of many conversions is still not accepted as well-formed because
>entities like agrave and friends aren't defined unless you process the
>DTD.
>
>Wouldn't it be reasonable to convert these to character entities as
part
>of the html->xhtml process?

With tidy, you have to be a little creative with the switches. For
example, to process Big5 text, we have to use "-latin1".

Certainly it is the expectation of some people that the entities for
special characters will disappear with XML, that people will use NCRs.
I am not sure about it.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From alison at research.canon.com.au  Tue Mar 30 04:47:05 1999
From: alison at research.canon.com.au (Alison Lennon)
Date: Mon Jun  7 17:10:48 2004
Subject: Ampersand connector in XML
References: <3.0.32.19990329183236.00c2bb80@pop.intergate.bc.ca>
Message-ID: <37003AE2.7B13728D@research.canon.com.au>

Tim Bray wrote:
> 
> At 12:24 PM 3/30/99 +1000, Alison Lennon wrote:
> >Could someone please explain to me why the ampersand group connector
> >of SGML was not included in XML.
> >
> >It seems to me that the absence of this connector results in
> >significant problems for many applications based on XML that want to
> >use unordered lists of elements.
> 
> Simply because it's a lot harder to implement than all the other
> content model apparatus.  In fact, back in SGML days, it was well-known
> to be buggy in several rather good and successful SGML products.  Yes,
> its absence does represent a loss in expressive power.  At the time,
> it seemed like a good trade-off.  To me it still does, although it
> has particularly irked the (large and growing number of) people who
> want to use XML to model relational semantics. -Tim

Is it likely to be included in later versions of XML? In other words,
what are the options for applications which need to use unordered
lists - SGML?

Alison
-- 
Alison Lennon, Senior Research Engineer
Canon Information Systems Research Australia Pty Ltd (CISRA),
1 Thomas Holt Drive,North Ryde,Sydney, NSW 2113.
Ph +61-2-9805-2931, Fax +61-2-9805-2929

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Mar 30 04:54:43 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:10:48 2004
Subject: Ampersand connector in XML
Message-ID: <3.0.32.19990329185409.00c2f630@pop.intergate.bc.ca>

At 12:45 PM 3/30/99 +1000, Alison Lennon wrote:
>Is it likely to be included in later versions of XML? 

Not impossible.  There are some people on the schema group who'd like
to bring it back.  But by no means a sure thing.

>In other words,
>what are the options for applications which need to use unordered
>lists - SGML?

Yep.  Or write your own code to validate the unordered-list elements;
since any nontrivial business application is going to need some extra
validation logic past what the DTD can do anyhow, this is probably
not too burdensome.

Another approach is to generate your documents in such a way that
you sort the unordered-list elements by any old criterion at all,
so that they become ordered-list elements; then use a simpler
content model. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Mar 30 05:01:57 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:49 2004
Subject: A Line in the Declarative Syntax Sand(Was: XML complexity, namespaces (was WG))
Message-ID: <003201be7a51$8f8c0350$1bf96d8c@NT.JELLIFFE.COM.AU>

From: David Megginson <david@megginson.com>

>What I did say is that there's not a practical difference among the
>different alternatives in XML and SGML for expressing this
>information, and probably not enough to justify the parallel
>maintenance of the two as discrete standards.

I don't agree because

1) XML is not a standard, because W3C is not an open process
but a friendly conspiracy of vendors and boffins who must kowtow
to Microsoft and TBL (not to say that these are not excellent
activities).

2) XML and SGML have fundamentally different application areas driving
them:

* SGML is a compiler compiler where the central technical question is
"people want markup in lots of different formats; how can we make a
parser
to detect the structure in as many of their formats as possible?" If you
have
shortrefs you must have maps and you must have entities and you must
have
minimization: they are justifiable because SGML is a parser technologym
not
an information-modeling technology.

* XML just expands the butt of SGML: the fact that there are tree/graph
structures
in marked-up data. Now, I admit that butt-expansion is a natural
function of
time: SGML's default delimiters (as used in HTML and SGML at many
companies)
are now familiar enough that there is also a question "people want
markup in
SGML-delimiter format: how can we make a (simple) parser that detects
the
structure in just that?"

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Tue Mar 30 05:27:17 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:10:49 2004
Subject: Is there anyone working on a binary version of XML?
References: <01a301be770e$00c1c920$0b2e249b@fileroom.Synapse> <36FAC962.D9F47DB6@lig.net> <36FAC72C.2B911970@prescod.net> <005501be771c$c8afb2e0$a6ab20c0@engeast.baynetworks.com> <36FAEE03.E435EB6D@lig.net>
Message-ID: <36FB56EF.7E3E87CE@manhattanproject.com>

"Stephen D. Williams" wrote:
> Imagine that you have all the features of XML: structure, flexibility, common format for
> interchange, but that you perform zero processing steps to import or export the 'document'
> from a program.  (Actually, I'm thinking this would be done in chunks, but essentially very
> few reads and writes.)

I had an idea to accomplish something similar to this using notations.  
First use a fixed width encoding, and then provide an index to the 
information contained within the XML document in a notation.  This way
you get many of the advantages above, but your information is still XML, 
so that it can be read by a parser who may not understand the indexing notation.

Anyway, I havn't had time to work on it more, but here was a 
crude, first-pass at explaining the idea I posted to the list
a while back.  I hope it helps.

Clark Evans


-------- Original Message --------
Subject: Fractal XML Index Notation
Date: Wed, 03 Feb 1999 01:32:34 +0000
From: Clark Evans <clark.evans@manhattanproject.com>
To: xml-dev@ic.ac.uk
References: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4>

Abstract:

	By fixing the content of an XML file, a 
        position based 	index mechanism can be added 
        to XML files, allowing fractal parsing.

Introduction:

In a thin-client/server environment, especially those 
implemented in an interpreted language, like Java, 
is important to minimise client-side processing by 
doing server-side pre-processing.

For example, suppose that an on-line shopping web 
site has a thin-client ordering java applet.  It could
quickly download, and start accepting customer
information, and other input.  Simoutanenously,
it could be downloading a 250K+ file(s) containing
the package and product list, authorized shipping 
agents, tax calculation tables, etc.  Advanced
versions of the applet would "cashe" a copy of the
catalog locally, and only download deltas.  

Several pre-processing items could occur, the most
obvious being a translation of the normalized schema:

 PRODUCT_CATEGORY   (CATEGORY_ID, CATEGORY_NAME)
 BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE)
 VENDOR             (VENDOR_ID, VENDOR_NAME)
 BUNDLE-PRODUCT     (BUNDLE_ID,PRODUCT_ID)
 PRODUCT            (PRODUCT_ID, PRODUCT_NAME, 
                     CATEGORY_ID, INVIDUAL_SALE_FLAG,
                     PRICE_IF_SOLD_INDIVIDUALLY )
 PRODUCT-VENDOR     (PRODUCT_ID,VENDOR_ID)
 BUNDLE-VENDOR      (BUNDLE_ID,VENDOR_ID)
 
into a hierarchical drill-down that better meets
the particular needs of the order-entry client:

<catalog>
   <product-category>
      <product-bundle>
         <product>
         <vendor>
      <individual-product>
         <vendor>

In this example, several joins are interwoven into a 
a single hierarchical "snapshot" to support the
the drill down requirements in the order-entry client.

Notice, that product-bundles, products, and vendors
*will* be duplicated with this scheme, this de-normalization
is exactly what is required since it makes the processing
on the client simpler.  Here XML complements the
relational database by providing a de-normalized 
stream of data instead of a normalized repository.

For another example, suppose a roaming-sales person 
receives an update every morning in his e-mail with
new products, discontinued products, changes in pricing, 
packaging, etc.  Then, during the day, the sales peson 
goes "door-to-door" selling the products and taking orders.  
The orders are collected on his/her hard drive untill 
the evening, when they are uploaded to the server for 
approval.

I see XML as a great move forward in a standard transport
layer for this form of communication.  Each order could
be a simple e-mail message, leveraging existing POP3/SMTP
standards.  The messages would be queued during the day,
and send after the sales person is connected to the 
network.  In a similar way, the updates to the product
could be sent as via e-mail (xml-mail anyone?) as well.  

THUS, we have moved the join from the client to the
server, but now, we have *increased* the parsing 
requirements of the client... also, with a _large_
catelog file (3+MB?), it is unreasonable to think 
that a collection of objects in memory would 
be the result of the parsing.

THEREFORE, some form of storage/retrieval is necessary
on the client.  This can be in a local database,
but that just increases the footprint and processing.

Instead of making a client-side database, and 
re-normalizing the information, I suggest that 
indexing the XML file may be a better alternative.
A way to do this, is to "fix" the XML file's binary
representaion, and build a physical index detailing
the "exact" location of an element within the file.

Requirement for such an index:

a) It should be embeddable inside XML, and should follow
XML if possible (perhaps it is a notation?)

b) It should allow indexing on arbitrary element attributes.

c) It should be created so that a change in one part of the
file has minimal impact on the rest of the XML file.  Thus,
although a change to a child may require a re-adjustment
of information about it's parent, it shouldn't require 
re-adjustment of information about each sibling.

d) It should take advantage of the "hierarchy" built
into the XML file, since the thin-client usage will
directly correspond to the "hierachy"

e) It should support typed entities and attributes
"Archetecutres", so that different attribute names
of sub-types can be indexed together.

f) Indexing an element based upon it's child elements
may not be required. If an index like this is needed,
perhaps a re-write adds an attribute with the 
computed value and then this is indexed instead.

g) Working with linking is purely optional, and may
not be important to support. <opinion> If you are 
using linking with transaction-oriented documents, 
you should be using a relational database instead. 
I see XML as bringing back the Hierarchical database 
to *complement* relational technology, not to 
*replace* it.</opinion>


================================================

What I propose is a "fractal" index inter-woven 
into the XML data.  First, here is the file to 
be indexed:

<catalog date="03-FEB-1999" company="Acme Tools" >
   <product-category name="Household" type="Domestic">
      <individual-product name="Hammer" price="13.95"/>
      <individual-product name="Screw-Driver, 1/4 inch" price="6.95"/>
      <individual-product name="Screw-Driver, 1/8 inch" price="7.95"/>
      <individual-product name="Allen-Wrench Set"       price="11.55"/>
      <product-bundle name="Household-Starter" price = "23.99" />
         <bundled-product name="Hammer"/>
         <bundled-product name="Screw-Driver, 1/4 inch"/>
         <bundled-product name="Screw-Driver, 1/8 inch"/>
         ...
      </product-bundle>
      ...
   </product-category>
   <product-category type="Commercial" name="Light-Industry" >
      <individual-product name="Hammer" price="13.95"/>
      <individual-product name="Versa Screw(tm)" price="66.95"/>
      ...
   </product-category>
   ...
</catalog>

Here is the "indexed" example, I use line numbers for 
the demonstration since it is easier to show in e-mail
form, however, I would see it being done by position instead.
I also use <!-- to comment stuff. -->

0001 <!-- other-information-before-the-catelog -->
...
0009 <catalog date="03-FEB-1999" company="Acme Tools" >
0010    <product-category name="Household" type="Domestic">
0011       <individual-product name="Hammer" price="13.95"/>
0012       <individual-product name="Screw-Driver, 1/4 inch"
price="6.95"/>
0013       <individual-product name="Screw-Driver, 1/8 inch"
price="7.95"/>
0014       <individual-product name="Allen-Wrench Set" price="1.55"/>
0015       <product-bundle name="Household-Starter" price = "23.99" />
0016          <bundled-product name="Hammer"/>
0017          <bundled-product name="Screw-Driver, 1/4 inch"/>
0018          <bundled-product name="Screw-Driver, 1/8 inch"/>
...
0033       </product-bundle>
...
0533       <index               <!-- an index for "Household"
category     -->
0534          name="Price"      <!-- the listing is asending by
price      -->
0535          index-start=525   <!-- (535-10), relative begining of
index  -->
0536          delimiter="|"     <!-- Hmm, possibly for
readability         -->
0536          position-width=4  <!-- Length for each position,
lpad="0"    -->
0537          length=100        <!-- Length of
index                       -->
0538       >
0539       <index-column name="name" width=30 align="left" rpad=" ">
0540          <index-element element="individual-product"
attribute="price" />
0541          <index-element element="product-bundle" attribute="price"
/>
0542       </index-column>
0543       0004|Allen-Wrench Set    | <!-- First item...                
-->
...
05??       0005|Household-Starter   | <!-- First item...                
-->
...
05??       0008|Allen-Wrench Set    | <!-- First item...                
-->
...
0632       </index>
0633       <index               
0634          name="Price"      <!-- the index is asending by
price        -->
0635          index-start=625   <!-- (635-10), relative begining of
index  -->
0636          delimiter="|"     
0636          position-width=4 
0637          length=100
0638       >
0639       <index-column name="price" width=5 align="right" lpad="0">
0640          <index-element element="individual-product"
attribute="price" />
0641          <index-element element="product-bundle" attribute="price"
/>
0642       </index-column>
0643       0433|01.23         <!-- Cheapest item...                     
-->
...
06??       0002|06.95         <!-- Refers to line 10+2=12               
-->
...
06??       0005|23.99         <!-- Referrs to line 10+5=15              
-->
...
0732       </index>
....
????    </product-category>
????    <product-category type="Commercial" name="Light-Industry" >
????       <individual-product name="Hammer" price="13.95"/>
????       <individual-product name="Versa Screw(tm)" price="66.95"/>
...
????       <index 
	      name="Price"
...
????       <index 
	      name=""
...
????    </product-category>
...
????    
0000 </catalog>
0000 


==============================


....


<INDEX

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Tue Mar 30 05:51:16 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:10:49 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <3700493E.D141CAE4@manhattanproject.com>

"Stephen D. Williams" wrote:
> Imagine that you have all the features of XML: structure, flexibility, common format for
> interchange, but that you perform zero processing steps to import or export the 'document'
> from a program.  (Actually, I'm thinking this would be done in chunks, but essentially very
> few reads and writes.)

I had an idea to accomplish something similar to this using notations.  
First use a fixed width encoding, and then provide an index to the 
information contained within the XML document in a notation.  This way
you get many of the advantages above, but your information is still XML, 
so that it can be read by a parser who may not understand the indexing notation.

Anyway, I havn't had time to work on it more, but here was a 
crude, first-pass at explaining the idea I posted to the list
a while back.  I hope it helps.

Clark Evans


-------- Original Message --------
Subject: Fractal XML Index Notation
Date: Wed, 03 Feb 1999 01:32:34 +0000
From: Clark Evans <clark.evans@manhattanproject.com>
To: xml-dev@ic.ac.uk
References: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4>

Abstract:

        By fixing the content of an XML file, a 
        position based  index mechanism can be added 
        to XML files, allowing fractal parsing.

Introduction:

In a thin-client/server environment, especially those 
implemented in an interpreted language, like Java, 
is important to minimise client-side processing by 
doing server-side pre-processing.

For example, suppose that an on-line shopping web 
site has a thin-client ordering java applet.  It could
quickly download, and start accepting customer
information, and other input.  Simoutanenously,
it could be downloading a 250K+ file(s) containing
the package and product list, authorized shipping 
agents, tax calculation tables, etc.  Advanced
versions of the applet would "cashe" a copy of the
catalog locally, and only download deltas.  

Several pre-processing items could occur, the most
obvious being a translation of the normalized schema:

 PRODUCT_CATEGORY   (CATEGORY_ID, CATEGORY_NAME)
 BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE)
 VENDOR             (VENDOR_ID, VENDOR_NAME)
 BUNDLE-PRODUCT     (BUNDLE_ID,PRODUCT_ID)
 PRODUCT            (PRODUCT_ID, PRODUCT_NAME, 
                     CATEGORY_ID, INVIDUAL_SALE_FLAG,
                     PRICE_IF_SOLD_INDIVIDUALLY )
 PRODUCT-VENDOR     (PRODUCT_ID,VENDOR_ID)
 BUNDLE-VENDOR      (BUNDLE_ID,VENDOR_ID)
 
into a hierarchical drill-down that better meets
the particular needs of the order-entry client:

<catalog>
   <product-category>
      <product-bundle>
         <product>
         <vendor>
      <individual-product>
         <vendor>

In this example, several joins are interwoven into a 
a single hierarchical "snapshot" to support the
the drill down requirements in the order-entry client.

Notice, that product-bundles, products, and vendors
*will* be duplicated with this scheme, this de-normalization
is exactly what is required since it makes the processing
on the client simpler.  Here XML complements the
relational database by providing a de-normalized 
stream of data instead of a normalized repository.

For another example, suppose a roaming-sales person 
receives an update every morning in his e-mail with
new products, discontinued products, changes in pricing, 
packaging, etc.  Then, during the day, the sales peson 
goes "door-to-door" selling the products and taking orders.  
The orders are collected on his/her hard drive untill 
the evening, when they are uploaded to the server for 
approval.

I see XML as a great move forward in a standard transport
layer for this form of communication.  Each order could
be a simple e-mail message, leveraging existing POP3/SMTP
standards.  The messages would be queued during the day,
and send after the sales person is connected to the 
network.  In a similar way, the updates to the product
could be sent as via e-mail (xml-mail anyone?) as well.  

THUS, we have moved the join from the client to the
server, but now, we have *increased* the parsing 
requirements of the client... also, with a _large_
catelog file (3+MB?), it is unreasonable to think 
that a collection of objects in memory would 
be the result of the parsing.

THEREFORE, some form of storage/retrieval is necessary
on the client.  This can be in a local database,
but that just increases the footprint and processing.

Instead of making a client-side database, and 
re-normalizing the information, I suggest that 
indexing the XML file may be a better alternative.
A way to do this, is to "fix" the XML file's binary
representaion, and build a physical index detailing
the "exact" location of an element within the file.

Requirement for such an index:

a) It should be embeddable inside XML, and should follow
XML if possible (perhaps it is a notation?)

b) It should allow indexing on arbitrary element attributes.

c) It should be created so that a change in one part of the
file has minimal impact on the rest of the XML file.  Thus,
although a change to a child may require a re-adjustment
of information about it's parent, it shouldn't require 
re-adjustment of information about each sibling.

d) It should take advantage of the "hierarchy" built
into the XML file, since the thin-client usage will
directly correspond to the "hierachy"

e) It should support typed entities and attributes
"Archetecutres", so that different attribute names
of sub-types can be indexed together.

f) Indexing an element based upon it's child elements
may not be required. If an index like this is needed,
perhaps a re-write adds an attribute with the 
computed value and then this is indexed instead.

g) Working with linking is purely optional, and may
not be important to support. <opinion> If you are 
using linking with transaction-oriented documents, 
you should be using a relational database instead. 
I see XML as bringing back the Hierarchical database 
to *complement* relational technology, not to 
*replace* it.</opinion>


================================================

What I propose is a "fractal" index inter-woven 
into the XML data.  First, here is the file to 
be indexed:

<catalog date="03-FEB-1999" company="Acme Tools" >
   <product-category name="Household" type="Domestic">
      <individual-product name="Hammer" price="13.95"/>
      <individual-product name="Screw-Driver, 1/4 inch" price="6.95"/>
      <individual-product name="Screw-Driver, 1/8 inch" price="7.95"/>
      <individual-product name="Allen-Wrench Set"       price="11.55"/>
      <product-bundle name="Household-Starter" price = "23.99" />
         <bundled-product name="Hammer"/>
         <bundled-product name="Screw-Driver, 1/4 inch"/>
         <bundled-product name="Screw-Driver, 1/8 inch"/>
         ...
      </product-bundle>
      ...
   </product-category>
   <product-category type="Commercial" name="Light-Industry" >
      <individual-product name="Hammer" price="13.95"/>
      <individual-product name="Versa Screw(tm)" price="66.95"/>
      ...
   </product-category>
   ...
</catalog>

Here is the "indexed" example, I use line numbers for 
the demonstration since it is easier to show in e-mail
form, however, I would see it being done by position instead.
I also use <!-- to comment stuff. -->

0001 <!-- other-information-before-the-catelog -->
...
0009 <catalog date="03-FEB-1999" company="Acme Tools" >
0010    <product-category name="Household" type="Domestic">
0011       <individual-product name="Hammer" price="13.95"/>
0012       <individual-product name="Screw-Driver, 1/4 inch"
price="6.95"/>
0013       <individual-product name="Screw-Driver, 1/8 inch"
price="7.95"/>
0014       <individual-product name="Allen-Wrench Set" price="1.55"/>
0015       <product-bundle name="Household-Starter" price = "23.99" />
0016          <bundled-product name="Hammer"/>
0017          <bundled-product name="Screw-Driver, 1/4 inch"/>
0018          <bundled-product name="Screw-Driver, 1/8 inch"/>
...
0033       </product-bundle>
...
0533       <index               <!-- an index for "Household"
category     -->
0534          name="Price"      <!-- the listing is asending by
price      -->
0535          index-start=525   <!-- (535-10), relative begining of
index  -->
0536          delimiter="|"     <!-- Hmm, possibly for
readability         -->
0536          position-width=4  <!-- Length for each position,
lpad="0"    -->
0537          length=100        <!-- Length of
index                       -->
0538       >
0539       <index-column name="name" width=30 align="left" rpad=" ">
0540          <index-element element="individual-product"
attribute="price" />
0541          <index-element element="product-bundle" attribute="price"
/>
0542       </index-column>
0543       0004|Allen-Wrench Set    | <!-- First item...                
-->
...
05??       0005|Household-Starter   | <!-- First item...                
-->
...
05??       0008|Allen-Wrench Set    | <!-- First item...                
-->
...
0632       </index>
0633       <index               
0634          name="Price"      <!-- the index is asending by
price        -->
0635          index-start=625   <!-- (635-10), relative begining of
index  -->
0636          delimiter="|"     
0636          position-width=4 
0637          length=100
0638       >
0639       <index-column name="price" width=5 align="right" lpad="0">
0640          <index-element element="individual-product"
attribute="price" />
0641          <index-element element="product-bundle" attribute="price"
/>
0642       </index-column>
0643       0433|01.23         <!-- Cheapest item...                     
-->
...
06??       0002|06.95         <!-- Refers to line 10+2=12               
-->
...
06??       0005|23.99         <!-- Referrs to line 10+5=15              
-->
...
0732       </index>
....
????    </product-category>
????    <product-category type="Commercial" name="Light-Industry" >
????       <individual-product name="Hammer" price="13.95"/>
????       <individual-product name="Versa Screw(tm)" price="66.95"/>
...
????       <index 
              name="Price"
...
????       <index 
              name=""
...
????    </product-category>
...
????    
0000 </catalog>
0000 

==============================

....


<INDEX

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Tue Mar 30 06:08:18 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:49 2004
Subject: Is there anyone working on a binary version of XML?
References: <3700493E.D141CAE4@manhattanproject.com>
Message-ID: <37005634.58EBEE1B@lig.net>

Excellent.  I've had similar ideas.  My current plan is to produce something without the
requirement that the result be pure text, however I once toyed with the idea of a database where
all indexing information was stored as part of the text in fixed width fields.  The file could be
edited with any text editor and then 'reindexed' and be ready for fast use.

Your design is pretty handy, but I really want something that can be loaded, have a minor
modification made with minimal data shuffling, and then 'saved' out very quickly.  Having to
rebuild a complete index probably isn't the most optimal way to do this.

sdw

Clark Evans wrote:

> "Stephen D. Williams" wrote:
> > Imagine that you have all the features of XML: structure, flexibility, common format for
> > interchange, but that you perform zero processing steps to import or export the 'document'
> > from a program.  (Actually, I'm thinking this would be done in chunks, but essentially very
> > few reads and writes.)
>
> I had an idea to accomplish something similar to this using notations.
> First use a fixed width encoding, and then provide an index to the
> information contained within the XML document in a notation.  This way
> you get many of the advantages above, but your information is still XML,
> so that it can be read by a parser who may not understand the indexing notation.
>
> Anyway, I havn't had time to work on it more, but here was a
> crude, first-pass at explaining the idea I posted to the list
> a while back.  I hope it helps.
>
> Clark Evans
>
> -------- Original Message --------
> Subject: Fractal XML Index Notation
> Date: Wed, 03 Feb 1999 01:32:34 +0000
> From: Clark Evans <clark.evans@manhattanproject.com>
> To: xml-dev@ic.ac.uk
> References: <958E41703996D21197A200A0C9D4C65672B7@AUS-SERVER4>
>
> Abstract:
>
>         By fixing the content of an XML file, a
>         position based  index mechanism can be added
>         to XML files, allowing fractal parsing.
>
> Introduction:
>
> In a thin-client/server environment, especially those
> implemented in an interpreted language, like Java,
> is important to minimise client-side processing by
> doing server-side pre-processing.
>
> For example, suppose that an on-line shopping web
> site has a thin-client ordering java applet.  It could
> quickly download, and start accepting customer
> information, and other input.  Simoutanenously,
> it could be downloading a 250K+ file(s) containing
> the package and product list, authorized shipping
> agents, tax calculation tables, etc.  Advanced
> versions of the applet would "cashe" a copy of the
> catalog locally, and only download deltas.
>
> Several pre-processing items could occur, the most
> obvious being a translation of the normalized schema:
>
>  PRODUCT_CATEGORY   (CATEGORY_ID, CATEGORY_NAME)
>  BUNDLE_OF_PRODUCTS (BUNDLE_ID, BUNDLE_NAME, BUNDLE_PRICE)
>  VENDOR             (VENDOR_ID, VENDOR_NAME)
>  BUNDLE-PRODUCT     (BUNDLE_ID,PRODUCT_ID)
>  PRODUCT            (PRODUCT_ID, PRODUCT_NAME,
>                      CATEGORY_ID, INVIDUAL_SALE_FLAG,
>                      PRICE_IF_SOLD_INDIVIDUALLY )
>  PRODUCT-VENDOR     (PRODUCT_ID,VENDOR_ID)
>  BUNDLE-VENDOR      (BUNDLE_ID,VENDOR_ID)
>
> into a hierarchical drill-down that better meets
> the particular needs of the order-entry client:
>
> <catalog>
>    <product-category>
>       <product-bundle>
>          <product>
>          <vendor>
>       <individual-product>
>          <vendor>
>
> In this example, several joins are interwoven into a
> a single hierarchical "snapshot" to support the
> the drill down requirements in the order-entry client.
>
> Notice, that product-bundles, products, and vendors
> *will* be duplicated with this scheme, this de-normalization
> is exactly what is required since it makes the processing
> on the client simpler.  Here XML complements the
> relational database by providing a de-normalized
> stream of data instead of a normalized repository.
>
> For another example, suppose a roaming-sales person
> receives an update every morning in his e-mail with
> new products, discontinued products, changes in pricing,
> packaging, etc.  Then, during the day, the sales peson
> goes "door-to-door" selling the products and taking orders.
> The orders are collected on his/her hard drive untill
> the evening, when they are uploaded to the server for
> approval.
>
> I see XML as a great move forward in a standard transport
> layer for this form of communication.  Each order could
> be a simple e-mail message, leveraging existing POP3/SMTP
> standards.  The messages would be queued during the day,
> and send after the sales person is connected to the
> network.  In a similar way, the updates to the product
> could be sent as via e-mail (xml-mail anyone?) as well.
>
> THUS, we have moved the join from the client to the
> server, but now, we have *increased* the parsing
> requirements of the client... also, with a _large_
> catelog file (3+MB?), it is unreasonable to think
> that a collection of objects in memory would
> be the result of the parsing.
>
> THEREFORE, some form of storage/retrieval is necessary
> on the client.  This can be in a local database,
> but that just increases the footprint and processing.
>
> Instead of making a client-side database, and
> re-normalizing the information, I suggest that
> indexing the XML file may be a better alternative.
> A way to do this, is to "fix" the XML file's binary
> representaion, and build a physical index detailing
> the "exact" location of an element within the file.
>
> Requirement for such an index:
>
> a) It should be embeddable inside XML, and should follow
> XML if possible (perhaps it is a notation?)
>
> b) It should allow indexing on arbitrary element attributes.
>
> c) It should be created so that a change in one part of the
> file has minimal impact on the rest of the XML file.  Thus,
> although a change to a child may require a re-adjustment
> of information about it's parent, it shouldn't require
> re-adjustment of information about each sibling.
>
> d) It should take advantage of the "hierarchy" built
> into the XML file, since the thin-client usage will
> directly correspond to the "hierachy"
>
> e) It should support typed entities and attributes
> "Archetecutres", so that different attribute names
> of sub-types can be indexed together.
>
> f) Indexing an element based upon it's child elements
> may not be required. If an index like this is needed,
> perhaps a re-write adds an attribute with the
> computed value and then this is indexed instead.
>
> g) Working with linking is purely optional, and may
> not be important to support. <opinion> If you are
> using linking with transaction-oriented documents,
> you should be using a relational database instead.
> I see XML as bringing back the Hierarchical database
> to *complement* relational technology, not to
> *replace* it.</opinion>
>
> ================================================
>
> What I propose is a "fractal" index inter-woven
> into the XML data.  First, here is the file to
> be indexed:
>
> <catalog date="03-FEB-1999" company="Acme Tools" >
>    <product-category name="Household" type="Domestic">
>       <individual-product name="Hammer" price="13.95"/>
>       <individual-product name="Screw-Driver, 1/4 inch" price="6.95"/>
>       <individual-product name="Screw-Driver, 1/8 inch" price="7.95"/>
>       <individual-product name="Allen-Wrench Set"       price="11.55"/>
>       <product-bundle name="Household-Starter" price = "23.99" />
>          <bundled-product name="Hammer"/>
>          <bundled-product name="Screw-Driver, 1/4 inch"/>
>          <bundled-product name="Screw-Driver, 1/8 inch"/>
>          ...
>       </product-bundle>
>       ...
>    </product-category>
>    <product-category type="Commercial" name="Light-Industry" >
>       <individual-product name="Hammer" price="13.95"/>
>       <individual-product name="Versa Screw(tm)" price="66.95"/>
>       ...
>    </product-category>
>    ...
> </catalog>
>
> Here is the "indexed" example, I use line numbers for
> the demonstration since it is easier to show in e-mail
> form, however, I would see it being done by position instead.
> I also use <!-- to comment stuff. -->
>
> 0001 <!-- other-information-before-the-catelog -->
> ...
> 0009 <catalog date="03-FEB-1999" company="Acme Tools" >
> 0010    <product-category name="Household" type="Domestic">
> 0011       <individual-product name="Hammer" price="13.95"/>
> 0012       <individual-product name="Screw-Driver, 1/4 inch"
> price="6.95"/>
> 0013       <individual-product name="Screw-Driver, 1/8 inch"
> price="7.95"/>
> 0014       <individual-product name="Allen-Wrench Set" price="1.55"/>
> 0015       <product-bundle name="Household-Starter" price = "23.99" />
> 0016          <bundled-product name="Hammer"/>
> 0017          <bundled-product name="Screw-Driver, 1/4 inch"/>
> 0018          <bundled-product name="Screw-Driver, 1/8 inch"/>
> ...
> 0033       </product-bundle>
> ...
> 0533       <index               <!-- an index for "Household"
> category     -->
> 0534          name="Price"      <!-- the listing is asending by
> price      -->
> 0535          index-start=525   <!-- (535-10), relative begining of
> index  -->
> 0536          delimiter="|"     <!-- Hmm, possibly for
> readability         -->
> 0536          position-width=4  <!-- Length for each position,
> lpad="0"    -->
> 0537          length=100        <!-- Length of
> index                       -->
> 0538       >
> 0539       <index-column name="name" width=30 align="left" rpad=" ">
> 0540          <index-element element="individual-product"
> attribute="price" />
> 0541          <index-element element="product-bundle" attribute="price"
> />
> 0542       </index-column>
> 0543       0004|Allen-Wrench Set    | <!-- First item...
> -->
> ...
> 05??       0005|Household-Starter   | <!-- First item...
> -->
> ...
> 05??       0008|Allen-Wrench Set    | <!-- First item...
> -->
> ...
> 0632       </index>
> 0633       <index
> 0634          name="Price"      <!-- the index is asending by
> price        -->
> 0635          index-start=625   <!-- (635-10), relative begining of
> index  -->
> 0636          delimiter="|"
> 0636          position-width=4
> 0637          length=100
> 0638       >
> 0639       <index-column name="price" width=5 align="right" lpad="0">
> 0640          <index-element element="individual-product"
> attribute="price" />
> 0641          <index-element element="product-bundle" attribute="price"
> />
> 0642       </index-column>
> 0643       0433|01.23         <!-- Cheapest item...
> -->
> ...
> 06??       0002|06.95         <!-- Refers to line 10+2=12
> -->
> ...
> 06??       0005|23.99         <!-- Referrs to line 10+5=15
> -->
> ...
> 0732       </index>
> ....
> ????    </product-category>
> ????    <product-category type="Commercial" name="Light-Industry" >
> ????       <individual-product name="Hammer" price="13.95"/>
> ????       <individual-product name="Versa Screw(tm)" price="66.95"/>
> ...
> ????       <index
>               name="Price"
> ...
> ????       <index
>               name=""
> ...
> ????    </product-category>
> ...
> ????
> 0000 </catalog>
> 0000
>
> ==============================
>
> ....
>
> <INDEX

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Tue Mar 30 07:11:51 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:49 2004
Subject: XML <-> non-XML filter project
Message-ID: <00cd01be7a6c$0d879ac0$0300000a@cygnus.uwa.edu.au>

Earlier this month, I posted the following to XSL-LIST. With apologies to
those who received it there, I'm posting it (modified) here to see if anyone
is interested in some co-operative effort in this area.

What I would like to see is people taking existing non-XML formats and
developing:

     a) a URI for the non-XML format (for notations and for the namespace of
the XML format)
     b) a DTD representing the existing non-XML format
     c) an output filter to convert documents conforming to the DTD into the
non-XML format
     d) (possibly) an input filter to convert the non-XML format into XML

There are individual cases of this sort of thing[1] but I would like to see
some sort of co-operative effort to produce a large number of these things.
I'm not envisaging complex filters, just a simple XML representation of the
non-XML format so that purely XML tools like editors, query engines, XSL
engines can operate on non-XML formats. There are plently of applications
including generation of these files on the basis of other XML documents (I
need this for Makefiles on my websites) and literate programming.

I would personally find great value in this being done for Makefiles,
procmail files, simple shell scripts and PalmPilot databases. Others of
value I can think of include Windows INI files, Unix mailboxes, your
favourite programming language...

If there is enough interest I am more than willing to coordinate these
efforts. Just let me know.

James

[1] http://www.xmlsoftware.com/convert/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From heikki at citec.fi  Tue Mar 30 08:03:12 1999
From: heikki at citec.fi (Heikki Toivonen)
Date: Mon Jun  7 17:10:49 2004
Subject: OFF: Attachments to list
Message-ID: <000c01be7a72$f13b2c40$2500a8c0@hto.citec.fi>

Instead of flaming people who send attachments to this list, why not use an
automated tool that refuses to send messages with attachments to this list?

For example, the Frame Users list (see http://www.FrameUsers.com) has a
system that checks if messages contains attachments or other illegal stuff.
If the screening program thinks something is wrong, it will reply to the
original sender with the message, explaining what was wrong.

One has to be careful with vCards, though, because many people do not
realize they are considered attachments (happened even to me, I used vCard
at one time as my .sig and I could not understand why FrameUsers bot was
saying I had an attachment in my email:).

--
  Heikki Toivonen
  http://www.doczilla.com
  http://www.citec.fi


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Tue Mar 30 09:45:27 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:10:49 2004
Subject: Is there anyone working on a binary version of XML?
References: <87256743.0062192B.00@d53mta03h.boulder.ibm.com> <3700177B.AC5BFE07@lig.net>
Message-ID: <370080F7.DBC8C96D@toolsmiths.se>

"Stephen D. Williams" wrote:

> roddey@us.ibm.com wrote:
>
> > Actually, to be fair, there would be a somewhat non-trivial amount of bit
> > fiddlin' to get it out of whatever canonical binary format you put it in,
> > into the local byte order, floating point representation, byte boundary
> > alignment, etc... Though hopefully that couldn't be any worse than parsing
> > :-)
>
> Not true, especially for Java....
>
> ...

> The byte order, etc. will be Java standard.  Shouldn't be too tough for C/C++, etc.

Its also very easy for C/C++,  There exists a number of standards and OpenSource
packages which support them.
After this "bit fiddling" have been taken care of,  the road is clear of obstacles.


/anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chris at w3.org  Tue Mar 30 10:03:46 1999
From: chris at w3.org (Chris Lilley)
Date: Mon Jun  7 17:10:50 2004
Subject: OFF: Attachments to list
References: <000c01be7a72$f13b2c40$2500a8c0@hto.citec.fi>
Message-ID: <3700847B.84F584E4@w3.org>


Heikki Toivonen wrote:
> 
> Instead of flaming people who send attachments to this list, why not use an
> automated tool that refuses to send messages with attachments to this list?

That needs a better definition of "attachment". Is a text/plain MIME
bodypart an attachment or not? What if it has two of them?

> One has to be careful with vCards, though, because many people do not
> realize they are considered attachments (happened even to me, I used vCard
> at one time as my .sig and I could not understand why FrameUsers bot was
> saying I had an attachment in my email:).

Similarly, many people do not realise tha they are using HTML mail or
that their (plain text) signature file is being included as a separate
bodypart.

--
Chris

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From om at lgsi.co.in  Tue Mar 30 10:34:07 1999
From: om at lgsi.co.in (Om Band)
Date: Mon Jun  7 17:10:50 2004
Subject: HELP Me : Throwing XML page through Servlet
Message-ID: <37007F73.59ADFCB9@lgsi.co.in>

Hi,
    Kan U Help Mi Please ?

    I am developing a search engine which will have a XML
    search form linked with a Servlet. The Servlet should take input
    from the textfield of the XML form, scan the database for matches &
    generate an XML page with the results found.(Dynamically) !

    Ideally it should not make a file of that XML but should throw
    it directly to the client m/c. In this case I am not able to link it

    with the already created XSL stylesheet which will already be
    on the server (Static).

    Even with making a separate XML file I am not able to display
    the XML page through Servlet, though directly it could be displayed
    with the same address typed in Address field of the browser !!

    The code I am using for Servlet is............
    (This makes a separate file)

        doPost(HttpServletRequest, HttpServletResponse response)
        {-------
            --------
            String file = "c:\\xml\\file.xml";

            fw = new FileWriter(file);
            pw = new PrintWriter(fw);

            pw.println("<?xml version=.....--------");        // Making
an
            pw.println("-------------------------");           // XML
file.
            ---------------------------------------------

            response.sendRedirect(file);
         }

    This code gives an error : xsl not found.
    (Which is there in the same dir)

    Alse How can I do this without making a separate .xml file ?

    THANKS !!

-Om


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990330/63a9fbe6/attachment.htm
From Michael.Kay at icl.com  Tue Mar 30 11:10:44 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:50 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
Message-ID: <93CB64052F94D211BC5D0010A80013310EB3C6@WWMESS3.172.19.125.2>

> I'm still shying away from reporting element-type
> declarations, at least until someone shows me an easy and concise way
> of doing it

I would think the best way is to present it in its XSchema form, to some
kind of
secondary DocumentHandler.

Mike Kay
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990330/b0770eee/attachment.htm
From Michael.Kay at icl.com  Tue Mar 30 11:20:21 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:50 2004
Subject: Is there anyone working on a binary version of XML?
Message-ID: <93CB64052F94D211BC5D0010A80013310EB3C7@WWMESS3.172.19.125.2>

> I have come to feel however that there is room for a 
> "works-as-if" binary analogue to text based XML.

I did various experiments with this a while back. I tried a serialised SAX
event stream, a simple canonicalisation and transcoding in which the special
characters like "<" were replaced with octet values <x20, a Java
serialisation of the DOM, etc. None gave a worthwhile saving over a stright
reparsing of the original XML text stream, in fact most performed worse. So
I gave up.

Mike Kay 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990330/bb3e3427/attachment.htm
From paul at prescod.net  Tue Mar 30 14:16:09 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:50 2004
Subject: Another WP sighting (citing?)
Message-ID: <3700BC1C.B30C3F1A@prescod.net>

AbiWord is another (the third?) open source word processor that will use
an XML document type for its ntaive format. One interesting thing about
this one is that it is intended to be portable between Unix and Windows.
That means that it is at least theoretically possible that a large,
heterogeneous corporation could standardize on it.

http://www.abisource.com
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Tue Mar 30 15:02:17 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:50 2004
Subject: Fw: XML query language
Message-ID: <009901be7aac$e9ad60d0$5402a8c0@oren.capella.co.il>

Paul Janssens <paul.janssens@skynet.be> wrote:
>I think (iii)
(results should be XML)
>should not be a requirement of an XML query language. The
>result of a query  could be a vector of tuples of pointers to the
>individual matches. Whatever needs to be done with that output can be
>done in a layer above that.

I fail to see the benfit in inventing a new format for query results. First,
a set of tuples with pointers, or whatever else, can be easily expressed in
XML. Second, if one wants to obtain 'pointers to the output', then it should
be a simple matter of constructing in the result a pointer to the matched
tree (<A href="..."> or something) instead of the matched tree itself.

AFAIK all XML QL proposals produce XML as output.

>Just because SQL mixes content with style
>doesn't mean an XML query language should.

You lost me here; this is the first time I've heard that SQL has anything to
do with style. The result of an SQL query is a table and is typically
accessed via some programming API which has nothing to do with presentation.
I agree that an XML query should do the same thing - that is, create an XML
tree as a result without worrying about presentation. The fact that I think
that _the transformational part_ of XSL should do this is perfectly
consistent, since I see this part as being a general independent mechanism
and not just a "style" language.

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Birbeck at iedigital.net  Tue Mar 30 15:52:05 1999
From: Mark.Birbeck at iedigital.net (Mark Birbeck)
Date: Mon Jun  7 17:10:50 2004
Subject: XML query language
Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054AE5@SOHOS002>

Has anyone looked at the fragment proposals for this? I think it is
ideal because it allows you to return the context of the nodes that make
up your results, as well as the results. If you add functionality for
multiple fragments (which the initial suggestions don't have) then it
works very well, allowing nodes to be returned that come from very
different contexts.

Regards,

Mark Birbeck
Managing Director
Intra Extra Digital Ltd.
39 Whitfield Street
London
W1P 5RE
w: http://www.iedigital.net/
t: 0171 681 4135
e: Mark.Birbeck@iedigital.net


> -----Original Message-----
> From: Oren Ben-Kiki 
> Sent: 30 March 1999 13:58
> To: XML List
> Subject: Fw: XML query language
> 
> 
> Paul Janssens <paul.janssens@skynet.be> wrote:
> >I think (iii)
> (results should be XML)
> >should not be a requirement of an XML query language. The
> >result of a query  could be a vector of tuples of pointers to the
> >individual matches. Whatever needs to be done with that output can be
> >done in a layer above that.
> 
> I fail to see the benfit in inventing a new format for query 
> results. First,
> a set of tuples with pointers, or whatever else, can be 
> easily expressed in
> XML. Second, if one wants to obtain 'pointers to the output', 
> then it should
> be a simple matter of constructing in the result a pointer to 
> the matched
> tree (<A href="..."> or something) instead of the matched tree itself.
> 
> AFAIK all XML QL proposals produce XML as output.
> 
> >Just because SQL mixes content with style
> >doesn't mean an XML query language should.
> 
> You lost me here; this is the first time I've heard that SQL 
> has anything to
> do with style. The result of an SQL query is a table and is typically
> accessed via some programming API which has nothing to do 
> with presentation.
> I agree that an XML query should do the same thing - that is, 
> create an XML
> tree as a result without worrying about presentation. The 
> fact that I think
> that _the transformational part_ of XSL should do this is perfectly
> consistent, since I see this part as being a general 
> independent mechanism
> and not just a "style" language.
> 
> Share & Enjoy,
> 
>     Oren Ben-Kiki
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrew at squiz.co.nz  Tue Mar 30 16:06:38 1999
From: andrew at squiz.co.nz (Andrew McNaughton)
Date: Mon Jun  7 17:10:50 2004
Subject: XML to Text questions 
In-Reply-To: Your message of "Mon, 29 Mar 1999 09:48:00 PST."
             <5BF896CAFE8DD111812400805F1991F708AAF212@RED-MSG-08> 
Message-ID: <199903301403.CAA03415@aniwa.sky>

> Q: "What tools are available for translation of XML into text?"
> 
> A:  Take a look at XSL.  Information on this and other XML-related
> activities can be found at http://www.w3.org/XML/Activity.html.

This is potentially misleading.  XSL produces as output an XML document.  It cannot be made to produce text which is not well-formed.  Specific XSL processors may provide extensions to handle this.  I believe SAXON does.

Other alternatives worth considering might include DSSSL and perl.

Andrew McNaughton
-- 
-----------
Andrew McNaughton
andrew@squiz.co.nz
http://www.newsroom.co.nz/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 30 16:11:27 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:50 2004
Subject: OFF: Attachments to list
In-Reply-To: <3700847B.84F584E4@w3.org>
References: <000c01be7a72$f13b2c40$2500a8c0@hto.citec.fi>
	<3700847B.84F584E4@w3.org>
Message-ID: <14080.45092.316461.553925@localhost.localdomain>

Chris Lilley writes:

 > Similarly, many people do not realise tha they are using HTML mail or
 > that their (plain text) signature file is being included as a separate
 > bodypart.

Bounce every posting containing a bodypart with a MIME type other than
text/plain.  People will figure out about vcards and HTML mail quite
fast that way, especially if it's possible to give a reasonably
informative message (with a note about vcards).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Tue Mar 30 16:29:30 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:50 2004
Subject: XML query language
References: <009901be7aac$e9ad60d0$5402a8c0@oren.capella.co.il>
Message-ID: <3700DF27.302D@skynet.be>

Oren Ben-Kiki wrote:
> 
> Paul Janssens <paul.janssens@skynet.be> wrote:
> >I think (iii)
> (results should be XML)
> >should not be a requirement of an XML query language. The
> >result of a query  could be a vector of tuples of pointers to the
> >individual matches. Whatever needs to be done with that output can be
> >done in a layer above that.
> 
> I fail to see the benfit in inventing a new format for query results. First,
> a set of tuples with pointers, or whatever else, can be easily expressed in
> XML

No problem there, my point was that ONLY this information should be the
output of a query, preferably in an XML format :-)

> Second, if one wants to obtain 'pointers to the output', then it should
> be a simple matter of constructing in the result a pointer to the matched
> tree (<A href="..."> or something) instead of the matched tree itself.
> 
> AFAIK all XML QL proposals produce XML as output.
> 
> >Just because SQL mixes content with style
> >doesn't mean an XML query language should.
> 
> You lost me here; this is the first time I've heard that SQL has anything to
> do with style. The result of an SQL query is a table and is typically
> accessed via some programming API which has nothing to do with presentation.
> I agree that an XML query should do the same thing - that is, create an XML
> tree as a result without worrying about presentation. The fact that I think
> that _the transformational part_ of XSL should do this is perfectly
> consistent, since I see this part as being a general independent mechanism
> and not just a "style" language.

Ok, sql ALLOWS you to mix style (or semantics) with content, as in

SELECT '<A href='||col1||'>'||col2||'</A>' FROM table1

For the same reason, if an xml query language allows you to arbitrarily
construct result trees, lazy users will abuse that feature to put style
or semantics in the output so they will not have to postprocess it with
XSL.

If on the other hand, only pointers to the resulting matches are
returned by the query language, anyone that wants an output is FORCED to
use XSL.

In my opinion, an xml query language should only describe a set of
equations, an xml query language implementation should only solve these
equations, and whatever is done with the result is NO business of the
query language.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 30 16:59:13 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:50 2004
Subject: Ampersand connector in XML
In-Reply-To: <37003AE2.7B13728D@research.canon.com.au>
References: <3.0.32.19990329183236.00c2bb80@pop.intergate.bc.ca>
	<37003AE2.7B13728D@research.canon.com.au>
Message-ID: <14080.44628.857723.958297@localhost.localdomain>

Alison Lennon writes:

 [on '&' in content models]

 > Is it likely to be included in later versions of XML? In other
 > words, what are the options for applications which need to use
 > unordered lists - SGML?

Actually, the options are somewhat broader than that.  There are two
reasons that people have traditionally wanted to use '&' in content
models:

1. to help with legacy data conversion, where the elements may be out
   of order during an intermediate stage; or

2. because there is no obvious reason to order the content.

You don't need (1), because XML allows you simply to process the
document without a DTD until it's cleaned up.  

Through more than a decade of industry experience, nearly everyone in
the SGML world ended up agreeing that (2) was a lousy idea -- the '&'
connect makes it very difficult for authors to create documents in
SGML editing tools, and as Tim Bray pointed out, the tools often got
it wrong anyway.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Mar 30 16:59:38 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:10:50 2004
Subject: XHTML and character entities
In-Reply-To: <001901be7a4f$50461d90$1bf96d8c@NT.JELLIFFE.COM.AU>
References: <001901be7a4f$50461d90$1bf96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <14080.44181.839780.143963@localhost.localdomain>

Rick Jelliffe writes:

 > Certainly it is the expectation of some people that the entities
 > for special characters will disappear with XML, that people will
 > use NCRs.  I am not sure about it.

I think that Rick makes a good point here (we touched on this point
earlier in a different context).  There are two problems:

1. some XML documents will *always* need characters not available
   through Unicode either directly or through composition, no matter
   how large Unicode grows; and

2. representing new characters through numeric references in the
   private-use area is unintuitive.

Internal SDATA entities were (and are) the bane of people trying to
write generic SGML processing software, but they were very useful for
small utilities tied closely to a specific SGML application (such as
an academic project for transcribing manuscripts, where you knew in
advance what SDATA entities you were going to see).  

On the other hand, there were actually proposals back in th'old days
to use Unicode values for SDATA strings rather than the (in)famous
"[eacute]" type strings.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Tue Mar 30 17:04:59 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:50 2004
Subject: XML <-> non-XML filter project
References: <00cd01be7a6c$0d879ac0$0300000a@cygnus.uwa.edu.au>
Message-ID: <3700F006.45A98468@lig.net>

I like this idea and a few weeks ago was evangelizing a similar idea:
<diatribe id="OOXMLSysAdmin">
<note>Bear with me, there is a good XML tie in at the end...</note>
<problem>
I was considering what was wrong with the way that OS and application configuration is handled
typically.
Of course NT can be a nightmare because of registry problems and centrality.  Unix/Linux is
somewhat easier to manage but still needs changes distributed throughout a filesystem tree,
often with minor variations between Unix vendors, Linux, BSD, etc.
</problem>

Furthermore, there are a few obvious goals that come to mind in designing a perfect system
administration environment:

<goal>
Applications and OS modules should be 'object oriented' in the sense that as much as possible
all programs, data, files, temp space, logging, and especially configuration are stored in an
area partitioned from everything else in a predictable way.  This could mean that you have one
directory that contains everything and is referenced one way or another from all of the
appropriate subsystems.

</goal><goal>

Configuration should be straightfoward, portable (between OS's, hardware, etc.) and easily
editable in most circumstances.

</goal><goal>

Upgrading the OS to a new version, distribution, etc. should be trivial and not require
reinstalling all applications.  Conversely, it should be trivial to copy an application to
another system with the same OS.  This includes easy backups and restores.
</goal>

I can see two major ways to solve these problems:
<solution>
Modification to standard subsystems and/or OS initialization sequences to expect modular
installations of applications.  For instance, Oracle requires userid's in /etc/passwd, OS
parameter changes, daemon startup in /etc/rc.d/init.d, environment variables for all or most
users, etc. etc.  Normally you change all kinds of things, add it to your path, add the
libraries to your path, include directories, Java library to your CLASSPATH, etc.

All of this (everything except possibly allocation of data space, although that's feasible
also in simple or default cases) should go into /opt/oracle (for instance) in ways that are
automatically picked up by boot up and/or user login actions.  For instance, I typically
modify /etc/profile once to add /opt/*/bin to PATH and /opt/*/bin/lib/*.jar to CLASSPATH,
etc.  It would be fairly easy to add users virtually to /etc/passwd with PAM modifications.
System parameters could be computed by the max of all mentions in /opt/*/config/osparam.
Environments could have all /opt/*/config/profile contents 'sourced'.  Etc.

</solution><solution>

In fact, the base OS installation (say a Linux distribution) should be read-only and all
changes made in an logical 'overlay' tree.
</solution>

Because all of this requires cooperation with people defining the 'correct' way to do things
and those putting together distributions or OS versions, I came up with another way that is
almost equivalent:

<solution>
For many things, especially standard OS parameters, configuration can be indicated in a nearly
generic way by creating logical XML files in, say, /config.  These files could easily handle
most common configuration and be operated on by installers with a standard feature set but
which are built specifically for the operating system they run on.

As an example, /config/network.xml could contain system name, domain name, network IP
addresses, masks, routes, etc.  Services to start at boot and/or login could be listed and
controlled.  Users to add to the box could be configured.  Filesystems to export, etc.  These
files could be used on any OS and a local installer would know how to install the equivalent
configuration into native config files, along with restarting daemons or reloading
configuration.

This would completely eliminate, for many users and purposes, any problems with fluctuations
with how a particular Unix stores system name (which varies) or network configuration (which
varies), etc.  I have worked with something like 10-15 different Unix OS which all vary more
from a system administration standpoint than anything.  Oddly enough, this would work just as
well for Win98 or WinNT since the installer could update the registry appropriately.
</solution>

Is there some reason we haven't done this already?
</diatribe>

sdw

James Tauber wrote:

> Earlier this month, I posted the following to XSL-LIST. With apologies to
> those who received it there, I'm posting it (modified) here to see if anyone
> is interested in some co-operative effort in this area.
>
> What I would like to see is people taking existing non-XML formats and
> developing:
>
>      a) a URI for the non-XML format (for notations and for the namespace of
> the XML format)
>      b) a DTD representing the existing non-XML format
>      c) an output filter to convert documents conforming to the DTD into the
> non-XML format
>      d) (possibly) an input filter to convert the non-XML format into XML
>
> There are individual cases of this sort of thing[1] but I would like to see
> some sort of co-operative effort to produce a large number of these things.
> I'm not envisaging complex filters, just a simple XML representation of the
> non-XML format so that purely XML tools like editors, query engines, XSL
> engines can operate on non-XML formats. There are plently of applications
> including generation of these files on the basis of other XML documents (I
> need this for Makefiles on my websites) and literate programming.
>
> I would personally find great value in this being done for Makefiles,
> procmail files, simple shell scripts and PalmPilot databases. Others of
> value I can think of include Windows INI files, Unix mailboxes, your
> favourite programming language...
>
> If there is enough interest I am more than willing to coordinate these
> efforts. Just let me know.
>
> James
>
> [1] http://www.xmlsoftware.com/convert/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Tue Mar 30 17:18:45 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:51 2004
Subject: Fw: XML query language
Message-ID: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il>

Paul Janssens <paul.janssens@skynet.be> wrote:
>In my opinion, an xml query language should only describe a set of
>equations, an xml query language implementation should only solve these
>equations, and whatever is done with the result is NO business of the
>query language.


Just to make sure I follow: you'd prefer that there would be a standard
<xql:result> DTD, so that results would always be created in an XML format
containing references to the matched XML elements (XLink/XPointer?). The
user would then filter this through XSL or whatever to display the results.

Nice separation of concerns, but I see several objections:

- Efficiency. Suppose I'm querying a very large DB, and I'm getting a list
of matches scattered all over the place. In the current approach, the DB
would both resolve the matches and extract the necessary data, potentially
at the same pass using a lot of locality-of-reference optimizations. In your
method a second tool would re-fetch the references in a second phase, which
would probably double the cost of doing the query.

- Power. Assume that I hypnotize all the W3C members to adopt the XSL
transformational part as XQL version 1.0 :-) This is more powerful then
current ?QL proposals because it allows for an <xsl:template> to call
<xsl:apply-templates> - that is, to perform nested queries (and therefore,
BTW, offers a natural way to do joins without variables, and solves other
?QL problems). All this works because XSL has a rich language for
constructing the results. In your approach, you won't be able to do a lot of
that; you'd end up adding special constructs for them, duplicating XSL's
capabilities in an incompatible language. Of course you'd be in good
company - that is what all the other ?QL language proposals do :-)

- Convenience. It is easier to specify a query as just "one thing" instead
of two. Note that even if ?QL == XSL transformation, it still makes a lot of
sense to filter its results through another XSL stylesheet for presentation
in most cases. Even lazy users will do so - if, for example, they had
already available XSL sheets for displaying certain types of results.

So all in all I prefer my approach: XQL = XSL - FO.

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Tue Mar 30 18:14:58 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:51 2004
Subject: Fw: XML query language
References: <009901be7aac$e9ad60d0$5402a8c0@oren.capella.co.il>
Message-ID: <3700F24E.3296C03F@prescod.net>

Oren Ben-Kiki wrote:
> 
> Paul Janssens <paul.janssens@skynet.be> wrote:
> >I think (iii)
> (results should be XML)
> >should not be a requirement of an XML query language. The
> >result of a query  could be a vector of tuples of pointers to the
> >individual matches. Whatever needs to be done with that output can be
> >done in a layer above that.
> 
> I fail to see the benfit in inventing a new format for query results. 

It isn't about a format. Query languages do not typically work on formats.
They have an input data model (i.e. a relational data base) and they have
an output model (i.e. a set of records). An XML Query Language should also
work in terms of the XML data model (the information set).

> First,
> a set of tuples with pointers, or whatever else, can be easily expressed in
> XML. Second, if one wants to obtain 'pointers to the output', then it should
> be a simple matter of constructing in the result a pointer to the matched
> tree (<A href="..."> or something) instead of the matched tree itself.

The IDL for an XML QL should be something like:

NodeList XMLQuery( DOC doc, String query )

Your alternative is:

String XMLQuery( String inputdoc, String query )
or
DOM XMLQuery( DOM inputdoc, String query )

That's just forcing the query engine to do more work -- much of it
unnecessary in most cases.

Let's put it this way: you are saying that the query engine should build a
list of pointers, build a tree, generate XPointer attributes just so that
an application can get back the original list of pointers!

If the application wants to turn the list of pointers into a tree, it can
do so. That's what XSL does.

> AFAIK all XML QL proposals produce XML as output.

No, XQL goes out of its way to NOT require that the output be XML. "The
specification does not indicate the output format. The result of a query
could be a node, a list of nodes, an XML document, an array, or some other
structure. That is, XQL does not dictate the binary format of the returns,
but rather the logical returns." The same is true of XSL patterns.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Tue Mar 30 18:18:06 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:51 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
Message-ID: <003001be7ac9$7561a020$46026982@thing1>

From: David Megginson <david@megginson.com>
>Yes, but it's also no good having a named constant that you cannot use
>in a switch statement.  Unfortunately, Java is broken here, and you
>have to choose one side or another


Using objects for constants can also cause problems with persistent
data, if you were depending on a singularity and testing with ==.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Michael.Kay at icl.com  Tue Mar 30 18:25:15 1999
From: Michael.Kay at icl.com (Kay Michael)
Date: Mon Jun  7 17:10:51 2004
Subject: XML to Text questions
Message-ID: <93CB64052F94D211BC5D0010A80013310EB3CB@WWMESS3.172.19.125.2>

> What tools are available for translation of XML into text?

You could try SAXON, if its XSL can't produce your "physics syntax", you can
augment it with a few Java element handlers.

Mike Kay 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19990330/f7fb59d5/attachment.htm
From paul.janssens at skynet.be  Tue Mar 30 18:40:12 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:51 2004
Subject: XML query language
References: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il>
Message-ID: <3700FDB6.626E@skynet.be>

Oren Ben-Kiki wrote:
> 
> Paul Janssens <paul.janssens@skynet.be> wrote:
> >In my opinion, an xml query language should only describe a set of
> >equations, an xml query language implementation should only solve these
> >equations, and whatever is done with the result is NO business of the
> >query language.
> 
> Just to make sure I follow: you'd prefer that there would be a standard
> <xql:result> DTD, so that results would always be created in an XML format
> containing references to the matched XML elements (XLink/XPointer?). The
> user would then filter this through XSL or whatever to display the results.

correct

> Nice separation of concerns, but I see several objections:
> 
> - Efficiency. Suppose I'm querying a very large DB, and I'm getting a list
> of matches scattered all over the place. In the current approach, the DB
> would both resolve the matches and extract the necessary data, potentially
> at the same pass using a lot of locality-of-reference optimizations. In your
> method a second tool would re-fetch the references in a second phase, which
> would probably double the cost of doing the query.

That's an implementation issue. You can build a tool that has an input
of both the query and the style description, and optimizes the DB acces.
In other words

xml report syntax = xml query syntax + xml style syntax

does NOT imply

xml report implementation = xml query implementation + xml style
implementation 


> - Power. Assume that I hypnotize all the W3C members to adopt the XSL
> transformational part as XQL version 1.0 :-) This is more powerful then
> current ?QL proposals because it allows for an <xsl:template> to call
> <xsl:apply-templates> - that is, to perform nested queries (and therefore,
> BTW, offers a natural way to do joins without variables, and solves other
> ?QL problems). All this works because XSL has a rich language for
> constructing the results. In your approach, you won't be able to do a lot of
> that; you'd end up adding special constructs for them, duplicating XSL's
> capabilities in an incompatible language. Of course you'd be in good
> company - that is what all the other ?QL language proposals do :-)

I have no problem with recycling some XSL syntax into ?QL where
applicable, in fact it would be a good idea. Just as you could recycle
XPointer syntax where applicable.


> - Convenience. It is easier to specify a query as just "one thing" instead
> of two. Note that even if ?QL == XSL transformation, it still makes a lot of
> sense to filter its results through another XSL stylesheet for presentation
> in most cases. Even lazy users will do so - if, for example, they had
> already available XSL sheets for displaying certain types of results.

The report syntax will allow you to either link to a query and style, or
describe them inline, e.g.

<report>
   <query>...
   </query>
   <style>...
   </style>
</report>


Paul Janssens - paul.janssens@skynet.be

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Tue Mar 30 18:45:47 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:51 2004
Subject: XML query language
Message-ID: <00fe01be7acc$22045960$5402a8c0@oren.capella.co.il>

Paul Prescod <paul@prescod.net> wrote:

>Oren Ben-Kiki wrote:
>> I fail to see the benfit in inventing a new format for query results.
>
>It isn't about a format. Query languages do not typically work on formats.
>They have an input data model (i.e. a relational data base) and they have
>an output model (i.e. a set of records). An XML Query Language should also
>work in terms of the XML data model (the information set).


Agreed.

>Let's put it this way: you are saying that the query engine should build a
>list of pointers, build a tree, generate XPointer attributes just so that
>an application can get back the original list of pointers!


I don't follow. You yourself have said:

>The IDL for an XML QL should be something like:
>
>NodeList XMLQuery( DOC doc, String query )

Well, them, what other way is to return a list of XPointers then to store
each in an "element"? This is assuming that you prefer the query engine to
return a list of pointers as a result, which I don't. The one nice thing
about this scheme is that you can add extra data per XPointer - a relevancy
score, for example.

I did mistakenly say:
>> AFAIK all XML QL proposals produce XML as output.
>
>No, XQL goes out of its way to NOT require that the output be XML. "The
>specification does not indicate the output format. The result of a query
>could be a node, a list of nodes, an XML document, an array, or some other
>structure. That is, XQL does not dictate the binary format of the returns,
>but rather the logical returns." The same is true of XSL patterns.


I stand corrected. I think you've phrased it perfectly above - the output
should be defined in the terms of the XML data model.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From shyutz at ms1.hinet.net  Tue Mar 30 19:38:05 1999
From: shyutz at ms1.hinet.net (Kevin Hsu)
Date: Mon Jun  7 17:10:51 2004
Subject: SGML and XML
Message-ID: <NCBBIOJLOLHDEPAKDDCMEEADCAAA.shyutz@ms1.hinet.net>

Hi,

I know the XML is the subset of SGML , and SGML is more complex and detail ,
but I must write a paper to tell the difference,
who can tell me the major difference between the SGML and XML, or where can
I find information , thanks in advance!

Kevin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Tue Mar 30 19:38:41 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:51 2004
Subject: Fw: XML query language
Message-ID: <010f01be7ad3$85632cf0$5402a8c0@oren.capella.co.il>

Paul Janssens <paul.janssens@skynet.be> wrote:
>I wrote:
>> Nice separation of concerns, but I see several objections:
>>
>> - Efficiency. Suppose I'm querying a very large DB, and I'm getting a
list
>> of matches scattered all over the place. In the current approach, the DB
>> would both resolve the matches and extract the necessary data,
potentially
>> at the same pass using a lot of locality-of-reference optimizations. In
your
>> method a second tool would re-fetch the references in a second phase,
which
>> would probably double the cost of doing the query.
>
>That's an implementation issue. You can build a tool that has an input
>of both the query and the style description, and optimizes the DB acces.
>In other words
>
>xml report syntax = xml query syntax + xml style syntax
>
>does NOT imply
>
>xml report implementation = xml query implementation + xml style
>implementation


We are not talking about a "report tool". I think it would be a very rare
application which would do an XML query and would only be interested in
pointers to the result, without requiring any data from that pointer. If
what you call a "report tool" is integrated into the query tool, always, it
hardly makes sense to make the distinction; if it isn't, then "non-report"
application will get the performance penalty hit.

>> - Power. Assume that I hypnotize all the W3C members to adopt the XSL
>> transformational part as XQL version 1.0 :-) This is more powerful then
>> current ?QL proposals because it allows for an <xsl:template> to call
>> <xsl:apply-templates> - that is, to perform nested queries (and
therefore,
>> BTW, offers a natural way to do joins without variables, and solves other
>> ?QL problems). All this works because XSL has a rich language for
>> constructing the results. In your approach, you won't be able to do a lot
of
>> that; you'd end up adding special constructs for them, duplicating XSL's
>> capabilities in an incompatible language. Of course you'd be in good
>> company - that is what all the other ?QL language proposals do :-)
>
>I have no problem with recycling some XSL syntax into ?QL where
>applicable, in fact it would be a good idea. Just as you could recycle
>XPointer syntax where applicable.


If we agree that an XQL match pattern should be used to select elements in
the DB and that XSL syntax should be used to specify what the XML result
data should be, don't we end up with XSL? Think of it another way. Suppose
we agree to use:

<xql:query match="XQL query pattern">
Other <xql:*> tags for constructing the results...

Then what is the difference between <xql:query> and <xsl:template> and
<xql:*> and <xsl:*>? Why bother having both?

Maybe it would be clearer if we thought about it this way: what feature of
XQL isn't useful in the transformational part of XSL, or vice versa? I can't
think of any. IMVHO both are _applications_ of the general XML -> XML
conversion problem, and any feature relevant for this problem will be
relevant for both.

>> - Convenience. It is easier to specify a query as just "one thing"
instead
>> of two. Note that even if ?QL == XSL transformation, it still makes a lot
of
>> sense to filter its results through another XSL stylesheet for
presentation
>> in most cases. Even lazy users will do so - if, for example, they had
>> already available XSL sheets for displaying certain types of results.
>
>The report syntax will allow you to either link to a query and style, or
>describe them inline, e.g.
>
><report>
>   <query>...
>   </query>
>   <style>...
>   </style>
></report>


Not nearly as convenient. In the query part you'd specify match patterns for
the DB, which automatically generate a list of pointers. You'd then specify
in the style section match patterns for entries in this list, which somehow
dereference them, and then proceed to match on the resulting trees to
generate FO objects (or whatever). There's both extra complexity for the
query writer and for the implementation which needs to figure out how to do
this in one pass for efficiency.

Does this really have any benefit over matching elements in the DB and
directly specifying which "near-by" elements are of interest using normal
XSL syntax? You would have the option of integrating the transformation to
FOs (or CSS) into this XSL (useful for ad-hoc queries and specialized
applications) or feeding the results to another XSL stylesheet for display
(probably one independent of the query, and fitting a particular display
media or format).

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Tue Mar 30 20:22:55 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:51 2004
Subject: Fw: XML query language  and another OS/XML suggestion
References: <00e101be7abf$f9bb5aa0$5402a8c0@oren.capella.co.il>
Message-ID: <37011E5F.A22DF23B@lig.net>

I don't have a strong opinion yet on xml query languages and results, however:

Maybe the results could be the XLink/XPointer AND the contents that it points to.  That way
you have a canonical reference but also get the contents for efficiency.

Offhand, for many situations, especially database queries, I think that it will be difficult
to Always generate a reasonable XLink/XPointer.  With SQL for instance, it is quite common to
create result strings from multiple fields and transformations.  Additionally, results will
often be ephemeral snapshots of something  (stats, processes, connections from /dev/proc/xml
for instance) that have no future reference.  Maybe this can be solved by just having a
'dead-end' link value to communicate these situations as meta-data.

A note on the /dev/proc/xml mention: I've been thinking for a while that EVERY data/meta-data
interface to a typical OS (such as Linux/Unix) should have an XML form.  Maybe add or override
-X  or --XML to all commands where it could possibly make sense.  ps, netstat, lsof, ifconfig,
df, egrep, ls, etc. are all good candidates.  Add simple tree/value extraction to bash and
you'd have more portability for a lot of things.

sdw

Oren Ben-Kiki wrote:

> Paul Janssens <paul.janssens@skynet.be> wrote:
> >In my opinion, an xml query language should only describe a set of
> >equations, an xml query language implementation should only solve these
> >equations, and whatever is done with the result is NO business of the
> >query language.
>
> Just to make sure I follow: you'd prefer that there would be a standard
> <xql:result> DTD, so that results would always be created in an XML format
> containing references to the matched XML elements (XLink/XPointer?). The
> user would then filter this through XSL or whatever to display the results.
>
> Nice separation of concerns, but I see several objections:
>
> - Efficiency. Suppose I'm querying a very large DB, and I'm getting a list
> of matches scattered all over the place. In the current approach, the DB
> would both resolve the matches and extract the necessary data, potentially
> at the same pass using a lot of locality-of-reference optimizations. In your
> method a second tool would re-fetch the references in a second phase, which
> would probably double the cost of doing the query.
>
> - Power. Assume that I hypnotize all the W3C members to adopt the XSL
> transformational part as XQL version 1.0 :-) This is more powerful then
> current ?QL proposals because it allows for an <xsl:template> to call
> <xsl:apply-templates> - that is, to perform nested queries (and therefore,
> BTW, offers a natural way to do joins without variables, and solves other
> ?QL problems). All this works because XSL has a rich language for
> constructing the results. In your approach, you won't be able to do a lot of
> that; you'd end up adding special constructs for them, duplicating XSL's
> capabilities in an incompatible language. Of course you'd be in good
> company - that is what all the other ?QL language proposals do :-)
>
> - Convenience. It is easier to specify a query as just "one thing" instead
> of two. Note that even if ?QL == XSL transformation, it still makes a lot of
> sense to filter its results through another XSL stylesheet for presentation
> in most cases. Even lazy users will do so - if, for example, they had
> already available XSL sheets for displaying certain types of results.
>
> So all in all I prefer my approach: XQL = XSL - FO.
>
> Share & Enjoy,
>
>     Oren Ben-Kiki
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Tue Mar 30 20:29:02 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:51 2004
Subject: XML query language  and another OS/XML suggestion
Message-ID: <012801be7ada$8574dc00$5402a8c0@oren.capella.co.il>

Stephen D. Williams <sdw@lig.net> wrote:
>A note on the /dev/proc/xml mention: I've been thinking for a while that
EVERY data/meta-data
>interface to a typical OS (such as Linux/Unix) should have an XML form.
Maybe add or override
>-X  or --XML to all commands where it could possibly make sense.  ps,
netstat, lsof, ifconfig,
>df, egrep, ls, etc. are all good candidates.  Add simple tree/value
extraction to bash and
>you'd have more portability for a lot of things.


Wouldn't that be great? The UNIX pipe model has suffered from not having a
standard structured format, as has the /proc file system.  Not to mention
what this could do to an OS like Plan9 where "everything is a file" and
textual formats abound...

However this would be a major undertaking. Maybe someone in the GNU project
would consider it, though.

Have fun,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Tue Mar 30 21:00:46 1999
From: clark.evans at manhattanproject.com (Clark Evans)
Date: Mon Jun  7 17:10:51 2004
Subject: SGML and XML
References: <NCBBIOJLOLHDEPAKDDCMEEADCAAA.shyutz@ms1.hinet.net>
Message-ID: <37011E5D.6CE993A1@manhattanproject.com>

Kevin Hsu wrote:
> 
> Hi,
> 
> I know the XML is the subset of SGML , and SGML is more complex and detail ,
> but I must write a paper to tell the difference,
> who can tell me the major difference between the SGML and XML, or where can
> I find information , thanks in advance!

I hope this will help:

-------- Original Message --------
Subject: Advantages of XML and SGML
Date: Fri, 12 Feb 1999 20:28:33 +0000
From: Clark Evans <clark.evans@manhattanproject.com>
Reply-To: Clark Evans <clark.evans@manhattanproject.com>
To: xml-dev@ic.ac.uk
Cc: Susan Barron <susan.d.barron@lmco.com>

Susan Barron wrote:
> 
> We have been using SGML for several years and are closely watching the
> trend towards XML.  Could someone please give me some examples of why
> you would use XML over SGML.  I know that XML is a subset of SGML.  I
> believe there must be some things that can be done in SGML that are not
> possible in XML.  Conversely, there must be somethings that XML does
> better than SGML. Thank you.


Since minimization is allowed in SGML, this creates situations
where the meaning of document can have multiple syntatic interpretations.
For instance:

<parent>
   <child>

Can have two syntatic intererpretations:

<parent>
   <child>
   </child>
</parent>

OR

<parent>
</parent>
<child>
</child>

The DTD is required for the parser to figure out which one is 
the correct interpretation of the input.   As such, an SGML
document must have one_and_only_one DTD to resolve these 
syntatic ambiguities.

XML restricts the syntax by eliminating these minimizations.
Thus, all documents have one and only one syntatic interpretation.
This dramatically reduces the complexity of the parser. Thus, a 
parser can be simpler to implement, and a DTD is _not_ required 
for parsing.

This lets the DTD be used for a 100% semantic role, which
is much more interesting for describing data!  This is great
beacuse it allows a document to conform to more than
one DTD at the same time, _without_ requiring a "mother"
DTD that merges all of the DTD's together.    This is
called "Architectures".   It allows multiple meanings
for the same document, depending upon the observer
without requireing all of the possible observers
to get together and specify a "united" DTD.  

However, this added flexibility, comes at a price:
The syntax becomes much more restrictive.

Therefore,

  For computer program <=> computer program
  communication XML is the ideal structure to use.
  Since it allows multiple subscribers to have
  their own interpretation of a data stream without
  changing the publishers.

  For human => computer communication SGML is will
  probably still remain as the prefered structure.
  The minimization features are very valueable 
  when a human is the author of the document.
  
  Also, there is nothing saying you can't use both!

  If a human is going to write it by hand, perhaps
  SGML is better, then you can have JClark's SP
  use the DTD to resolve the ambiguities and produce
  the XML document that can be introduced into the
  corporate "xml bus"

Hope this helps!

Clark Evans

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdw at lig.net  Tue Mar 30 21:01:52 1999
From: sdw at lig.net (Stephen D. Williams)
Date: Mon Jun  7 17:10:51 2004
Subject: XML query language  and another OS/XML suggestion
References: <012801be7ada$8574dc00$5402a8c0@oren.capella.co.il>
Message-ID: <37012794.E1134B00@lig.net>

Yes, the pipe mechanism takes on whole new meaning with XML.

It wouldn't be all that large of a job to do a lot of it.  Obviously adding --XML capabilities
wouldn't be that tough since it's simply adding labeling tags to output that is already
formatted.

Even /proc/xml wouldn't be that hard for output, a little tougher for input, but allowed input
would be so restrictive that a simple regex parser would suffice for most things.

Hacking bash in an appropriate way is more difficult, however there is already an sgrep (SGML
grep) and external tools can handle all of this like Perl, Java, Tcl/TK.

I suppose what we need is a group to start standardizing a DTD that settles what to call
everything in a system (ports, network address, process/thread, user, etc.).  That's probably
the biggest job.

sdw

Oren Ben-Kiki wrote:

> Stephen D. Williams <sdw@lig.net> wrote:
> >A note on the /dev/proc/xml mention: I've been thinking for a while that
> EVERY data/meta-data
> >interface to a typical OS (such as Linux/Unix) should have an XML form.
> Maybe add or override
> >-X  or --XML to all commands where it could possibly make sense.  ps,
> netstat, lsof, ifconfig,
> >df, egrep, ls, etc. are all good candidates.  Add simple tree/value
> extraction to bash and
> >you'd have more portability for a lot of things.
>
> Wouldn't that be great? The UNIX pipe model has suffered from not having a
> standard structured format, as has the /proc file system.  Not to mention
> what this could do to an OS like Plan9 where "everything is a file" and
> textual formats abound...
>
> However this would be a major undertaking. Maybe someone in the GNU project
> would consider it, though.
>
> Have fun,
>
>     Oren Ben-Kiki
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
> (un)subscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
OptimaLogic - Finding Optimal Solutions     Web/Crypto/OO/Unix/Comm/Video/DBMS
sdw@lig.net   Stephen D. Williams  Senior Consultant/Architect   http://sdw.st
43392 Wayside Cir,Ashburn,VA 20147-4622 703-724-0118W 703-995-0407Fax 5Jan1999


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Tue Mar 30 21:03:01 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:52 2004
Subject: XML query language
References: <00fe01be7acc$22045960$5402a8c0@oren.capella.co.il>
Message-ID: <370114EA.2525C481@prescod.net>

Oren Ben-Kiki wrote:
> 
> I don't follow. You yourself have said:
> 
> >The IDL for an XML QL should be something like:
> >
> >NodeList XMLQuery( DOC doc, String query )
> 
> Well, them, what other way is to return a list of XPointers then to store
> each in an "element"? 

You don't need an element. You just need a nodelist. Look at the DOM's
brutally named "getElementsByTagName" method. 

Also consider the XSL specification:

> A select pattern must match the production for SelectExpr; it returns 
> the list of nodes that results from evaluating the SelectExpr with the 
> current node as context; the nodes are in the list are in document order.

XPointer is interesting because it doesn't support either interpretation:
"The result of a spanning selection cannot generally be expressed as a
well-formed XML document, nor as a node or list of nodes from an element
tree."

--

If you are asking me what is the syntax for a nodelist then I'll say it
has no syntax. It is an abstraction like the record set returned by a
database. If you have to move the query result between machines then you
can choose an encoding (quite likely XML) but that's outside of the realm
of the query language itself -- it is akin to report writing. 

If you aren't moving data between processes then you shouldn't be forced
to encode it in XML (even a DOM). This is just a general principle that
applies here.

> This is assuming that you prefer the query engine to
> return a list of pointers as a result, which I don't. The one nice thing
> about this scheme is that you can add extra data per XPointer - a relevancy
> score, for example.

I'm not convinced that this is the domain of the query language, but even
if it is then you are asking for an annotated nodelist, not a DOM. If we
do decide to go ahead with annotated nodelists then we would have to add
that to the XML data model.

That still doesn't have anything to do with generating XML elements:
*unless the application wants to do so*.

> I stand corrected. I think you've phrased it perfectly above - the output
> should be defined in the terms of the XML data model.

And that model has a concept of nodelist -- this is the most appropriate
return value for query results.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Tue Mar 30 21:15:37 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:52 2004
Subject: Fw: XML query language
References: <010f01be7ad3$85632cf0$5402a8c0@oren.capella.co.il>
Message-ID: <370117AE.B5E009AE@prescod.net>

Oren Ben-Kiki wrote:
> 
> We are not talking about a "report tool". I think it would be a very rare
> application which would do an XML query and would only be interested in
> pointers to the result, without requiring any data from that pointer. 

How about deletion? How about changes to the nodes? How about reports of
nodes that have changed?

> If
> what you call a "report tool" is integrated into the query tool, always, it
> hardly makes sense to make the distinction; if it isn't, then "non-report"
> application will get the performance penalty hit.

You are conflating implementation with language specification. XPointer is
a query language that can be used separately from XLink. Does that mean
that XLink implementations have taken a performance hit? No, because you
can choose to integrate XPointer and XLink in a loose way (xptr_filter |
xlink_filter ) or you can choose to implement them tightly. Your choice.

> If we agree that an XQL match pattern should be used to select elements in
> the DB and that XSL syntax should be used to specify what the XML result
> data should be, don't we end up with XSL? 

When you combine the query language with the report generation language
you end up with something very like XSL, yes. But you could use the two
separately. You could embed another query language into XSL (in a perfect
world) and use the query language in another style language or non-style
application.

> Think of it another way. Suppose
> we agree to use:
> 
> <xql:query match="XQL query pattern">
> Other <xql:*> tags for constructing the results...

Right. That's why XQL doesn't have tags for constructing the results. It
leaves that up to XSL, or Python or whatever it is embedded in.

> Maybe it would be clearer if we thought about it this way: what feature of
> XQL isn't useful in the transformational part of XSL, or vice versa? I can't
> think of any. IMVHO both are _applications_ of the general XML -> XML
> conversion problem, and any feature relevant for this problem will be
> relevant for both.

No, XQL has nothing to do with conversion. If I use it to locate nodes in
the tree before deleting them, where is the conversion? Imagine a command
line:

XQL_locate database '/foo/bar["baz"]' | Node_Delete

The language passed between those two commands might be XML. It also might
not. Maybe it is just a list of UUIDs. Maybe it is the offset of the node
into the database store.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From db at eng.sun.com  Tue Mar 30 23:34:09 1999
From: db at eng.sun.com (David Brownell)
Date: Mon Jun  7 17:10:52 2004
Subject: IE5.0 does not conform to RFC2376
References: <199903211509.AA00016@archlute.apsdc.ksp.fujixerox.co.jp> <36F62D81.A623C0A2@w3.org>
Message-ID: <3701411C.784A1D4A@eng.sun.com>

One lesson:  most web servers should default to using the
"application/xml" MIME content type, not "text/xml"!


Chris Lilley wrote:
> 
> What this RFC appears to do is remove author control over correctly
> labelling the encoding, and ensure that most if not all XML documents
> get incorrectly labelled as US-ASCII.

Not at all.  The best default MIME content type for all web
servers is "application/xml".  Without a "charset=Big5" or
similar declaration, then the XML processor's autodetection
kicks in ... minimally handling UTF-8 and UTF-16, and quite
commonly handling a variety of additional encodings.

For example, Sun's XML processor handles about 140 encodings
at last count ... and _does_ conform to RFC 2376.


> So, this RFC removes at a stroke the possibility of authors correctly
> labelling the encoding of their XML documents and takes us back to that
> dark time (the present) when the majority of, say, Japanese Web content
> was mis-labelled. And it seems to have done this simply to save a very
> small part of coding effort for people writing transcoders.

Again, no it doesn't.  The idea is to get the web server to
attach the correct MIME content type, which is NOT "text/xml"
in many/most cases.  Authors must rely on the administrator
not breaking their content, and this is part of it.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From leventer at uol.com.br  Wed Mar 31 03:34:39 1999
From: leventer at uol.com.br (=?iso-8859-1?Q?Maur=EDcio_Leventer?=)
Date: Mon Jun  7 17:10:52 2004
Subject: unsubscribe leventer@uol.com.br
Message-ID: <001701be6d48$fe3658c0$b299d3c8@leventeruol.com.br>

unsubscribe leventer@uol.com.br


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Wed Mar 31 05:07:58 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:10:52 2004
Subject: XML <-> non-XML filter project
Message-ID: <002a01be7b1b$977e1ab0$44f96d8c@NT.JELLIFFE.COM.AU>


From: James Tauber <jtauber@jtauber.com>

 >What I would like to see is people taking existing non-XML formats and
>developing:
>
>     a) a URI for the non-XML format (for notations and for the
namespace of
>the XML format)
>     b) a DTD representing the existing non-XML format
>     c) an output filter to convert documents conforming to the DTD
into the
>non-XML format
>     d) (possibly) an input filter to convert the non-XML format into
XML

There is a project somewhat like this through FSF: the "GNU Filters".
They have an Excel to XML filter now, from memory. I think this is a
good project to support.

http://www.fsf.org/

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata at apsdc.ksp.fujixerox.co.jp  Wed Mar 31 06:54:49 1999
From: murata at apsdc.ksp.fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:10:52 2004
Subject: IE5.0 does not conform to RFC2376
In-Reply-To: <3701411C.784A1D4A@eng.sun.com>
Message-ID: <199903310453.AA00111@archlute.apsdc.ksp.fujixerox.co.jp>

David Brownell wrote:
> Again, no it doesn't.  The idea is to get the web server to
> attach the correct MIME content type, which is NOT "text/xml"
> in many/most cases.  Authors must rely on the administrator
> not breaking their content, and this is part of it.

"application/xml" is appropriate for some XML data.  On the other 
hand, if you do not want to miss fallback to text/plain, "text/xml" 
is the right choice.

Cheers,

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata@apsdc.ksp.fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Mar 31 10:17:57 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:52 2004
Subject: XML query language  and another OS/XML suggestion
References: <012801be7ada$8574dc00$5402a8c0@oren.capella.co.il>
Message-ID: <00fe01be7b4e$a509c8e0$0300000a@cygnus.uwa.edu.au>

> Wouldn't that be great? The UNIX pipe model has suffered from not having a
> standard structured format, as has the /proc file system. Not to mention
> what this could do to an OS like Plan9 where "everything is a file" and
> textual formats abound...

This is pretty much what I was suggesting a little while ago on this list
(in the same breath as ?berdocument). I'm still trying to find the time to
work a bit more on it. I certainly have a lot of ideas about it so if others
are interested in helping with implementation, I'd love them to drop me an
email. My idea involves a layer on top of the operating system that treats
the operating system as one big XML document (hence the phase "?berdocument
shell" which I used at the time).

I'm thinking of calling it "Plan X" which both includes the mandatory "X"
for association with XML and suggests, via roman numeral, a continuation of
the thinking of Plan 9.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From branjan at wipinfo.soft.net  Wed Mar 31 10:32:31 1999
From: branjan at wipinfo.soft.net (Balaji Ranjan)
Date: Mon Jun  7 17:10:52 2004
Subject: snmp in XML
Message-ID: <Pine.LNX.3.96.990331135959.6270A-100000@hardy.wipinfo.soft.net>

hi,
  is there a snmp representation in XML not using the CIM standard but
representing mib in a XML way

regards
Balaji Ranjan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mark.Birbeck at iedigital.net  Wed Mar 31 10:40:54 1999
From: Mark.Birbeck at iedigital.net (Mark Birbeck)
Date: Mon Jun  7 17:10:52 2004
Subject: XML query language
Message-ID: <A26F84C9D8EDD111A102006097C4CD0D054AE8@SOHOS002>

Paul Prescod wrote:
> And that model has a concept of nodelist -- this is the most 
> appropriate return value for query results.

What do you mean by nodelist? Does it take into account that result
nodes may be returned from different parts of the tree, or even at
different depths? It would be quite inefficient to encode the entire
path of each node and just list each result.

We use a variation on the fragment spec that allows both of these
conditions to be met, for example:

<p:package xmlns:p="http://www.w3.org/XML/Package/1.0">
    <f:fcs xmlns:f="http://www.w3.org/XML/Fragment/1.0"
 
fragbodyref="http://test.ied-ied.ied-support.net/documents/article[autho
r='Mark']">
        <People/>
        <Documents>
            <f:fragbody IDREF="#1"/>
            <article/>
            <f:fragbody IDREF="#2"/>
            <article/>
            <article/>
        </Documents>
    </f:fcs>
    <p:page ID="1">
        <article>
            <author>Mark</author>
            <text>
                ...
            </text>
        </article>
    </p:page>
    <p:page ID="2">
        <article>
            <author>Mark</author>
            <text>
                ...
            </text>
        </article>
    </p:page>
</p:package>

[Note that the ID/IDREF part is not in the fragment spec. Only one
fragbody/page pair is allowed.]

I think the useful things about the fragment spec are:
- the initial query is encoded in the container of the results
(fragbodyref)
- you get the context of your results set. An application could now
modify
  these results - say add a paragraph of text - and have enough info to
do
  the work
- nodes could be returned from anywhere in the hierarchy
- a remote application could keep its own DOM model of the hierarchy,
and
  only request nodes it needs as and when it needs them

In fact, we love it so much that we use it for everything that is
returned from our server! Even one article is returned as a fragment.

Interested to know what people think of this approach.

Regards,

Mark

Mark Birbeck
Managing Director
Intra Extra Digital Ltd.
39 Whitfield Street
London
W1P 5RE
w: http://www.iedigital.net/
t: 0171 681 4135
e: Mark.Birbeck@iedigital.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Wed Mar 31 12:17:16 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:10:52 2004
Subject: Fw: XML query language
Message-ID: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il>

Paul Prescod <paul@prescod.net> wrote:

>I wrote:
>> Well, them, what other way is to return a list of XPointers then to store
>> each in an "element"?
>
>You don't need an element. You just need a nodelist. Look at the DOM's
>brutally named "getElementsByTagName" method.


You mean the NodeList contains the matched nodes directly, and not XPointers
which point to them. Presumably these nodes can be used to either access to
tree in the vicinity of the match, or to obtain other data regarding the
node (such as a fast ID for direct access to a DB record or whatever). That
does makes more sense then Paul Jenssens' proposal (returning XPointers).

>If you are asking me what is the syntax for a nodelist then I'll say it
>has no syntax. It is an abstraction like the record set returned by a
>database. If you have to move the query result between machines then you
>can choose an encoding (quite likely XML) but that's outside of the realm
>of the query language itself -- it is akin to report writing.


No standard way to represent a query result as text? I find this strange.
But if the result is a nodes list, wouldn't fragments somehow resolve this?
After all each node is a fragment...

>If you aren't moving data between processes then you shouldn't be forced
>to encode it in XML (even a DOM). This is just a general principle that
>applies here.


The output of an XSL processor is also not forced to be encoded in XML (it
might be a DOM or even a display on the screen), but it is very helpful to
have a standard XML encoding for it (witness the current XSL
implementations). Shouldn't the same hold for XQL?

And in a separate message:


>> Think of it another way. Suppose
>> we agree to use:
>>
>> <xql:query match="XQL query pattern">
>> Other <xql:*> tags for constructing the results...
>
>Right. That's why XQL doesn't have tags for constructing the results. It
>leaves that up to XSL, or Python or whatever it is embedded in.

Both XML-QL and XQL have ways to construct results (CONSTRUCT and
<xql:result>). I feel that _if_ XML is to be constructed as a result of an
XML query then XSL is the language to do so; there's no need to invent a new
construction language. Can we agree on this?

>No, XQL has nothing to do with conversion. If I use it to locate nodes in
>the tree before deleting them, where is the conversion? Imagine a command
>line:
>
>XQL_locate database '/foo/bar["baz"]' | Node_Delete
>
>The language passed between those two commands might be XML. It also might
>not. Maybe it is just a list of UUIDs. Maybe it is the offset of the node
>into the database store.

OK, if what you are saying is:

- We have two languages:
  (i) matching of XML elements, which we'll call XQL for the moment, and is
basically the XSL match pattern language;
  (ii) constructing XML trees from other XML trees which we'll call XTL for
the moment and is basically the <xsl:*> tags.
- XSL is the combination of both (plus FO objects).
- XQL is usable in other contexts then XTL.
- There's no other standard XML construction syntax other then XTL.

Then we agree. I'd also add:

- We should have separate specs for XQL, XTL, and FOs. The XTL spec should
simply reference the XQL spec. The FO spec should be independent.
- XQL should be used wherever a set of XML elements needs to be selected
from an XML tree.
- So therefore CSS should allow using XQL in its selectors. For that matter,
CSS should allow an XML syntax :-)
- And also XPointers?

Actually, what is the difference between XPointer syntax and XQL (as defined
above)? Both allow matching elements according to the structure of the XML
tree and/or the value of attributes. The syntax is different and the set of
capabilities doesn't exactly match. Is it just due to historical reasons
that XSL isn't using (possibly enhanced) XPointers in its match patterns?

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Wed Mar 31 13:01:03 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:10:52 2004
Subject: Fw: XML query language
In-Reply-To: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il>
Message-ID: <199903311059.MAA19602@sonne.darmstadt.gmd.de>

Oren Ben-Kiki <oren@capella.co.il> wrote at 31 Mar 99, 12:12:

> Paul Prescod <paul@prescod.net> wrote:
> >I wrote:
> >> Well, them, what other way is to return a list of XPointers then to store
> >> each in an "element"?
> >
> >You don't need an element. You just need a nodelist. Look at the DOM's
> >brutally named "getElementsByTagName" method.

What a XML query should return depends on what the results are needed 
for. There is no such think as "the right way" to use an XML query 
language. Look who was on the W3C-QL workshop '98 and what they asked 
for:

1. Information Retrieval
XML seen as: Collection of text documents
Formalisms offered: Z39.50, RDF, WebSQL, PAT, ...
Query result needed: References to relevant documents

2. WWW information systems
XML seen as: Abstraction of heterogenous data sources and services
Formalisms offered: HTTP, CGI, URI
Query result: Integrated data sources and services

3. Database community (both rleational and OO):
XML seen as: Set of structured facts (order doesn't matter)
Formalisms offered: SQL, OQL
Query result: Set of (re)structured facts (order doesn't matter)

4. Document processors
XML seen as: Structured text (order matters)
Formalisms offered: XSL selectors,
Query result: Pointers to selected text fragments (order matters) for 
further processing (e.g. by XSL templates or programming languages)

5. Document transformation
XML seen as: Syntax tree
Formalisms offered: hedge automata
Query result: Transformed syntax tree

6. Hypertext community
XML seen as: Graph of structured nodes connected by Hyperlinks
Formalisms offered: XLink, XPointer
Query result: Locations within a structured node

All of those need a QL. But all have different constraints (e.g. 
Hypertext needs a QL to fit in URL) and want different results 
(pointers to documents vs. documents vs. restructured documents).

David Maier identified five fundamental operations in XML queries:
1. Selection of elements 
	depending on content, structure or attributes
2. Extraction of elements
3. Redution of elements
4. Restructuring of documents
5. Combination of elements

Looking at the user groups, e.g. neither Hypertext nor information 
retrieval will need restructuring or combination. Document processing 
will need all 5 operations. 

Right now XQL offers operations 1-3, XSL offers operations 1-4 and 
XML-QL offers operations 1-5 (with the cost of loosing order).

You suggest to use XPointers as the result of XML queries. XPointers 
from my point of view are queries by themselves. Being from the 
database community, I want restructured XML as a result. Who is right 
? No one. It just depends on the way you look at it.

	++im


--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul.janssens at skynet.be  Wed Mar 31 13:45:21 1999
From: paul.janssens at skynet.be (Paul Janssens)
Date: Mon Jun  7 17:10:52 2004
Subject: Fw: XML query language
References: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il>
Message-ID: <37020A2D.47DF@skynet.be>

Oren Ben-Kiki wrote:
>...
> 
> You mean the NodeList contains the matched nodes directly, and not XPointers
> which point to them. Presumably these nodes can be used to either access to
> tree in the vicinity of the match, or to obtain other data regarding the
> node (such as a fast ID for direct access to a DB record or whatever). That
> does makes more sense then Paul Janssens' proposal (returning XPointers).

The Xpointer proposal was one for a textbased result of the query (a
standard DTD if you like), but at an API level, you just need references
to the live node(s), that's obvious. 


Paul Janssens - paul.janssens@skynet.be

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From branjan at wipinfo.soft.net  Wed Mar 31 16:30:00 1999
From: branjan at wipinfo.soft.net (Balaji Ranjan)
Date: Mon Jun  7 17:10:52 2004
Subject: hi
Message-ID: <Pine.LNX.3.96.990331195555.11309B-100000@hardy.wipinfo.soft.net>

hi all,
 has anybody got a consolidated archive of xml examples in the list or
outside.kindly pass it on to me,so that i can learn more abt. using xml

thanks and regards
Balaji Ranjan
Wipro infotech
B'lore
Web Biz Card: http://eCode.com/?brn


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Wed Mar 31 16:31:56 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:10:52 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
In-Reply-To: <003001be7ac9$7561a020$46026982@thing1>
Message-ID: <v03102801b327d5f63b07@[168.100.203.234]>

At 11:22 AM -0500 3/30/99, Bill la Forge wrote:
>
>
>Using objects for constants can also cause problems with persistent
>data, if you were depending on a singularity and testing with ==.
>

This isn't a problem with the syntax I've described because there is only a
fixed set of objects in which identity comparisons are the same as equality
comparisons.

The issue of switch statements is a little more serious. However, you can
always use if-else.


+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|        XML: Extensible Markup Language (IDG Books 1998)            |
|   http://www.amazon.com/exec/obidos/ISBN=0764531999/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://sunsite.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://sunsite.unc.edu/xml/     |
+----------------------------------+---------------------------------+


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From b.laforge at jxml.com  Wed Mar 31 17:52:18 1999
From: b.laforge at jxml.com (Bill la Forge)
Date: Mon Jun  7 17:10:52 2004
Subject: SAX2: DTDDeclHandler (minimalist position)
Message-ID: <000b01be7b8f$22edc460$c8a8a8c0@thing1>

From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
>>Using objects for constants can also cause problems with persistent
>>data, if you were depending on a singularity and testing with ==.
>>
>
>This isn't a problem with the syntax I've described because there is only a
>fixed set of objects in which identity comparisons are the same as equality
>comparisons.


How do you maintain singularities when deserializing a JavaBean which
contains a reference to one of these objects? 

That is to say, you have a constant which references an object. No problem.

Now you have a bean with a variable which has been assigned the constant
value. No problem.

Now you save the bean. No problem.

Now you deserialize the bean. No problem.

Now you test the value of the variable in the bean with ==. Woops. The test
always returns false.

Conclusion: using objects for constants is great unless you are using Java
Serialization or almost any other kind of persistance.

Bill


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mik at owl.co.uk  Wed Mar 31 18:19:22 1999
From: mik at owl.co.uk (Michael Ewins)
Date: Mon Jun  7 17:10:52 2004
Subject: Attribute-Value Normalisation
Message-ID: <041201be7b91$7fd0b050$2096c9c2@mik-ppro.owl.co.uk>


Can someone help me or point me toward appropriate FAQ.

In the XML specification it says attribute values will be normalised. However,
the explanation isn't clear to me so I'll try to clear up my understanding
through an example.

<!ATTLIST   MYDOC   file   CDATA   #IMPLIED >

I have an element MYDOC and this has an attribute that references a filename
which is CDATA.

An example document might read

<MYDOC file="c:\temp\hello.txt"/>

Will this attribute be normalised if it contains any whitespace? For example,
"space   morepsace.txt" is a valid filename on Windows but I need to know if XML
will attempt to normalise the multiple whitespace to
"space morespace.txt" or something equally wrong.

Essentially my question is, if I have a valid filename (on Windows or Mac) that
I use as an attribute value in XML can I be sure the parser will pass this on
unchanged?

thanks in advance for any help...

--
Michael Ewins
Panasonic OWL -- mik@owl.co.uk -- http://www.owl.co.uk
home -- michael_ewins@hotmail.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Mar 31 19:45:41 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:10:53 2004
Subject: Attribute-Value Normalisation
References: <041201be7b91$7fd0b050$2096c9c2@mik-ppro.owl.co.uk>
Message-ID: <00d101be7b9d$e63ea660$0300000a@cygnus.uwa.edu.au>

> <!ATTLIST   MYDOC   file   CDATA   #IMPLIED >
[...]
> <MYDOC file="c:\temp\hello.txt"/>
>
> Will this attribute be normalised if it contains any whitespace?

If it is declared CDATA, the only whitespace normalization is that
carriage-returns, line-feeds and tabs are normalized to spaces (with a CR+LF
being normalized to only a single space)

Only if it were *not* CDATA would multiple spaces be normalized to one.

[...]
> Essentially my question is, if I have a valid filename (on Windows or Mac)
that
> I use as an attribute value in XML can I be sure the parser will pass this
on
> unchanged?

As long as the filename didn't contain a < or & that you include literally.

James
--
James Tauber / jtauber@jtauber.com / www.jtauber.com
Associate Researcher, Electronic Commerce Network
Curtin University of Technology, Perth, Western Australia

Full-day XML Tutorial @ WWW8 : http://www8.org/

Maintainer of : www.xmlinfo.com,  www.xmlsoftware.com and www.schema.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 31 20:07:19 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:53 2004
Subject: XML query language
References: <A26F84C9D8EDD111A102006097C4CD0D054AE8@SOHOS002>
Message-ID: <37025CF2.D186E248@prescod.net>

Mark Birbeck wrote:
> 
> Paul Prescod wrote:
> > And that model has a concept of nodelist -- this is the most
> > appropriate return value for query results.
> 
> What do you mean by nodelist? Does it take into account that result
> nodes may be returned from different parts of the tree, or even at
> different depths? 

Sure. A node list is a list of nodes. No more, no less.

> It would be quite inefficient to encode the entire
> path of each node and just list each result.

Query languages have nothing to do with encodings. That's the point I'm
trying to make. If you want to make a "query results encoding language" --
great. Ideally it would work with the results returned by *any query
language*. But you *must* be able to use the query language without the
query encoding language -- i.e. in the middle of a Python or Java program,
in a stylesheet, in a GUI.

> In fact, we love it so much that we use it for everything that is
> returned from our server! Even one article is returned as a fragment.
> 
> Interested to know what people think of this approach.

It looks good for the special case where the query results must be
communicated between processes. It isn't useful for the other cases. In
the middle of my Python or Java program I'm certainly not going to do a
query and then re-parse the results. The results should be returned as a
list of PyObject or java.lang.object references.

Summary: If the query language is going to have maximum usefulness it must
not specify that the results must be encoded in any special syntax or that
they must be encoded at all. Encoding results is another important but
separate issue (just as it is SQL).

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 31 20:35:20 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:53 2004
Subject: XQL and XPointer
Message-ID: <37026166.77965C31@prescod.net>

Oren Ben-Kiki asks:
> Actually, what is the difference between XPointer syntax and XQL (as defined
> above)? Both allow matching elements according to the structure of the XML
> tree and/or the value of attributes. The syntax is different and the set of
> capabilities doesn't exactly match. Is it just due to historical reasons
> that XSL isn't using (possibly enhanced) XPointers in its match patterns?

Yes the reasons are mostly historical but there are technical issues. The
XPointer "model" is to select a contiguous, perhaps non-well formed range
of data. The XSL model is to select a list of well-formed, perhaps
non-contiguous nodes.

I don't think that there is anything wrong from a hypertext-theoretic
point of view with having pointers return non-contiguous nodes. And you
can simulate non-well-formedness:

This is text <EMPH>and this is some</EMPH> more text.
     ^                      ^

In this case we could simulate a link from the first occurence of "is" to
the second by selecting the nodes "i","s"," ","t","e","x","t"," ",...,
etc.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Wed Mar 31 20:46:38 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:10:53 2004
Subject: Fw: XML query language
References: <017501be7b5f$0835dd90$5402a8c0@oren.capella.co.il>
Message-ID: <3702600B.75477042@prescod.net>

Oren Ben-Kiki wrote:
> 
> >You don't need an element. You just need a nodelist. Look at the DOM's
> >brutally named "getElementsByTagName" method.
> 
> You mean the NodeList contains the matched nodes directly, and not XPointers
> which point to them. 

Right. Pointers to, not copies of, the nodes. And the pointers should be
in the most efficient "syntax" allowed by the system. In a Python program
it is a PyObject reference. In C++ it is a DOMNode *. In a
process-portable XML encoding it is an XPointer. Everybody is focused on
this last case but it is only a special case.

> >If you are asking me what is the syntax for a nodelist then I'll say it
> >has no syntax. It is an abstraction like the record set returned by a
> >database. If you have to move the query result between machines then you
> >can choose an encoding (quite likely XML) but that's outside of the realm
> >of the query language itself -- it is akin to report writing.
> 
> No standard way to represent a query result as text? I find this strange.

I didn't say that there should be no standard way. I said that the
standard way is not something that the query language should specify. If
there are 6 query languages (some standardized and some proprietary) and 6
result encoding syntaxes (some standardized and some proprietary) then you
should be able to use any query language with any encoding syntax.

> Both XML-QL and XQL have ways to construct results (CONSTRUCT and
> <xql:result>). 

There is no such element type described in
http://www.w3.org/TandS/QL/QL98/pp/xql.html

> OK, if what you are saying is:
> 
> - We have two languages:
>   (i) matching of XML elements, which we'll call XQL for the moment, and is
> basically the XSL match pattern language;
>   (ii) constructing XML trees from other XML trees which we'll call XTL for
> the moment and is basically the <xsl:*> tags.
> - XSL is the combination of both (plus FO objects).
> - XQL is usable in other contexts then XTL.
> - There's no other standard XML construction syntax other then XTL.
> 
> Then we agree. 

Yes!

> I'd also add:
> 
> - We should have separate specs for XQL, XTL, and FOs. The XTL spec should
> simply reference the XQL spec. The FO spec should be independent.

Techically a good idea but I think that it is politically impossible to
separate XSL and its matching language at this point. Maybe XSL 2.0 will
depend on whatever XML QL is eventually standardized.

> - XQL should be used wherever a set of XML elements needs to be selected
> from an XML tree.
> - So therefore CSS should allow using XQL in its selectors. For that matter,
> CSS should allow an XML syntax :-)
> - And also XPointers?

I agree with all of this but changes to CSS are unlikely in the
short->medium term.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for only himself
 http://itrc.uwaterloo.ca/~papresco

"Perpetually obsolescing and thus losing all data and programs every 10
years (the current pattern) is no way to run an information economy or
a civilization." - Stewart Brand, founder of the Whole Earth Catalog
http://www.wired.com/news/news/culture/story/10124.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rhavaldar at str.com  Wed Mar 31 22:37:40 1999
From: rhavaldar at str.com (Raghunandan Havaldar)
Date: Mon Jun  7 17:10:53 2004
Subject: XML-to-Java, and Java-to-XML
Message-ID: <002b01be7bb6$47a03aa0$612a96d0@raghu.STR_MILW>

Hi,

I am experimenting with mapping an XML document to
Java object model, and vice-versa. Have a couple of
things on mind - using Java Beans, Reflection mechanism
and (mapper, lookup classes). 

I was wondering if somebody out there has worked and
developed some kind of a model to do this transformation.
(am sure someone has given it a try). 

If not, has anybody have ideas of how to go about doing it ?.

Currently, the XML documents are purely in a flat file
format. If the 'mapper', 'lookup' and related utility classes
can be defined, the XML documents could possibly be
stored in databases instead.

definition:
 'mapper' - maps a XML model to a Java object graph
(uses Java Beans's patterns and Reflection to achieve this).
Also, should be able to do Java object model to
an XML model (vice-versa).

'lookup'  - provides lookup of XML nodes in a DOM-based
 tree.

I have just started scratching the surface today. Any ideas,
suggestions or comments are welcome.

thanks,
raghu


Raghu Havaldar
Consultant
rhavaldar@str.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Wed Mar 31 22:52:05 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:10:53 2004
Subject: XML query language
In-Reply-To: <37025CF2.D186E248@prescod.net>
Message-ID: <199903312050.WAA14507@sonne.darmstadt.gmd.de>

Paul Prescod <paul@prescod.net> wrote at 31 Mar 99, 11:35:

> Mark Birbeck wrote:
> > 
> > Paul Prescod wrote:
> > > And that model has a concept of nodelist -- this is the most
> > > appropriate return value for query results.
> > 
> > What do you mean by nodelist? Does it take into account that result
> > nodes may be returned from different parts of the tree, or even at
> > different depths? 
> 
> Sure. A node list is a list of nodes. No more, no less.

An XQL query may return numbers, strings, Date objects or even user 
defined data types, which are not nodes in the DOM sense, but 
objects. If you wrap the results in tags like <xql:date>, ... you 
will get problems with the user defined types and loose type 
information.
To me the return values of a query are just a vector of objects in 
document orders, of which some happen to be DOM nodes.

	++im
--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Mar 31 23:18:47 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:10:53 2004
Subject: XML-to-Java, and Java-to-XML
In-Reply-To: <002b01be7bb6$47a03aa0$612a96d0@raghu.STR_MILW>
Message-ID: <199903312118.QAA31121@hesketh.net>

At 02:37 PM 3/31/99 -0600, Raghunandan Havaldar wrote:
>Hi,
>
>I am experimenting with mapping an XML document to
>Java object model, and vice-versa. Have a couple of
>things on mind - using Java Beans, Reflection mechanism
>and (mapper, lookup classes). 
>
>I was wondering if somebody out there has worked and
>developed some kind of a model to do this transformation.
>(am sure someone has given it a try). 

Take a look at MDSAX and Coins on the JXML.com site - www.jxml.com.  
It sounds like it's pretty much exactly what you're looking for.  
You specify the mapping from elements to classes in a ContextML 
document, itself XML, and it builds a processing structure into 
which you can feed your documents to build your classes.  

The best part is that MDSAX/Coins handles all the weird work for you, 
including reflection.  You can even feed documents with different 
vocabularies into the same structure by specifying a different ContextML
document. With MDSAX it's just document->Beans, while with Coins you 
can go back from Beans->document.

Simon St.Laurent
XML: A Primer
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To (un)subscribe, mailto:majordomo@ic.ac.uk the following message;
(un)subscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)