From jlam at iunknown.com  Wed Dec  1 00:50:24 1999
From: jlam at iunknown.com (John Lam)
Date: Mon Jun  7 17:18:07 2004
Subject: XML4J EA2 --> Xerces-J 1.0
Message-ID: <1B79E83E7849174A813044A2E56F78040C09@AROD.iunknown.com>

Will IBM continue development of XML4J independently of Xerces-J? Or
will Xerces-J be the "official" version of that source code base?

-John


-----Original Message-----
From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
Mike Pogue
Sent: Tuesday, November 30, 1999 12:38 PM
To: xml-dev@ic.ac.uk
Subject: Re: XML4J EA2 --> Xerces-J 1.0


Eric Ulevik wrote:
>From: Mike Pogue <mpogue@apache.org>
>> The Xerces-J parser (the Apache name for what IBM calls XML4J EA2) is
>> both compliant (including passing one test that we disagree with your
>> interpretation of the spec on), and is freely available.  Both source
>> code and binaries for Xerces-J version 1 are available at
>> http://xml.apache.org, with updates done frequently.

>I haven't seen any updates. Just the original release. Am I mistaken?

The *very* latest source (including new functionality, and some late
breaking performance and memory enhancements) is available via anonymous
cvs (see http://xml.apache.org for details).  

We'll be bundling the latest source code up into a more formal release
(zip file, tarball) shortly (it's being tested right now).

Mike

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lindsey at diac.com  Wed Dec  1 00:58:34 1999
From: lindsey at diac.com (William Lindsey)
Date: Mon Jun  7 17:18:07 2004
Subject: XFM (or something similar)
In-Reply-To: <3.0.32.19991130114807.01475710@pop.intergate.ca>
Message-ID: <Pine.LNX.4.10.9911301755390.6618-100000@cobra.diac.com>

Sean McGrath wrote:
> >Do xml-dev'ers think XFM is a good idea?

Tim Bray replied:
> I think having a way for an instance to promise it references no
external
> entities is a no-brainer.
[ ... snip ... ]

Should we invent yet another way for the instance to tell us about
itself?  We already have the BOM, the XML declaration, the Document
Type declaration, and the XML-Stylesheet PI.  I guess it hasn't
been decided how an instance is associated with a W3C Schema.

Maybe we should investigate a more general way to specify all
this stuff externally. It seems to fit within the scope of
the problem Tim outlines in "Related-Resource Discovery for XML" [1].
Is there a W3C XML packaging activity?

Best,

Bill

[1] http://www.textuality.com/xml/why-pkg.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec  1 01:17:31 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:07 2004
Subject: XFM (or something similar)
Message-ID: <3.0.32.19991130171720.014ac9e0@pop.intergate.ca>

At 05:57 PM 11/30/99 -0700, William Lindsey wrote:
>Maybe we should investigate a more general way to specify all
>this stuff externally. It seems to fit within the scope of
>the problem Tim outlines in "Related-Resource Discovery for XML" [1].
>Is there a W3C XML packaging activity?

Working on it.  I think it's important.  Will know more at the end of 
next week. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Dec  1 01:55:06 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:07 2004
Subject: XML processing instruction survey
References: <000b01bf3b87$b13c5c50$0f36a8c0@quokka.com>
Message-ID: <016c01bf3b9f$19a5fa50$eb020a0a@bowstreet.com>

> I'm interested in the extent to which people are actually using the XML
> processing instruction ( <?xml ) in their XML files, and the extent to
which
> they find it useful.

You mean the XML Declaration?

It is clearly useful if you use a different character encoding to UTF-8
(hence also US-ASCII) or UTF-16.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec  1 02:41:27 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:07 2004
Subject: XML processing instruction survey
In-Reply-To: "James Tauber"'s message of "Tue, 30 Nov 1999 20:55:07 -0500"
References: <000b01bf3b87$b13c5c50$0f36a8c0@quokka.com> <016c01bf3b9f$19a5fa50$eb020a0a@bowstreet.com>
Message-ID: <m3hfi3jsfc.fsf@localhost.localdomain>

"James Tauber" <jtauber@jtauber.com> writes:

> > I'm interested in the extent to which people are actually using the XML
> > processing instruction ( <?xml ) in their XML files, and the extent to
> which
> > they find it useful.
> 
> You mean the XML Declaration?
> 
> It is clearly useful if you use a different character encoding to UTF-8
> (hence also US-ASCII) or UTF-16.

The XML Declaration (not a PI) is also very useful for
forwards-compatibility, so that in the future XML 1.1 (etc.) parsers
will know that they are dealing with XML 1.0 and can either [a] apply
the appropriate rules or [b] die gracefully, depending on the
requirements of future XML specs, if any.

The bad news is that level-three browsers do ugly things when they see 
the XML declaration, but they do ugly things with lots of XML syntax
(since they're not actually XML browsers).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Dec  1 02:42:56 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:08 2004
Subject: joe stephenson
Message-ID: <026501bf3ba5$c7551db0$eb020a0a@bowstreet.com>

I thought Joe Stephenson was supposed to be back today.

James :-)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec  1 02:55:57 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:08 2004
Subject: XML processing instruction survey
Message-ID: <3.0.32.19991130185652.01516ec0@pop.intergate.ca>

At 03:07 PM 11/30/99 -0800, Jeffrey E. Sussna wrote:
>I'm interested in the extent to which people are actually using the XML
>processing instruction ( <?xml ) in their XML files, and the extent to which
>they find it useful.

It's not really designed for people.  It's mostly designed for use
by the XML processor to help figure out the encoding and make sure that
this is really XML.

I'd think that using it at the application level would be not only
uncommon but probably unwise.  I'd be interested to hear any positive
responses to the query. -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec  1 03:24:58 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:08 2004
Subject: SML - a vote against
In-Reply-To: <3.0.32.19991130100957.01cb0330@nexus.webmethods.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEALEIAA.martind@netfolder.com>

Hi Joe,

Joe said:
Instead, what I'd like to see is the codification of subsets and
recommendations for domains of use of these subsets.  For example, in the
domain of XML for business messaging, if not for all of XML-for-data, I'd
like to see a formal recommendation to avoid both entity declarations and
mixed content.  I'd like to make it easy for someone who knows their domain
of use to identify exactly what they need to learn about XML and to learn
just those pieces.  Applications could advertise conformance with various
recommendations to ease both learning to use the application and
integrating with the application.

Didier reply:
Independently of SML there is a need for messaging convention otherwise this
convention is defined by a manufacturer as Microsoft is trying with Biztalk.

The whole thread I tried to bring out about meta data and message is about
this. Biztalk after all only add some meta data to an XML document like for
instance, for what/whom is this message, from what/whom is this message, to
which process this message is part of, etc... All these things are in fact
meta information about the document transported from A to B. However, to
realize this, what is missing now in the XML framework is:

a) the ability to validate a document fragment or the whole document as an
aggregation of fragments.

If we get that, then it will be possible to build a message that would
include the meta data about a document and the document itself in the same
text "package".

Off course this is a need for e-commerce transaction and probably not
something that could be applied to other kind of documents.

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webmethods.com  Wed Dec  1 05:49:02 1999
From: jlapp at webmethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:08 2004
Subject: XML processing instruction survey
In-Reply-To: <3.0.32.19991130185652.01516ec0@pop.intergate.ca>
Message-ID: <199912010548.VAA24908@hawk.prod.itd.earthlink.net>

We parse both XML and HTML, and you can configure whether to use the presence of the declaration to make the distinction.

I've always dreamt of having an indicator in this declaration that tells me whether the document includes any GE references besides refs to the predefined GEs.  I can get better throughput when I know they aren't there, and right now you have to configure the behavior up front.  I'd really like to autodetect on a per document basis.

... always pushing to squeeze through a few more docs per sec.

At 06:56 PM 11/30/1999 -0800, Tim Bray wrote:
>At 03:07 PM 11/30/99 -0800, Jeffrey E. Sussna wrote:
>>I'm interested in the extent to which people are actually using the XML
>>processing instruction ( <?xml ) in their XML files, and the extent to which
>>they find it useful.
>
>It's not really designed for people.  It's mostly designed for use
>by the XML processor to help figure out the encoding and make sure that
>this is really XML.
>
>I'd think that using it at the application level would be not only
>uncommon but probably unwise.  I'd be interested to hear any positive
>responses to the query. -T.
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
--
Joe Lapp              (Looking for some good people to help design
Senior Engineer        and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From varun at chennai.tcs.co.in  Wed Dec  1 09:37:13 1999
From: varun at chennai.tcs.co.in (V Arun Kumar)
Date: Mon Jun  7 17:18:08 2004
Subject: creating a DOM tree from streams of tagged data?????????
Message-ID: <6525683A.0034B3B9.00@MAILSERVER2.chennai.tcs.co.in>


i m using sun's parser.
i am able form a DOM tree provided i read from xml file using the parser

my problem goes like this
i hav a html page in the front end .
Upon submitting ,a stream of XML data is sent to a servlet .
Now, how can i construct a DOM tree from this string of XML data ??????
any help would be appreciated


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From philipnye at freenet.co.uk  Wed Dec  1 11:13:15 1999
From: philipnye at freenet.co.uk (Philip Nye)
Date: Mon Jun  7 17:18:08 2004
Subject: SML - an alternative
References: <013601bf3b5c$d5b1e9e0$e9d9f2cc@omicron.com>
Message-ID: <384502F0.5D7EA2E0@freenet.co.uk>

"Stephen T. Mohr" wrote:
> 
> People who have built parsers claim the alleged complexity of XML isn't a
> problem for them *as XML stands*.  Not having built one of my own, I'll take
> their word for it.

Then why do most XML parsers:
a. choose to exclude an arbitrary set of features e.g. PIs, external
entities etc.?
b. consistently fail conformity tests?

This is the foundation on which a huge superstructure is rapidly being
built willy-nilly.

Philip

-- 
Philip Nye
Engineering Arts
72 Herberton Road  ~  Bournemouth  BH6 5HZ  ~  UK
tel +44 (0)1202 418236  ~  fax +44 (0)1202 418676
mailto:philipnye@freenet.co.uk

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From IndrajitC at catsglobal.co.in  Wed Dec  1 12:46:14 1999
From: IndrajitC at catsglobal.co.in (Indrajit Chaudhuri)
Date: Mon Jun  7 17:18:08 2004
Subject: creating a DOM tree from streams of tagged data?????????
References: <6525683A.0034B3B9.00@MAILSERVER2.chennai.tcs.co.in>
Message-ID: <384518C0.9DCA0F5E@catsglobal.co.in>

First create an InputSource Object with the byte/character stream and
then pass it to the parser. Examples are there in the api docs which
comes along with the parser.

Thanks,
Indrajit

V Arun Kumar wrote:
> 
> i m using sun's parser.
> i am able form a DOM tree provided i read from xml file using the parser
> 
> my problem goes like this
> i hav a html page in the front end .
> Upon submitting ,a stream of XML data is sent to a servlet .
> Now, how can i construct a DOM tree from this string of XML data ??????
> any help would be appreciated
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tshw at capitalmarketscompany.com  Wed Dec  1 14:23:51 1999
From: tshw at capitalmarketscompany.com (Shaw Tim)
Date: Mon Jun  7 17:18:08 2004
Subject: NOT SML but XML-for-data and business
Message-ID: <FDFFD5C2748BD211BC500008C71E933CBDF912@uklonts01.uklo.capitalmarketscompany.com>

I am an Application tool/framework developer/integrator. I sit (sort of)
half-way between XML Parser-writers and XML tool-vendors.

I developed a pseudo-DOM system, using the DocumentHandler mechanism of any
available SAX parser, which creates 'lightweight' DOM objects specifically
designed for data-processing. The DOM objects implement the required
interface, but not all of the methods 'do' things. Another set of interfaces
provides a 'data-oriented' view on the DOM structures (and it is this which
the lwDOM is optimised for). This allows me to use DOM-based tools (XSLT
etc), but removes the neccessity for programmers to learn/use the DOM
directly.

I see XML Schema (among other things) as providing great opportunities in
this domain - data-types/constraints/ranges etc., but I don't have a clear
view of how this will be integrated.

I have 3 immediate questions (not all of which need to be answered at once
:-) :
1) how are the (real) XML developers approaching XML Schema? Will it be
transparent to applications, and just constrain the data as per DTD's or
will the meta-data be available - if so, in what form?
2) given the fragmentation of DTD definitions in any given market (mine
particularly with FpML, FIXML, FinML, BizTalk etc.) is anyone addressing
general tools for mapping between these - and what are the issues that are
being tackled?
3) are there any discussions going on about how to map between different
formats programatically? I can imagine RDF being used, but then there needs
to be agreement on meta-meta-data(!) - and I can't imagine (see 2 above)
people agreeing on anything much when it's a core business differentiator
(read tie-in/revenue opportunity) to have a proprietary format.
 
Thanks

tim
*********************************************************************
The information in this email is confidential and is intended solely 
for the addressee(s). 					
Access to this email by anyone else is unauthorised. If you are	not 
an intended recipient, you must not read, use or disseminate the 
information contained in the email. 			
Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of 
The Capital Markets Company.				  

http://www.capitalmarketscompany.com
***********************************************************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec  1 15:13:50 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:08 2004
Subject: NOT SML but XML-for-data and business
In-Reply-To: <FDFFD5C2748BD211BC500008C71E933CBDF912@uklonts01.uklo.capitalmarketscompany.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAMECDEIAA.martind@netfolder.com>

Hi Shaw,

Shaw said:
1) how are the (real) XML developers approaching XML Schema? Will it be
transparent to applications, and just constrain the data as per DTD's or
will the meta-data be available - if so, in what form?

Didier reply:
Actually we have a problem. There is no recommendations about how to link a
document/fragment with its schema and nor any recommendation on about the
validation rules for documents aggregating fragment (each fragment having
its own schema). I an anxious to see what the XML schema group will present
at XML 99 as an answer to these question or to these unfulfilled needs.

Shaw said:
2) given the fragmentation of DTD definitions in any given market (mine
particularly with FpML, FIXML, FinML, BizTalk etc.) is anyone addressing
general tools for mapping between these - and what are the issues that are
being tackled?

Didier reply:
We are using XSLT to translation from one to the other. We are currently
working on a meta model that could be mapped to these different languages.
However, this is a lot of work even for a specific domain (the finance
domain). But if we reach our goal. It will be easier to translate from this
meta model (or meta language which is XML based) to any other particular XML
based language with XSLT.


Shaw said:
3) are there any discussions going on about how to map between different
formats programatically? I can imagine RDF being used, but then there needs
to be agreement on meta-meta-data(!) - and I can't imagine (see 2 above)
people agreeing on anything much when it's a core business differentiator
(read tie-in/revenue opportunity) to have a proprietary format.

Didier reply:
We tried RDF but RDF is a completely different data model and we discovered
that to create a meta data model in RDF is simply not practical. We
discovered that it is a lot easier if the meta domain language is simply an
element hierarchy. Not necessarily easier for machines but definitively
easier for humans. And believe me, to reduce error is quite important. The
problem is that errors will introduced by humans. If the system is too
complex, we have errors. So we re-discovered what Ben Schederman discovered
several years ago. Thus, from the software psychology point of view, we
discovered that using RDF for translation is error prone and that using a
master schema translated into other schemas is less error-prone. However,
this experience is limited to our group and it would be interesting to see
what others got as result. Of course, this is possible only if they keep
track of their process and have in place a learning mechanism to fine tune
these processes.

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rick at activated.com  Wed Dec  1 15:37:25 1999
From: rick at activated.com (Rick Ross)
Date: Mon Jun  7 17:18:08 2004
Subject: [ANNOUNCE] EZ/X - XML/XSL Processor Preview Now Available
Message-ID: <384540A5.7486C26C@activated.com>

***********************************************************
***********************************************************
EZ/X - XML/XSL Processor Preview Now Available

Download EZ/X now:
  web =======> http://www.activated.com/download/ezx.zip
               http://www.activated.com/download/ezx.tar.gz
  ftp =======> ftp://ftp.activated.com/ezx.zip
               ftp://ftp.activated.com/ezx.tar.gz
  feedback ==> mailto:ezx-feedback@activated.com
***********************************************************
***********************************************************

Activated Intelligence (http://activated.com) invites you to preview our
EZ/X suite of core XML tools for Java. EZ/X combines world-class XML
parsing and XSL processing in a compact, pure Java package.

We're seeking a major industry partner who can leverage EZ/X as part of
its XML leadership strategy (and hopefully make it FREE to you!) We
welcome your insights about how Activated should move forward with this
product.

EZ/X has been in production use for over a year at the JavaLobby
(http://javalobby.org) - which was probably the world's first 100%
dynamically generated XML/XSL site. EZ/X has delivered consistently
there under grueling circumstances and extreme heavy loads.

We've worked hard to make EZ/X fast, reliable, and conformant to
prevailing standards. Preliminary testing by Activated and third-parties
suggests that EZ/X should give a great performance boost to your
mission-critical XML projects. XSL processing with EZ/X is usually 2-3
times faster than Lotus/IBM/Apache or Oracle, and even faster than that
when dealing with complex XML/XSL.

We hope EZ/X works as well for you as it has for us, and we hope it
helps propel your success with XML. We look forward to your
comments, ideas and suggestions, and we urge you to send them to
ezx-feedback@activated.com (mailto:ezx-feedback@activated.com).
If you like what you see, then please let us know - your support
makes all the difference.

Best regards,
The Activated EZ/X Team

------------------------
Activated Intelligence
http://www.activated.com
(919) 678-0300

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Joel_Sherriff at compuware.com  Wed Dec  1 15:33:39 1999
From: Joel_Sherriff at compuware.com (Sherriff, Joel)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
Message-ID: <A58643BEDEF7D211BABB0008C75D853F48A5CB@fhpri01.compuware.com>

I've posted a variation of this to comp.text.xml also, so if it looks
familiar, that's where you saw it...

I'm writing an xml document analysis tool and would appreciate any links to
xml that contain binary entities, complicated 
stylesheets (css or xsl) that contain external dependencies, or any other
use of external dependencies.  I've poked around 
www.xmltree.com and, though there are quite a few xml links, I've yet to
find any that are more than text. The purpose of the analysis 
tool is to list any links within the xml to outside documents and any URLs
that must be read to fully process the xml (ie: external dtd, 
stylesheet, gif images, etc).  

Can somebody explain, in a nutshell, the purpose of RDF.  After reading the
spec, I don't feel any more educated than before.

Because I need the analysis tool to be lightweight and fast (it'll be a
component in a load-testing tool), and need to support as many different
standards as possible, I've implemented it using lex, as opposed to one of
the available parsers.  However, after looking at a few RDF examples it
appears that RDF can be used to "construct" URL's, which looks to be a
trouble spot for me.

Any of the experts on the list think of any other potential trouble spots?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From beavis at proteometrics.com  Wed Dec  1 16:11:02 1999
From: beavis at proteometrics.com (Ronald Beavis)
Date: Mon Jun  7 17:18:09 2004
Subject: No subject
Message-ID: <004901bf3c17$93d64230$8c3770cc@pmc2>

unsubscribe

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991201/e1440244/attachment.htm
From tbray at textuality.com  Wed Dec  1 16:18:28 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
Message-ID: <3.0.32.19991201080509.0132f800@pop.intergate.ca>

At 10:34 AM 12/1/99 -0500, Sherriff, Joel wrote:
>Can somebody explain, in a nutshell, the purpose of RDF.  After reading the
>spec, I don't feel any more educated than before.

My attempt to explain RDF is at http://www.xml.com/xml/pub/98/06/rdf.html
 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From RDaniel at DATAFUSION.net  Wed Dec  1 20:23:04 1999
From: RDaniel at DATAFUSION.net (Ron Daniel)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
Message-ID: <0D611E39F997D0119F9100A0C931315C52FB4A@datafusionnt1>

Tim Bray said:
At 10:34 AM 12/1/99 -0500, Sherriff, Joel wrote:
>Can somebody explain, in a nutshell, the purpose of RDF.  After reading the
>spec, I don't feel any more educated than before.

My attempt to explain RDF is at http://www.xml.com/xml/pub/98/06/rdf.html
 -Tim

You may also want to take a look at Tim Berners-Lee's
document on "Describing and Exchanging Data":

http://www.w3.org/1999/04/WebData

but here is my attempt to explain RDF's purpose in a 
nutshell:

   The purpose of RDF is to provide metadata (data about
   other data) in a manner that is very easy to process
   by machines.

and here is my attempt to give an example of why RDF
is useful:


Assume you have a bunch of XML documents from a variety of
sources, many of which contain an <author> element, and your
job is to build a simple card-catalog style database so that
you can search by author and get the documents written by
that person.
Also assume you have an XML-aware version of a
tool like grep that lets you search the documents for the <author> element.
Like grep, this tool prints the filename where a match
was found. Unlike grep it prints the content of the matched element
rather than a line. (This seems like a reasonable minimum
functionality for an XML-aware grep-like tool).

That tool should make the job easy. Search for <author> elements,
pipe the output to 'cut', and you can make a text file
ready for import into your database. But there is one little
hitch - you can't assume that the person identified in the <author>
element of file X is the author of file X. Maybe file X is
saying that they are really the author of document Y. Without
knowledge of the convention followed in each file you can't tell.
And since the files came from a lot of sources, you are talking
a lot of work to see what conventions are being followed.

RDF does not leave this important information implicit. Each
RDF statement has exactly three parts:
   Subject - the thing being talked about (the documents in
     the example above).
   Predicate - the type of statement being made about the subject
     (author in this example)
   Object - the value portion of the statement (the author name
     in this example).

If the data was expressed in RDF, an RDF-aware grep-like
tool would let you select all the RDF properties labeled
"author", get the URIs of the resource and the name of the
author, and plop that info into the database. There would
be no ambiguity about the thing which was authored.

This regularity in the form of expression is key to making
the metadata easy to process by machines.

Regards,
Ron
 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec  1 20:39:26 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
Message-ID: <3.0.32.19991201124035.0153b920@pop.intergate.ca>

At 12:22 PM 12/1/99 -0800, Ron Daniel wrote:
>and here is my attempt to give an example of why RDF
>is useful:

Very good, Ron.  Well said.

And here is my attempt to explain why RDF hasn't been more successful:

  The syntax is hideously ugly and hard to understand, and the spec worries
  so hard about being correct and complete that it is pretty well 100%
  incomprehensible to ordinary people.

I probably just hurt some feelings, but I've already shouted this in private 
enough times that it won't be a surprise.

In my opinion RDF needs some serious sugar-coating and tutorializing
if it is ever going to achieve its potential.

I think its potential is huge, dwarfing that of XML.   -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From LWatanab at JetForm.com  Wed Dec  1 21:15:34 1999
From: LWatanab at JetForm.com (Larry Watanabe)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
Message-ID: <111CF63B7D2ED211830000805F65A2FF01804961@OTTMAIL2>


Ron Daniel writes:
>RDF does not leave this important information implicit. Each
>RDF statement has exactly three parts:
>   Subject - the thing being talked about (the documents in
>     the example above).
>   Predicate - the type of statement being made about the subject
>     (author in this example)
>  Object - the value portion of the statement (the author name
>     in this example).

>If the data was expressed in RDF, an RDF-aware grep-like
>tool would let you select all the RDF properties labeled
>"author", get the URIs of the resource and the name of the
>author, and plop that info into the database. There would
>be no ambiguity about the thing which was authored.

This works fine for inherently binary relations, but for n-ary relations you
end up reifying them by introducing a dummy node. Matching against that
dummy node will yield no matches, or only incorrect ones, since the names of
the nodes are supposed to be new constants (or existentially quantified
variables). 

To make that dummy node meaningful, you would have to match a wildcard
against it and other relations. But then you're back to your original
sitatuation of not knowing what the relation means unless you have further
knowledge of the semantics of the relations.

-Larry Watanabe
Jetform Corporation
lwatanab@jetform.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec  1 21:44:49 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
In-Reply-To: Tim Bray's message of "Wed, 01 Dec 1999 12:40:37 -0800"
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca>
Message-ID: <m3bt9hp6g4.fsf@localhost.localdomain>

Tim Bray <tbray@textuality.com> writes:

> And here is my attempt to explain why RDF hasn't been more successful:
> 
>   The syntax is hideously ugly and hard to understand, and the spec worries
>   so hard about being correct and complete that it is pretty well 100%
>   incomprehensible to ordinary people.
> 
> I probably just hurt some feelings, but I've already shouted this in private 
> enough times that it won't be a surprise.

I think Tim has shouted it in public as well.

It's a shame, because RDF is very nice for exchanging object-oriented
information among loosely-coupled systems, and there's some good Perl
and Java support for it already available (I'm sure the Python people
will get in there quickly).  

The problem is that the RDF-Syntax spec confounds even its bravest
readers by trying to do two things at once:

a) define a model and syntax for exchanging object-oriented
   information in XML; and

b) apply the model and syntax to the problem domain of representing
   knowledge about Web pages.

Neither of those two things is brain-dead simple, but either alone
could have been presented clearly and straight-forwardly to an
intelligent reader who knew the domain.  Let this be a warning to us
all to write our specs in clean, simple layers.

> In my opinion RDF needs some serious sugar-coating and tutorializing
> if it is ever going to achieve its potential.

And lots of software.

> I think its potential is huge, dwarfing that of XML.   -Tim

Agreed.  XML is just syntax, and as Tim (I think) has said, syntax is
boring: XML simply represents a low-level syntactic layer that we all
had to agree on and get out of the way so that we could move on to the
tasty stuff.  XML was never supposed to be the point of the whole
exercise, any more than IP or TCP was supposed to be the point of the
Internet or the Web.

RDF is much closer to that tasty stuff.  The ability to exchange
object-oriented information seemlessly among heterogenous systems is
very, very exciting -- it's something that CORBA promised and failed
to deliver outside the enterprise, and now RDF (and XML) can take a
shot at it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Wed Dec  1 22:04:32 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
In-Reply-To: <3.0.32.19991201124035.0153b920@pop.intergate.ca>
Message-ID: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com>

At 12:40 PM 12/1/99 -0800, Tim Bray wrote:
>
>And here is my attempt to explain why RDF hasn't been more successful:
>
>  The syntax is hideously ugly and hard to understand, and the spec worries
>  so hard about being correct and complete that it is pretty well 100%
>  incomprehensible to ordinary people.

Agree. I've written a product that used MCF (RDF's predecessor)
and written schemas for OODBs, and I can't make much sense of
the RDF spec. Maybe it is semi-obvious to anyone with a background
in knowledge representation, but it needs to be explained differently
for the other 99.99% of us.

>I think its potential is huge, dwarfing that of XML.   -Tim

I disagree on this one. It's rare that metacontent is more
valuable than content, long-term. I'll bet on the books over
the card catalog, every time.

wunder
--
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://software.infoseek.com/
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Wed Dec  1 22:20:07 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain>
Message-ID: <38459FA4.DAA00E35@praxis.cz>

David Megginson wrote:
> It's a shame, because RDF is very nice for exchanging object-oriented
> information among loosely-coupled systems, and there's some good Perl
> and Java support for it already available (I'm sure the Python people
> will get in there quickly).

I can see the enormous interest in having a text-based format for
exchanging object-oriented data. But can't this be done with a good
object-oriented XML schema language, of the which the current W3C seems
to be a very good start? I read Tim's paper on xml.com, and he argues
against the use of XML (without an additional syntax) for representing
metadata with two arguments, both based on scalability. But if metadata
are attached in a separate document, as RDF metadata would presumably
be, they could expressed just as concisely (and probably more so) by
using an XML instance based on a simple (but extensible) XML schema. The
latter would be preferably because am extensible schema language is an
important tool in its own right.

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Wed Dec  1 22:37:34 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
In-Reply-To: <3.0.32.19991201124035.0153b920@pop.intergate.ca>
Message-ID: <000301bf3c4c$85bde920$0f36a8c0@quokka.com>

As an RDF user, I agree with all comments about the complexity of its syntax
and the specification itself. I spent about a month reading and rereading
the RDF spec before I concluded it really was as conceptually simple as it
had appeared on first reading.

On the subject of its potential, I partly agree and partly disagree. Yes,
RDF does move things up the semantic food chain. Yes, XML is kind of like a
good orthogonal machine instruction set, which needs 3G, 4G, and 5G
languages on top of it. But I still see RDF as being useful for metadata,
not for every kind of object-oriented conversation you'd want to have. I
wouldn't consider RDF at the same level as CORBA, but perhaps part of an
overall solution.

Jeff

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk
> [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Tim Bray
> Sent: Wednesday, December 01, 1999 12:41 PM
> To: 'xml-dev@ic.ac.uk'
> Subject: RE: Some questions
>
>
> At 12:22 PM 12/1/99 -0800, Ron Daniel wrote:
> >and here is my attempt to give an example of why RDF
> >is useful:
>
> Very good, Ron.  Well said.
>
> And here is my attempt to explain why RDF hasn't been more successful:
>
>   The syntax is hideously ugly and hard to understand, and
> the spec worries
>   so hard about being correct and complete that it is pretty well 100%
>   incomprehensible to ordinary people.
>
> I probably just hurt some feelings, but I've already shouted
> this in private
> enough times that it won't be a surprise.
>
> In my opinion RDF needs some serious sugar-coating and tutorializing
> if it is ever going to achieve its potential.
>
> I think its potential is huge, dwarfing that of XML.   -Tim
>
>
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and
> on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Wed Dec  1 22:39:55 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:18:09 2004
Subject: Some Questions
Message-ID: <000401bf3c4c$e033dea0$0f36a8c0@quokka.com>

I think the important thing to remember about RDF is that it is not XML. It
is fundamentally an abstract model for expressing metadata. It happens to be
representable using XML, but it is different from XML. Unfortunately, this
distinction is part of what makes the spec hard to read, but it's important.

Jeff Sussna


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec  1 22:40:39 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:09 2004
Subject: Some questions
In-Reply-To: Walter Underwood's message of "Wed, 01 Dec 1999 14:02:47 -0800"
References: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com>
Message-ID: <m3904lp3v7.fsf@localhost.localdomain>

Walter Underwood <wunder@infoseek.com> writes:

> >I think its potential is huge, dwarfing that of XML.   -Tim
> 
> I disagree on this one. It's rare that metacontent is more
> valuable than content, long-term. I'll bet on the books over
> the card catalog, every time.

That's just the problem with the spec -- if you forget the word/prefix
"meta" completely, RDF is just an XML format for object exchange; it
just happens that one possible application of those objects if
metadata, and the RDF-Syntax spec mixes the two together.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec  1 22:40:44 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: <m3bt9hp6g4.fsf@localhost.localdomain>
Message-ID: <NBBBJPGDLPIHJGEHAKBACEDEEIAA.martind@netfolder.com>

Hi David,

David said:
The problem is that the RDF-Syntax spec confounds even its bravest
readers by trying to do two things at once:

a) define a model and syntax for exchanging object-oriented
   information in XML; and

b) apply the model and syntax to the problem domain of representing
   knowledge about Web pages.


Didier reply:
I guess that what is causing the confusion right at the beginning is the
triad stuff. Instead, it would have been more useful to present the concept
or the atomic unit as a record or an object without the methods. But,
contrary to RDB records, there is inheritance relationship between the RDF
entities.

So, instead of a model based on the triad "object property value" as an
atom, it would have been a lot easier to say "object as a collection of
properties/values". a schema is like a template, an object is just this
template with slots filled (values added to properties). A template can
inherit from an other template.

But, from the RDF document point of view, what we always see is the objects
and their associated collection of properties/values.

Its funny, one of the ancestor of RDF is the MCF (not from Netscape but from
Apple research/Talva ref - http://www.netfolder.com/SDK/MCF.htm and
http://www.netfolder.com/SDK/MCF11.htm). This ancestor language was designed
as a simple set of units and each unit having a collection of
property/value. It seems that instead of being simplified it just became
more obscure. its sad, it is so easy to use when well explained and
understood.

Obviously the choice of word like "about", "description" lead to think of
data about something instead of the data being _the_ something. This is why
I use a structure like this:

<rdf:description id="MyID">
<location>http://www.netfolder.com</location>
... etc...
</rdf:description>

What are the gains?
a) the object is location independent.
b) its location is just an other property (and in fact it is a property)
c) Then, it is simply an object without any reference, what is giving
references is the properties.
d) I can relate the object to others with properties.
e) its easier to remember and understand.
d) this is the object not an object about an other object.

I discovered that, in some cases I want to express certain object's
relationship like for instance a hierarchy. Let's say that I want to
transfer the content of a directory service from one place to an other, then
in that case:

<rdf:description about="context1/context2/context3">
<location>http://www.netfolder.com</location>
... etc...
</rdf:description>

That way,  all objects are transported as a small independent hierarchy. The
hierarchical relationship is expressed with the string in the about
attribute. And because we are used to express relative position in a
hierarchy with "/" I use it.

I didn't used it for other kind of data structures.

I do not know what went wrong??? Probably OCCAM was in vacations :-))

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web Boston (http://www.mfweb.com)
Markup 99 (http://www.gca.com)
Book to come soon: XML Pro published by Wrox Press
Products http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec  1 22:45:57 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: Matthew Gertner's message of "Wed, 01 Dec 1999 23:22:28 +0100"
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz>
Message-ID: <m366zpp3m8.fsf@localhost.localdomain>

Matthew Gertner <matthew@praxis.cz> writes:

> I can see the enormous interest in having a text-based format for
> exchanging object-oriented data. But can't this be done with a good
> object-oriented XML schema language, of the which the current W3C
> seems to be a very good start?

Perhaps I'm a little confused, but I cannot see how the fact that a
schema language itself happens to be object oriented allows you to do
object exchange in XML (it doesn't hurt, but how can it help?).

I've been confused before, so there's no need for panic.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elm at east.sun.com  Wed Dec  1 22:58:47 1999
From: elm at east.sun.com (Eve L. Maler)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: <m3904lp3v7.fsf@localhost.localdomain>
References: <Walter Underwood's message of "Wed, 01 Dec 1999 14:02:47 -0800">
 <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com>
Message-ID: <4.2.0.58.19991201175459.00baea50@abnaki>

At 05:58 AM 10/30/99 -0400, David Megginson wrote:
>That's just the problem with the spec -- if you forget the word/prefix
>"meta" completely, RDF is just an XML format for object exchange; it
>just happens that one possible application of those objects if
>metadata, and the RDF-Syntax spec mixes the two together.

When people talk about RDF, the "meta" part is what I have trouble with in 
general.  In what way is markup not metadata?  In what way are element 
content and attribute values not also metadata (depending on what you do 
with them)?  It feels weird for one particular data model to claim to have 
cornered the metadata market.

         Eve

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Wed Dec  1 23:21:52 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
Message-ID: <3.0.32.19991201182252.0283eec0@nexus.webmethods.com>

Hi Eve!  Actually, I've brought a similar issue up with RDF group members
on a few occasions.  I've asked for help understanding why I'd choose the
RDF syntax instead of inventing my own XML document type to represent the
desired metadata.

Maybe someone on this list can help me with that.

At 05:59 PM 12/1/99 -0500, Eve L. Maler wrote:
>When people talk about RDF, the "meta" part is what I have trouble with in 
>general.  In what way is markup not metadata?  In what way are element 
>content and attribute values not also metadata (depending on what you do 
>with them)?  It feels weird for one particular data model to claim to have 
>cornered the metadata market.
>
>         Eve

--
Joe Lapp                     (Looking for some good people to
Senior Engineer               help create XML technologies that
http://www.webMethods.com     connect businesses to businesses
jlapp@webMethods.com          over the web.)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec  1 23:45:32 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
Message-ID: <3.0.32.19991201154526.014c3710@pop.intergate.ca>

At 02:02 PM 12/1/99 -0800, Walter Underwood wrote:
>I disagree on this one. It's rare that metacontent is more
>valuable than content, long-term. I'll bet on the books over
>the card catalog, every time.

Wow, that's a profoundly deep and strong statement, and I think at the
core of the argument that *should* be happening about how to make the
Web a better place.  In fairness, it should be said that Walter works for
a company whose search engine does the equivalent of reading all the
pages of all the books on all the shelves, and trying to guess what the
books mean.  I used to be in that business myself.

But I think metadata wins.  If you count hits on Internet search engines,
the Yahoo and ODP directories, which are both human-constructed metadata, 
absolutely wipe out any fulltext search engine you can name, even though
they have orders of magnitude less sites and a lower volume of information
about each.  Because human-constructed metadata wins on the net just like
it did in the library.  

RDF is important because it can facilitate the interchange of, and 
a certain number of the common uses of, this kind of metadata.

Anyhow, just because you have a card catalogue doesn't mean you throw
the library books away. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec  1 23:45:29 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
Message-ID: <3.0.32.19991201154632.0152f350@pop.intergate.ca>

At 06:22 PM 12/1/99 -0500, Joe Lapp wrote:
>Hi Eve!  Actually, I've brought a similar issue up with RDF group members
>on a few occasions.  I've asked for help understanding why I'd choose the
>RDF syntax instead of inventing my own XML document type to represent the
>desired metadata.
>
>Maybe someone on this list can help me with that.

Because the same data structures and usage patterns keep coming back across
wide ranges of metadata applications, even though the world isn't about
to agree on common vocabularies.  So there are huge gains to be had from
a common data model and transfer syntax. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Wed Dec  1 23:52:16 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
Message-ID: <3.0.32.19991201185319.0191d680@nexus.webmethods.com>

At 03:46 PM 12/1/99 -0800, Tim Bray wrote:
>Because the same data structures and usage patterns keep coming back across
>wide ranges of metadata applications, even though the world isn't about
>to agree on common vocabularies.  So there are huge gains to be had from
>a common data model and transfer syntax. -Tim

That's a very strong motivation.  But we have to balance that with another
very strong motivation: making the documents easy to understand by the
people who need to work with them.  By designing your own doctype you can
tailor the structure and the language to suit the target audience.

RDF may be simple at heart, but is it reasonable to ask the average user to
figure it out, to expect that the average user of metadata will even be
able to grok the abstractions?  I may be reiterating your earlier
sentiment, but I worry that the abstractions are as much an impediment as
the spec and the syntax.

--
Joe Lapp                     (Looking for some good people to
Senior Engineer               help create XML technologies that
http://www.webMethods.com     connect businesses to businesses
jlapp@webMethods.com          over the web.)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elm at east.sun.com  Thu Dec  2 00:00:26 1999
From: elm at east.sun.com (Eve L. Maler)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: <3.0.32.19991201154632.0152f350@pop.intergate.ca>
Message-ID: <4.2.0.58.19991201185138.009f5a90@abnaki>

At 03:46 PM 12/1/99 -0800, Tim Bray wrote:
>Because the same data structures and usage patterns keep coming back across
>wide ranges of metadata applications, even though the world isn't about
>to agree on common vocabularies.  So there are huge gains to be had from
>a common data model and transfer syntax. -Tim

Not that I don't respect RDF's power, but personally, I think the key *is* 
common vocabularies.  We may have to start small, and they may just be hub 
formats that get mapped to/from a lot, but agreeing on semantics is the 
pill that has to be swallowed.  Even RDF depends on this, particularly on 
an open system such as the Web where you can't really control or influence 
the habits of content creators.  If you want to indicate that you are the 
author of a certain page, at the very least you have to refer to a widely 
understood "author" semantic in order for author-criterion searching to be 
of any use to your audience.  Whether it's an RDF property or a well-known 
namespace or whatever doesn't seem to matter as much.

         Eve
--
Eve Maler            Sun Microsystems
elm @ east.sun.com    +1 781 442 3190

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Thu Dec  2 00:14:46 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: <3.0.32.19991201185319.0191d680@nexus.webmethods.com>
Message-ID: <000501bf3c5a$26970d60$0f36a8c0@quokka.com>

It is not reasonable to ask the user to figure it out. RDF, along with much
of XML, is not really suited (or at least in RDF's case, intended) for
direct human access. Remember that the goal of RDF is to make it easy for
MACHINES to process metadata. Users should be able to use tools that hide
the details of RDF.

Jeff

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk
> [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Joe Lapp
> Sent: Wednesday, December 01, 1999 3:53 PM
> To: xml-dev@ic.ac.uk
> Subject: Re: Some questions
>
>
> At 03:46 PM 12/1/99 -0800, Tim Bray wrote:
> >Because the same data structures and usage patterns keep
> coming back across
> >wide ranges of metadata applications, even though the world
> isn't about
> >to agree on common vocabularies.  So there are huge gains to
> be had from
> >a common data model and transfer syntax. -Tim
>
> That's a very strong motivation.  But we have to balance that
> with another
> very strong motivation: making the documents easy to understand by the
> people who need to work with them.  By designing your own
> doctype you can
> tailor the structure and the language to suit the target audience.
>
> RDF may be simple at heart, but is it reasonable to ask the
> average user to
> figure it out, to expect that the average user of metadata
> will even be
> able to grok the abstractions?  I may be reiterating your earlier
> sentiment, but I worry that the abstractions are as much an
> impediment as
> the spec and the syntax.
>
> --
> Joe Lapp                     (Looking for some good people to
> Senior Engineer               help create XML technologies that
> http://www.webMethods.com     connect businesses to businesses
> jlapp@webMethods.com          over the web.)
>
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and
> on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Thu Dec  2 00:18:39 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: <4.2.0.58.19991201175459.00baea50@abnaki>
Message-ID: <000601bf3c5a$ab80c750$0f36a8c0@quokka.com>

It's funny: when you talk about RDF, it seems very complex, but when you use
it, it seems very simple. I am using RDF as an interchange format for
metadata about assets in a distributed publishing environment. I have, for
example, a photo of a car racer. I need to know who the racer in the picture
is, who took the picture, what format it was taken in (i.e., JPEG), what
date it was taken on, and so forth. RDF works just gorgeously for this. As
to whether it's suitable for all conceivable object interchange, I don't
know and I'm not sure I care since I only try to use RDF for its intended
purpose (metadata). I actually think the "meta" part is what makes the spec
comprehensible, because I can always return to a specific purpose. I believe
that RDF went too far with its syntax to try to make virtually every XML
document valid RDF. If you want to define a metadata language, define one.
If you want to define a general object interchange language, define that
instead.

Jeff

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk
> [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Eve L. Maler
> Sent: Wednesday, December 01, 1999 2:59 PM
> To: 'xml-dev@ic.ac.uk'
> Subject: Re: Some questions
>
>
> At 05:58 AM 10/30/99 -0400, David Megginson wrote:
> >That's just the problem with the spec -- if you forget the
> word/prefix
> >"meta" completely, RDF is just an XML format for object exchange; it
> >just happens that one possible application of those objects if
> >metadata, and the RDF-Syntax spec mixes the two together.
>
> When people talk about RDF, the "meta" part is what I have
> trouble with in
> general.  In what way is markup not metadata?  In what way
> are element
> content and attribute values not also metadata (depending on
> what you do
> with them)?  It feels weird for one particular data model to
> claim to have
> cornered the metadata market.
>
>          Eve
>
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and
> on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jes at kuantech.com  Thu Dec  2 00:21:12 1999
From: jes at kuantech.com (Jeffrey E. Sussna)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: <4.2.0.58.19991201185138.009f5a90@abnaki>
Message-ID: <000801bf3c5b$06a06a50$0f36a8c0@quokka.com>

Well, you need both. You need the shared concept of "author" and the shared
representation of an instance of that concept. XML specs of various kinds
are trying to define shared representations at various semantic layers. Both
vertical and horizontal vocabulary efforts (Dublin Core, BizTalk, etc.) are
required to complete the equation.

Jeff

P.S. Please don't bash me for mentioning BizTalk. It was an arbitrary
example.

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk
> [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Eve L. Maler
> Sent: Wednesday, December 01, 1999 4:01 PM
> To: xml-dev@ic.ac.uk
> Subject: Re: Some questions
>
>
> At 03:46 PM 12/1/99 -0800, Tim Bray wrote:
> >Because the same data structures and usage patterns keep
> coming back across
> >wide ranges of metadata applications, even though the world
> isn't about
> >to agree on common vocabularies.  So there are huge gains to
> be had from
> >a common data model and transfer syntax. -Tim
>
> Not that I don't respect RDF's power, but personally, I think
> the key *is*
> common vocabularies.  We may have to start small, and they
> may just be hub
> formats that get mapped to/from a lot, but agreeing on
> semantics is the
> pill that has to be swallowed.  Even RDF depends on this,
> particularly on
> an open system such as the Web where you can't really control
> or influence
> the habits of content creators.  If you want to indicate that
> you are the
> author of a certain page, at the very least you have to refer
> to a widely
> understood "author" semantic in order for author-criterion
> searching to be
> of any use to your audience.  Whether it's an RDF property or
> a well-known
> namespace or whatever doesn't seem to matter as much.
>
>          Eve
> --
> Eve Maler            Sun Microsystems
> elm @ east.sun.com    +1 781 442 3190
>
> xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 00:28:12 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: "Jeffrey E. Sussna"'s message of "Wed, 1 Dec 1999 14:36:33 -0800"
References: <000301bf3c4c$85bde920$0f36a8c0@quokka.com>
Message-ID: <m33dutoyw3.fsf@localhost.localdomain>

"Jeffrey E. Sussna" <jes@kuantech.com> writes:

> I wouldn't consider RDF at the same level as CORBA, but perhaps part
> of an overall solution.

Though I'm the one that brought CORBA into the discussion, I think
that a better comparison would probably be XMI, since CORBA is a
protocol rather than a format.

RDF is simpler and less rigidly defined than XMI, and it is nicely
extensible -- that makes it much more suitable, say, for distributing
data in the decentralized, undisciplined environment of the Web, and
much less suitable, say, for direct Java-to-C++ communication in a
well-defined system with a single architecture.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 00:35:58 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:10 2004
Subject: Some questions
In-Reply-To: Joe Lapp's message of "Wed, 01 Dec 1999 18:22:52 -0500"
References: <3.0.32.19991201182252.0283eec0@nexus.webmethods.com>
Message-ID: <m3wvs5njyf.fsf@localhost.localdomain>

Joe Lapp <jlapp@webMethods.com> writes:

> Hi Eve!  Actually, I've brought a similar issue up with RDF group members
> on a few occasions.  I've asked for help understanding why I'd choose the
> RDF syntax instead of inventing my own XML document type to represent the
> desired metadata.

The answer is the same as the answer to why you wouldn't invent your
own markup language rather than XML -- that there is a network effect
to using the same format as other people.  In the case of RDF, it is a 
lot easier to do something like

  RDFCollection coll = new RDFCollection("http://www.foo.com/data.rdf");
  RDFResource res = coll.getResource("http://www.foo.com/ids/00001");
  System.out.println("The name is " + 
                     res.getProperty("http://www.foo.com/ns#name"));

than it is to set up a SAX handler or walk through a DOM tree to try
to get the information -- if RDF (or something like it) catches on,
presumably we'll also get visual modelling tools, SQL-mapping tools,
forms-generators, and lots of other nice COTS stuff that it's hard to
write for XML in the general case.  There are two catches, however:

a) a lot of people have to use the same standard; and
b) there has to be a good software base.

RDF hasn't fully met either criterion yet, though there's some
improvement.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 00:31:43 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:11 2004
Subject: Some questions
In-Reply-To: "Didier PH Martin"'s message of "Wed, 1 Dec 1999 17:35:52 -0500"
References: <NBBBJPGDLPIHJGEHAKBACEDEEIAA.martind@netfolder.com>
Message-ID: <m3zox1nk6u.fsf@localhost.localdomain>

"Didier PH Martin" <martind@netfolder.com> writes:

> Obviously the choice of word like "about", "description" lead to think of
> data about something instead of the data being _the_ something. This is why
> I use a structure like this:
> 
> <rdf:description id="MyID">
> <location>http://www.netfolder.com</location>
> ... etc...
> </rdf:description>
> 
> What are the gains?
> a) the object is location independent.

Or, in programming terms, the ID is local rather than global, or in
Web terms, it is relative rather than absolute (note that RDF allows
ID as well).  That's suitable for some applications, but entirely
useless for others (it's often important to have single global
identifiers for well-known people, places, and things).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 00:38:37 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:11 2004
Subject: Some questions
In-Reply-To: Joe Lapp's message of "Wed, 01 Dec 1999 18:53:20 -0500"
References: <3.0.32.19991201185319.0191d680@nexus.webmethods.com>
Message-ID: <m3u2n9nju1.fsf@localhost.localdomain>

Joe Lapp <jlapp@webMethods.com> writes:

> RDF may be simple at heart, but is it reasonable to ask the average
> user to figure it out, to expect that the average user of metadata
> will even be able to grok the abstractions?  I may be reiterating
> your earlier sentiment, but I worry that the abstractions are as
> much an impediment as the spec and the syntax.

I don't think so -- the average user never even caught on to the HTML
<meta> element.  Personally, I'm much more interested in RDF for B2B
data exchange than I am in convincing Jane User to stick RDF in her
Web pages.  Besides, B2B is where the money and excitement is right
now.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 00:52:27 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:11 2004
Subject: Some Questions
In-Reply-To: "Jeffrey E. Sussna"'s message of "Wed, 1 Dec 1999 14:39:05 -0800"
References: <000401bf3c4c$e033dea0$0f36a8c0@quokka.com>
Message-ID: <m3r9idnj70.fsf@localhost.localdomain>

"Jeffrey E. Sussna" <jes@kuantech.com> writes:

> I think the important thing to remember about RDF is that it is not XML. It
> is fundamentally an abstract model for expressing metadata. It happens to be
> representable using XML, but it is different from XML. Unfortunately, this
> distinction is part of what makes the spec hard to read, but it's important.

That's a good point, and it's important to remember that it applies to
almost *everything* that can be represented in XML.  Even traditional
document types like HTML or DocBook really have their own model (hence
the HTML DOM).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Thu Dec  2 01:10:09 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:11 2004
Subject: Some questions
In-Reply-To: <4.2.0.58.19991201185138.009f5a90@abnaki>
Message-ID: <Pine.GHP.4.21.9912020006330.18108-100000@mail.ilrt.bris.ac.uk>

On Wed, 1 Dec 1999, Eve L. Maler wrote:

> At 03:46 PM 12/1/99 -0800, Tim Bray wrote:
> >Because the same data structures and usage patterns keep coming back across
> >wide ranges of metadata applications, even though the world isn't about
> >to agree on common vocabularies.  So there are huge gains to be had from
> >a common data model and transfer syntax. -Tim
> 
> Not that I don't respect RDF's power, but personally, I think the key *is* 
> common vocabularies.	  We may have to start small, and they may just be hub 
> formats that get mapped to/from a lot, but agreeing on semantics is the 
> pill that has to be swallowed.  Even RDF depends on this, particularly on 
> an open system such as the Web where you can't really control or influence 
> the habits of content creators.  If you want to indicate that you are the 
> author of a certain page, at the very least you have to refer to a widely 
> understood "author" semantic in order for author-criterion searching to be 
> of any use to your audience.  Whether it's an RDF property or a well-known 
> namespace or whatever doesn't seem to matter as much.

I don't disagree with any of this except the last claim; both matter IMHO.

What really matters above all is the use of unique identifiers (in Web 
context, URIs) both for the concepts/objects defined in a vocabulary and
those named in our instance data. There is very little to RDF apart from this
idea, ie. that simple stilted 3-part statements of the form:

	{peter, likes, mary}
	{peter, age, 7}
	{mary, livesIn, London}
	{peter, faveColor, red}

...are more useful when disambiguated with unique identifiers. Which
'peter', which 'London' and so forth.

We pay the price in verbosity, but when we move to URIs
(eg. urn:xmeta:cities:canada:London or http://xmlns.com/cities/LondonUK)
for these silly stilted sentences, there's another huge pay off: data
aggregation. Since the RDF information model is just stilted 3-part
sentences mostly built from URIs, we can aggregate two RDF data graphs
by joining nodes that share common identifiers.

if one piece of data tells us:

(I'm switching to an ascii-art labelled graph representation here)

	[mary] --livesIn--> [London]
	[mary] --age--> "9"
	[peter] --livesNextDoorTo-->[mary]

and something else (say the CIA world fact book or X500) 
informs us that...

	[London] --numCommunists--> "10,000"
	[London] --situatedIn--> [Canada]

we can simply[*] join these two graphs on the common node London (or,
rather, the unambiguous version ie [urn:xmeta:cities:canada:London].


Whether this is 'data' or 'metadata' is of no interest to me
whatsoever. Using URIs for Web data is just downright handy. 
We can take heaps of silly 3-part sentences from anywhere (that we
trust...) on the Web, pour them into a common database and get
something mostly intelligible.  

Here's a bald claim:

	Aggregating unanticipated RDF data graphs into a useful common 
	data structure is a feasible task; doing the same with unanticipated
	non-RDF XML data is, in the general case, much harder.

Maybe I'm wrong; perhaps someone has an algorithm for general 
purpose DOM-merging or SAX-stream aggregation that doesn't mangle
data. If anyone has seen such a thing please post the URL...

(BTW I'm making loose use of undefined terms here. By 'unanticipated'
I'm talking about a processor encountering instance data in a previously
unseen vocabulary. By 'aggregation' I mean joining together relevant
facts (or would-be facts) scattered across various XML documents and
document-parts such that applications can make use of the pooled
information.)

Let me emphasise that I'm not focussing on the use of RDF syntax
here; that doesn't matter. The key thing IMHO to support Web 
data aggregation is for interchanged data to have a common URI-based
graph interpretation. We can do that with XSL or (hopefully) 
using annotations in XML Schemata or annotations on good old fashioned
DTDs. RDF is URIs URIs URIs and not a lot else. I'm willing to be
persuaded that the syntax needs more thought, but the value of using
unique identifers in Web data interchange seems pretty
uncontroversial...

Dan
 

[*] I'm glossing over some issues here (eg. relating to
knowledge of cardinality/occurrence constraints to aid data
aggregation apps); aggregation in RDF is still hard to do right, but is
vastly easier than for arbitrary XML content.


--
daniel.brickley@bristol.ac.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jwtodd at pacbell.net  Thu Dec  2 03:42:09 1999
From: jwtodd at pacbell.net (James Todd)
Date: Mon Jun  7 17:18:11 2004
Subject: q: i'd like to merge two docs ...
Message-ID: <3845EAFA.341C3122@pacbell.net>


hi -

    i could use a pointer or two, a recipe if you will, on how best to
    "modify and merge" two xml docs. the scenario:

        an inbound xml "fragment", a complete xml doc in it's own
        right, is amended (eg. one new attribute is added)

        the results of which is appended, as a child node, to a
        "hosting" xml tree

    i've got most of this working using the ProjectX [? Mr. Brownell ?]
    parser yet it fails during the appendChild() stating that the child
node

        "That node doesn't belong in this document"

    due to the fact, i believe, that it has a distinct OwnerDocument.

    my methodology to date is to create dom's for both the inbound
    "fragment" and the destination xml docs afterwhich i'd like to
modify
    the fragment (hence going the dom route) and finally add the results

    to the destination doc via appendChild().

    i had hoped to bypass walking the tree in order to create an
    "document ownerless" copy with which to work with. is there
    a better/preferred means by which to accomplish this task?

    any/all comments and suggestions welcomed.

    thx much,

- james


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From orchard at pacificspirit.com  Thu Dec  2 04:02:28 1999
From: orchard at pacificspirit.com (David Orchard)
Date: Mon Jun  7 17:18:11 2004
Subject: Some questions
In-Reply-To: <3.0.32.19991201154526.014c3710@pop.intergate.ca>
Message-ID: <000401bf3c7a$064c3ec0$e930e620@n54wntw.vancouver.can.ibm.com>

As well, I fall into the context is king category, not content.

The best metric we have for that is company market caps and revenues.  TV
Guide makes more than CBS, NBC, ABC and Fox put together.

Yahoo wins because the context is human created rather than generated.
Human context or Point of View is always more usable to humans than machine
POV.

If you argue that Point of View and Context are actually content not
metadata, then there's no such thing as metadata.  It's all just data.
Which is what I actually believe.

Cheers,
Dave Orchard


> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Tim Bray
> Sent: Wednesday, December 01, 1999 3:47 PM
> To: Walter Underwood; 'xml-dev@ic.ac.uk'
> Subject: RE: Some questions
>
>
> At 02:02 PM 12/1/99 -0800, Walter Underwood wrote:
> >I disagree on this one. It's rare that metacontent is more
> >valuable than content, long-term. I'll bet on the books over
> >the card catalog, every time.
>
> Wow, that's a profoundly deep and strong statement, and I think at the
> core of the argument that *should* be happening about how to make the
> Web a better place.  In fairness, it should be said that Walter works for
> a company whose search engine does the equivalent of reading all the
> pages of all the books on all the shelves, and trying to guess what the
> books mean.  I used to be in that business myself.
>
> But I think metadata wins.  If you count hits on Internet search engines,
> the Yahoo and ODP directories, which are both human-constructed metadata,
> absolutely wipe out any fulltext search engine you can name, even though
> they have orders of magnitude less sites and a lower volume of information
> about each.  Because human-constructed metadata wins on the net just like
> it did in the library.
>
> RDF is important because it can facilitate the interchange of, and
> a certain number of the common uses of, this kind of metadata.
>
> Anyhow, just because you have a card catalogue doesn't mean you throw
> the library books away. -Tim
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smuench at us.oracle.com  Thu Dec  2 04:44:15 1999
From: smuench at us.oracle.com (Steve Muench)
Date: Mon Jun  7 17:18:11 2004
Subject: i'd like to merge two docs ...
References: <3845EAFA.341C3122@pacbell.net>
Message-ID: <005001bf3c6e$ef8d57b0$5a672382@us.oracle.com>

Assuming you have XML DOM Documents "one" and "two"
and that "oneElement" is the element in doc "one"
to which you'd like to append the entire content
of "two"...

You should be able to do:

   Element twoDocElt = two.getDocumentElement();
   two.removeChild(twoDocElt);
   oneElement.appendChild(twoDocElt);

_________________________________________________________
Steve Muench, Consulting Product Manager & XML Evangelist
Business Components for Java Development Team
http://technet.oracle.com/tech/java
http://technet.oracle.com/tech/xml
----- Original Message -----
From: James Todd <jwtodd@pacbell.net>
To: <xml-dev@ic.ac.uk>
Sent: Wednesday, December 01, 1999 9:43 PM
Subject: q: i'd like to merge two docs ...


|
| hi -
|
|     i could use a pointer or two, a recipe if you will, on how best to
|     "modify and merge" two xml docs. the scenario:
|
|         an inbound xml "fragment", a complete xml doc in it's own
|         right, is amended (eg. one new attribute is added)
|
|         the results of which is appended, as a child node, to a
|         "hosting" xml tree
|
|     i've got most of this working using the ProjectX [? Mr. Brownell ?]
|     parser yet it fails during the appendChild() stating that the child
| node
|
|         "That node doesn't belong in this document"
|
|     due to the fact, i believe, that it has a distinct OwnerDocument.
|
|     my methodology to date is to create dom's for both the inbound
|     "fragment" and the destination xml docs afterwhich i'd like to
| modify
|     the fragment (hence going the dom route) and finally add the results
|
|     to the destination doc via appendChild().
|
|     i had hoped to bypass walking the tree in order to create an
|     "document ownerless" copy with which to work with. is there
|     a better/preferred means by which to accomplish this task?
|
|     any/all comments and suggestions welcomed.
|
|     thx much,
|
| - james
|
|
| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
| Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
| To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
| unsubscribe xml-dev
| To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
| subscribe xml-dev-digest
| List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|
|


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jwtodd at pacbell.net  Thu Dec  2 06:46:19 1999
From: jwtodd at pacbell.net (James Todd)
Date: Mon Jun  7 17:18:11 2004
Subject: i'd like to merge two docs ...
References: <3845EAFA.341C3122@pacbell.net>
 <005001bf3c6e$ef8d57b0$5a672382@us.oracle.com>
Message-ID: <38461554.61917EBD@pacbell.net>


hmmm ... this is pretty much what i did with the exception of
the "removeChild()" step. my interpretation of this is that the
removeChild step will disassociate/null the OwnerDocument
so that it is effectively orphaned and can be added into the
new hosting doc.

i'll give it a whirl.

thx much,

- james

Steve Muench wrote:

> Assuming you have XML DOM Documents "one" and "two"
> and that "oneElement" is the element in doc "one"
> to which you'd like to append the entire content
> of "two"...
>
> You should be able to do:
>
>    Element twoDocElt = two.getDocumentElement();
>    two.removeChild(twoDocElt);
>    oneElement.appendChild(twoDocElt);
>
> _________________________________________________________
> Steve Muench, Consulting Product Manager & XML Evangelist
> Business Components for Java Development Team
> http://technet.oracle.com/tech/java
> http://technet.oracle.com/tech/xml
> ----- Original Message -----
> From: James Todd <jwtodd@pacbell.net>
> To: <xml-dev@ic.ac.uk>
> Sent: Wednesday, December 01, 1999 9:43 PM
> Subject: q: i'd like to merge two docs ...
>
> |
> | hi -
> |
> |     i could use a pointer or two, a recipe if you will, on how best to
> |     "modify and merge" two xml docs. the scenario:
> |
> |         an inbound xml "fragment", a complete xml doc in it's own
> |         right, is amended (eg. one new attribute is added)
> |
> |         the results of which is appended, as a child node, to a
> |         "hosting" xml tree
> |
> |     i've got most of this working using the ProjectX [? Mr. Brownell ?]
> |     parser yet it fails during the appendChild() stating that the child
> | node
> |
> |         "That node doesn't belong in this document"
> |
> |     due to the fact, i believe, that it has a distinct OwnerDocument.
> |
> |     my methodology to date is to create dom's for both the inbound
> |     "fragment" and the destination xml docs afterwhich i'd like to
> | modify
> |     the fragment (hence going the dom route) and finally add the results
> |
> |     to the destination doc via appendChild().
> |
> |     i had hoped to bypass walking the tree in order to create an
> |     "document ownerless" copy with which to work with. is there
> |     a better/preferred means by which to accomplish this task?
> |
> |     any/all comments and suggestions welcomed.
> |
> |     thx much,
> |
> | - james
> |
> |
> | xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> | Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> | To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> | unsubscribe xml-dev
> | To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> | subscribe xml-dev-digest
> | List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> |
> |
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mikew at o3.co.uk  Thu Dec  2 08:08:51 1999
From: mikew at o3.co.uk (Mike Williams)
Date: Mon Jun  7 17:18:11 2004
Subject: Storing SAX Locator information in DOM tree
Message-ID: <m3vh6h7ofh.fsf@picasso.o3.co.uk>

I'm pre-parsing some XML-based web-page templates, and building them into
DOM Documents.  My template-processor takes Document+data as input, and
generates SAX events.

My problem is this: if I detect an error while processing the template
(expected tags are missing, etc.), I have no way of relating this to a
position in the original template-file.  This would obviously be useful for 
my template-authors, so they don't have to re-check entire templates.

I'd really like to store information against each node in the Document,
recording what file it was built from, and where (line/column) the node
started; the SAX Locator information, basically.  Reasonable?  

Is there any way to implement this?

-- 
Mike Williams

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Thu Dec  2 09:06:12 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:11 2004
Subject: RDF vs. standard vocabularies (Was Re: Some questions)
References: <3.0.32.19991201154632.0152f350@pop.intergate.ca>
Message-ID: <384635F2.7E975E13@praxis.cz>

Tim Bray wrote:
> Because the same data structures and usage patterns keep coming back across
> wide ranges of metadata applications, even though the world isn't about
> to agree on common vocabularies.  So there are huge gains to be had from
> a common data model and transfer syntax. -Tim

But aren't common vocabularies needed for RDF as well?

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Thu Dec  2 09:26:30 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:11 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain>
Message-ID: <38463AB4.36C5292B@praxis.cz>

David Megginson wrote:
> Perhaps I'm a little confused, but I cannot see how the fact that a
> schema language itself happens to be object oriented allows you to do
> object exchange in XML (it doesn't hurt, but how can it help?).
> 
> I've been confused before, so there's no need for panic.

I'm not entirely sure whether or not I am confused myself. Let me give
this a crack, and I'm sure someone will be happy to tell me why I am
wrong. :-)

Let's say I have an arbitrary object structure that I want to serialize
and send down the pipe. Serializing a bunch of object attributes in XML
is a no-brainer, and representing arbitrary references between objects
is also fairly trivial if something like XLink is used (and we need
XLink, there's surely no controversy about this). The aspects of
object-oriented design that are missing are then inheritance and
polymorphism. This is why an object-oriented schema language is needed:
to do this properly I should be mapping each of my object classes to a
specific element type, and I need to be able to say that a given element
type extends a base type if this type of relationship is present in my
original object schema. Rich data types are also needed although this
doesn't have to do with object-orientation per se. Polymorphism is about
behavior and should be implemented in schema-aware tools.

I honestly feel that XML provides all the tools to do what RDF is trying
to do, without an additional syntactic layer. What is missing from the
picture is a mechanism for modelling object structures according to
object-oriented principles, and this is why an OO schema language is
necessary. The only other thing the RDF brings to the game is that it
turns relationships into first-class objects that can be referenced as
well, but I don't know any OO language that enables this without
modelling it specifically (i.e. creating an object to represent a
reference), and this can be done in an analogous way in XML as well.

If I may, let me turn your question on its head: what about an XML
Schema approach doesn't let you do object exchange in a satisfactory
manner?

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 11:31:25 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:11 2004
Subject: Some questions
In-Reply-To: "David Orchard"'s message of "Wed, 1 Dec 1999 20:02:17 -0800"
References: <000401bf3c7a$064c3ec0$e930e620@n54wntw.vancouver.can.ibm.com>
Message-ID: <m3emd5a8c9.fsf@localhost.localdomain>

"David Orchard" <orchard@pacificspirit.com> writes:

> As well, I fall into the context is king category, not content.
> 
> The best metric we have for that is company market caps and
> revenues.  TV Guide makes more than CBS, NBC, ABC and Fox put
> together.
> 
> Yahoo wins because the context is human created rather than
> generated.  Human context or Point of View is always more usable to
> humans than machine POV.
> 
> If you argue that Point of View and Context are actually content not
> metadata, then there's no such thing as metadata.  It's all just
> data.  Which is what I actually believe.

Yup, me too -- I'm not a big fan of the "meta" word.  After all, many
companies have information about me in their databases, but they don't 
have me myself in them -- does that mean that all of their data are
"metadata" as well?  If so, then what are just plain data?

BTW, RDF itself is being very heavily and successfully used in the
Linux world right now -- it's the basis of the database for rpmfind, a
utility that allows users to find new packages or upgrade existing
ones, including dependencies.  Of course, end users never have to see
the RDF (they can look if they want to), but that's the way it should
be.


All the best,


David

p.s. The RDF used by rpmfind is not strictly conformant, since it uses 
     a pre-REC version of the RDF Namespace URI, but otherwise it's
     fully correct.

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 11:44:29 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:12 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: Matthew Gertner's message of "Thu, 02 Dec 1999 10:24:04 +0100"
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz>
Message-ID: <m3aenta7qn.fsf@localhost.localdomain>

Matthew Gertner <matthew@praxis.cz> writes:

> I honestly feel that XML provides all the tools to do what RDF is trying
> to do, without an additional syntactic layer. What is missing from the
> picture is a mechanism for modelling object structures according to
> object-oriented principles, and this is why an OO schema language is
> necessary. 

If you have a function loadXML(), you get a DOM tree or a bunch of SAX
events or something similar; if you have a function loadRDF(), you get
a collection of objects with attributes and relationships.  In either
case, a schema can tell you things like "element type/class B is a
kind of element type/class A", but that's secondary information; the
primary information is "element X is an object of class Y with
identifier Z, while element A represents a relationship between this
object and object C".

If you're interested in a collection of objects in the first place,
why should you have to see or know about XML elements and attributes
at all?  Or to put it a different way, why should people constantly
have to redo the work of extracting objects from XML, when they're all
trying to do the same thing?

I think that reasonable people can argue that RDF is not the best
solution to the problem of object exchange in XML, but I am somewhat
surprised to hear people deny that the problem even exists: there is
an enormous demand for exchanging objects in XML (businesses exchange
a lot of structured data), and it's hard work to have to figure out
over and over how to construct objects from a SAX stream or a DOM tree
especially when programmers with XML knowledge are scarce and
expensive.  

I have no doubt that we need an abstract object layer on top of XML.
Right now, RDF is the best solution currently available (XMI also has
its advocates), but I'm ready to listen about anything better.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steven.livingstone at scotent.co.uk  Thu Dec  2 11:54:43 1999
From: steven.livingstone at scotent.co.uk (Steven Livingstone, ITS, SENM)
Date: Mon Jun  7 17:18:12 2004
Subject: Schema Question
Message-ID: <8DCB90532FF7D211B34400805FD48853B363F5@SENMAIL3>

Hi all - 
I've got a question on X-Schema for anyone who may be of help.

I am creating an XML Schema for an XML document which is to dynamically
generated.
I am ok with most of it, but there is one particular part where I may have
any number of elements, but with the same property.

So i may have 

<element_properties>
<color>red</color>
<mixed_with_na>green</mixed_with_na>
..
..
..
</element_properties>


The element properties could be called anything, but have the same type of
value.

Is there a way to specify a variable for the element name, but set it's
type, say, to string so that any element created under <element_properties>
could be called anything but follow predetermined validation ?

Beyond that, I will stick to the normal <property name="x" value="y"/>
technique with validtion which isn't really a problem, but the other
would/could be useful !?

Cheers
Steven

Steven Livingstone - http://www.citix.com
07771 957 280 or +447771957280

Professional Site Server 3, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696
Professional Site Server 3.0 Commerce Edition, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Thu Dec  2 12:24:41 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:12 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain>
Message-ID: <38466487.328D1CFA@praxis.cz>

David Megginson wrote:
> If you have a function loadXML(), you get a DOM tree or a bunch of SAX
> events or something similar; if you have a function loadRDF(), you get
> a collection of objects with attributes and relationships.  In either
> case, a schema can tell you things like "element type/class B is a
> kind of element type/class A", but that's secondary information; the
> primary information is "element X is an object of class Y with
> identifier Z, while element A represents a relationship between this
> object and object C".

A schema gives you this information too. The problem of how to attach a
schema to an instance is not yet resolved, but it is a purely syntactic
consideration and a satisfactory solution will be found. This then tells
you what class a given instance belongs too. The identity can be
specified using an ID attribute; this is exactly the way it is done in
RDF. That an element represents a relationship is implicit in the
content model of the element.

SAX is great as far as it goes, but we seem to be agreeing that an
additional layer is needed on top. This layer is not the DOM. One of the
lessons that I learned from my time at POET Software is that, although
we had an excellent generic API, the vast majority of our customers
wanted to work with real C++ (and later Java) classes in their problem
domain. But there is nothing to say that a loadXML() function must
return a DOM tree. There are a variety of efforts to create
domain-specific objects automatically from XML objects. I don't have a
list at the tips of my fingers, but if anyone does it would be a great
resource. They are out there because I keep bumping into them.

> If you're interested in a collection of objects in the first place,
> why should you have to see or know about XML elements and attributes
> at all?  Or to put it a different way, why should people constantly
> have to redo the work of extracting objects from XML, when they're all
> trying to do the same thing?

Once again, there are already tools that provide this functionality
across applications (i.e. they can be plugged in and used without
additional development). The interest of XML is essentially as a way to
serialize objects and send them across a network, as you also stated.

> I think that reasonable people can argue that RDF is not the best
> solution to the problem of object exchange in XML, but I am somewhat
> surprised to hear people deny that the problem even exists: there is
> an enormous demand for exchanging objects in XML (businesses exchange
> a lot of structured data), and it's hard work to have to figure out
> over and over how to construct objects from a SAX stream or a DOM tree
> especially when programmers with XML knowledge are scarce and
> expensive.
> 
> I have no doubt that we need an abstract object layer on top of XML.
> Right now, RDF is the best solution currently available (XMI also has
> its advocates), but I'm ready to listen about anything better.

In no way do I doubt the importance of being able to exchange objects in
XML, but I do have serious reservations about RDF as the way to do this,
and they have nothing to do with the hairy syntax or hard-to-understand
spec. What is lacking right now is an overarching approach to using XML
in real-world applications. To be quite blunt it seems ashame that a lot
of really great work is being put into the RDF effort (including a very
valuable vocabulary for collection classes, just to name one) instead of
being integrated more tightly into the overall XML architecture. This is
especially so because there isn't an overall XML architecture yet, and
the effort and thought that are being put into RDF could bring us a long
way towards this. I don't agree with the conclusions of the Cambridge
Communique. I think that if the work being done on RDF were refocused to
making sure that XML Schemas do everything that the RDF advocates are
rightly claiming is necessary, that we will see a clear win in terms of
pushing the whole XML effort from a theoretical effort into a major
paradigm shift with extensive real-world implications. As things stand,
this work is being diluted because both we are asking people to read
about, grasp and implement two things instead of just one.

Cheers,
Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Dec  2 12:40:02 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:12 2004
Subject: Some questions
In-Reply-To: <m3zox1nk6u.fsf@localhost.localdomain>
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEECEIAA.martind@netfolder.com>

Hi David,

David said:
Or, in programming terms, the ID is local rather than global, or in
Web terms, it is relative rather than absolute (note that RDF allows
ID as well).  That's suitable for some applications, but entirely
useless for others (it's often important to have single global
identifiers for well-known people, places, and things).

Didier reply:
But most of the RDF users  set the description element "about" attribute's
value with a URL. In fact, this is OK because the spec indicates that you
are providing a description _about_ something and the about value may be its
location.

I discovered that using this form, is, most of the time bogus. Instead, I do
what librarian discovered. Have the classification card (i.e. description
element) to be independent of any properties. So, instead of using a
location in the description element, I use instead an ID. This mainly
because the object's location _is_ a property. So, the object's location is
indicated by a "location" property. If there is no location I do not include
a "location" property.

See, this is very different. The object, this time is a collection of
properties. The description itself is uniquely identified in a description
collection by an ID (so that, if this is needed, we can relate a description
to an other). I do not use a URL as a value for the about and I tend not to
use the about attribute but instead use the "id" attribute and include the
location as a property in the description.

So, now, the real challenge for data interchange is to agree on a particular
schema or property set. Otherwise we only exchange data with our own tools
:-)

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web Boston (http://www.mfweb.com)
Markup 99 (http://www.gca.com)
Book to come soon: XML Pro published by Wrox Press
Products http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Dec  2 12:50:12 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:12 2004
Subject: Some questions
In-Reply-To: <000801bf3c5b$06a06a50$0f36a8c0@quokka.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEECEIAA.martind@netfolder.com>

Hi Jeff,

Jeff said:
Well, you need both. You need the shared concept of "author" and the shared
representation of an instance of that concept. XML specs of various kinds
are trying to define shared representations at various semantic layers. Both
vertical and horizontal vocabulary efforts (Dublin Core, BizTalk, etc.) are
required to complete the equation.

Jeff

P.S. Please don't bash me for mentioning BizTalk. It was an arbitrary
example.

Didier reply:
I won't bash you but will precise that, the biztalk framework is more an
envelope used to transport you document. In that sense, your document is
transformed into a biztalk document's fragment.

If we look closely enough, a biztalk framework is a collection of meta data
about an XML document. Meta data like:
a) from whom/what is this document coming from?
b) to whom/what is this document sent to?
c) what is the purpose of this document?
d) To which process is this document part of?

So, a biztalk document is a set of meta data properties and your off course
includes your document now transformed into a biztalk document's fragment. A
biztalk document has about the same structure as an HTML document.
<biztalk>
<route>
header part or meta data part
</route>
<body>
body part - this is where you insert your document
</body>
</biztalk>

This is roughly equivalent to an HTML document structure:

<html>
<headers>
... your headers here including meta data
</headers>
<body>
...the HTML document body
</body>
</html>

Something interesting to note here, the meta data are mainly "routing" meta
data as you would find in workflow engines. Is a biztalk server a workflow
engine? Does Microsoft now want to enter in the workflow business? I let you
make your own conclusions.

PS: my outlook spell checker still wants to replace the "biztalk" word by
the "bestial" word :-)))

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web Boston (http://www.mfweb.com)
Markup 99 (http://www.gca.com)
Book to come soon: XML Pro published by Wrox Press
Products http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Dec  2 13:55:20 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:12 2004
Subject: XML processing instruction survey
Message-ID: <013401bf3cd0$660d6040$3ff96d8c@NT.JELLIFFE.COM.AU>


From: Jeffrey E. Sussna <jes@kuantech.com>

>I'm interested in the extent to which people are actually using the XML
>processing instruction ( <?xml ) in their XML files, and the extent to
which
>they find it useful.

You probably should post this question (in Japanese) to a Japanese XML
mailgroup, or (in Chinese) to a Chinese XML mail group (such as the one
running from University of Milan), and so on.

Asking an English-language list will only give your a survey of how many
people work outside their only language: I am interested in this, but a
lack of response would not provide evidence of anything much.  Also,
when dealing with CJK societies, with their strong Buddhist aversion to
self-promotion coupled with a strong Confucian deference to authority
(let alone the strong reluctence to embarrass them selves or others, or
put themselves into conflict), you might be hard-pressed to get much
response even there.

For me, I use it every day.  See  http://www.ascc.net/xml for a
bilingual website in UTF-8, Big5, and GB2312.  Logfiles reveal that
Chinese is accessed primarily through Big5 or GB.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 14:27:38 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:12 2004
Subject: Some questions
In-Reply-To: <NBBBJPGDLPIHJGEHAKBAEEECEIAA.martind@netfolder.com>
References: <m3zox1nk6u.fsf@localhost.localdomain>
	<NBBBJPGDLPIHJGEHAKBAEEECEIAA.martind@netfolder.com>
Message-ID: <14406.33170.34794.500249@localhost.localdomain>

Didier PH Martin writes:

 > Didier reply: 

 > But most of the RDF users set the description element "about"
 > attribute's value with a URL. In fact, this is OK because the spec
 > indicates that you are providing a description _about_ something
 > and the about value may be its location.

The advantage is that the URL is an absolute identifier (whether it
actually points to anything or not).  For example, imagine that
Amazon.com uses id p0809764 to refer to the person David Bowie, while
Reuters uses the id p0809764 to refer to the person Bill Clinton.  If
I get some RDF

  <foo:Person rdf:ID="p0809764">
    <foo:customer-rating>80%</foo:customer-rating>
  </foo:Person>

how do I know who I'm talking about?  On the other hand, if I have

  <foo:Person rdf:about="http://www.reuters.com/ids#p0809764">
    <foo:customer-rating>61%</foo:customer
  </foo:Person>

  <foo:Person rdf:about="http://www.amazon.com/performers/p0809764">
    <foo:customer-rating>80%</foo:customer-rating>
  </foo:Person>

then there's no room for confusion.  Certainly, local IDs have their
uses, but we're building a new environment where information has to be
useful across systems, and to accomplish that, we need to use some
kind of global identifiers, such as URLs or URNs (once the latter are
ready for Prime Time); local IDs are of little value outside of closed
systems.

 > I discovered that using this form, is, most of the time
 > bogus. Instead, I do what librarian discovered. Have the
 > classification card (i.e. description element) to be independent of
 > any properties.

Yes, but with the Web, a better analogy would be that you're in
Robarts Library in Toronto and have a card from the Bodleian in Oxford
that says to get the third book on the fifth shelf in the eighteenth
row.  It would have been better to have given you the ISBN so that you 
could find it in any library.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 14:40:46 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:12 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: Matthew Gertner's message of "Thu, 02 Dec 1999 13:22:31 +0100"
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz>
Message-ID: <m34se19zkv.fsf@localhost.localdomain>

Matthew Gertner <matthew@praxis.cz> writes:

> A schema gives you this information too. The problem of how to attach a
> schema to an instance is not yet resolved, but it is a purely syntactic
> consideration and a satisfactory solution will be found. This then tells
> you what class a given instance belongs too. The identity can be
> specified using an ID attribute; this is exactly the way it is done in
> RDF. That an element represents a relationship is implicit in the
> content model of the element.

I still don't follow.  Perhaps I need to reread the XML Schema spec,
but given

  <foo>
   <bar id="xxx">
    <hack>David</hack>
    <flurb>Megginson</flurb>
   </bar>
  </foo>

How does the schema tell me that foo represents a container for a
collection of objects, bar represents an object, and hack and flurb
represent the object's properties?

> SAX is great as far as it goes, but we seem to be agreeing that an
> additional layer is needed on top. This layer is not the DOM. 

It can be.  The DOM represents a domain-specific object layer that is
useful for a wide subset of XML operations (especially document- and
browser-oriented work).  There need to be many layers on top of XML,
one for each domain -- it happens that many of those layers will share 
the need to encode objects, so a standard object layer sandwiched
between XML and the domain-specific layers can save a lot of work.

> There are a variety of efforts to create
> domain-specific objects automatically from XML objects. I don't have a
> list at the tips of my fingers, but if anyone does it would be a great
> resource. They are out there because I keep bumping into them.

One example is RDF.

> To be quite blunt it seems ashame that a lot of really great work is
> being put into the RDF effort (including a very valuable vocabulary
> for collection classes, just to name one) instead of being
> integrated more tightly into the overall XML architecture. 

I disagree strongly with the last part of that statement.  I'd argue
the opposite -- higher-level layers should be as independent of XML as
possible.  That's the only way to build good, layered architectures.
XML does one thing (represent a tree structure in a character stream)
very well: it's an excellent layer to build other layers on top of,
but XML itself should stay as simple as possible so that it's
applicable widely to many different fields.

> I think that if the work being done on RDF were refocused to making
> sure that XML Schemas do everything that the RDF advocates are
> rightly claiming is necessary, that we will see a clear win in terms
> of pushing the whole XML effort from a theoretical effort into a
> major paradigm shift with extensive real-world implications.

That would be another serious mistake.  Object exchange, while
important, represents only one of many layers that can be build on top
of XML, and if XML Schemas start trying to solve high-level problems
for every specific domain, it will become an unimplementable mess.
RDF already made a similar mistake by mixing together a spec for
object encoding in XML with a spec for representing knowledge about
Web pages.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From LWatanab at JetForm.com  Thu Dec  2 14:51:31 1999
From: LWatanab at JetForm.com (Larry Watanabe)
Date: Mon Jun  7 17:18:12 2004
Subject: Some questions
Message-ID: <111CF63B7D2ED211830000805F65A2FF01804962@OTTMAIL2>


Eve Maler wrote
> Not that I don't respect RDF's power, but personally, I think the key *is*
> common vocabularies.  We may have to start small, and they may just be hub
> formats that get mapped to/from a lot, but agreeing on semantics is the 
> pill that has to be swallowed.  Even RDF depends on this, particularly on
> an open system such as the Web where you can't really control or influence

> the habits of content creators.  If you want to indicate that you are the 
> author of a certain page, at the very least you have to refer to a widely 
> understood "author" semantic in order for author-criterion searching to be

> of any use to your audience.  Whether it's an RDF property or a well-known
> namespace or whatever doesn't seem to matter as much.

	I agree; if someone chooses to define "author" to be what someone
else uses for
	"garage mechanic" then there is no advantage to common syntax. 
	Even assuming we rely on common English usages, there are multiple
representations 
	and arbitrary decisions in mapping English to logic (which RDF is a
disguised form of). 
	For example, suppose we want to represent "John loves Mary". 
	We could represent this as the triple

		{John, loves, Mary}

	or it could be represented as

		{person001, loves, person002}
		{person001, name, John}
		{person002, name, Mary}

	both correctly represent the statement in RDF triples. 

	It would be advantageous to have a common repository of
vocabularies, so that
	 people would agree on meanings and syntax (i.e. do we use love or
Loves or LUV} 
	and is the first person the lover or the lovee, etc. This would
serve a similar function 
	to a namespace declaration, but would deal with the semantics of the
expressions.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Thu Dec  2 15:01:54 1999
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:18:12 2004
Subject: CFP: W3C XML Activity chat before XML '99
Message-ID: <f5bso1lqtcz.fsf@cogsci.ed.ac.uk>

You may have seen:

"Upcoming Events: 

    XML? 99, 5-9 Dec '99 in Philadelphia (a GCA Conference)
    meet Judy Brewer, Bert Bos, Dan Connolly, Dave Raggett,
    Joseph Reagle, Chris Lilley, Michael Sperberg-McQueen
    from the W3C Team "
	-- http://www.w3.org/XML/

and, meanwhile, a lot of discussion in xml-dev and elsewhere about
what W3C is doing with XML (and HTML and ...) and how
and why it does all this stuff.

I propose we get together over an IRC channel and chat:

Who: everybody's welcome
	In addition to myself, Bert Bos, Ian Jacobs,
	Henry Thompson, and Daniel Veillard
	from the W3C Team plan to be there.

When: Friday, 3 December at 1500Z (9am U.S. Central time)
	for about an hour.

	(Apologies to the parts of the world where that's
	inconvenient. The log will go online, and hopefully
	we'll have more chats at different times of day in the future.)


Where: irc://irc.openprojects.net/#w3c
	i.e. channel #w3c on irc.openprojects.net

        about this IRC network, see
	Open Projects Network - New User?
        http://openprojects.nu/about.html

	stay tuned to the XML home page http://www.w3.org/XML/
	for other details.

What: The W3C XML Activity: Who, What, How, and Why


Recommended reading:

W3C Extensible Markup Language (XML) Activity 
	http://www.w3.org/XML/Activity

HTML Working Group Roadmap 
    18 November 1999, Shane McCarron, Dave Ragett 
http://www.w3.org/TR/xhtml-roadmap

Schemas coming of age: use them
Tim Berners-Lee (timbl@w3.org)
Tue, 9 Nov 1999 15:31:59 -0500 
http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/0249.html

Web Architecture from 50,000 feet
http://www.w3.org/DesignIssues/Architecture

Web Architecture: Describing and Exchanging Data
W3C Note 7 June 1999
http://www.w3.org/1999/04/WebData

Web Architecture: Extensible Languages 
10 Februray 1998, Tim Berners-Lee, Dan Connolly 
http://www.w3.org/TR/NOTE-webarch-extlang

xml-dev archives
http://www.lists.ic.ac.uk/hypermail/xml-dev/

and news:comp.text.xml

ht, on behalf of
  Dan Connolly, W3C
  http://www.w3.org/People/Connolly/
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Sophie.Mabilat at apitech.fr  Thu Dec  2 15:35:06 1999
From: Sophie.Mabilat at apitech.fr (Sophie MABILAT)
Date: Mon Jun  7 17:18:12 2004
Subject: DTDs and Schemas...
Message-ID: <B1C8643B3AB0D21180250000C0B179CD05B68F@JUPITER>

Does anyone know a tool which converts DTDs into Schemas and Schemas into
DTDs ?

-------------------------------------------------------------------
Sophie MABILAT
Sophie.Mabilat@apitech.fr
-------------------------------------------------------------------
APITECH 113, rue Marietton 69009 Lyon FRANCE
T�l. : 04 78 43 49 30  Fax : 04 78 83 47 86
-------------------------------------------------------------------
www.zipbee.com
-------------------------------------------------------------------


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Thu Dec  2 15:56:04 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:12 2004
Subject: XML processing instruction survey
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC0145@cc20exch2.mobility.com>

From: Tim Bray [mailto:tbray@textuality.com]
Sent: Tuesday, November 30, 1999 9:57 PM
> 
> It's not really designed for people.  It's mostly designed for use
> by the XML processor to help figure out the encoding and make 
> sure that
> this is really XML.
> 
> I'd think that using it at the application level would be not only
> uncommon but probably unwise.  I'd be interested to hear any positive
> responses to the query. -T.

As would I.  I'm currently writing YAXB (Yet Another XML Book), and I'm
finding myself hard-pressed to come up with intelligent examples of where
PIs might be useful.

The XML Declaration I have no problem with, but PIs...

David Hunter
david.hunter@mobileq.com
http://www.MobileQ.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smohr at voicenet.com  Thu Dec  2 16:02:28 1999
From: smohr at voicenet.com (Stephen T. Mohr)
Date: Mon Jun  7 17:18:12 2004
Subject: DTDs and Schemas...
References: <B1C8643B3AB0D21180250000C0B179CD05B68F@JUPITER>
Message-ID: <01f901bf3cde$582c5590$e9d9f2cc@omicron.com>

Extensibility's XML Authority will convert a DTD to a schema, but it's
compliance with the W3C XML Schema draft is necessarily a bit dated.  It
will also export a DTD to an XML-DR (i.e., Microsoft schema preview) schema.

----- Original Message -----
From: Sophie MABILAT <Sophie.Mabilat@apitech.fr>
To: <xml-dev@ic.ac.uk>; <XML-L@listserv.heanet.ie>
Sent: Thursday, 2 December 1999 10:22
Subject: DTDs and Schemas...


> Does anyone know a tool which converts DTDs into Schemas and Schemas into
> DTDs ?
>
> -------------------------------------------------------------------
> Sophie MABILAT
> Sophie.Mabilat@apitech.fr
> -------------------------------------------------------------------
> APITECH 113, rue Marietton 69009 Lyon FRANCE
> T?l. : 04 78 43 49 30  Fax : 04 78 83 47 86
> -------------------------------------------------------------------
> www.zipbee.com
> -------------------------------------------------------------------
>
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cox_andy at bah.com  Thu Dec  2 16:13:43 1999
From: cox_andy at bah.com (Cox Andy)
Date: Mon Jun  7 17:18:13 2004
Subject: XML processing instruction survey
In-Reply-To: <805C62F55FFAD1118D0800805FBB428D02BC0145@cc20exch2.mobility.com>
Message-ID: <001a01bf3ce0$7dae9ec0$20aa509c@bah.com>

One example of "real-world" PI usage can be found in the W3C Recommendation
"Associating Style Sheets with XML documents" [1].

I have also seen them used in the Apache Cocoon project [2].

Andy

[1] http://www.w3.org/TR/xml-stylesheet/
[2] http://java.apache.org/cocoon/ (soon http://xml.apache.org/cocoon)

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Hunter, David
> Sent: Thursday, 02 December 1999 10:56 AM
> To: 'XML Dev'
> Subject: RE: XML processing instruction survey
>
>
> From: Tim Bray [mailto:tbray@textuality.com]
> Sent: Tuesday, November 30, 1999 9:57 PM
> >
> > It's not really designed for people.  It's mostly designed for use
> > by the XML processor to help figure out the encoding and make
> > sure that
> > this is really XML.
> >
> > I'd think that using it at the application level would be not only
> > uncommon but probably unwise.  I'd be interested to hear any positive
> > responses to the query. -T.
>
> As would I.  I'm currently writing YAXB (Yet Another XML Book), and I'm
> finding myself hard-pressed to come up with intelligent examples of where
> PIs might be useful.
>
> The XML Declaration I have no problem with, but PIs...
>
> David Hunter
> david.hunter@mobileq.com
> http://www.MobileQ.com
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Thu Dec  2 16:36:39 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:13 2004
Subject: Schema Question
Message-ID: <61DAD58E8F4ED211AC8400A0C9B4687341553B@THOR>

Steven Livingstone wrote
>><element_properties>
>><color>red</color>
>><mixed_with_na>green</mixed_with_na>
>>Is there a way to specify a variable for the element name, but set it's
>>type, say, to string so that any element created under
<element_properties>
>>could be called anything but follow predetermined validation ?


The concepts of archetypes in the W3C Schema draft were motivated (at least
in my interpretation) by the desire to do something like what you suggested.
The classic would be to create an Address archetype and use it to define
ShipTo and Billing elements that have the same content model.

However, this does not allow a document author to make up a new element name
and have the parser mystically figure out it should be an address (or
whatever) and validate it.  The list of all the acceptible elements must be
enumerated in the schema. (I could be wrong my interpretation on this
however).

I guess if you consider an archetype as being an element without a name, you
could allow an archetype to appear in a content model and then any child
element could be validated against the content model.  However, choosing
between two potential archetypes (say in a choice of archetypes or a
sequence with optional archetypes) may require you to look at their content
to determine what archetype applies.  I think this adds too much complexity
to schema validation for its value.

If you really want to do this (and to validate), I think that you let
"element_properties" have any content and then use an XSLT (or something
else) determine if the content of element_properties matches your pattern.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Dec  2 16:39:19 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:18:13 2004
Subject: Some questions
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz>
Message-ID: <3846A0A9.5D8AB94D@prescod.net>

Matthew Gertner wrote:
> 
> I can see the enormous interest in having a text-based format for
> exchanging object-oriented data. But can't this be done with a good
> object-oriented XML schema language, of the which the current W3C seems
> to be a very good start? 

One of XML's few innovations was making the schema optional. People
thought that was really important. If object representation requires a
schema then we're back where we started -- at least in that domain.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Dec  2 16:44:44 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:18:13 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain>
Message-ID: <3846A1E9.AFB43B7A@prescod.net>

David Megginson wrote:
> 
> How does the schema tell me that foo represents a container for a
> collection of objects, bar represents an object, and hack and flurb
> represent the object's properties?

It probably doesn't, but Matthew is right that you could imagine a
schema language that DOES

> Object exchange, while
> important, represents only one of many layers that can be build on top
> of XML, and if XML Schemas start trying to solve high-level problems
> for every specific domain, it will become an unimplementable mess.

I would argue that every domain, including documents, has a concept of
"objects" and a concept of "properties." XML's inability to represent
this is, in my opinion, a major flaw. It would be nice if schemas could
work around that flaw but I still think that there is a place in the
world for an instance-only syntax for objects and properties.

> RDF already made a similar mistake by mixing together a spec for
> object encoding in XML with a spec for representing knowledge about
> Web pages.

I agree that this was a mistake and it befuddled me for a while. I see
it as a different situation, however, because I can't imagine a problem
domain that does NOT need to know about structured objects and their
properties.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Dec  2 16:47:47 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:18:13 2004
Subject: Some questions
References: <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> <m3904lp3v7.fsf@localhost.localdomain>
Message-ID: <3846A2A4.5B795FA4@prescod.net>

David Megginson wrote:
> 
> That's just the problem with the spec -- if you forget the word/prefix
> "meta" completely, RDF is just an XML format for object exchange; it
> just happens that one possible application of those objects if
> metadata, and the RDF-Syntax spec mixes the two together.

Agreed. The RDF spec also mixes syntax and data model (while claiming
that the latter is independent of the former). I think that the data
model is useful enough on its own (especially in light of the problems
with RDF syntax) to deserve a separate spec.

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From begeddov at jfinity.com  Thu Dec  2 16:47:58 1999
From: begeddov at jfinity.com (Gabe Beged-Dov)
Date: Mon Jun  7 17:18:13 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz>
Message-ID: <38469260.E7D18FA3@jfinity.com>

Matthew Gertner wrote:

> Let's say I have an arbitrary object structure that I want to serialize
> and send down the pipe. Serializing a bunch of object attributes in XML
> is a no-brainer, and representing arbitrary references between objects
> is also fairly trivial if something like XLink is used (and we need
> XLink, there's surely no controversy about this).

XLink is explicitly intended to support hyperlinking rather than linking, i.e. you have an
instance level title on each object reference :-!  RDF is explicitly intended to support
linking rather than hyperlinking. You can specify a title for you object reference but you do
it at the class level rather than the instance level.

Even if you try to use XLink for  OO linking you will find that you end up with the
equivalent of void* pointers.  Let's call these kinds of links properties of the source
object. RDF allows you to specify the type of the property using a URI and (using RDF Schema)
specify the base type of the property value.  This is what you would expect to be able to do
for strongly typed pointers in OO interchange.

XLink doesn't even allow you to use a namespace qualified name for the "role" (this may have
been fixed but it will be done as a new attribute value type like qname).  It certainly
doesn't touch being able to specify a type for the property value.  The XML Schema group may
end up supporting strongly typed references but I wouldn't be surprised if this fell off the
plate.

In short, RDF (and RDF Schema) support OO interchange in a pretty straightforward manner
TODAY. David Megginson's work on the DATAX toolkit shows how straightforward it can be to use
RDF.  As part of my work at Rogue Wave, I participated in the development of several
alternative C++/XML frameworks. We didn't use RDF because of the lack of tools. If I had to
do it over again and choose between RDF + RDFSchema today and XML + XLink + XMLSchema
tomorrow for OO interchange I know which way I would go.

Cordially from Corvallis,

Gabe Beged-Dov
http://www.jfinity.com/gabe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Thu Dec  2 16:55:36 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:18:13 2004
Subject: Some questions
References: <Walter Underwood's message of "Wed, 01 Dec 1999 14:02:47 -0800">
	 <3.0.5.32.19991201140247.00c7bae0@corp.infoseek.com> <4.2.0.58.19991201175459.00baea50@abnaki>
Message-ID: <3846A477.FEC9503C@prescod.net>

"Eve L. Maler" wrote:
> 
> When people talk about RDF, the "meta" part is what I have trouble with in
> general.  In what way is markup not metadata?  In what way are element
> content and attribute values not also metadata (depending on what you do
> with them)?  It feels weird for one particular data model to claim to have
> cornered the metadata market.

Here are definitions I use that are mostly free of the ambiguity people
typically associate with the words content and metadata. Metadata is
property/value oriented so that you can ask questions in terms of "what
is the value of this property". Content is list within list oriented so
that you can ask: "what comes before this item, and what comes after
it."

RDF data is content if you look at the XML level (because the XML data
model doesn't make the <TITLE> element addressable as a property) but it
is metadata if you look at the RDF level (because RDF really WOULD make
the <TITLE> element addressable as a TITLE property).

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Thu Dec  2 17:16:55 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:18:13 2004
Subject: Any XML Schemas validators out yet ?
Message-ID: <3846A9E7.B14067F1@toolsmiths.se>

Hi All

I have just started to write a new RPC using XML as content transfer and
whant to
use the new XML Schema proposal instead of DTD's.

So Im wondering if there are any tools that can validate XML Schemas
themselfs
and maybe also validate XML documents using XML Schemas ?

Regards Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steven.livingstone at scotent.co.uk  Thu Dec  2 17:26:08 1999
From: steven.livingstone at scotent.co.uk (Steven Livingstone, ITS, SENM)
Date: Mon Jun  7 17:18:13 2004
Subject: Any XML Schemas validators out yet ?
Message-ID: <8DCB90532FF7D211B34400805FD48853B56DBA@SENMAIL3>

Yep,
XML Authority from extensibility.com

cheers
Steven

Steven Livingstone - http://www.deltabiz.com
07771 957 280 or +447771957280

Professional Site Server 3, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696
Professional Site Server 3.0 Commerce Edition, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505


> -----Original Message-----
> From:	Anders W. Tell [SMTP:anderst@toolsmiths.se]
> Sent:	2 December 1999 17:19
> To:	xml-dev@ic.ac.uk
> Subject:	Any XML Schemas validators out yet ?
> 
> Hi All
> 
> I have just started to write a new RPC using XML as content transfer and
> whant to
> use the new XML Schema proposal instead of DTD's.
> 
> So Im wondering if there are any tools that can validate XML Schemas
> themselfs
> and maybe also validate XML documents using XML Schemas ?
> 
> Regards Anders
> --
> /_/_/_/_/_/_/_/_/_/_/_/_/_/_/
> /  Financial Toolsmiths AB  /
> /  Anders W. Tell           /
> /_/_/_/_/_/_/_/_/_/_/_/_/_/_/
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Thu Dec  2 17:27:08 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:13 2004
Subject: Some questions
In-Reply-To: <3.0.32.19991201154526.014c3710@pop.intergate.ca>
Message-ID: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com>

At 03:46 PM 12/1/99 -0800, Tim Bray wrote:
>
>But I think metadata wins.  If you count hits on Internet search engines,
>the Yahoo and ODP directories, which are both human-constructed metadata, 
>absolutely wipe out any fulltext search engine you can name, ...

This is a subtle issue -- in aggregate, the metacontent is used 
more, but each user spends more time with content than with metacontent. 
The better the directory or search engine, the less time you need to 
spend with it (an interesting conflict when you are ad-supported).

Here are stages of having the info (content) that you want,
ordered in increasing amounts of wasted time.

1. I have the information.
2. I know where the information is.
3. I know it exists, but I don't know where it is.
4. I don't know if it exists.

Only the last two need some sort of metacontent or finding aid.

Organizing and indexing content is a time-saver, and sometimes that
is essential. Sometimes, the metacontent has the whole answer (which
companies sell rhinestone tiaras), but most people really want to 
buy the tiara.

So I still put my money on Jane Austen or the OED over the 
card catalog. Heck, I'll put my money on Fanny Burney or 
40,000 Words over the card catalog.

On the other hand, I strongly agree that metacontent should be 
interchangable, both in syntax (RDF) and in data (e.g., AACR2 
for author names). I just wish that the RDF spec was as clear
as AACR2.

wunder
PS: My wife did need to buy a rhinestone tiara, and I was really
impressed by the results for that search. Who would have guessed
that the web has dozens of places to buy those?
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fscheng at netzero.net  Thu Dec  2 17:27:38 1999
From: fscheng at netzero.net (Frank Biz)
Date: Mon Jun  7 17:18:13 2004
Subject: How do embed carriage return/new line into the data?
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain>
Message-ID: <009601bf3cea$40e58260$644b8fcd@intervoice.com>

I'm fairly new to this community. Please help me answer this very simple
question.

Thanks,
Frank.

----- Original Message -----
From: "David Megginson" <david@megginson.com>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 02, 1999 8:39 AM
Subject: Re: Object-oriented serialization (Was Re: Some questions)


> Matthew Gertner <matthew@praxis.cz> writes:
>
> > A schema gives you this information too. The problem of how to attach a
> > schema to an instance is not yet resolved, but it is a purely syntactic
> > consideration and a satisfactory solution will be found. This then tells
> > you what class a given instance belongs too. The identity can be
> > specified using an ID attribute; this is exactly the way it is done in
> > RDF. That an element represents a relationship is implicit in the
> > content model of the element.
>
> I still don't follow.  Perhaps I need to reread the XML Schema spec,
> but given
>
>   <foo>
>    <bar id="xxx">
>     <hack>David</hack>
>     <flurb>Megginson</flurb>
>    </bar>
>   </foo>
>
> How does the schema tell me that foo represents a container for a
> collection of objects, bar represents an object, and hack and flurb
> represent the object's properties?
>
> > SAX is great as far as it goes, but we seem to be agreeing that an
> > additional layer is needed on top. This layer is not the DOM.
>
> It can be.  The DOM represents a domain-specific object layer that is
> useful for a wide subset of XML operations (especially document- and
> browser-oriented work).  There need to be many layers on top of XML,
> one for each domain -- it happens that many of those layers will share
> the need to encode objects, so a standard object layer sandwiched
> between XML and the domain-specific layers can save a lot of work.
>
> > There are a variety of efforts to create
> > domain-specific objects automatically from XML objects. I don't have a
> > list at the tips of my fingers, but if anyone does it would be a great
> > resource. They are out there because I keep bumping into them.
>
> One example is RDF.
>
> > To be quite blunt it seems ashame that a lot of really great work is
> > being put into the RDF effort (including a very valuable vocabulary
> > for collection classes, just to name one) instead of being
> > integrated more tightly into the overall XML architecture.
>
> I disagree strongly with the last part of that statement.  I'd argue
> the opposite -- higher-level layers should be as independent of XML as
> possible.  That's the only way to build good, layered architectures.
> XML does one thing (represent a tree structure in a character stream)
> very well: it's an excellent layer to build other layers on top of,
> but XML itself should stay as simple as possible so that it's
> applicable widely to many different fields.
>
> > I think that if the work being done on RDF were refocused to making
> > sure that XML Schemas do everything that the RDF advocates are
> > rightly claiming is necessary, that we will see a clear win in terms
> > of pushing the whole XML effort from a theoretical effort into a
> > major paradigm shift with extensive real-world implications.
>
> That would be another serious mistake.  Object exchange, while
> important, represents only one of many layers that can be build on top
> of XML, and if XML Schemas start trying to solve high-level problems
> for every specific domain, it will become an unimplementable mess.
> RDF already made a similar mistake by mixing together a spec for
> object encoding in XML with a spec for representing knowledge about
> Web pages.
>
>
> All the best,
>
>
> David
>
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fscheng at netzero.net  Thu Dec  2 17:32:01 1999
From: fscheng at netzero.net (Franklin Cheng)
Date: Mon Jun  7 17:18:13 2004
Subject: How to embed special characters (such as '<' , carriage return) into the data
References: <3.0.32.19991130114807.01475710@pop.intergate.ca>
Message-ID: <00b701bf3cea$f9efcf40$644b8fcd@intervoice.com>

Thanks in advance.
Frank.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 18:15:44 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:13 2004
Subject: Content or Metadata?
In-Reply-To: Walter Underwood's message of "Thu, 02 Dec 1999 09:24:03 -0800"
References: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com>
Message-ID: <m3bt898b1w.fsf@localhost.localdomain>

Walter Underwood <wunder@infoseek.com> writes:

> So I still put my money on Jane Austen or the OED over the 
> card catalog. Heck, I'll put my money on Fanny Burney or 
> 40,000 Words over the card catalog.

The second example is an interesting choice.  After all, the full OED
would probably count as metadata to people who bother to make the
distinction: it contains headwords and subheadwords, grammatical
information, and definitions, but the bulk of the dictionary is made
up of references to other printed works (word in context citations),
just as the bulk of Yahoo! is made up of references to other Web
sites.

So, is the OED content or metadata?  I dunno -- that's why I try to
avoid the terms whenever I can.  

This is a long-standing problem though.  In my former field, Medieval
studies, there are numerous examples of originally marginal glosses
and commentary (metadata?) becoming independently-distributed texts
(content?).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From HZhou at HNTB.com  Thu Dec  2 19:07:43 1999
From: HZhou at HNTB.com (Hao Zhou)
Date: Mon Jun  7 17:18:13 2004
Subject: No subject
Message-ID: <C623B85D4158D311A67D00805FEA6C794314CA@CBSEX1>

unsubscribe xml-dev

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Dec  2 19:14:59 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:13 2004
Subject: Content or Metadata?
Message-ID: <3.0.32.19991202111153.0150e870@pop.intergate.ca>

At 01:14 PM 12/2/99 -0500, David Megginson wrote:
>The second example is an interesting choice.  After all, the full OED
>would probably count as metadata to people who bother to make the
>distinction: 

These are murky waters.  But there are a couple of things that are
incontrovertably true:

1. All metadata is data.  Given an aggregation of data items, each 
   application can and will make its own decisions as to which is "data"
   and which "meta".  Thus a common syntax for both, to the extent
   possible, is a good thing.
2. Not all data is metadata.  Examples: this email message; Chopin's
   Nocturnes; Tuxedo.gif.  

Operationally, my experience suggests that in stuff that is
not metadata, ordering matters.  The converse is true; if ordering matters, 
it's probably not metadata.   There are exceptions but you have to
work pretty hard. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pandeng at telepath.com  Thu Dec  2 19:20:42 1999
From: pandeng at telepath.com (Steve Schafer)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <m3bt898b1w.fsf@localhost.localdomain>
References: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com> <m3bt898b1w.fsf@localhost.localdomain>
Message-ID: <3854c62c.84062872@90.0.0.40>

On 02 Dec 1999 13:14:35 -0500, David Megginson <david@megginson.com>u
wrote:

>So, is the OED content or metadata?  I dunno -- that's why I try to
>avoid the terms whenever I can.  

It's all a matter of context, no? There exist an infinite number of
levels, each one "meta" to the one immediately below it.

-Steve Schafer


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From spreitze at parc.xerox.com  Thu Dec  2 19:42:27 1999
From: spreitze at parc.xerox.com (Mike Spreitzer)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <3.0.32.19991202111153.0150e870@pop.intergate.ca>
Message-ID: <NCBBJANJAENGCPMNOIOCKEFHFLAA.spreitze@parc.xerox.com>

> Operationally, my experience suggests that in stuff that is
> not metadata, ordering matters.  The converse is true; if ordering matters,
> it's probably not metadata.   There are exceptions but you have to
> work pretty hard. -Tim

What about the list of authors of a scholarly paper?  Isn't that metadata for which order
matters?

Not sweating yet,
Mike


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 19:52:05 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <3.0.32.19991202111153.0150e870@pop.intergate.ca>
References: <3.0.32.19991202111153.0150e870@pop.intergate.ca>
Message-ID: <14406.52640.332965.449524@localhost.localdomain>

Tim Bray writes:
 > At 01:14 PM 12/2/99 -0500, David Megginson wrote:
 > >The second example is an interesting choice.  After all, the full OED
 > >would probably count as metadata to people who bother to make the
 > >distinction: 
 > 
 > These are murky waters.  But there are a couple of things that are
 > incontrovertably true:
 > 
 > 1. All metadata is data.  Given an aggregation of data items, each 
 >    application can and will make its own decisions as to which is "data"
 >    and which "meta".  Thus a common syntax for both, to the extent
 >    possible, is a good thing.
 > 2. Not all data is metadata.  Examples: this email message; Chopin's
 >    Nocturnes; Tuxedo.gif.  

Hmm -- see below.

 > Operationally, my experience suggests that in stuff that is
 > not metadata, ordering matters.  The converse is true; if ordering matters, 
 > it's probably not metadata.   There are exceptions but you have to
 > work pretty hard. -Tim

How about ranked search results, or the top ten Web sites?  I didn't
really have to work that hard -- that's why RDF has the horrible
kludge where the rdf:li property automatically changes into rdf:_1,
rdf:_2, etc.

Here's a trickier example: is a film review metadata or data?  It's
prose and it's ordered, but I'm reading it only because I'm interested
in something else.  I could even extend that to a picture of a tuxedo
and beyond, but I'll spare the readers for now.

The point is that the content/metadata distinction is not a property
of the data but a property of how the data's actual use.  If I use
something for its own sake, it's content; if I use something for
something else's sake, it's metadata.

Tim is right that Chopin's Noctures are much more likely to be used
for their own sake in most familiar contexts, but consider a
collection of metadata about influences behind a musical piece: even
there, there is no crisp line.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec  2 19:56:57 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:14 2004
Subject: XML RPC
Message-ID: <33D189919E89D311814C00805F1991F7F4A958@RED-MSG-08>

RE: "I have just started to write a new RPC using XML as content
transfer..."

See also http://XMLRPC.com, http://XMLRPC.com and
http://news.cnet.com/news/0-1003-200-1474298.html .


Best wishes,
Andrew Layman

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From greynolds at datalogics.com  Thu Dec  2 19:57:15 1999
From: greynolds at datalogics.com (Reynolds, Gregg)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
Message-ID: <51ED3F5356D8D011A0B1006097C3073401B1700B@martinique>

That would be "paradata".
(http://www.amazon.com/exec/obidos/ASIN/0521424062/qid=944164377/sr=1-1/102-
1972469-2427226)

-gregg


> -----Original Message-----
> From: Mike Spreitzer [mailto:spreitze@parc.xerox.com]
> Sent: Thursday, December 02, 1999 1:42 PM

 
> > Operationally, my experience suggests that in stuff that is
> > not metadata, ordering matters.  The converse is true; if 
> ordering matters,
> > it's probably not metadata.   There are exceptions but you have to
> > work pretty hard. -Tim
> 
> What about the list of authors of a scholarly paper?  Isn't 
> that metadata for which order
> matters?
> 
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Thu Dec  2 20:05:05 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
Message-ID: <199912021504116.SM01084@Unknown.>

> > Operationally, my experience suggests that in stuff that is
> > not metadata, ordering matters.  The converse is true; if ordering matters,
> > it's probably not metadata.   There are exceptions but you have to
> > work pretty hard. -Tim
> 
> What about the list of authors of a scholarly paper?  Isn't that metadata for which
> order matters?

Maybe it matters to them, but not to me.  :)

Look, it's like I say on my site (if you catch the randomizer just right) - reality is 
holographic.  If you delve deep enough, any data you find will eventually serve as 
metadata for something else.  For instance (one of my favorites), digging into the roots of 
the word "testify" will eventually indicate that Greco-Roman society was pretty 
patriarchal in nature, even to the point of codifying this bias in their legal structure.  (The 
full chain of connections?  "Testify" comes from the same lexical root as "testicle" - 
because in Greco-Roman courts, you swore your oath on the family jewels.  Women not 
having testicles, this at least implies that a woman could not give testimony - which is an 
anti-woman bias in the legal structure.  Since you don't have such a thing in the court 
system without some social impetus, the natural conclusion is that the society regarded 
women as "less" than men - meaning that men ran things.)  Of course, this is far from 
relevant to XML, so I'll shut up about that now.  <g>

Perhaps this will get back to the thread at hand - has anyone yet figured out a decent 
way to attach accurate PICS ratings (esp. RSACi) to dynamic documents?  I've got a 
hack going right now that prevents subordinate data (random ads) from conflicting with 
the rating assigned to the primary data (the article on which the ad spot appears) - but 
that requires the use of an eight-field SQL query to select a conforming set of eligible ads 
(each of which is labeled with a minimum and maximum rating), and a random record is 
chosen from that set.  While this works, it is somewhat less than elegant....


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec  2 20:07:24 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:14 2004
Subject: How do embed carriage return/new line into the data?
Message-ID: <33D189919E89D311814C00805F1991F7F4A959@RED-MSG-08>

Regarding putting RDF features into XML, David Megginson wrote:

> I disagree strongly with the last part of that statement.  I'd argue
> the opposite -- higher-level layers should be as independent of XML as
> possible.  That's the only way to build good, layered architectures.
> XML does one thing (represent a tree structure in a character stream)
> very well: it's an excellent layer to build other layers on top of,
> but XML itself should stay as simple as possible so that it's
> applicable widely to many different fields.
>
> > I think that if the work being done on RDF were refocused to making
> > sure that XML Schemas do everything that the RDF advocates are
> > rightly claiming is necessary, that we will see a clear win in terms
> > of pushing the whole XML effort from a theoretical effort into a
> > major paradigm shift with extensive real-world implications.
>
> That would be another serious mistake.  Object exchange, while
> important, represents only one of many layers that can be build on top
> of XML, and if XML Schemas start trying to solve high-level problems
> for every specific domain, it will become an unimplementable mess.
> RDF already made a similar mistake by mixing together a spec for
> object encoding in XML with a spec for representing knowledge about
> Web pages.

I agree with David on every one of the points he makes above.  

I am at least as keen as the next person to use XML for transferring
structured data often originated or consumed by object systems, but it would
be bad design to make this the only use of XML schemas. 

Best wishes,
Andrew Layman  

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robin at isogen.com  Thu Dec  2 20:16:27 1999
From: robin at isogen.com (Robin Cover)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <3.0.32.19991202111153.0150e870@pop.intergate.ca>
Message-ID: <Pine.GSO.3.96.991202140219.20104C-100000@grind>

WRT (especially):

> in stuff that is not metadata, ordering matters.  The converse is 
> true; if ordering matters, it's probably not metadata.

I don't think I agree, and it's not at all hard to find exceptions,
if I understand the question.  I think the distinction is indeed
POV (point of view), and in some cases, as simple as "view"
(projection).  Imagine an entire book, encoded character by
character, from beginning to end.  Which characters are "metadata"
but not "data"?  Any?  The book subunits (parts, chapters, sections,
subsections) have titles, which like the volume title, may be
regarded as "metadata" for the respective units, but they are
also "data."  For some purposes (an analytical bibliographer),
not only "order" is significant - so are many other matters of
spatial geometry with respect to the "characters" (and other
non-character properties); for other analysts (e.g.,
enumerative bibliography, descriptive cataloging), the "order" of
some character strings (in relation to others) is unimportant.

The distinction is rather like "content" (vs.) "not-content" --
fairly bogus, distracting, and confusing -- not to mention
problematic because it lies the base of some bad markup language
designs.

My 2 cents.

-robin

------------------------------------------------------------------

On Thu, 2 Dec 1999, Tim Bray wrote:

> At 01:14 PM 12/2/99 -0500, David Megginson wrote:
> >The second example is an interesting choice.  After all, the full OED
> >would probably count as metadata to people who bother to make the
> >distinction: 
> 
> These are murky waters.  But there are a couple of things that are
> incontrovertably true:
> 
> 1. All metadata is data.  Given an aggregation of data items, each 
>    application can and will make its own decisions as to which is "data"
>    and which "meta".  Thus a common syntax for both, to the extent
>    possible, is a good thing.
> 2. Not all data is metadata.  Examples: this email message; Chopin's
>    Nocturnes; Tuxedo.gif.  
> 
> Operationally, my experience suggests that in stuff that is
> not metadata, ordering matters.  The converse is true; if ordering matters, 
> it's probably not metadata.   There are exceptions but you have to
> work pretty hard. -Tim
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec  2 20:50:04 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <3854c62c.84062872@90.0.0.40>
Message-ID: <Pine.LNX.4.10.9912020350400.15285-100000@cauchy.clarkevans.com>


On Thu, 2 Dec 1999, Steve Schafer wrote:
> It's all a matter of context, no? There exist an infinite number of
> levels, each one "meta" to the one immediately below it.

Yes! 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec  2 21:06:13 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <3854c62c.84062872@90.0.0.40>
Message-ID: <Pine.LNX.4.10.9912020355570.15285-100000@cauchy.clarkevans.com>

On Thu, 2 Dec 1999, Steve Schafer wrote:
> It's all a matter of context, no? There exist an infinite number of
> levels, each one "meta" to the one immediately below it.

I believe that it is a binary recursive pattern:


                                     meta-data  ...
                                  /
                         meta-data
                      /           \\
             meta-data               data   ...
          /           \\ 
         /               data  ...
  context              
         \\              meta-data ...
          \\         / 
             data                   meta-data ...
                     \\          /
                         data 
                                 \\
                                    data  ...


Thus, you are right on about it being 
a "matter of context".

Hope this perspective helps,

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec  2 21:20:02 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:14 2004
Subject: Content or Metadata?
In-Reply-To: <Pine.GSO.3.96.991202140219.20104C-100000@grind>
Message-ID: <Pine.LNX.4.10.9912020407571.15285-100000@cauchy.clarkevans.com>

On Thu, 2 Dec 1999, Robin Cover wrote:
> The distinction is rather like "content" (vs.) "not-content" --
> fairly bogus, distracting, and confusing -- not to mention
> problematic because it lies the base of some bad markup
> language designs.

It's confusing and problematic when the context
of the document is not taken into consideration
or when the document is used in more than one context 
without the proper (isomorphic) transformations to 
preserve meaning.  Furthermore, to add insult to injury, 
there is no such thing as context independence...

Perhaps explicit user perspectives / use cases 
are needed when doing document modeling.  Mabye 
inserting a few transformation steps between 
contexts would help?

;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 21:29:00 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:15 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <14406.58446.675568.388482@localhost.localdomain>

I think that there is a growing need for a common C++ SAX 1.0
interface as XML moves more and more into high-performance
environments.  I have kept pointers that people sent to quite a few
existing attempts, but before I look those over, I'd like to try my
own off the top of my head.

I'll be posting three follow-up messages on SAX/C++ to stimulate
discussion:

1. Some C++-specific SAX design principles.
2. Implementation changes required or possible in C++.
3. My first stab at a core SAX 1.0 C++ interface.

I know that SAX2 is still being neglected, and I apologize.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 21:33:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:15 2004
Subject: SAX/C++: C++-specific design principles
Message-ID: <14406.58740.871829.541816@localhost.localdomain>

Here are the principles that I applied to creating my first draft
SAX/C++ interface:

1. Use references when there can never be a null value, pointers
   otherwise.

2. Pointers never change ownership -- if a Parser (for example) wants
   to own an InputSource, it needs to make its own copy.  The app has
   to free everything that it allocates, and the SAX driver, likewise.

3. Callbacks cannot be const, since they often change the state of the 
   client app.

4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
   with most existing C++ code.

5. Use char * rather than string, to avoid forcing a lot of allocation 
   overhead on the SAX driver.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robertl1 at home.com  Thu Dec  2 21:37:46 1999
From: robertl1 at home.com (Robert La Quey)
Date: Mon Jun  7 17:18:15 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: <38466487.328D1CFA@praxis.cz>
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca>
 <m3bt9hp6g4.fsf@localhost.localdomain>
 <38459FA4.DAA00E35@praxis.cz>
 <m366zpp3m8.fsf@localhost.localdomain>
 <38463AB4.36C5292B@praxis.cz>
 <m3aenta7qn.fsf@localhost.localdomain>
Message-ID: <3.0.6.32.19991202133035.04315e60@mail.dt1.sdca.home.com>

At 01:22 PM 12/2/99 +0100, you wrote:
>David Megginson wrote:
>> If you have a function loadXML(), you get a DOM tree or a bunch of SAX
>> events or something similar; if you have a function loadRDF(), you get
>> a collection of objects with attributes and relationships.  In either
>> case, a schema can tell you things like "element type/class B is a
>> kind of element type/class A", but that's secondary information; the
>> primary information is "element X is an object of class Y with
>> identifier Z, while element A represents a relationship between this
>> object and object C".
>
>A schema gives you this information too. The problem of how to attach a
>schema to an instance is not yet resolved, but it is a purely syntactic
>consideration and a satisfactory solution will be found. This then tells
>you what class a given instance belongs too. The identity can be
>specified using an ID attribute; this is exactly the way it is done in
>RDF. That an element represents a relationship is implicit in the
>content model of the element.
>
>SAX is great as far as it goes, but we seem to be agreeing that an
>additional layer is needed on top. This layer is not the DOM. One of the
>lessons that I learned from my time at POET Software is that, although
>we had an excellent generic API, the vast majority of our customers
>wanted to work with real C++ (and later Java) classes in their problem
>domain. But there is nothing to say that a loadXML() function must
>return a DOM tree. There are a variety of efforts to create
>domain-specific objects automatically from XML objects. I don't have a
>list at the tips of my fingers, but if anyone does it would be a great
>resource. They are out there because I keep bumping into them.
>
>> If you're interested in a collection of objects in the first place,
>> why should you have to see or know about XML elements and attributes
>> at all?  Or to put it a different way, why should people constantly
>> have to redo the work of extracting objects from XML, when they're all
>> trying to do the same thing?
>
>Once again, there are already tools that provide this functionality
>across applications (i.e. they can be plugged in and used without
>additional development). The interest of XML is essentially as a way to
>serialize objects and send them across a network, as you also stated.
>
>> I think that reasonable people can argue that RDF is not the best
>> solution to the problem of object exchange in XML, but I am somewhat
>> surprised to hear people deny that the problem even exists: there is
>> an enormous demand for exchanging objects in XML (businesses exchange
>> a lot of structured data), and it's hard work to have to figure out
>> over and over how to construct objects from a SAX stream or a DOM tree
>> especially when programmers with XML knowledge are scarce and
>> expensive.
>> 
>> I have no doubt that we need an abstract object layer on top of XML.
>> Right now, RDF is the best solution currently available (XMI also has
>> its advocates), but I'm ready to listen about anything better.
>
>In no way do I doubt the importance of being able to exchange objects in
>XML, but I do have serious reservations about RDF as the way to do this,
>and they have nothing to do with the hairy syntax or hard-to-understand
>spec. What is lacking right now is an overarching approach to using XML
>in real-world applications ... 

uhh guys, the thread on Web Architecture, to which essentially no one replied,
was addressed to exactly these issues. Oh well, it is good to see the issues 
raised ...

A small rewrite to fit this thread. 

<synopsis>
Layer Purpose                            Example/Description
3) application  				e.g. [PICS], [OCS], [RSS]

2a) Resource Description Framework  	Dublin Core 
						Describes a particular choice of 
						data structures (property lists)
						to be used by applications
2b) Other Application Oriented Data Structures (or objects)

2) Object Definition				Standard way to represent objects
						in ML 

1) ML						ML used for data serialization 
						and transport and IDL
</synopsis>

I left out namespaces for the moment. 

The basic problem remains a lack of a clearly articulated vision of what 
the web of the future could/should be. 


Bob La Quey

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 21:39:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:15 2004
Subject: SAX/C++: Changes for C++
Message-ID: <14406.59075.218048.437305@localhost.localdomain>

Here are some of the differences between the SAX/Java interfaces and the 
SAX/C++ interfaces:

- lots of const
- C++ const char * for Java String throughout (and, thus, UTF-8
  instead of UTF-16)
- InputSource doesn't have an equivalent of Java Reader (no getReader
  method)
- SAXException does not allow an embedded exception, because there's
  no need to tunnel exceptions in C++ (you can always throw any
  exception)
- DocumentHandler::characters and DocumentHandler::ignorableWhitespace 
  don't need the 'start' argument, since they can be passed a pointer
  to the start position in an existing array (that's not possible in
  Java)
- HandlerBase omitted, since the classes can contain their own default 
  implementations
- I haven't figured out what to do with Parser::setLocale yet


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 21:41:27 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:15 2004
Subject: SAX/C++: First interface draft
Message-ID: <14406.59198.949047.2487@localhost.localdomain>

I have just drafted this interface, and haven't even run it through a
C++ compiler yet.  For clarity, I've omitted constructors and
destructors, as well as most of what will be inline implementations.

Notes: I haven't looked at other C++ efforts yet, but I will try to do
so now.  Eventually, this should be in a special C++ namespace.

sax.h
====================8<====================8<====================
#ifndef __SAX_HXX
#define __SAX_HXX

#include <istream>

class InputSource
{
public:
  virtual const char * getPublicId (void) const;
  virtual void setPublicId (const char * publicId);

  virtual const char * getSystemId (void) const;
  virtual void setSystemId (const char * systemId);

  virtual std::istream * getInputStream (void) const;
  virtual void setInputStream (std::istream * in);

protected:
  const char * _publicId;
  const char * _systemId;
  std::istream * _in;
};


class AttributeList
{
public:
  virtual size_t getLength (void) const = 0;

  virtual const char * getName (size_t pos) const = 0;
  virtual const char * getType (size_t pos) const = 0;
  virtual const char * getValue (size_t pos) const = 0;

  virtual const char * getType (const char * name) const;
  virtual const char * getValue (const char * name) const;
};


class SAXException
{
public:
  virtual const char * getMessage (void) const;
protected:
  const char * _message;
};


class SAXParseException : public SAXException
{
public:
  virtual const char * getPublicId (void) const;
  virtual const char * getSystemId (void) const;
  virtual const size_t getLineNumber (void) const;
  virtual const size_t getColumnNumber (void) const;

protected:
  const char * _publicId;
  const char * _systemId;
  const size_t _lineNumber;
  const size_t _columnNumber;
};


class EntityResolver
{
public:
  virtual const InputSource * resolveEntity (const char * publicId,
					     const char * systemId);
};


class DTDHandler
{
public:
  virtual void notationDecl (const char * name,
			     const char * publicId,
			     const char * systemId) {}
  virtual void unparsedEntityDecl (const char * name,
				   const char * publicId,
				   const char * systemId,
				   const char * notationName) {}
};


class DocumentHandler
{
public:
  virtual void setDocumentLocator (const Locator &locator);
  virtual void startDocument (void) {}
  virtual void endDocument (void) {}
  virtual void startElement (const char * name, const AttributeList &atts) {}
  virtual void endElement (const char * name) {}
  virtual void characters (const char * ch, size_t length) {}
  virtual void ignorableWhitespace (const char * ch, size_t length) {}
  virtual void 
  processingInstruction (const char * target, const char * data) {}

protected:
  Locator * _locator;
};


class ErrorHandler
{
public:
  virtual void warning (const SAXParseException &e) {}
  virtual void error (const SAXParseException &e) {}
  virtual void fatalError (const SAXParseException &e) {}
};


class Parser
{
public:
  // setLocale??

  virtual void setEntityResolver (EntityResolver &resolver);
  virtual void setDTDHandler (DTDHandler &handler);
  virtual void setDocumentHandler (DocumentHandler &handler);
  virtual void setErrorHandler (ErrorHandler &handler);

  virtual void parse (const char * systemId);
  virtual void parse (const InputSource &input) = 0;

protected:
  EntityResolver * _resolver;
  DTDHandler * _dtdHandler;
  DocumentHandler * _documentHandler;
  ErrorHandler * _errorHandler;
};

#endif
====================8<====================8<====================

Comments?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From LWatanab at JetForm.com  Thu Dec  2 21:56:03 1999
From: LWatanab at JetForm.com (Larry Watanabe)
Date: Mon Jun  7 17:18:15 2004
Subject: SAX/C++: First interface draft
Message-ID: <111CF63B7D2ED211830000805F65A2FF0180496C@OTTMAIL2>


	I would suggest making the String class external, with a
well-defined minimal interface. Then the user could implement the interface
in their own string classes, typedef the String (or DOMString or whatever)
as their class, and compile it together. 

	This would allow an application to use its own String
implementation, which would save the trouble of a lot of conversions. 

> -----Original Message-----
> From:	David Megginson [SMTP:david@megginson.com]
> Sent:	Thursday, December 02, 1999 4:40 PM
> To:	XMLDev list
> Subject:	SAX/C++: First interface draft
> 
> I have just drafted this interface, and haven't even run it through a
> C++ compiler yet.  For clarity, I've omitted constructors and
> destructors, as well as most of what will be inline implementations.
> 
> Notes: I haven't looked at other C++ efforts yet, but I will try to do
> so now.  Eventually, this should be in a special C++ namespace.
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Thu Dec  2 21:59:40 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:15 2004
Subject: A processing instruction for robots
Message-ID: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>

HTML has a robots meta tag. XML has no standard way to
declare the same information. Here is a proposal, with
an implementation:

  http://homepages.go.com/~wunder0/robots-pi.html

Comments are welcome. This is also posted to the robots
list.

wunder
--
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://software.infoseek.com/
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Dec  2 22:11:18 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:15 2004
Subject: Content or Metadata?
Message-ID: <3.0.32.19991202140921.01490100@pop.intergate.ca>

At 11:41 AM 12/2/99 PST, Mike Spreitzer wrote:
>What about the list of authors of a scholarly paper?  Isn't that metadata for which order
>matters?

Yep, in fact that's the one use-case that kept coming up during the early 
stage of RDF design.  Here's another one for free: content models.  But
the notion that there is some ordering on a document's author, title, 
and date-of-publication is surprising and unnatural. -T.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Dec  2 22:11:15 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:15 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <3.0.32.19991202141224.0148fc60@pop.intergate.ca>

At 04:27 PM 12/2/99 -0500, David Megginson wrote:
>I'll be posting three follow-up messages on SAX/C++ to stimulate
>discussion:

Good idea, one question.  Any way to do C at the same time? -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Thu Dec  2 22:12:29 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:18:15 2004
Subject: Any XML Schemas validators out yet ?
Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2781B7@MDYNYCMSX1>

>So Im wondering if there are any tools that can validate XML Schemas
>themselfs
>and maybe also validate XML documents using XML Schemas ?

At least for W3C Schemas:

Being XML documents themselves, you can take the DTD in Appendix B of the
schema proposal and validate your schema against that using any validating
parser. 

To validate XML documents against these schemas, the only thing I know of
out there is the Xerces parser at xml.apache.org.

Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robin at isogen.com  Thu Dec  2 22:33:29 1999
From: robin at isogen.com (Robin Cover)
Date: Mon Jun  7 17:18:15 2004
Subject: Content or Metadata?
In-Reply-To: <3.0.32.19991202140921.01490100@pop.intergate.ca>
Message-ID: <Pine.GSO.3.96.991202162508.20104E-100000@grind>


> the notion that there is some ordering on a document's author, title,
> and date-of-publication is surprising and unnatural. -T.

Depends...

He said "list of authors."  As in, multiple authors, where the
principal author is listed first (regardless of the spelling of
surname and Western-style collation sequence), the "next-most-
principal-author" is listed second in the order(-ed, -able)
author list, reflecting the contract...  blah blah.

Of course, such notions reflect perspective, which may or may
not be implicit/explicit in the style rules and underlying
assumptions of the house.  

Hence: "views, perspectives, projections, purposes."  No one
of them is fixed.  The poem escapes the intent of the author,
and becomes the property of the collective consciousness of the
community.

-rcc

-----------------------------------------------------------------

On Thu, 2 Dec 1999, Tim Bray wrote:

> At 11:41 AM 12/2/99 PST, Mike Spreitzer wrote:
> >What about the list of authors of a scholarly paper?  Isn't that metadata for which order
> >matters?
> 
> Yep, in fact that's the one use-case that kept coming up during the early 
> stage of RDF design.  Here's another one for free: content models.  But
> the notion that there is some ordering on a document's author, title, 
> and date-of-publication is surprising and unnatural. -T.
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jwtodd at pacbell.net  Thu Dec  2 23:01:30 1999
From: jwtodd at pacbell.net (James Todd)
Date: Mon Jun  7 17:18:15 2004
Subject: i'd like to merge two docs ...
References: <3845EAFA.341C3122@pacbell.net>
 <005001bf3c6e$ef8d57b0$5a672382@us.oracle.com> <38461554.61917EBD@pacbell.net>
 <000d01bf3c93$6febd9d0$3b652382@us.oracle.com> <3846F703.DF8CED3@pacbell.net>
Message-ID: <3846FC10.D1B6E795@pacbell.net>


quick recap:

ahhh ... i just figured it out. with ProjectX there is a
com.sun.xml.tree.DocumentEx.changeNodeOwner(Node) that does the
trick. so, if i exchange the removeChild() call with a changeNodeOwner()
call i can quite readily rehost a doc fragment.

any ideas as how to do this, if possible, with a standard dom api?

thx,

- james

James Todd wrote:


> > |
> > | Steve Muench wrote:
> > |
> > | > Assuming you have XML DOM Documents "one" and "two"
> > | > and that "oneElement" is the element in doc "one"
> > | > to which you'd like to append the entire content
> > | > of "two"...
> > | >
> > | > You should be able to do:
> > | >
> > | >    Element twoDocElt = two.getDocumentElement();
> > | >    two.removeChild(twoDocElt);
> > | >    oneElement.appendChild(twoDocElt);
> > | >
> > | > _________________________________________________________
> > | > Steve Muench, Consulting Product Manager & XML Evangelist
> > | > Business Components for Java Development Team
> > | > http://technet.oracle.com/tech/java
> > | > http://technet.oracle.com/tech/xml
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robin at isogen.com  Thu Dec  2 23:02:14 1999
From: robin at isogen.com (Robin Cover)
Date: Mon Jun  7 17:18:16 2004
Subject: Content or Metadata?
In-Reply-To: <3.0.32.19991202140921.01490100@pop.intergate.ca>
Message-ID: <Pine.GSO.3.96.991202164214.20104G-100000@grind>

Postscriptum:

In some OO theory, I think it's believed favorable to create
distinct attributes for things that are ordered (since 
[Boyce-] Codd believed that attributes are intrinsically
unordered): this, for Mike Spreitzer's example of
"list of authors": firstAuthor, secondAuthor, thirdAuthor,
etc.  Well, suppose there are in fact three groups of
authors, with different principles of sub-ordering, which
are masked in the typical presentation... it may then be
more economical (80:20 rule, which I detest) to say that
we allow an attribute value which is an orderable list
of (sub-)tokens.  I have seen -- indeed, documented -- some
works which enumerated over 30 authors for the piece.
Volumes/analytical works from the French academies.
(And why not?  Only the aesthetics of print books and the
supposed cost of printer's ink have lead to style rules that
say "truncate with 'etc' after N authors...".)  In such
cases: I suspect the order (-edness, -ability) has nothing
to do with whether the factoids are (meta-)data or not.

Nothing is simple, despite what could appear to be
incontrovertible facts.

-r

--------------------------------------------------------------

On Thu, 2 Dec 1999, Tim Bray wrote:

> At 11:41 AM 12/2/99 PST, Mike Spreitzer wrote:
> >What about the list of authors of a scholarly paper?  Isn't that metadata for which order
> >matters?
> 
> Yep, in fact that's the one use-case that kept coming up during the early 
> stage of RDF design.  Here's another one for free: content models.  But
> the notion that there is some ordering on a document's author, title, 
> and date-of-publication is surprising and unnatural. -T.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  2 23:50:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:16 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <3.0.32.19991202141224.0148fc60@pop.intergate.ca>
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca>
Message-ID: <14407.1389.659881.147338@localhost.localdomain>

Tim Bray writes:

 > At 04:27 PM 12/2/99 -0500, David Megginson wrote:
 > >I'll be posting three follow-up messages on SAX/C++ to stimulate
 > >discussion:
 > 
 > Good idea, one question.  Any way to do C at the same time? -Tim

Sure -- is there a strong need for a common C interface, though?  We
already have Expat's C interface, and I don't know of anyone else in
that space yet.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Fri Dec  3 00:31:29 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:18:16 2004
Subject: Any XML Schemas validators out yet ?
References: <01BA10F0CD20D3119B2400805FD40F9F2781B7@MDYNYCMSX1>
Message-ID: <3846F6CB.86276410@toolsmiths.se>

"DuCharme, Robert" wrote:

> >So Im wondering if there are any tools that can validate XML Schemas
> >themselfs
> >and maybe also validate XML documents using XML Schemas ?
>
> At least for W3C Schemas:
>
> Being XML documents themselves, you can take the DTD in Appendix B of the
> schema proposal and validate your schema against that using any validating
> parser.

I tried this but MS Explorer 5.0.2919 reports this error in the XML Schema proposal:

Attribute 'xmlns:' must be a #FIXED attribute. Line 17, Position 18

                 model      (open|refinable|closed) 'closed' >
-----------------^

Maybe Im using the wrong Schema ,
"http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.dtd" ?

>
> To validate XML documents against these schemas, the only thing I know of
> out there is the Xerces parser at xml.apache.org.

Thanks, Ill have a look.

Best
/Anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Dec  3 01:19:56 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:16 2004
Subject: Content or Metadata?
References: <3.0.32.19991202111153.0150e870@pop.intergate.ca>
Message-ID: <011f01bf3d15$bbac9300$eb020a0a@bowstreet.com>

=> 2. Not all data is metadata.  Examples: this email message; Chopin's
>    Nocturnes; Tuxedo.gif.

While, I'm not arguing against your point, it is interesting to note that
each of these examples have headers which could be thought of as metadata.

James Tauber

"Metadata is data you forgot to put in in the first place" - Ted Nelson


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Fri Dec  3 02:51:38 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:16 2004
Subject: Content or Metadata?
Message-ID: <006e01bf3d3c$d46a6da0$5ef96d8c@NT.JELLIFFE.COM.AU>


From: Robin Cover <robin@isogen.com>

>Of course, such notions reflect perspective, which may or may
>not be implicit/explicit in the style rules and underlying
>assumptions of the house.

For all its sins, RDF showed up a major area that is currently missing
in Schemas: the need to make the generic relationships between elements
explicit.

In particular RDF used "bag", "seq" and "alt".  But there are many more
such relationships:
    * is one element an annotation of another?

    * is that annotation superior (e.g. a title, a summary) or
subsidiary (e.g., an explaination, a digression, an alternative, a
role)?

    * does one element/attribute have any meaning without some other
element/attribute (e.g., does a particular number also require a units
element/attribute/default)?

    * which roles do elements and atributes play in the particular
taxonomic/ontological methodology of their creator (e.g., what is data,
what is metadata)?

Some of these things, RDF Schemas could make possible, and XLink could
have made possible.  I think RDF is a continual reminder that GIs and
containment may make relationships obvious to humans, but in the absense
of other conventions, they may hide these relationships from the
computer.

B.t.w, the sins of RDF were all commented on at the time:

  * the spec is clearly two or three different different
documents cobbled together with little cohesion between them;

  * having a syntax like the _n attribute names which made validation
impossible except by special-purpose validators;

   * not having the discipline of a DTD fragment, so that some elements
mentioned are never explicitly given in the EBNF productions;

   * RDF is a framework but it should have been an architecture which is
framework-neutral. The test of whether it is useful as a framework are
whether generic tools are useful for RDF data; if, in fact, it is being
mainly used for specific applications, then RDF markup would be better
formulated as conventions that sit on top of DTDs/schemas that allow as
natural modeling of the data as possible.

I think RDF should have concentrated on how to fit on top of
regular markup, including markup of inline elements interspersed
through paragraphs. Atomic data is just the simplest case of that.


Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec  3 03:08:11 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:16 2004
Subject: Content or Metadata?
In-Reply-To: <NCBBJANJAENGCPMNOIOCKEFHFLAA.spreitze@parc.xerox.com>
Message-ID: <001201bf3d3b$9c469760$099918d1@docuverse1>

'meta-' just means 'beyond' or 'transcending' and requires
a context.  Engineers typically apply the 'instance' role
to the context and 'definition' role to the 'meta-whatever'
because 'Category' is a powerful meme.  There are other
memes that retains the 'meta' relationship between roles.

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Fri Dec  3 03:25:56 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:16 2004
Subject: Some questions
References: <3.0.5.32.19991202092403.00cc56f0@corp.infoseek.com>
Message-ID: <003901bf3d3e$b4fe61e0$a82a08d1@tomshp>


Walter Underwood wrote:
>...
> Here are stages of having the info (content) that you want,
> ordered in increasing amounts of wasted time.
>
> 1. I have the information.
> 2. I know where the information is.
> 3. I know it exists, but I don't know where it is.
> 4. I don't know if it exists.
>
> Only the last two need some sort of metacontent or finding aid.
>
I'd add one more:
5. I'm not sure exactly what I'm looking for, but I'll probably know it when
I find it.

This could be analogous to browsing in store looking for a gift, which you
vaguely thought might be a toaster, and discovering a bread machine.

With the size and complexity of the web, making (5) work better would be a
great boon.

Tom Passin

> Organizing and indexing content is a time-saver, and sometimes that
> is essential. Sometimes, the metacontent has the whole answer (which
> companies sell rhinestone tiaras), but most people really want to
> buy the tiara.
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Rajiv.Mordani at eng.sun.com  Fri Dec  3 03:25:25 1999
From: Rajiv.Mordani at eng.sun.com (Rajiv Mordani)
Date: Mon Jun  7 17:18:16 2004
Subject: q: i'd like to merge two docs ...
In-Reply-To: <3845EAFA.341C3122@pacbell.net>
Message-ID: <Pine.SOL.3.96.991202192213.6718E-100000@milhouse>

You have the changeNodeOwner API in XmlDocument to actually do the
necessary so you don't get the error shown below. So before appending use
the changeNodeOwner and then you are all set.

- Rajiv

XML is to the 90s what ASCII was to the 70s

On Wed, 1 Dec 1999, James Todd wrote:

> 
> hi -
> 
>     i could use a pointer or two, a recipe if you will, on how best to
>     "modify and merge" two xml docs. the scenario:
> 
>         an inbound xml "fragment", a complete xml doc in it's own
>         right, is amended (eg. one new attribute is added)
> 
>         the results of which is appended, as a child node, to a
>         "hosting" xml tree
> 
>     i've got most of this working using the ProjectX [? Mr. Brownell ?]
>     parser yet it fails during the appendChild() stating that the child
> node
> 
>         "That node doesn't belong in this document"
> 
>     due to the fact, i believe, that it has a distinct OwnerDocument.
> 
>     my methodology to date is to create dom's for both the inbound
>     "fragment" and the destination xml docs afterwhich i'd like to
> modify
>     the fragment (hence going the dom route) and finally add the results
> 
>     to the destination doc via appendChild().
> 
>     i had hoped to bypass walking the tree in order to create an
>     "document ownerless" copy with which to work with. is there
>     a better/preferred means by which to accomplish this task?
> 
>     any/all comments and suggestions welcomed.
> 
>     thx much,
> 
> - james
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Fri Dec  3 03:48:36 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:16 2004
Subject: Request for Discussion: SAX 1.0 in C++
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain>
Message-ID: <008301bf3d41$de0c9b80$a82a08d1@tomshp>


David Megginson wrote
> Tim Bray writes:
>
>  > At 04:27 PM 12/2/99 -0500, David Megginson wrote:
>  > >I'll be posting three follow-up messages on SAX/C++ to stimulate
>  > >discussion:
>  >
>  > Good idea, one question.  Any way to do C at the same time? -Tim
>
> Sure -- is there a strong need for a common C interface, though?  We
> already have Expat's C interface, and I don't know of anyone else in
> that space yet.
>
But C is available on most _any_ platform - often for free.  So almost
anyone could compile in C but not necessarily in C++.  Isn't rxp done in C?

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vlashua at RSGsystems.com  Fri Dec  3 04:45:49 1999
From: vlashua at RSGsystems.com (Vane Lashua)
Date: Mon Jun  7 17:18:16 2004
Subject: RDF, again
Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D72@MAILPROD>

Asking out of ignorance: 
Is there thought being devoted to a universally accessible catalog of id's,
names (lists), classes, datatypes -- maybe even using the MARC system and
the LC index -- existing as a universal repository of components describing
data structures?

It would be a "soft" resource, like a library catalog, but with "hard" data
points: the LC system is not a standard; it is a registry and a reference
maintained by an authority. A publisher may suggest the cataloguing
classification of an individual object, but any given library may catalog
its instance-object differently. Meanwhile, because publishers and libraries
are interested in keeping in touch with information, a library patron from
virtually anywhere can find most objects in a given class and select from
them.

The difficulty with the definitions below, for instance, is that "name" is a
collection of characters whose context is not clear without a reference.
Namespaces, it seems to me, are absolutely necessary, but they tend to
encourage diversity where convergence would be a more enlightened tendency.

Vane


-----Original Message-----
From: Mark Birbeck [mailto:Mark.Birbeck@iedigital.net]
Sent: Tuesday, November 23, 1999 6:46 PM
To: 'Paul Prescod'; 'xml-dev@ic.ac.uk'
Subject: RE: RDF, again


Paul Prescod wrote:
> The thing I find confusing about the RDF syntax is that the 
> element type
> name can be either an RDF type name or an RDF property. XML makes no
> distinction and that's why I think that it is difficult to use for
> object oriented interchange.

I got the impression from the spec that this is intentional, so that a
straightforward XML document - that might not contain *any* RDF - can
still be interpreted as a set of RDF statements. In other words,
different XML layouts (elements for attributes, e.g.) of the same data
would result in the same RDF statements.

The XML would still need to be well thought out though. For example:

	<person name="Paul">
		<food>trifle</food>
	</person>

might mean trifle is your favourite food, the main food you're allergic
to, or your pudding preference for the office Xmas party. All of these
are acceptable in XML, but the RDF interpretation of this may well be
incorrect - or at least not as rich in meaning as we would like:

	Person has a name "Paul" and a food "trifle"

So, to make the first statement - trifle is Paul's favourite food - we
could use the following RDF:

<rdf:RDF>
	<rdf:Description ID="1">
		<rdf:Type rdf:resource="person" />
		<x:name>Paul</x:name>
	</rdf:Description>

	<rdf:Description ID="2">
		<rdf:Type rdf:resource="food" />
		<x:name>trifle</x:name>
	</rdf:Description>

	<rdf:Description about="#1">
		<x:favourite rdf:resource="#2" />
	</rdf:Description>
</rdf:RDF>

Using the abbreviated forms allowed to us, this is the same 'RDF':

	<x:person x:name="Paul">
		<x:favourite>
			<x:food>trifle</food>
		</x:favourite>
	</x:person>

or:

	<x:person>
		<x:name>Paul</x:name>
		<x:favourite>
			<x:food>trifle</x:food>
		</x:favourite>
	</x:person>

or:

	<x:person>
		<x:name>Paul</x:name>
		<x:favourite x:food="trifle" />
	</x:person>

So - to turn this round - any of the previous three XML documents can be
interpreted as the same set of RDF statements - a person with the name
"Paul" has a favourite food, and that food is called trifle - even
without any explicit RDF present.

As to whether it is any good for object interchange, I think it is. Of
course, if the relationships between elements contained within other
elements can be inferred then straight XML is fine. But as soon as you
need something more complex then RDF is very good (not to mention when
the objects being referred to are outside of the XML document you're,
and so you can't use ID/IDREF.)

Best regards,

Mark

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vlashua at RSGsystems.com  Fri Dec  3 04:46:33 1999
From: vlashua at RSGsystems.com (Vane Lashua)
Date: Mon Jun  7 17:18:16 2004
Subject: INTERFACE {was SGML, XML and SML, ugh!}
Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D71@MAILPROD>

There is no rationale for interface as a topic in an XML discussion group,
but while it's passing:

The newest of the eyeglasses interfaces with the small addition of
thumb-ball, earphones, and mic, is getting near to "better-than-TRS-80". The
most significant impediment to a good interface is the querty keyboard and
our collective investment in having learned to use it (combined with the
need for relative silence while we're using it).

Around the same era that the mouse emerged, there was on the market a
single-handed(?) encoding device whose speed was about the same as querty. I
think I saw it in Byte. Anybody seen one lately?

Vane

-----Original Message-----
From: Tyler Baker [mailto:tyler@infinet.com]
Sent: Monday, November 22, 1999 11:13 PM
To: rev-bob@gotc.com
Cc: xml-dev@ic.ac.uk
Subject: Re: SGML, XML and SML


rev-bob@gotc.com wrote:

> > ** Original Sender: David Megginson <david@megginson.com>
> >
<snip!>
> really seems to be looking at - text-to-speech conversion for small
devices.  That is,
> instead of working on making the tiny screens a little bit bigger or a
little bit clearer,
<snip!>

no one has created a display device or
user interface that is significantly better than the old TRS-80.
<snip!>

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec  3 04:49:44 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:16 2004
Subject: SAX/C++: UTF-8 v UTF-16
References: <14406.58740.871829.541816@localhost.localdomain>
Message-ID: <38472FE3.D3BB22BC@jclark.com>

David Megginson wrote:

> 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
>    with most existing C++ code.

I would say there was at least as much C++ code using UTF-16 as using
UTF-8. On Windows at least, UTF-16 is much more common. The DOM mandates
UTF-16, so if SAX mandated UTF-8 there would be an unfortunate mismatch.
This is a tough one, because there's a lot more diversity in the C++
world.  My preference would be not to mandate either UTF-8 or UTF-16
exclusively.  There are lots of apps using UTF-8 and there are lots of
apps using UTF-16; if you exclude either, then a lot of apps will take a
mojor performance/convenience hit. Expat allows a choice at compile-time
between UTF-8 and UTF-16, and there are big projects using both (eg Perl
uses UTF-8 and Mozilla uses UTF-16).

There are a couple of possible solutions:

1. A lo-tech solution.  Provide a SAXChar typedef, and define everything
in terms of SAXChar.  SAXChar gets typedefed to either char or unsigned
short depending on whether SAX_UNICODE is defined or not.  It's up to
implementations to decide whether to support both or just one, and up to
clients to decide whether to work with both or to require one.

A variation on this is to allow both UTF-8 and UTF-16 variants to exist
in a single library.  To do this, you can do something along the lines
of

class AttributeList16 {
public:
  virtual const unsigned short *getName(int pos) = 0;
};

class AttributeList8 {
public:
  virtual const char *getName(int pos) = 0;
};

#ifdef SAX_UNICODE
typedef AttributeList16 AttributeList;
#else
typedef AttributeList8 AttributeList;
#endif

2. A hi-tech solution.  Do what the Standard C++ library does and make
the interface a template in the character type.  This is the cleanest
solution, but lots of C++ projects eschew templates on portability
grounds.

If you feel that one needs to be mandated, I would pick UTF-16.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec  3 04:49:46 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:16 2004
Subject: SAX/C++: C++-specific design principles
References: <14406.58740.871829.541816@localhost.localdomain>
Message-ID: <384741C0.50ABA536@jclark.com>

David Megginson wrote:

> 2. Pointers never change ownership -- if a Parser (for example) wants
>    to own an InputSource, it needs to make its own copy.  The app has
>    to free everything that it allocates, and the SAX driver, likewise.

That's problematic for EntityResolve::resolveEntity; that requires that
ownership of an InputSource be transferred from to the caller from the
callee.

This could be avoided by doing:

virtual const InputSource *
resolveEntity(const char *publicId,
              const char *systemId);

instead of:

virtual void
resolveEntity(const char *publicId,
              const char *systemId,
              InputSource &inputSource);

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec  3 04:49:49 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:16 2004
Subject: SAX/C++: First interface draft
References: <14406.59198.949047.2487@localhost.localdomain>
Message-ID: <38474BAF.AF4CFF2D@jclark.com>

In Java, everything in SAX is an interface. The way to do an interface
in C++ is to use a class where all members (except possibly a virtual
destructor) are abstract (ie defined as = 0).  This provides the maximum
flexibility and insulation. The only good reason not to do an interface
is if it were necessary and possible to inline some method calls for
performance.  I think this this applies here: certainly there's no
performance need to inline method calls to something like InputSource.

One interesting issue is whether to provide a virtual destructor.  I
think the safest solution is not to provide a virtual destructor but
instead to declare but not define a private operator delete.  This makes
it a compile time error to do:

  DTDHandler *p;
  // ...
  delete p;

Given the policy on object ownership there's never any need to do that:
only the creator of an object can delete it and the creator always has a
pointer to the concrete subclass which will provide a way to release the
object.

It also has the nice property that there is no .cpp file associated with
the SAX interface and no SAX library that has to be compiled or linked
with.  It would be a completely pure interface.

Here's another draft, with this change and a few other minor changes;

- use int not size_t (Lakos has a whole section on why unsigned in
interfaces is usually a bad idea)
- use a SAXString typedef for zero-terminated arrays
- don't use (void) for empty argument lists
- use iosfwd not istream as the header file
- use characters not SAXCharacters as the method name on DocumentHandler
- use a const char * arg for Parser::setLocale; I think that's the best
you can do portably; Standard C++ allows locales to be identifier by
name
- add Locator
- change resolveEntity to avoid transfer of ownership as suggested in my
previous message
- solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8
and a SAX_UTF16 namespace (since you're using std::istream, you are
assuming compiler support for namespaces); this will work nicely with
namespace aliases (eg namespace SAX = SAX_UTF8).

Discussion points:

- Would it be better to typedef SAXString to the Standard C++ string
class (ie std::basic_string<SAXChar>)?

James

Here's SAX.h:

#ifndef __SAX_HXX
#define __SAX_HXX

// Forward declarations of std::istream
#include <iosfwd>

namespace SAX_UTF8 {

  typedef char SAXChar;
  // A 0 terminated array of SAXChars.
  typedef const char *SAXString;
#include "SAXDecl.h"

}

namespace SAX_UTF16 {

  typedef unsigned short SAXChar;
  // A 0 terminated array of SAXChars.
  typedef const unsigned short *SAXString;
#include "SAXDecl.h"

}

#endif

And here's SAXDecl.h:

class InputSource
{
public:
  virtual SAXString getPublicId () const = 0;
  virtual void setPublicId (SAXString publicId) = 0;

  virtual SAXString getSystemId () const = 0;
  virtual void setSystemId (SAXString systemId) = 0;

  virtual std::istream * getInputStream () const = 0;
  virtual void setInputStream (std::istream * in) = 0;
private:
  void operator delete (void *);
};


class AttributeList
{
public:
  virtual int getLength () const = 0;

  virtual SAXString getName (int pos) const = 0;
  virtual SAXString getType (int pos) const = 0;
  virtual SAXString getValue (int pos) const = 0;

  virtual SAXString getType (SAXString name) const = 0;
  virtual SAXString getValue (SAXString name) const = 0;
private:
  void operator delete (void *);
};


class SAXException
{
public:
  virtual SAXString getMessage () const = 0;
private:
  void operator delete (void *);
};


class SAXParseException : public SAXException
{
public:
  virtual SAXString getPublicId () const = 0;
  virtual SAXString getSystemId () const = 0;
  virtual int getLineNumber () const = 0;
  virtual int getColumnNumber () const = 0;
private:
  void operator delete (void *);
};


class EntityResolver
{
public:
  virtual void resolveEntity (SAXString publicId,
			      SAXString systemId,
			      InputSource &) = 0;
private:
  void operator delete (void *);
};


class DTDHandler
{
public:
  virtual void notationDecl (SAXString name,
			     SAXString publicId,
			     SAXString systemId) = 0;
  virtual void unparsedEntityDecl (SAXString name,
				   SAXString publicId,
				   SAXString systemId,
				   SAXString notationName) = 0;
private:
  void operator delete (void *);
};


class Locator
{
public:
  virtual SAXString getPublicId () const = 0;
  virtual SAXString getSystemId () const = 0;
  virtual int getLineNumber() const = 0;
  virtual int getColumnNumber() const = 0;
private:
  void operator delete (void *);
};

class DocumentHandler
{
public:
  virtual void setDocumentLocator (const Locator &locator) = 0;
  virtual void startDocument () = 0;
  virtual void endDocument () = 0;
  virtual void startElement (SAXString name, const AttributeList &atts)
= 0;
  virtual void endElement (SAXString name) = 0;
  virtual void characters (const SAXChar * ch, int length) = 0;
  virtual void ignorableWhitespace (const SAXChar * ch, int length) = 0;
  virtual void processingInstruction (SAXString target, SAXString data)
= 0;
private:
  void operator delete (void *);
};


class ErrorHandler
{
public:
  virtual void warning (const SAXParseException &e) = 0;
  virtual void error (const SAXParseException &e) = 0;
  virtual void fatalError (const SAXParseException &e) = 0;
private:
  void operator delete (void *);
};


class Parser
{
public:
  virtual void setLocale (const char *) = 0;
  virtual void setEntityResolver (EntityResolver &resolver) = 0;
  virtual void setDTDHandler (DTDHandler &handler) = 0;
  virtual void setDocumentHandler (DocumentHandler &handler) = 0;
  virtual void setErrorHandler (ErrorHandler &handler) = 0;

  virtual void parse (SAXString systemId) = 0;
  virtual void parse (const InputSource &input) = 0;
private:
  void operator delete (void *);
};

This also extends easily to doing a templated version:

template<class SAXChar, class SAXString>
class BASIC_SAX {
#include "SAXDecl.h"
};

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at qub.com  Fri Dec  3 05:08:13 1999
From: paul at qub.com (Paul Tchistopolskii)
Date: Mon Jun  7 17:18:17 2004
Subject: XML. SAX. Streaming processing with Groves.
Message-ID: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii>


The advantage of SAX ( and attributes in XML) 
is that we have attributes  in place when startElement 
is invoked. We know what are some properties of this 
element right when element begins.

At that point could we get the information about 
the another properties this element has
( the child elements ) ?

No. We can not. If we'l decide to read the entire 
element before infoking startElement() - we'l have
to read the entire ( root ) document to know. 
DOM does it.

How can we workaround this limitation?

Right now I'm writing yet another wrapper 
around SAX, accumulating the element contents  
in 'microDOM' and then making a descision in 
endElement()  what to do with the element 
itself depending on the values of his children.

The 'correct' approach to avoid such a hell 
is to use DOM - but I can't. Documents could 
be big. 

I also can not require the client to turn all their 
elements into attributes.

It's actualy very interesting. If one wants his 
XML documents to be easy to process 
without DOM he  should have as much attributes 
as it's possible ! 

<aside> 
Isn't it  the end of long discussion of Elements vs Attributes? 
Now when I see the question: "Should I use attributes or 
elements?" - I know the answer:

"If you want it to be processed by current APIs not keeping 
the entire docuemnt in the memory  - use attributes everywhere 
you can."
</aside>

What if  for *some*  elements parser would invoke 
startElement ( or endElement) *after*  reading the 
entire element and placing all the children into ... Grove) ?

I think I could specify  those 'return-as-grove' elements 
at runtime, or when I'm initializing the parser.

How easy is to it do with curent SAX design ?

Rgds.Paul.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mike.Champion at softwareag-usa.com  Fri Dec  3 06:03:30 1999
From: Mike.Champion at softwareag-usa.com (Michael Champion)
Date: Mon Jun  7 17:18:17 2004
Subject: XML. SAX. Streaming processing with Groves.
References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii>
Message-ID: <011101bf3d53$60df73a0$e5d88dce@WORKGROUP>


----- Original Message -----
From: Paul Tchistopolskii <paul@qub.com>
To: <xml-dev@ic.ac.uk>
Sent: Friday, December 03, 1999 12:05 AM
Subject: XML. SAX. Streaming processing with Groves.


> The 'correct' approach to avoid such a hell
> is to use DOM - but I can't. Documents could
> be big.
>

The DOM WG will be defining the requirements for Level 3 over the next 6
weeks or so.  Standard APIs for loading, saving, parsing, and serializing
XML text are "must have" items for Level 3, and this issue (that an
application may want access to the elements of a document before it is fully
parsed) has come up. For example, a programmer might choose not to continue
parsing some huge document after the necessary data were found.

Concrete suggestions for actual APIs or pointers to APIs that allow this
would be appreciated.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec  3 06:13:52 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:17 2004
Subject: XML. SAX. Streaming processing with Groves.
In-Reply-To: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii>
Message-ID: <001b01bf3d55$a451c500$099918d1@docuverse1>

>The advantage of SAX ( and attributes in XML) 
>is that we have attributes  in place when startElement 
>is invoked. We know what are some properties of this 
>element right when element begins.

While there are indeed practical benefits to having
attributes readily available, event-based APIs like
SAX unintentionally encourage novice XML programmers,
who are not fully aware of attribute-vs-element issues,
toward designing data formats that favoring attributes
over child elements.

>At that point could we get the information about 
>the another properties this element has
>( the child elements ) ?

At the expense of requiring multithread support,
most of this problem goes away if the parser runs
in a separate thread so that by the time the attribute
stored as a child element is requested, it is already
available.  If not, the requester's thread simply blocks
until it is.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From howardk at fatdog.com  Fri Dec  3 06:25:14 1999
From: howardk at fatdog.com (Howard Katz)
Date: Mon Jun  7 17:18:17 2004
Subject: ANNOUNCE: XML Query Engine
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <384761B5.5CBF5851@fatdog.com>

[from the website]
XML Query Engine is a full-text search-and-retrieval engine for XML documents. A JavaBean
component, XML Query Engine can index single or multiple well-formed XML documents using any
SAX-based parser. The query engine builds an in-memory representation of the content and
structure of the indexed documents. Users can then pose queries against the indexed data using
XQL, a de facto standard for searching XML that is [very nearly] a proper subset of XPath, an
official W3C recommendation.

The version of XQL used by XML Query Engine has been extended slightly to provide a facility
for making full-text queries against the data set. This capability is similar to that found in
most current web-based search engines.
[end]

The software is currently in alpha and I'm interested in getting feedback. I'll be demoing in
the "New Technology Nursery" area on the exhibit floor at XML'99 next week. I don't know my
schedule for the show yet. If you want to reach me in Philadelphia, call me at the cell number
below or leave a message at the Marriott.

I'll be shipping copies of the software once I'm back in Vancouver the week of December 13th.
I'd like to hear initially what people want to do with it. If you want a copy, send me an
email and tell me in one or two lines whether your intentions are honourable and what they
are. :-) I'll be happy to send you a zipped copy in return.

More information is available at www.fatdog.com. If you're emailing me, please copy me at
howardckatz@yahoo.com since I'm experiencing some email difficulties due to a domain-name
move.

Regards,

Howard Katz, Fatdog Software

email: howardk@fatdog.com
web: www.fatdog.com
cell: (604) 725-3434


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mdash at techbooks.com  Fri Dec  3 06:30:11 1999
From: mdash at techbooks.com (Manoranjan Dash)
Date: Mon Jun  7 17:18:17 2004
Subject: unsubscribe
In-Reply-To: <011101bf3d53$60df73a0$e5d88dce@WORKGROUP>
References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii>
Message-ID: <3.0.6.32.19991203115728.008a0100@pinnacle.techbooks.com>

unsubscribe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec  3 06:30:32 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:17 2004
Subject: INTERFACE {was SGML, XML and SML, ugh!}
In-Reply-To: <A51F7543E295D2118D6600A024CDB2F71B9D71@MAILPROD>
Message-ID: <001d01bf3d57$f8bdca60$099918d1@docuverse1>

The other day, I had this idea about solving the
display problem for mobile computering.  While I
do not think it is implementable right now, I do
not think it is impossible.

The problem: large displays for mobile device.
The solution: public display walls that shows
different views to to different people simul-
taneously.  Multithreaded display of sort. <g>

Technology wise, I think the pixels will have to
protrude like a small pyramid to show multiple
views and multiplexed to coincide with the viewer's
eyeglass which includes LCD shutters.  Interesting
stuff to muse about.  Not much privacy barrier but
I think it might make sense in certain applications.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From liamquin at interlog.com  Fri Dec  3 07:27:41 1999
From: liamquin at interlog.com (Liam R. E. Quin)
Date: Mon Jun  7 17:18:17 2004
Subject: ANNOUNCE: XML Query Engine
In-Reply-To: <384761B5.5CBF5851@fatdog.com>
Message-ID: <Pine.BSI.3.96r.991203021510.2564A-100000@shell1.interlog.com>

On Thu, 2 Dec 1999, Howard Katz wrote:

> XML Query Engine is a full-text search-and-retrieval engine for
> XML documents. A JavaBean
> component, XML Query Engine can index single or multiple well-formed
> XML documents using any
> SAX-based parser.

I think this is interesting, but I wonder about the performance.
There are two main reasons  for using text retrieval, as I see it.
(1) for searching a large body of text significantly more quickly than
    with grep

(2) for kinds of search not otherwise possible, such as searches that
    span words, or that include stemming, synonyms or other morphological
    and linguistic analysis, or that include document structure or other
    "fielded" searches.

> The query engine builds an in-memory representation of the content and
> structure of the indexed documents.
This sounds like an interesting proof of concept...
but if I am searching, say, five gigabytes of text, what
will happen?

Indexing speed is also an issue.

So I assume the main purpose of this tool is to experiment
with XPath/XQL, is that fair??

Lee

and yes, I'd like a copy!  thanks :-)

-- 
Liam Quin, Barefoot Computing, Toronto;  The barefoot agitator
l i a m    at    h o l o w e b    dot    n e t
Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec  3 07:34:28 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:17 2004
Subject: Request for Discussion: SAX 1.0 in C++
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp>
Message-ID: <38475BB6.662858CF@jclark.com>

"Thomas B. Passin" wrote:
> 
> David Megginson wrote
> > Tim Bray writes:
> >
> >  > At 04:27 PM 12/2/99 -0500, David Megginson wrote:
> >  > >I'll be posting three follow-up messages on SAX/C++ to stimulate
> >  > >discussion:
> >  >
> >  > Good idea, one question.  Any way to do C at the same time? -Tim
> >
> > Sure -- is there a strong need for a common C interface, though?  We
> > already have Expat's C interface, and I don't know of anyone else in
> > that space yet.
> >
> But C is available on most _any_ platform - often for free.

So is C++ these days.

> So almost
> anyone could compile in C but not necessarily in C++.

The bigger problem is that the SAX style of interface goes over quite
naturally into C++, but would be rather awkward in C.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 09:09:41 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:17 2004
Subject: Some questions
In-Reply-To: David Megginson's message of "30 Oct 1999 06:45:32 -0500"
References: <000301bf3c4c$85bde920$0f36a8c0@quokka.com> <m33dutoyw3.fsf@localhost.localdomain>
Message-ID: <whogc8tmpg.fsf@viffer.oslo.metis.no>

>>>>> David Megginson <david@megginson.com>:

> "Jeffrey E. Sussna" <jes@kuantech.com> writes:
>> I wouldn't consider RDF at the same level as CORBA, but perhaps part
>> of an overall solution.

> Though I'm the one that brought CORBA into the discussion, I think
> that a better comparison would probably be XMI, since CORBA is a
> protocol rather than a format.

<pedantic mode>
CORBA is a standard (or set of standards), for creating and using
distributed objects, and consisting of formats and protocols.
IIOP/GIOP would be a protocol, and so would the IDL interfaces
of the CORBA Services, if you stretch the concept.  IDL is definitely
a format, and I don't know how to categorize the different
IDL language bindings as either.
</pedantic mode>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Fri Dec  3 09:32:56 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:17 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain>
Message-ID: <38478DB2.FACA4633@praxis.cz>

David Megginson wrote:
> How does the schema tell me that foo represents a container for a
> collection of objects, bar represents an object, and hack and flurb
> represent the object's properties?

The point is not what the current schema draft allows, it is whether it
would be feasible and appropriate to represent this information in XML
schemas, as Paul rightly stated. My opinion is that it would be fairly
trivial and extremely useful.

> It can be.  The DOM represents a domain-specific object layer that is
> useful for a wide subset of XML operations (especially document- and
> browser-oriented work).  There need to be many layers on top of XML,
> one for each domain -- it happens that many of those layers will share
> the need to encode objects, so a standard object layer sandwiched
> between XML and the domain-specific layers can save a lot of work.

Sure, the DOM has value. My point is that maybe 95% of applications want
a domain-specific rather than a generic interface. My other point is
that a domain-specific interface can be implemented generically; i.e.
programmatic interfaces for accessing XML data can be generated
automatically from XML schemas. This isn't *that* far from what MDSAX is
doing. IBM's XML BeanMaker (http://alphaworks.ibm.com/tech/xmlbeanmaker)
is a good example of this concept.

> > There are a variety of efforts to create
> > domain-specific objects automatically from XML objects. I don't have a
> > list at the tips of my fingers, but if anyone does it would be a great
> > resource. They are out there because I keep bumping into them.
> 
> One example is RDF.

So we are talking about different things. RDF is a formalism but it
doesn't provide you with any code (although I'm sure that tools for this
could be written, and perhaps already have been). I am talking about
something that will take my schema with Customer and Invoice element
types and turn it into, say, Java classes called Customer and Invoice.

> I disagree strongly with the last part of that statement.  I'd argue
> the opposite -- higher-level layers should be as independent of XML as
> possible.  That's the only way to build good, layered architectures.
> XML does one thing (represent a tree structure in a character stream)
> very well: it's an excellent layer to build other layers on top of,
> but XML itself should stay as simple as possible so that it's
> applicable widely to many different fields.

I agree with the layering approach. But well-formed XML should be viewed
as the lowest level (representing tree structures); when bound to an XML
schema it then becomes a serialized object representation.

> That would be another serious mistake.  Object exchange, while
> important, represents only one of many layers that can be build on top
> of XML, and if XML Schemas start trying to solve high-level problems
> for every specific domain, it will become an unimplementable mess.
> RDF already made a similar mistake by mixing together a spec for
> object encoding in XML with a spec for representing knowledge about
> Web pages.

Maybe this is the crux of our disagreement. I see object exchange as
*the* application for valid XML. I'd be interested to hear some examples
of applications that cannot be cast effectively in this light. In this
view, RDF and XML Schemas are coming at the same problem from different
angles. RDF is saying essentially "how do we build an XML application
that represents object structures", while XML Schemas are saying "how do
we enhance DTDs by adding some object-oriented facilities". My fear is
that these two approaches are going to meet somewhere in the middle and
turn out to be the same thing. If so, I vastly prefer the use of XML
schemas. Why? Because this results in a vast simplication of the whole
XML picture. Isn't it better to take a normal XML instance, using base
XML syntax, and "turn" it into an object by adding the appropriate
information in a separate schema, rather than having to recast the whole
thing in a different syntax?

(I wonder if I am expressing this idea clearly. I'll happily post an
example of how this could be done if I'm not.)

Cheers,
Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Fri Dec  3 09:38:10 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:17 2004
Subject: Schemas and strongly typed links (Was Re: Object-oriented serialization)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <38469260.E7D18FA3@jfinity.com>
Message-ID: <38478EF9.ED93CA0A@praxis.cz>

Gabe Beged-Dov wrote:
<snip>
> XLink doesn't even allow you to use a namespace qualified name for the "role" (this may have
> been fixed but it will be done as a new attribute value type like qname).  It certainly
> doesn't touch being able to specify a type for the property value.  The XML Schema group may
> end up supporting strongly typed references but I wouldn't be surprised if this fell off the
> plate.

This is really exactly what I am trying to get across. A tremendous
amount of effort is being invested in RDF, on various levels (specing,
implementation, evangelism, etc.). This shouldn't cause XML schemas to
be poorer! I may be standing alone here (am I?), but to me it would be a
minor tragedy if XML schemas did not support strongly typed links at the
schema level.

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 10:55:37 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:17 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: David Megginson's message of "Thu, 2 Dec 1999 16:32:36 -0500 (EST)"
References: <14406.58740.871829.541816@localhost.localdomain>
Message-ID: <whbt88tht4.fsf@viffer.oslo.metis.no>

>>>>> David Megginson <david@megginson.com>:

> 1. Use references when there can never be a null value, pointers
>    otherwise.

Sounds reasonable.

> 2. Pointers never change ownership -- if a Parser (for example) wants
>    to own an InputSource, it needs to make its own copy.  The app has
>    to free everything that it allocates, and the SAX driver, likewise.

A good basic practice.

> 3. Callbacks cannot be const, since they often change the state of the 
>    client app.

Agree.

> 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
>    with most existing C++ code.

Disagree.  This just defer the task of decoding from UTF-8 to UTF-16,
which every forward-looking XML application eventually will have to
do.  For Asian languages this will also incur extra overhead, since
I'm lead to belive they will mostly store documents as UTF-16, so that 
we will have a UTF-16 to UTF-8 to UTF-16 transformation through the
SAX interface.

(I currently have a SAX (or "SAXoid") C++ wrapper around expat, where
I currently use plain std::string& to transfer text.  But this is just 
a transitional stage until I manage to get full wide char support in
the underlying system.  (What I send through SAX isn't UTF-8, but
ISO8859-1 with all unknown characters changed into ".", since this is
all the underlying system understands))

> 5. Use char * rather than string, to avoid forcing a lot of allocation 
>    overhead on the SAX driver.

Hm... when I wrote my expat wrapper, I didn't even stop to think about 
this, since strings are so easy to use, and it would become a string
in the first map<> lookup anyways.

But I guess late evaluation is always a good thing (I'm using this
heavily on the AttributeList, where no C++ objects will be created
until someone asks for the first attribute).

But I would rather see "const wchar_t*" (which I belive at least the
Xerces-C uses) than "const char*".

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 10:58:37 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:17 2004
Subject: SAX/C++: Changes for C++
In-Reply-To: David Megginson's message of "Thu, 2 Dec 1999 16:38:11 -0500 (EST)"
References: <14406.59075.218048.437305@localhost.localdomain>
Message-ID: <wh7liwthnr.fsf@viffer.oslo.metis.no>

>>>>> David Megginson <david@megginson.com>:

> Here are some of the differences between the SAX/Java interfaces and the 
> SAX/C++ interfaces:

> - InputSource doesn't have an equivalent of Java Reader (no getReader
>   method)

I would like to be able to create a "push" stream, ie. something
similar to a libwww stream, where data that arrives asynchronously
will just be "pushed" to the parser as they arrive.

expat already supports this, and I use it.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 11:38:22 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: First interface draft
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <whso1ks19h.fsf@viffer.oslo.metis.no>

>>>>> James Clark <jjc@jclark.com>:

> One interesting issue is whether to provide a virtual destructor.  I
> think the safest solution is not to provide a virtual destructor but
> instead to declare but not define a private operator delete.  This
> makes it a compile time error to do:

>   DTDHandler *p;
>   // ...
>   delete p;

Hm... not defining a virtual destructor for a class with virtual
functions gives me warnings in "gcc -Wall".  Will a private operator
delete do anything about these warnings, I wonder...?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Fri Dec  3 11:53:56 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: UTF-8 v UTF-16
References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com>
Message-ID: <008b01bf3d84$b037d650$c5010180@p197>

> 2. A hi-tech solution.  Do what the Standard C++ library does and make
> the interface a template in the character type.  This is the cleanest
> solution, but lots of C++ projects eschew templates on portability
> grounds.

The Vivid C/C++ toolkit uses templates internally and so far has been
compiled under Windows, Solaris and HPUX and a few others so I think the
problem with templates is not so much of an issue these days(although it
took us time to find the LCD for template support), but, I'd still probably
avoid them in the SAX C/C++ definitions just in case.

> If you feel that one needs to be mandated, I would pick UTF-16.

I second that.  The Vivid Creation SAX interfaces
 http://www.vivid-creations.com/free/sax.h ) have been UTF-16 from day 1
 around 16 months ago ) and to date they've had nothing but positive
feedback.  I'd therefore make everything wchar_t and not char.

I'd forget C as most platforms do have C/C++ and STL these days.

Regards,

Richard.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Toby.Speight at streapadair.freeserve.co.uk  Fri Dec  3 12:09:09 1999
From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight)
Date: Mon Jun  7 17:18:18 2004
Subject: A processing instruction for robots
In-Reply-To: Walter Underwood's message of "Thu, 02 Dec 1999 13:58:58 -0800"
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
Message-ID: <u1z941b1f.fsf@lanber.cam.citrix.com>

Walter> Walter Underwood <URL:mailto:wunder@infoseek.com>

0> In article <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>,
0> Walter wrote:

Walter> HTML has a robots meta tag.  XML has no standard way to
Walter> declare the same information.  Here is a proposal, with an
Walter> implementation:
Walter>
Walter>   <URL:http://homepages.go.com/%7Ewunder0/robots-pi.html>
Walter>
Walter> Comments are welcome.

It may be an idea to provide a NOTATION identifier for the processing
instruction, rather than binding it to the specific word "robots".  It
depends on the trade-off you want to make between implementor convenience
and author generality.  If you've thought about it and decided against,
it's probably worth a comment in your proposal explaining your rationale.

-- 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec  3 12:36:08 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: UTF-8 v UTF-16
References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197>
Message-ID: <3847B8F3.D81B8286@jclark.com>

Richard Anderson wrote:

> > If you feel that one needs to be mandated, I would pick UTF-16.
> 
> I second that.  The Vivid Creation SAX interfaces
>  http://www.vivid-creations.com/free/sax.h ) have been UTF-16 from day 1
>  around 16 months ago ) and to date they've had nothing but positive
> feedback.  I'd therefore make everything wchar_t and not char.

Unfortunately wchar_t isn't guaranteed to be UTF-16.  Some platforms
make it 32-bits.  However, I agree it's a good idea for SAXChar to be
typedefed to wchar_t on platforms where wchar_t is UTF-16.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec  3 12:37:13 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: First interface draft
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whso1ks19h.fsf@viffer.oslo.metis.no>
Message-ID: <3847B801.588A8797@jclark.com>

Steinar Bang wrote:
> 
> >>>>> James Clark <jjc@jclark.com>:
> 
> > One interesting issue is whether to provide a virtual destructor.  I
> > think the safest solution is not to provide a virtual destructor but
> > instead to declare but not define a private operator delete.  This
> > makes it a compile time error to do:
> 
> >   DTDHandler *p;
> >   // ...
> >   delete p;
> 
> Hm... not defining a virtual destructor for a class with virtual
> functions gives me warnings in "gcc -Wall".  Will a private operator
> delete do anything about these warnings, I wonder...?

If not, gcc should be fixed, because there's no legitimate reason to
give a warning. I got this technique from

  http://www.develop.com/dbox/cxx/SmartPtr.htm#Obvious

I've also verified that this complies with the C++ standard (the
relevant clause is 12.5p4).  gcc 2.95.1 correctly gives a compile-error
if you try to delete such a class. Visual C++ 6 doesn't catch this at
compile, but you'll still get a link-time error.

The other possible technique is a protected virtual destructor with an
empty implementation.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 13:07:48 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: UTF-8 v UTF-16
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 19:34:59 +0700"
References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> <3847B8F3.D81B8286@jclark.com>
Message-ID: <whyabcqijv.fsf@viffer.oslo.metis.no>

>>>>> James Clark <jjc@jclark.com>:

> Richard Anderson wrote:
>> > If you feel that one needs to be mandated, I would pick UTF-16.
>> 
>> I second that.  The Vivid Creation SAX interfaces
>> http://www.vivid-creations.com/free/sax.h ) have been UTF-16 from day 1
>> around 16 months ago ) and to date they've had nothing but positive
>> feedback.  I'd therefore make everything wchar_t and not char.

> Unfortunately wchar_t isn't guaranteed to be UTF-16.  Some platforms
> make it 32-bits.

Yep!  So I've heard.

Do you have a list of the ones that does this?

> However, I agree it's a good idea for SAXChar to be typedefed to
> wchar_t on platforms where wchar_t is UTF-16.

Hm... should we also to a
	typedef basic_string<SAXChar> SAXstring;
(needs a better name, I lowercased the "s" in "string" to differ it
from SAXString)?

(pf course, then we would probably need SAXChar char_traits<> of some
sorts as well...)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 13:15:00 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: First interface draft
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <whu2m0qi7l.fsf@viffer.oslo.metis.no>

>>>>> James Clark <jjc@jclark.com>:

> - Would it be better to typedef SAXString to the Standard C++ string
> class (ie std::basic_string<SAXChar>)?

An argument for using 
        typdef const SAXChar* SAXString;
is that you get late construction of the basic_string<>, ie. you don't 
create it until you have to (eg. when using it to do a lookup in an
STL map<>).

But ven if it's not used directly on the SAX interface, such a typedef
would be useful when handling the data (eg. for creating the above
mentioned map<>).

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Fri Dec  3 13:15:28 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: <14406.58740.871829.541816@localhost.localdomain>
References: <14406.58740.871829.541816@localhost.localdomain>
Message-ID: <19991203081521.O2478@w3.org>

On Thu, Dec 02, 1999 at 04:32:36PM -0500, David Megginson wrote:
> Here are the principles that I applied to creating my first draft
> SAX/C++ interface:

  I'm afraid I won't be able to provide this interface in libxml
(the Gnome XML library http://xmlsoft.org/) due to the focus on C++,
though a C++ wrapper on top should be able to provide it.


> 2. Pointers never change ownership -- if a Parser (for example) wants
>    to own an InputSource, it needs to make its own copy.  The app has
>    to free everything that it allocates, and the SAX driver, likewise.

  Very good idea,

> 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
>    with most existing C++ code.

  Like James pointed out it's hard to segregate a class of users.
UTF-8 compacteness will be appreciated by people wanting low memory
overhead when building transaction processing. UTF-16 will simplify
interfacing to DOM or using XML in UI oriented apps.

> 5. Use char * rather than string, to avoid forcing a lot of allocation 
>    overhead on the SAX driver.

I did opt for the simple approach having an xmlChar type used everywhere
except non XML content (filenames, errors messages ...). Having it 8 or
16 bits should be a compile-time (or run-time but that's more risky)
option.

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Fri Dec  3 13:15:13 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:18:18 2004
Subject: Content or Metadata?
References: <NCBBJANJAENGCPMNOIOCKEFHFLAA.spreitze@parc.xerox.com>
Message-ID: <3847C259.7E4B4653@prescod.net>

Mike Spreitzer wrote:
> 
> What about the list of authors of a scholarly paper?  Isn't that metadata for which order
> matters?

Think of it from a programming language perspective:

class doc:
	title: string
	published: date
	authors: list of string
	text: list of (para|list|img)

The authors property is unordered with respect to the other properties
but its domain is ordered. The *list of authors* is metadata for the doc
object.

In grove land we allow a single, particular property to be labeled as
the content property. In this case it would be the "text:" property. In
a language like Python, you would navigate "regular (metadata)"
properties like this:

doc.publisher.address.street

and content properties like this:

doc[5][3][2][4]

In the former, the name is significant. In the latter the position is
significant. All of this is explained at:

http://www.prescod.net/groves/shorttut

-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
"I always wanted to be somebody, but I should have been more
specific." --Lily Tomlin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Fri Dec  3 13:20:58 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:18 2004
Subject: Request for Discussion: SAX 1.0 in C++
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp> <38475BB6.662858CF@jclark.com>
Message-ID: <002f01bf3d91$d254fa80$0ffbb1cd@tomshp>


From: James Clark <jjc@jclark.com>
>
> The bigger problem is that the SAX style of interface goes over quite
> naturally into C++, but would be rather awkward in C.
>
Well, that's for sure.  Maybe it's not worth the effort.  I was mainly
thinking about portability to smaller or rarer platforms (Amiga, Palm, etc).

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 13:20:44 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:18 2004
Subject: parser asynch input (Was: SAX/C++: First interface draft)
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <whpuwoqhy4.fsf_-_@viffer.oslo.metis.no>

>>>>> James Clark <jjc@jclark.com>:

> class Parser
> {
> public:
>   virtual void setLocale (const char *) = 0;
>   virtual void setEntityResolver (EntityResolver &resolver) = 0;
>   virtual void setDTDHandler (DTDHandler &handler) = 0;
>   virtual void setDocumentHandler (DocumentHandler &handler) = 0;
>   virtual void setErrorHandler (ErrorHandler &handler) = 0;

>   virtual void parse (SAXString systemId) = 0;
>   virtual void parse (const InputSource &input) = 0;
> private:
>   void operator delete (void *);
> };

I would like to add operations that can be used to "push" data to the
parser asynchronously:

class Parser
{
public:
  virtual void setLocale (const char *) = 0;
  virtual void setEntityResolver (EntityResolver &resolver) = 0;
  virtual void setDTDHandler (DTDHandler &handler) = 0;
  virtual void setDocumentHandler (DocumentHandler &handler) = 0;
  virtual void setErrorHandler (ErrorHandler &handler) = 0;

  virtual void parse (SAXString systemId) = 0;
  virtual void parse (const InputSource &input) = 0;
  virtual void asynchId(SAXString systemId) = 0; //for error reporting
  virtual void asynchPutBlock(const char* buf, int len) = 0;
private:
  void operator delete (void *);
};

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Wdehora at cromwellmedia.co.uk  Fri Dec  3 13:23:55 1999
From: Wdehora at cromwellmedia.co.uk (Bill dehOra)
Date: Mon Jun  7 17:18:18 2004
Subject: INTERFACE {was SGML, XML and SML, ugh!}
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A5CA3CD@odin.cromwellmedia.co.uk>


    :  The other day, I had this idea about solving the
    :  display problem for mobile computering.  While I
    :  do not think it is implementable right now, I do
    :  not think it is impossible.
    :  
    :  The problem: large displays for mobile device.
    :  The solution: public display walls that shows
    :  different views to to different people simul-
    :  taneously.  Multithreaded display of sort. <g>
    :  
    :  Technology wise, I think the pixels will have to
    :  protrude like a small pyramid to show multiple
    :  views and multiplexed to coincide with the viewer's
    :  eyeglass which includes LCD shutters.  Interesting
    :  stuff to muse about.  Not much privacy barrier but
    :  I think it might make sense in certain applications.
    :  


That's a cool idea, but unnecessary, given that we will have HUD's *within*
spectacles in a decade or so, making the wall displays redundant. Combine
spectacles with headphones and voice recognition, maybe with a small keypad
you have a complete mobile computing environment. You can get context driven
information about the environment (a la GSM), giving you an augmented
reality, which IMHO will dwarf what is happening with the internet now. All
we really need are some advances in battery technology, and reality is in
for a comeback. Anyone for RML?

regards,

Bill de hOra

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Fri Dec  3 13:52:34 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:18 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: "Thomas B. Passin"'s message of "Fri, 3 Dec 1999 08:25:02 -0500"
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp> <38475BB6.662858CF@jclark.com> <002f01bf3d91$d254fa80$0ffbb1cd@tomshp>
Message-ID: <wh903cqgha.fsf@viffer.oslo.metis.no>

>>>>> "Thomas B. Passin" <tpassin@idsonline.com>:

> From: James Clark <jjc@jclark.com>

>> The bigger problem is that the SAX style of interface goes over
>> quite naturally into C++, but would be rather awkward in C.

> Well, that's for sure.  Maybe it's not worth the effort.  I was
> mainly thinking about portability to smaller or rarer platforms
> (Amiga, Palm, etc).

The W3C libwww structured stream, is pretty close to the
DocumentHandler, at least:
	http://www.w3.org/Library/src/HTStruct.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steven.livingstone at scotent.co.uk  Fri Dec  3 13:52:24 1999
From: steven.livingstone at scotent.co.uk (Steven Livingstone, ITS, SENM)
Date: Mon Jun  7 17:18:18 2004
Subject: XML processing instruction survey
Message-ID: <8DCB90532FF7D211B34400805FD48853B56DFB@SENMAIL3>

Microsoft make an interesting use of it in their technology preview for SQL
Server XML.

The output of a transform is cached and the PI "servercache" is used to
determine how long the output should be cached for.

Interesting.

Cheers,
Steven

Steven Livingstone - http://www.deltabiz.com
07771 957 280 or +447771957280

Professional Site Server 3, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696
Professional Site Server 3.0 Commerce Edition, Wrox Press
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505


> -----Original Message-----
> From:	Jeffrey E. Sussna [SMTP:jes@kuantech.com]
> Sent:	30 November 1999 23:08
> To:	'XML Dev'
> Subject:	XML processing instruction survey
> 
> I'm interested in the extent to which people are actually using the XML
> processing instruction ( <?xml ) in their XML files, and the extent to
> which
> they find it useful.
> 
> Jeff Sussna
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:00:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:06:24 +0700"
References: <14406.58740.871829.541816@localhost.localdomain> <384741C0.50ABA536@jclark.com>
Message-ID: <m3zovsi0rr.fsf@localhost.localdomain>

James Clark <jjc@jclark.com> writes:

> That's problematic for EntityResolve::resolveEntity; that requires that
> ownership of an InputSource be transferred from to the caller from the
> callee.
> 
> This could be avoided by doing:
> 
> virtual const InputSource *
> resolveEntity(const char *publicId,
>               const char *systemId);
> 
> instead of:
> 
> virtual void
> resolveEntity(const char *publicId,
>               const char *systemId,
>               InputSource &inputSource);

(I'll assume that James accidentally reversed the two).  The second
one is a very good idea -- the only modification I'd make is to add a
bool return value, so that the parser knows whether the resolver
actually wants to override:

virtual bool
 resolveEntity(const char *publicId,
               const char *systemId,
               InputSource &inputSource);


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:05:11 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: Daniel Veillard's message of "Fri, 3 Dec 1999 08:15:21 -0500"
References: <14406.58740.871829.541816@localhost.localdomain> <19991203081521.O2478@w3.org>
Message-ID: <m3wvqwi0j7.fsf@localhost.localdomain>

Daniel Veillard <Daniel.Veillard@w3.org> writes:

> On Thu, Dec 02, 1999 at 04:32:36PM -0500, David Megginson wrote:
> > Here are the principles that I applied to creating my first draft
> > SAX/C++ interface:
> 
>   I'm afraid I won't be able to provide this interface in libxml
> (the Gnome XML library http://xmlsoft.org/) due to the focus on C++,
> though a C++ wrapper on top should be able to provide it.

Yes, that would be the best approach -- both libXML and Expat should
be easy to wrap in SAX/C++ adapters (XML4C++ will probably have its
own SAX/C++ support).  It would be interesting for apps to be able to
switch between libXML and Expat and compare performance, features,
etc.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:09:12 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:18 2004
Subject: RDF, again
In-Reply-To: Vane Lashua's message of "Thu, 2 Dec 1999 17:50:33 -0500"
References: <A51F7543E295D2118D6600A024CDB2F71B9D72@MAILPROD>
Message-ID: <m3u2m0i0cj.fsf@localhost.localdomain>

Vane Lashua <vlashua@RSGsystems.com> writes:

> The difficulty with the definitions below, for instance, is that "name" is a
> collection of characters whose context is not clear without a reference.
> Namespaces, it seems to me, are absolutely necessary, but they tend to
> encourage diversity where convergence would be a more enlightened tendency.

Namespaces encourage innovation.  Innovation is the first stage in
development, and it needs to be followed by standardization where
demand warrants.

In ordinary language, Namespaces let people invent stuff, but it's our 
responsibility to look at what's being invented and standardize the
things that are being done over and over again.  It's a good idea to
let the market have a say first; if you skip the innovation stage and
try to standardize in advance, your standards will often miss the mark.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:19:21 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: First interface draft
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 19:30:57 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whso1ks19h.fsf@viffer.oslo.metis.no> <3847B801.588A8797@jclark.com>
Message-ID: <m3ogc8hzvm.fsf@localhost.localdomain>

James Clark <jjc@jclark.com> writes:

> The other possible technique is a protected virtual destructor with an
> empty implementation.

That might be a little clearer to less experienced C++ programmers
(like me).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:17:45 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:18 2004
Subject: SAX/C++: First interface draft
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <m3r9h4hzyb.fsf@localhost.localdomain>

James Clark <jjc@jclark.com> writes:

> In Java, everything in SAX is an interface. The way to do an interface
> in C++ is to use a class where all members (except possibly a virtual
> destructor) are abstract (ie defined as = 0).  This provides the maximum
> flexibility and insulation. The only good reason not to do an interface
> is if it were necessary and possible to inline some method calls for
> performance.  I think this this applies here: certainly there's no
> performance need to inline method calls to something like InputSource.

Actually, I don't see any strong argument not to provide empty inline
implementations for the handler callbacks:

    virtual void startDocument (void) {}
    virtual void endDocument (void) {}
    virtual void startElement (const char * name, const AttributeList &atts) {}
    virtual void endElement (const char * name) {}

(etc.)

That way, subclasses can implement only the callbacks they need, and
there's no need for a HandlerBase class.

> One interesting issue is whether to provide a virtual destructor.  I
> think the safest solution is not to provide a virtual destructor but
> instead to declare but not define a private operator delete.  This makes
> it a compile time error to do:
> 
>   DTDHandler *p;
>   // ...
>   delete p;
> 
> Given the policy on object ownership there's never any need to do that:
> only the creator of an object can delete it and the creator always has a
> pointer to the concrete subclass which will provide a way to release the
> object.

I appreciate James sharing some of his C++ experience here.  This
sounds like a good idea to me, but I'm at best a C++ journeyman, so
I'd be happy to hear from other masters on the list.

> It also has the nice property that there is no .cpp file associated with
> the SAX interface and no SAX library that has to be compiled or linked
> with.  It would be a completely pure interface.

Yes, I'd like this as well.

> Here's another draft, with this change and a few other minor changes;
> 
> - use int not size_t (Lakos has a whole section on why unsigned in
> interfaces is usually a bad idea)

OK -- I thought that I was being well-behaved using size_t.  Oh, well.

> - use a SAXString typedef for zero-terminated arrays

Sounds good, if slightly obfuscatory.

> - don't use (void) for empty argument lists

What are the arguments for and against?

> - use iosfwd not istream as the header file

I have no idea why, but I'll take James's word on this.

> - use characters not SAXCharacters as the method name on DocumentHandler
> - use a const char * arg for Parser::setLocale; I think that's the best
> you can do portably; Standard C++ allows locales to be identifier by
> name

Thanks.

> - add Locator

Yes, I forgot it.

> - change resolveEntity to avoid transfer of ownership as suggested in my
> previous message

Perfect, except that it might be nice to use bool as the return value, 
as I suggested, so that the parser isn't forced to examine the
InputSource if the app hasn't made any changes.

> - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8
> and a SAX_UTF16 namespace (since you're using std::istream, you are
> assuming compiler support for namespaces); this will work nicely with
> namespace aliases (eg namespace SAX = SAX_UTF8).

Ouch!  We might be getting a little hairy here.  How is Namespace
support out there, by the way?  I know that EGCS is pretty good on
Linux, though the std:: Namespace still isn't properly supported.

> Discussion points:
> 
> - Would it be better to typedef SAXString to the Standard C++ string
> class (ie std::basic_string<SAXChar>)?

Do we want to force that overhead on an app?  I need to understand
better if there will be a high cost to calling c_ptr over and over
again, if the app needs a regular zero-terminated character array.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:22:02 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:18 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: Matthew Gertner's message of "Fri, 03 Dec 1999 10:30:27 +0100"
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz>
Message-ID: <m3ln7chzrd.fsf@localhost.localdomain>

Matthew Gertner <matthew@praxis.cz> writes:

> David Megginson wrote:

> > How does the schema tell me that foo represents a container for a
> > collection of objects, bar represents an object, and hack and flurb
> > represent the object's properties?
> 
> The point is not what the current schema draft allows, it is whether it
> would be feasible and appropriate to represent this information in XML
> schemas, as Paul rightly stated. My opinion is that it would be fairly
> trivial and extremely useful.

Would you require schema processing, then, for object exchange (or in
other words, would there be no equivalent of the DTD-less XML
document)?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Colas.Nahaboo at sophia.inria.fr  Fri Dec  3 14:22:41 1999
From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo)
Date: Mon Jun  7 17:18:18 2004
Subject: Object-oriented serialization (Was Re: Some questions) 
In-Reply-To: Your message of "Thu, 02 Dec 1999 10:24:04 +0100."
             <38463AB4.36C5292B@praxis.cz> 
Message-ID: <199912031422.PAA22478@aye.inria.fr>


Matthew Gertner writes:
> The aspects of
> object-oriented design that are missing are then inheritance and
> polymorphism. 

In my opinion, things may be more simple. If SGML/XML had not been designed by
people living in a typeless world (text documents), XML could have provided a
much better medium to express object instances, with such a simple design as
getting rid of element contents, and allowing attributes contents to be XML,
e.g:

<Point
  x=<Length unit="inches" value="12"/>
  y=<Length unit="cm" value="2"/>
  color=<RGB R=<Number base="16" value="FF"/> G="0" B="0"/>
/>

matching a C/C++/Java... declaration of Point as:

Point {
  Length x;
  Length y;
  Color color;
};

As you can see, this would be a very elegant and natural way to express object
instances (aka serialization). One would of course need a schema language on
top of that (to express what I wrote in a C-like declaration), but having to tweak the "low-level" serialisation to fit in the
current XML1.0 recomendation is I think the original sin of XML, which
pollutes a lot of the discussions I see here. For instance the current drive
for removing attributes results from this.
 
But, just like RDF, we could standardise on this non-XML-1.0-compatible (lets
call it GXML for Generalized XML :-) representation, and devise a canonical
way to express it in XML.  For instance:

<Point>
  <_x><Length unit="inches" value="12"/></_x>
  <_y><Length unit="cm" value="2"/></_y>
  <_color><RGB G="0" B="0">
       <_R><Number base="16" value="FF"/></_R>
  </_color>
</Point>

but we could devise others...

--
Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmcdonou at library.berkeley.edu  Fri Dec  3 14:23:18 1999
From: jmcdonou at library.berkeley.edu (Jerome McDonough)
Date: Mon Jun  7 17:18:19 2004
Subject: INTERFACE {was SGML, XML and SML, ugh!}
In-Reply-To: <A51F7543E295D2118D6600A024CDB2F71B9D71@MAILPROD>
Message-ID: <199912031422.JAA01701@westnet.com>

At 02:43 PM 12/2/99 -0500, Vane Lashua wrote:
>Around the same era that the mouse emerged, there was on the market a
>single-handed(?) encoding device whose speed was about the same as querty. I
>think I saw it in Byte. Anybody seen one lately?
>

Yes, they're alive and well.  The Twiddler (gotta love the name) is probably
the best known, as they've been written up a few times as the keyboard of
choice
for the people at MIT developing wearable computing gear.  See
www.handykey.com
for more info.  Other one-handed chording keyboards
available include CyKey from Bellaire Electronics (www.bellaire.demon.co.uk),
the 
MonoManus from ElmEntry Enterprises (www.hankes.com/eee/index.htm) and the
Bat Personal Keyboard (www.infogrip.com).  Prices for these are typically
in the $150 - $200 U.S. range.

With practice, people using one handed chording keyboards can usually get
typing speeds approaching 60 wpm, which is about the low-end of professional
touch-typing speeds on QWERTY boards.


Jerome McDonough -- jmcdonou@library.Berkeley.EDU  |  (......)
Library Systems Office, 386 Doe, U.C. Berkeley     |  \ *  * /
Berkeley, CA 94720-6000    (510) 642-5168          |  \  <>  /
"Well, it looks easy enough...."                   |   \ -- /  SGNORMPF!!!
         -- From the Famous Last Words file        |    ||||

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Fri Dec  3 14:49:20 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:19 2004
Subject: Web Vision (Was Re: Object-oriented serialization)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca>
	 <m3bt9hp6g4.fsf@localhost.localdomain>
	 <38459FA4.DAA00E35@praxis.cz>
	 <m366zpp3m8.fsf@localhost.localdomain>
	 <38463AB4.36C5292B@praxis.cz>
	 <m3aenta7qn.fsf@localhost.localdomain> <3.0.6.32.19991202133035.04315e60@mail.dt1.sdca.home.com>
Message-ID: <3847D7E2.A4AE2B48@praxis.cz>

Robert La Quey wrote:
> The basic problem remains a lack of a clearly articulated vision of what
> the web of the future could/should be.

Okay, Bob, let me take a crack at this. As will no doubt be obvious to
anyone who has read my last few posts, I am quite sceptical as to the
real value of RDF. My hope is that XML schemas will come to be viewed as
the right mechanism for specifying object-oriented structures, turning
XML instances into object serializations with all this implies. I said
some nasty things about RDF in my "Pleas for Schemas"
(www.praxisxml.com/praxis_xml.html) and was frankly quite disappointed
that no RDF proponants stepped up to defend their case. The current
discussion in this context is therefore very interesting and edifying
for me.

Anyway, my personal vision for the "new" (i.e. XML-enhanced) Web is
quite simple. Let's look at the pieces one by one:

* Namespaces -- are there to specify unique names. No more, no less.
They should be orthogonal to schemas, so a single schema can use several
namespaces and vice versa. The choice of URIs to uniquely scope names is
clever and elegant, but there needs to be a specification of what is at
the end of the URI. Apparently people get really confused otherwise.
Something like Simon St. Laurent's XPDL would be a great choice, but
oriented towards namespaces and not schemas and pointing to further
resources that might be of interest (such as schemas that use the
namespace and human-readable documentation).

* XML -- is for representing tree structures in text format.

* XML schemas -- are for turning XML instances into objects, adding
information about the semantic relationships between element types. They
are also repositories for business logic related these element types. I
strongly contest the view that business logic can only be represented in
a messy combination of human-readable documentation and running code.
Schemas can provide a huge amount of semantic information in various
ways:
- Semantics of element type relationships (e.g. one element type is an
attribute of another, and not just contained in it)
- Plausibility constraints (e.g. allowed data ranges or regexs for
string; this is already is the spec)
- XPath constraints. I think Rick Jelliffe's Schematron is a brilliant
idea that would bring heaps of benefits when embedded inside an XML
schema. Many types of nontrivial semantics can be expressed using XPath
and linked directly to a given element type.
- Opt-outs, for example links to Java classes that do further processing
that cannot be expressed descriptively. At this point we have "lost" in
terms of representing full application semantics inside a schema, but at
least we have a central location for binding logic in a descriptive way
to the applications data structures.

* Stylesheets -- transform XML documents or render them in various
human-readable formats.

In this view, schemas provide the application logic to let XML actually
do something. Websites become aggregators of XML documents (many of
which will be generated dynamically from database content), the schemas
tell the processing application (which might be on the server or on the
client) what to do with the document. For example, I could write a
servlet that, based on an arbitrary XML schema, turns an XML document
into an abstract form definition (also XML) that can be processed by a
generic "form" stylesheet and turned into an HTML form, a WAP form or
whatever. This means I can turn any schema into an input form instantly
with no programming whatsoever (and this works just as well for reports,
query forms and what-have-you).

So what we have are a bunch of objects in the form of XML documents
zipping around the Web. Some are rendered for human viewing, some are
consumed by other applications. Schemas are the glue that ties this
altogether, doing two things:
1) Providing object and application semantics to the XML instances and
2) Serving as the unit for distributing, reusing and extending these
semantics.

I'd be interested to hear how the RDFers see things in relation to this
vision. I know it's a little half-baked, but I don't want to write a
20-page mail and I'm sure no one wants to read it. I am working on
something more detailed to be posted on our website; we have tons of
ideas and plans in this direction.

Cheers,
Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tshw at capitalmarketscompany.com  Fri Dec  3 14:50:30 1999
From: tshw at capitalmarketscompany.com (Shaw Tim)
Date: Mon Jun  7 17:18:19 2004
Subject: RDF, again
Message-ID: <FDFFD5C2748BD211BC500008C71E933CBDF938@uklonts01.uklo.capitalmarketscompany.com>


Given a piece of data, described by a name within a namespace, I would like
to be able to determine the 'equivalent' data (and it's name) within another
namespace. I don't mind 'finding' the data, but I need to be able to
determine it's 'new' name within the new namespace.
I think that, the way things are going (and have always been), it's
necessary to have such a mechanism. From my (limited) understanding of RDF,
it's not able to give me the 'hook' to do this.
This, to my mind, is the equivalent of automated natural language
translation, where XML is the alphabet, the Namespace is the Dictionary and
DTD's or Schemata are the Grammar (and the data is the ...? - analogy breaks
down here in my overloaded braincell!).

We (humans) have always had a problem communicating across language
boundaries - can we not define a mechanism such that we do not propagate
this problem to XML?
People will continue to define their own grammars for specific purposes -
what about asking them to 'translate' their Dictionary into Esperanto so
that others can use their ... whatever words would map to (meaning?).

I'm all for encouraging diversity, but if people want to share something
they need to be able to define it in such a way that it can be understood by
others - without having to worry exactly _which_ others. We are in danger of
splitting it all up into dialects understandable only to small groups - and
that, it seems to me, is exactly why we went for XML in the first place! 

tim

> -----Original Message-----
> From: David Megginson [mailto:david@megginson.com]
> Sent: 03 December 1999 15:08
> To: 'xml-dev@ic.ac.uk'
> Subject: Re: RDF, again
> 
> 
> Vane Lashua <vlashua@RSGsystems.com> writes:
> 
> > The difficulty with the definitions below, for instance, is 
> that "name" is a
> > collection of characters whose context is not clear without 
> a reference.
> > Namespaces, it seems to me, are absolutely necessary, but 
> they tend to
> > encourage diversity where convergence would be a more 
> enlightened tendency.
> 
> Namespaces encourage innovation.  Innovation is the first stage in
> development, and it needs to be followed by standardization where
> demand warrants.
> 
> In ordinary language, Namespaces let people invent stuff, but 
> it's our 
> responsibility to look at what's being invented and standardize the
> things that are being done over and over again.  It's a good idea to
> let the market have a say first; if you skip the innovation stage and
> try to standardize in advance, your standards will often miss 
> the mark.
> 
> 
> All the best,
> 
> 
> David
> 
> -- 
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: 
> http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
*********************************************************************
The information in this email is confidential and is intended solely 
for the addressee(s). 					
Access to this email by anyone else is unauthorised. If you are	not 
an intended recipient, you must not read, use or disseminate the 
information contained in the email. 			
Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of 
The Capital Markets Company.				  

http://www.capitalmarketscompany.com
***********************************************************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 14:54:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:19 2004
Subject: RDF, again
In-Reply-To: <FDFFD5C2748BD211BC500008C71E933CBDF938@uklonts01.uklo.capitalmarketscompany.com>
References: <FDFFD5C2748BD211BC500008C71E933CBDF938@uklonts01.uklo.capitalmarketscompany.com>
Message-ID: <14407.55626.211719.951880@localhost.localdomain>

Shaw Tim writes:

 > Given a piece of data, described by a name within a namespace, I would like
 > to be able to determine the 'equivalent' data (and it's name) within another
 > namespace. I don't mind 'finding' the data, but I need to be able to
 > determine it's 'new' name within the new namespace.
 > I think that, the way things are going (and have always been), it's
 > necessary to have such a mechanism. From my (limited) understanding of RDF,
 > it's not able to give me the 'hook' to do this.

See

  http://www.w3.org/TR/PR-rdf-schema

This has long been delayed from going to REC for political rather than 
technical reasons, but I think that things are looking a little
sunnier now.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Fri Dec  3 15:19:09 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:19 2004
Subject: SAX/C++: First interface draft
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <m3r9h4hzyb.fsf@localhost.localdomain>
Message-ID: <008601bf3da1$5b6e5690$c5010180@p197>

> Actually, I don't see any strong argument not to provide empty inline
> implementations for the handler callbacks:
>
>     virtual void startDocument (void) {}
>     virtual void endDocument (void) {}
>     virtual void startElement (const char * name, const AttributeList
&atts) {}
>     virtual void endElement (const char * name) {}

I think the "= 0" is far more normal in the C/C++.  I personally prefer that
approach because if I derive a class from the parser class, and do a typo in
method implementation in the derived class I get a compile error, rather
than spending time figuring out the empty base class implementation is being
called.  Having a handler base is fine by me.

If we do go for the "= 0" approach I suggest the class be prefixed with an
"I" (for interface).  Those who work with COM will understand that rational
behind that ;)


> I appreciate James sharing some of his C++ experience here.  This
> sounds like a good idea to me, but I'm at best a C++ journeyman, so
> I'd be happy to hear from other masters on the list.

The other option is :

protected:
   virtual ~CSAXParser() {}

Of course, it is nice to be able to delete things so should the class should
probably include a delete or release method:

    virtual void delete() = 0;

That can then be implemented using referencing counting if so desired or a
straight delete.

> > - use iosfwd not istream as the header file

I've not used isofwd, but why not just define a SAX input stream class :

class CInputStream
{
public:

    virtual int ReadChar( unsigned char& ch ) = 0;
    ...
};

I for one have implemented by SAX support this way so people using the
toolkit can implement streams however they see fit, esp. if STL support on
their platform is a but shaky.

> > - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8
> > and a SAX_UTF16 namespace (since you're using std::istream, you are
> > assuming compiler support for namespaces); this will work nicely with
> > namespace aliases (eg namespace SAX = SAX_UTF8).

We've had problems with namespaces not being supported with some compilers,
so it is best to avoid them.  That is the reason why I suggest all
interface/class names are prefixed with SAX, CSAX or better still ISAX

Regards,

Richard.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Colas.Nahaboo at sophia.inria.fr  Fri Dec  3 15:32:32 1999
From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo)
Date: Mon Jun  7 17:18:19 2004
Subject: [SML] Whether to support Attribute or not? 
In-Reply-To: Your message of "Mon, 29 Nov 1999 02:59:23 EST."
             <Pine.LNX.4.10.9911290225570.2129-100000@cauchy.clarkevans.com> 
Message-ID: <199912031531.QAA22952@aye.inria.fr>


"Clark C. Evans" writes:
> Take the following HTML fragment:

>   <table border="2" cellpadding="50">
>     <tr><td>One</td><td>Two</td></tr>
>     <tr><td colspan="2">Three</td></tr>
>   </element>

> I clearly see the different role that content plays
> as opposed to attributes. The border and cellpadding
> attributes *modify* the state of the table; where
> the tr element content is *part-of* the table.

mmm, if you really look at it, things are mudded because you forget that
<table> is an object having a field "rows", which is  a list of elements of
"type" <tr>, and that <tr> is an object having a field named "cells" having a
list of <td>s as values.

Now, replace the word "field" by "attributes" and you see that contents is
actually a kind of attribute, with its real name omitted and implicit
(actually, explicited somewhat in the DTD.

Everything is confused in XML because everything is of type "text", that you
mix n match everywhere.

For me, your exemple *should* be written in an ideal XML 2:

<table border="2" cellpadding="50"
  rows=	 <tr cells=<td contents="One"/><td contents="Two"/>/>
         <tr cells=<td colspan="2"contents="Three"/>/>
/>

--
Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Fri Dec  3 15:54:21 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:19 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain>
Message-ID: <3847E71D.3F32FFA9@praxis.cz>

David Megginson wrote:
> Would you require schema processing, then, for object exchange (or in
> other words, would there be no equivalent of the DTD-less XML
> document)?

Yes and no. A schema-less XML document is absolutely fine for most
things. If you need object semantics then you need a schema. The good
news is that, for things like metadata attached to documents, the number
of schemas will most likely be small and well-known. If I have the
schema for the Dublin Core on my machine already, for example, then I
can interpret any instance based on the Dublin Core. Obviously the
knowledge about the schema can be compiled into specific classes for
processing instances based on this schema, as I already mentioned. This
will no doubt be necessary for efficiency; it would be unrealistic to
reparse the schema every time an instance needs to be processed.

Cheers,
Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vlashua at RSGsystems.com  Fri Dec  3 16:06:44 1999
From: vlashua at RSGsystems.com (Vane Lashua)
Date: Mon Jun  7 17:18:19 2004
Subject: Object-oriented serialization (Was Re: Some questions) 
Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D76@MAILPROD>

I think you're mixing apples and oranges.

An even simpler declaration of your example below -- and correct in XML --
would be:

<Point value="12in,2cm;RFFx,G0,B0" />

or:

<processingsegment lang="Java" content="Point {Length x; Length y; Color
color;};" />

or:

<? Java Point {Length x; Length y; Color color;}; ?>

XML is a storage medium. Java source code is a storage medium. XML may
contain Java source code syntax, as Java source code may contain XML syntax,
but both need processors to do more.

And by the way, SGML grew out of a world of extremely limited and narrowly
typed data processing  _and_ fixed length records. Data typing in SGML is as
simple as adding an attribute to an element declaration. It is up to a
processor to know how to use it. Just as it is in Java.

Vane

-----Original Message-----
From: Colas Nahaboo [mailto:Colas.Nahaboo@sophia.inria.fr]

Matthew Gertner writes:
> The aspects of
> object-oriented design that are missing are then inheritance and
> polymorphism. 

In my opinion, things may be more simple. If SGML/XML had not been designed
by
people living in a typeless world (text documents), XML could have provided
a
much better medium to express object instances, with such a simple design as
getting rid of element contents, and allowing attributes contents to be XML,
e.g:

<Point
  x=<Length unit="inches" value="12"/>
  y=<Length unit="cm" value="2"/>
  color=<RGB R=<Number base="16" value="FF"/> G="0" B="0"/>
/>

matching a C/C++/Java... declaration of Point as:

Point {
  Length x;
  Length y;
  Color color;
};

As you can see, this would be a very elegant and natural way to express
object
instances (aka serialization). One would of course need a schema language on
top of that (to express what I wrote in a C-like declaration), but having to
tweak the "low-level" serialisation to fit in the
current XML1.0 recomendation is I think the original sin of XML, which
pollutes a lot of the discussions I see here. For instance the current drive
for removing attributes results from this.
 
But, just like RDF, we could standardise on this non-XML-1.0-compatible
(lets
call it GXML for Generalized XML :-) representation, and devise a canonical
way to express it in XML.  For instance:

<Point>
  <_x><Length unit="inches" value="12"/></_x>
  <_y><Length unit="cm" value="2"/></_y>
  <_color><RGB G="0" B="0">
       <_R><Number base="16" value="FF"/></_R>
  </_color>
</Point>

but we could devise others...

--
Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia,
http://www.inria.fr/koala/colas


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Fri Dec  3 16:26:26 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:18:19 2004
Subject: Any XML Schemas validators out yet ?
Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2781C0@MDYNYCMSX1>

>I tried this but MS Explorer 5.0.2919 reports this error in the XML Schema
proposal:
>Attribute 'xmlns:' must be a #FIXED attribute. Line 17, Position 18
>Maybe Im using the wrong Schema ,
>"http://www.w3.org/TR/1999/WD-xmlschema-1-19991105/structures.dtd" ?

Using (X/NT)emacs+PSGML and then the IBM (and now Xerces) validating Java
parser, all the WD-xmlschema* drafts worked for me. I've only dabbled in the
most simplistic levels of XML support in IE so far, so I don't know what the
problem would be there.

Bob DuCharme          www.snee.com/bob           <bob@  
snee.com>  "The elements be kind to thee, and make thy
spirits all of comfort!" Anthony and Cleopatra, III ii

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From KenNorth at email.msn.com  Fri Dec  3 16:41:47 1999
From: KenNorth at email.msn.com (KenNorth)
Date: Mon Jun  7 17:18:19 2004
Subject: Rocket framework for creating Web sites
Message-ID: <000001bf3dad$a0c0f520$0b00a8c0@grissom>

Michael Floyd just released the Rocket framework:

"In a nutshell, Rocket is a collection of skeleton XML documents, XSL style
sheets, and DTD's that you can use as a basis for creating your own
XML-based Web site. Using Rocket, you can transform XML documents and serve
them to any browser, regardless of its capabilities. Rocket also allows you
exchange XML streams between XML-capable browsers and HTTP servers."

Check his BeyondHTML site:
http://www.beyondhtml.com/rocket/rocket.xml


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Fri Dec  3 16:55:54 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:19 2004
Subject: A processing instruction for robots
In-Reply-To: <u1z941b1f.fsf@lanber.cam.citrix.com>
References: <Walter Underwood's message of "Thu, 02 Dec 1999 13:58:58 -0800">
 <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
Message-ID: <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com>

At 12:09 PM 12/3/99 +0000, Toby Speight wrote:
>
>It may be an idea to provide a NOTATION identifier for the processing
>instruction, rather than binding it to the specific word "robots".  It
>depends on the trade-off you want to make between implementor convenience
>and author generality.  If you've thought about it and decided against,
>it's probably worth a comment in your proposal explaining your rationale.

Good point. Since the target of the PI is "any robot that cares",
the notation would need to point to something other than a 
particular robot, probably the spec. That is a namspace-like
use of the notation.

In that case, should the spec require that processors check for
the correct notation before interpreting the PI?

I'm a bit wary of making things more complex. The robot world is 
the natural home of the Desparate Perl Hacker, so I'd like the
spec to be understandable in 30 seconds or less.

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec  3 17:01:39 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:19 2004
Subject: [SML] Whether to support Attribute or not? 
In-Reply-To: <199912031531.QAA22952@aye.inria.fr>
Message-ID: <Pine.LNX.4.10.9912022359110.18460-100000@cauchy.clarkevans.com>


On Fri, 3 Dec 1999, Colas Nahaboo wrote:
> For me, your exemple *should* be written in an ideal XML 2:
> 
> <table border="2" cellpadding="50"
>   rows=	 <tr cells=<td contents="One"/><td contents="Two"/>/>
>          <tr cells=<td colspan="2"contents="Three"/>/>
> />

First John's syntax mutation, 
    <element <att <nested>val</nested> > />
and now your syntax mutation,
    <elemetn att=<nested>val</nested> />

Pretty.  Yum.

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Colas.Nahaboo at sophia.inria.fr  Fri Dec  3 17:11:09 1999
From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo)
Date: Mon Jun  7 17:18:19 2004
Subject: Object-oriented serialization (Was Re: Some questions) 
In-Reply-To: Your message of "Fri, 03 Dec 1999 11:03:27 EST."
             <A51F7543E295D2118D6600A024CDB2F71B9D76@MAILPROD> 
Message-ID: <199912031710.SAA22137@koala.inria.fr>


Vane Lashua writes:
> I think you're mixing apples and oranges.

I see it the other way: I try to make people realize that they are the same,
and the current artifical limits in the XML syntax make people stuble on
artificial syntax problems.

> An even simpler declaration of your example below -- and correct in XML --
> would be:
> <Point value="12in,2cm;RFFx,G0,B0" />

This is not XML. You invent a sub-language to describe the contents of the
value attribute. You will then need XML and a XML parser to understand the
outer XML, and you will have to invent and specify the inner language, and
design and implement the parser, which is *more* complex than an unified "XML
2" language. (note that SVG did just that with the contents of the path
element :-). People tend to invent plenty of these sub-languages and mentally
hide them under the rug, failing to see that the did not simplified anything,
just made things more complex at more places in many different - and often
unspecified - ways.

> XML is a storage medium. Java source code is a storage medium. XML may
> contain Java source code syntax, as Java source code may contain XML syntax,
> but both need processors to do more.

Yep, but if you look at my example, you could see that I got rid of any
sub-language!!! I only need an XML parser, nothing else at the parsing level.
I still need the upper semantic level, of course, but at least I dont have to
have plenty of different lexical parsers (and specs) to describe my data.
The SVG example is striking. To implement a SVG viewer, you need to have an
XML parser, plenty of other parsers to parse the sub-languages invented in the
different attributes and contents of the SVG XML, including a full CSS and
HTML parser...

Note that I descibed only the object instances, NOT the classes structures
(this belongs to schemas, not to the XML level), and they are not java, they
could represent C++, common lisp, python,... objects!

--
Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Dec  3 17:14:24 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:19 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz>
Message-ID: <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com>

> Yes and no. A schema-less XML document is absolutely fine for most
> things. If you need object semantics then you need a schema.

But a schema doesn't tell you the semantics (although certain schema
languages might tell you how certain element types relate to others).

When a human devises FooML, they generally come up with a vocabulary of
labels (perhaps made universally unique via namespaces), a bunch of
syntactic constraints (a schema), some human prose describing what the
labels mean, and maybe some code for doing cool stuff with FooML documents.
(Ultimately the human prose would probably be used by other people to write
code for doing cool stuff, too)

The semantics are in the human prose and the actions the code performs. Not
the schema.

James Tauber


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Toby.Speight at streapadair.freeserve.co.uk  Fri Dec  3 17:45:10 1999
From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight)
Date: Mon Jun  7 17:18:19 2004
Subject: A processing instruction for robots
In-Reply-To: Walter Underwood's message of "Fri, 03 Dec 1999 08:55:16 -0800"
References: <Walter Underwood's message of "Thu, 02 Dec 1999 13:58:58 -0800"> <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <u1z941b1f.fsf@lanber.cam.citrix.com> <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com>
Message-ID: <uogc7ncky.fsf@lanber.cam.citrix.com>

Walter> Walter Underwood <URL:mailto:wunder@infoseek.com>

0> In article <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com>,
0> Walter wrote:

Walter> At 12:09 PM 12/3/99 +0000, Toby Speight wrote:

>> It may be an idea to provide a NOTATION identifier for the
>> processing instruction, rather than binding it to the specific
>> word "robots".  It depends on the trade-off you want to make
>> between implementor convenience and author generality.  If
>> you've thought about it and decided against, it's probably
>> worth a comment in your proposal explaining your rationale.

Walter> Good point.  Since the target of the PI is "any robot that
Walter> cares", the notation would need to point to something other
Walter> than a particular robot, probably the spec.  That is a
Walter> namspace-like use of the notation.

That's exactly how I see it, too (though I don't want to use the
phrase "point to" as I consider myself strictly on the fence of the
great "Namespace As Locator" debate - "identify" may be a better
word).


Walter> In that case, should the spec require that processors check
Walter> for the correct notation before interpreting the PI?

Not only that, you may choose to specify that a PI with that notation
should be honoured, *no matter what the local name is in the document*.
I don't think this would fly with the DPH, though.


Walter> I'm a bit wary of making things more complex.  The robot world
Walter> is the natural home of the Desparate Perl Hacker, so I'd like
Walter> the spec to be understandable in 30 seconds or less.

This is the argument against doing it the completely generalised,
indirect way (the SGML Way).  It may mean that you have to decide that
it's asking too much to expect the indexers to process PIs according
to their notations.

What I'm saying is that this has to be an informed decision, and the
spec should clearly report the alternatives considered - but I know
*I'm* not qualified to actually decide.

-- 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Fri Dec  3 17:43:47 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:19 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com>
Message-ID: <38480038.6028FEF2@praxis.cz>

James Tauber wrote:
> 
> > Yes and no. A schema-less XML document is absolutely fine for most
> > things. If you need object semantics then you need a schema.
> 
> But a schema doesn't tell you the semantics (although certain schema
> languages might tell you how certain element types relate to others).

I posted a mail early today entitled "Web Vision". It explains more
clearly what I have in mind. It is a relatively new idea but I feel that
it is possible to pack a lot of useful semantics into a schema.

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Erlend.Overby at usit.uio.no  Fri Dec  3 18:04:18 1999
From: Erlend.Overby at usit.uio.no (Erlend �verby)
Date: Mon Jun  7 17:18:19 2004
Subject: SGML the next big thing?
Message-ID: <NABBLDAMACKBNFGHOFJCEEGKENAA.Erlend.Overby@usit.uio.no>


I think it is time to take a short break, raise the head and have
a look at the XML-landscape.

To me it seems that we are trying to reinvent SGML, but in
a much more complex way.

>From SGML we need the following features.
 - Groves
 - HyTime
 - TopicMaps
 - Subdoc
 - Architectural Forms
 - #CONREF
 - #NOTATIONS
 - #CURRENT
 - Public Identifiers

What we don?t need from the SGML standard:
 - SGML Declaration
 - Character entities
 - Minimisation
 - The "&" construct

Getting rid of these "features" will make it much easier to
process and implement systems based on SGML.

>From XML we need the following features:
 - Unicode
 - /> for empty elements
 - The concept of well formed
 - XSL
 - XSLT

I would like to try to show how some of the features from SGML coul
be used:

Architectural forms:
To be able to exchange information in a proper manner, the
industry has to agree on a Architecture (Common DTD). This will
avoid the need for transformations between different DTDs, since
the information is based on the same architecture.

#NOTATIONS:
This could be used to inform the processor about what kind of
information this is. It could be of type MathML, Chemical Markup,
HTML, TIFF etc. Or the information should be processed after
the ISO8601 DATE specifications etc.

Just to give an idea.


It is time to combinate the best from SGML and XML.

Today XML is too simple; it lacks several important features needed
in a structured information environment. By combining the best from
SGML and XML we will have a new working standard. This new
combination should be the preferred platform for everyone who work
with structured information, documents or data.

The best XML has done for the information community is the
awareness of structured information, and how important this is for
the business case. Now it is time to sell SSGML
(Simplified Standard Generalized Markup Language) :-)


Btw: Charles Goldfarb did not invent SGML, he discovered it :-)


Best
Erlend ?verby

--
Thinking is the hardest work there is, which
is probably the reason why so few engage
in it.
                                  (Henry Ford)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Fri Dec  3 18:08:58 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:19 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
In-Reply-To: Your message of "Thu, 02 Dec 1999 16:27:42 EST."
             <14406.58446.675568.388482@localhost.localdomain> 
Message-ID: <199912031808.LAA14313@localhost.localdomain>

> I think that there is a growing need for a common C++ SAX 1.0
> interface as XML moves more and more into high-performance
> environments.  I have kept pointers that people sent to quite a few
> existing attempts, but before I look those over, I'd like to try my
> own off the top of my head.
> 
> I'll be posting three follow-up messages on SAX/C++ to stimulate
> discussion:
> 
> 1. Some C++-specific SAX design principles.
> 2. Implementation changes required or possible in C++.
> 3. My first stab at a core SAX 1.0 C++ interface.
> 
> I know that SAX2 is still being neglected, and I apologize.

I would really like you to reconsider this ordering of priorities.  SAX2 is 
urgently needed for DOM implementors and developers of XSLT engines with 
streaming output.  You have done an admirable job of leading the SAX and SAX2 
development, and it is dying without your output.  Now if you were simply 
buried with 9-5 work, and couldn't lend your efforts, it would be 
understandable.  But to detract from SAX2 in order to focus on SAX/C++, I 
think is muddling the priorities.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 18:14:13 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:19 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
In-Reply-To: <199912031808.LAA14313@localhost.localdomain>
References: <14406.58446.675568.388482@localhost.localdomain>
	<199912031808.LAA14313@localhost.localdomain>
Message-ID: <14408.2087.817611.250771@localhost.localdomain>

uche.ogbuji@fourthought.com writes:

 > I would really like you to reconsider this ordering of priorities.
 > SAX2 is urgently needed for DOM implementors and developers of XSLT
 > engines with streaming output.  You have done an admirable job of
 > leading the SAX and SAX2 development, and it is dying without your
 > output.  Now if you were simply buried with 9-5 work, and couldn't
 > lend your efforts, it would be understandable.  But to detract from
 > SAX2 in order to focus on SAX/C++, I think is muddling the
 > priorities.

What in SAX2 is most urgently needed for DOM and XSLT?  I know that
DOM level one *can* support some things that SAX doesn't report (such
as comments and CDATA section boundaries), but there is nothing in DOM
level one that says those have to be included, and I've heard of
relatively few real-world applications that need that information.

I'm not as familiar with the situation in XSLT, and information would
be helpful.


All the best,


Daivd

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 18:14:27 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:19 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
In-Reply-To: <199912031808.LAA14313@localhost.localdomain>
References: <14406.58446.675568.388482@localhost.localdomain>
	<199912031808.LAA14313@localhost.localdomain>
Message-ID: <14408.2102.136185.50050@localhost.localdomain>

uche.ogbuji@fourthought.com writes:

 > I would really like you to reconsider this ordering of priorities.
 > SAX2 is urgently needed for DOM implementors and developers of XSLT
 > engines with streaming output.  You have done an admirable job of
 > leading the SAX and SAX2 development, and it is dying without your
 > output.  Now if you were simply buried with 9-5 work, and couldn't
 > lend your efforts, it would be understandable.  But to detract from
 > SAX2 in order to focus on SAX/C++, I think is muddling the
 > priorities.

What in SAX2 is most urgently needed for DOM and XSLT?  I know that
DOM level one *can* support some things that SAX doesn't report (such
as comments and CDATA section boundaries), but there is nothing in DOM
level one that says those have to be included, and I've heard of
relatively few real-world applications that need that information.

I'm not as familiar with the situation in XSLT, and information would
be helpful.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Dec  3 18:18:25 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:19 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> <38480038.6028FEF2@praxis.cz>
Message-ID: <014a01bf3dba$dfff38c0$eb020a0a@bowstreet.com>

> > But a schema doesn't tell you the semantics (although certain schema
> > languages might tell you how certain element types relate to others).
>
> I posted a mail early today entitled "Web Vision". It explains more
> clearly what I have in mind. It is a relatively new idea but I feel that
> it is possible to pack a lot of useful semantics into a schema.

Are you achieving this by expressing how certain element types relate to
other element types and to concepts? A semantic network?

If so, you are still ultimately relating the elements to concepts you are
probably going to define by human prose or running code.

I'm not arguing with this idea. I think it probably has some promise. But
the real semantics are ultimately introduced into the system by agreed to
concepts that aren't expressed via schemata. A schema is part of the
picture, but not the whole.

I'll go back and read your Web Vision post.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Fri Dec  3 18:20:08 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:20 2004
Subject: SGML the next big thing?
References: <NABBLDAMACKBNFGHOFJCEEGKENAA.Erlend.Overby@usit.uio.no>
Message-ID: <015501bf3dbb$1de0be70$eb020a0a@bowstreet.com>


> Btw: Charles Goldfarb did not invent SGML, he discovered it :-)

Sounds like another Ted Nelson quote.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 18:22:56 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
Message-ID: <14408.2610.245842.199581@localhost.localdomain>

I'd like to hear what others think on this issue.  There was some
interest in SAX2 when I posted my alpha interfaces a few months back
(most notably, but not exclusively, from David Brownell), but it was
hardly a tidal wave.  On the other hand, I am noticing a building
pressure from implementors to get something out in C++.

I can think of a few reasons that the world might desperately be
waiting for SAX2:

1. To get some kind of standard Namespace support (or at least a way
   to tell whether a parser has Namespace support built in).

2. To query parser features in general.

3. To get at the stuff that SAX 1.0 doesn't report, like comments,
   CDATA boundaries, and DTD declarations.

I think that there is a real need for #1, since many other specs (XSL,
XML Schema, RDF, XHTML, etc.) are built on top of Namespaces.  I think
that #2 would make life a fair bit easier for library developers, but
it's not as critical (Simon St-Laurent will be grateful, though).

I have a lot of trouble with #3, though.  There are a few specialized
fields where this stuff isn't just syntactic fluff (repositories and
editing tools spring immediately to mind), but in general, very, very,
very few real-world XML applications need to know about anything but
elements, attributes, and character data -- witness the recent SML
discussion.

I'm very interested in hearing other opinions.  Having a standard
streaming interface stimulated a lot of development of reusable Java
XML processing components, and I'd like to see the same thing happen
in C++, but I need to hear what other people think the priorities
should be.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec  3 18:38:11 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:20 2004
Subject: SGML the next big thing?
In-Reply-To: <NABBLDAMACKBNFGHOFJCEEGKENAA.Erlend.Overby@usit.uio.no>
Message-ID: <Pine.LNX.4.10.9912030138330.18460-100000@cauchy.clarkevans.com>


On Fri, 3 Dec 1999, Erlend =D8verby wrote:
> Btw: Charles Goldfarb did not invent SGML, he discovered it :-)

I belive in this whole-heartedly.

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec  3 19:15:13 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:20 2004
Subject: FYI: YML: A Grand Unification of SAX and DOM? (fwd)
Message-ID: <Pine.LNX.4.10.9912030214230.18460-100000@cauchy.clarkevans.com>


Hello everyone,

I'd like to carry on a sub-discussion of SML called
YML on the SML list.  It is of particular importance
to the interaction between XML and XSL.

Clark

---------- Forwarded message ----------
Date: Fri, 3 Dec 1999 02:13:25 -0500 (EST)
From: Clark C. Evans <clark.evans@manhattanproject.com>
To: sml-dev@eGroups.com
Subject: YML: A Grand Unification of SAX and DOM?

Paul,

I didn't want to speak up a few days ago -- claiming that I was 
going to do a grand unfication of SAX and DOM (even though this 
was exactly what I was thinking may be the case).

On Thu, 2 Dec 1999, Paul Tchistopolskii wrote:
> > Isn't it  the end of long discussion of Elements vs Attributes?
> > Now when I see the question: "Should I use attributes or
> > elements?" - I know the answer:
> >
> > "If you want it to be processed by current APIs not keeping
> > the entire docuemnt in the memory  - use attributes everywhere
> > you can."

Exactly...  It is a matter of what type of access you would 
like to have when processing the information -- you have two 
choices:   Random Access  (DOM) or Sequential Access (SAX)

The trick, however, is subtle.  You don't want _all_ random 
access nor do you want _all_ sequential access.  You want a 
ballance.  And this is where the binary doubly recursive
pattern comes in. 

On Thu, 2 Dec 1999, G. Ken Holman wrote:
> I posit that expressing properties of hierarchical components as
> sub-elements of ancestry does not work well in information design 
> for the following reasons:
>
> (1) - I claim that the information in <b> has no (and should have no)
> direct relation to the information in <e>, but that the information in
> attr1= may or may not have direct relation to the information in <e>
> because of descendent scope (<e> is a descendant of <a> but not of <b>
> so how could <b>'s "influence" be construed as impacting on <e>?)
>
> (2) - when I am processing <e> (say with XSLT and XPath) I can very
> easily determine properties of <e> by examining ascendent places in 
> the hierarchy (<a> is an ascendant of <e> therefore <a> and its
> attributes are easily addressed via the ancestor:: axis)
>   - if I didn't have attributes and I was obliged to use sub-elements,
> the extra processing involved to examine all child elements of all
> ancestors for possible applicable properties would be both unwieldy 
> and ambiguous    

This is a great summary.  The last day I jotted down a rough
sketch of the Idea I've had running thorugh my head the last
few days.  It is posted at

   http://clarkevans.com/yml.html

With a text version below.  It's *far* from perfect, but I wanted 
to get the idea out as a cohesive unit so we can work on it
as a community.

I'd like to hear what you all think...

Clark


--------------------------------- 
YML - The Why Markup Language
---------------------------------- 
Authors:  Clark Evans 
History:  Version .1, 03-DEC-1999 

Summary

  YML is currently an assembly of thoughts regarding
  the creation of a doubly recursive markup language
  and parser description.  YML is an extension of
  the simple markup language ("SML"), which is a
  strict subset of the extensible markup language
  ("XML").  Further, YML is a unification of the XML
  document object model ("DOM") and the simple
  application programming interface for XML ("SAX"). 

Motivation

  YML was motivated from two reoccurring debates on the
  XML list, under the titles "SAX vs DOM" and "element
  vs attribute".  It is interesting how they are interwoven.
  The SAX vs DOM debate often centers around which is
  better for processing information: random access
  method (RAM) or a sequential access method (SAM).
  Those from the DOM camp state that having the entire
  document in memory makes things easy to program; while
  those from the SAX camp point to efficiencies of
  stream processing.

  The element vs attribute debate is concerned with
  the distinction between an element's content and
  its attributes.  One camp believes that the difference
  reflects an clear contextual binary decomposition, while
  the other camp views the distinction as syntax sugar.
  These debates are subtly linked since SAX provides
  an accepted, de-facto interpretation -- included
  with sequential access for each element, is a
  random access collection of its attributes.  This
  interpretation is of huge value, as it ties the
  element vs attribute debate to a more tangeable
  processing concern, sequential access or
  random access.

  SAX is of interest for one other reason.  It does
  not notify the processor individually for each
  attribute -- instead, it waits until it has
  collected each and every attribute before providing
  them as a single collection.  This is in sharp
  contrast to its treatment of elements, which are
  handled individually, one by one. 

Another Motivation: Transformation Languages

  The real value in XML isn't just data representation
  or ease of parsing, it is the promise of a
  transformation language expressed in XML itself.
  The XML style sheet language ("XSL") is one approach
  to markup language transformations.  XSL is the composite
  of many wonderful constructs, lessened by a few
  particularly bad restrictions.

  The delightful recursive template matching system is XSL's
  claim to fame.  XSL is a collection of such templates,
  where a match clause identifies an expression which
  will trigger particular elements (and not attributes) to
  be processed according to the rules provided. These 'xpath'
  expressions define multiple axis.  The ancestor axis is the
  most important, it is the current element stack.  Of secondary
  importance is the attribute axis, which allows access
  to an element's attributes.  These axis together allow for
  a very powerful way to identify and process elements.
  One disturbing aspect of XSL is the inclusion of the forward
  and previous axes in this xpath expression syntax.  Furthermore,
  loop constructs and the ability to re-visit elements was also
  included in the language.  For an XSL processor to reasonably
  support these features, random access to the information is
  a requirement. This is a problem for large-size information
  sets or low-memory processing devices.

  There are a few individuals who are contemplating a stream
  based alternative to XSL which will work without these large
  memory restrictions.  Assuming that SAX was the underlying
  basis for such a processor; the only items available in
  random-access memory at any given time would be an element
  stack, including each element's attributes.  This is hardly
  good enough to be efficient.  An extension to a minimal
  processor build on top of SAX, could enable those elements
  on the "previous" axis -- as long as they are mentioned
  somewhere in the stylesheet.  A smart collector could identify
  an element which must be used later, pinning it for random
  access in the future.  Unfortunately, this method would not
  work well with dynamically generated xpath expressions.
  There are many other concerns as well, such as how to
  accomplish sorting, repeat performances, and other
  clear benifits that the random access brings.  However,
  so far, there has not been a clear approach. 

Direction

  The goal of YML is to be a building block upon which
  an alternative to XSL can be built.  One which is more
  space efficient than XSL, yet one does not sacrifice
  time as a pure stream based alternative appears to do.
  To accomplish this, memory must be managed differently
  at the parser level; thus a new parser description ("PD")
  must be provided -- one that ballances the constraints of
  SAX with the power of DOM.  And, to accomplish this,
  the syntax of the markup language ("ML") itself must be
  substantially altered.  Strictly speaking, the ML could
  easily be an XML extension, however, the data model
  presented here would be too hard to grok with all of XML's
  subtleties.

  These are serious changes, however, if it is possible to
  unify SAX and DOM, and perhaps enable the generation of a
  better transformation processor, it may be worth it. 

Background

  Consider these included by reference:
  http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/1120.html
  http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/1136.html
  http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Nov-1999/1205.html
  http://www.egroups.com/group/sml-dev/31.html
  http://www.egroups.com/group/sml-dev/89.html 

Development

  Consider an enhanced SAX parser with an element stack enabling
  the new XSL processor random access to the entire ancestor
  and attribute axis, with sequential access otherwise.

  Consider further:

    <root r="x" >
      <s1/>
      <s2/>
    </root>

  Here, both sequential access nodes s1 and s2 have random access
  to the node r.  However, these nodes cannot access each other
  since they are provided sequentially: When the s1 is visited,
  s2 has not yet been provided.  Also, when s2 is visited, s1
  has already been dropped from memory.
  Note, that this is recursive:

    <root r="x">
      <s sr="a">
        <ss/>
      </s>
    </root>

  Here it is clear that the node ss can access node
  s, sr, and r.  So far so good.  Lets enumerate the
  possible node types:

       s, r, sr, ss, ssr, sss, sssr, ...

  Notice that given the current XML syntax, and this
  processing model, random access nodes with children
  are not allowed.  In other words, a given xpath
  may only consist of sequentially accessed nodes
  followed by an optional, random access tail node. Meat
  It is shown below how a change in XML's syntax to
  permit recursion on the attribute axis would allow
  a parser to be built having random access nodes
  allowing children.

  It is hypothesized that this syntax change would allow
  a construction of a parser that could be used in lieu
  of both DOM and SAX, giving random access or sequential
  access in a context sensitive manner, as a function
  of the source information.

  It is further hypothesized that this parser could be
  used to build a processor that has most of XSL's
  advantages without sacrificing performance. 

The Change

  Consider the following syntax (due to John Cowan):

  <root
    <r
      <rr/>
    >
      <rs/>
    </r>
  >
    <s
      <sr/>
    >
      <ss/>
    </s>
  </root>

  With this change, it is possible to generate all of
  the possible node types:

    r s rr rs sr ss rrr rrs rsr rss srr srs ssr sss ...

  This may not be the prettiest syntax; however,
  XML becomes a sub-set of this new syntax -- where
  the following definition is used for backwards
  compatability.

    <el att="val" /> <=> <el <att>val</att> />

  And perhaps allowing the following syntax sugar
  is used for nested attributes (due to Colas Nahaboo):

    <el att=<ch>val</ch> /> <=> <el <att <ch>val</ch>/>/>

  Further, there should be no problem for XML parsers to
  enable the recognition of the new syntax since the
  above are since neither of the above expressions are
  well-formed.

  Thus, this is the basis for a completely different
  parser behavior that alternates between random access
  or serial access depending upon the type of node
  which is encountered, according to something like
  the following:

    interface yml-node {
      boolean is-random();
    }
    interface yml-branch extends yml-node {
      String name();
    }
    interface yml-leaf extends yml-node {
      String text();
    }
    interface yml-stack {
      yml-node current();
      yml-stack parent();
      // list of random children
      int count();
      yml-stack child(int i);
      // private
      yml-stack(yml-stack parent, yml-node current);
      add( yml-stack child );
      complete();
    }
    interface yml-output {
      void push( yml-stack element );
      void leaf( yml-stack element );
      void pop( yml-stack element );
    }
    interface yml-input {
      // to be defined
    }
    void yml-process( yml-input in, yml-output out,
                      yml-stack stack, boolean is-random )
    {
      if(in.peek-is-leaf()) {
        yml-stack top =
          new yml-stack( stack ,
            new yml-leaf(is-random,in.next());
        if(is-random)
          stack.add(top);
        else
          out.leaf(top);
        return;
      }
      // it's a branch
      yml-stack top =
        new yml-stack( stack,
          new yml-branch(is-random, in.next());
      while(in.inside-the-tag())
         yml-process(in,out,top,true);
      top.complete();
      if(!is-random)
        out.push(top);
      while(in.outside-the-tag())
         yml-process(in,out,top,false);
      if(is-random)
        stack.add(top);
      else
        out.pop(top);
      return;
    }

  Thus, if the entire output uses the "random recursion"
  extreme, with only a single node (the top one) being
  sequential, then this method looks very much like DOM.
  In the other extreme, if "sequential recursion" is used,
  with an occasional attribute, then this method looks
  very much like SAX.  However, if the input stream is
  a unique mixture then the result is surprising:
  the parser configures its memory usage subject
  to the structure of the information being processed.
  Thus, a unified parser is created.

  For an transformation processor built on top of this
  type of parser, it motivates an additional 'random'
  axis.  Define a sequential node as one visited by
  the procesor's interface, and is dropped from
  memory afterwords.  Define a random node as
  one not visited by the processor's interface, but
  made available through a random access method.
  Access of random nodes by sequential nodes is
  provided by the following rules: (a) Sequential
  nodes may reference its or any of its sequential
  ancestors's random siblings.  (b) If a sequential
  node may reference a random node, then it may also
  reference any random children of the random node.
 
  Furthermore, if XML syntax compliance is absolutely
  needed -- the attribution notion could be used to
  mark random nodes:
 
  <root>
    <r random-access="yes">
      <rr random-access="yes"/>
      <rs/>
    <r>
    <s>
      <sr random-access="yes" />
      <ss/>
    </s>
  </root>
 
  Alternatively, the distinction between sequential
  or random nodes could be detailed in a DTD or some
  other schema document.  However, I feel all of these
  are kluges and that John Cowan's syntax is the best
  expression of the idea.  So the syntax becomes a bit
  more complicated... maybye.  I say make the lower level
  a bit more complicated... to simplify everything else.
 
  It may be a ways off, however, I believe that this
  binary recursive method provides an novel and
  unexpected approach to making information processors
  more efficient.
 
 
Best Wishes,
 
Clark Evans
 
 
Credits
 
Too many to mention, first the xml-dev and sgml-dev and xsl lists; filled
with smart people.  Second, to Dan Palanza for introducing me to binary
recursive models.  Further, to the huge amount of philosophical, and
technical literature out there regarding programming and systems theory
that has shaped the manner in which
I approach problems.  Thanks!  


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Dec  3 19:12:37 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <14408.2610.245842.199581@localhost.localdomain>
Message-ID: <199912031912.OAA10962@hesketh.net>

At 01:21 PM 12/3/99 -0500, David Megginson wrote:
>I'd like to hear what others think on this issue.  There was some
>interest in SAX2 when I posted my alpha interfaces a few months back
>(most notably, but not exclusively, from David Brownell), but it was
>hardly a tidal wave.  On the other hand, I am noticing a building
>pressure from implementors to get something out in C++.

As interested as I am in seeing SAX2 emerge (see below), I'll admit that
getting SAX out in C++ is probably more immediately important.  I avoid C++
and C completely, but I get lots of queries about XML for C/C++/COM other
than IE 5.

>I can think of a few reasons that the world might desperately be
>waiting for SAX2:
>
>1. To get some kind of standard Namespace support (or at least a way
>   to tell whether a parser has Namespace support built in).
>
>2. To query parser features in general.
>
>3. To get at the stuff that SAX 1.0 doesn't report, like comments,
>   CDATA boundaries, and DTD declarations.

I think #1 is very important, but #2 makes both #1 and #3 much easier.
Those who neither need nor want namespaces may still want other features,
and need to make the queries, and once that query process is in place it's
easy to define numbers 1 and 3 as 'optional parser features'.

I was pretty pleased with the SAX2 Alpha, and think it may provide enough
of #2 that maybe #1 and #3 could be carried out as separate (but
affiliated) efforts.

Now that I'm nearly done refinishing my floors, I'm hoping to have more
time to devote to proposals like this again.  But first I have to go to
Philadelphia for that XML '99 thing.  I'd love to talk with anyone who's
interested about SAX futures there.  Meet at the bar, any bar I guess.

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Fri Dec  3 19:17:24 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:20 2004
Subject: SGML the next big thing?
Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415549@THOR>

Erlend Xverby [Erlend.Overby@usit.uio.no] wrote

>>What we don't need from the SGML standard:
>> - SGML Declaration
>> - Character entities
>> - Minimisation
>> - The "&" construct

It looks like the XML Schema group is trying to add back the & construct.
If you have a compelling justification for continued suppression, please
rant long and loud.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Dec  3 19:31:13 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
Message-ID: <3.0.32.19991203112817.014cd7b0@pop.intergate.ca>

At 01:21 PM 12/3/99 -0500, David Megginson wrote:
>1. To get some kind of standard Namespace support (or at least a way
>   to tell whether a parser has Namespace support built in).
>2. To query parser features in general.
>3. To get at the stuff that SAX 1.0 doesn't report, like comments,
>   CDATA boundaries, and DTD declarations.
>
>I think that there is a real need for #1
>I think that #2 would make life a fair bit easier for library developers
>I have a lot of trouble with #3

Agreed, on all points.  The unavailability of namespaces threatens
to make SAX unusable before too long. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Fri Dec  3 19:34:29 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:18:20 2004
Subject: SGML the next big thing?
In-Reply-To: <61DAD58E8F4ED211AC8400A0C9B46873415549@THOR>
Message-ID: <199912031930.LAA00641@mail.sqwest.bc.ca>

On 3 Dec 99, at 12:14, Arnold, Curt wrote:

> It looks like the XML Schema group is trying to add back the & construct.
> If you have a compelling justification for continued suppression, please
> rant long and loud.

How about every SGML parser author I've talked to says the & 
construct was the biggest, hardest part (which means probably the 
buggiest) of the entire parser? I think the XML WG was right in 
throwing it out of XML in the first place.


Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Fri Dec  3 19:57:27 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <3.0.32.19991203112817.014cd7b0@pop.intergate.ca>
Message-ID: <199912031953.LAA21889@mail.sqwest.bc.ca>

On 3 Dec 99, at 11:32, Tim Bray wrote:

> At 01:21 PM 12/3/99 -0500, David Megginson wrote:
> >1. To get some kind of standard Namespace support (or at least a way
> >   to tell whether a parser has Namespace support built in).
> >2. To query parser features in general.
> >3. To get at the stuff that SAX 1.0 doesn't report, like comments,
> >   CDATA boundaries, and DTD declarations.
> >
> >I think that there is a real need for #1
> >I think that #2 would make life a fair bit easier for library developers
> >I have a lot of trouble with #3
> 
> Agreed, on all points.  The unavailability of namespaces threatens
> to make SAX unusable before too long. -Tim

I think SAX availability of namespaces would be useful; the DOM 
Level 2 (soon to be a Candidate Recommendation, which means 
"please implement and tell us whether it's possible") has 
namespace support and the proliferation of SAX to DOM builders 
means it would be good if SAX and DOM could support more of 
what the other needs.

I have mixed feelings about CDATA sections; they're useful for 
things like writing scripts that are embedded in XML documents, so 
I'd rather have them available, but I can see that not every 
application needs them.


Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Fri Dec  3 20:13:55 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:20 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
References: <14406.58446.675568.388482@localhost.localdomain><199912031808.LAA14313@localhost.localdomain> <14408.2087.817611.250771@localhost.localdomain>
Message-ID: <006501bf3dca$a4767ce0$4a5eedc1@arp01>

> What in SAX2 is most urgently needed for DOM and XSLT?  I know that
> DOM level one *can* support some things that SAX doesn't report (such
> as comments and CDATA section boundaries), but there is nothing in DOM
> level one that says those have to be included, and I've heard of
> relatively few real-world applications that need that information.

How about focusing on SAX/2, and making the first C/C++ SAX interface
actually SAX 2 so we kill two birds with one stone ?

Regards,

Richard.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vlashua at RSGsystems.com  Fri Dec  3 20:13:42 1999
From: vlashua at RSGsystems.com (Vane Lashua)
Date: Mon Jun  7 17:18:20 2004
Subject: Object-oriented serialization (Was Re: Some questions) 
Message-ID: <A51F7543E295D2118D6600A024CDB2F71B9D77@MAILPROD>

What's the point of defining "Point"? You are putting it in a context for a
processing engine to process. "Point" is meaningless by itself -- even
though it may be syntactly correct, in a context, with normalized attributes
and values -- without a specific processor that understands what a "Point"
is.

Let's say you want to type a value in XML. Easy. ...type="int" value="1"/> .
Or ...type="2Dpoint" value="2,2"/> (whether I say 2Dxvalue="2" 2Dyvalue="2"
is of no consequence). Both "int" and "2Dpoint" are defined somewhere else.

A Java or COBOL processor may be made to understand the type "2Dpoint", but
XML never will. It is not a processor any more than valid Java source code
is.

Vane

-----Original Message-----
From: Colas Nahaboo [mailto:Colas.Nahaboo@sophia.inria.fr]
Sent: Friday, December 03, 1999 12:11 PM

Vane Lashua writes:
> I think you're mixing apples and oranges.

I see it the other way: I try to make people realize that they are the same,
and the current artifical limits in the XML syntax make people stuble on
artificial syntax problems.

> An even simpler declaration of your example below -- and correct in XML --
> would be:
> <Point value="12in,2cm;RFFx,G0,B0" />

This is not XML. You invent a sub-language to describe the contents of the
value attribute. You will then need XML and a XML parser to understand the
outer XML, and you will have to invent and specify the inner language, and
design and implement the parser, which is *more* complex than an unified
"XML
2" language. (note that SVG did just that with the contents of the path
element :-). People tend to invent plenty of these sub-languages and
mentally
hide them under the rug, failing to see that the did not simplified
anything,
just made things more complex at more places in many different - and often
unspecified - ways.

> XML is a storage medium. Java source code is a storage medium. XML may
> contain Java source code syntax, as Java source code may contain XML
syntax,
> but both need processors to do more.

Yep, but if you look at my example, you could see that I got rid of any
sub-language!!! I only need an XML parser, nothing else at the parsing
level.
I still need the upper semantic level, of course, but at least I dont have
to
have plenty of different lexical parsers (and specs) to describe my data.
The SVG example is striking. To implement a SVG viewer, you need to have an
XML parser, plenty of other parsers to parse the sub-languages invented in
the
different attributes and contents of the SVG XML, including a full CSS and
HTML parser...

Note that I descibed only the object instances, NOT the classes structures
(this belongs to schemas, not to the XML level), and they are not java, they
could represent C++, common lisp, python,... objects!

--
Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia,
http://www.inria.fr/koala/colas


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From xml-dev at teleo.net  Fri Dec  3 20:20:28 1999
From: xml-dev at teleo.net (Patrick Phalen)
Date: Mon Jun  7 17:18:20 2004
Subject: Rocket framework for creating Web sites
In-Reply-To: <000001bf3dad$a0c0f520$0b00a8c0@grissom>
References: <000001bf3dad$a0c0f520$0b00a8c0@grissom>
Message-ID: <99120312225003.00844@quadra.teleo.net>

[KenNorth, on Fri, 03 Dec 1999]
:: Michael Floyd just released the Rocket framework:
:: 
:: "In a nutshell, Rocket is a collection of skeleton XML documents, XSL style
:: sheets, and DTD's that you can use as a basis for creating your own
:: XML-based Web site. Using Rocket, you can transform XML documents and serve
:: them to any browser, regardless of its capabilities. Rocket also allows you
:: exchange XML streams between XML-capable browsers and HTTP servers."
:: 
:: Check his BeyondHTML site:
:: http://www.beyondhtml.com/rocket/rocket.xml

Can someone explain how this would be an improvement over Cocoon?
http://xml.apache.org/cocoon/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Fri Dec  3 20:41:38 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:20 2004
Subject: expat meant to be restartable?
Message-ID: <38482AC7.95973198@fxtech.com>

I don't think it is, but I want to check. I'd like to be able to reuse
an XML_Parser after I've called XML_Parse with isFinal set to 1.
Basically I want to go back and parse a subset of the original file,
using modified starting buffer pointer and length, but it doesn't seem
to work (I get a JUNK_AFTER_DOC_ELEMENT error). I would like to avoid
creating a new parser for each element subtree I scan.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Fri Dec  3 20:50:07 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:20 2004
Subject: simple DOM-style XML parser, in C++?
Message-ID: <38482CCC.756549B3@fxtech.com>

I just "discovered" this list, and I wanted to check into the existence
of a simple parser that provides DOM-like access, before I implemented
my own (on top of expat). I've heard of SAX, and wonder if the C++
interface does anything like what I want to do.

I'm looking for a simple query interface that provides generic access to
elements and attributes, possibly with iterator-style access.

Here is an example XML file, and how I would like to go about parsing
it:

<Container name="foo" type="bar">
	<Foo name="element" length="42"/>
	<Object name="object1">
		<SubObject/>
	</Object>
	<Object name="object2">
		<SubObject/>
	</Object>
</Container>

// open file (somehow)
XML::File file(filename);
// search for a top-level element
XML::Element element = file.GetElement("Container");
// query attributes
XML::Attribute nameAttr = element.GetAttribute("name");
XML::Attribute typeAttr = element.GetAttribute("type");
// get attribute values
std::string name, type;
nameAttr >> name;
typeAttr >> type;

// look for specific sub-element
XML::Element fooElem = element.GetElement("Foo");
// read attributes directly
fooElem.GetAttribute("name") >> name;
int length;
fooElem.GetAttribute("length") >> length;

// loop over elements by iterator
XML::element_iterator it = element.begin("Object");
while (it != element.end())
{
	XML_Element &objElem = (*it);
	objElem.GetAttribute("name") >> name;
	// etc ...
}

I think something like this is pretty straightforward without all that
DOM complexity. I think this interface can be layered on top of expat.

Does the C++ SAX interface already do something like this? *should* I be
using DOM instead for something this simple?

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mbrady at nist.gov  Fri Dec  3 21:07:27 1999
From: mbrady at nist.gov (Mary Brady)
Date: Mon Jun  7 17:18:20 2004
Subject: DOM ECMAScript Test Suite
References: <199912031953.LAA21889@mail.sqwest.bc.ca>
Message-ID: <015501bf3dd2$ec275390$293b0681@ncsl.nist.gov>

Hi Everyone,

I've just updated our DOM ECMAScript test suite, available from

        http://www.nist.gov/xml/

Click on DOM Test Suite.  This suite includes ~900 ECMAScript 
tests that exercise the DOM Level 1 Fundamental, Extended, and
HTML interfaces.  You can view the results using IE5 by clicking
on first the category, and then the particular interface.  Options are
available for displaying the source code, semantic requirements 
(which are simply axioms we glean from the spec to organize our
thoughts), and the actual specification.

Please let me know if you find this useful.  We are in the process of 
generating equivalent functionality for the java binding.  We are just
about finished with the fundamental interfaces, and expect to have a
first set, including fundamental and extended available in early January.

As always, comments/suggestions are greatly appreciated.

Mary Brady
NIST, Conformance Testing
mbrady@nist.gov


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec  3 21:11:59 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <14408.2610.245842.199581@localhost.localdomain>
Message-ID: <000e01bf3dd3$15c21d20$099918d1@docuverse1>

David,

I believe language specific SAX bindings can and should be
delegated to others with you and rest of us keeping an eye
on its progress.  [btw, I am busy :]  Just form a small
group (2-3 people) out of the people who are applying
pressure on you about SAX/C++, throw in a C++ guru for
flavor, and stir.  XML-DEV can serve as the sounding board
for this and other languages (Smalltalk?).

SAX2 work, on the other hand, is far more important IMHO.
Lets get the namespace support added and revisit the
parser features and missing callback issues.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Bruce.Duffy at westgroup.com  Fri Dec  3 21:53:34 1999
From: Bruce.Duffy at westgroup.com (Duffy, Bruce)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
Message-ID: <27CD34D68C7DD211A68A0004AC38272A023ED475@elizabeth.int.westgroup.com>

I agree with Tim Bray.

I'll have to walk away from SAX if it doesn't
support namespaces in the near future.  

I'm concerned that SAX will lose its relevance
(vendors will cease to support it) if it doesn't
track the relevant xml standards.


	Bruce Duffy

-----Original Message-----
From: David Megginson [mailto:david@megginson.com]
Sent: Friday, December 03, 1999 12:22 PM
To: XMLDev list
Subject: SAX/C++ vs. SAX2


I'd like to hear what others think on this issue.  There was some
interest in SAX2 when I posted my alpha interfaces a few months back
(most notably, but not exclusively, from David Brownell), but it was
hardly a tidal wave.  On the other hand, I am noticing a building
pressure from implementors to get something out in C++.

I can think of a few reasons that the world might desperately be
waiting for SAX2:

1. To get some kind of standard Namespace support (or at least a way
   to tell whether a parser has Namespace support built in).

2. To query parser features in general.

3. To get at the stuff that SAX 1.0 doesn't report, like comments,
   CDATA boundaries, and DTD declarations.

I think that there is a real need for #1, since many other specs (XSL,
XML Schema, RDF, XHTML, etc.) are built on top of Namespaces.  I think
that #2 would make life a fair bit easier for library developers, but
it's not as critical (Simon St-Laurent will be grateful, though).

I have a lot of trouble with #3, though.  There are a few specialized
fields where this stuff isn't just syntactic fluff (repositories and
editing tools spring immediately to mind), but in general, very, very,
very few real-world XML applications need to know about anything but
elements, attributes, and character data -- witness the recent SML
discussion.

I'm very interested in hearing other opinions.  Having a standard
streaming interface stimulated a lot of development of reusable Java
XML processing components, and I'd like to see the same thing happen
in C++, but I need to hear what other people think the priorities
should be.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From KenNorth at email.msn.com  Fri Dec  3 21:54:36 1999
From: KenNorth at email.msn.com (KenNorth)
Date: Mon Jun  7 17:18:20 2004
Subject: Rocket framework for creating Web sites
References: <000001bf3dad$a0c0f520$0b00a8c0@grissom> <99120312225003.00844@quadra.teleo.net>
Message-ID: <022701bf3e3d$7ba1ed40$0b00a8c0@grissom>

From: Patrick Phalen <xml-dev@teleo.net>
> :: Michael Floyd just released the Rocket framework:
>
> Can someone explain how this would be an improvement over Cocoon?
> http://xml.apache.org/cocoon/

Patrick,

When did choice become a dirty word?

There appears to be an obvious difference. Rocket currently works with ASP,
with plans for supporting Java servlets and Perl/CGI.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec  3 22:31:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:20 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <27CD34D68C7DD211A68A0004AC38272A023ED475@elizabeth.int.westgroup.com>
References: <27CD34D68C7DD211A68A0004AC38272A023ED475@elizabeth.int.westgroup.com>
Message-ID: <14408.17522.150496.586573@localhost.localdomain>

Duffy, Bruce writes:

 > I agree with Tim Bray.
 > 
 > I'll have to walk away from SAX if it doesn't
 > support namespaces in the near future.  
 > 
 > I'm concerned that SAX will lose its relevance
 > (vendors will cease to support it) if it doesn't
 > track the relevant xml standards.

There's no problem adding Namespace support to SAX -- after all, you
can stack SAX filters on top of each other.  John Cowan wrote a
Namespace filter for SAX about a year ago, and I have a fairly
high-performance one that I just haven't had time to package and
release yet.

The problem is the lack of a standard way to tell whether a SAX driver
already supports Namespace processing natively (and, thus, doesn't
need a filter), together with a standard way to turn that processing
on or off.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Fri Dec  3 22:54:36 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:20 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: <38478DB2.FACA4633@praxis.cz>
Message-ID: <Pine.GHP.4.21.9912032153220.11243-100000@mail.ilrt.bris.ac.uk>

On Fri, 3 Dec 1999, Matthew Gertner wrote:

> David Megginson wrote:
> > How does the schema tell me that foo represents a container for a
> > collection of objects, bar represents an object, and hack and flurb
> > represent the object's properties?
> 
> The point is not what the current schema draft allows, it is whether it
> would be feasible and appropriate to represent this information in XML
> schemas, as Paul rightly stated. My opinion is that it would be fairly
> trivial and extremely useful.

I believe it will be possible to annotate XML schemas with information
for mapping into (generic or domain specific) application datamodels
such as RDF. I don't think it is right to expect the hard-pressed XML
Schema group to define all these mappings within that working group.
But that doesn't matter; all we need is a placeholder for such
information.

My understanding of the Cambridge Communique meeting was that we reached
agreement on just this. See points 1-6 under '3. Observations and
Recommendations' in http://www.w3.org/TR/1999/NOTE-schema-arch-19991007

If it _is_ really trivial to define a mapping from XML Schema
information to a classes/objects/properties RDFesque model, I for one would like to
see this documented and implemented. XML-DEV seems as good a place as
any to play around with such a thing...

Excerpt from the Cambridge Communique:
(I've no idea where the XML Schema WG's work is up to in relation to
these ideas; the basic principles outlined here seem enough to get
discussion going on XML-DEV though)

	[from http://www.w3.org/TR/1999/NOTE-schema-arch-19991007]
	3. Observations and Recommendations 
	This group reached consensus on the following observations and recommendations: 

	The XML data model is the XML Information Set being specified by the XML
	Information Set Working Group. Other data models exist, both generic and
	application-specific. RDF is an example of one such generic data
	model. The XML Schema and RDF Schema languages are separate languages
	based on different data models and do not need to be merged into a
	single comprehensive language. 

	An XML Schema schema document will be able to hold declarations for
	validating instance documents. It should also be able to hold
	declarations for mapping from instance document XML infosets to
	application-oriented data structures. 

	For evolvability and interoperability, the XML Schema specification
	should provide an extension mechanism allowing for the augmentation of
	XML Schema schemas with additional material. At a minimum, XML Schema
	should permit elements from other namespaces to be included in schema
	documents. This extension mechanism should also permit individual
	extensions to be marked 'mandatory', meaning that a document instance
	cannot be deemed 'schema valid' if the processing required by a marked
	extension cannot be performed. 

	The extension mechanism should be appropriate for use to incorporate
	declarations ("mapping declarations") to aid the construction of
	application-oriented data structures (e.g. ones implementing the RDF
	model) as part of the schema-validation and XML infoset construction
	process. This facility should not be exclusive to RDF, but should also
	be useable to guide the construction of data structures conforming to
	other data models, e.g. UML. 
	[...]

> > It can be.  The DOM represents a domain-specific object layer that is
> > useful for a wide subset of XML operations (especially document- and
> > browser-oriented work).  There need to be many layers on top of XML,
> > one for each domain -- it happens that many of those layers will share
> > the need to encode objects, so a standard object layer sandwiched
> > between XML and the domain-specific layers can save a lot of work.
> 
> Sure, the DOM has value. My point is that maybe 95% of applications want
> a domain-specific rather than a generic interface. My other point is
> that a domain-specific interface can be implemented generically; i.e.
> programmatic interfaces for accessing XML data can be generated
> automatically from XML schemas. This isn't *that* far from what MDSAX is
> doing. IBM's XML BeanMaker (http://alphaworks.ibm.com/tech/xmlbeanmaker)
> is a good example of this concept.
> 
> > > There are a variety of efforts to create
> > > domain-specific objects automatically from XML objects. I don't have a
> > > list at the tips of my fingers, but if anyone does it would be a great
> > > resource. They are out there because I keep bumping into them.
> > 
> > One example is RDF.
> 
> So we are talking about different things. RDF is a formalism but it
> doesn't provide you with any code (although I'm sure that tools for this
> could be written, and perhaps already have been). I am talking about
> something that will take my schema with Customer and Invoice element
> types and turn it into, say, Java classes called Customer and Invoice.

Sure, you could do this. My hunch is that the urge to do this won't be
as strong when we have more abstract (objects and properties) interfaces
to XML content, rather than our current APIs that obsess on detail of
particular serialisations rather than on what those serialisations have
told us about the objects. If we could get to a world where generic
rather than domain interfaces being useful to even 10% instead of 5% of
applications (to borrow your figure), that'd be a huge win.

> > I disagree strongly with the last part of that statement.  I'd argue
> > the opposite -- higher-level layers should be as independent of XML as
> > possible.  That's the only way to build good, layered architectures.
> > XML does one thing (represent a tree structure in a character stream)
> > very well: it's an excellent layer to build other layers on top of,
> > but XML itself should stay as simple as possible so that it's
> > applicable widely to many different fields.
> 
> I agree with the layering approach. But well-formed XML should be viewed
> as the lowest level (representing tree structures); when bound to an XML
> schema it then becomes a serialized object representation.

There is also a need to know the objects'n'properties view of the data
without going to fetch (or having advance knowledge of) the
syntactic schema or serialisation policy. RDF's
initial syntax was one approach; there have been and will be
others. The Microsoft folks were for a while throwing around some
interesting ideas on mapping more 'colloquial' XML syntax into directed
labelled graphs. There's a version at http://www.biztalk.org/Resources/canonical.asp
for example. 


> > That would be another serious mistake.  Object exchange, while
> > important, represents only one of many layers that can be build on top
> > of XML, and if XML Schemas start trying to solve high-level problems
> > for every specific domain, it will become an unimplementable mess.
> > RDF already made a similar mistake by mixing together a spec for
> > object encoding in XML with a spec for representing knowledge about
> > Web pages.
> 
> Maybe this is the crux of our disagreement. I see object exchange as
> *the* application for valid XML. 

I've also heard that some folks want to use it for structured hypertext 
documents...

(One consequence of XML's document heritage is that document order is
generally treated as meaningful and in need of preservation. This can be
a pain in the butt for data-centric apps.)

				I'd be interested to hear some examples
> of applications that cannot be cast effectively in this light. In this
> view, RDF and XML Schemas are coming at the same problem from different
> angles. RDF is saying essentially "how do we build an XML application
> that represents object structures", 

This is one aspect of what RDF attempts, ie the syntax component.
The initial RDF Syntax is saying 'how do we build an XML application to
represent a particularly Webbish flavour of object   
structures? (ie. directed labelled graphs with web identifiers for
nodes, node types, relation/property types).

RDF in general doesn't look for one way of stuffing RDF data graphs into
XML; there are bound to be many ways of shipping these kinds of object
structures around in angle brackets.  So... the upper levels of RDF
(model and schema) *don't* care how about the way in which we "build an
XML application that represents object structures".

				while XML Schemas are saying "how do
> we enhance DTDs by adding some object-oriented facilities". My fear is
> that these two approaches are going to meet somewhere in the middle and
> turn out to be the same thing. If so, I vastly prefer the use of XML
> schemas. Why? Because this results in a vast simplication of the whole
> XML picture. Isn't it better to take a normal XML instance, using base
> XML syntax, and "turn" it into an object by adding the appropriate
> information in a separate schema, rather than having to recast the whole
> thing in a different syntax?

I don't see a conflict here. RDF is happy with multiple ways of shipping
data around; what it cares about is having a unified model for this
heterogenous data. Nobody I've met ever expected all interesting RDF
applications to use RDF 1.0 Syntax.

> (I wonder if I am expressing this idea clearly. I'll happily post an
> example of how this could be done if I'm not.)

I'd love to see examples of an annotated XML Schema that shows how to
derrive an objects'n'properties view of instance data.

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Fri Dec  3 23:20:20 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:20 2004
Subject: dateTime Schema counter proposal
Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415555@THOR>

This message is my starting suggestion for an overhaul of the date and time
related data types in the Sept working draft of the XML Schema Datatypes
document (http://www.w3.org/TR/xmlschema-2/).  I hope this can engender some
discussion on xml-dev and we can then can donate something that can be
incorporated quickly into the XML Schema working groups process (hopefully
in time for the next draft).  I'm am not a W3C member so this is not an
officially endorsed W3C effort.

My problems with the current draft:

1. Acceptance of ISO 8601 truncated forms without a mechanism to disallow
them makes mapping to existing programming language date and time types
problematic and engenders a lot of complexity.  The use of truncated forms
seems to oppose the XML design criteria of not adding complexity to gain
terseness.

For example, --15-04 is supposed to represent April 15th of every year (or
of an unspecified year).  There is no mechanism to fit that into a datatype
that represents an absolute date (such as VT_VARIANT or COleDateTime or
corresponding Java datatypes/classes).

2. Use of hh:mm:ss form for time durations makes it difficult to represent
intervals like 120 days.  Also, there is no provision for disallowing
non-exact durations (such as durations including years or months terms that
cannot be unambiguously converted to seconds) 

3. There is no mechanism for requiring or disallowing time zone qualifiers.

Okay, here goes

Note: in all the patterns below when I used the + sign, it indicates either
a + or - appearing in its position.

remove timeInstant and recurringTimeInstance.

x.x.x. date

Represents an particular day of a particular calendar year at an explicitly
stated or implied time zone.

x.x.x.x lexicalRepresentation

A single lexical representation, which is a subset of the lexical
representations allowed by ISO 8601, is allowed for date. This lexical
representation is the ISO 8601 extended format with optional time zone
specifier: CCYY-MM-DD[ Z | +hh:mm].  The presence of the time zone qualifier
is controlled using the timeZone facet.
Examples:
1999-12-04Z
1999-12-04-06:00
1999-12-04
x.x.x time
Represents a specific instant in a unspecified day.
x.x.x.x Lexical Representation
A single lexical representation, which is a subset of the lexical
representations allowed by ISO 8601, is allowed for timeDuration. This
lexical representation is the ISO 8601 extended format Thh:mm[:ss[.sss]][Z |
+hh:mm]
p.s. 
Examples:
T00:15Z
T12:30:00+05:00
T13:00
x.x.x dateTime
A particular instant on a particular date in an particular calendar year.
x.x.x.x Lexical Representation

CCYY-MM-DDThh:mm[:ss[.sss]][ Z | +hh:mm]

Examples:

1999-12-04T15:03
1999-12-05T15:03+05:00
1999-12-05T15:03:15.123+05:00

x.x.x timeDuration

A time duration is a defined length of time, such as 12 hours.

x.x.x.x Lexical Representation

There are two allowable lexical representation of timeDuration, one is
consistent with ISO 8601 section 5.5.3.2 P[nY][nM][nD][nH][nM], the second
is lexical representation of a real datatype interpreted as duration in
seconds.

Examples:

P6W  : Six weeks (this is, 42 days)
P12H30M : Twelve hours 30 minutes
45000 - 45000 seconds
1e-6 - one microsecond

Note: This ISO form was chosen since the alternative representation has
difficultly representing durations such as 120 days.


x.x.x timeZone

Indicates a specific offset from Universal Coordinated Time.

x.x.x.x Lexical Representation

Two forms, the first, Z indicates no offset from UTC, the second is +hh:mm.

Constraining facets

x.x.x timeZone facet

The time zone facet constrains the appearance of a time zone specifier and
qualifies date, time and dateTime datatypes.

Schema for timeZone facet:  I expressed default as an attribute so it could
be distinguished from the default element and could be typed as timeZone.

<element name='timeZone'>
	<archetype>
		<attribute name='minOccurs'>
			<datatype name="integer"/>
			<default>0</default>
		</attribute>
		<attribute name='maxOccurs'>
			<datatype name="integer"/>
			<default>1</default>
		</attribute>
		<attribute name="default">
			<datatype name="timeZone"/>
		</attribute>
		<attribute name="fixed">
			<datatype name="boolean"/>
		</attribute>
	</archetype>
</element>

Examples of use

<datatype name="zonedDateTime">
	<basetype name="dateTime"/>
	<!-- time zone must appear -->
	<timeZone minOccurs="1"/>
</datetype>

<datatype name="CSTdefaultTime">
	<basetype name="time"/>
	<!-- if time zone is not specified, it is implied to be CST  -->
	<timeZone default="-06:00"/>
</datatype>

<datatype name="localTime">
	<basetype name="time"/>
	<!-- time zone may not appear  -->
	<timeZone maxOccurs="0"/>
</datatype>


<datatype name="CSTDate">
	<basetype name="date"/>
	<!-- date can either be 1999-12-04 or 1999-12-04-06:00, but no other
time zone    ->
	<timeZone default="-06:00" fixed="true"/>
</datatype>

x.x.x precise facet

The precise facet qualifies timeDuration and disallows the use of year and
month terms (which cannot be unambiguously converted to a duration in
seconds).

Example:

<datatype name="preciseDuration">
	<basetype name="timeDuration"/>
	<precise/>
</datatype>

Note: comparisions should not be done of time, dateTime or date datatypes
unless the time zone is explicit or implied by the timeZone facet.

Note: here are some ways of representing some of the ISO 8601 functionality
that we lost, by using multiple attributes (or elements) to hold multiple
pieces of information.

Recurring date:

<Session>
	<!-- recurring day (one specific Friday) and a recurrence frequency
->
	<RecurringDay day="1999-12-04" repeats="P1W"/>
</Session>

<TaxFilingDeadline dayTime="1999-04-15T24:00" repeats="P1Y"/>

Time Period

<TimePeriod start="1999-12-06T09:00-04:00" end="1999-12-06T16:00-04:00"/>
<TimePeriod start="1999-12-06T09:00-04:00" duration="25200"/>
<TimePeriod start="1999-12-06T09:00-04:00" duration="P7H"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From yiminz at timberline.com  Fri Dec  3 23:20:33 1999
From: yiminz at timberline.com (yimin zhu)
Date: Mon Jun  7 17:18:20 2004
Subject: schema mapping tool
Message-ID: <2D722CFF0999D111AB860001FA375F1004353D21@laposte.timberline.com>

Does anybody know if there is a tool for mapping two XML schemas?

Yimin Zhu
Research & Development
Timberline Software Corp.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From liamquin at interlog.com  Sat Dec  4 04:23:29 1999
From: liamquin at interlog.com (Liam R. E. Quin)
Date: Mon Jun  7 17:18:20 2004
Subject: SGML the next big thing?
In-Reply-To: <199912031930.LAA00641@mail.sqwest.bc.ca>
Message-ID: <Pine.BSI.3.96r.991203230938.7036B-100000@shell1.interlog.com>

On Fri, 3 Dec 1999, Lauren Wood wrote:
> On 3 Dec 99, at 12:14, Arnold, Curt wrote:
>> It looks like the XML Schema group is trying to add back the & construct.
>> If you have a compelling justification for continued suppression, please
>> rant long and loud.
> 
> How about every SGML parser author I've talked to says the & 
> construct was the biggest, hardest part (which means probably the 
> buggiest) of the entire parser? I think the XML WG was right in 
> throwing it out of XML in the first place.

If this is as per content models, I think
(1) Lauren is right, because as SGML specified them, they were very
    hard to get right.

    This & thing is so far outside the way most other computer languages
    work that standard off-the-shelf parser generators roll on their
    backs and wave their paws in the air and admit defeat.

(2) The idea of saying, "this element must contain at least one of each of
    the following elements" is a useful one, and is very different from
    the & construct.

    A simplified, regularised form of & might be possible.

(3) The & connector interacts with #PCDATA to form pernicious content
    models (see below).  The XML WG went to great lengths to make sure
    that no valid XML document suffers from this SGML bogosity.  Similar
    lengths are needed for "&".

Note:
    For those who're not familiar with &, the content model connector in
    SGML that says that in order to match a & b & c ..., every content
    fragment a, b, etc., must be satsfied, and nothing must be left over.
    Furthermore, there must be exactly one way to satisfy the expression,
    as otherwise it is "ambigious" and illegal, just as
	(a, b?) | a
    is illegal in SGML, even though it is a perfectly sensible and valid
    regular expression for the rest of the world of computing :-)


    Consider the following SGML declaration (with OMITTAG NO):
	<!ELEMENT boy
	    (noise & (dirt,mud)+ & (mud,shoes,trouble)* & #PCDATA) +smell
	>
    This is a "pernicious" mixed content model, and can only have
    white space in it between elements once, since that uses up the
    #PCDATA content model fragment.

    The following is (let's say for the sake of argument) a valid boy:
	mud,smell,shoes,trouble,dirt,mud,dirt,mud,noise,smell

    If you try and match this against the content model I gave, you'll
    see that you can't do it with LL(1) or LALR(1) directly unless
    you build a DFA with a rather large number of states.  I added the
    inclusion +smell, but you could change the content model to be
	(boy-model | smell)*
    to have an even more interesting time of it.


-- 
Liam Quin, Barefoot Computing, Toronto;  The barefoot agitator
l i a m    at    h o l o w e b    dot    n e t <-- NEW ADDRESS
Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/
Please remove your shoes and socks before replying in anger.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tyler at infinet.com  Sat Dec  4 05:06:03 1999
From: tyler at infinet.com (Tyler Baker)
Date: Mon Jun  7 17:18:20 2004
Subject: XML processing instruction survey
References: <3.0.32.19991130185652.01516ec0@pop.intergate.ca>
Message-ID: <3848A063.863A20A3@infinet.com>

Tim Bray wrote:

> At 03:07 PM 11/30/99 -0800, Jeffrey E. Sussna wrote:
> >I'm interested in the extent to which people are actually using the XML
> >processing instruction ( <?xml ) in their XML files, and the extent to which
> >they find it useful.
>
> It's not really designed for people.  It's mostly designed for use
> by the XML processor to help figure out the encoding and make sure that
> this is really XML.
>
> I'd think that using it at the application level would be not only
> uncommon but probably unwise.  I'd be interested to hear any positive
> responses to the query. -T.

I use processing instructions for the class name (Java specific here) of the application
object that is used to handle a particular document type. In this sense the PI acts as a
stream header. The PI in a sense is not document content but is only used as an identifier as
to what module should be dynamically loaded to handle the document data.

I like to think of PI's as things which are useful for commanding your particular application
as to what it should do with the data and not something inherent within the document itself.

Tyler


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From twleung at sauria.com  Sat Dec  4 05:17:40 1999
From: twleung at sauria.com (twleung@sauria.com)
Date: Mon Jun  7 17:18:20 2004
Subject: XML4J EA2 --> Xerces-J 1.0
References: <1B79E83E7849174A813044A2E56F78040C09@AROD.iunknown.com>
Message-ID: <008301bf3e16$f2d4d3e0$0a00a8c0@orconet.com>

John,

All of the XML4J development team is now working on Xerces.  There may
be some additional IBM only features that get done and put in an XML4J
release on alphaWorks, but any feature that is of general interest will go into
the Xerces-J code base. 

Ted
IBM XML Technology Group

----- Original Message ----- 
From: John Lam <jlam@iunknown.com>
To: <xml-dev@ic.ac.uk>
Sent: Tuesday, November 30, 1999 4:53 PM
Subject: RE: XML4J EA2 --> Xerces-J 1.0


Will IBM continue development of XML4J independently of Xerces-J? Or
will Xerces-J be the "official" version of that source code base?

-John


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Sat Dec  4 05:48:05 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:21 2004
Subject: DOM ECMAScript Test Suite
In-Reply-To: <015501bf3dd2$ec275390$293b0681@ncsl.nist.gov>
Message-ID: <000b01bf3e1b$317d5640$099918d1@docuverse1>

>I've just updated our DOM ECMAScript test suite, available from
>
>        http://www.nist.gov/xml/
>
>Click on DOM Test Suite.  This suite includes ~900 ECMAScript 
>tests that exercise the DOM Level 1 Fundamental, Extended, and

Very very useful, Mary.  Java-binding will be nice as well.

Any plan on DOM Level 2 conformance testing?

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Sat Dec  4 06:15:22 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:21 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com>
Message-ID: <001201bf3e1f$074601c0$099918d1@docuverse1>

Walter,

Could you elaborate your decision to use PI rather than
element(s)?

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Dec  4 07:38:13 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:21 2004
Subject: SGML the next big thing?
Message-ID: <00fc01bf3e2d$f508d370$55f96d8c@NT.JELLIFFE.COM.AU>

>  From: Arnold, Curt <Curt.Arnold@hyprotech.com

>Erlend Xverby [Erlend.Overby@usit.uio.no] wrote
>
>>>What we don't need from the SGML standard:
>>> - SGML Declaration
>>> - Character entities
>>> - Minimisation
>>> - The "&" construct
>
>It looks like the XML Schema group is trying to add back the &
construct.

XFM looks to me like a kind of SGML Declaration, in that it says which
features are needed to process a document.  Similarly there is a
well-known need in XML to allow non-standard characters/glyphs (for
mathematics, advertising and Chinese especially) so it is not impossible
that there may be increased development (whether looking like entities
or numeric character references) towards better character entities too.

That only leaves minimization.  That is perhaps the nicest thing in
HTML, and perhaps will be the only SGML thing that won't be reintroduced
in the fullness of time.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Sat Dec  4 07:41:30 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:21 2004
Subject: YML: A Grand Unification of SAX and DOM? (fwd)
In-Reply-To: <Pine.LNX.4.10.9912030214230.18460-100000@cauchy.clarkevans.com>
Message-ID: <001301bf3e2b$085ea740$099918d1@docuverse1>

Clark,

What is the advantage of YML over these
solutions?

1. Pockets

<element>
  <pocket:attributes>
    <att>
      <ch>val</ch>
    </att>
  </pocket:attributes>
  <pocket:children>
    <foo>
      <pocket:text>bar</pocket:text>
    </foo>
  </pocket:children>
</element>

2. Parental Guidence

<element>
  <sax:cache>
    <foo>bar</foo>
  </sax:cache>
</element>

3. Road Signs

<element>
  <sax:cache>true</sax:cache>
  <foo>bar</foo>
  <sax:cache>false</sax:cache>
</element>

These are preliminary XML design patterns
so pattern names are weird to say the least.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mtbryan at sgml.u-net.com  Sat Dec  4 08:06:00 1999
From: mtbryan at sgml.u-net.com (Martin Bryan)
Date: Mon Jun  7 17:18:21 2004
Subject: Problem for mathematically minded XML experts
Message-ID: <043f01bf3e2e$90bd3540$5fc466c3@unet.com>

I would like to pose the following question to those of you with a
mathematical bent who have some spare time (I'm not expecting the answer
quickly - this is a holiday teaser I suspect!).

Given I have two DTDs, in which there are two elements whose models can be
described as follows:

Element DTD1 consists of a sequence of E1 elements and G1 OR groups, where
the total number of elements in the OR groups is N1

and

Element DTD2 consists of a sequence of E2 elements and G2 OR groups, where
the total number of elements in the OR groups is N2

is there a formula that can be used to determine whether the same pair of
elements are valid in both DTD1 and DTD2? If there is, is there a way to
determine the difference caused by the following conditions being added:

a) there need be no constraint on the order of the elements
b) the elements must be in a particular order
c) the elements must be adjacent, in any order
d) the elements must be adjacent, in a particular order.

Does the split of the number of elements in each OR group affect the
calculation significantly?
Does the fact that one or other, or both of the elements is a member of a
group significantly affect the calculation?
How would the calculation change if it was required that three matches were
required from the model?

If you can help me understand any part of the problem, or point me to a
paper/book on the subject, I would be grateful.

Martin Bryan, 29 Oldbury Orchard, Churchdown, Glos GL3 2PU, UK
Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Dec  4 09:10:42 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:21 2004
Subject: Problem for mathematically minded XML experts
Message-ID: <012101bf3e3a$ef24b9d0$55f96d8c@NT.JELLIFFE.COM.AU>


Murata Makoto of FujiXerox  has been working on the question of set
operations on grammars for several years.

Rick Jelliffe

-----Original Message-----
From: Martin Bryan <mtbryan@sgml.u-net.com>
To: xml-dev@ic.ac.uk <xml-dev@ic.ac.uk>
Date: Saturday, 4 December 1999 16:24
Subject: Problem for mathematically minded XML experts


>I would like to pose the following question to those of you with a
>mathematical bent who have some spare time (I'm not expecting the
answer
>quickly - this is a holiday teaser I suspect!).
>
>Given I have two DTDs, in which there are two elements whose models can
be
>described as follows:
>
>Element DTD1 consists of a sequence of E1 elements and G1 OR groups,
where
>the total number of elements in the OR groups is N1
>
>and
>
>Element DTD2 consists of a sequence of E2 elements and G2 OR groups,
where
>the total number of elements in the OR groups is N2
>
>is there a formula that can be used to determine whether the same pair
of
>elements are valid in both DTD1 and DTD2? If there is, is there a way
to
>determine the difference caused by the following conditions being
added:
>
>a) there need be no constraint on the order of the elements
>b) the elements must be in a particular order
>c) the elements must be adjacent, in any order
>d) the elements must be adjacent, in a particular order.
>
>Does the split of the number of elements in each OR group affect the
>calculation significantly?
>Does the fact that one or other, or both of the elements is a member of
a
>group significantly affect the calculation?
>How would the calculation change if it was required that three matches
were
>required from the model?
>
>If you can help me understand any part of the problem, or point me to a
>paper/book on the subject, I would be grateful.
>
>Martin Bryan, 29 Oldbury Orchard, Churchdown, Glos GL3 2PU, UK
>Phone/Fax: +44 1452 714029 E-mail: mtbryan@sgml.u-net.com
>
>
>
>xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Dec  4 09:09:36 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:21 2004
Subject: Content models considered bad...errr sometimes (was:  Re: SGML the next big thing?)
Message-ID: <011c01bf3e3a$8bc87a20$55f96d8c@NT.JELLIFFE.COM.AU>

From: Liam R. E. Quin <liamquin@interlog.com>

>    This & thing is so far outside the way most other computer
languages
>    work that standard off-the-shelf parser generators roll on their
>    backs and wave their paws in the air and admit defeat.

I am interested if you think this also reveals anything about the
persistent claims that SGML is bad because is doesn't conform to
the expectations of computer science (as influenced by an early
generation of tools such as YACC).   I would tend towards the
view that uncritical acceptence of academic paradigms has held
SGML/XML development up.  In the case of XML (and SGML,
which is really a compiler compiler, though with a different
target to YACC and Lex) I think the view that a schema should
be viewed as a language definition is holding things back
(which is *not* to say that there is no benefit in being able
to implement a schema as a language, or that there is no
benefit in being able to reason about a schema using
formal language theory).

No-one says "Windows, Icons, Menus
and Popups are not easy to implement in YACC, so we should not
have them": in fact, in the 90s, the trend for specifying GUIs has
been solidly away from formal grammatical descriptions of the
total interface language, even if just for flexibility.

>(3) The & connector interacts with #PCDATA to form pernicious content
>    models (see below).  The XML WG went to great lengths to make sure
>    that no valid XML document suffers from this SGML bogosity.
Similar
>    lengths are needed for "&".

Paul Prescod had an excellent idea a while back for adding a #WS
particle
that explicitly modelled whitespace. That would get rid of most
problems,
but it I presume there would still be an ambiguity possible with
    (#PCDATA | #WS )

But outside all this there is the basic issue of whether content models
actually are good to be the only direct mechanism for implementing
data models in XML: if  the idea of namespaces is
to allow ad hoc inclusion of elements from different domains at
the user discretion, the idea that a schema should be a language
description becomes less and less convincing. How useful is
"," when we might want to interpose elements from any other
namespace anywhere, for example?

For example, here is your content model, followed by a Schematron
schema.  I would say that the Schematron schema captures much
more directly what the content model might be modeling: in fact, the
content
model establishes relationships but fails to provide what they mean.

> <!ELEMENT boy
>     (noise & (dirt,mud)+ & (mud,shoes,trouble)* & #PCDATA) +smell

<schema>
<pattern name="A Boy">
 <rule context="boy">
    <assert test="count(noise)=1">Boys need noise</assert>
    <assert test="dirt">Boys need dirt</assert>
    <assert test="mud">Boys need mud</assert>
    <assert test="count(mud)=count(dirt) + count(shoes)"
    >Some mud comes from dirt and some mud comes from shoes.</assert>
    <assert test="count(shoes)= count(trouble)"
    >A boy will have as much trouble as he has muddy shoes.</assert>
 </rule>
 <rule context="smell">
    <assert  test="ancestor::boy">Boys can smell</assert>
 </rule>
 <rule context="boy/trouble">
    <assert test="previousSibling::shoes">Muddy shoes lead to
trouble</assert>
    <assert test="count(mud)=count(dirt) + count(trouble)"
    >The mud that comes from dirt is independent of the
    mud that causes trouble</assert>
 </rule>
 <rule context="boy/shoes">
    <assert test="previousSibling::mud">A boy's shoes must be
muddy</assert>
 </rule>
 <rule context="boy/dirt">
     <assert test="followingSibling::mud">All dirt leads to mud</assert>
    <assert test="name(followingSibling::*[position()=1])='smell'
                | name(followingSibling::*[position()=1])='mud'"
    >Dirt must be followed by mud or smells</assert>
 </rule>
</pattern>
</schema>

Other rules could be added to capture the intricacies of the inclusion,
but
the question should be asked whether the content model captures the
intent of the schema developer more than the Schematron schema does:
to what extent does the elegence of regular expressions force decisions
to be made that are extraneous to modeling requirements, i.e. that
are merely artifacts of the notation/paradigm.

I think that a good number of the people who claimed dislike for DTDs
will
find that really their problem is with regular grammars. Of course, the
people
who need to convert from class-based data into XML will find XML
Schema's
provisions of inheritence or class mechanisms very useful, but that
still
won't help matters if the relationship between elements is important.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at qub.com  Sat Dec  4 09:09:45 1999
From: paul at qub.com (Paul Tchistopolskii)
Date: Mon Jun  7 17:18:21 2004
Subject: YML: A Grand Unification of SAX and DOM? (fwd)
References: <001301bf3e2b$085ea740$099918d1@docuverse1>
Message-ID: <028b01bf3e37$8af1f7a0$5df5c13f@PaulTchistopolskii>


There is also *very*  elegant
'reverse-polish-notation'  approach
proposed by Robert ( process
element  when Grove is in place,
providing the execution stack ).

Not sure he was talking about the
execution stack, it was my attempt to
understand how could it work.

The only drawback of such a view
is that the  execution stack constantly
grows and we need to clean it up
sometimes.

However.

Because mutithreading approach should
have the same drawback, I think that the
workaround should already exist in the
source code of SP ( thanks to Sean for
pointing that  SP is  an existing implementation
of multithreading approach ).

No namespaces, no extra markup  -
just smart cleanup ( could be easier
than look-ahead, because the information
to make a descision is already 'in place',
right?  )

Rgds.Paul.

> Clark,
>
> What is the advantage of YML over these
> solutions?
>
> 1. Pockets
>
> <element>
>   <pocket:attributes>
>     <att>
>       <ch>val</ch>
>     </att>
>   </pocket:attributes>
>   <pocket:children>
>     <foo>
>       <pocket:text>bar</pocket:text>
>     </foo>
>   </pocket:children>
> </element>
>
> 2. Parental Guidence
>
> <element>
>   <sax:cache>
>     <foo>bar</foo>
>   </sax:cache>
> </element>
>
> 3. Road Signs
>
> <element>
>   <sax:cache>true</sax:cache>
>   <foo>bar</foo>
>   <sax:cache>false</sax:cache>
> </element>
>
> These are preliminary XML design patterns
> so pattern names are weird to say the least.
>
> Best,
>
> Don Park    -   mailto:donpark@docuverse.com
> Docuverse   -   http://www.docuverse.com
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Dec  4 09:43:09 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:21 2004
Subject: YML: A Grand Unification of SAX and DOM? (fwd)
Message-ID: <013201bf3e3f$5eea7a30$55f96d8c@NT.JELLIFFE.COM.AU>


From: Don Park <donpark@docuverse.com> 

 
> <element>
>  <pocket:attributes>
>    <att>
>      <ch>val</ch>
>    </att>
>  </pocket:attributes>
>  <pocket:children>
>    <foo>
>      <pocket:text>bar</pocket:text>
>    </foo>
>  </pocket:children>
></element>

Compare to XML less RSI-inducing
    <pocket ch="val"><foo>bar</foo></pocket>

Note that the XSL pattern to find the attribute ch of element
pocket is  "pocket/@ch"  for the XML but 
"element/pocket:children/../pocket:attributes/att/ch"
for the alleged SML. It could be said that one could use
"element/pocket:attributes/att/ch"  but then there is the
validation possibility where the pocket:attributes elements
are made part of some other element. 

Of course, it would be possible to make an implementation
of an XSL processor that interprets "pocket/@ch" as 
"element/pocket:children/../pocket:attributes/att/ch"
and hides this from the user.  It means that instead of looking
at a type field in the parse tree, the name is used (presumably
a better implementation method would be to translate
the alleged SML into standard XML DOM on import).

But then the user would have to have in mind the XML markup
when reading the alleged SML, which is an additional
mental burdon.  But I look forward to the development
of SPaths, SXSL, SDOM, SML Schemas, SPointers,
SLink, SInclusions, etcs.  At best, SML will make it easier
to get exactly where we are today anywhere.

><element>
>  <sax:cache>
>    <foo>bar</foo>
>  </sax:cache>
></element>

Compare to XML:
    <foo sax:cache="active">bar</foo>
When an effect follows element scope, it is better practise
to use elements than PIs. Otherwise
    <?sax cache="on"?><foo>bar</foo><?sax cache="off"?>
or perhaps
    <sax:cache><foo>bar</foo></sax:cache>

><element>
>  <sax:cache>true</sax:cache>
>  <foo>bar</foo>
>  <sax:cache>false</sax:cache>
></element>

Compare to XML:
    <foo sax:cache="active">bar</foo>
When an effect follows element scope, it is better practise
to use elements than PIs. Otherwise
    <?sax cache="true"?><foo>bar</foo><?sax cache="false"?>
but of course we cannot comment to deeply on a snapshot.

Rick Jelliffe

    
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Sat Dec  4 09:58:04 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:21 2004
Subject: [OT] Apologies for the inconvenience....
Message-ID: <199912040457971.SM01128@Unknown.>

Sorry about the off-topic post, but I'll make it short.  To the West Coast DBer who 
contacted me Friday evening - you have my attention, but I initially dismissed your email 
as buckshot and hence trashed it prematurely.  Please re-send said message for mutual 
contact information.

(For obvious reasons, I could not send this privately, and since this was the forum of 
initial contact, this is the lowest-bandwidth way of re-establishing contact.  Again, I 
apologize to the rest of the list members.)


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sat Dec  4 19:01:35 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:21 2004
Subject: YML: A Grand Unification of SAX and DOM? (fwd)
In-Reply-To: <013201bf3e3f$5eea7a30$55f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <Pine.LNX.4.10.9912040200070.24207-100000@cauchy.clarkevans.com>


Don's examples didn't demonstrate recursion, 
and this is the meat of the proposal.

On Sat, 4 Dec 1999, Rick Jelliffe wrote:
> From: Don Park <donpark@docuverse.com> 
> > <element>
> >  <pocket:attributes>
> >    <att>
> >      <ch>val</ch>
> >    </att>
> >  </pocket:attributes>
> >  <pocket:children>
> >    <foo>
> >      <pocket:text>bar</pocket:text>
> >    </foo>
> >  </pocket:children>
> ></element>
> 
> Compare to XML less RSI-inducing
>     <pocket ch="val"><foo>bar</foo></pocket>
> 
> Note that the XSL pattern to find the attribute ch of element
> pocket is  "pocket/@ch"  for the XML but 
> "element/pocket:children/../pocket:attributes/att/ch"
> for the alleged SML. It could be said that one could use
> "element/pocket:attributes/att/ch"  but then there is the
> validation possibility where the pocket:attributes elements
> are made part of some other element. 

Yet another reason for the distinction being set
at the syntax level, as it currently is with XML.

;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sat Dec  4 18:52:10 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:21 2004
Subject: [YML] RE: YML: A Grand Unification of SAX and DOM? (fwd)
In-Reply-To: <001301bf3e2b$085ea740$099918d1@docuverse1>
Message-ID: <Pine.LNX.4.10.9912040110310.24207-100000@cauchy.clarkevans.com>

On Fri, 3 Dec 1999, Don Park wrote:
> What is the advantage of YML over these
> solutions?
 
> 3. Road Signs
> 
> <element>
>   <sax:cache>true</sax:cache>
>   <foo>bar</foo>
>   <sax:cache>false</sax:cache>
> </element>

This structure does not seem to be 
recursive, given that I'm interpreting
it as I would a processing instruction.

> 2. Parental Guidence
>

> <element>
>   <sax:cache>
>     <foo>bar</foo>
>   </sax:cache> </element>
 
This *could* be logically equivalent 
depending upon your interpretation.
Consider this instead:
 
<element>
  <sax:cache>
    <foo>
       <bar/>
    </foo>
  </sax:cache> 
</element>
 
If the processor would provide random access
to <bar/>, then the answer is no -- this is
not the same as YML.
 
If, however, you use a keyword like this 
to denote that the _immediate children_ 
are placed into random access storage, 
then the answer is almost.  In addition, a 
mechanism is required to gaurentee that 
all of the random access children occur 
before the first sequential child.

 
> 1. Pockets
>
> <element>
>   <pocket:attributes>
>     <att>
>       <ch>val</ch>
>     </att>
>   </pocket:attributes>
>   <pocket:children>
>     <foo>
>       <pocket:text>bar</pocket:text>
>     </foo>
>   </pocket:children>
> </element>

First, I don't get what you had intended
with <pocket:text>bar</pocket:text>, so
let's consider this replaced with
<text>bar</text> to proceed.

If attributes/children means random/sequential
access, then, of the three, this is the
closest since there seems to be an implicit
requirement that all random access children
occur before sequential access children.

However, you did not use this construct
recursively -- <ch> and <text> were
not marked with the sequential/random
access distinction.  If this distinction 
were embedded into a binary syntax, then 
this type of problem would not occur.
So, here is the YML version of the above,
assuming sequential access for <ch>
and random access for <bar>.
 
  <element
    <att>
       <ch>val</ch>
    </att>   
  >
    <foo 
      <text>bar</text>
    />
  </element>  

Or, using syntax sugar:

  <element att=<ch>val</ch> >
    <foo text="bar" />
  </element>

Major advantage here is that the doubly 
recursive syntax drives the processing 
choice as to collect attributes together 
and provide them via random access (DOM), 
or to provide them individually via 
sequential access (SAX).
 
I hope this moves things along!

Best Wishes,

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sat Dec  4 18:58:16 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:21 2004
Subject: [YML] Re: YML: A Grand Unification of SAX and DOM? (fwd)
In-Reply-To: <028b01bf3e37$8af1f7a0$5df5c13f@PaulTchistopolskii>
Message-ID: <Pine.LNX.4.10.9912040155210.24207-100000@cauchy.clarkevans.com>


Paul, I didn't get this at all.  Sorry.

On Sat, 4 Dec 1999, Paul Tchistopolskii wrote:
> There is also *very*  elegant
> 'reverse-polish-notation'  approach
> proposed by Robert ( process
> element  when Grove is in place,
> providing the execution stack ).
> 
> Not sure he was talking about the
> execution stack, it was my attempt to
> understand how could it work.
> 
> The only drawback of such a view
> is that the  execution stack constantly
> grows and we need to clean it up
> sometimes.
> 
> However.
> 
> Because mutithreading approach should
> have the same drawback, I think that the
> workaround should already exist in the
> source code of SP ( thanks to Sean for
> pointing that  SP is  an existing implementation
> of multithreading approach ).
> 
> No namespaces, no extra markup  -
> just smart cleanup ( could be easier
> than look-ahead, because the information
> to make a descision is already 'in place',
> right?  )

I'm talking about using a low-level recursive 
binary distinction in syntax to unify the 
behavior of SAX and DOM -- without *any* 
schema knowledge of the input stream known 
by the parser author, nor requiring any 
external processing guidelines.  

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sat Dec  4 21:09:19 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:21 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
In-Reply-To: Your message of "Fri, 03 Dec 1999 13:13:10 EST."
             <14408.2102.136185.50050@localhost.localdomain> 
Message-ID: <199912042109.OAA19150@localhost.localdomain>

>  > I would really like you to reconsider this ordering of priorities.
>  > SAX2 is urgently needed for DOM implementors and developers of XSLT
>  > engines with streaming output.  You have done an admirable job of
>  > leading the SAX and SAX2 development, and it is dying without your
>  > output.  Now if you were simply buried with 9-5 work, and couldn't
>  > lend your efforts, it would be understandable.  But to detract from
>  > SAX2 in order to focus on SAX/C++, I think is muddling the
>  > priorities.
> 
> What in SAX2 is most urgently needed for DOM and XSLT?  I know that
> DOM level one *can* support some things that SAX doesn't report (such
> as comments and CDATA section boundaries), but there is nothing in DOM
> level one that says those have to be included, and I've heard of
> relatively few real-world applications that need that information.
> 
> I'm not as familiar with the situation in XSLT, and information would
> be helpful.

When I think about it more clearly, it is really the XSLT needs that stick 
out.  4XSLT uses DOM to process XSLT and uses SAX2 Alpha for output.  The 
sorts of things that need to be addressed in SAX for XSLT output include 
namespaces, Doctype declarations, and comments.  Note that 4DOM implements DOM 
Level 2 core, and thus we can support all the advanced features we need in the 
DOM, but SAX 1.0 falls short for input and output.

If we can agree on general interfaces for SAX2, the Python community can take 
care of itself and develop the Python binding.  I don't see why the much 
larger C++ community can't do its own work as well.  We should be working on 
general, language-independent interfaces for XML development and let the 
language-specific needs catch up as needed.  Reference implementations and all 
that are fine, and so I have no problem with SAX being specified in Java.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sat Dec  4 21:12:42 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:21 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
In-Reply-To: Your message of "Fri, 03 Dec 1999 20:11:53 GMT."
             <006501bf3dca$a4767ce0$4a5eedc1@arp01> 
Message-ID: <199912042112.OAA19167@localhost.localdomain>

> > What in SAX2 is most urgently needed for DOM and XSLT?  I know that
> > DOM level one *can* support some things that SAX doesn't report (such
> > as comments and CDATA section boundaries), but there is nothing in DOM
> > level one that says those have to be included, and I've heard of
> > relatively few real-world applications that need that information.
> 
> How about focusing on SAX/2, and making the first C/C++ SAX interface
> actually SAX 2 so we kill two birds with one stone ?

This to me seems the most sensible approach.  Especially when SAX2 has had so 
much discussion and is potentially so close to completion.  The C++/SAX 
discussion is just starting and could go on for months.  It might as well be 
built around an up-to-date standard.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From KenNorth at email.msn.com  Sat Dec  4 21:09:10 1999
From: KenNorth at email.msn.com (KenNorth)
Date: Mon Jun  7 17:18:21 2004
Subject: Security alerts: XML redirect in IE 5.0, MiniZip worm
References: <199912040457971.SM01128@Unknown.>
Message-ID: <004e01bf3e9b$ac8df260$0b00a8c0@grissom>

Earlier this week I sent a security alert to each xml-dev member whose
e-mail address was in my IN basket.

I received an e-mail worm earlier in the week and sent a warning not to open
an attachment named ZIPPED_FILES.EXE. The MiniZip worm propagates by mailing
itself so I thought it best to err on the side of caution and send a
warning. (I don't know whether MiniZip leaves a copy of sent messages in the
Sent folder.)

Members should also be aware of an Internet Explorer 5.0 security problem
related to XML redirects:
http://www.ntsecurity.net/go/load.asp?iD=/security/IE54.htm


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sat Dec  4 21:16:09 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++ vs. SAX2 
In-Reply-To: Your message of "Fri, 03 Dec 1999 13:21:38 EST."
             <14408.2610.245842.199581@localhost.localdomain> 
Message-ID: <199912042116.OAA19183@localhost.localdomain>

> I'd like to hear what others think on this issue.  There was some
> interest in SAX2 when I posted my alpha interfaces a few months back
> (most notably, but not exclusively, from David Brownell), but it was
> hardly a tidal wave.  On the other hand, I am noticing a building
> pressure from implementors to get something out in C++.

Have you considered that the lack of heavy discussion after you posted the 
SAX2 alpha was because of its high quality?  There is little to argue with.  
The Python/XML group was able to hammer out a Python binding based on your 
alpha in short order.  We have put this to practical use in 4XSLT, and it 
works very well.  I don't think there needs to be a lot more dicussion on the 
way to SAX 2.0, but I hardly think that minimizes its importance.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at qub.com  Sun Dec  5 00:08:33 1999
From: paul at qub.com (Paul Tchistopolskii)
Date: Mon Jun  7 17:18:22 2004
Subject: [YML] Re: YML: A Grand Unification of SAX and DOM? (fwd)
References: <Pine.LNX.4.10.9912040155210.24207-100000@cauchy.clarkevans.com>
Message-ID: <006f01bf3eb4$a0c71420$5df5c13f@PaulTchistopolskii>


> Paul, I didn't get this at all.  Sorry.

I think it's because you are concentrated on 
another task than I am. I'm thinking about 
mixing streaming and Groves for processing 
XML ( SML ) documents.
 
> On Sat, 4 Dec 1999, Paul Tchistopolskii wrote:
> > There is also *very*  elegant
> > 'reverse-polish-notation'  approach
> > proposed by Robert ( process
> > element  when Grove is in place,
> > providing the execution stack ).
> > 
> > Not sure he was talking about the
> > execution stack, it was my attempt to
> > understand how could it work.
> > 
> > The only drawback of such a view
> > is that the  execution stack constantly
> > grows and we need to clean it up
> > sometimes.
> > 
> > However.
> > 
> > Because mutithreading approach should
> > have the same drawback, I think that the
> > workaround should already exist in the
> > source code of SP ( thanks to Sean for
> > pointing that  SP is  an existing implementation
> > of multithreading approach ).
> > 
> > No namespaces, no extra markup  -
> > just smart cleanup ( could be easier
> > than look-ahead, because the information
> > to make a descision is already 'in place',
> > right?  )
> 
> I'm talking about using a low-level recursive 
> binary distinction in syntax to unify the 
> behavior of SAX and DOM -- without *any* 
> schema knowledge of the input stream known 
> by the parser author, nor requiring any 
> external processing guidelines.  

 Your approach is : "if we'l write our document
providing some extra information, it'l be easier 
for processing API to make a desision how 
to process it".

Even I found your proposal to be very elegant, 
I dont like that idea in principle. It's the attributish 
way  when one  is marking  'road-signs'  or 'pockets' 
in the document. Document sould be about the content, 
not about the 'road-signs' , PI's,  and some other stuff

Stylesheet  is  about processing ;-)

I'l prefer to attach the 'road-signs'  at  runtime.

I see 2 ways for now  to change processing of the 
XML ( SML ) documents not changing the documents 
themselvs.

    simple SAX-based 'switcher' 
    reverse-polish-noitation view 

After I'l understand what way makes life easier for 
streaming XSLT I may write more. It is  all getting 
hard.

Rgds.Paul.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sun Dec  5 00:23:00 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:22 2004
Subject: [YML] Re: YML: A Grand Unification of SAX and DOM? (fwd)
In-Reply-To: <006f01bf3eb4$a0c71420$5df5c13f@PaulTchistopolskii>
Message-ID: <Pine.LNX.4.10.9912040713090.24951-100000@cauchy.clarkevans.com>


On Sat, 4 Dec 1999, Paul Tchistopolskii wrote:
> I think it's because you are concentrated on 
> another task than I am. I'm thinking about 
> mixing streaming and Groves for processing 
> XML ( SML ) documents.

Sounds similar enough... and  our end goal is the
same, a more efficient XSL processor.

> Your approach is : "if we'l write our document
> providing some extra information, it'l be easier 
> for processing API to make a desision how 
> to process it".

Let me re-prase:  "if we design our documents
in such a way that the information dependencies
are identified, and we use a syntax to demark these
dependencies, then a parser can better support
the processor by providing either sequential or
random access depending upon the context."

> Even I found your proposal to be very elegant, 
> I dont like that idea in principle. It's the attributish 
> way  when one  is marking  'road-signs'  or 'pockets' 
> in the document. Document sould be about the content, 
> not about the 'road-signs' , PI's,  and some other stuff
>
> Stylesheet  is  about processing ;-)

Far enough.. but I would say that stylesheets are
about transforming, not about providing the information
in an accessable way that supports the dependencies.

> I'l prefer to attach the 'road-signs'  at  runtime.

Well, you will have do to this based on some distinction.
I'd be interested to see what you pick in the end.  I'm 
putting it at the syntax level so that the designers
of the content can have control over it.
 
> I see 2 ways for now  to change processing of the 
> XML ( SML ) documents not changing the documents 
> themselvs.
> 
>     simple SAX-based 'switcher' 
>     reverse-polish-noitation view 
> 
> After I'l understand what way makes life easier for 
> streaming XSLT I may write more. It is  all getting 
> hard.

Cool.  I'd like to hear more about the RPN view.
This sounds interesting.

;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Sun Dec  5 01:29:51 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:22 2004
Subject: Request for Discussion: SAX 1.0 in C++ 
References: <199912042112.OAA19167@localhost.localdomain>
Message-ID: <001301bf3ec0$d47d7d20$5afbb1cd@tomshp>


From: <uche.ogbuji@fourthought.com>


> > > What in SAX2 is most urgently needed for DOM and XSLT?  I know that
> > > DOM level one *can* support some things that SAX doesn't report (such
> > > as comments and CDATA section boundaries), but there is nothing in DOM
> > > level one that says those have to be included, and I've heard of
> > > relatively few real-world applications that need that information.
> >
> > How about focusing on SAX/2, and making the first C/C++ SAX interface
> > actually SAX 2 so we kill two birds with one stone ?
>
> This to me seems the most sensible approach.  Especially when SAX2 has had
so
> much discussion and is potentially so close to completion.  The C++/SAX
> discussion is just starting and could go on for months.  It might as well
be
> built around an up-to-date standard.
>
> --
> Uche Ogbuji

I'd second this, with the thought that most of the effort would concentrate
on finishing the SAX2 interface itself before spending the potential
"months" on the C++/SAX implementation.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Sun Dec  5 16:28:02 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <Pine.GHP.4.21.9912032153220.11243-100000@mail.ilrt.bris.ac.uk>
Message-ID: <384A9281.B395AAA@praxis.cz>

Dan Brickley wrote:
> I believe it will be possible to annotate XML schemas with information
> for mapping into (generic or domain specific) application datamodels
> such as RDF. I don't think it is right to expect the hard-pressed XML
> Schema group to define all these mappings within that working group.
> But that doesn't matter; all we need is a placeholder for such
> information.

I totally agree. As long as these considerations are being taken into
account, I'm sure there will be plenty of people experimenting with
various approaches. This will certainly lead to a better understanding
of how to address these issues than simply mandating something that was
worked out by a committee.

> My understanding of the Cambridge Communique meeting was that we reached
> agreement on just this. See points 1-6 under '3. Observations and
> Recommendations' in http://www.w3.org/TR/1999/NOTE-schema-arch-19991007
<snip>

The need to develop an abstract schema for representing objects and
properties is very clear; one of the problems people have with
understanding RDF is that this need seems so obvious that they assume
XML already fills it. The real question is whether a separate RDF syntax
is the appropriate way to do this. I see a lot of value in seeing this
information as an extension of the information currently provided in an
XML schema (i.e. basically a serialization of the XML infoset). The
overlap is great because both RDF schemas and XML schemas are working
with the same basic informational units (assuming you accept the mapping
of class -> element type).

The RDF model also seems awfully complex for normal mortals. If the
stated target were knowledge management specialists, then there is
clearly an important niche market for a very complete mechanism for
specifying semantic relationships between resources. If we are talking
about the standard mechanism for object interchange on the Web, a
simpler mechanism (adding, for example, only the notion of properties
and strong datatyping) implemented inside an XML schema has a much
greater chance of being widely accepted. Of course, I'm only guessing
that XML schemas will be very widespread anyway, so there's plenty of
room for disagreement.

> Sure, you could do this. My hunch is that the urge to do this won't be
> as strong when we have more abstract (objects and properties) interfaces
> to XML content, rather than our current APIs that obsess on detail of
> particular serialisations rather than on what those serialisations have
> told us about the objects. If we could get to a world where generic
> rather than domain interfaces being useful to even 10% instead of 5% of
> applications (to borrow your figure), that'd be a huge win.

Interesting insight. I see your point, but I also see this also
supporting the argument for making any XML instance a potential object
by using the associated schema for conveying information about object
properties. This would mean that there would be a "new DOM" only works
on valid instances. If you do have a schema, it should be possible to
exploit this directly by having better generic interfaces, rather than
trying to treat well-formed and valid instances in the same way.

> There is also a need to know the objects'n'properties view of the data
> without going to fetch (or having advance knowledge of) the
> syntactic schema or serialisation policy. RDF's
> initial syntax was one approach; there have been and will be
> others. The Microsoft folks were for a while throwing around some
> interesting ideas on mapping more 'colloquial' XML syntax into directed
> labelled graphs. There's a version at http://www.biztalk.org/Resources/canonical.asp
> for example.

Cool, I will have to take a much closer look at that. It seems to be
very close to what I am talking about. Thanks for the tip.

> I've also heard that some folks want to use it for structured hypertext
> documents...
> 
> (One consequence of XML's document heritage is that document order is
> generally treated as meaningful and in need of preservation. This can be
> a pain in the butt for data-centric apps.)

Fair enough, but the potential for object interchange is what is getting
people really excited. Nothing about having object facilities in the XML
schema language precludes the use of XML for structured documents. But
if we are talking about a web "vision", the potential for easy
interchange of data between applications is more likely to have a
revolutionary impact than the use of structured documents and
stylesheets. I also think that many of these object facilities will
actually turn out to be very useful for what are normally considered to
be documents.

> I don't see a conflict here. RDF is happy with multiple ways of shipping
> I'd love to see examples of an annotated XML Schema that shows how to
> derrive an objects'n'properties view of instance data.

I am going to have to write up something about this. Stay tuned...

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From matthew at praxis.cz  Sun Dec  5 16:32:19 1999
From: matthew at praxis.cz (Matthew Gertner)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> <38480038.6028FEF2@praxis.cz> <014a01bf3dba$dfff38c0$eb020a0a@bowstreet.com>
Message-ID: <384A9384.678B366C@praxis.cz>

James Tauber wrote:
> Are you achieving this by expressing how certain element types relate to
> other element types and to concepts? A semantic network?
> 
> If so, you are still ultimately relating the elements to concepts you are
> probably going to define by human prose or running code.
> 
> I'm not arguing with this idea. I think it probably has some promise. But
> the real semantics are ultimately introduced into the system by agreed to
> concepts that aren't expressed via schemata. A schema is part of the
> picture, but not the whole.
> 
> I'll go back and read your Web Vision post.

As long as human beings are the only plausible "end consumers" of these
documents, their semantics will always be determined ultimately by fuzzy
things like intentions and expectations. The semantic constraints I am
talking about are one step away from these "ultimate" semantics; they
tell you that an integer contained in a given element cannot be greater
than 100, but they don't tell you why. These are still semantics to me
and they provide tremendous value when you want to process a broad range
of documents generically (which might, for example, involve generating
an application-specific interface for any arbitrary schema).

Matthew

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec  5 17:45:48 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
Message-ID: <003101bf3f4c$1b467550$09f96d8c@NT.JELLIFFE.COM.AU>

From: Matthew Gertner <matthew@praxis.cz>

>Dan Brickley wrote:
>> I believe it will be possible to annotate XML schemas with
information
>> for mapping into (generic or domain specific) application datamodels
>> such as RDF. I don't think it is right to expect the hard-pressed XML
>> Schema group to define all these mappings within that working group.
...
>
>I totally agree. As long as these considerations are being taken into
>account, I'm sure there will be plenty of people experimenting with
>various approaches. This will certainly lead to a better understanding
>of how to address these issues than simply mandating something that was
>worked out by a committee.

In this vein, schematron-rdf  at
http://www.ascc.net/xml/resource/schematron/schematron.html
generates RDF documents (currently with bogus XLinks, but you can
customize it easily) based on Schematron schemas. In this case, the
schema is not converted to RDF, rather the RDF shows which assertions in
the schema apply to each element in the instance.  This is a rather
different use for schemas: as programs for  automated annotation.

The thing that became immediately clear from working on it was that RDF
is good for arcs (relationships) but grammar-based schemas largely hide
these relationships (between elements, attributes, data) behind a few
generic but superficial types: containment, sequence, repetition.
Schematron assertions now allow a "role" attribute, for labelling
classes of arcs.

I think developers of other schema languages might also consider this
kind of thing too: that the connectors between particles of patterns
(e.g., compositors in the content models in a grammar-based schema
language) should have some role attribute (and documentation?) for
labelling their significance. For example, if element A must be follwed
by element B, to say why.  The nodes that conventional schemas define
(e.g. elements and attributes) are interesting, but the arcs between
them can also be very interesting for automatic annotation using RDF.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec  5 17:56:31 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
Message-ID: <005001bf3f4d$9c82e940$09f96d8c@NT.JELLIFFE.COM.AU>


From: Dan Brickley <Daniel.Brickley@bristol.ac.uk>

 >(One consequence of XML's document heritage is that document order is
>generally treated as meaningful and in need of preservation. This can
be
>a pain in the butt for data-centric apps.)

Perhaps a major part of the problem is that sometimes the document order
is meaningful and other times just an artifact of there being no "&"
connector in XML content models, and there is no way to decide.  And
when the order is important, there is no way to label what its
significance is; indeed, the same thing is true of every axis including
the children and parent axes.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wperry at fiduciary.com  Sun Dec  5 18:59:45 1999
From: wperry at fiduciary.com (W. E. Perry)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <005001bf3f4d$9c82e940$09f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <384AB61C.D947D256@fiduciary.com>

Is this not precisely the reason that 'behaviour' (or whatever we are eventually to call it)
of XLinks is indispensable? Not as a replacement for the document-centric assumptions that
text order is meaningful or that the implicit parent-child relationship of element containment
is significant, but as the mechanism for specifying (granted, to a perhaps more data-oriented
audience) either where these relationships should be explicit, or where they are replaced by
explicitly presented alternatives.

Respectfully,

Walter Perry


Rick Jelliffe wrote:

> Perhaps a major part of the problem is that sometimes the document order
> is meaningful and other times just an artifact of there being no "&"
> connector in XML content models, and there is no way to decide.  And
> when the order is important, there is no way to label what its
> significance is; indeed, the same thing is true of every axis including
> the children and parent axes.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec  5 21:13:14 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
Message-ID: <000601bf3f69$16947fd0$3bf96d8c@NT.JELLIFFE.COM.AU>

Yes, except XLinks are specified in instances, not as a schema per se.
I hope the XML Schema will have some extension mechanism to
allow these kinds of thing, but who knows.

It is true that sequence and containment relations between elements
in a content model could be treated of as some kind of extended link.

<xlink:extended  >
    <xlink:locator href="http://xxx?xpointer=//boy/dirt"  role="cause"
/>
    <xlink:locator
href="http://xxx?xpointer=//boy/dirt/followingSibling::mud"
role="effect" />
</xlink:extended>

(You could modify schematron-rdf to generate these kinds of
XLinks pretty easily.)

But the trouble with attempting to use XLinks to directly declare
some part of an XML Schema is there is no way to nicely interact
with content models using maxOccurs--if a dog has two eyes and
we want to link between them and the two eyes are declared using
a single <type name="eye" maxOccur="2" /> then we are sunk.
We really want to link in the instance not in the schema.

And we cannot use hrefs to the instances because we don't know
what the instance document URI is: a URI identifies a particular
resource not a class of resources.  XLinks are not designed for
us as schema declarations.

So I think there needs to be first-class support for this in the
schema language itself: in the case of XML Schemas, probably
the most possible thing would be a role attribute (or some equivalent)
on groups.  There is not much there to hook onto. Any ideas on this
would be welcome, even if just to help me think through the
issues for Schematron.

Rick Jelliffe

From: W. E. Perry <wperry@fiduciary.com>

>Is this not precisely the reason that 'behaviour' (or whatever we are
eventually to call it)
>of XLinks is indispensable? Not as a replacement for the
document-centric assumptions that
>text order is meaningful or that the implicit parent-child relationship
of element containment
>is significant, but as the mechanism for specifying (granted, to a
perhaps more data-oriented
>audience) either where these relationships should be explicit, or where
they are replaced by
>explicitly presented alternatives.
 >
>Rick Jelliffe wrote:
>
>> Perhaps a major part of the problem is that sometimes the document
order
>> is meaningful and other times just an artifact of there being no "&"
>> connector in XML content models, and there is no way to decide.  And
>> when the order is important, there is no way to label what its
>> significance is; indeed, the same thing is true of every axis
including
>> the children and parent axes.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Colas.Nahaboo at sophia.inria.fr  Mon Dec  6 00:08:28 1999
From: Colas.Nahaboo at sophia.inria.fr (Colas Nahaboo)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions) 
In-Reply-To: Your message of "Fri, 03 Dec 1999 15:10:14 EST."
             <A51F7543E295D2118D6600A024CDB2F71B9D77@MAILPROD> 
Message-ID: <199912060005.BAA25394@koala.inria.fr>


Vane Lashua writes:
> What's the point of defining "Point"?

I do not want to define it in XML (or rather, the language to occupy the
ecological niche of XML). I just want to transport its data (instance
contents).

> "Point" is meaningless by itself -- even
> though it may be syntactly correct, in a context, with normalized attributes
> and values -- without a specific processor that understands what a "Point"
> is.

Of course!!! You want to separate irrelated problems if you want to stay aside
from combinatorial complexity explosion... Understanding the semantics of data
(classes, the typing system, and more...) belongs to the application (helped
by a Schema language if a good one exist), *NOT* to the "transport" XML layer.
If you want to invent a language to express the semantics of objects, well,
good luck, but I dont want to wait for your effort to succeed before being
able to transport my data contents :-)

I want XML *NOT* to try to understand what it doesnt know about.


--
Colas Nahaboo, Koala/Dyade/Bull @ INRIA Sophia, http://www.inria.fr/koala/colas


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 00:36:34 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:22 2004
Subject: simple XML for C++ application data-file I/O
Message-ID: <384B04DA.DCD6BAED@fxtech.com>

I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the
solutions I've seen are very simple or straightforward for generic
application data I/O (ie. non web, e-commerce, Java-type stuff). In
other words, I'm about to roll my own, and would like to gauge interest
in a small callback-based API for simple XML I/O.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From xml-dev at teleo.net  Mon Dec  6 00:58:33 1999
From: xml-dev at teleo.net (Patrick Phalen)
Date: Mon Jun  7 17:18:22 2004
Subject: simple XML for C++ application data-file I/O
In-Reply-To: <384B04DA.DCD6BAED@fxtech.com>
References: <384B04DA.DCD6BAED@fxtech.com>
Message-ID: <9912051700340G.00844@quadra.teleo.net>

[Paul Miller, on Sun, 05 Dec 1999]
:: I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the
:: solutions I've seen are very simple or straightforward for generic
:: application data I/O (ie. non web, e-commerce, Java-type stuff). In
:: other words, I'm about to roll my own, and would like to gauge interest
:: in a small callback-based API for simple XML I/O.

Not sure what you mean. Are you talking about IPC, RPC?
Have you looked at XML-RPC and SOAP?

<Heh. Nine acronyms embedded in one brief msg.>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 01:08:39 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:22 2004
Subject: simple XML for C++ application data-file I/O
References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net>
Message-ID: <384B0C65.2A6710C0@fxtech.com>

> :: I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the
> :: solutions I've seen are very simple or straightforward for generic
> :: application data I/O (ie. non web, e-commerce, Java-type stuff). In
> :: other words, I'm about to roll my own, and would like to gauge interest
> :: in a small callback-based API for simple XML I/O.

> Not sure what you mean. Are you talking about IPC, RPC?
> Have you looked at XML-RPC and SOAP?

I should have been more clear. I just want to use XML for simple
non-web-bound application data files (document files). I need a
non-validating parser that I can use to efficiently parse my application
data, without all the complexity (and overhead) of something like DOM,
but not as general-purpose as expat.

> <Heh. Nine acronyms embedded in one brief msg.>

Yeah, XML has definitely helped spawn plenty of new TLAs.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Mon Dec  6 01:24:29 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <3.0.32.19991201124035.0153b920@pop.intergate.ca> <m3bt9hp6g4.fsf@localhost.localdomain> <38459FA4.DAA00E35@praxis.cz> <m366zpp3m8.fsf@localhost.localdomain> <38463AB4.36C5292B@praxis.cz> <m3aenta7qn.fsf@localhost.localdomain> <38466487.328D1CFA@praxis.cz> <m34se19zkv.fsf@localhost.localdomain> <38478DB2.FACA4633@praxis.cz> <m3ln7chzrd.fsf@localhost.localdomain> <3847E71D.3F32FFA9@praxis.cz> <00a001bf3db1$ed45e5f0$eb020a0a@bowstreet.com> <38480038.6028FEF2@praxis.cz> <014a01bf3dba$dfff38c0$eb020a0a@bowstreet.com> <384A9384.678B366C@praxis.cz>
Message-ID: <00c101bf3f88$bc0cc980$eb020a0a@bowstreet.com>

> The semantic constraints I am
> talking about are one step away from these "ultimate" semantics; they
> tell you that an integer contained in a given element cannot be greater
> than 100, but they don't tell you why. These are still semantics to me

Ah. This is why I have have some difficulty understanding some of what you
are saying. To me, the constraint that an integer cannot be greater than 100
is not semantics. It's syntax.

MyInteger ::= ( '100' | digit{1,2} | '-' digit+ )

or in some more perspicuous grammar:

MyInteger = Integer x : x <= 100

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata.makoto at fujixerox.co.jp  Mon Dec  6 02:18:23 1999
From: murata.makoto at fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:18:22 2004
Subject: Problem for mathematically minded XML experts
Message-ID: <199912060220.AA03501@archlute.fujixerox.co.jp>

>I would like to pose the following question to those of you with a
>mathematical bent who have some spare time (I'm not expecting the answer
>quickly - this is a holiday teaser I suspect!).

I gave a talk at XTech'99 on automatic construction of 
intersection/union/difference of schemata (or DTDs).  I even demonstrated 
my prototypical implementation.

>is there a formula that can be used to determine whether the same pair of
>elements are valid in both DTD1 and DTD2? If there is, is there a way to
>determine the difference caused by the following conditions being added:

If the intersection of two schemata (or DTDs) is not empty, there exists 
such elements.

My slides and annotated log file of my demonstration are available at:

	http://www.geocities.com/ResearchTriangle/Lab/6259/XTech99/index.htm

You might want to retrieve this single file which contains all HTML pages 
and image files.

	http://www.geocities.com/ResearchTriangle/Lab/6259/XTech99/xtech99.zip

An introduction to hedge automata is available at:

	ttp://www.geocities.com/ResearchTriangle/Lab/6259/hedge_nice.pdf

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata.makoto@fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdr at camsoft.com  Mon Dec  6 02:31:48 1999
From: sdr at camsoft.com (Stewart Rubenstein)
Date: Mon Jun  7 17:18:22 2004
Subject: simple XML for C++ application data-file I/O
References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> <384B0C65.2A6710C0@fxtech.com>
Message-ID: <384B2008.A8E62B7D@camsoft.com>

I was easily able to use XML for exactly this sort of thing.  For
reading, I used James Clark's expat parser with Andy Dent's expatpp
wrapper for C++, and it dropped in quite easily.  You can get them both
from <http://www.highway1.com.au/adsoftware/expatpp.htm>.

Writing XML is almost too easy to bother getting help for.  You do have
to take some care if you're going to be dealing with text beyond
US-ASCII.  Fortunately, my OS's - MacOS and Windows - both have fairly
decent Unicode support now.

My application already has an object tree, so I just wrote the following
in the base class, and implemented the obvious virtual functions in the
subclasses that can exist in the tree:

void CDXObject::XMLWrite(std::ostream &sink) const
{
	// First write the opening tag and the attributes.
	sink << "<" << XMLObjectName() << std::endl;
	
	// The id is the only totally generic tag
	if (m_objectID != 0)
		sink << " " << kCDXML_id << "=\"" << m_objectID << "\"" << std::endl;

	// This is overridden by subclasses to write any object-specific attributes
	XMLWriteAttributes(sink);
	
	// If there's any 
	if (!XMLNeedToWriteContent() && m_contents.empty())
		sink << "/>";
	else
	{
		sink << ">";
		XMLWriteContent(sink);
           for (CDXObjectContentMap::const_iterator i = m_contents.begin();
			i != m_contents.end();  ++i)
			GetObject(i)->XMLWrite(sink);	// write each of the contained objects
		sink << "</" << XMLObjectName() << ">";
	}
}

Paul Miller wrote:
> 
> > :: I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the
> > :: solutions I've seen are very simple or straightforward for generic
> > :: application data I/O (ie. non web, e-commerce, Java-type stuff). In
> > :: other words, I'm about to roll my own, and would like to gauge interest
> > :: in a small callback-based API for simple XML I/O.
> 
> > Not sure what you mean. Are you talking about IPC, RPC?
> > Have you looked at XML-RPC and SOAP?
> 
> I should have been more clear. I just want to use XML for simple
> non-web-bound application data files (document files). I need a
> non-validating parser that I can use to efficiently parse my application
> data, without all the complexity (and overhead) of something like DOM,
> but not as general-purpose as expat.
> 
> > <Heh. Nine acronyms embedded in one brief msg.>
> 
> Yeah, XML has definitely helped spawn plenty of new TLAs.
> 
> --
> Paul Miller - stele@fxtech.com
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Mon Dec  6 03:40:20 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:22 2004
Subject: expat meant to be restartable?
References: <38482AC7.95973198@fxtech.com>
Message-ID: <38488A77.C4ACEDA@jclark.com>

Paul Miller wrote:
> 
> I don't think it is, but I want to check. I'd like to be able to reuse
> an XML_Parser after I've called XML_Parse with isFinal set to 1.
> Basically I want to go back and parse a subset of the original file,
> using modified starting buffer pointer and length, but it doesn't seem
> to work (I get a JUNK_AFTER_DOC_ELEMENT error). I would like to avoid
> creating a new parser for each element subtree I scan.

Expat doesn't support that.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Mon Dec  6 03:47:55 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <14408.2610.245842.199581@localhost.localdomain>; from David Megginson on Fri, Dec 03, 1999 at 01:21:38PM -0500
References: <14408.2610.245842.199581@localhost.localdomain>
Message-ID: <19991206144718.D3543@io.mds.rmit.edu.au>

On Fri, Dec 03, 1999 at 01:21:38PM -0500, David Megginson wrote:
[Re: whether to work on SAX2 or on SAX/C++]
> I can think of a few reasons that the world might desperately be
> waiting for SAX2:
> 
> 1. To get some kind of standard Namespace support (or at least a way
>    to tell whether a parser has Namespace support built in).
> 
> 2. To query parser features in general.
> 
> 3. To get at the stuff that SAX 1.0 doesn't report, like comments,
>    CDATA boundaries, and DTD declarations.
> 
> I'm very interested in hearing other opinions.  Having a standard
> streaming interface stimulated a lot of development of reusable Java
> XML processing components, and I'd like to see the same thing happen
> in C++, but I need to hear what other people think the priorities
> should be.

#1 clearly is important; if only to ensure that SAX remains a
desirable and viable interface. If application developers or
parser writers are start to walk away from SAX due to a lack 
of namespace support, then SAX will rapidly die.

#3 is vital for many XML *processing* applications.

If you want to provide a SAX interface to an XML database server
that must be able to round-trip documents, SAX 1.0 isn't enough.
If you're writing an editor, or an XSLT engine, or a compound
document manager, or a transport protocol like WBXML, you want
or need to know about things that are in the SAX2 LexicalHandler
(e.g. CDATA sections, comments), NamespaceHandler, and DeclHandler.

For other applications, #3 isn't relevant. But that's the value of #2:
parser writers can implement the features and support the properties
they wish to, and application writers can selectively invoke that
functionality.

As it happens, I'm in the process of implementing a SAX interface for
a couple of SIM-related projects. We need, I think, the functionality
that SAX2 provides. Given that our code base is in C++, I guess my
vote is for both: a stable SAX2 and a standard C++ definition.

But having taken a look at SAX2, not much seems to be wrong with it.
Whereas there's already close to a dozen SAX/C++ variants, and climbing.
*That* trend needs to be stepped on, and quickly, before it gets out of hand.

Michael
____________________________________________
http://www.mds.rmit.edu.au/~msf/
Multimedia Databases Group, RMIT, Australia.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Mon Dec  6 04:04:47 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++: Changes for C++
In-Reply-To: <14406.59075.218048.437305@localhost.localdomain>; from David Megginson on Thu, Dec 02, 1999 at 04:38:11PM -0500
References: <14406.59075.218048.437305@localhost.localdomain>
Message-ID: <19991206150418.E3543@io.mds.rmit.edu.au>

On Thu, Dec 02, 1999 at 04:38:11PM -0500, David Megginson wrote:
> Here are some of the differences between the SAX/Java interfaces and the 
> SAX/C++ interfaces:
> 
> - lots of const
> - C++ const char * for Java String throughout (and, thus, UTF-8
>   instead of UTF-16)
> - InputSource doesn't have an equivalent of Java Reader (no getReader
>   method)

I don't mind if the character container is unsigned short or wchar_t
(it doesn't really matter if wchar_t is 32 bits on some platforms as
it's easy enough to convert to/from where required), but put me down
as another vote for UTF-16 rather than UTF-8.

Given that the point of Unicode is to support I18N, why choose as a default
a format that typically has a 50% size overhead for non-European languages?
Many parsers and application happily work internally using UTF-16;
why not standardize that as the default SAX character encoding?

Suggestion:
    Do what the Java SAX interface did: optionally provide *both*
    ByteStream and CharacterStream components in an InputSource object

Applications can treat the ByteStream as a stream of bytes whose encoding
can either be auto-detected, or is explicitly indicated by the Encoding.
However, a CharacterStream would always be a sequence of UTF-16 characters.
    
> - SAXException does not allow an embedded exception, because there's
>   no need to tunnel exceptions in C++ (you can always throw any
>   exception)

Unless you use throw() lists in function declarations; as did the Java spec.
In which case, you need to be able to embed exceptions...

> - DocumentHandler::characters and DocumentHandler::ignorableWhitespace 
>   don't need the 'start' argument, since they can be passed a pointer
>   to the start position in an existing array (that's not possible in
>   Java)

Yup.

> - HandlerBase omitted, since the classes can contain their own default 
>   implementations

I think this has been covered by others; if we define SAX/C++ using
abstract classes, then we need HandlerBase and the Impl classes back
for convenience.

> - I haven't figured out what to do with Parser::setLocale yet

Michael
____________________________________________
http://www.mds.rmit.edu.au/~msf/
Multimedia Databases Group, RMIT, Australia.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wperry at fiduciary.com  Mon Dec  6 05:58:20 1999
From: wperry at fiduciary.com (W. E. Perry)
Date: Mon Jun  7 17:18:22 2004
Subject: Object-oriented serialization (Was Re: Some questions)
References: <000601bf3f69$16947fd0$3bf96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <384B5077.47DA8C84@fiduciary.com>

Ah, but the nuts and bolts of specific *behaviour* over links will have to be defined,
ultimately as procedural code of some sort, in some generally invocable form. The logic (case,
sequential, conditional) leading to the code which implements that behaviour might well--and
probably ought to be--expressed as XML text, but at some point, from a leaf node of an XML
document expressing a decision tree, it will be necessary to invoke procedural code to
implement a defined behaviour. That procedural code might--and probably ought to
be--parameterized by XML and designed to return XML, but of itself, as procedural code, it is
opaque to the XML which invokes it. That procedural code, implementing a specific behaviour,
is the 'class of resource' which you are looking for (as its opacity to the instance
demonstrates). The behaviour expressed in that procedural code is invoked via a particular
URI, but a unique process is instantiated only in the scope and context of the current
document, and presumably only through parameterization specific to the instance, passed to
generally available code. In fact, anything which might reasonably be described as behaviour
is generally available to XML processing only as a 'class of resource':  that generally
available class must not be confused with its particular instantiation, for an instance
invocation of that behaviour via a particular URI does not impair the availability of that
URI--and of the behaviour it addresses--to other invocations from different contexts.

Rick Jelliffe wrote:

> Yes, except XLinks are specified in instances, not as a schema per se.
> I hope the XML Schema will have some extension mechanism to
> allow these kinds of thing, but who knows.
>
> It is true that sequence and containment relations between elements
> in a content model could be treated of as some kind of extended link.

[snip]

> And we cannot use hrefs to the instances because we don't know
> what the instance document URI is: a URI identifies a particular
> resource not a class of resources.  XLinks are not designed for
> us as schema declarations.
>
> So I think there needs to be first-class support for this in the
> schema language itself: in the case of XML Schemas, probably
> the most possible thing would be a role attribute (or some equivalent)
> on groups.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Mon Dec  6 07:19:04 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++: UTF-8 v UTF-16
In-Reply-To: <38472FE3.D3BB22BC@jclark.com>; from James Clark on Fri, Dec 03, 1999 at 09:50:11AM +0700
References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com>
Message-ID: <19991206181830.A11576@io.mds.rmit.edu.au>

James Clark wrote:
> David Megginson wrote:
> > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
> >    with most existing C++ code.
> I would say there was at least as much C++ code using UTF-16 as using UTF-8.
[...]
> There are a couple of possible solutions:
> 
> 1. A lo-tech solution.  Provide a SAXChar typedef [...]
> 
> 2. A hi-tech solution.  [use templates]

3. Use a similar solution to the Java spec: provide both a ByteStream
   and a CharacterStream in InputSource, which has two benefits.

One, it is consistent with the Java interface, which can't be a *bad* thing.

Two, it frees us to define the CharacterStream explicitly as a conduit
for UTF-16 encoded data, whilst allowing parsers/applications the freedom
to use the ByteStream for data that is encoded in whatever format desired.

The encoding can either be auto-detected, or can be explicitly identified
using the InputSource setEncoding()/getEncoding() member function.

This means going back to the two streams and the getEncoding()/setEncoding()
methods of the original Java spec.

This really seems like a Good Thing; I liked the look of it in the
Java interface; why not use it here also?

> If you feel that one needs to be mandated, I would pick UTF-16.

Agreed.

Michael
____________________________________________
http://www.mds.rmit.edu.au/~msf/
Multimedia Databases Group, RMIT, Australia.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Mon Dec  6 07:29:26 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++: First interface draft
In-Reply-To: <38474BAF.AF4CFF2D@jclark.com>; from James Clark on Fri, Dec 03, 1999 at 11:48:47AM +0700
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <19991206182900.F3543@io.mds.rmit.edu.au>

On Fri, Dec 03, 1999 at 11:48:47AM +0700, James Clark wrote:
> Here's another draft, with this change and a few other minor changes;
[...]
> - solve the UTF-8/UTF-16 problem by having two namespaces:

As I've suggested elsewhere, this can also be (partially) addressed
by providing both a CharacterStream and a ByteStream. That would change
the InputSource definition to:

 class InputSource
 {
 public:
   virtual SAXString getPublicId () const = 0;
   virtual void setPublicId (SAXString publicId) = 0;

   virtual SAXString getSystemId () const = 0;
   virtual void setSystemId (SAXString systemId) = 0;

   virtual SAXString getEncoding () const = 0;
   virtual void setEncoding(SAXString encoding) = 0;
   
   virtual std::istream * getByteStream () const = 0;
   virtual void setByteStream (std::istream * in) = 0;

   // Issue: is wistream the best C++ cchoice for a Unicode "character" stream,
   // given that sizeof(wchar_t) need not be 2 (eg, under // Sun/Solaris CC)?
   virtual std::wistream * getCharacterStream () const = 0;
   virtual void setCharacterStream (std::wistream * in) = 0;

 private:
   void operator delete (void *);
 };


> Discussion points:
>
> - Would it be better to typedef SAXString to the Standard C++ string
> class (ie std::basic_string<SAXChar>)?

Also:
    The Java definition explicitly indicates what exceptions may be thrown
through out the interface. Should C++ exception specificiers be used to
mirror those semantics?  If that's the case, we probably also need to add
embedded exceptions back into the SAXException class, a la:

	virtual std::exception& getException() const = 0;
	virtual SAXString toString() const = 0;

General query: should there be heavier use of const and "&" amongst
the various function's parameter declarations and return values?

Michael
____________________________________________
http://www.mds.rmit.edu.au/~msf/
Multimedia Databases Group, RMIT, Australia.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 08:35:08 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++: First interface draft
In-Reply-To: David Megginson's message of "03 Dec 1999 09:16:28 -0500"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <m3r9h4hzyb.fsf@localhost.localdomain>
Message-ID: <whaenoo4b2.fsf@viffer.oslo.metis.no>

>>>>> David Megginson <david@megginson.com>:

> Actually, I don't see any strong argument not to provide empty inline
> implementations for the handler callbacks:

Inlined virtuals will cause an instantiation of the vtable and the
function bodies in _every_ compilation unit the header file is
included into (ref. Scott Meyers "More Effective C++", Item 24 pp
118).

This is a size cost that can be easily avoided.

I'm also coming more and more to the conclusion that even trivial
non-virtual function bodies should not be inlined, _unless_ there is a 
clear performance reason to do so.

This is because even trivial inlined function occasinally needs to be
changed and changing something in a headerfile causes recompilation.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 08:46:24 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:22 2004
Subject: simple XML for C++ application data-file I/O
In-Reply-To: Paul Miller's message of "Sun, 05 Dec 1999 20:07:49 -0500"
References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> <384B0C65.2A6710C0@fxtech.com>
Message-ID: <wh66yco3s5.fsf@viffer.oslo.metis.no>

>>>>> Paul Miller <stele@fxtech.com>:

> I should have been more clear. I just want to use XML for simple
> non-web-bound application data files (document files). I need a
> non-validating parser that I can use to efficiently parse my
> application data, without all the complexity (and overhead) of
> something like DOM, but not as general-purpose as expat.

What I did in a similar situation, was to take James Clark's expat
	http://www.jclark.com/xml/expat.html
and wrap it in a SAXoid interface.  Today I would have done the same
thing with James Clarks modified version of David Megginsons proposal
instead of my own reinterpretation of the Java SAX into C++ (which is
what I now have).

Alternatively you can take a look at Xerces-C from the Apache
consortium: 
	http://xml.apache.org/xerces-c/index.html
It has its own SAX interface.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ssahuc at imediation.com  Mon Dec  6 09:03:42 1999
From: ssahuc at imediation.com (Sebastien Sahuc)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++ vs. SAX2
Message-ID: <C10B7E3A3AC3D211804E0000B45EDA84404FC5@mail.imediation.com>

> Michael Fuller wrote :
> But having taken a look at SAX2, not much seems to be wrong with it.
> *That* trend needs to be stepped on, and quickly, before it 
> gets out of hand.

Completely agree with it. And why not focus on SAX2 for both Java and
C++ as someone has already pointed out ?

Sebastien


> 
> Michael
> ____________________________________________
> http://www.mds.rmit.edu.au/~msf/
> Multimedia Databases Group, RMIT, Australia.
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: 
> http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on 
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 09:06:35 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:22 2004
Subject: SAX/C++: First interface draft
In-Reply-To: Steinar Bang's message of "03 Dec 1999 14:14:54 +0100"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whu2m0qi7l.fsf@viffer.oslo.metis.no>
Message-ID: <wh1z90o2ui.fsf@viffer.oslo.metis.no>

>>>>> Steinar Bang <sb@metis.no>:

>>>>> James Clark <jjc@jclark.com>:
>> - Would it be better to typedef SAXString to the Standard C++ string
>> class (ie std::basic_string<SAXChar>)?

> An argument for using 
>         typdef const SAXChar* SAXString;
> is that you get late construction of the basic_string<>, ie. you don't 
> create it until you have to (eg. when using it to do a lookup in an
> STL map<>).

After thinking over the weekend, I'm changing my vote on this issue.
I think the convenience of using basic_string<> way outweighs the
cost advantages of lazy conversion, since in most cases the first
thing that would be done in the DocumentHandler (or whereever) would
be to create a basic_string<> for the appropriate character size
anyway. 

But the UTF-16 string should not be a straight typedef.  We should
derive from basic_string<SAXChar> to get a char* constructor that
would take a UTF-8-encoded string.  This is for ease of use with
character constants.

Hm... we may also need an operator<<() for byte streams, that would do 
UTF-8 encoding...? (fewer implementations have templated streams than
have basic_string<>, and we may want to use a byte stream rather than
a wide stream for I/O anyway.)

that would make the SAX.h file something like this:

Here's SAX.h:

#ifndef __SAX_HXX
#define __SAX_HXX

// Forward declarations of std::istream
#include <iosfwd>

namespace SAX_UTF8 {

  typedef char SAXChar;
  typedef std::string SAXString;
#include "SAXDecl.h"

}

namespace SAX_UTF16 {

  typedef unsigned short SAXChar;
  class SAXString : public std::basic_string<SAXChar> {
  public:
    SAXString(const char* utf8);
  };

  ostream& operator::<<(ostream&,const SAXString&);

#include "SAXDecl.h"

}

#endif

..or something...

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 09:09:46 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: First interface draft
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <whwvqsmo4v.fsf@viffer.oslo.metis.no>

>>>>> James Clark <jjc@jclark.com>:

> - solve the UTF-8/UTF-16 problem by having two namespaces: a SAX_UTF8
> and a SAX_UTF16 namespace (since you're using std::istream, you are
> assuming compiler support for namespaces); this will work nicely with
> namespace aliases (eg namespace SAX = SAX_UTF8).

I have a practical problem with using std::istream on the MSVC++
platform.  Since the Standard C++ Library as delivered with MSVC++ 5
and 6 is broken, we're using Standards<ToolKit> from ObjectSpace to
provide us with the parts of the Standard C++ Library we're using.

And Objectspace Standards<ToolKit> is not compatible with the Standard 
C++ Library iostreams of MSVC++.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Mon Dec  6 09:14:21 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX2/C++ interface draft [Was: Re: Request for Discussion: SAX 1.0 in C++]
In-Reply-To: <006501bf3dca$a4767ce0$4a5eedc1@arp01>; from Richard Anderson on Fri, Dec 03, 1999 at 08:11:53PM -0000
References: <14406.58446.675568.388482@localhost.localdomain><199912031808.LAA14313@localhost.localdomain> <14408.2087.817611.250771@localhost.localdomain> <006501bf3dca$a4767ce0$4a5eedc1@arp01>
Message-ID: <19991206201359.G3543@io.mds.rmit.edu.au>

On Fri, Dec 03, 1999 at 08:11:53PM -0000, Richard Anderson wrote:
> How about focusing on SAX/2, and making the first C/C++ SAX interface
> actually SAX 2 so we kill two birds with one stone ?

Good idea!

Here's a [hasty] conversion of the SAX2 Java classes into C++, 
along the lines of David and James' posted SAX 1.0/C++ headers
I've used exception specifiers for consistency with the original Java.

// SAX2Decl.h

namespace SAX
{
    class Configurable;
    class DTDHandler;
    class DocumentHandler;
    class EntityResolver;
    class ErrorHandler;
    class InputSource;
    class Parser;
    class SAXException;

    class Configurable
    {
    public:
	virtual void setFeature(SAXString featureId, bool state)
					    throw(SAXException) = 0;
	virtual bool getFeature(SAXString featureId)
					    throw(SAXException) = 0;
	virtual void setProperty(SAXString propertyId, void * value)
					    throw(SAXException) = 0;
	virtual void * getProperty(SAXString propertyId)
					    throw(SAXException) = 0;
    private:
	void operator delete (void *);
    }

    class ConfigurableParserAdapter : public Parser, public Configurable
    {
    public
	ConfigurableParserAdapter(Parser& parser);

	// SAX 1.0 methods
	void setLocale(const char * locale)		throw(SAXException);
	void setEntityResolver(EntityResolver& resolver)
       	void setDTDHandler(DTDHandler& handler)
	void setDocumentHandler(DocumentHandler& handler)
       	void setErrorHandler(ErrorHandler& handler)
	void parse(const InputSource& source)		throw(SAXException);
	void parse(SAXString systemId)			throw(SAXException);
    
	// SAX2 methods
	void setFeature(SAXString featureId, bool state) throw(SAXException);
	bool getFeature(SAXString featureId)		throw(SAXException);
	void setProperty(SAXString propertyId, void * value)
							throw(SAXException);
	void * getProperty(SAXString propertyId)	throw(SAXException);
    private:
	void operator delete (void *);
    }

    class DeclHandler
    {
    public:
	virtual void elementDecl(SAXString name, SAXString model)
						throw(SAXException) = 0;

	virtual void attributeDecl(SAXString eName,
				   SAXString aName,
				   SAXString type,
				   SAXString valueDefault,
				   SAXString value)
						throw(SAXException) = 0;

	virtual void internalEntityDecl(SAXString name, SAXString value)
						throw(SAXException) = 0;

	virtual void externalEntityDecl(SAXString name,
					SAXString publicId,
					SAXString systemId)
						throw(SAXException) = 0;
    private:
	void operator delete (void *);
    }

    class LexicalHandler
    {
    public:
	virtual void startDTD(SAXString name,
			      SAXString publicId,
			      SAXString systemId)
					throw(SAXException) = 0;
				    
	virtual void endDTD()		throw(SAXException) = 0;

	virtual void startEntity(SAXString name)
					throw(SAXException) = 0;
	virtual void endEntity(SAXString name)
					throw(SAXException) = 0;

	virtual void startCDATA()	throw(SAXException) = 0;
	virtual void endCDATA()		throw(SAXException) = 0;

	virtual void comment(const SAXChar * ch, int length)
					throw(SAXException) = 0;

    private:
	void operator delete (void *);
    }

    class NamespaceHandler
    {
    public:
	virtual void startNamespaceDeclScope(SAXString prefix, SAXString uri)
					throw(SAXException) = 0;
	virtual void endNamespaceDeclScope(SAXString prefix)
					throw(SAXException) = 0;
    private:
	void operator delete (void *);
    }

    class SAXNotRecognizedException : public SAXException
    {
    public:
	SAXNotRecognizedException(SAXString message);
    private:
	void operator delete (void *);
    }

    class SAXNotSupportedException : public SAXException
    {
    public:
	SAXNotSupportedException(SAXString message);
    private:
	void operator delete (void *);
    }
}


Michael
-- 
http://www.mds.rmit.edu.au/~msf/
Multimedia Databases Group, RMIT, Australia.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 09:25:51 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
Message-ID: <m3yab8qv35.fsf@ifi.uio.no>


* Walter Underwood
|
| Comments are welcome.

First thought: this is fine for very simple uses, but for more complex
uses something along the lines of the robots.txt file would be very
nice. How about a variant PI that can point to a robots.rdf resource?


Second thought: "and the index attribute must be first". This is nice
for implementors, but is likely to clash with the expectations of
users and the cost of more generality is very low for implementors.

Why not follow the <URL: http://www.w3.org/TR/xml-stylesheet/ > style
of specifying PI pseudo-attributes?


Also: The robot PI, says the spec, "should be in the internal subset
(not in an external DTD or parameter entity). Since robots may be
non-validating, a robots PI in the external subset might not be seen
by the robot."

I think this is misleading, since "the internal subset" is usually a
short for "the internal DTD subset". A better way of putting it might
be "It should be in the document entity (not in an external entity,
including the external DTD subset and external parameter entities).
Since robots may skip external entities, PIs in external entities
might not be seen by the robot."

However, I don't think this will do either. Entities are what the
storage structure of SGML/XML documents are composed of, and I think
this spec needs to take some sort of stand as to how entities map to
WWW resources, and which entities the PI is really talking about.

One way is to say that every resource is an entity, and every
web-accessible entity is a resource. Then one might say that the
robots PI refers to

 a) the entity in which it is found

 b) the entity in which it is found and all entities included by this
 entity via entity references, regardless of any robots PIs in these
 included entities

 c) the entity in which it is found, and if "follow" is set to yes,
 all entities included by this entity via entity references,
 regardless of any robots PIs in these included entities

 d) the entity in which it is found, and if "sub-entities" is set to
 yes, all entities included by this entity via entity references,
 regardless of any robots PIs in these included entities

Once one agrees on a policy I think this is worth a subsection in the
spec, regardless of the choice made. b) is probably the easiest to
implement, since many APIs do not expose entity structure. It might
not be the best choice, though.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 09:31:29 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: A processing instruction for robots
In-Reply-To: <001201bf3e1f$074601c0$099918d1@docuverse1>
References: <001201bf3e1f$074601c0$099918d1@docuverse1>
Message-ID: <m3wvqsqutu.fsf@ifi.uio.no>


* Don Park
| 
| Walter,
| Could you elaborate your decision to use PI rather than element(s)?

I'm not Walter, but to me this has the obvious advantage that it can
be used completely orthogonally to the document contents and the
software used to process the document for non-indexing purposes.

Of course, it works poorly with SML, and IMHO this (and the
"Associating stylesheets with XML documents" recommendation) are good
arguments for including PIs in SML, even if only before the document
element. 

No doubt there will be other proposals of this sort, and if these are
all specified in terms of elements writing application-specific
processing software will be hell unless we either start using
architectures or mandate the use of namespaces in processing. And even
then it might still be hell for various reasons, especially the
namespace solution.

So IMHO PIs are the right choice for this.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 09:33:50 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <008301bf3d41$de0c9b80$a82a08d1@tomshp>
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca> <14407.1389.659881.147338@localhost.localdomain> <008301bf3d41$de0c9b80$a82a08d1@tomshp>
Message-ID: <m3vh6cqupx.fsf@ifi.uio.no>


* David Megginson
| 
| Sure -- is there a strong need for a common C interface, though?  We
| already have Expat's C interface, and I don't know of anyone else in
| that space yet.

* Thomas B. Passin
|
| But C is available on most _any_ platform - often for free.  So
| almost anyone could compile in C but not necessarily in C++.  Isn't
| rxp done in C?

It is, and I think libxml is also.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 09:36:46 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <001301bf3ec0$d47d7d20$5afbb1cd@tomshp>
References: <199912042112.OAA19167@localhost.localdomain> <001301bf3ec0$d47d7d20$5afbb1cd@tomshp>
Message-ID: <m3u2lwqukx.fsf@ifi.uio.no>


* Thomas B. Passin
| 
| I'd second this, with the thought that most of the effort would
| concentrate on finishing the SAX2 interface itself before spending
| the potential "months" on the C++/SAX implementation.

I agree with this as well. If SAX2 is as close to finished as I think
it is, it really should be finished off now, to be followed by a C++
translation of SAX 2.0.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 09:43:53 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <14408.2610.245842.199581@localhost.localdomain>
References: <14408.2610.245842.199581@localhost.localdomain>
Message-ID: <m3so1gqu92.fsf@ifi.uio.no>


* David Megginson
|
| I'd like to hear what others think on this issue.

My feeling is that SAX2 is very important (namespaces, generalized
querying and extensibility and lexical information are all very
important to some kinds of applications), and furthermore so close to
completion that it should go before a C++ binding, which might take a
long time to complete.

The added benefit is of course that the C++ bindings for SAX 1.0 and
2.0 can be done simultaneously.

--Lars M.

 
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nisse at lysator.liu.se  Mon Dec  6 09:54:38 1999
From: nisse at lysator.liu.se (Niels M�ller)
Date: Mon Jun  7 17:18:23 2004
Subject: parser asynch input (Was: SAX/C++: First interface draft)
In-Reply-To: Steinar Bang's message of "03 Dec 1999 14:20:35 +0100"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whpuwoqhy4.fsf_-_@viffer.oslo.metis.no>
Message-ID: <nnyab8cs38.fsf@sanna.lysator.liu.se>

Steinar Bang <sb@metis.no> writes:

> I would like to add operations that can be used to "push" data to the
> parser asynchronously:

I also think this is important (and it's a sadly missing feature of
the IBM's xml4c parser, which provides another SAX-like C++ API).

But is there any reason not to use the same InputSource abstraction
for the fragment blocks? Say, something like

  class Parser
  {
  public:
    virtual void setLocale (const char *) = 0;
    virtual void setEntityResolver (EntityResolver &resolver) = 0;
    virtual void setDTDHandler (DTDHandler &handler) = 0;
    virtual void setDocumentHandler (DocumentHandler &handler) = 0;
    virtual void setErrorHandler (ErrorHandler &handler) = 0;
  
    virtual void parse (SAXString systemId) = 0;
    virtual void parse (const InputSource &input) = 0;
!   virtual void parseFragment (const InputSource &input) = 0;
!   virtual void parseEnd() = 0;
  private:
    void operator delete (void *);
  };

The idea is that a document is the catenation of one or more fragments
(e.g. blocks of data that are read from a socket).

parse(source) would be equivalent to parseFragment(source);
parseEnd().

/Niels

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vilya at nag.co.uk  Mon Dec  6 10:12:25 1999
From: vilya at nag.co.uk (Vilya Harvey)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++ vs. SAX2
References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no>
Message-ID: <384B8C32.22805488@nag.co.uk>

Sebastien Sahuc wrote:
> 
> > Michael Fuller wrote :
> > But having taken a look at SAX2, not much seems to be wrong with it.
> > *That* trend needs to be stepped on, and quickly, before it
> > gets out of hand.
> 
> Completely agree with it. And why not focus on SAX2 for both Java and
> C++ as someone has already pointed out ?

Just a thought: why not take a leaf out of the DOM's book and write the
canonical version of the SAX interfaces in a language-neutral format like
IDL? That way, bindings to a number of languages (including, but not
limited to, C++ and Java) can be trivially derived by using the
appropriate IDL-to-whatever converter.

Vil.
-- 
Vilya Harvey  <vilya@nag.co.uk>    Wilkinson House  Mob: +44  961 106 505
Computational Mathematics Group   Jordan Hill Road   Wk: +44 1865 511 245
NAG Limited                    Oxford  UK  OX2 8DR  Fax: +44 1865 311 205

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 10:25:44 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:23 2004
Subject: parser asynch input (Was: SAX/C++: First interface draft)
In-Reply-To: nisse@lysator.liu.se's message of "06 Dec 1999 10:54:19 +0100"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whpuwoqhy4.fsf_-_@viffer.oslo.metis.no> <nnyab8cs38.fsf@sanna.lysator.liu.se>
Message-ID: <wh7lismkmb.fsf@viffer.oslo.metis.no>

>>>>> nisse@lysator.liu.se (Niels M?ller):

> Steinar Bang <sb@metis.no> writes:
>> I would like to add operations that can be used to "push" data to the
>> parser asynchronously:

> I also think this is important (and it's a sadly missing feature of
> the IBM's xml4c parser, which provides another SAX-like C++ API).

> But is there any reason not to use the same InputSource abstraction
> for the fragment blocks? Say, something like

>   class Parser
>   {
>   public:
[snip!]
> !   virtual void parseFragment (const InputSource &input) = 0;
> !   virtual void parseEnd() = 0;
>   private:
>     void operator delete (void *);
>   };

Hm... I think for me at least, this will cause an extra copy of the
fragment before parsing.

If I have a buffer, and wrap an strstream around it, I would still need
to read the entire fragment from the istream into another buffer
before feeding it to expat.

Or would it be more efficient to do a loop on the stream and put the
buffer's contents char by char into expat...?

Being able to put a buffer directly into the parser is the most
efficient way of doing things, from the way we currently handle
different file formats in our application.  We have a map from MIME
types to pointers to instances of a class called NetStreamFactory:

class NetStreamFactory {
  public:
    virtual ~NetStreamFactory();
    virtual NetStream* newStream(const Url* url = 0) = 0;
};

not surprisingly, these factories are used to create instances of
subclasses of NetStream (subclasses handling XML, and our old file
format, as well as decoding image formats like PNG and JPEG):

class NetStream {
  public:
    virtual ~NetStream();
    virtual void setReadOnly(bool readOnly = true);
    virtual void putBlock(const char* buf, unsigned long len,
			  bool entireFile = false) = 0;
    virtual void eof();
};

(The idea with the "entireFile" argument to putBlock, is that I can avoid
doing buffering of the data for the NetStream classes that need the
entire file (our old format which uses a recursive descent parser, and 
our current JPEG decoder) for the case where I'm reading in the file
from the local file system.  Also for the case of data arriving on the 
net, I'm delivering the buffer read from the network as is, to the XML 
parser, without doing an extra copy).

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From john.aldridge at informatix.co.uk  Mon Dec  6 11:30:43 1999
From: john.aldridge at informatix.co.uk (John Aldridge)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: First interface draft
In-Reply-To: <wh1z90o2ui.fsf@viffer.oslo.metis.no>
References: <Steinar Bang's message of "03 Dec 1999 14:14:54 +0100">
 <14406.59198.949047.2487@localhost.localdomain>
 <38474BAF.AF4CFF2D@jclark.com>
 <whu2m0qi7l.fsf@viffer.oslo.metis.no>
Message-ID: <3.0.6.32.19991206113002.00ad23f0@mailhost>

At 10:06 06/12/99 +0100, Steinar Bang <sb@metis.no> wrote:
>After thinking over the weekend, I'm changing my vote on this issue.
>I think the convenience of using basic_string<> way outweighs the
>cost advantages of lazy conversion,

Agreed.

>But the UTF-16 string should not be a straight typedef.  We should
>derive from basic_string<SAXChar> to get a char* constructor that
>would take a UTF-8-encoded string.  This is for ease of use with
>character constants.

I disagree with this, though.  It's not that much of a hardship to add an
"L" before your character constants; certainly not enough to warrant
subclassing a class without a virtual dtor (and duplicating all the
constructors and all the functions which return a string as a function
result).

I think I'd go for using straight std::wstring, and define the characters
in those wstrings to be UTF-16 encoded (whatever the current C locale
says).  Leave it up to the application either to set a locale in which
wchar_t is UTF-16 (in which case the RTL functions will behave sensibly),
or not (in which case the application will have to hand crank some things).
-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec  6 11:40:40 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: UTF-8 v UTF-16
In-Reply-To: <19991206181830.A11576@io.mds.rmit.edu.au>
References: <14406.58740.871829.541816@localhost.localdomain>
	<38472FE3.D3BB22BC@jclark.com>
	<19991206181830.A11576@io.mds.rmit.edu.au>
Message-ID: <14411.41098.673804.538770@localhost.localdomain>

Michael Fuller writes:
 > James Clark wrote:
 > > David Megginson wrote:
 > > > 4. Hold my nose and use UTF-8 rather than UTF-16, for compatibility
 > > >    with most existing C++ code.
 > > I would say there was at least as much C++ code using UTF-16 as using UTF-8.
 > [...]
 > > There are a couple of possible solutions:
 > > 
 > > 1. A lo-tech solution.  Provide a SAXChar typedef [...]
 > > 
 > > 2. A hi-tech solution.  [use templates]
 > 
 > 3. Use a similar solution to the Java spec: provide both a ByteStream
 >    and a CharacterStream in InputSource, which has two benefits.

Unfortunately, the problem is not input from the document but output
to the client.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Mon Dec  6 13:09:39 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:23 2004
Subject: simple XML for C++ application data-file I/O
References: <384B04DA.DCD6BAED@fxtech.com> <9912051700340G.00844@quadra.teleo.net> <384B0C65.2A6710C0@fxtech.com>
Message-ID: <001f01bf3feb$b919bf40$a62a08d1@tomshp>


Original :
From: Paul Miller <stele@fxtech.com>
> ...
> I should have been more clear. I just want to use XML for simple
> non-web-bound application data files (document files). I need a
> non-validating parser that I can use to efficiently parse my application
> data, without all the complexity (and overhead) of something like DOM,
> but not as general-purpose as expat.
> ...
I had a task to convert simple spreadsheet data - each row was complete in
itself - to xml.  I saved the spreadsheet as a tab-delimited file, converted
it to XML using awk, and did the simple processing I needed to do using
regular expressions in Python. It was quick and easy.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 13:33:29 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++ vs. SAX2
In-Reply-To: <384B8C32.22805488@nag.co.uk>
References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> <384B8C32.22805488@nag.co.uk>
Message-ID: <m31z90p554.fsf@ifi.uio.no>


* Vilya Harvey
| 
| Just a thought: why not take a leaf out of the DOM's book and write
| the canonical version of the SAX interfaces in a language-neutral
| format like IDL? 

This may sound like a good idea, but it has its drawbacks in that one
is immediately forced into a lowest common denominator design where it
is impossible to make use of the features that really make each
language what they are.

Also, IDL does not have convenient ways of mapping to C++ streams,
Java InputStream, Python dictionary-like objects and file-like objects
etc etc  

Another problem is that exceptions are first-class objects in SAX
(which is exploited by the Java and Python mappings), but not in IDL.

Nor are language naming conventions respected. (startElement should
really be startElement (in Java), start_element (in C++, Python, IDL)
and start-element (in Common Lisp/Scheme) and there may even be more
variations.

As a general reference and statement of intent it might have some
value, but I really think translation should be done by humans. The
main advantage feature of IDL, cross-process and cross-language
interoperability, is not really all that valuable for SAX anyway.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec  6 13:41:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: First interface draft
In-Reply-To: Steinar Bang's message of "06 Dec 1999 09:34:57 +0100"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <m3r9h4hzyb.fsf@localhost.localdomain> <whaenoo4b2.fsf@viffer.oslo.metis.no>
Message-ID: <m33dtgp4qa.fsf@localhost.localdomain>

Steinar Bang <sb@metis.no> writes:

> >>>>> David Megginson <david@megginson.com>:
> 
> > Actually, I don't see any strong argument not to provide empty inline
> > implementations for the handler callbacks:
> 
> Inlined virtuals will cause an instantiation of the vtable and the
> function bodies in _every_ compilation unit the header file is
> included into (ref. Scott Meyers "More Effective C++", Item 24 pp
> 118).
> 
> This is a size cost that can be easily avoided.

Thanks -- this is the kind of information I was looking for.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec  6 13:43:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: First interface draft
In-Reply-To: Steinar Bang's message of "06 Dec 1999 10:09:36 +0100"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whwvqsmo4v.fsf@viffer.oslo.metis.no>
Message-ID: <m3zovonq1i.fsf@localhost.localdomain>

Steinar Bang <sb@metis.no> writes:

> And Objectspace Standards<ToolKit> is not compatible with the Standard 
> C++ Library iostreams of MSVC++.

I think that this goes beyond the scope of SAX -- we have to be able
to assume at least a basic level of ANSI-C++ conformance, or else
we'll end up rewriting the whole standard library.  I'm willing not to 
beat up on the hairier features (like templates), but we have to be
able to count on the basics.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec  6 13:51:12 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:23 2004
Subject: simple XML for C++ application data-file I/O
In-Reply-To: Paul Miller's message of "Sun, 05 Dec 1999 19:35:38 -0500"
References: <384B04DA.DCD6BAED@fxtech.com>
Message-ID: <m3wvqsnppb.fsf@localhost.localdomain>

Paul Miller <stele@fxtech.com> writes:

> I've seen a lot of discussion about DOM, SAX, RDF, etc. but none of the
> solutions I've seen are very simple or straightforward for generic
> application data I/O (ie. non web, e-commerce, Java-type stuff). In
> other words, I'm about to roll my own, and would like to gauge interest
> in a small callback-based API for simple XML I/O.

We tried to keep SAX 1.0 as simple as possible -- how would you
simplify the following further?

  public static void main ()
  {
    Parser parser = new SomeSAXDriver();
    parser.setDocumentHandler(new MyHandler());
    try {
      parser.parse("http://www.foo.com/foo.xml");
    } catch (SAXException e) {
      // do something!!
    }
  }

and

  public class MyHandler extends HandlerBase
  {
    public void startElement (String name, AttributeList atts)
    {
      // do something!!
    }

    public void endElement (String name)
    {
      // do something!!
    }

    public void characters (char ch[], int start, int length)
    {
      // do something!!
    }
  }


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 13:59:49 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: UTF-8 v UTF-16
In-Reply-To: <whyabcqijv.fsf@viffer.oslo.metis.no>
References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> <3847B8F3.D81B8286@jclark.com> <whyabcqijv.fsf@viffer.oslo.metis.no>
Message-ID: <m3emd0npa6.fsf@ifi.uio.no>


* James Clark
| 
| Unfortunately wchar_t isn't guaranteed to be UTF-16.  Some platforms
| make it 32-bits.

* Steinar Bang
| 
| Yep!  So I've heard.
| 
| Do you have a list of the ones that does this?

gcc 2.95 on Linux does, at least. I don't know what it does on other
platforms. 
 
--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 14:06:36 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: Changes for C++
In-Reply-To: <wh7liwthnr.fsf@viffer.oslo.metis.no>
References: <14406.59075.218048.437305@localhost.localdomain> <wh7liwthnr.fsf@viffer.oslo.metis.no>
Message-ID: <m3d7sknoym.fsf@ifi.uio.no>


* Steinar Bang
| 
| I would like to be able to create a "push" stream, ie. something
| similar to a libwww stream, where data that arrives asynchronously
| will just be "pushed" to the parser as they arrive.
| 
| expat already supports this, and I use it.

We added support for this as an extension in the Python version of
SAX, since several of the Python parsers support this (xmllib, xmlproc
and pyexpat). This was simply done by adding three methods on the
extended parser interface: reset, feed and close.

For C++ SAX2 this might be done through a property
(http://.../push-stream) which returns a PushStream implementation
with these three methods to allow you to push data into parsers which
support this. 

Some means of specifying the URL of the document entity is probably
also a good idea, for resolution of relative URLs.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 14:11:43 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:23 2004
Subject: simple XML for C++ application data-file I/O
References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain>
Message-ID: <384BC3E2.E526B322@fxtech.com>

> We tried to keep SAX 1.0 as simple as possible -- how would you
> simplify the following further?
> 
>     public void startElement (String name, AttributeList atts)
>     {
>       // do something!!
>     }

Here is where I have the problem. This leaves an awful lot up to the
application, still, including handling the proper nesting. I would like
to make the actual parsing of elements more "automatic", so when a
certain element is hit, it calls a function with my object pointer where
I can pick up the parsing from there, then drop back out to the
enclosing XML scope and keep going.

Perhaps what I want to do should be built on SAX instead of expat,
though.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eoin_lane at esatclear.ie  Mon Dec  6 14:34:22 1999
From: eoin_lane at esatclear.ie (Eoin Lane)
Date: Mon Jun  7 17:18:23 2004
Subject: psgml 1.2.1 problem
Message-ID: <384BC90B.8BA224A4@esatclear.ie>

I'm trying to write a xml doc with emacs configured to use psgml-1.2.1
but am having some problems. I have checked that psgml works with a
simple dtd. However when I use the dtd (document-v10.dtd) below I get
the following error.

~/character.ent line 2 col 12 entity common.att
~/document-v10.dtd line 218 col 29 entity DOCUMENT
~/installing.xml line 3 col 51
Name expected; at: :lang

I wonder could anyone tell me what I am doing wrong. I know the dtd is
correct because I checked it with IBM 4j parser and it validated. it
would be of great benefit to me if I could use the dtd in emacs so any
help would be greatly appreciated.

Eoin.


--

Dr. Eoin Lane
InConn Technologies Ltd.
17 Washington St.
Cork.
Tel. (021) 271855 Fax (021) 272419
http://inconn.ucc.ie
mailto:eoinlane@esatclear.ie


-------------- next part --------------
<!-- ===================================================================

     Apache Documentation DTD (Version 1.0)

PURPOSE:
  This DTD was developed to create a simple yet powerful document
  type for software documentation for use with the Apache projects.
  It is an XML-compliant DTD and it's maintained by the Apache XML
  project.

TYPICAL INVOCATION:

  <!DOCTYPE document PUBLIC
       "-//APACHE//DTD Documentation Vx.yz//EN"
       "http://xml.apache.org/DTD/document-vxyz.dtd">

  where

    x := major version
    y := minor version
    z := status identifier (optional)

NOTES:
  Many of the design patterns used in this DTD were take from the
  W3C XML Specification DTD edited by Eve Maler <elm@arbortext.com>.

  Where possible, great care has been used to reutilize HTML tag
  names to reduce learning efforts and to allow HTML editors to be
  used for complex authorings like tables and lists.

AUTHORS:
  Stefano Mazzocchi <stefano@apache.org>

FIXME:
  - how can we include char entities without hardwiring them?
  - should "form" tags be included?
  - should all style-free HTML 4.0 markup tags be included?
  - how do we handle the idea of "soft" xlinks?
  - should we add "soft" links to images?

CHANGE HISTORY:
  19991121 Initial version. (SM)
  19991123 Replaced "res" with more standard "strong" for emphasis. (SM)
  19991124 Added "fork" element for window forking behavior. (SM)
  19991124 Added "img-inline" element to separate from "img". (SM)
  19991129 Removed "affiliation" from "author". (SM)
  19991129 Made "author" empty and moved "name|email" as attributes (SM)

COPYRIGHT:
  Copyright (c) 1999 The Apache Software Foundation.

  Permission to copy in any form is granted provided this notice is
  included in all copies. Permission to redistribute is granted
  provided this file is distributed untouched in all its parts and
  included files.

==================================================================== -->


<!-- =============================================================== -->
<!-- Common character entities (included from external file) -->
<!-- =============================================================== -->

<!-- FIXME (SM): this is hardcoding. Find a better way of doing this
     possibly using public identifiers of ISO latin char sets -->
<!ENTITY % charEntity SYSTEM "characters.ent">
%charEntity;


<!-- =============================================================== -->
<!-- Userful entitieis for increased DTD readability -->
<!-- =============================================================== -->

<!ENTITY % text "#PCDATA">


<!-- =============================================================== -->
<!-- Entities for general XML compliance -->
<!-- =============================================================== -->

<!-- Common attributes
        Every element has an ID attribute (sometimes required,
        but usually optional) for links, and a Role attribute
        for extending the useful life of the DTD by allowing
        authors to make subclasses for any element. %common.att;
        is for common attributes where the ID is optional, and
        %common-idreq.att; is for common attributes where the
        ID is required.
-->
<!ENTITY % common.att
        'id                     ID              #IMPLIED
         xml:lang               NMTOKEN         #IMPLIED
         role                   NMTOKEN         #IMPLIED'>
<!ENTITY % common-idreq.att
        'id                     ID              #REQUIRED
         xml:lang               NMTOKEN         #IMPLIED
         role                   NMTOKEN         #IMPLIED'>


<!-- xml:space attribute ===============================================
        Indicates that the element contains white space
        that the formatter or other application should retain,
        as appropriate to its function.
==================================================================== -->
<!ENTITY % xmlspace.att
        'xml:space (default|preserve) #FIXED "preserve"'>


<!-- def attribute =====================================================
        Points to the element where the relevant definition can be
        found, using the IDREF mechanism.  %def.att; is for optional
        def attributes, and %def-req.att; is for required def
        attributes.
==================================================================== -->
<!ENTITY % def.att
        'def                    IDREF           #IMPLIED'>
<!ENTITY % def-req.att
        'def                    IDREF           #REQUIRED'>


<!-- ref attribute =====================================================
        Points to the element where more information can be found,
        using the IDREF mechanism.  %ref.att; is for optional
        ref attributes, and %ref-req.att; is for required ref
        attributes.
================================================================== -->
<!ENTITY % ref.att
        'ref                    IDREF           #IMPLIED'>
<!ENTITY % ref-req.att
        'ref                    IDREF           #REQUIRED'>


<!-- =============================================================== -->
<!-- Entities for XLink compliance -->
<!-- =============================================================== -->

<!ENTITY % xlink-simple.att
        'type      (simple|extended|locator|arc) #FIXED "simple"
         href      CDATA                         #IMPLIED
         role      CDATA                         #IMPLIED
         title     CDATA                         #IMPLIED '>
<!--    'xmlns     CDATA                         #FIXED "http://www.w3.org/XML/XLink/0.9" -->
<!-- FIXME: brain-dead IE5 has broken support for
     namespace validation and since I use it for editing
     I remove this for now -->

<!ENTITY % xlink-user-replace.att
        'show      (new|parsed|replace)   #FIXED "replace"
         actuate   (user|auto)            #FIXED "user" '>

<!ENTITY % xlink-user-new.att
        'show      (new|parsed|replace)   #FIXED "new"
         actuate   (user|auto)            #FIXED "user" '>

<!ENTITY % xlink-auto-parsed.att
        'show      (new|parsed|replace)   #FIXED "parsed"
         actuate   (user|auto)            #FIXED "auto" '>

<!-- FIXME (SM): XLink doesn't yet cover the idea of soft links so
     introducing it here using the same namespace is _somewhat_
     illegal. Should we create it own namespace?
-->
<!ENTITY % xlink-soft.att
        'mode      (hard|soft)            #FIXED "soft" '>


<!-- =============================================================== -->
<!-- Entities for general usage -->
<!-- =============================================================== -->


<!-- Key attribute =====================================================
        Optionally provides a sorting or indexing key, for cases when
        the element content is inappropriate for this purpose.
==================================================================== -->
<!ENTITY % key.att
        'key                    CDATA           #IMPLIED'>


<!-- Title attributes ==================================================
        Indicates that the element requires to have a title.
==================================================================== -->
<!ENTITY % title.att
        'title                  CDATA           #REQUIRED'>


<!-- Name attributes ==================================================
        Indicates that the element requires to have a name.
==================================================================== -->
<!ENTITY % name.att
        'name                   CDATA           #REQUIRED'>


<!-- Email attributes ==================================================
        Indicates that the element requires to have an email.
==================================================================== -->
<!ENTITY % email.att
        'email                  CDATA           #REQUIRED'>


<!-- =============================================================== -->
<!-- General definitions -->
<!-- =============================================================== -->

<!-- A person is a general human entity -->
<!ELEMENT person EMPTY>
<!ATTLIST person %common.att;
                 %name.att;
                 %email.att;>


<!-- =============================================================== -->
<!-- Content definitions -->
<!-- =============================================================== -->

<!ENTITY % local.content.mix "">

<!ENTITY % markup "strong|em|code|sub|sup">

<!ENTITY % links "link|connect|jump|fork|anchor">

<!ENTITY % special "br|img">

<!ENTITY % link-content.mix "%text;|%markup;|%special;%local.content.mix;">

<!ENTITY % content.mix "%link-content.mix;|%links;">

    <!-- ==================================================== -->
    <!-- Phrase Markup -->
    <!-- ==================================================== -->

    <!-- Strong (typically bold) -->
    <!ELEMENT strong (%text;)>
    <!ATTLIST strong %common.att;>

    <!-- Emphasis (typically italic) -->
    <!ELEMENT em (%text;)>
    <!ATTLIST em %common.att;>

    <!-- Code (typically monospaced) -->
    <!ELEMENT code (%text;)>
    <!ATTLIST code %common.att;>

    <!-- Superscript (typically smaller and higher) -->
    <!ELEMENT sup (%text;)>
    <!ATTLIST sup %common.att;>

    <!-- Subscript (typically smaller and lower) -->
    <!ELEMENT sub (%text;)>
    <!ATTLIST sub %common.att;>

    <!-- FIXME (SM): should we add these HTML 4.0 markups
         which are style-free?

          -dfn
          -samp
          -kbd
          -var
          -cite
          -abbr
          -acronym

     -->

    <!-- ==================================================== -->
    <!-- Hypertextual Links -->
    <!-- ==================================================== -->

    <!-- hard replacing link (equivalent of <a ...>) -->
    <!ELEMENT link (%link-content.mix;)*>
    <!ATTLIST link %common.att;
                   %xlink-simple.att;
                   %xlink-user-replace.att;>

    <!-- Hard window replacing link (equivalent of <a ... target="_top">) -->
    <!ELEMENT jump (%link-content.mix;)*>
    <!ATTLIST jump %common.att;
                   %xlink-simple.att;
                   %xlink-user-new.att;>

    <!-- Hard window forking link (equivalent of <a ... target="_new">) -->
    <!ELEMENT fork (%link-content.mix;)*>
    <!ATTLIST fork %common.att;
                   %xlink-simple.att;
                   %xlink-user-new.att;>

    <!-- Anchor point (equivalent of <a name="...">) -->
    <!ELEMENT anchor EMPTY>
    <!ATTLIST anchor %common-idreq.att;>

    <!-- Soft link between processed pages (no equivalent in HTML) -->
    <!ELEMENT connect (%link-content.mix;)*>
    <!ATTLIST connect %common.att;
                      %xlink-simple.att;
                      %xlink-user-replace.att;
                      %xlink-soft.att;>

    <!-- ==================================================== -->
    <!-- Specials -->
    <!-- ==================================================== -->

    <!-- Breakline Object (typically forces line break) -->
    <!ELEMENT br EMPTY>
    <!ATTLIST br %common.att;>

    <!-- Image Object (typically an inlined image) -->
    <!-- FIXME (SM): should we have the notion of soft links even here
         for inlined objects? -->
    <!ELEMENT img EMPTY>
    <!ATTLIST img src    CDATA  #REQUIRED
                  alt    CDATA  #REQUIRED
                  height CDATA  #IMPLIED
                  width  CDATA  #IMPLIED
                  usemap CDATA  #IMPLIED
                  ismap  (ismap) #IMPLIED
                  %common.att;>


<!-- =============================================================== -->
<!-- Blocks definitions -->
<!-- =============================================================== -->

<!ENTITY % local.blocks "">

<!ENTITY % local.lists "">

<!ENTITY % paragraphs "p|source|note|fixme|img-block">

<!ENTITY % tables "table">

<!ENTITY % lists "ol|ul|sl|dl %local.lists;">

<!ENTITY % blocks "%paragraphs;|%tables;|%lists; %local.blocks;">

    <!-- ==================================================== -->
    <!-- Paragraphs -->
    <!-- ==================================================== -->

    <!-- Text Paragraph (normally vertically space delimited) -->
    <!ELEMENT p (%content.mix;)*>
    <!ATTLIST p %common.att;>

    <!-- Source Paragraph (normally space is preserved) -->
    <!ELEMENT source (%content.mix;)*>
    <!ATTLIST source %common.att;
                     %xmlspace.att;>

    <!-- Note Paragraph (normally shown encapsulated) -->
    <!ELEMENT note (%content.mix;)*>
    <!ATTLIST note %common.att;>

    <!-- Fixme Paragraph (normally not shown) -->
    <!ELEMENT fixme (%content.mix;)*>
    <!-- the "author" attribute should match the "key" attribute of the
         <author> element -->
    <!ATTLIST fixme author CDATA #REQUIRED
                    %common.att;>

    <!-- ==================================================== -->
    <!-- Tables -->
    <!-- ==================================================== -->

    <!ENTITY % cellhalign.att
            'align          (left|center
                            |right|justify
                            |char)          #IMPLIED
            char            CDATA           #IMPLIED
            charoff         CDATA           #IMPLIED'>

    <!ENTITY % cellvalign.att
            'valign         (top|middle
                            |bottom
                            |baseline)      #IMPLIED'>

    <!ENTITY % thtd.att
            'abbr           CDATA           #IMPLIED
            axis            CDATA           #IMPLIED
            headers         IDREFS          #IMPLIED
            scope           (row
                            |col
                            |rowgroup
                            |colgroup)      #IMPLIED
            rowspan         NMTOKEN         "1"
            colspan         NMTOKEN         "1"'>

    <!ENTITY % width.att
            'width          CDATA           #IMPLIED'>

    <!ENTITY % span.att
            'span           NMTOKEN         "1"'>


    <!-- Table (based on the IETF HTML table standard [RFC1942]) -->
    <!ELEMENT table
            (caption?, (col*|colgroup*), thead?, tfoot?, tbody+)>
    <!ATTLIST table
            %common.att;
            %width.att;
            summary         CDATA           #IMPLIED
            border          CDATA           #IMPLIED
            frame           (void|above
                            |below|hsides
                            |lhs|rhs
                            |vsides|box
                            |border)        #IMPLIED
            rules           (none|groups
                            |rows|cols
                            |all)           #IMPLIED
            cellspacing     CDATA           #IMPLIED
            cellpadding     CDATA           #IMPLIED>

        <!ELEMENT caption (%content.mix;)*>
        <!ATTLIST caption %common.att;>

        <!ELEMENT colgroup (col)*>
        <!ATTLIST colgroup
                %common.att;
                %span.att;
                %width.att;
                %cellhalign.att;
                %cellvalign.att;>

            <!ELEMENT col EMPTY>
            <!ATTLIST col
                    %common.att;
                    %span.att;
                    %width.att;
                    %cellhalign.att;
                    %cellvalign.att;>

        <!ELEMENT thead (tr)+>
        <!ATTLIST thead
                %common.att;
                %cellhalign.att;
                %cellvalign.att;>

        <!ELEMENT tfoot (tr)+>
        <!ATTLIST tfoot
                %common.att;
                %cellhalign.att;
                %cellvalign.att;>

        <!ELEMENT tbody (tr)+>
        <!ATTLIST tbody
                %common.att;
                %cellhalign.att;
                %cellvalign.att;>

            <!ELEMENT tr (th|td)+>
            <!ATTLIST tr
                    %common.att;
                    %cellhalign.att;
                    %cellvalign.att;>

                <!ELEMENT th (%content.mix;)*>
                <!ATTLIST th
                        %common.att;
                        %thtd.att;
                        %cellhalign.att;
                        %cellvalign.att;>

                <!ELEMENT td (%content.mix;)*>
                <!ATTLIST td
                        %common.att;
                        %thtd.att;
                        %cellhalign.att;
                        %cellvalign.att;>

    <!-- ==================================================== -->
    <!-- Lists -->
    <!-- ==================================================== -->

    <!-- Unordered list (typically bulleted) -->
    <!ELEMENT ul (li|%lists;)+>
    <!--    spacing attribute:
            Use "normal" to get normal vertical spacing for items;
            use "compact" to get less spacing.  The default is dependent
            on the stylesheet. -->
    <!ATTLIST ul
            %common.att;
            spacing         (normal|compact)        #IMPLIED>

    <!-- Ordered list (typically numbered) -->
    <!ELEMENT ol (li|%lists;)+>
    <!--    spacing attribute:
            Use "normal" to get normal vertical spacing for items;
            use "compact" to get less spacing.  The default is dependent
            on the stylesheet. -->
    <!ATTLIST ol
            %common.att;
            spacing         (normal|compact)        #IMPLIED>

    <!-- Simple list (typically with no mark) -->
    <!ELEMENT sl (li|%lists;)+>
    <!ATTLIST sl %common.att;>

        <!-- List item -->
        <!ELEMENT li (%content.mix;|%lists;)*>
        <!ATTLIST li %common.att;>

    <!-- Definition list (typically two-column) -->
    <!ELEMENT dl (dt,dd)+>
    <!ATTLIST dl %common.att;>

        <!-- Definition term -->
        <!ELEMENT dt (%content.mix;)*>
        <!ATTLIST dt %common.att;>

        <!-- Definition description -->
        <!ELEMENT dd (%content.mix;)*>
        <!ATTLIST dd %common.att;>

    <!-- ==================================================== -->
    <!-- Special Blocks -->
    <!-- ==================================================== -->

    <!-- Image Block (typically a separated and centered image) -->
    <!-- FIXME (SM): should we have the notion of soft links even here
         for inlined objects? -->
    <!ELEMENT img-block EMPTY>
    <!ATTLIST img-block src    CDATA  #REQUIRED
                        alt    CDATA  #REQUIRED
                        height CDATA  #IMPLIED
                        width  CDATA  #IMPLIED
                        usemap CDATA  #IMPLIED
                        ismap  (ismap) #IMPLIED
                        %common.att;>


<!-- =============================================================== -->
<!-- Document -->
<!-- =============================================================== -->

<!ELEMENT document (header?, body, footer?)>
<!ATTLIST document %common.att;>

    <!-- ==================================================== -->
    <!-- Header -->
    <!-- ==================================================== -->

    <!ENTITY % local.headers "">

    <!ELEMENT header (title, subtitle?, version?, type?, authors,
                      notice*, abstract? %local.headers;)>
    <!ATTLIST header %common.att;>

    <!ELEMENT title (%text;)>
    <!ATTLIST title %common.att;>

    <!ELEMENT subtitle (%text;)>
    <!ATTLIST subtitle %common.att;>

    <!ELEMENT version (%text;)>
    <!ATTLIST version %common.att;>

    <!ELEMENT type (%text;)>
    <!ATTLIST type %common.att;>

    <!ELEMENT authors (person+)>
    <!ATTLIST authors %common.att;>

    <!ELEMENT notice (%content.mix;)*>
    <!ATTLIST notice %common.att;>

    <!ELEMENT abstract (%content.mix;)*>
    <!ATTLIST abstract %common.att;>

    <!-- ==================================================== -->
    <!-- Body -->
    <!-- ==================================================== -->

    <!ENTITY % local.sections "">

    <!ENTITY % sections "s1 %local.sections;">

    <!ELEMENT body (%sections;)+>
    <!ATTLIST body %common.att;>

        <!ELEMENT s1 (s2|%blocks;)*>
        <!ATTLIST s1 %title.att; %common.att;>

            <!ELEMENT s2 (s3|%blocks;)*>
            <!ATTLIST s2 %title.att; %common.att;>

                <!ELEMENT s3 (s4|%blocks;)*>
                <!ATTLIST s3 %title.att; %common.att;>

                    <!ELEMENT s4 (%blocks;)*>
                    <!ATTLIST s4 %title.att; %common.att;>

    <!-- ==================================================== -->
    <!-- Footer -->
    <!-- ==================================================== -->

    <!ENTITY % local.footers "">

    <!ELEMENT footer (legal %local.footers;)>

        <!ELEMENT legal (%content.mix;)*>
        <!ATTLIST legal %common.att;>

<!-- =============================================================== -->
<!-- End of DTD -->
<!-- =============================================================== -->
-------------- next part --------------
<!-- 
     Portions (C) International Organization for Standardization 1986
     Permission to copy in any form is granted for use with
     conforming SGML systems and applications as defined in
     ISO 8879, provided this notice is included in all copies.
-->

<!-- 
     Character entity set.
-->

<!-- Latin A -->
<!ENTITY nbsp     "&#160;">  <!-- U+00A0 ISOnum    - no-break space = non-breaking space                                   -->
<!ENTITY iexcl    "&#161;">  <!-- U+00A1 ISOnum    - inverted exclamation mark                                             -->
<!ENTITY cent     "&#162;">  <!-- U+00A2 ISOnum    - cent sign                                                             -->
<!ENTITY pound    "&#163;">  <!-- U+00A3 ISOnum    - pound sign                                                            -->
<!ENTITY curren   "&#164;">  <!-- U+00A4 ISOnum    - currency sign                                                         -->
<!ENTITY yen      "&#165;">  <!-- U+00A5 ISOnum    - yen sign = yuan sign                                                  -->
<!ENTITY brvbar   "&#166;">  <!-- U+00A6 ISOnum    - broken bar = broken vertical bar                                      -->
<!ENTITY sect     "&#167;">  <!-- U+00A7 ISOnum    - section sign                                                          -->
<!ENTITY uml      "&#168;">  <!-- U+00A8 ISOdia    - diaeresis = spacing diaeresis                                         -->
<!ENTITY copy     "&#169;">  <!-- U+00A9 ISOnum    - copyright sign                                                        -->
<!ENTITY ordf     "&#170;">  <!-- U+00AA ISOnum    - feminine ordinal indicator                                            -->
<!ENTITY laquo    "&#171;">  <!-- U+00AB ISOnum    - left-pointing double angle quotation mark = left pointing guillemet   -->
<!ENTITY not      "&#172;">  <!-- U+00AC ISOnum    - not sign                                                              -->
<!ENTITY shy      "&#173;">  <!-- U+00AD ISOnum    - soft hyphen = discretionary hyphen                                    -->
<!ENTITY reg      "&#174;">  <!-- U+00AE ISOnum    - registered sign = registered trade mark sign                          -->
<!ENTITY macr     "&#175;">  <!-- U+00AF ISOdia    - macron = spacing macron = overline = APL overbar                      -->
<!ENTITY deg      "&#176;">  <!-- U+00B0 ISOnum    - degree sign                                                           -->
<!ENTITY plusmn   "&#177;">  <!-- U+00B1 ISOnum    - plus-minus sign = plus-or-minus sign                                  -->
<!ENTITY sup2     "&#178;">  <!-- U+00B2 ISOnum    - superscript two = superscript digit two = squared                     -->
<!ENTITY sup3     "&#179;">  <!-- U+00B3 ISOnum    - superscript three = superscript digit three = cubed                   -->
<!ENTITY acute    "&#180;">  <!-- U+00B4 ISOdia    - acute accent = spacing acute                                          -->
<!ENTITY micro    "&#181;">  <!-- U+00B5 ISOnum    - micro sign                                                            -->
<!ENTITY para     "&#182;">  <!-- U+00B6 ISOnum    - pilcrow sign = paragraph sign                                         -->
<!ENTITY middot   "&#183;">  <!-- U+00B7 ISOnum    - middle dot = Georgian comma = Greek middle dot                        -->
<!ENTITY cedil    "&#184;">  <!-- U+00B8 ISOdia    - cedilla = spacing cedilla                                             -->
<!ENTITY sup1     "&#185;">  <!-- U+00B9 ISOnum    - superscript one = superscript digit one                               -->
<!ENTITY ordm     "&#186;">  <!-- U+00BA ISOnum    - masculine ordinal indicator                                           -->
<!ENTITY raquo    "&#187;">  <!-- U+00BB ISOnum    - right-pointing double angle quotation mark = right pointing guillemet -->
<!ENTITY frac14   "&#188;">  <!-- U+00BC ISOnum    - vulgar fraction one quarter = fraction one quarter                    -->
<!ENTITY frac12   "&#189;">  <!-- U+00BD ISOnum    - vulgar fraction one half = fraction one half                          -->
<!ENTITY frac34   "&#190;">  <!-- U+00BE ISOnum    - vulgar fraction three quarters = fraction three quarters              -->
<!ENTITY iquest   "&#191;">  <!-- U+00BF ISOnum    - inverted question mark = turned question mark                         -->
<!ENTITY Agrave   "&#192;">  <!-- U+00C0 ISOlat1   - latin capital letter A with grave = latin capital letter A grave      -->
<!ENTITY Aacute   "&#193;">  <!-- U+00C1 ISOlat1   - latin capital letter A with acute                                     -->
<!ENTITY Acirc    "&#194;">  <!-- U+00C2 ISOlat1   - latin capital letter A with circumflex                                -->
<!ENTITY Atilde   "&#195;">  <!-- U+00C3 ISOlat1   - latin capital letter A with tilde                                     -->
<!ENTITY Auml     "&#196;">  <!-- U+00C4 ISOlat1   - latin capital letter A with diaeresis                                 -->
<!ENTITY Aring    "&#197;">  <!-- U+00C5 ISOlat1   - latin capital letter A with ring above = latin capital letter A ring  -->
<!ENTITY AElig    "&#198;">  <!-- U+00C6 ISOlat1   - latin capital letter AE = latin capital ligature AE                   -->
<!ENTITY Ccedil   "&#199;">  <!-- U+00C7 ISOlat1   - latin capital letter C with cedilla                                   -->
<!ENTITY Egrave   "&#200;">  <!-- U+00C8 ISOlat1   - latin capital letter E with grave                                     -->
<!ENTITY Eacute   "&#201;">  <!-- U+00C9 ISOlat1   - latin capital letter E with acute                                     -->
<!ENTITY Ecirc    "&#202;">  <!-- U+00CA ISOlat1   - latin capital letter E with circumflex                                -->
<!ENTITY Euml     "&#203;">  <!-- U+00CB ISOlat1   - latin capital letter E with diaeresis                                 -->
<!ENTITY Igrave   "&#204;">  <!-- U+00CC ISOlat1   - latin capital letter I with grave                                     -->
<!ENTITY Iacute   "&#205;">  <!-- U+00CD ISOlat1   - latin capital letter I with acute                                     -->
<!ENTITY Icirc    "&#206;">  <!-- U+00CE ISOlat1   - latin capital letter I with circumflex                                -->
<!ENTITY Iuml     "&#207;">  <!-- U+00CF ISOlat1   - latin capital letter I with diaeresis                                 -->
<!ENTITY ETH      "&#208;">  <!-- U+00D0 ISOlat1   - latin capital letter ETH                                              -->
<!ENTITY Ntilde   "&#209;">  <!-- U+00D1 ISOlat1   - latin capital letter N with tilde                                     -->
<!ENTITY Ograve   "&#210;">  <!-- U+00D2 ISOlat1   - latin capital letter O with grave                                     -->
<!ENTITY Oacute   "&#211;">  <!-- U+00D3 ISOlat1   - latin capital letter O with acute                                     -->
<!ENTITY Ocirc    "&#212;">  <!-- U+00D4 ISOlat1   - latin capital letter O with circumflex                                -->
<!ENTITY Otilde   "&#213;">  <!-- U+00D5 ISOlat1   - latin capital letter O with tilde                                     -->
<!ENTITY Ouml     "&#214;">  <!-- U+00D6 ISOlat1   - latin capital letter O with diaeresis                                 -->
<!ENTITY times    "&#215;">  <!-- U+00D7 ISOnum    - multiplication sign                                                   -->
<!ENTITY Oslash   "&#216;">  <!-- U+00D8 ISOlat1   - latin capital letter O with stroke = latin capital letter O slash     -->
<!ENTITY Ugrave   "&#217;">  <!-- U+00D9 ISOlat1   - latin capital letter U with grave                                     -->
<!ENTITY Uacute   "&#218;">  <!-- U+00DA ISOlat1   - latin capital letter U with acute                                     -->
<!ENTITY Ucirc    "&#219;">  <!-- U+00DB ISOlat1   - latin capital letter U with circumflex                                -->
<!ENTITY Uuml     "&#220;">  <!-- U+00DC ISOlat1   - latin capital letter U with diaeresis                                 -->
<!ENTITY Yacute   "&#221;">  <!-- U+00DD ISOlat1   - latin capital letter Y with acute                                     -->
<!ENTITY THORN    "&#222;">  <!-- U+00DE ISOlat1   - latin capital letter THORN                                            -->
<!ENTITY szlig    "&#223;">  <!-- U+00DF ISOlat1   - latin small letter sharp s = ess-zed                                  -->
<!ENTITY agrave   "&#224;">  <!-- U+00E0 ISOlat1   - latin small letter a with grave = latin small letter a grave          -->
<!ENTITY aacute   "&#225;">  <!-- U+00E1 ISOlat1   - latin small letter a with acute                                       -->
<!ENTITY acirc    "&#226;">  <!-- U+00E2 ISOlat1   - latin small letter a with circumflex                                  -->
<!ENTITY atilde   "&#227;">  <!-- U+00E3 ISOlat1   - latin small letter a with tilde                                       -->
<!ENTITY auml     "&#228;">  <!-- U+00E4 ISOlat1   - latin small letter a with diaeresis                                   -->
<!ENTITY aring    "&#229;">  <!-- U+00E5 ISOlat1   - latin small letter a with ring above = latin small letter a ring      -->
<!ENTITY aelig    "&#230;">  <!-- U+00E6 ISOlat1   - latin small letter ae = latin small ligature ae                       -->
<!ENTITY ccedil   "&#231;">  <!-- U+00E7 ISOlat1   - latin small letter c with cedilla                                     -->
<!ENTITY egrave   "&#232;">  <!-- U+00E8 ISOlat1   - latin small letter e with grave                                       -->
<!ENTITY eacute   "&#233;">  <!-- U+00E9 ISOlat1   - latin small letter e with acute                                       -->
<!ENTITY ecirc    "&#234;">  <!-- U+00EA ISOlat1   - latin small letter e with circumflex                                  -->
<!ENTITY euml     "&#235;">  <!-- U+00EB ISOlat1   - latin small letter e with diaeresis                                   -->
<!ENTITY igrave   "&#236;">  <!-- U+00EC ISOlat1   - latin small letter i with grave                                       -->
<!ENTITY iacute   "&#237;">  <!-- U+00ED ISOlat1   - latin small letter i with acute                                       -->
<!ENTITY icirc    "&#238;">  <!-- U+00EE ISOlat1   - latin small letter i with circumflex                                  -->
<!ENTITY iuml     "&#239;">  <!-- U+00EF ISOlat1   - latin small letter i with diaeresis                                   -->
<!ENTITY eth      "&#240;">  <!-- U+00F0 ISOlat1   - latin small letter eth                                                -->
<!ENTITY ntilde   "&#241;">  <!-- U+00F1 ISOlat1   - latin small letter n with tilde                                       -->
<!ENTITY ograve   "&#242;">  <!-- U+00F2 ISOlat1   - latin small letter o with grave                                       -->
<!ENTITY oacute   "&#243;">  <!-- U+00F3 ISOlat1   - latin small letter o with acute                                       -->
<!ENTITY ocirc    "&#244;">  <!-- U+00F4 ISOlat1   - latin small letter o with circumflex                                  -->
<!ENTITY otilde   "&#245;">  <!-- U+00F5 ISOlat1   - latin small letter o with tilde                                       -->
<!ENTITY ouml     "&#246;">  <!-- U+00F6 ISOlat1   - latin small letter o with diaeresis                                   -->
<!ENTITY divide   "&#247;">  <!-- U+00F7 ISOnum    - division sign                                                         -->
<!ENTITY oslash   "&#248;">  <!-- U+00F8 ISOlat1   - latin small letter o with stroke = latin small letter o slash         -->
<!ENTITY ugrave   "&#249;">  <!-- U+00F9 ISOlat1   - latin small letter u with grave                                       -->
<!ENTITY uacute   "&#250;">  <!-- U+00FA ISOlat1   - latin small letter u with acute                                       -->
<!ENTITY ucirc    "&#251;">  <!-- U+00FB ISOlat1   - latin small letter u with circumflex                                  -->
<!ENTITY uuml     "&#252;">  <!-- U+00FC ISOlat1   - latin small letter u with diaeresis                                   -->
<!ENTITY yacute   "&#253;">  <!-- U+00FD ISOlat1   - latin small letter y with acute                                       -->
<!ENTITY thorn    "&#254;">  <!-- U+00FE ISOlat1   - latin small letter thorn                                              -->
<!ENTITY yuml     "&#255;">  <!-- U+00FF ISOlat1   - latin small letter y with diaeresis                                   -->

<!-- Latin Extended-A -->
<!ENTITY OElig    "&#338;">  <!-- U+0152 ISOlat2   - latin capital ligature OE                                             -->
<!ENTITY oelig    "&#339;">  <!-- U+0153 ISOlat2   - latin small ligature oe                                               -->

<!-- ligature is a misnomer, this is a separate character in some languages -->
<!ENTITY Scaron   "&#352;">  <!-- U+0160 ISOlat2   - latin capital letter S with caron                                     -->
<!ENTITY scaron   "&#353;">  <!-- U+0161 ISOlat2   - latin small letter s with caron                                       -->
<!ENTITY Yuml     "&#376;">  <!-- U+0178 ISOlat2   - latin capital letter Y with diaeresis                                 -->

<!-- Spacing Modifier Letters -->
<!ENTITY circ     "&#710;">  <!-- U+02C6 ISOpub    - modifier letter circumflex accent                                     -->
<!ENTITY tilde    "&#732;">  <!-- U+02DC ISOdia    - small tilde                                                           -->

<!-- General Punctuation -->
<!ENTITY ensp     "&#8194;"> <!-- U+2002 ISOpub    - en space                                                              -->
<!ENTITY emsp     "&#8195;"> <!-- U+2003 ISOpub    - em space                                                              -->
<!ENTITY thinsp   "&#8201;"> <!-- U+2009 ISOpub    - thin space                                                            -->
<!ENTITY zwnj     "&#8204;"> <!-- U+200C RFC 2070  - zero width non-joiner                                                 -->
<!ENTITY zwj      "&#8205;"> <!-- U+200D RFC 2070  - zero width joiner                                                     -->
<!ENTITY lrm      "&#8206;"> <!-- U+200E RFC 2070  - left-to-right mark                                                    -->
<!ENTITY rlm      "&#8207;"> <!-- U+200F RFC 2070  - right-to-left mark                                                    -->
<!ENTITY ndash    "&#8211;"> <!-- U+2013 ISOpub    - en dash                                                               -->
<!ENTITY mdash    "&#8212;"> <!-- U+2014 ISOpub    - em dash                                                               -->
<!ENTITY lsquo    "&#8216;"> <!-- U+2018 ISOnum    - left single quotation mark                                            -->
<!ENTITY rsquo    "&#8217;"> <!-- U+2019 ISOnum    - right single quotation mark                                           -->
<!ENTITY sbquo    "&#8218;"> <!-- U+201A NEW       - single low-9 quotation mark                                           -->
<!ENTITY ldquo    "&#8220;"> <!-- U+201C ISOnum    - left double quotation mark                                            -->
<!ENTITY rdquo    "&#8221;"> <!-- U+201D ISOnum    - right double quotation mark,                                          -->
<!ENTITY bdquo    "&#8222;"> <!-- U+201E NEW       - double low-9 quotation mark                                           -->
<!ENTITY dagger   "&#8224;"> <!-- U+2020 ISOpub    - dagger                                                                -->
<!ENTITY Dagger   "&#8225;"> <!-- U+2021 ISOpub    - double dagger                                                         -->
<!ENTITY permil   "&#8240;"> <!-- U+2030 ISOtech   - per mille sign                                                        -->
<!ENTITY lsaquo   "&#8249;"> <!-- U+2039 ISO prop. - single left-pointing angle quotation mark                             -->

<!-- lsaquo is proposed but not yet ISO standardized -->
<!ENTITY rsaquo   "&#8250;"> <!-- U+203A ISO prop. -   single right-pointing angle quotation mark                          -->

<!-- rsaquo is proposed but not yet ISO standardized -->
<!ENTITY euro     "&#8364;"> <!-- U+20AC NEW       -   euro sign                                                           -->

<!-- Latin Extended-B -->
<!ENTITY fnof     "&#402;">  <!-- U+0192 ISOtech   - latin small f with hook = function = florin                           -->

<!-- Greek -->
<!ENTITY Alpha    "&#913;">  <!-- U+0391           - greek capital letter alpha                                            -->
<!ENTITY Beta     "&#914;">  <!-- U+0392           - greek capital letter beta                                             -->
<!ENTITY Gamma    "&#915;">  <!-- U+0393 ISOgrk3   - greek capital letter gamma                                            -->
<!ENTITY Delta    "&#916;">  <!-- U+0394 ISOgrk3   - greek capital letter delta                                            -->
<!ENTITY Epsilon  "&#917;">  <!-- U+0395           - greek capital letter epsilon                                          -->
<!ENTITY Zeta     "&#918;">  <!-- U+0396           - greek capital letter zeta                                             -->
<!ENTITY Eta      "&#919;">  <!-- U+0397           - greek capital letter eta                                              -->
<!ENTITY Theta    "&#920;">  <!-- U+0398 ISOgrk3   - greek capital letter theta                                            -->
<!ENTITY Iota     "&#921;">  <!-- U+0399           - greek capital letter iota                                             -->
<!ENTITY Kappa    "&#922;">  <!-- U+039A           - greek capital letter kappa                                            -->
<!ENTITY Lambda   "&#923;">  <!-- U+039B ISOgrk3   - greek capital letter lambda                                           -->
<!ENTITY Mu       "&#924;">  <!-- U+039C           - greek capital letter mu                                               -->
<!ENTITY Nu       "&#925;">  <!-- U+039D           - greek capital letter nu                                               -->
<!ENTITY Xi       "&#926;">  <!-- U+039E ISOgrk3   - greek capital letter xi                                               -->
<!ENTITY Omicron  "&#927;">  <!-- U+039F           - greek capital letter omicron                                          -->
<!ENTITY Pi       "&#928;">  <!-- U+03A0 ISOgrk3   - greek capital letter pi                                               -->
<!ENTITY Rho      "&#929;">  <!-- U+03A1           - greek capital letter rho                                              -->
<!ENTITY Sigma    "&#931;">  <!-- U+03A3 ISOgrk3   - greek capital letter sigma                                            -->
<!ENTITY Tau      "&#932;">  <!-- U+03A4           - greek capital letter tau                                              -->
<!ENTITY Upsilon  "&#933;">  <!-- U+03A5 ISOgrk3   - greek capital letter upsilon                                          -->
<!ENTITY Phi      "&#934;">  <!-- U+03A6 ISOgrk3   - greek capital letter phi                                              -->
<!ENTITY Chi      "&#935;">  <!-- U+03A7           - greek capital letter chi                                              -->
<!ENTITY Psi      "&#936;">  <!-- U+03A8 ISOgrk3   - greek capital letter psi                                              -->
<!ENTITY Omega    "&#937;">  <!-- U+03A9 ISOgrk3   - greek capital letter omega                                            -->
<!ENTITY alpha    "&#945;">  <!-- U+03B1 ISOgrk3   - greek small letter alpha                                              -->
<!ENTITY beta     "&#946;">  <!-- U+03B2 ISOgrk3   - greek small letter beta                                               -->
<!ENTITY gamma    "&#947;">  <!-- U+03B3 ISOgrk3   - greek small letter gamma                                              -->
<!ENTITY delta    "&#948;">  <!-- U+03B4 ISOgrk3   - greek small letter delta                                              -->
<!ENTITY epsilon  "&#949;">  <!-- U+03B5 ISOgrk3   - greek small letter epsilon                                            -->
<!ENTITY zeta     "&#950;">  <!-- U+03B6 ISOgrk3   - greek small letter zeta                                               -->
<!ENTITY eta      "&#951;">  <!-- U+03B7 ISOgrk3   - greek small letter eta                                                -->
<!ENTITY theta    "&#952;">  <!-- U+03B8 ISOgrk3   - greek small letter theta                                              -->
<!ENTITY iota     "&#953;">  <!-- U+03B9 ISOgrk3   - greek small letter iota                                               -->
<!ENTITY kappa    "&#954;">  <!-- U+03BA ISOgrk3   - greek small letter kappa                                              -->
<!ENTITY lambda   "&#955;">  <!-- U+03BB ISOgrk3   - greek small letter lambda                                             -->
<!ENTITY mu       "&#956;">  <!-- U+03BC ISOgrk3   - greek small letter mu                                                 -->
<!ENTITY nu       "&#957;">  <!-- U+03BD ISOgrk3   - greek small letter nu                                                 -->
<!ENTITY xi       "&#958;">  <!-- U+03BE ISOgrk3   - greek small letter xi                                                 -->
<!ENTITY omicron  "&#959;">  <!-- U+03BF NEW       - greek small letter omicron                                            -->
<!ENTITY pi       "&#960;">  <!-- U+03C0 ISOgrk3   - greek small letter pi                                                 -->
<!ENTITY rho      "&#961;">  <!-- U+03C1 ISOgrk3   - greek small letter rho                                                -->
<!ENTITY sigmaf   "&#962;">  <!-- U+03C2 ISOgrk3   - greek small letter final sigma                                        -->
<!ENTITY sigma    "&#963;">  <!-- U+03C3 ISOgrk3   - greek small letter sigma                                              -->
<!ENTITY tau      "&#964;">  <!-- U+03C4 ISOgrk3   - greek small letter tau                                                -->
<!ENTITY upsilon  "&#965;">  <!-- U+03C5 ISOgrk3   - greek small letter upsilon                                            -->
<!ENTITY phi      "&#966;">  <!-- U+03C6 ISOgrk3   - greek small letter phi                                                -->
<!ENTITY chi      "&#967;">  <!-- U+03C7 ISOgrk3   - greek small letter chi                                                -->
<!ENTITY psi      "&#968;">  <!-- U+03C8 ISOgrk3   - greek small letter psi                                                -->
<!ENTITY omega    "&#969;">  <!-- U+03C9 ISOgrk3   - greek small letter omega                                              -->
<!ENTITY thetasym "&#977;">  <!-- U+03D1 NEW       - greek small letter theta symbol                                       -->
<!ENTITY upsih    "&#978;">  <!-- U+03D2 NEW       - greek upsilon with hook symbol                                        -->
<!ENTITY piv      "&#982;">  <!-- U+03D6 ISOgrk3   - greek pi symbol                                                       -->

<!-- General Punctuation -->
<!ENTITY bull     "&#8226;"> <!-- U+2022 ISOpub    - bullet = black small circle                                           -->
<!ENTITY hellip   "&#8230;"> <!-- U+2026 ISOpub    - horizontal ellipsis = three dot leader                                -->
<!ENTITY prime    "&#8242;"> <!-- U+2032 ISOtech   - prime = minutes = feet                                                -->
<!ENTITY Prime    "&#8243;"> <!-- U+2033 ISOtech   - double prime = seconds = inches                                       -->
<!ENTITY oline    "&#8254;"> <!-- U+203E NEW       - overline = spacing overscore                                          -->
<!ENTITY frasl    "&#8260;"> <!-- U+2044 NEW       - fraction slash                                                        -->

<!-- Letterlike Symbols -->
<!ENTITY weierp   "&#8472;"> <!-- U+2118 ISOamso   - script capital P = power set = Weierstrass p                          -->
<!ENTITY image    "&#8465;"> <!-- U+2111 ISOamso   - blackletter capital I = imaginary part                                -->
<!ENTITY real     "&#8476;"> <!-- U+211C ISOamso   - blackletter capital R = real part symbol                              -->
<!ENTITY trade    "&#8482;"> <!-- U+2122 ISOnum    - trade mark sign                                                       -->
<!ENTITY alefsym  "&#8501;"> <!-- U+2135 NEW       - alef symbol = first transfinite cardinal                              -->

<!-- Arrows -->
<!ENTITY larr     "&#8592;"> <!-- U+2190 ISOnum    - leftwards arrow                                                       -->
<!ENTITY uarr     "&#8593;"> <!-- U+2191 ISOnum    - upwards arrow                                                         -->
<!ENTITY rarr     "&#8594;"> <!-- U+2192 ISOnum    - rightwards arrow                                                      -->
<!ENTITY darr     "&#8595;"> <!-- U+2193 ISOnum    - downwards arrow                                                       -->
<!ENTITY harr     "&#8596;"> <!-- U+2194 ISOamsa   - left right arrow                                                      -->
<!ENTITY crarr    "&#8629;"> <!-- U+21B5 NEW       - downwards arrow with corner leftwards = carriage return               -->
<!ENTITY lArr     "&#8656;"> <!-- U+21D0 ISOtech   - leftwards double arrow                                                -->
<!ENTITY uArr     "&#8657;"> <!-- U+21D1 ISOamsa   - upwards double arrow                                                  -->
<!ENTITY rArr     "&#8658;"> <!-- U+21D2 ISOtech   - rightwards double arrow                                               -->
<!ENTITY dArr     "&#8659;"> <!-- U+21D3 ISOamsa   - downwards double arrow                                                -->
<!ENTITY hArr     "&#8660;"> <!-- U+21D4 ISOamsa   - left right double arrow                                               -->

<!-- Mathematical Operators -->
<!ENTITY forall   "&#8704;"> <!-- U+2200 ISOtech   - for all                                                               -->
<!ENTITY part     "&#8706;"> <!-- U+2202 ISOtech   - partial differential                                                  -->
<!ENTITY exist    "&#8707;"> <!-- U+2203 ISOtech   - there exists                                                          -->
<!ENTITY empty    "&#8709;"> <!-- U+2205 ISOamso   - empty set = null set = diameter                                       -->
<!ENTITY nabla    "&#8711;"> <!-- U+2207 ISOtech   - nabla = backward difference                                           -->
<!ENTITY isin     "&#8712;"> <!-- U+2208 ISOtech   - element of                                                            -->
<!ENTITY notin    "&#8713;"> <!-- U+2209 ISOtech   - not an element of                                                     -->
<!ENTITY ni       "&#8715;"> <!-- U+220B ISOtech   - contains as member                                                    -->
<!ENTITY prod     "&#8719;"> <!-- U+220F ISOamsb   - n-ary product = product sign                                          -->
<!ENTITY sum      "&#8721;"> <!-- U+2211 ISOamsb   - n-ary sumation                                                        -->
<!ENTITY minus    "&#8722;"> <!-- U+2212 ISOtech   - minus sign                                                            -->
<!ENTITY lowast   "&#8727;"> <!-- U+2217 ISOtech   - asterisk operator                                                     -->
<!ENTITY radic    "&#8730;"> <!-- U+221A ISOtech   - square root = radical sign                                            -->
<!ENTITY prop     "&#8733;"> <!-- U+221D ISOtech   - proportional to                                                       -->
<!ENTITY infin    "&#8734;"> <!-- U+221E ISOtech   - infinity                                                              -->
<!ENTITY ang      "&#8736;"> <!-- U+2220 ISOamso   - angle                                                                 -->
<!ENTITY and      "&#8743;"> <!-- U+2227 ISOtech   - logical and = wedge                                                   -->
<!ENTITY or       "&#8744;"> <!-- U+2228 ISOtech   - logical or = vee                                                      -->
<!ENTITY cap      "&#8745;"> <!-- U+2229 ISOtech   - intersection = cap                                                    -->
<!ENTITY cup      "&#8746;"> <!-- U+222A ISOtech   - union = cup                                                           -->
<!ENTITY int      "&#8747;"> <!-- U+222B ISOtech   - integral                                                              -->
<!ENTITY there4   "&#8756;"> <!-- U+2234 ISOtech   - therefore                                                             -->
<!ENTITY sim      "&#8764;"> <!-- U+223C ISOtech   - tilde operator = varies with = similar to                             -->
<!ENTITY cong     "&#8773;"> <!-- U+2245 ISOtech   - approximately equal to                                                -->
<!ENTITY asymp    "&#8776;"> <!-- U+2248 ISOamsr   - almost equal to = asymptotic to                                       -->
<!ENTITY ne       "&#8800;"> <!-- U+2260 ISOtech   - not equal to                                                          -->
<!ENTITY equiv    "&#8801;"> <!-- U+2261 ISOtech   - identical to                                                          -->
<!ENTITY le       "&#8804;"> <!-- U+2264 ISOtech   - less-than or equal to                                                 -->
<!ENTITY ge       "&#8805;"> <!-- U+2265 ISOtech   - greater-than or equal to                                              -->
<!ENTITY sub      "&#8834;"> <!-- U+2282 ISOtech   - subset of                                                             -->
<!ENTITY sup      "&#8835;"> <!-- U+2283 ISOtech   - superset of                                                           -->
<!ENTITY nsub     "&#8836;"> <!-- U+2284 ISOamsn   - not a subset of                                                       -->
<!ENTITY sube     "&#8838;"> <!-- U+2286 ISOtech   - subset of or equal to                                                 -->
<!ENTITY supe     "&#8839;"> <!-- U+2287 ISOtech   - superset of or equal to                                               -->
<!ENTITY oplus    "&#8853;"> <!-- U+2295 ISOamsb   - circled plus = direct sum                                             -->
<!ENTITY otimes   "&#8855;"> <!-- U+2297 ISOamsb   - circled times = vector product                                        -->
<!ENTITY perp     "&#8869;"> <!-- U+22A5 ISOtech   - up tack = orthogonal to = perpendicular                               -->
<!ENTITY sdot     "&#8901;"> <!-- U+22C5 ISOamsb   - dot operator                                                          -->

<!-- Miscellaneous Technical -->
<!ENTITY lceil    "&#8968;"> <!-- U+2308 ISOamsc   - left ceiling = apl upstile                                            -->
<!ENTITY rceil    "&#8969;"> <!-- U+2309 ISOamsc   - right ceiling                                                         -->
<!ENTITY lfloor   "&#8970;"> <!-- U+230A ISOamsc   - left floor = apl downstile                                            -->
<!ENTITY rfloor   "&#8971;"> <!-- U+230B ISOamsc   - right floor                                                           -->
<!ENTITY lang     "&#9001;"> <!-- U+2329 ISOtech   - left-pointing angle bracket = bra                                     -->
<!ENTITY rang     "&#9002;"> <!-- U+232A ISOtech   - right-pointing angle bracket = ket                                    -->

<!-- Geometric Shapes -->
<!ENTITY loz      "&#9674;"> <!-- U+25CA ISOpub    - lozenge                                                               -->

<!-- Miscellaneous Symbols -->
<!ENTITY spades   "&#9824;"> <!-- U+2660 ISOpub    - black spade suit                                                      -->
<!ENTITY clubs    "&#9827;"> <!-- U+2663 ISOpub    - black club suit = shamrock                                            -->
<!ENTITY hearts   "&#9829;"> <!-- U+2665 ISOpub    - black heart suit = valentine                                          -->
<!ENTITY diams    "&#9830;"> <!-- U+2666 ISOpub    - black diamond suit                                                    -->
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991206/ade7393b/installing.htm
From sb at metis.no  Mon Dec  6 14:34:04 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: First interface draft
In-Reply-To: David Megginson's message of "06 Dec 1999 08:43:05 -0500"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whwvqsmo4v.fsf@viffer.oslo.metis.no> <m3zovonq1i.fsf@localhost.localdomain>
Message-ID: <whd7skjfzk.fsf@viffer.oslo.metis.no>

>>>>> David Megginson <david@megginson.com>:

> Steinar Bang <sb@metis.no> writes:
>> And Objectspace Standards<ToolKit> is not compatible with the Standard 
>> C++ Library iostreams of MSVC++.

> I think that this goes beyond the scope of SAX -- we have to be able
> to assume at least a basic level of ANSI-C++ conformance, or else
> we'll end up rewriting the whole standard library.  I'm willing not
> to beat up on the hairier features (like templates), but we have to
> be able to count on the basics.

Then you have to define what this basic level of ANSI-C++ conformance
consists of: templates (at what level), namespaces, standard library
components inside the std::namespace, the existence of the parts of
the standard C++ library, and which parts.

We decided to go for a basic conformance of standard C++, when we
started the project I'm working on, in the summer of 1995.  We have to
use Standards<ToolKit> precisely because the standard C++ things we
use (such as STL and basic_string<>) that are delivered with MSVC++ is
broken.

Our "basic conformance" consists of using STL (extensively) and
string.  We've stayed away from namespaces, and exotic iostream
features (code_cvt<>, templated streams with wide streams).  We've
used templates in our own code, without much incident (the compilers
are Sunpro C++, gcc/egcs and MSVC++), but we've moved away from that
because they caused code bloat.

A minor annoyance with the C++ mapping of CORBA IDL, is that it used
an exotic and low priority (among implementers) feature like
namespaces, and disregarded useful standard C++ things like vector<>
(for sequence<>) and std::basic_string<>.

I would hope to avoid the same reasoning for SAX.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 14:38:20 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++: UTF-8 v UTF-16
In-Reply-To: Lars Marius Garshol's message of "06 Dec 1999 14:59:29 +0100"
References: <14406.58740.871829.541816@localhost.localdomain> <38472FE3.D3BB22BC@jclark.com> <008b01bf3d84$b037d650$c5010180@p197> <3847B8F3.D81B8286@jclark.com> <whyabcqijv.fsf@viffer.oslo.metis.no> <m3emd0npa6.fsf@ifi.uio.no>
Message-ID: <wh9038jfsd.fsf@viffer.oslo.metis.no>

>>>>> Lars Marius Garshol <larsga@garshol.priv.no>:

> * James Clark
>> 
>> Unfortunately wchar_t isn't guaranteed to be UTF-16.  Some platforms
>> make it 32-bits.

> gcc 2.95 on Linux does, at least. I don't know what it does on other
> platforms.

Ugh... that would be one of my target compilers eventually(*). :-/

I saw some discussion on the libstc++-v3 mailing list about having to
make wchar_t 32 bit, to make it able to hold UCS-4.  I didn't know it
ended up being the case.


(*) I'm currently using 1.2-pre2, because it was the first one that
    worked for me since egcs-1.0.3 and I probably won't upgrade again
    until gcc is reasonably stable.  And I'm currently not using
    wstring and wchar_t

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ray at xmission.com  Mon Dec  6 14:40:21 1999
From: ray at xmission.com (Ray Whitmer)
Date: Mon Jun  7 17:18:23 2004
Subject: SAX/C++ vs. SAX2
References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> <384B8C32.22805488@nag.co.uk> <m31z90p554.fsf@ifi.uio.no>
Message-ID: <384BCE3F.168A7CB@xmission.com>

Lars Marius Garshol wrote:

> * Vilya Harvey
> |
> | Just a thought: why not take a leaf out of the DOM's book and write
> | the canonical version of the SAX interfaces in a language-neutral
> | format like IDL?
>
> This may sound like a good idea, but it has its drawbacks in that one
> is immediately forced into a lowest common denominator design where it
> is impossible to make use of the features that really make each
> language what they are.

Just to clarify, if IDL stub generators were being used with the DOM spec, this would be
true, which is the normal way to use IDL.  This is not how IDL is being used by the DOM
specification.  It simply forms a neutral starting point.

[...]

> Nor are language naming conventions respected. (startElement should
> really be startElement (in Java), start_element (in C++, Python, IDL)
> and start-element (in Common Lisp/Scheme) and there may even be more
> variations.

I don't understand your need to promote arbitrary style differences which have nothing to do
with the language, which your example here seems to demonstrate.

I find the statement that startElement should be start_element in C++ and IDL far from
obvious, although it may need to be true now for Legacy reasons.  The mixed casing that Java
uses was borrowed from C++ specs, and is common there.

> As a general reference and statement of intent it might have some
> value, but I really think translation should be done by humans. The
> main advantage feature of IDL, cross-process and cross-language
> interoperability, is not really all that valuable for SAX anyway.

I agree, and this is the philosophy behind the DOM's use of IDL -- let each binding adapt it
as necessary (into a single spec for that binding, not in as many different ways as desired).

The disadvantages using IDL is that people will try to use it with an IDL compiler, and/or
neglect to publish single human-derived bindings for specific languages so variety of
mutations could spring up for a particular language, as has happened in certain cases with
DOM.

Ray Whitmer
ray@xmission.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 14:41:03 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++: Changes for C++
In-Reply-To: Lars Marius Garshol's message of "06 Dec 1999 15:06:25 +0100"
References: <14406.59075.218048.437305@localhost.localdomain> <wh7liwthnr.fsf@viffer.oslo.metis.no> <m3d7sknoym.fsf@ifi.uio.no>
Message-ID: <wh4sdwjfon.fsf@viffer.oslo.metis.no>

>>>>> Lars Marius Garshol <larsga@garshol.priv.no>:

> We added support for this as an extension in the Python version of
> SAX, since several of the Python parsers support this (xmllib,
> xmlproc and pyexpat). This was simply done by adding three methods
> on the extended parser interface: reset, feed and close.

> For C++ SAX2 this might be done through a property
> (http://.../push-stream) which returns a PushStream implementation
> with these three methods to allow you to push data into parsers
> which support this.

Either one of these would be fine for me.

> Some means of specifying the URL of the document entity is probably
> also a good idea, for resolution of relative URLs.

...as well as for error message output.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 14:47:41 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain>
Message-ID: <384BCC51.936AA275@fxtech.com>

This is what I had in mind. Consider this (contrived) XML data-file,
that consists of a Title, Author, and one or more Paragraph elements:

<Document name="mydoc.doc">
	<Title>Sample XML Document</Title>
	<Author email="paul@fxtech.com">Paul Miller</Author>
	<Paragraph>
		This is the first paragraph.
	</Paragraph>
	<Paragraph>
		This is the second paragraph.
	</Paragraph>
</Document>

Now, expat and SAX only give you the elements, so you have to keep track
of where you are in the document in the element handler yourself. What I
have in mind is a nestable set of registered element handlers,
implemented as callbacks. The callbacks are static function pointers,
since I want a non-intrusive design.
With this example, I assume two primary classes (Document and
Paragraph). Although Title and Author are represented as elements here,
they are really attributes of the Document object. Now consider this
code to parse it:

void ParseDocument(XML::InputStream &in)
{
	XML::ElementHandler handlers[] = {
		XML::ElementHandler("Document", sParseDocument),
		XML::ElementHandler::END
	};
	in.Parse(handlers, NULL);	// NULL is optional user-data
}

static void 
sParseDocument(XML::InputStream &in, XML::Element &elem, void *userData)
{
	// query the name attribute
	std::string docName;
	elem.GetAttribute("name", docName);
	// create a new document with this name
	Document *doc = new Document(docName);

	XML::ElementHandler handlers[] = {
		XML::ElementHandler("Title", sParseTitle),
		XML::ElementHandler("Author", sParseAuthor),
		XML::ElementHandler("Paragraph", sParseParagraph),
		XML::ElementHandler::END
	};

	// parse the document elements
	in.Parse(handlers, doc);
}

static void 
Document::sParseTitle(XML::InputStream &in, XML::Element &elem, void
*userData)
{
	Document *doc = (Document *)userData;
	doc->SetTitle(elem.GetData());
}

static void 
Document::sParseAuthor(XML::InputStream &in, XML::Element &elem, void
*userData)
{
	Document *doc = (Document *)userData;
	doc->SetAuthor(elem.GetData(), elem.GetAttribute());
}

static void 
Document::sParseParagraph(XML::InputStream &in, XML::Element &elem, void
*userData)
{
	Document *doc = (Document *)userData;
	Paragraph *para = new Paragraph;
	para->Parse(in, elem);
	doc->AddParagraph(para);
}

void Paragraph::Parse(XML::InputStream &in, XML::Element &elem)
{
	SetText(elem.GetData());
}

The major idea here is you register everything up-front, and
element-specific callbacks get called to deal with specific elements.
You can start up parsing inside an element, so you can nest parsing at
the object level.

Comments?
	
--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From john.aldridge at informatix.co.uk  Mon Dec  6 15:01:44 1999
From: john.aldridge at informatix.co.uk (John Aldridge)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++: First interface draft
In-Reply-To: <whd7skjfzk.fsf@viffer.oslo.metis.no>
References: <David Megginson's message of "06 Dec 1999 08:43:05 -0500">
 <14406.59198.949047.2487@localhost.localdomain>
 <38474BAF.AF4CFF2D@jclark.com>
 <whwvqsmo4v.fsf@viffer.oslo.metis.no>
 <m3zovonq1i.fsf@localhost.localdomain>
Message-ID: <3.0.6.32.19991206150103.009a1c10@mailhost>

At 15:33 06/12/99 +0100, Steinar Bang <sb@metis.no> wrote:
>Our "basic conformance" consists of using STL (extensively) and
>string.  We've stayed away from namespaces, and exotic iostream
>features (code_cvt<>, templated streams with wide streams).  We've
>used templates in our own code, without much incident (the compilers
>are Sunpro C++, gcc/egcs and MSVC++), but we've moved away from that
>because they caused code bloat.

We're using MSVC 6 here, and basic_string<> seems fine.  We use templates
extensively (both the STL and our own), and they too give little trouble
_except_ when it comes to exporting template instantiations across DLL
boundaries, which takes considerable care (but can usually be managed).
Namespaces are fine too.

Member templates only half work, and are probably worth avoiding; and I've
no detailed experience with iostreams, so cannot comment on how safe it is
to dig into the murky corners there.

I've got a reasonable amount of experience with DEC C++, and it's also in
good shape in these regards.  I've no recent Unix/g++ experience, though.

I think the days of having to avoid large chunks of the C++ standard are
largely over, thank heavens.
-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eoin_lane at esatclear.ie  Mon Dec  6 15:02:08 1999
From: eoin_lane at esatclear.ie (Eoin Lane)
Date: Mon Jun  7 17:18:24 2004
Subject: PSGML-1.2.1 problems
Message-ID: <384BCFC8.25F7721E@esatclear.ie>

I'm trying to write a xml doc with emacs configured to use psgml-1.2.1
but am having some problems. I have checked that psgml works with a
simple dtd. However when I use the dtd (document-v10.dtd) below I get
the following error.

~/character.ent line 2 col 12 entity common.att
~/document-v10.dtd line 218 col 29 entity DOCUMENT
~/installing.xml line 3 col 51
Name expected; at: :lang

I wonder could anyone tell me what I am doing wrong. I know the dtd is
correct because I checked it with IBM 4j parser and it validated. it
would be of great benefit to me if I could use the dtd in emacs so any
help would be greatly appreciated.

Eoin.


--

Dr. Eoin Lane
InConn Technologies Ltd.
17 Washington St.
Cork.
Tel. (021) 271855 Fax (021) 272419
http://www.inconn.ie
mailto:eoinlane@esatclear.ie


-------------- next part --------------
<!-- ===================================================================

     Apache Documentation DTD (Version 1.0)

PURPOSE:
  This DTD was developed to create a simple yet powerful document
  type for software documentation for use with the Apache projects.
  It is an XML-compliant DTD and it's maintained by the Apache XML
  project.

TYPICAL INVOCATION:

  <!DOCTYPE document PUBLIC
       "-//APACHE//DTD Documentation Vx.yz//EN"
       "http://xml.apache.org/DTD/document-vxyz.dtd">

  where

    x := major version
    y := minor version
    z := status identifier (optional)

NOTES:
  Many of the design patterns used in this DTD were take from the
  W3C XML Specification DTD edited by Eve Maler <elm@arbortext.com>.

  Where possible, great care has been used to reutilize HTML tag
  names to reduce learning efforts and to allow HTML editors to be
  used for complex authorings like tables and lists.

AUTHORS:
  Stefano Mazzocchi <stefano@apache.org>

FIXME:
  - how can we include char entities without hardwiring them?
  - should "form" tags be included?
  - should all style-free HTML 4.0 markup tags be included?
  - how do we handle the idea of "soft" xlinks?
  - should we add "soft" links to images?

CHANGE HISTORY:
  19991121 Initial version. (SM)
  19991123 Replaced "res" with more standard "strong" for emphasis. (SM)
  19991124 Added "fork" element for window forking behavior. (SM)
  19991124 Added "img-inline" element to separate from "img". (SM)
  19991129 Removed "affiliation" from "author". (SM)
  19991129 Made "author" empty and moved "name|email" as attributes (SM)

COPYRIGHT:
  Copyright (c) 1999 The Apache Software Foundation.

  Permission to copy in any form is granted provided this notice is
  included in all copies. Permission to redistribute is granted
  provided this file is distributed untouched in all its parts and
  included files.

==================================================================== -->


<!-- =============================================================== -->
<!-- Common character entities (included from external file) -->
<!-- =============================================================== -->

<!-- FIXME (SM): this is hardcoding. Find a better way of doing this
     possibly using public identifiers of ISO latin char sets -->
<!ENTITY % charEntity SYSTEM "characters.ent">
%charEntity;


<!-- =============================================================== -->
<!-- Userful entitieis for increased DTD readability -->
<!-- =============================================================== -->

<!ENTITY % text "#PCDATA">


<!-- =============================================================== -->
<!-- Entities for general XML compliance -->
<!-- =============================================================== -->

<!-- Common attributes
        Every element has an ID attribute (sometimes required,
        but usually optional) for links, and a Role attribute
        for extending the useful life of the DTD by allowing
        authors to make subclasses for any element. %common.att;
        is for common attributes where the ID is optional, and
        %common-idreq.att; is for common attributes where the
        ID is required.
-->
<!ENTITY % common.att
        'id                     ID              #IMPLIED
         xml:lang               NMTOKEN         #IMPLIED
         role                   NMTOKEN         #IMPLIED'>
<!ENTITY % common-idreq.att
        'id                     ID              #REQUIRED
         xml:lang               NMTOKEN         #IMPLIED
         role                   NMTOKEN         #IMPLIED'>


<!-- xml:space attribute ===============================================
        Indicates that the element contains white space
        that the formatter or other application should retain,
        as appropriate to its function.
==================================================================== -->
<!ENTITY % xmlspace.att
        'xml:space (default|preserve) #FIXED "preserve"'>


<!-- def attribute =====================================================
        Points to the element where the relevant definition can be
        found, using the IDREF mechanism.  %def.att; is for optional
        def attributes, and %def-req.att; is for required def
        attributes.
==================================================================== -->
<!ENTITY % def.att
        'def                    IDREF           #IMPLIED'>
<!ENTITY % def-req.att
        'def                    IDREF           #REQUIRED'>


<!-- ref attribute =====================================================
        Points to the element where more information can be found,
        using the IDREF mechanism.  %ref.att; is for optional
        ref attributes, and %ref-req.att; is for required ref
        attributes.
================================================================== -->
<!ENTITY % ref.att
        'ref                    IDREF           #IMPLIED'>
<!ENTITY % ref-req.att
        'ref                    IDREF           #REQUIRED'>


<!-- =============================================================== -->
<!-- Entities for XLink compliance -->
<!-- =============================================================== -->

<!ENTITY % xlink-simple.att
        'type      (simple|extended|locator|arc) #FIXED "simple"
         href      CDATA                         #IMPLIED
         role      CDATA                         #IMPLIED
         title     CDATA                         #IMPLIED '>
<!--    'xmlns     CDATA                         #FIXED "http://www.w3.org/XML/XLink/0.9" -->
<!-- FIXME: brain-dead IE5 has broken support for
     namespace validation and since I use it for editing
     I remove this for now -->

<!ENTITY % xlink-user-replace.att
        'show      (new|parsed|replace)   #FIXED "replace"
         actuate   (user|auto)            #FIXED "user" '>

<!ENTITY % xlink-user-new.att
        'show      (new|parsed|replace)   #FIXED "new"
         actuate   (user|auto)            #FIXED "user" '>

<!ENTITY % xlink-auto-parsed.att
        'show      (new|parsed|replace)   #FIXED "parsed"
         actuate   (user|auto)            #FIXED "auto" '>

<!-- FIXME (SM): XLink doesn't yet cover the idea of soft links so
     introducing it here using the same namespace is _somewhat_
     illegal. Should we create it own namespace?
-->
<!ENTITY % xlink-soft.att
        'mode      (hard|soft)            #FIXED "soft" '>


<!-- =============================================================== -->
<!-- Entities for general usage -->
<!-- =============================================================== -->


<!-- Key attribute =====================================================
        Optionally provides a sorting or indexing key, for cases when
        the element content is inappropriate for this purpose.
==================================================================== -->
<!ENTITY % key.att
        'key                    CDATA           #IMPLIED'>


<!-- Title attributes ==================================================
        Indicates that the element requires to have a title.
==================================================================== -->
<!ENTITY % title.att
        'title                  CDATA           #REQUIRED'>


<!-- Name attributes ==================================================
        Indicates that the element requires to have a name.
==================================================================== -->
<!ENTITY % name.att
        'name                   CDATA           #REQUIRED'>


<!-- Email attributes ==================================================
        Indicates that the element requires to have an email.
==================================================================== -->
<!ENTITY % email.att
        'email                  CDATA           #REQUIRED'>


<!-- =============================================================== -->
<!-- General definitions -->
<!-- =============================================================== -->

<!-- A person is a general human entity -->
<!ELEMENT person EMPTY>
<!ATTLIST person %common.att;
                 %name.att;
                 %email.att;>


<!-- =============================================================== -->
<!-- Content definitions -->
<!-- =============================================================== -->

<!ENTITY % local.content.mix "">

<!ENTITY % markup "strong|em|code|sub|sup">

<!ENTITY % links "link|connect|jump|fork|anchor">

<!ENTITY % special "br|img">

<!ENTITY % link-content.mix "%text;|%markup;|%special;%local.content.mix;">

<!ENTITY % content.mix "%link-content.mix;|%links;">

    <!-- ==================================================== -->
    <!-- Phrase Markup -->
    <!-- ==================================================== -->

    <!-- Strong (typically bold) -->
    <!ELEMENT strong (%text;)>
    <!ATTLIST strong %common.att;>

    <!-- Emphasis (typically italic) -->
    <!ELEMENT em (%text;)>
    <!ATTLIST em %common.att;>

    <!-- Code (typically monospaced) -->
    <!ELEMENT code (%text;)>
    <!ATTLIST code %common.att;>

    <!-- Superscript (typically smaller and higher) -->
    <!ELEMENT sup (%text;)>
    <!ATTLIST sup %common.att;>

    <!-- Subscript (typically smaller and lower) -->
    <!ELEMENT sub (%text;)>
    <!ATTLIST sub %common.att;>

    <!-- FIXME (SM): should we add these HTML 4.0 markups
         which are style-free?

          -dfn
          -samp
          -kbd
          -var
          -cite
          -abbr
          -acronym

     -->

    <!-- ==================================================== -->
    <!-- Hypertextual Links -->
    <!-- ==================================================== -->

    <!-- hard replacing link (equivalent of <a ...>) -->
    <!ELEMENT link (%link-content.mix;)*>
    <!ATTLIST link %common.att;
                   %xlink-simple.att;
                   %xlink-user-replace.att;>

    <!-- Hard window replacing link (equivalent of <a ... target="_top">) -->
    <!ELEMENT jump (%link-content.mix;)*>
    <!ATTLIST jump %common.att;
                   %xlink-simple.att;
                   %xlink-user-new.att;>

    <!-- Hard window forking link (equivalent of <a ... target="_new">) -->
    <!ELEMENT fork (%link-content.mix;)*>
    <!ATTLIST fork %common.att;
                   %xlink-simple.att;
                   %xlink-user-new.att;>

    <!-- Anchor point (equivalent of <a name="...">) -->
    <!ELEMENT anchor EMPTY>
    <!ATTLIST anchor %common-idreq.att;>

    <!-- Soft link between processed pages (no equivalent in HTML) -->
    <!ELEMENT connect (%link-content.mix;)*>
    <!ATTLIST connect %common.att;
                      %xlink-simple.att;
                      %xlink-user-replace.att;
                      %xlink-soft.att;>

    <!-- ==================================================== -->
    <!-- Specials -->
    <!-- ==================================================== -->

    <!-- Breakline Object (typically forces line break) -->
    <!ELEMENT br EMPTY>
    <!ATTLIST br %common.att;>

    <!-- Image Object (typically an inlined image) -->
    <!-- FIXME (SM): should we have the notion of soft links even here
         for inlined objects? -->
    <!ELEMENT img EMPTY>
    <!ATTLIST img src    CDATA  #REQUIRED
                  alt    CDATA  #REQUIRED
                  height CDATA  #IMPLIED
                  width  CDATA  #IMPLIED
                  usemap CDATA  #IMPLIED
                  ismap  (ismap) #IMPLIED
                  %common.att;>


<!-- =============================================================== -->
<!-- Blocks definitions -->
<!-- =============================================================== -->

<!ENTITY % local.blocks "">

<!ENTITY % local.lists "">

<!ENTITY % paragraphs "p|source|note|fixme|img-block">

<!ENTITY % tables "table">

<!ENTITY % lists "ol|ul|sl|dl %local.lists;">

<!ENTITY % blocks "%paragraphs;|%tables;|%lists; %local.blocks;">

    <!-- ==================================================== -->
    <!-- Paragraphs -->
    <!-- ==================================================== -->

    <!-- Text Paragraph (normally vertically space delimited) -->
    <!ELEMENT p (%content.mix;)*>
    <!ATTLIST p %common.att;>

    <!-- Source Paragraph (normally space is preserved) -->
    <!ELEMENT source (%content.mix;)*>
    <!ATTLIST source %common.att;
                     %xmlspace.att;>

    <!-- Note Paragraph (normally shown encapsulated) -->
    <!ELEMENT note (%content.mix;)*>
    <!ATTLIST note %common.att;>

    <!-- Fixme Paragraph (normally not shown) -->
    <!ELEMENT fixme (%content.mix;)*>
    <!-- the "author" attribute should match the "key" attribute of the
         <author> element -->
    <!ATTLIST fixme author CDATA #REQUIRED
                    %common.att;>

    <!-- ==================================================== -->
    <!-- Tables -->
    <!-- ==================================================== -->

    <!ENTITY % cellhalign.att
            'align          (left|center
                            |right|justify
                            |char)          #IMPLIED
            char            CDATA           #IMPLIED
            charoff         CDATA           #IMPLIED'>

    <!ENTITY % cellvalign.att
            'valign         (top|middle
                            |bottom
                            |baseline)      #IMPLIED'>

    <!ENTITY % thtd.att
            'abbr           CDATA           #IMPLIED
            axis            CDATA           #IMPLIED
            headers         IDREFS          #IMPLIED
            scope           (row
                            |col
                            |rowgroup
                            |colgroup)      #IMPLIED
            rowspan         NMTOKEN         "1"
            colspan         NMTOKEN         "1"'>

    <!ENTITY % width.att
            'width          CDATA           #IMPLIED'>

    <!ENTITY % span.att
            'span           NMTOKEN         "1"'>


    <!-- Table (based on the IETF HTML table standard [RFC1942]) -->
    <!ELEMENT table
            (caption?, (col*|colgroup*), thead?, tfoot?, tbody+)>
    <!ATTLIST table
            %common.att;
            %width.att;
            summary         CDATA           #IMPLIED
            border          CDATA           #IMPLIED
            frame           (void|above
                            |below|hsides
                            |lhs|rhs
                            |vsides|box
                            |border)        #IMPLIED
            rules           (none|groups
                            |rows|cols
                            |all)           #IMPLIED
            cellspacing     CDATA           #IMPLIED
            cellpadding     CDATA           #IMPLIED>

        <!ELEMENT caption (%content.mix;)*>
        <!ATTLIST caption %common.att;>

        <!ELEMENT colgroup (col)*>
        <!ATTLIST colgroup
                %common.att;
                %span.att;
                %width.att;
                %cellhalign.att;
                %cellvalign.att;>

            <!ELEMENT col EMPTY>
            <!ATTLIST col
                    %common.att;
                    %span.att;
                    %width.att;
                    %cellhalign.att;
                    %cellvalign.att;>

        <!ELEMENT thead (tr)+>
        <!ATTLIST thead
                %common.att;
                %cellhalign.att;
                %cellvalign.att;>

        <!ELEMENT tfoot (tr)+>
        <!ATTLIST tfoot
                %common.att;
                %cellhalign.att;
                %cellvalign.att;>

        <!ELEMENT tbody (tr)+>
        <!ATTLIST tbody
                %common.att;
                %cellhalign.att;
                %cellvalign.att;>

            <!ELEMENT tr (th|td)+>
            <!ATTLIST tr
                    %common.att;
                    %cellhalign.att;
                    %cellvalign.att;>

                <!ELEMENT th (%content.mix;)*>
                <!ATTLIST th
                        %common.att;
                        %thtd.att;
                        %cellhalign.att;
                        %cellvalign.att;>

                <!ELEMENT td (%content.mix;)*>
                <!ATTLIST td
                        %common.att;
                        %thtd.att;
                        %cellhalign.att;
                        %cellvalign.att;>

    <!-- ==================================================== -->
    <!-- Lists -->
    <!-- ==================================================== -->

    <!-- Unordered list (typically bulleted) -->
    <!ELEMENT ul (li|%lists;)+>
    <!--    spacing attribute:
            Use "normal" to get normal vertical spacing for items;
            use "compact" to get less spacing.  The default is dependent
            on the stylesheet. -->
    <!ATTLIST ul
            %common.att;
            spacing         (normal|compact)        #IMPLIED>

    <!-- Ordered list (typically numbered) -->
    <!ELEMENT ol (li|%lists;)+>
    <!--    spacing attribute:
            Use "normal" to get normal vertical spacing for items;
            use "compact" to get less spacing.  The default is dependent
            on the stylesheet. -->
    <!ATTLIST ol
            %common.att;
            spacing         (normal|compact)        #IMPLIED>

    <!-- Simple list (typically with no mark) -->
    <!ELEMENT sl (li|%lists;)+>
    <!ATTLIST sl %common.att;>

        <!-- List item -->
        <!ELEMENT li (%content.mix;|%lists;)*>
        <!ATTLIST li %common.att;>

    <!-- Definition list (typically two-column) -->
    <!ELEMENT dl (dt,dd)+>
    <!ATTLIST dl %common.att;>

        <!-- Definition term -->
        <!ELEMENT dt (%content.mix;)*>
        <!ATTLIST dt %common.att;>

        <!-- Definition description -->
        <!ELEMENT dd (%content.mix;)*>
        <!ATTLIST dd %common.att;>

    <!-- ==================================================== -->
    <!-- Special Blocks -->
    <!-- ==================================================== -->

    <!-- Image Block (typically a separated and centered image) -->
    <!-- FIXME (SM): should we have the notion of soft links even here
         for inlined objects? -->
    <!ELEMENT img-block EMPTY>
    <!ATTLIST img-block src    CDATA  #REQUIRED
                        alt    CDATA  #REQUIRED
                        height CDATA  #IMPLIED
                        width  CDATA  #IMPLIED
                        usemap CDATA  #IMPLIED
                        ismap  (ismap) #IMPLIED
                        %common.att;>


<!-- =============================================================== -->
<!-- Document -->
<!-- =============================================================== -->

<!ELEMENT document (header?, body, footer?)>
<!ATTLIST document %common.att;>

    <!-- ==================================================== -->
    <!-- Header -->
    <!-- ==================================================== -->

    <!ENTITY % local.headers "">

    <!ELEMENT header (title, subtitle?, version?, type?, authors,
                      notice*, abstract? %local.headers;)>
    <!ATTLIST header %common.att;>

    <!ELEMENT title (%text;)>
    <!ATTLIST title %common.att;>

    <!ELEMENT subtitle (%text;)>
    <!ATTLIST subtitle %common.att;>

    <!ELEMENT version (%text;)>
    <!ATTLIST version %common.att;>

    <!ELEMENT type (%text;)>
    <!ATTLIST type %common.att;>

    <!ELEMENT authors (person+)>
    <!ATTLIST authors %common.att;>

    <!ELEMENT notice (%content.mix;)*>
    <!ATTLIST notice %common.att;>

    <!ELEMENT abstract (%content.mix;)*>
    <!ATTLIST abstract %common.att;>

    <!-- ==================================================== -->
    <!-- Body -->
    <!-- ==================================================== -->

    <!ENTITY % local.sections "">

    <!ENTITY % sections "s1 %local.sections;">

    <!ELEMENT body (%sections;)+>
    <!ATTLIST body %common.att;>

        <!ELEMENT s1 (s2|%blocks;)*>
        <!ATTLIST s1 %title.att; %common.att;>

            <!ELEMENT s2 (s3|%blocks;)*>
            <!ATTLIST s2 %title.att; %common.att;>

                <!ELEMENT s3 (s4|%blocks;)*>
                <!ATTLIST s3 %title.att; %common.att;>

                    <!ELEMENT s4 (%blocks;)*>
                    <!ATTLIST s4 %title.att; %common.att;>

    <!-- ==================================================== -->
    <!-- Footer -->
    <!-- ==================================================== -->

    <!ENTITY % local.footers "">

    <!ELEMENT footer (legal %local.footers;)>

        <!ELEMENT legal (%content.mix;)*>
        <!ATTLIST legal %common.att;>

<!-- =============================================================== -->
<!-- End of DTD -->
<!-- =============================================================== -->
-------------- next part --------------
<!-- 
     Portions (C) International Organization for Standardization 1986
     Permission to copy in any form is granted for use with
     conforming SGML systems and applications as defined in
     ISO 8879, provided this notice is included in all copies.
-->

<!-- 
     Character entity set.
-->

<!-- Latin A -->
<!ENTITY nbsp     "&#160;">  <!-- U+00A0 ISOnum    - no-break space = non-breaking space                                   -->
<!ENTITY iexcl    "&#161;">  <!-- U+00A1 ISOnum    - inverted exclamation mark                                             -->
<!ENTITY cent     "&#162;">  <!-- U+00A2 ISOnum    - cent sign                                                             -->
<!ENTITY pound    "&#163;">  <!-- U+00A3 ISOnum    - pound sign                                                            -->
<!ENTITY curren   "&#164;">  <!-- U+00A4 ISOnum    - currency sign                                                         -->
<!ENTITY yen      "&#165;">  <!-- U+00A5 ISOnum    - yen sign = yuan sign                                                  -->
<!ENTITY brvbar   "&#166;">  <!-- U+00A6 ISOnum    - broken bar = broken vertical bar                                      -->
<!ENTITY sect     "&#167;">  <!-- U+00A7 ISOnum    - section sign                                                          -->
<!ENTITY uml      "&#168;">  <!-- U+00A8 ISOdia    - diaeresis = spacing diaeresis                                         -->
<!ENTITY copy     "&#169;">  <!-- U+00A9 ISOnum    - copyright sign                                                        -->
<!ENTITY ordf     "&#170;">  <!-- U+00AA ISOnum    - feminine ordinal indicator                                            -->
<!ENTITY laquo    "&#171;">  <!-- U+00AB ISOnum    - left-pointing double angle quotation mark = left pointing guillemet   -->
<!ENTITY not      "&#172;">  <!-- U+00AC ISOnum    - not sign                                                              -->
<!ENTITY shy      "&#173;">  <!-- U+00AD ISOnum    - soft hyphen = discretionary hyphen                                    -->
<!ENTITY reg      "&#174;">  <!-- U+00AE ISOnum    - registered sign = registered trade mark sign                          -->
<!ENTITY macr     "&#175;">  <!-- U+00AF ISOdia    - macron = spacing macron = overline = APL overbar                      -->
<!ENTITY deg      "&#176;">  <!-- U+00B0 ISOnum    - degree sign                                                           -->
<!ENTITY plusmn   "&#177;">  <!-- U+00B1 ISOnum    - plus-minus sign = plus-or-minus sign                                  -->
<!ENTITY sup2     "&#178;">  <!-- U+00B2 ISOnum    - superscript two = superscript digit two = squared                     -->
<!ENTITY sup3     "&#179;">  <!-- U+00B3 ISOnum    - superscript three = superscript digit three = cubed                   -->
<!ENTITY acute    "&#180;">  <!-- U+00B4 ISOdia    - acute accent = spacing acute                                          -->
<!ENTITY micro    "&#181;">  <!-- U+00B5 ISOnum    - micro sign                                                            -->
<!ENTITY para     "&#182;">  <!-- U+00B6 ISOnum    - pilcrow sign = paragraph sign                                         -->
<!ENTITY middot   "&#183;">  <!-- U+00B7 ISOnum    - middle dot = Georgian comma = Greek middle dot                        -->
<!ENTITY cedil    "&#184;">  <!-- U+00B8 ISOdia    - cedilla = spacing cedilla                                             -->
<!ENTITY sup1     "&#185;">  <!-- U+00B9 ISOnum    - superscript one = superscript digit one                               -->
<!ENTITY ordm     "&#186;">  <!-- U+00BA ISOnum    - masculine ordinal indicator                                           -->
<!ENTITY raquo    "&#187;">  <!-- U+00BB ISOnum    - right-pointing double angle quotation mark = right pointing guillemet -->
<!ENTITY frac14   "&#188;">  <!-- U+00BC ISOnum    - vulgar fraction one quarter = fraction one quarter                    -->
<!ENTITY frac12   "&#189;">  <!-- U+00BD ISOnum    - vulgar fraction one half = fraction one half                          -->
<!ENTITY frac34   "&#190;">  <!-- U+00BE ISOnum    - vulgar fraction three quarters = fraction three quarters              -->
<!ENTITY iquest   "&#191;">  <!-- U+00BF ISOnum    - inverted question mark = turned question mark                         -->
<!ENTITY Agrave   "&#192;">  <!-- U+00C0 ISOlat1   - latin capital letter A with grave = latin capital letter A grave      -->
<!ENTITY Aacute   "&#193;">  <!-- U+00C1 ISOlat1   - latin capital letter A with acute                                     -->
<!ENTITY Acirc    "&#194;">  <!-- U+00C2 ISOlat1   - latin capital letter A with circumflex                                -->
<!ENTITY Atilde   "&#195;">  <!-- U+00C3 ISOlat1   - latin capital letter A with tilde                                     -->
<!ENTITY Auml     "&#196;">  <!-- U+00C4 ISOlat1   - latin capital letter A with diaeresis                                 -->
<!ENTITY Aring    "&#197;">  <!-- U+00C5 ISOlat1   - latin capital letter A with ring above = latin capital letter A ring  -->
<!ENTITY AElig    "&#198;">  <!-- U+00C6 ISOlat1   - latin capital letter AE = latin capital ligature AE                   -->
<!ENTITY Ccedil   "&#199;">  <!-- U+00C7 ISOlat1   - latin capital letter C with cedilla                                   -->
<!ENTITY Egrave   "&#200;">  <!-- U+00C8 ISOlat1   - latin capital letter E with grave                                     -->
<!ENTITY Eacute   "&#201;">  <!-- U+00C9 ISOlat1   - latin capital letter E with acute                                     -->
<!ENTITY Ecirc    "&#202;">  <!-- U+00CA ISOlat1   - latin capital letter E with circumflex                                -->
<!ENTITY Euml     "&#203;">  <!-- U+00CB ISOlat1   - latin capital letter E with diaeresis                                 -->
<!ENTITY Igrave   "&#204;">  <!-- U+00CC ISOlat1   - latin capital letter I with grave                                     -->
<!ENTITY Iacute   "&#205;">  <!-- U+00CD ISOlat1   - latin capital letter I with acute                                     -->
<!ENTITY Icirc    "&#206;">  <!-- U+00CE ISOlat1   - latin capital letter I with circumflex                                -->
<!ENTITY Iuml     "&#207;">  <!-- U+00CF ISOlat1   - latin capital letter I with diaeresis                                 -->
<!ENTITY ETH      "&#208;">  <!-- U+00D0 ISOlat1   - latin capital letter ETH                                              -->
<!ENTITY Ntilde   "&#209;">  <!-- U+00D1 ISOlat1   - latin capital letter N with tilde                                     -->
<!ENTITY Ograve   "&#210;">  <!-- U+00D2 ISOlat1   - latin capital letter O with grave                                     -->
<!ENTITY Oacute   "&#211;">  <!-- U+00D3 ISOlat1   - latin capital letter O with acute                                     -->
<!ENTITY Ocirc    "&#212;">  <!-- U+00D4 ISOlat1   - latin capital letter O with circumflex                                -->
<!ENTITY Otilde   "&#213;">  <!-- U+00D5 ISOlat1   - latin capital letter O with tilde                                     -->
<!ENTITY Ouml     "&#214;">  <!-- U+00D6 ISOlat1   - latin capital letter O with diaeresis                                 -->
<!ENTITY times    "&#215;">  <!-- U+00D7 ISOnum    - multiplication sign                                                   -->
<!ENTITY Oslash   "&#216;">  <!-- U+00D8 ISOlat1   - latin capital letter O with stroke = latin capital letter O slash     -->
<!ENTITY Ugrave   "&#217;">  <!-- U+00D9 ISOlat1   - latin capital letter U with grave                                     -->
<!ENTITY Uacute   "&#218;">  <!-- U+00DA ISOlat1   - latin capital letter U with acute                                     -->
<!ENTITY Ucirc    "&#219;">  <!-- U+00DB ISOlat1   - latin capital letter U with circumflex                                -->
<!ENTITY Uuml     "&#220;">  <!-- U+00DC ISOlat1   - latin capital letter U with diaeresis                                 -->
<!ENTITY Yacute   "&#221;">  <!-- U+00DD ISOlat1   - latin capital letter Y with acute                                     -->
<!ENTITY THORN    "&#222;">  <!-- U+00DE ISOlat1   - latin capital letter THORN                                            -->
<!ENTITY szlig    "&#223;">  <!-- U+00DF ISOlat1   - latin small letter sharp s = ess-zed                                  -->
<!ENTITY agrave   "&#224;">  <!-- U+00E0 ISOlat1   - latin small letter a with grave = latin small letter a grave          -->
<!ENTITY aacute   "&#225;">  <!-- U+00E1 ISOlat1   - latin small letter a with acute                                       -->
<!ENTITY acirc    "&#226;">  <!-- U+00E2 ISOlat1   - latin small letter a with circumflex                                  -->
<!ENTITY atilde   "&#227;">  <!-- U+00E3 ISOlat1   - latin small letter a with tilde                                       -->
<!ENTITY auml     "&#228;">  <!-- U+00E4 ISOlat1   - latin small letter a with diaeresis                                   -->
<!ENTITY aring    "&#229;">  <!-- U+00E5 ISOlat1   - latin small letter a with ring above = latin small letter a ring      -->
<!ENTITY aelig    "&#230;">  <!-- U+00E6 ISOlat1   - latin small letter ae = latin small ligature ae                       -->
<!ENTITY ccedil   "&#231;">  <!-- U+00E7 ISOlat1   - latin small letter c with cedilla                                     -->
<!ENTITY egrave   "&#232;">  <!-- U+00E8 ISOlat1   - latin small letter e with grave                                       -->
<!ENTITY eacute   "&#233;">  <!-- U+00E9 ISOlat1   - latin small letter e with acute                                       -->
<!ENTITY ecirc    "&#234;">  <!-- U+00EA ISOlat1   - latin small letter e with circumflex                                  -->
<!ENTITY euml     "&#235;">  <!-- U+00EB ISOlat1   - latin small letter e with diaeresis                                   -->
<!ENTITY igrave   "&#236;">  <!-- U+00EC ISOlat1   - latin small letter i with grave                                       -->
<!ENTITY iacute   "&#237;">  <!-- U+00ED ISOlat1   - latin small letter i with acute                                       -->
<!ENTITY icirc    "&#238;">  <!-- U+00EE ISOlat1   - latin small letter i with circumflex                                  -->
<!ENTITY iuml     "&#239;">  <!-- U+00EF ISOlat1   - latin small letter i with diaeresis                                   -->
<!ENTITY eth      "&#240;">  <!-- U+00F0 ISOlat1   - latin small letter eth                                                -->
<!ENTITY ntilde   "&#241;">  <!-- U+00F1 ISOlat1   - latin small letter n with tilde                                       -->
<!ENTITY ograve   "&#242;">  <!-- U+00F2 ISOlat1   - latin small letter o with grave                                       -->
<!ENTITY oacute   "&#243;">  <!-- U+00F3 ISOlat1   - latin small letter o with acute                                       -->
<!ENTITY ocirc    "&#244;">  <!-- U+00F4 ISOlat1   - latin small letter o with circumflex                                  -->
<!ENTITY otilde   "&#245;">  <!-- U+00F5 ISOlat1   - latin small letter o with tilde                                       -->
<!ENTITY ouml     "&#246;">  <!-- U+00F6 ISOlat1   - latin small letter o with diaeresis                                   -->
<!ENTITY divide   "&#247;">  <!-- U+00F7 ISOnum    - division sign                                                         -->
<!ENTITY oslash   "&#248;">  <!-- U+00F8 ISOlat1   - latin small letter o with stroke = latin small letter o slash         -->
<!ENTITY ugrave   "&#249;">  <!-- U+00F9 ISOlat1   - latin small letter u with grave                                       -->
<!ENTITY uacute   "&#250;">  <!-- U+00FA ISOlat1   - latin small letter u with acute                                       -->
<!ENTITY ucirc    "&#251;">  <!-- U+00FB ISOlat1   - latin small letter u with circumflex                                  -->
<!ENTITY uuml     "&#252;">  <!-- U+00FC ISOlat1   - latin small letter u with diaeresis                                   -->
<!ENTITY yacute   "&#253;">  <!-- U+00FD ISOlat1   - latin small letter y with acute                                       -->
<!ENTITY thorn    "&#254;">  <!-- U+00FE ISOlat1   - latin small letter thorn                                              -->
<!ENTITY yuml     "&#255;">  <!-- U+00FF ISOlat1   - latin small letter y with diaeresis                                   -->

<!-- Latin Extended-A -->
<!ENTITY OElig    "&#338;">  <!-- U+0152 ISOlat2   - latin capital ligature OE                                             -->
<!ENTITY oelig    "&#339;">  <!-- U+0153 ISOlat2   - latin small ligature oe                                               -->

<!-- ligature is a misnomer, this is a separate character in some languages -->
<!ENTITY Scaron   "&#352;">  <!-- U+0160 ISOlat2   - latin capital letter S with caron                                     -->
<!ENTITY scaron   "&#353;">  <!-- U+0161 ISOlat2   - latin small letter s with caron                                       -->
<!ENTITY Yuml     "&#376;">  <!-- U+0178 ISOlat2   - latin capital letter Y with diaeresis                                 -->

<!-- Spacing Modifier Letters -->
<!ENTITY circ     "&#710;">  <!-- U+02C6 ISOpub    - modifier letter circumflex accent                                     -->
<!ENTITY tilde    "&#732;">  <!-- U+02DC ISOdia    - small tilde                                                           -->

<!-- General Punctuation -->
<!ENTITY ensp     "&#8194;"> <!-- U+2002 ISOpub    - en space                                                              -->
<!ENTITY emsp     "&#8195;"> <!-- U+2003 ISOpub    - em space                                                              -->
<!ENTITY thinsp   "&#8201;"> <!-- U+2009 ISOpub    - thin space                                                            -->
<!ENTITY zwnj     "&#8204;"> <!-- U+200C RFC 2070  - zero width non-joiner                                                 -->
<!ENTITY zwj      "&#8205;"> <!-- U+200D RFC 2070  - zero width joiner                                                     -->
<!ENTITY lrm      "&#8206;"> <!-- U+200E RFC 2070  - left-to-right mark                                                    -->
<!ENTITY rlm      "&#8207;"> <!-- U+200F RFC 2070  - right-to-left mark                                                    -->
<!ENTITY ndash    "&#8211;"> <!-- U+2013 ISOpub    - en dash                                                               -->
<!ENTITY mdash    "&#8212;"> <!-- U+2014 ISOpub    - em dash                                                               -->
<!ENTITY lsquo    "&#8216;"> <!-- U+2018 ISOnum    - left single quotation mark                                            -->
<!ENTITY rsquo    "&#8217;"> <!-- U+2019 ISOnum    - right single quotation mark                                           -->
<!ENTITY sbquo    "&#8218;"> <!-- U+201A NEW       - single low-9 quotation mark                                           -->
<!ENTITY ldquo    "&#8220;"> <!-- U+201C ISOnum    - left double quotation mark                                            -->
<!ENTITY rdquo    "&#8221;"> <!-- U+201D ISOnum    - right double quotation mark,                                          -->
<!ENTITY bdquo    "&#8222;"> <!-- U+201E NEW       - double low-9 quotation mark                                           -->
<!ENTITY dagger   "&#8224;"> <!-- U+2020 ISOpub    - dagger                                                                -->
<!ENTITY Dagger   "&#8225;"> <!-- U+2021 ISOpub    - double dagger                                                         -->
<!ENTITY permil   "&#8240;"> <!-- U+2030 ISOtech   - per mille sign                                                        -->
<!ENTITY lsaquo   "&#8249;"> <!-- U+2039 ISO prop. - single left-pointing angle quotation mark                             -->

<!-- lsaquo is proposed but not yet ISO standardized -->
<!ENTITY rsaquo   "&#8250;"> <!-- U+203A ISO prop. -   single right-pointing angle quotation mark                          -->

<!-- rsaquo is proposed but not yet ISO standardized -->
<!ENTITY euro     "&#8364;"> <!-- U+20AC NEW       -   euro sign                                                           -->

<!-- Latin Extended-B -->
<!ENTITY fnof     "&#402;">  <!-- U+0192 ISOtech   - latin small f with hook = function = florin                           -->

<!-- Greek -->
<!ENTITY Alpha    "&#913;">  <!-- U+0391           - greek capital letter alpha                                            -->
<!ENTITY Beta     "&#914;">  <!-- U+0392           - greek capital letter beta                                             -->
<!ENTITY Gamma    "&#915;">  <!-- U+0393 ISOgrk3   - greek capital letter gamma                                            -->
<!ENTITY Delta    "&#916;">  <!-- U+0394 ISOgrk3   - greek capital letter delta                                            -->
<!ENTITY Epsilon  "&#917;">  <!-- U+0395           - greek capital letter epsilon                                          -->
<!ENTITY Zeta     "&#918;">  <!-- U+0396           - greek capital letter zeta                                             -->
<!ENTITY Eta      "&#919;">  <!-- U+0397           - greek capital letter eta                                              -->
<!ENTITY Theta    "&#920;">  <!-- U+0398 ISOgrk3   - greek capital letter theta                                            -->
<!ENTITY Iota     "&#921;">  <!-- U+0399           - greek capital letter iota                                             -->
<!ENTITY Kappa    "&#922;">  <!-- U+039A           - greek capital letter kappa                                            -->
<!ENTITY Lambda   "&#923;">  <!-- U+039B ISOgrk3   - greek capital letter lambda                                           -->
<!ENTITY Mu       "&#924;">  <!-- U+039C           - greek capital letter mu                                               -->
<!ENTITY Nu       "&#925;">  <!-- U+039D           - greek capital letter nu                                               -->
<!ENTITY Xi       "&#926;">  <!-- U+039E ISOgrk3   - greek capital letter xi                                               -->
<!ENTITY Omicron  "&#927;">  <!-- U+039F           - greek capital letter omicron                                          -->
<!ENTITY Pi       "&#928;">  <!-- U+03A0 ISOgrk3   - greek capital letter pi                                               -->
<!ENTITY Rho      "&#929;">  <!-- U+03A1           - greek capital letter rho                                              -->
<!ENTITY Sigma    "&#931;">  <!-- U+03A3 ISOgrk3   - greek capital letter sigma                                            -->
<!ENTITY Tau      "&#932;">  <!-- U+03A4           - greek capital letter tau                                              -->
<!ENTITY Upsilon  "&#933;">  <!-- U+03A5 ISOgrk3   - greek capital letter upsilon                                          -->
<!ENTITY Phi      "&#934;">  <!-- U+03A6 ISOgrk3   - greek capital letter phi                                              -->
<!ENTITY Chi      "&#935;">  <!-- U+03A7           - greek capital letter chi                                              -->
<!ENTITY Psi      "&#936;">  <!-- U+03A8 ISOgrk3   - greek capital letter psi                                              -->
<!ENTITY Omega    "&#937;">  <!-- U+03A9 ISOgrk3   - greek capital letter omega                                            -->
<!ENTITY alpha    "&#945;">  <!-- U+03B1 ISOgrk3   - greek small letter alpha                                              -->
<!ENTITY beta     "&#946;">  <!-- U+03B2 ISOgrk3   - greek small letter beta                                               -->
<!ENTITY gamma    "&#947;">  <!-- U+03B3 ISOgrk3   - greek small letter gamma                                              -->
<!ENTITY delta    "&#948;">  <!-- U+03B4 ISOgrk3   - greek small letter delta                                              -->
<!ENTITY epsilon  "&#949;">  <!-- U+03B5 ISOgrk3   - greek small letter epsilon                                            -->
<!ENTITY zeta     "&#950;">  <!-- U+03B6 ISOgrk3   - greek small letter zeta                                               -->
<!ENTITY eta      "&#951;">  <!-- U+03B7 ISOgrk3   - greek small letter eta                                                -->
<!ENTITY theta    "&#952;">  <!-- U+03B8 ISOgrk3   - greek small letter theta                                              -->
<!ENTITY iota     "&#953;">  <!-- U+03B9 ISOgrk3   - greek small letter iota                                               -->
<!ENTITY kappa    "&#954;">  <!-- U+03BA ISOgrk3   - greek small letter kappa                                              -->
<!ENTITY lambda   "&#955;">  <!-- U+03BB ISOgrk3   - greek small letter lambda                                             -->
<!ENTITY mu       "&#956;">  <!-- U+03BC ISOgrk3   - greek small letter mu                                                 -->
<!ENTITY nu       "&#957;">  <!-- U+03BD ISOgrk3   - greek small letter nu                                                 -->
<!ENTITY xi       "&#958;">  <!-- U+03BE ISOgrk3   - greek small letter xi                                                 -->
<!ENTITY omicron  "&#959;">  <!-- U+03BF NEW       - greek small letter omicron                                            -->
<!ENTITY pi       "&#960;">  <!-- U+03C0 ISOgrk3   - greek small letter pi                                                 -->
<!ENTITY rho      "&#961;">  <!-- U+03C1 ISOgrk3   - greek small letter rho                                                -->
<!ENTITY sigmaf   "&#962;">  <!-- U+03C2 ISOgrk3   - greek small letter final sigma                                        -->
<!ENTITY sigma    "&#963;">  <!-- U+03C3 ISOgrk3   - greek small letter sigma                                              -->
<!ENTITY tau      "&#964;">  <!-- U+03C4 ISOgrk3   - greek small letter tau                                                -->
<!ENTITY upsilon  "&#965;">  <!-- U+03C5 ISOgrk3   - greek small letter upsilon                                            -->
<!ENTITY phi      "&#966;">  <!-- U+03C6 ISOgrk3   - greek small letter phi                                                -->
<!ENTITY chi      "&#967;">  <!-- U+03C7 ISOgrk3   - greek small letter chi                                                -->
<!ENTITY psi      "&#968;">  <!-- U+03C8 ISOgrk3   - greek small letter psi                                                -->
<!ENTITY omega    "&#969;">  <!-- U+03C9 ISOgrk3   - greek small letter omega                                              -->
<!ENTITY thetasym "&#977;">  <!-- U+03D1 NEW       - greek small letter theta symbol                                       -->
<!ENTITY upsih    "&#978;">  <!-- U+03D2 NEW       - greek upsilon with hook symbol                                        -->
<!ENTITY piv      "&#982;">  <!-- U+03D6 ISOgrk3   - greek pi symbol                                                       -->

<!-- General Punctuation -->
<!ENTITY bull     "&#8226;"> <!-- U+2022 ISOpub    - bullet = black small circle                                           -->
<!ENTITY hellip   "&#8230;"> <!-- U+2026 ISOpub    - horizontal ellipsis = three dot leader                                -->
<!ENTITY prime    "&#8242;"> <!-- U+2032 ISOtech   - prime = minutes = feet                                                -->
<!ENTITY Prime    "&#8243;"> <!-- U+2033 ISOtech   - double prime = seconds = inches                                       -->
<!ENTITY oline    "&#8254;"> <!-- U+203E NEW       - overline = spacing overscore                                          -->
<!ENTITY frasl    "&#8260;"> <!-- U+2044 NEW       - fraction slash                                                        -->

<!-- Letterlike Symbols -->
<!ENTITY weierp   "&#8472;"> <!-- U+2118 ISOamso   - script capital P = power set = Weierstrass p                          -->
<!ENTITY image    "&#8465;"> <!-- U+2111 ISOamso   - blackletter capital I = imaginary part                                -->
<!ENTITY real     "&#8476;"> <!-- U+211C ISOamso   - blackletter capital R = real part symbol                              -->
<!ENTITY trade    "&#8482;"> <!-- U+2122 ISOnum    - trade mark sign                                                       -->
<!ENTITY alefsym  "&#8501;"> <!-- U+2135 NEW       - alef symbol = first transfinite cardinal                              -->

<!-- Arrows -->
<!ENTITY larr     "&#8592;"> <!-- U+2190 ISOnum    - leftwards arrow                                                       -->
<!ENTITY uarr     "&#8593;"> <!-- U+2191 ISOnum    - upwards arrow                                                         -->
<!ENTITY rarr     "&#8594;"> <!-- U+2192 ISOnum    - rightwards arrow                                                      -->
<!ENTITY darr     "&#8595;"> <!-- U+2193 ISOnum    - downwards arrow                                                       -->
<!ENTITY harr     "&#8596;"> <!-- U+2194 ISOamsa   - left right arrow                                                      -->
<!ENTITY crarr    "&#8629;"> <!-- U+21B5 NEW       - downwards arrow with corner leftwards = carriage return               -->
<!ENTITY lArr     "&#8656;"> <!-- U+21D0 ISOtech   - leftwards double arrow                                                -->
<!ENTITY uArr     "&#8657;"> <!-- U+21D1 ISOamsa   - upwards double arrow                                                  -->
<!ENTITY rArr     "&#8658;"> <!-- U+21D2 ISOtech   - rightwards double arrow                                               -->
<!ENTITY dArr     "&#8659;"> <!-- U+21D3 ISOamsa   - downwards double arrow                                                -->
<!ENTITY hArr     "&#8660;"> <!-- U+21D4 ISOamsa   - left right double arrow                                               -->

<!-- Mathematical Operators -->
<!ENTITY forall   "&#8704;"> <!-- U+2200 ISOtech   - for all                                                               -->
<!ENTITY part     "&#8706;"> <!-- U+2202 ISOtech   - partial differential                                                  -->
<!ENTITY exist    "&#8707;"> <!-- U+2203 ISOtech   - there exists                                                          -->
<!ENTITY empty    "&#8709;"> <!-- U+2205 ISOamso   - empty set = null set = diameter                                       -->
<!ENTITY nabla    "&#8711;"> <!-- U+2207 ISOtech   - nabla = backward difference                                           -->
<!ENTITY isin     "&#8712;"> <!-- U+2208 ISOtech   - element of                                                            -->
<!ENTITY notin    "&#8713;"> <!-- U+2209 ISOtech   - not an element of                                                     -->
<!ENTITY ni       "&#8715;"> <!-- U+220B ISOtech   - contains as member                                                    -->
<!ENTITY prod     "&#8719;"> <!-- U+220F ISOamsb   - n-ary product = product sign                                          -->
<!ENTITY sum      "&#8721;"> <!-- U+2211 ISOamsb   - n-ary sumation                                                        -->
<!ENTITY minus    "&#8722;"> <!-- U+2212 ISOtech   - minus sign                                                            -->
<!ENTITY lowast   "&#8727;"> <!-- U+2217 ISOtech   - asterisk operator                                                     -->
<!ENTITY radic    "&#8730;"> <!-- U+221A ISOtech   - square root = radical sign                                            -->
<!ENTITY prop     "&#8733;"> <!-- U+221D ISOtech   - proportional to                                                       -->
<!ENTITY infin    "&#8734;"> <!-- U+221E ISOtech   - infinity                                                              -->
<!ENTITY ang      "&#8736;"> <!-- U+2220 ISOamso   - angle                                                                 -->
<!ENTITY and      "&#8743;"> <!-- U+2227 ISOtech   - logical and = wedge                                                   -->
<!ENTITY or       "&#8744;"> <!-- U+2228 ISOtech   - logical or = vee                                                      -->
<!ENTITY cap      "&#8745;"> <!-- U+2229 ISOtech   - intersection = cap                                                    -->
<!ENTITY cup      "&#8746;"> <!-- U+222A ISOtech   - union = cup                                                           -->
<!ENTITY int      "&#8747;"> <!-- U+222B ISOtech   - integral                                                              -->
<!ENTITY there4   "&#8756;"> <!-- U+2234 ISOtech   - therefore                                                             -->
<!ENTITY sim      "&#8764;"> <!-- U+223C ISOtech   - tilde operator = varies with = similar to                             -->
<!ENTITY cong     "&#8773;"> <!-- U+2245 ISOtech   - approximately equal to                                                -->
<!ENTITY asymp    "&#8776;"> <!-- U+2248 ISOamsr   - almost equal to = asymptotic to                                       -->
<!ENTITY ne       "&#8800;"> <!-- U+2260 ISOtech   - not equal to                                                          -->
<!ENTITY equiv    "&#8801;"> <!-- U+2261 ISOtech   - identical to                                                          -->
<!ENTITY le       "&#8804;"> <!-- U+2264 ISOtech   - less-than or equal to                                                 -->
<!ENTITY ge       "&#8805;"> <!-- U+2265 ISOtech   - greater-than or equal to                                              -->
<!ENTITY sub      "&#8834;"> <!-- U+2282 ISOtech   - subset of                                                             -->
<!ENTITY sup      "&#8835;"> <!-- U+2283 ISOtech   - superset of                                                           -->
<!ENTITY nsub     "&#8836;"> <!-- U+2284 ISOamsn   - not a subset of                                                       -->
<!ENTITY sube     "&#8838;"> <!-- U+2286 ISOtech   - subset of or equal to                                                 -->
<!ENTITY supe     "&#8839;"> <!-- U+2287 ISOtech   - superset of or equal to                                               -->
<!ENTITY oplus    "&#8853;"> <!-- U+2295 ISOamsb   - circled plus = direct sum                                             -->
<!ENTITY otimes   "&#8855;"> <!-- U+2297 ISOamsb   - circled times = vector product                                        -->
<!ENTITY perp     "&#8869;"> <!-- U+22A5 ISOtech   - up tack = orthogonal to = perpendicular                               -->
<!ENTITY sdot     "&#8901;"> <!-- U+22C5 ISOamsb   - dot operator                                                          -->

<!-- Miscellaneous Technical -->
<!ENTITY lceil    "&#8968;"> <!-- U+2308 ISOamsc   - left ceiling = apl upstile                                            -->
<!ENTITY rceil    "&#8969;"> <!-- U+2309 ISOamsc   - right ceiling                                                         -->
<!ENTITY lfloor   "&#8970;"> <!-- U+230A ISOamsc   - left floor = apl downstile                                            -->
<!ENTITY rfloor   "&#8971;"> <!-- U+230B ISOamsc   - right floor                                                           -->
<!ENTITY lang     "&#9001;"> <!-- U+2329 ISOtech   - left-pointing angle bracket = bra                                     -->
<!ENTITY rang     "&#9002;"> <!-- U+232A ISOtech   - right-pointing angle bracket = ket                                    -->

<!-- Geometric Shapes -->
<!ENTITY loz      "&#9674;"> <!-- U+25CA ISOpub    - lozenge                                                               -->

<!-- Miscellaneous Symbols -->
<!ENTITY spades   "&#9824;"> <!-- U+2660 ISOpub    - black spade suit                                                      -->
<!ENTITY clubs    "&#9827;"> <!-- U+2663 ISOpub    - black club suit = shamrock                                            -->
<!ENTITY hearts   "&#9829;"> <!-- U+2665 ISOpub    - black heart suit = valentine                                          -->
<!ENTITY diams    "&#9830;"> <!-- U+2666 ISOpub    - black diamond suit                                                    -->
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991206/06c15776/installing.htm
From tshw at capitalmarketscompany.com  Mon Dec  6 15:08:37 1999
From: tshw at capitalmarketscompany.com (Shaw Tim)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
Message-ID: <FDFFD5C2748BD211BC500008C71E933CBDF946@uklonts01.uklo.capitalmarketscompany.com>

FWIW, I ended up having different (sub)DocHandlers for the different nesting
levels and implementing a Handler stack to push/pop them according to the
tags they handled. At least this way you can handle sub-trees fairly simply,
and reduce the bulk of code required for situations where you have many tags
identifying different sub-trees (and hence semantics).
A (minor) problem I had with this was that I looked up the Handlers based on
the tag-name - so there's a problem when the same tag is used in different
'contexts'.
It would be useful to associate a Handler with a given tag at the parser
initialisation level, using some XPath notation to identify the appropriate
tag(s).

tim

> -----Original Message-----
> From: Paul Miller [mailto:stele@fxtech.com]
> Sent: 06 December 1999 15:11
> To: xml-dev
> Subject: Re: simple XML for C++ application data-file I/O
> 
> 
> > We tried to keep SAX 1.0 as simple as possible -- how would you
> > simplify the following further?
> > 
> >     public void startElement (String name, AttributeList atts)
> >     {
> >       // do something!!
> >     }
> 
> Here is where I have the problem. This leaves an awful lot up to the
> application, still, including handling the proper nesting. I 
> would like
> to make the actual parsing of elements more "automatic", so when a
> certain element is hit, it calls a function with my object 
> pointer where
> I can pick up the parsing from there, then drop back out to the
> enclosing XML scope and keep going.
> 
> Perhaps what I want to do should be built on SAX instead of expat,
> though.
> 
> --
> Paul Miller - stele@fxtech.com
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and 
> on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
*********************************************************************
The information in this email is confidential and is intended solely 
for the addressee(s). 					
Access to this email by anyone else is unauthorised. If you are	not 
an intended recipient, you must not read, use or disseminate the 
information contained in the email. 			
Any views expressed in this message are those of the individual sender,
except where the sender specifically states them to be the views of 
The Capital Markets Company.				  

http://www.capitalmarketscompany.com
***********************************************************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Mon Dec  6 15:09:39 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
In-Reply-To: <384BCC51.936AA275@fxtech.com>
References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain> <384BCC51.936AA275@fxtech.com>
Message-ID: <m3ln78kswe.fsf@ifi.uio.no>


* Paul Miller
| 
| Comments?

It looks good, and in fact this was exactly the sort of thing SAX was
designed to allow you to do as a layer above SAX. In Java we already
have SAXON and MDSAX which both do this kind of thing. Python already
has one interface of this sort and ezsax is likely to become another.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Mon Dec  6 15:10:42 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++ vs. SAX2 
In-Reply-To: Your message of "Mon, 06 Dec 1999 10:13:06 GMT."
             <384B8C32.22805488@nag.co.uk> 
Message-ID: <199912061510.IAA03966@localhost.localdomain>

> Just a thought: why not take a leaf out of the DOM's book and write the
> canonical version of the SAX interfaces in a language-neutral format like
> IDL? That way, bindings to a number of languages (including, but not
> limited to, C++ and Java) can be trivially derived by using the
> appropriate IDL-to-whatever converter.

Shh!  That's unwelcome talk around here.

I advocated using IDL for the official SAX definition a while back, but no-one 
seemed to deem it worth considering.  Of course, we've fallen into exactly the 
sort of trap that language-specific interface definition causes: people 
translating to another language all do it differently, and the whole set of 
discussions must reiterate for language Y.

The Python/XML group recently hashed out details of of a Python/DOM binding.  
Because there is a developed Python/CORBA binding, we knew exactly how to 
model several key components of the interface.  Note that this does not 
involve taking up _any_ of CORBA's baggage except for interface definition, 
for which IDL does a brilliant job.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Mon Dec  6 15:11:06 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:24 2004
Subject: A processing instruction for robots
In-Reply-To: <m3wvqsqutu.fsf@ifi.uio.no>
Message-ID: <000b01bf3ffc$2fc659e0$099918d1@docuverse1>

>* Don Park
>| 
>| Walter,
>| Could you elaborate your decision to use PI rather than element(s)?
>
>I'm not Walter, but to me this has the obvious advantage that it can
>be used completely orthogonally to the document contents and the
>software used to process the document for non-indexing purposes.

IMHO, this line of thinking (aka 'sacred content')
forces us to use PI or special attributes for
extension of document instances.  Poor use of
the letter 'X' in XML.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Mon Dec  6 15:19:42 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++ vs. SAX2 
In-Reply-To: Your message of "06 Dec 1999 14:31:35 +0100."
             <m31z90p554.fsf@ifi.uio.no> 
Message-ID: <199912061519.IAA03998@localhost.localdomain>

> | Just a thought: why not take a leaf out of the DOM's book and write
> | the canonical version of the SAX interfaces in a language-neutral
> | format like IDL? 
> 
> This may sound like a good idea, but it has its drawbacks in that one
> is immediately forced into a lowest common denominator design where it
> is impossible to make use of the features that really make each
> language what they are.
> 
> Also, IDL does not have convenient ways of mapping to C++ streams,
> Java InputStream, Python dictionary-like objects and file-like objects
> etc etc  
> 
> Another problem is that exceptions are first-class objects in SAX
> (which is exploited by the Java and Python mappings), but not in IDL.
> 
> Nor are language naming conventions respected. (startElement should
> really be startElement (in Java), start_element (in C++, Python, IDL)
> and start-element (in Common Lisp/Scheme) and there may even be more
> variations.
> 
> As a general reference and statement of intent it might have some
> value, but I really think translation should be done by humans. The
> main advantage feature of IDL, cross-process and cross-language
> interoperability, is not really all that valuable for SAX anyway.

All these problems you bring up are already being addressed by most language 
groups in the process of developing a CORBA binding.  Do you really see such 
evil in the C++, Java and python bindings for native construction from IDL?

I should repeat that not all aspects of CORBA bindings are useful.  For 
instance, the Java binding for actual distributed components requires an ugly 
explosion of packages to cope with CORBA's semantics (largely that language's 
own fault for not supporting multiple implementation sharing).  But I don't 
think these problems plague the simple task of mapping IDL to native 
signatures.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 15:31:57 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain> <384BCC51.936AA275@fxtech.com>
Message-ID: <384BD6A5.24A60C72@fxtech.com>

> The major idea here is you register everything up-front, and
> element-specific callbacks get called to deal with specific elements.
> You can start up parsing inside an element, so you can nest parsing at
> the object level.

I should point out that I'm interested in a C++ implementation only. It
seems the Java people already have anything XML they could ever want.
:-)

If there are others interested in what I proposed I'm prepared to whip
up an implementation this week.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 15:44:39 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++: First interface draft
In-Reply-To: John Aldridge's message of "Mon, 06 Dec 1999 15:01:03 +0000"
References: <3.0.6.32.19991206150103.009a1c10@mailhost>
Message-ID: <whso1ghy64.fsf@viffer.oslo.metis.no>

>>>>> John Aldridge <john.aldridge@informatix.co.uk>:

> We're using MSVC 6 here, and basic_string<> seems fine. 

It's not.  See eg.
	http://msdn.microsoft.com/visualc/stl/faq.htm#Q4

There are patches to this and other problems and bugs with the
Standard Library, to be found at
	http://www.dinkumware.com/vc_fixes.html
but these fixes won't help with templates that are explicitly
instantiated in the C++ runtime DLL.

I spent two weeks before last christmas trying to lose
Standards<ToolKit> when using MSVC++, and I got to the stage where I
was able to compile the program and run it a little bit before it
crashed, before we decided to cut our losses and went back to
Standards<ToolKit>.  This is a program that runs without incident on
Sunpro 4.2+Standards<ToolKit>, gcc/egsc on linux and MSVC++ with
Standards<ToolKit>.

Complaints about this state of the Standard C++ library, are met with
responses on the line of "MSVC++ is not a standard C++ compiler.  It's 
a Windows compiler".

Quite amazing, really.

However, MS has indicated that MSVC++ 7 will may come out with a fixed
version of the Standard C++ Library (but I'm not holding my breath
waiting for this).

> We use templates extensively (both the STL and our own), and they
> too give little trouble _except_ when it comes to exporting template
> instantiations across DLL boundaries, which takes considerable care
> (but can usually be managed).

It's OK if the instantiated classes don't have any static members.
Then you run into having to do this:
	http://msdn.microsoft.com/visualc/stl/faq.htm#Q5

> Namespaces are fine too.

Yes.  That wasn't my problem.  My problem was that std::iostreams are
incompatible with Standards<ToolKit> (a failing of Standards<ToolKit>, 
I agree).  I could make a stab at replacing Standards<ToolKit> with
stuff from SGI:
	http://www.stlport.org/doc/README.VC++.html

But then it's a question of replacing stuff that works with stuff that 
maybe works.
[snip!]

> I think the days of having to avoid large chunks of the C++ standard
> are largely over, thank heavens.

In half a year, to a year, I expect I'll agree with you.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vilya at nag.co.uk  Mon Dec  6 15:48:32 1999
From: vilya at nag.co.uk (Vilya Harvey)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++ vs. SAX2
References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> <384B8C32.22805488@nag.co.uk> <m31z90p554.fsf@ifi.uio.no>
Message-ID: <384BDACE.44C13FED@nag.co.uk>

Lars Marius Garshol wrote:
> 
> * Vilya Harvey
> |
> | Just a thought: why not take a leaf out of the DOM's book and write
> | the canonical version of the SAX interfaces in a language-neutral
> | format like IDL?
> 
> This may sound like a good idea, but it has its drawbacks in that one
> is immediately forced into a lowest common denominator design where it
> is impossible to make use of the features that really make each
> language what they are.

Ray Whitmer basically said what I wanted to say in his response to your
post (thanks Ray!) so I won't repeat that; just consider me in agreement
with the points he made. :-)

> Also, IDL does not have convenient ways of mapping to C++ streams,
> Java InputStream, Python dictionary-like objects and file-like objects
> etc etc

In theory though, you would simply define the functionality you required
from a stream (to use your example) in an interface then make use of the
appropriate "native" stream class in your implementation. That's not
terribly inconvenient, and it needn't be inefficient either in a
reasonable implementation.

> Another problem is that exceptions are first-class objects in SAX
> (which is exploited by the Java and Python mappings), but not in IDL.

The only problem I see is that IDL doesn't allow exceptions to have
inheritance. That would mean some slight changes to the API (although the
implementations could still inherit from one another), but nothing really
serious. Other than that, IDL only allows exceptions to declare member
data (no methods); I don't see that as a real limitation though, since I
have yet to see a *useful* example of an exception object with any methods
other than getters/setters (which IDL member data gets mapped to) and
printStackTrace().

> Nor are language naming conventions respected. (startElement should
> really be startElement (in Java), start_element (in C++, Python, IDL)
> and start-element (in Common Lisp/Scheme) and there may even be more
> variations.

Not everyone using a particular language follows the same naming
conventions anyway, so I really don't think that should be a factor. As an
aside, I disagree with you about the C++ name: I think it should be
startElement not start_element. :-) Also as an aside, I haven't seen any
IDL to Lisp or Scheme converters - does such a tool exist?

> As a general reference and statement of intent it might have some
> value, but I really think translation should be done by humans.

I agree with you about this.

> The main advantage feature of IDL, cross-process and cross-language
> interoperability, is not really all that valuable for SAX anyway.

I would argue that since SAX appears to be intended as a cross-language
API, the main advantage of IDL would be its language neutrality. It would
mean that the API would not be developed with the capabilities of one
particular language in mind, as has happened with SAX 1.0. Of course
whether or not that would be a real advantage is debatable but it would
help avoid situations such as we currently have, where several
incompatible C++ bindings have sprung up. Surely that's a good thing?

Vil.
(Not speaking for my employer.)
-- 
Vilya Harvey  <vilya@nag.co.uk>    Wilkinson House  Mob: +44  961 106 505
Computational Mathematics Group   Jordan Hill Road   Wk: +44 1865 511 245
NAG Limited                    Oxford  UK  OX2 8DR  Fax: +44 1865 311 205

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From czinkos at mail.matav.hu  Mon Dec  6 15:51:35 1999
From: czinkos at mail.matav.hu (Zsolt Czinkos)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
Message-ID: <384BDF42.9BCA2C23@mail.matav.hu>

Paul Miller wrote:

> The major idea here is you register everything up-front, and
> element-specific callbacks get called to deal with specific elements.

Hello,

With SAXON JAVA API you can define your own element-specific handlers.
(Last version I had a look at was 4.5.)


Best,

Zsolt Czinkos

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec  6 15:55:26 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
In-Reply-To: Paul Miller's message of "Mon, 06 Dec 1999 09:10:42 -0500"
References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain> <384BC3E2.E526B322@fxtech.com>
Message-ID: <m3so1gnjyg.fsf@localhost.localdomain>

Paul Miller <stele@fxtech.com> writes:

> Here is where I have the problem. This leaves an awful lot up to the
> application, still, including handling the proper nesting. I would like
> to make the actual parsing of elements more "automatic", so when a
> certain element is hit, it calls a function with my object pointer where
> I can pick up the parsing from there, then drop back out to the
> enclosing XML scope and keep going.

If you're using Java, then there are already some higher-level
toolkits for this sort of thing -- you might want to take a look at
SAXON, which is built on top of SAX.

> Perhaps what I want to do should be built on SAX instead of expat,
> though.

That will make sense once we have a common SAX C++ interface for
expat, libxml, rxp, xml4c++, and any other C-/C++-based XML parsers.
We're not there yet, though.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 16:07:35 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++: First interface draft
Message-ID: <384BDEFC.FD92DBE0@fxtech.com>

> It's not.  See eg.
>         http://msdn.microsoft.com/visualc/stl/faq.htm#Q4

I believe this problem is due to Microsoft's basic_string being a
reference-counting implementation. I've seen problems with this myself.

> Complaints about this state of the Standard C++ library, are met with
> responses on the line of "MSVC++ is not a standard C++ compiler.  It's
> a Windows compiler".

> Quite amazing, really.

Indeed, but it *is* possible to write portable code with MSVC++. It just
depends on how much Microsoft stuff you drag into your build. One thing
you might try is SGI's implementation of the STL (which includes their
own version of std::string). I've been using this for years with much
success. Download it at http://www.sgi.com/Technology/STL. Alex Stepanov
works on it at SGI, so you know it's good.

> Yes.  That wasn't my problem.  My problem was that std::iostreams are
> incompatible with Standards<ToolKit> (a failing of Standards<ToolKit>,

I personally don't use MS's iostreams. They are about 6-10 times slower
than good ole' stdio, so I wrote my own basic IO streams classes that
simply wrap around a FILE *. Much better.

-Paul

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bhall at merrillhall.com  Mon Dec  6 16:10:36 1999
From: bhall at merrillhall.com (Ben Hall)
Date: Mon Jun  7 17:18:24 2004
Subject: PSGML-1.2.1 problems
In-Reply-To: <384BCFC8.25F7721E@esatclear.ie>
Message-ID: <NDBBKJDPILEOPLHPFFEJIEGOCPAA.bhall@merrillhall.com>

It appears that you define the role attribute in common.att,
xlink-simple.att, common-idreq.att and call more than one of these in some
elements.

--Ben

====================================
benjamin hall
merrill-hall new media, inc
bhall@merrillhall.com
404.827.9883
====================================

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Eoin Lane
> Sent: Monday, December 06, 1999 10:01 AM
> To: xml-dev@ic.ac.uk
> Subject: PSGML-1.2.1 problems
>
>
> I'm trying to write a xml doc with emacs configured to use psgml-1.2.1
> but am having some problems. I have checked that psgml works with a
> simple dtd. However when I use the dtd (document-v10.dtd) below I get
> the following error.
>
> ~/character.ent line 2 col 12 entity common.att
> ~/document-v10.dtd line 218 col 29 entity DOCUMENT
> ~/installing.xml line 3 col 51
> Name expected; at: :lang
>
> I wonder could anyone tell me what I am doing wrong. I know the dtd is
> correct because I checked it with IBM 4j parser and it validated. it
> would be of great benefit to me if I could use the dtd in emacs so any
> help would be greatly appreciated.
>
> Eoin.
>
>
> --
>
> Dr. Eoin Lane
> InConn Technologies Ltd.
> 17 Washington St.
> Cork.
> Tel. (021) 271855 Fax (021) 272419
> http://www.inconn.ie
> mailto:eoinlane@esatclear.ie
>
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 16:21:28 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:24 2004
Subject: simple XML for C++ application data-file I/O
In-Reply-To: Paul Miller's message of "Mon, 06 Dec 1999 10:30:45 -0500"
References: <384B04DA.DCD6BAED@fxtech.com> <m3wvqsnppb.fsf@localhost.localdomain> <384BCC51.936AA275@fxtech.com> <384BD6A5.24A60C72@fxtech.com>
Message-ID: <whk8mshwgj.fsf@viffer.oslo.metis.no>

>>>>> Paul Miller <stele@fxtech.com>:

> If there are others interested in what I proposed I'm prepared to
> whip up an implementation this week.

I am interesting in something like this, but for me it would have
something that would take its input as a SAX DocumentHandler.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jmg at trivida.com  Mon Dec  6 16:36:10 1999
From: jmg at trivida.com (Jeff Greif)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++: First interface draft
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whwvqsmo4v.fsf@viffer.oslo.metis.no>
Message-ID: <01e701bf4007$da1a2b00$a20010ac@trivida.com>

I think this problem usually can be managed by choice of include path,
assuming you have source code for the SAX library.  If the include directory
for the ObjectSpace headers is found before the include directory for
MSCVC++ standard library when you compile both the SAX library and
application code, the problem with broken parts of the MSVC++ library
version of iostream should be avoided.  But this approach does not work if
you want to use someone else's binary of the library.

Jeff

----- Original Message -----
From: Steinar Bang <sb@metis.no>
To: <xml-dev@ic.ac.uk>
Sent: Monday, December 06, 1999 1:09 AM
Subject: Re: SAX/C++: First interface draft


> I have a practical problem with using std::istream on the MSVC++
> platform.  Since the Standard C++ Library as delivered with MSVC++ 5
> and 6 is broken, we're using Standards<ToolKit> from ObjectSpace to
> provide us with the parts of the Standard C++ Library we're using.
>
> And Objectspace Standards<ToolKit> is not compatible with the Standard
> C++ Library iostreams of MSVC++.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec  6 16:49:41 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:24 2004
Subject: SAX/C++: First interface draft
In-Reply-To: "Jeff Greif"'s message of "Mon, 6 Dec 1999 08:35:05 -0800"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <whwvqsmo4v.fsf@viffer.oslo.metis.no> <01e701bf4007$da1a2b00$a20010ac@trivida.com>
Message-ID: <wh3dtghv53.fsf@viffer.oslo.metis.no>

>>>>> "Jeff Greif" <jmg@trivida.com>:

> I think this problem usually can be managed by choice of include
> path, assuming you have source code for the SAX library.  If the
> include directory for the ObjectSpace headers is found before the
> include directory for MSCVC++ standard library when you compile both
> the SAX library and application code, the problem with broken parts
> of the MSVC++ library version of iostream should be avoided.  But
> this approach does not work if you want to use someone else's binary
> of the library.

No, this is not the problem.  The problem is that Standards<ToolKit>
does not work with the standard C++ library iostream implementation of 
MSVC++.  This implementation is in the std:: namespace.

Instead it works with the old iostreams which are _not_ in the std::
namespace.

Ie. if I use std::istream, I get the incompatible iostreams.

The problem can _maybe_ be solved by ditching Standards<ToolKit> in
favour of the STLport version of SGI STL.

But suggesting to my coworkers that I spend time on this, when the
current Standards<ToolKit> setup is working, would be met with
incomprehension.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Mon Dec  6 18:09:12 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:24 2004
Subject: Object-oriented serialization (Was Re: Some questions)
Message-ID: <33D189919E89D311814C00805F1991F7F4A98C@RED-MSG-08>

Dan Brickley wrote:
> I believe it will be possible to annotate XML schemas with
> information
> for mapping into (generic or domain specific) application datamodels
> such as RDF. I don't think it is right to expect the hard-pressed XML
> Schema group to define all these mappings within that working group.

I agree.  There are probably many ways to express mappings. One candidate is
shown at the end of the "Schemas NG" paper.  See
http://www.lindamann.com/xml/XML%20Schemas%20NG%20Guide%20HTML.htm, and look
for the section titled "Mapping to Other Data Models."

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Mon Dec  6 19:41:45 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:25 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: <33D189919E89D311814C00805F1991F7F4A98C@RED-MSG-08>
Message-ID: <Pine.GHP.4.21.9912061836240.26620-100000@mail.ilrt.bris.ac.uk>

On Mon, 6 Dec 1999, Andrew Layman wrote:

> Dan Brickley wrote:
> > I believe it will be possible to annotate XML schemas with
> > information
> > for mapping into (generic or domain specific) application datamodels
> > such as RDF. I don't think it is right to expect the hard-pressed XML
> > Schema group to define all these mappings within that working group.
> 
> I agree.  There are probably many ways to express mappings. One candidate is
> shown at the end of the "Schemas NG" paper.  See
> http://www.lindamann.com/xml/XML%20Schemas%20NG%20Guide%20HTML.htm, and look
> for the section titled "Mapping to Other Data Models."

This is interesting work, though it's unclear quite how it fits in with
the Canonical Format / Serializing Graphs paper. The 'Mapping to Other
Data Models' section of 'Schemas NG' shows one strategy for annotating
schemas to support directed labelled graph interchange in XML. It would
be good to see these two strategies drawn together in a single document
describing objects'n'properties DLG serialization strategies for XML
applications. By drawn together I mean having a common documented
model for the DLG representation rather than informal prose.

It is clear by now that the RDF 1.0 Syntax doesn't cut it as the One
True Graph Serialization for all XML applications. I don't think anybody
expected otherwise, but we now have general consensus [eg. 1] that a
more broadly usable DLG exchange syntax is needed by RDF apps. 
We have two proposals already floated on the RDF Interest
Group for alternate DLG-interchange syntaxes [2, 3] and their aims seem
to be basically the same as [4,5]: DLG interchange in XML.
It is also clear that a lot of (RDF-agnostic) XML data interchange apps
want to ship directed labelled graphs around using non-stilted XML
syntaxes. I've argued elsewhere [7] that these graphs will often want to
use URIs for edge types, node identifiers and node types in all but
tightly-coupled closed environments. 

My hope is that XML-DEV and the RDF Interest Group[6] will come up with
implementation-led proposals for XML DLG-interchange that both 
complement the XML Schema work (for mapping-based proposals) and fit
with colloquial (ie. mainstream) XML conventions (for serialisation syntaxes).
  
There's a bunch of interest in an improved syntax for RDF graph
serialization, and growing interest in more general XML DLG interchange
strategies layered on top of XML + XML Schemas. I have a hard time
thinking of these as different problems, hence my wish that the DLG
model mentioned in the schemas NG and canonical papers be documented a
bit more formally to aid comparison with similar proposals for a better 
RDF syntax...

Dan


Refs:
[1] http://www.w3.org/TR/1999/NOTE-schema-arch-19991007 (s3.8)
[2] http://lists.w3.org/Archives/Public/www-rdf-interest/1999Nov/0066.html
[3] http://lists.w3.org/Archives/Public/www-rdf-interest/1999Nov/0100.html
[4] http://www.biztalk.org/Resources/canonical.asp
[5] http://www.lindamann.com/xml/XML%20Schemas%20NG%20Guide%20HTML.htm#_ftn4
[6] http://www.w3.org/RDF/Interest/
[7] http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Dec-1999/0121.html


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clefebvre at advance-groupe.com  Mon Dec  6 19:58:30 1999
From: clefebvre at advance-groupe.com (Christophe Lefebvre)
Date: Mon Jun  7 17:18:25 2004
Subject: No subject
References: <14408.2610.245842.199581@localhost.localdomain> <m3so1gqu92.fsf@ifi.uio.no> <384B8C32.22805488@nag.co.uk>
Message-ID: <384C14CE.57E8A74C@advance-groupe.com>

unsubscribe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Mon Dec  6 20:10:17 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:25 2004
Subject: Object-oriented serialization (Was Re: Some questions)
Message-ID: <33D189919E89D311814C00805F1991F7F4A9A5@RED-MSG-08>

Thanks.  As a recap: There are, broadly, two approaches to serializing a
graph in XML. 

One is to invent a meta-grammar, a set of canonicalization rules.  That is
what RDF syntax did, and what the attribute-centric and element-centric
canonical format papers do, what SOAP section eight does. I think of this as
"tunnelling the graph through XML."

The other is to allow XML documents to follow any pattern described in a
schema, and augmenting the schema with a set of mapping rules.

There appears to be significant value to each approach. (In particular,
however, I disagree with the sometimes-asserted claim that graphs capture
the semantics of a communication while grammars do not.  Graphs are just
another grammar.  This makes me reluctant to deprecate grammars.)

I agree that formal approaches to mapping would be helpful. I look forward
to reading your papers.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jefftr at bellsouth.net  Mon Dec  6 20:29:10 1999
From: jefftr at bellsouth.net (Jeff Russell)
Date: Mon Jun  7 17:18:25 2004
Subject: GUI XML doc authoring tools
Message-ID: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net>

Anybody know of any Windows GUI (or Linux, as a last resort) XML document
authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a
DTD.

Jeff Russell
jefftr@bellsouth.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Mon Dec  6 20:45:53 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: <m3zovsi0rr.fsf@localhost.localdomain>
Message-ID: <3886141114.944483978@MDAXKE>

A danger with adopting the convention that 1 of the 0 or more output
parameters is a return value is that it may interfere with a later
convention on error handling.

I haven't seen that discussed yet in your design principles.
Unlike java or perl, exceptions in C++ are a bit of a land mine,
and could also risk destroying any simple interop with a straight
C library, either above or below.

Not to mention the fact that there is no standard for cross-language
exception raising.

choices seem to be:
- return an error code
- return a boolean success/failure
- use C++ exceptions
- call an error handler and return 0 (which may not get run if the error handler aborts)
- some combination of the above, configurable by the programmer

btw, i'd like to register an objection to reference args. they make
code reading a bit of pain because you cannot tell from the call whether
a copy constructor is going to be used or not -- you always have to
go hunt up the .h. with a pointer arg, it is always clear.

and in regards the character type question, that is a bit awkward because
a key goal for many programmers will be to use the "native" string type
used by the parser, which may be just linked in binary -- not recompiled.
of course, if we all just use expat, that is solved -- we have to have a
SAX/C++ type which directly points to expat's strings.

-mda

--On Friday, December 03, 1999 8:58 AM -0500 David Megginson <david@megginson.com> wrote:

> James Clark <jjc@jclark.com> writes:
> 
>> That's problematic for EntityResolve::resolveEntity; that requires that
>> ownership of an InputSource be transferred from to the caller from the
>> callee.
>> 
>> This could be avoided by doing:
>> 
>> virtual const InputSource *
>> resolveEntity(const char *publicId,
>>               const char *systemId);
>> 
>> instead of:
>> 
>> virtual void
>> resolveEntity(const char *publicId,
>>               const char *systemId,
>>               InputSource &inputSource);
> 
> (I'll assume that James accidentally reversed the two).  The second
> one is a very good idea -- the only modification I'd make is to add a
> bool return value, so that the parser knows whether the resolver
> actually wants to override:
> 
> virtual bool
>  resolveEntity(const char *publicId,
>                const char *systemId,
>                InputSource &inputSource);
> 
> 
> All the best,
> 
> 
> David
> 
> -- 
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec  6 21:07:43 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
References: <3886141114.944483978@MDAXKE>
Message-ID: <384C255C.6419E17@fxtech.com>

> - use C++ exceptions

I vote for C++ exceptions. That is why they are there.

> btw, i'd like to register an objection to reference args. they make
> code reading a bit of pain because you cannot tell from the call whether
> a copy constructor is going to be used or not -- you always have to
> go hunt up the .h. with a pointer arg, it is always clear.

If you always pass by reference, this isn't a problem. In C++, there is
almost never a compelling reason to pass objects by value. Always using
references is nice because you can tell by looking at the prototype
whether the argument is optional (if it's a pointer, it's optional). If
you always use pointers you have to read the documentation to find out.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Mon Dec  6 22:25:56 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:25 2004
Subject: A question on nomenclature
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC0170@cc20exch2.mobility.com>

<name>
  <first/>
  <middle/>
  <last/>
</name>

A simple question.  What is that?

My choices so far:
-an "application of XML", or possibly just "application", although this
would cause confusion with "application" as defined in the spec.
-a "vocabulary" (the one I personally use, although I may change after this
thread...)
-a "grammar"

Keep in mind I'm talking about the "structure" there, not the "instance" of
that "structure".  (I want to describe the "class", not the "object".)  I
have a feeling that there isn't a real consensus anywhere, and that
different people are using different names.  (Are there any others?  Do
people use "format", or something along those lines?  Or "class"?)

It's not something that I would ever have to worry about when using XML in
my applications, but if I were to, oh, I don't know, write a book about XML,
I'd want to create as little confusion as possible, so would I be safe in
calling the structure I created a "vocabulary"?  Do things get hairier if we
get into formats documented in DTDs/Schemas, and documents with no DTD or
Schema, or does the nomenclature stay the same?

Any thoughts or opinions would be appreciated.  Any documentation that I've
missed which states emphatically "this is what you would call it" would be
even more appreciated, but I don't think it's out there...

David Hunter
MobileQ 
david.hunter@mobileq.com 
http://www.MobileQ.com 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Mon Dec  6 22:19:11 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: <384C255C.6419E17@fxtech.com>
Message-ID: <3891840459.944489678@MDAXKE>


>> - use C++ exceptions
> 
> I vote for C++ exceptions. That is why they are there.

Someone should dictate whether the exception objects are raised,
or pointers to them. Regardless, it is impossible for mere mortals
to use them without having leaks when they occur below constructors
and destructors. But I guess anyone using C/C++ already knows they
are taking such risks.

Don't get me wrong; i like exceptions in programming languages that
support them well.

I'm a little confused by the intent of the draft header, where
there is a SAXParseException class which is an argument to a handler.
Seems like if it is a native C++ exception, then the caller takes
care of catching it, not registering a handler.

I also wonder whether a handler (error handler or any other, like
document) is supposed to be able to call back into the Parser
and tell it clean up.

I also might note that the current exception class appears to have
no member data indicating which parser or inputsource object is in use,
which would be an issue with a multi-threaded implementation, or
even a single-threaded one with multiple top-level instances.

> 
>> btw, i'd like to register an objection to reference args. they make
>> code reading a bit of pain because you cannot tell from the call whether
>> a copy constructor is going to be used or not -- you always have to
>> go hunt up the .h. with a pointer arg, it is always clear.
> 
> If you always pass by reference, this isn't a problem. In C++, there is
> almost never a compelling reason to pass objects by value.

Agreed. I guess it comes down to how much you trust other programmers.
If you trust them, then using pointerhood to encode optionality might
be useful. I guess I'm just too often forced to deal with C++ afficionados
who love nothing more than hiding several automatic class methods and
casts in every argument value.

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mlepage at antimeta.com  Mon Dec  6 23:01:16 1999
From: mlepage at antimeta.com (mlepage@antimeta.com)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: <3891840459.944489678@MDAXKE>; from mda@discerning.com on Mon, Dec 06, 1999 at 02:14:38PM -0800
References: <384C255C.6419E17@fxtech.com> <3891840459.944489678@MDAXKE>
Message-ID: <19991206180059.A24592@antimeta.com>

On Mon, Dec 06, 1999 at 02:14:38PM -0800, Mark D. Anderson wrote:
> 
> >> - use C++ exceptions
> > 
> > I vote for C++ exceptions. That is why they are there.
> 
> Someone should dictate whether the exception objects are raised,
> or pointers to them. Regardless, it is impossible for mere mortals
> to use them without having leaks when they occur below constructors
> and destructors. But I guess anyone using C/C++ already knows they
> are taking such risks.
> 
> Don't get me wrong; i like exceptions in programming languages that
> support them well.

In C++, you throw exceptions by value and catch them by reference (see Meyers for details). So the exceptions themselves don't leak.

Since fully constructed objects are destructed during stack unwinding, there are no leaks there.

If you are doing things using pointers, etc., where allocated resources are not automatically freed (i.e. the pointer is freed but not the pointee), then yes you risk memory leaks. However, you should be using auto_ptr and other helpers to avoid that problem. This technique is discussed at length in Stroustrup and Meyers. So assuming you are using the helpers made available for you, properly, there are no memory leaks there.

Sutter's new book Exceptional C++ is just out, and details even more regarding exception safety, I presume.

-- 
Marc Lepage  (aka SEGV)
http://www.antimeta.com/
RTS game programming info, Minion open source game, etc.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Tue Dec  7 00:08:11 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: <19991206180059.A24592@antimeta.com>
Message-ID: <3898269654.944496107@MDAXKE>

I'm familiar with the solutions; it is just that i'd rather not have to
trust that every other C++ programmer has memorized Meyer.
Nor would I want to impose smart pointers, which are imho right up there
with vector and string classes in ways-to-learn-to-hate-c++.

if in fact SAX decides to support C++ exceptions rather than an error
handler, it would probably help to just give some examples that 
clarify correct usage of the exception classes, for the non-cognoscenti.

-mda

--On Monday, December 06, 1999 6:00 PM -0500 mlepage@antimeta.com wrote:

> On Mon, Dec 06, 1999 at 02:14:38PM -0800, Mark D. Anderson wrote:
>> 
>> >> - use C++ exceptions
>> > 
>> > I vote for C++ exceptions. That is why they are there.
>> 
>> Someone should dictate whether the exception objects are raised,
>> or pointers to them. Regardless, it is impossible for mere mortals
>> to use them without having leaks when they occur below constructors
>> and destructors. But I guess anyone using C/C++ already knows they
>> are taking such risks.
>> 
>> Don't get me wrong; i like exceptions in programming languages that
>> support them well.
> 
> In C++, you throw exceptions by value and catch them by reference (see Meyers for details). So the exceptions themselves don't leak.
> 
> Since fully constructed objects are destructed during stack unwinding, there are no leaks there.
> 
> If you are doing things using pointers, etc., where allocated resources are not automatically freed (i.e. the pointer is freed but not the pointee), then yes you risk memory leaks. However, you should be using auto_ptr and other helpers to avoid that problem. This technique is discussed at length in Stroustrup and Meyers. So assuming you are using the helpers made available for you, properly, there are no memory leaks there.
> 
> Sutter's new book Exceptional C++ is just out, and details even more regarding exception safety, I presume.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jefftr at bellsouth.net  Tue Dec  7 01:45:17 1999
From: jefftr at bellsouth.net (Jeff Russell)
Date: Mon Jun  7 17:18:25 2004
Subject: A question on nomenclature
In-Reply-To: <805C62F55FFAD1118D0800805FBB428D02BC0170@cc20exch2.mobility.com>
Message-ID: <000a01bf4054$57347770$90fc4dd8@bhm.bellsouth.net>

Different people are describing it different ways. An "application of XML"
would be generic enough. "Application" would be one step higher, and so also
technically correct. "Format" would describe a particular document or specific
DTD/schema. Grammar or syntax is what the XML spec describes. A vocabulary
might be the specific "proprietary" set of tags used in a given deocument, DTD,
or schema.

"Class" is a technical word from XSL and CSS.

Jeff Russell

|-----Original Message-----
|From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
|Hunter, David
|Sent: Monday, December 06, 1999 4:26 PM
|To: 'XML-dev'
|Subject: A question on nomenclature
|
|
|<name>
|  <first/>
|  <middle/>
|  <last/>
|</name>
|
|A simple question.  What is that?
|
|My choices so far:
|-an "application of XML", or possibly just "application", although this
|would cause confusion with "application" as defined in the spec.
|-a "vocabulary" (the one I personally use, although I may change after this
|thread...)
|-a "grammar"
|
|Keep in mind I'm talking about the "structure" there, not the "instance" of
|that "structure".  (I want to describe the "class", not the "object".)  I
|have a feeling that there isn't a real consensus anywhere, and that
|different people are using different names.  (Are there any others?  Do
|people use "format", or something along those lines?  Or "class"?)
|
|It's not something that I would ever have to worry about when using XML in
|my applications, but if I were to, oh, I don't know, write a book about XML,
|I'd want to create as little confusion as possible, so would I be safe in
|calling the structure I created a "vocabulary"?  Do things get hairier if we
|get into formats documented in DTDs/Schemas, and documents with no DTD or
|Schema, or does the nomenclature stay the same?
|
|Any thoughts or opinions would be appreciated.  Any documentation that I've
|missed which states emphatically "this is what you would call it" would be
|even more appreciated, but I don't think it's out there...
|
|David Hunter
|MobileQ
|david.hunter@mobileq.com
|http://www.MobileQ.com
|
|xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
|Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
|CD-ROM/ISBN 981-02-3594-1
|To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
|unsubscribe xml-dev
|To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
|subscribe xml-dev-digest
|List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|
|


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Tue Dec  7 05:03:14 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:25 2004
Subject: A question on nomenclature
References: <805C62F55FFAD1118D0800805FBB428D02BC0170@cc20exch2.mobility.com>
Message-ID: <016501bf4070$415d1f80$0300000a@cygnus.uwa.edu.au>

> <name>
>   <first/>
>   <middle/>
>   <last/>
> </name>
>
> A simple question.  What is that?
>
> My choices so far:
> -an "application of XML", or possibly just "application", although this
> would cause confusion with "application" as defined in the spec.
> -a "vocabulary" (the one I personally use, although I may change after
this
> thread...)
> -a "grammar"

It's a "schema". It happens to be in a particular schema language that is
kind of "schema-by-example" but it is a schema nevertheless. Schemas are
grammars so it's a "grammar" too.

I tend to use the term "vocabulary" to mean a set of element type (and their
attribute) names not necessarily with defined content specifications.
However, many people use "vocabulary" to mean the same as "schema" and
"grammar".

My AUD0.02

James Tauber


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec  7 07:30:01 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: Paul Miller's message of "Mon, 06 Dec 1999 16:06:36 -0500"
References: <3886141114.944483978@MDAXKE> <384C255C.6419E17@fxtech.com>
Message-ID: <whd7sjgqdp.fsf@viffer.oslo.metis.no>

>>>>> Paul Miller <stele@fxtech.com>:

>> - use C++ exceptions

> I vote for C++ exceptions. That is why they are there.

Personally I think that C++ exceptions should only be used to signal
critical situations for the future execution of the program, not as a
normal matter of program flow control.

However, I think the ErrorHandler is a good idea, and have no problems 
with exceptions being thrown for the cases where one haven defined and 
registered an ErrorHandler.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec  7 07:27:43 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX/C++: C++-specific design principles
In-Reply-To: "Mark D. Anderson"'s message of "Mon, 06 Dec 1999 12:39:38 -0800"
References: <3886141114.944483978@MDAXKE>
Message-ID: <whhfhvgqhw.fsf@viffer.oslo.metis.no>

>>>>> "Mark D. Anderson" <mda@discerning.com>:

> Unlike java or perl, exceptions in C++ are a bit of a land mine, 
[snip!]

Se Items 9 through 15, and in particular Item 15 "Understand the costs
of exception handling", in Scott Meyers' "More Effective C++"
	http://www.awl.com/cseng/titles/0-201-63371-X/
for more detail on this.

> Not to mention the fact that there is no standard for cross-language
> exception raising.

> choices seem to be:
> - return an error code
> - return a boolean success/failure
> - use C++ exceptions
> - call an error handler and return 0 (which may not get run if the
>   error handler aborts)
> - some combination of the above, configurable by the programmer

Personally I'm partial to allocate the "return value" in the caller,
and give a reference argument to this value and return a status code,
rather than returning the value itself, e.g.
	bool getValue(int index, string& value);
rather than
	const string& getValue(int index);

The syntax is more clumsy, but the memory management is easier (I'm
also partial to allocate objects on the stack in the caller, rather
than doing new).

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Tue Dec  7 08:23:12 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:25 2004
Subject: GUI XML doc authoring tools
In-Reply-To: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net>
References: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net>
Message-ID: <m31z8zgnwx.fsf@ifi.uio.no>


* Jeff Russell
|
| Anybody know of any Windows GUI (or Linux, as a last resort) XML
| document authoring tools? Something like SoftQuad's XMeTaL, but that
| doesn't require a DTD.

You can find a list of free XML editors for all platforms at

  <URL: http://www.stud.ifi.uio.no/~lmariusg/linker/XMLtools.html#SC_XMLEditors >

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec  7 08:41:50 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX and <!DOCTYPE>
Message-ID: <whln77f8hq.fsf@viffer.oslo.metis.no>

Is there something in SAX that gives the information in the <!DOCTYPE> 
declaration to the application?

I've scratched my head over DTDHandler
	http://www.megginson.com/SAX/javadoc/org.xml.sax.DTDHandler.html
but I haven't found anything that looks like it there.  I've looked at 
the source code for XMLNorm/XMLWriter, and it doesn't look like it is
anywhere there.  The top level element is used as the name.

But when looking at the original announcement for XMLNorm:
	http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Jul-1999/0346.html
there's a link to an XML document that has a DOCTYPE with both a
PUBLIC identifier and a SYSTEM identifier
	http://home.sprynet.com/~dmeggins/texts/darkness/darkness.xml

Or maybe the above document is _before_ processing with XMLNorm...?

Hm... searches on the net also gave me some discussions from January
1998, that seems to indicate that a <!DOCTYPE> declaration isn't good
enough as a document type identifier, but there didn't seem to be any
conclusions (at least I didn't find them.

What I need the document type for, is to set the appropriate
DocumentHandler in the parser.  I'm assuming that this is something
others would like to do.  If there is no DOCTYPE information
available, how is it done?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Tue Dec  7 09:46:48 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:25 2004
Subject: SAX and <!DOCTYPE>
In-Reply-To: <whln77f8hq.fsf@viffer.oslo.metis.no>
References: <whln77f8hq.fsf@viffer.oslo.metis.no>
Message-ID: <m3r9gzdqx0.fsf@ifi.uio.no>


* Steinar Bang
|
| Is there something in SAX that gives the information in the
| <!DOCTYPE> declaration to the application?

Not in 1.0, as this was considered lexical rather than logical
information. (It's optional in the infoset WD.)

| I've scratched my head over DTDHandler
| 	http://www.megginson.com/SAX/javadoc/org.xml.sax.DTDHandler.html
| but I haven't found anything that looks like it there.  

DTDHandler serves a very narrow purpose: to pass on to the application
exactly what the XML rec requires processors to pass on.

SAX 2.0, however, does have this in the LexicalHandler.startDTD callback.

| What I need the document type for, is to set the appropriate
| DocumentHandler in the parser.  I'm assuming that this is something
| others would like to do.  If there is no DOCTYPE information
| available, how is it done?

It depends on the situation. In the XSA client, which needs to accept
both XSA and OSD documents, but can't tell them apart before parsing
begins, uses a DispatchingDocHandler, which has a hash of
DocumentHandlers keyed on the name of the document element. In this
very restricted case that worked just fine.

In other cases one might perhaps key on the namespace of the document
element, and with SAX 2 one could use the public identifier of the
DOCTYPE declaration.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Tue Dec  7 13:37:22 1999
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:18:25 2004
Subject: GUI XML doc authoring tools
In-Reply-To: "Jeff Russell"'s message of "Mon, 6 Dec 1999 14:26:48 -0600"
References: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net>
Message-ID: <f5b66yag9dx.fsf@cogsci.ed.ac.uk>

"Jeff Russell" <jefftr@bellsouth.net> writes:

> Anybody know of any Windows GUI (or Linux, as a last resort) XML document
> authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a
> DTD.

I like XED [1], but that's not too surprising:  I wrote it :-).

ht

[1] http://www.ltg.ed.ac.uk/~ht/xed.html
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Tue Dec  7 16:38:39 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:25 2004
Subject: nestable C/C++ XML parser?
Message-ID: <384D3843.F3F7AFB9@fxtech.com>

I'm trying to develop a tag-based front-end to expat and having no luck.
I'd like to be able to parse an XML document in nestable chunks, by
calling into a nestable parser. In other words, I'd like to start
parsing, then branch to a function to handle a specific element, parsing
in there until that element is closed, then fall back out of the
function to continue parsing the rest of the document.

Something like this:

ParseDocument (call HandleFoo when Foo element is found)

	HandleFoo()
		ParseFoo
		// do something with Foo stuff here

FinishParseDocument

Has anyone seen such a beast?

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Tue Dec  7 17:11:17 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:25 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: <384D3843.F3F7AFB9@fxtech.com>
References: <384D3843.F3F7AFB9@fxtech.com>
Message-ID: <m37liqad77.fsf@ifi.uio.no>


* Paul Miller
|
| In other words, I'd like to start parsing, then branch to a function
| to handle a specific element, parsing in there until that element is
| closed, then fall back out of the function to continue parsing the
| rest of the document.

More people than you have been asking for this, but this is quite
simply not the way XML is meant to work. XML is a standardized syntax,
and because of that it makes no sense to let application developers do
part of the parsing, since they are likely to get parts of it wrong
and since the syntax is standardized there is no reason not to let the
parser handle it for you. (You would in any case only duplicate its
standard-decreed way of parsing.)

The only application I see for this sort of thing is to be able to
work around XML syntax rules, but once you do that your document is no
longer an XML document and you shouldn't pretend that it is, not even
to yourself. (Imagine what happens when an XML repository, XML editor,
XML browser or an XSLT engine tries to work with your "XML" document.)

In other words, when you find yourself doing this you should very
likely explain why to experienced XML developers and then ask them how
one usually handles this sort of thing _within_ XML, or else abandon
any pretense of using XML entirely.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From schen at falconwing.com  Tue Dec  7 17:19:32 1999
From: schen at falconwing.com (Sean Chen)
Date: Mon Jun  7 17:18:25 2004
Subject: GUI XML doc authoring tools
In-Reply-To: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net>
Message-ID: <Pine.LNX.3.96.991207114025.21461A-100000@www.falconwing.com>

Hi Jeff, everyone,

On Mon, 6 Dec 1999, Jeff Russell wrote:

> Anybody know of any Windows GUI (or Linux, as a last resort) XML document
> authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require a
> DTD.

You can try my Java-based Athame XML editor, which is currently in early
stages of development.  It's main feature is XSLT support using James
Clark's XT.  I've used it to write a couple hundred pages of courseware
but it's rough on the edges.

http://falconwing.com/~schen/

It comes bundled with the DocBk XML and XSLT stylesheets.

. . . Sean.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Toby.Speight at streapadair.freeserve.co.uk  Tue Dec  7 17:25:35 1999
From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight)
Date: Mon Jun  7 17:18:25 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: Lars Marius Garshol's message of "07 Dec 1999 18:11:08 +0100"
References: <384D3843.F3F7AFB9@fxtech.com> <m37liqad77.fsf@ifi.uio.no>
Message-ID: <uk8mqek8g.fsf@lanber.cam.citrix.com>

Lars> Lars Marius Garshol <URL:mailto:larsga@garshol.priv.no>

0> In article <m37liqad77.fsf@ifi.uio.no>, Lars wrote:

Lars> The only application I see for this sort of thing is to be able
Lars> to work around XML syntax rules,

I see a demand for parsing a document with SAX, but using some
start-tags to switch to building DOM (or DOM-like) objects, returning
to stream-oriented processing afterwards.  Perhaps you have a large
"set" or "list", and you know that the members of that collection can
be processed independently - why waste memory on a complete DOM for
that?

Lars> but once you do that your document is no longer an XML document
Lars> and you shouldn't pretend that it is, not even to yourself.

This bit I agree with.

-- 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From greynolds at datalogics.com  Tue Dec  7 17:26:48 1999
From: greynolds at datalogics.com (Reynolds, Gregg)
Date: Mon Jun  7 17:18:25 2004
Subject: A question on nomenclature
Message-ID: <51ED3F5356D8D011A0B1006097C3073401B1702F@martinique>

> -----Original Message-----
> From: Hunter, David [mailto:dhunter@Mobility.com]
> Sent: Monday, December 06, 1999 4:26 PM
> 
> <name>
>   <first/>
>   <middle/>
>   <last/>
> </name>
> 
> A simple question.  What is that?
> 

It's a string.  If you view it from the perspective of formal grammar, then
it's a sentence in any language whose grammar defines it as such; an
infinite number such grammars are definable using XML DTDs.  But it is also
a sentence in a language whose grammar stipulates that all sentences
sandwich "dd" between any two strings.  Plus an infinite number of other
languages, including the one whose only legal sentence is just that string.

> My choices so far:
> -an "application of XML", or possibly just "application", 
> although this
> would cause confusion with "application" as defined in the spec.

Right.  "XML Application" is marketing weaselspeak.  There are no XML
applications, only languages (grammars) defined using XML.  (Ever hear of an
"SQL application"?)

> -a "vocabulary" (the one I personally use, although I may 
> change after this
> thread...)

Makes a certain intuitive sense, but I'd say vocab is better left to mean
the words instead of the sentences - i.e., it's tied up with the concepts
being modeled, in this case various kinds of names.

> -a "grammar"
> 

Nope.  Grammars is rules.  What you've written doesn't express any rules;
you've got to have a metalanguage to have a grammar, too.

> Keep in mind I'm talking about the "structure" there, not the 
> "instance" of
> that "structure".  (I want to describe the "class", not the 
> "object".)  I

Not sure what you mean.  I take it you're after the structural
"interpretation", as it were, of the instance.  

> I'd want to create as little confusion as possible, so would 
> I be safe in
> calling the structure I created a "vocabulary"?  Do things 

I think you'd run into trouble eventually, since one generally uses tagnames
with a recognizable meaning in ordinary discourse.  So you'd end up with
"register confusion": uncertainty about when "vocab" means formal
grammatical structures, and when it means the semantic realities being
modeled by those structures.

> 
> Any thoughts or opinions would be appreciated.  Any 
> documentation that I've
> missed which states emphatically "this is what you would call 
> it" would be
> even more appreciated, but I don't think it's out there...

Assuming you're interested in Truth and Clarity, I'd look in the section on
formal languages and mathematical logic, and avoid industry-generated stuff,
which tends to be rather solipsistic.  Stoy's classic "Denotational
Semantics" (you can get it through Amazon etc.) is very helpful in
clarifying the relationship between syntax and "meaning".  Also try Spivey's
ZRM (http://spivey.oriel.ox.ac.uk/~mike/zrm/).  Neither of these directly
deals with XML, but XML is a specific case of a more general phenomenon;
reading those two works in particular was a huge help for me at least in
understanding the foundations of XMLdom.  Caveat:  when you hear the word
"architecture", reach for your revolver.

-gregg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rhelton at rhythms.net  Tue Dec  7 17:43:48 1999
From: rhelton at rhythms.net (rhelton@rhythms.net)
Date: Mon Jun  7 17:18:25 2004
Subject: No subject
Message-ID: <916BA3451A99D2118FCC0090272ABD2F031073C7@CAXIXI>

unsubscribe

--Rich Helton--
Rhythms EAI Architecture
ext 2913


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rhelton at rhythms.net  Tue Dec  7 17:46:00 1999
From: rhelton at rhythms.net (rhelton@rhythms.net)
Date: Mon Jun  7 17:18:25 2004
Subject: No subject
Message-ID: <916BA3451A99D2118FCC0090272ABD2F031073C9@CAXIXI>

unsubscribe

--Rich Helton--
Rhythms EAI Architecture
ext 2913


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Tue Dec  7 17:53:26 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:25 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: <uk8mqek8g.fsf@lanber.cam.citrix.com>
References: <384D3843.F3F7AFB9@fxtech.com> <m37liqad77.fsf@ifi.uio.no> <uk8mqek8g.fsf@lanber.cam.citrix.com>
Message-ID: <m390368wo3.fsf@ifi.uio.no>


* Lars Marius Garshol
| 
| The only application I see for this sort of thing is to be able to
| work around XML syntax rules,

* Toby Speight
| 
| I see a demand for parsing a document with SAX, but using some
| start-tags to switch to building DOM (or DOM-like) objects, returning
| to stream-oriented processing afterwards.  

Sure, I too see a need for this, and I've even implemented it.
However, this is something completely different from doing parsing on
behalf of the parser. Parsing is turning a stream of bytes (or
characters) into something higher-level, but this is not what you are
talking about.

As far as I understood him, the original poster wanted to do the
parsing (that is, the reading and interpretation of bytes/chars) on
behalf of expat. 

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Tue Dec  7 17:55:12 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:25 2004
Subject: A processing instruction for robots
In-Reply-To: <001201bf3e1f$074601c0$099918d1@docuverse1>
References: <3.0.5.32.19991203085516.03ce3de0@corp.infoseek.com>
Message-ID: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com>

At 10:12 PM 12/3/99 -0800, Don Park wrote:
>Walter,
>
>Could you elaborate your decision to use PI rather than
>element(s)?

Lars did a pretty good job, but I'll elaborate anyway.

This is information for a specific kind of XML processor
(an indexing robot), but it is not specific to the document
type. So we need a mechanism that applies to any XML document
and can be automatically ignored by non-robot processors.
A PI is an exact fit. Even the name is right -- it is an
instruction to the robot about how to process it.

The alternative, adding an element to every DTD in the 
universe, with the corresponding breakage to every processor
that reads those DTDs, is just too awful to contemplate.

wunder
--
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://software.infoseek.com/
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Tue Dec  7 18:10:49 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:25 2004
Subject: A processing instruction for robots
In-Reply-To: <m3yab8qv35.fsf@ifi.uio.no>
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
 <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
Message-ID: <3.0.5.32.19991207101003.00bfc100@corp.infoseek.com>

At 10:25 AM 12/6/99 +0100, Lars Marius Garshol wrote:
>
>First thought: this is fine for very simple uses, but for more complex
>uses something along the lines of the robots.txt file would be very
>nice. How about a variant PI that can point to a robots.rdf resource?

Two reasons, one based on keeping it very simple for authors,
and one on keeping it simple for robot implementors.

In our experience, the simple form covers almost all needs.
We have 1000+ customers, and only three or four of them use
our selective indexing support. So, I think of the robots
meta tag as a proven solution that doesn't need major improvement.

Secondly, fetching two or more entities for one document makes
the robot code much more complex. If the robots.rdf file gets
a 404, what happens? What about a 401 or a timeout? The robot
may need separate last-modified dates and revisit times for
each entity. And after it is implemented and tested, how do you
explain all that to customers who just want search results?

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vilya at nag.co.uk  Tue Dec  7 18:17:00 1999
From: vilya at nag.co.uk (Vilya Harvey)
Date: Mon Jun  7 17:18:25 2004
Subject: nestable C/C++ XML parser?
References: <384D3843.F3F7AFB9@fxtech.com> <m37liqad77.fsf@ifi.uio.no> <uk8mqek8g.fsf@lanber.cam.citrix.com> <m390368wo3.fsf@ifi.uio.no>
Message-ID: <384D4F32.F1CC0BC0@nag.co.uk>

Lars Marius Garshol wrote:
> 
> * Lars Marius Garshol
> |
> | The only application I see for this sort of thing is to be able to
> | work around XML syntax rules,
> 
> * Toby Speight
> |
> | I see a demand for parsing a document with SAX, but using some
> | start-tags to switch to building DOM (or DOM-like) objects, returning
> | to stream-oriented processing afterwards.
> 
> Sure, I too see a need for this, and I've even implemented it.
> However, this is something completely different from doing parsing on
> behalf of the parser. Parsing is turning a stream of bytes (or
> characters) into something higher-level, but this is not what you are
> talking about.

Not exactly right. Parsing deals with a sequence of *tokens*; in the
programming world these tokens are usually the result of lexical analysis
of a sequence of characters, but they don't *have* to be. The tokens in
question could be XML entities, for example...

> As far as I understood him, the original poster wanted to do the
> parsing (that is, the reading and interpretation of bytes/chars) on
> behalf of expat.

I think there has been some miscommunication due to the fact that there
are really two distinct levels of parsing that can take place with XML.
There is the parsing which turns a sequence of characters in some encoding
into a particular XML entity or sequence of entities; and then there is
the parsing which interprets a sequence of XML tokens to derive some
application- or domain-specific meaning. I suspect it may have been the
second type of parsing that the original poster was referring to.

Bye,
Vil.
(Not speaking for my employer.)
-- 
Vilya Harvey  <vilya@nag.co.uk>    Wilkinson House  Mob: +44  961 106 505
Computational Mathematics Group   Jordan Hill Road   Wk: +44 1865 511 245
NAG Limited                    Oxford  UK  OX2 8DR  Fax: +44 1865 311 205

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Tue Dec  7 18:15:53 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:25 2004
Subject: A processing instruction for robots
In-Reply-To: <m3yab8qv35.fsf@ifi.uio.no>
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
 <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>
Message-ID: <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com>

At 10:25 AM 12/6/99 +0100, Lars Marius Garshol wrote:
>
>Second thought: "and the index attribute must be first". This is nice
>for implementors, but is likely to clash with the expectations of
>users and the cost of more generality is very low for implementors.

I'm open to changing this, but I thought I would start
with the most strict version. The advantage of the strict
version is that it doesn't need to be parsed. The Desparate
Perl Hacker can do four regex compares for the four variants
and get back to work.

Maybe folks who've worked with authors on SGML systems have
some relevant experience. Is this too strict for folks that
aren't tamed by computers?

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Tue Dec  7 18:19:21 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
References: <384D3843.F3F7AFB9@fxtech.com> <m37liqad77.fsf@ifi.uio.no> <uk8mqek8g.fsf@lanber.cam.citrix.com> <m390368wo3.fsf@ifi.uio.no>
Message-ID: <384D4FE7.C81F2CCE@fxtech.com>

> Sure, I too see a need for this, and I've even implemented it.
> However, this is something completely different from doing parsing on
> behalf of the parser. Parsing is turning a stream of bytes (or
> characters) into something higher-level, but this is not what you are
> talking about.
> 
> As far as I understood him, the original poster wanted to do the
> parsing (that is, the reading and interpretation of bytes/chars) on
> behalf of expat.
 
This is more or less correct. I want to use XML as an application data
file format. Why? Two primary reasons:
1. I don't need/want to invent a new syntax - I like XML just fine and
it handles object-oriented nesting of data quite nicely
2. I can publish a DTD and make it easier for my end-users to use my
application data in their own applications (I work in special effects
applications, and certain high-end customers like to use my data in
their own custom tools) without doing a lot of hand-holding

Whether this constitutes a "good enough" reason to use XML I don't know.
The primary use of XML seems to be web-oriented e-commerse stuff, of
which I don't give a hoot about (I'll leave that stuff to the web
experts).

Given my needs, I know the data in the XML file, and I know what to do
with it once I get to it. But I *do not* want to go with the huge
complexity of DOM. I've indicated in a previous thread the kind of API
I'd like to access the data. I was hoping expat would let me do nested
parsing, but it doesn't.

Frankly, for this kind of application file format stuff, validation and
namespaces probably aren't really necessary, but I want to use the XML
syntax mostly because it's well defined. This means I'll probably have
to implement my own restartable low-level "parser" which just deals with
the syntax and the basics. I was hoping to layer on expat, for no other
reason than to gain free validation once expat gets it, but considering
my needs this probably isn't necessary, and it's just a bit more work
for me.

>From this (and other discussions) it looks like this type of XML parser
for application data would be generally useful (in the C/C++ community),
so I'll be sure to make my efforts available.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Tue Dec  7 18:22:29 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:26 2004
Subject: A question on nomenclature
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC017D@cc20exch2.mobility.com>

From: Reynolds, Gregg [mailto:greynolds@datalogics.com]
Sent: Tuesday, December 07, 1999 12:26 PM
> > Keep in mind I'm talking about the "structure" there, not the 
> > "instance" of
> > that "structure".  (I want to describe the "class", not the 
> > "object".)
> 
> Not sure what you mean.  I take it you're after the structural
> "interpretation", as it were, of the instance.  

Sorry, I'll try to state my case better.  :-)  I mean that I want to
describe a "class" of XML documents, in which the root element will be
<name>, and <name> will have three child elements, <first>, <middle>, and
<last>.  I'm describing that class by writing this:

<name>
  <first/>
  <middle/>
  <last/>
</name>

(James Tauber described this as a "schema-by-example") but what I really
want is the name that I would call that "class" of XML documents.  I may end
up with an "instance" of that class, which happens to look exactly like that
thing above, because it has no information for <first>, <middle>, or <last>,
but I don't care about that.  I'm still leaning toward "vocabulary", because
that still seems to describe it best, but I'm still open too.  (I think
"schema" is probably correct for what I'm trying to do as well, but that
would confuse readers with XML Schemas, which are just one type of "schema
description language"...)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ebohlman at netcom.com  Tue Dec  7 18:45:55 1999
From: ebohlman at netcom.com (Eric Bohlman)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: <m390368wo3.fsf@ifi.uio.no>
Message-ID: <Pine.GSU.4.10.9912071036540.9271-100000@netcom2.netcom.com>

On 7 Dec 1999, Lars Marius Garshol wrote:
> Sure, I too see a need for this, and I've even implemented it.
> However, this is something completely different from doing parsing on
> behalf of the parser. Parsing is turning a stream of bytes (or
> characters) into something higher-level, but this is not what you are
> talking about.
> 
> As far as I understood him, the original poster wanted to do the
> parsing (that is, the reading and interpretation of bytes/chars) on
> behalf of expat. 

I've got a hunch that what he really wanted to do was "pull" the
higher-level somethings rather than have them "pushed" at him, i.e. call a
function to get the next something rather than have the parser make a
callback, presumably because he needs to maintain some state and he'd like
to do it via flow-of-control rather than setting and testing state
variables.

If that's the case, he'd be better off building a wrapper that would feed
the input incrementally to expat and buffer up events, with the whole
thing driven by a "get next token" function that would return something
similar to a line of ESIS data.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Tue Dec  7 18:56:42 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
References: <Pine.GSU.4.10.9912071036540.9271-100000@netcom2.netcom.com>
Message-ID: <384D58A7.2AA6493F@fxtech.com>

> I've got a hunch that what he really wanted to do was "pull" the
> higher-level somethings rather than have them "pushed" at him, i.e. call a
> function to get the next something rather than have the parser make a
> callback, presumably because he needs to maintain some state and he'd like
> to do it via flow-of-control rather than setting and testing state
> variables.

No, I did want things pushed at me (via callbacks), but I want the
opportunity to do some object-specific processing "inside" one of the
callbacks, after the next set of *nested* elements were processed. This
requires a nestable parser, where I can pick up the parsing inside a
different scope. One of the examples I presented yesterday sort of
illustrates what I want to accomplish.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rhanson at blast.net  Tue Dec  7 19:03:24 1999
From: rhanson at blast.net (Robert Hanson)
Date: Mon Jun  7 17:18:26 2004
Subject: A processing instruction for robots
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com><3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com>
Message-ID: <013b01bf40e4$c25d7890$0cb919ce@INTERNETDEPT>

> I'm open to changing this, but I thought I would start
> with the most strict version. The advantage of the strict
> version is that it doesn't need to be parsed.

> The Desparate Perl Hacker can do four regex compares
> for the four variants and get back to work.

I guess that depends on how "desparate" they are.  It is relatively easy to do
no matter what order the attributes are.  I would suggest not specifying an
order unless you can think up a better reason for keeping it.

Below is sample code with the output...  (notice the 8 examples with varying
attribute orders an values... but only 2 regexes).

----------------------------------
# Tested Perl code
my @examples = (
 '<?robots index="yes" follow="yes"?>',
 '<?robots index="no"  follow="yes"?>',
 '<?robots index="yes" follow="no" ?>',
 '<?robots index="no"  follow="no" ?>',
 '<?robots follow="yes" index="yes"?>',
 '<?robots follow="no"  index="yes"?>',
 '<?robots follow="yes" index="no" ?>',
 '<?robots follow="no"  index="no" ?>'
 );

foreach my $ex1 ( @examples )
 {
 if ( $ex1 =~ /<\?robots((?:\s+(?:index|follow)="(?:yes|no)"){2})\s*\?>/ )
  {
  my %tmp;
  while ( $ex1 =~ /\s+(index|follow)="(yes|no)"/g )
   {
   $tmp{$1} = $2;
   }
  print "Follow: $tmp{follow}  Index: $tmp{index}\n";
  }
 }

==== OUTPUT ====
Follow: yes  Index: yes
Follow: yes  Index: no
Follow: no   Index: yes
Follow: no   Index: no
Follow: yes  Index: yes
Follow: no   Index: yes
Follow: yes  Index: no
Follow: no   Index: no
===============


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Tue Dec  7 20:42:31 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:26 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com>


Here's my take on it... Note that what I'm saying here reflects the
necessities of supporting really bad C++ implementations, not my personal
feelings. If it were up to me, I'd say use every modern service of C++ and
those who don't have compliant C++ implementation can have a good reason to
get one. But, by an unfortunate decision, I was not made the ruler of the
world... Go figure!

1) I don't mind that we just start of with SAX2 I guess. It makes sense
this late in the game perhaps to just concentrate on SAX2.

2) We would prefer that all data come out of the SAX interfaces as raw
wchar_t strings. This is the most flexible mechanism and does not lock
people into using any particular implementation of a string object. It also
has the highest potential performance for those folks who never need to put
it into anything more formal than a raw array.

3) We agree with the basic desire to avoid object ownership issues, but
wouldn't worry about them if they are well documented. Object ownership is
just a fundamental issue in C++ and if you don't understand them you
probably are going to blow your own foot off no matter what.

4) We would be concerned about some of the SAX2 stuff wrt setting features
(I think its features) via an abstracted object interface because its a
little bit sticky. It can be done, but the point still arises of where does
the desirability of being the same as the Java interface end and the
desireability of having a very natural interface for your own language
begin? I.e. just don't make it so Java'esque that it requires a lot of
trickery to make work on C++. Don't require some common base class.

5) If you wanted to templatize the interface over the character type, we
wouldn't mind particularly. But, considering that any implementation of the
interface would *always* use the same instantiation, why bother? Just
typedef the character type and let each implementation drive it. Its not
likely that a particular build of a particular implementation would need to
change this on the fly, right?

6) The issue of handler ownership is something we punted on. As far as we
are concerned, handlers installed on the SAXParser belong to the caller
because in most cases one object implements a number of handlers.

7) The names of methods of the handlers need to be non-ambiguous to avoid
problems. So DocType handlers should use DocTypeCharacters() or
DTDCharacters() or whatever, and Document handlers should use
DocCharacters() or some such thing. Its just not worth the paranoia of how
implementations would deal with multiple mixed in interfaces having the
same named methods. If the processing should be common, the class
implementing both handlers can delegate to a private method.

8) I disagree with the contention that unsigned shouldn't be used in
interfaces. If the thing being modeled is unsigned, use unsigned because
you are modelling the type desired. I would personally typedef (by logical
usage) all of the fundamental types used by the interfaces and let the
implementation drive them.

9) APIs such as getType() or getValue() should return a "const wchar_t*" so
that the caller uses the returned value directly. The overhead of copying
the return (and having to clean it up) would probably be unacceptable
(actually it wchar_t would be some defined type that is driven by the
implementation.) Yes this involves ownership issues, but as I said, this is
fundamental to C++, so people should probably just 'get over it' :-)

10) I believe that its better to have the interfaces remain pure virtual
and provide a HandlerBase. This lets people who want to be sure that
they've overridden everything be told so by the compiler, and it allows
selective overriding by using HandlerBase where desired.

11) The class names (since we can't afford to use C++ namespaces) should be
expanded to include a SAX prefix to avoid clashes. So SAXParser and
SAXLocator and SAXAttributeList and so on.

12) We added reset() methods to all the handlers. The reason being that, on
the start of a new parse operation, each handler might need to reset its
internal state. We assume that the handlers might be completely unknown to
the code that kicks off the parse event and we didn't want them to have to
assume that the order of events wouldn't change over time (i.e. we didn't
want them to just pick what they think will be the first event and reset
from that.)


That's all I can think of at the moment. I haven't had enough time to look
at SAX2 closely so I don't know what there might be problematic to us in
the C++ world. But, I still think that its good enough to just pick up at
SAX2 as long as SAX2 can be reconcilled with the needs of the C++ world.

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Tue Dec  7 20:45:29 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
References: <384D3843.F3F7AFB9@fxtech.com> <82jqbu$pns$1@eve.enteract.com>
Message-ID: <384D7222.23FAAF41@fxtech.com>

> Ignore my response on xml-dev: I incorrectly guessed what you wanted to
> do.  Let me try again.  This time it looks like what you want to do is
> process Foo and its contents with a different set of handlers than the
> rest of the document.  If that's the case, have your "standard"
> StartElement handler set new handlers when it encounters a Foo and have
> the new EndElement handler set the handlers back to the "standard" ones
> when it encounters the end of a Foo.  If necessary, maintain a stack of
> pointers to handlers.

This is a good idea on the surface, and where I started down in my
implementation when I hit a snag. This provides too much housekeeping,
and too many functions if you want to do something special when the
element is finished (such as add the just-parsed object to a list). It
would be nicer to be able treat parsing of an element as an atomic
operation, so you can write code like this:

Document::ParseDocument(XML_Input &in)
{
	XML_ElementHandler handlers[] = {
		{ "Object", ParseObject }
		{ NULL }
	};
	in.Parse(handlers, this);
}

Docuement::ParseObject(XML_Element &element, void *userData)
{
	Document *doc = (Document *)userData;
	Object *obj = new Object;
	obj->Parse(element);
	doc->AddObject(obj);
}

Object::Parse(XML_Element &element)
{
	XML_ElementHandler handlers[] = {
		... object-specific element handlers ...
	};
	// parse just the object subtree to the </Object> token
	element.Parse(handlers, this);
}

You see in ParseObject() that I can do everything I need to create a new
object, parse it, and do something with it after I've parsed it. I can
only do this if the parser lets me parse just a subtree and then stop
(ie. it returns control back to me when it finds the </Object> token).

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Tue Dec  7 20:51:42 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: <384D58A7.2AA6493F@fxtech.com>
Message-ID: <000501bf40f4$f6348c20$099918d1@docuverse1>

There is clearly a need for this although IMHO the demand
for it will not be large until complexity of XML data grows
significantly.

Event-based API, like SAX, is reactive opposed to active
APIs like Java's StringTokenizer.  I found that reactive
systems dealing with complex data/event stream tend to
get bogged down with state management which increases
maintenance cost significantly.  Extensibility of XML
will works against you unless you know what you are doing.

Active/reactive designs, reactive at high level and active
at low level, are more suited to handling complex XML data.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Tue Dec  7 22:49:39 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: Lars Marius Garshol's message of 07 Dec 1999 18:53:32 +0100
Message-ID: <7390.199912072249@doyle.cogsci.ed.ac.uk>

> | I see a demand for parsing a document with SAX, but using some
> | start-tags to switch to building DOM (or DOM-like) objects, returning
> | to stream-oriented processing afterwards.  

As Lars pointed out, it seems like the original poster wanted
something else.  But if this *is* what you want, LT XML provides it -
after reading a start tag you can call a function that "fills in" the
tree starting there.

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rossb at wrq.com  Tue Dec  7 23:08:38 1999
From: rossb at wrq.com (Ross Bleakney)
Date: Mon Jun  7 17:18:26 2004
Subject: Appending to an XML document
Message-ID: <1654BC972546D31189DA00508B318AC82CB832@charmander.wrq.com>

My apologies if you have already read this in the XSL list. I (and
apparently several other people) have a need to append an element onto an
existing XML file. I would like to avoid reading in the whole document and
then writing it back out again. My original plan was to open the file, find
the end of it, back up a bit to find the last tag, write the new element and
then rewrite the closing tag. I am looking for a generic solution.

I know of no XML API that allows for modifying a document. They make it easy
to create new documents out of old ones, but they don't allow you to modify
an existing file. Doing so would mean the possibility of optimization that
would greatly reduce disk I/O. For example, if you had XML like this:

<Events>
   <Event>...<Event>
   <Event>...<Event>
</Events>

It would be really nice to write code like this:

	ModifyXML modXML = new ModifyXML("MyDoc.XML");	
	Element event = modXML.createElement("Event");
	event.appendChild(modXML.createTextNode("A big event happened"));
	modXML.appendChild("Events", event);
	modXML.update();

An implementor of this interface could take advantage of the fact that
<Events> is the main tag and perform the same sort of work I suggested
(backing up from the end and then writing). The routines for this interface
would be very limited since this would only be used when you want to modify
a document and you know that using SAX (or DOM) is inefficient. Thus there
would be no reason to have an "insertBefore". The API could be limited to
appending and deleting. 

Is there something like this already?

Thanks,
Ross

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anupama at quintessent.net  Tue Dec  7 23:10:36 1999
From: anupama at quintessent.net (anupama@quintessent.net)
Date: Mon Jun  7 17:18:26 2004
Subject: GUI XML doc authoring tools
Message-ID: <OF12746C47.C0676A84-ON88256840.007ED220@quintessent.net>


There was a listing on XML-Industry
news(http://www.oasis-open.org/cover/xmlNews.html) about alpha release of
an editor.
http://architag.com/xray/

I haven't tried it, but it might be close to what you are looking for.


"Jeff Russell" <jefftr@bellsouth.net>@ic.ac.uk on 12/06/99 12:26:48 PM

Please respond to "Jeff Russell" <jefftr@bellsouth.net>

Sent by:  owner-xml-dev@ic.ac.uk


To:   "Xml-Dev@Ic. Ac. Uk" <xml-dev@ic.ac.uk>
cc:

Subject:  GUI XML doc authoring tools


Anybody know of any Windows GUI (or Linux, as a last resort) XML document
authoring tools? Something like SoftQuad's XMeTaL, but that doesn't require
a
DTD.

Jeff Russell
jefftr@bellsouth.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Tue Dec  7 23:38:20 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:18:26 2004
Subject: Appending to an XML document
In-Reply-To: <1654BC972546D31189DA00508B318AC82CB832@charmander.wrq.com>
Message-ID: <199912072337.AAA10151@sonne.darmstadt.gmd.de>

Ross,

currently I'm busy designing an XML based log format myself. In 
contrast to "classic line based logging", appending indeed is 
prohibitively costly in XML. Thus I decided not to log into a 
wellformed XML document, but to stick with a sequence of <Event> type 
doc-fragments, just being well-formed per event.
Of course one can not parse the result immediately, but at the time 
of log analysis (or whatever you do with your event data), it's 
trivial to pre- and append the necessary tags to enclose the doc-
fragments.

XML was just not designed to fit the demands of concatenatiation. But 
I found the value of structuring single events in a "semi-structured" 
(read: well-formed) way valuable enough to choose XML. The "missing 
enclosing tag" is not really a serious problem if you delay its 
insertation until REALLY necessary.

	++im

Ross Bleakney <rossb@wrq.com> wrote at 7 Dec 99, 15:08:

> I know of no XML API that allows for modifying a document. They make it easy
> to create new documents out of old ones, but they don't allow you to modify
> an existing file. Doing so would mean the possibility of optimization that
> would greatly reduce disk I/O. For example, if you had XML like this:
> 
> <Events>
>    <Event>...<Event>
>    <Event>...<Event>
> </Events>
> 
> It would be really nice to write code like this:
> 
> 	ModifyXML modXML = new ModifyXML("MyDoc.XML");	
> 	Element event = modXML.createElement("Event");
> 	event.appendChild(modXML.createTextNode("A big event happened"));
> 	modXML.appendChild("Events", event);
> 	modXML.update();
> 

--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From beth at planet7tech.com  Tue Dec  7 23:46:09 1999
From: beth at planet7tech.com (Beth Penland)
Date: Mon Jun  7 17:18:26 2004
Subject: ANNOUNCE: XML Advisory Council
Message-ID: <NDBBKHFJKLJBJBCHDEJEMEBMCAAA.beth@planet7tech.com>

Planet 7 Technologies is looking for experienced XML developers to
participate in our P7 Advisory Council. We are currently developing software
to fundamentally improve the way eCommerce networks use XML information,
allowing for the real-time distribution of XML across existing networks and
applications. Please contact beth@planet7tech.com if you are interested.

Beth Penland
Planet 7 Technologies
2787 152nd Avenue NE
Building 7
Redmond, WA 98052


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Tue Dec  7 20:42:31 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:26 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com>


Here's my take on it... Note that what I'm saying here reflects the
necessities of supporting really bad C++ implementations, not my personal
feelings. If it were up to me, I'd say use every modern service of C++ and
those who don't have compliant C++ implementation can have a good reason to
get one. But, by an unfortunate decision, I was not made the ruler of the
world... Go figure!

1) I don't mind that we just start of with SAX2 I guess. It makes sense
this late in the game perhaps to just concentrate on SAX2.

2) We would prefer that all data come out of the SAX interfaces as raw
wchar_t strings. This is the most flexible mechanism and does not lock
people into using any particular implementation of a string object. It also
has the highest potential performance for those folks who never need to put
it into anything more formal than a raw array.

3) We agree with the basic desire to avoid object ownership issues, but
wouldn't worry about them if they are well documented. Object ownership is
just a fundamental issue in C++ and if you don't understand them you
probably are going to blow your own foot off no matter what.

4) We would be concerned about some of the SAX2 stuff wrt setting features
(I think its features) via an abstracted object interface because its a
little bit sticky. It can be done, but the point still arises of where does
the desirability of being the same as the Java interface end and the
desireability of having a very natural interface for your own language
begin? I.e. just don't make it so Java'esque that it requires a lot of
trickery to make work on C++. Don't require some common base class.

5) If you wanted to templatize the interface over the character type, we
wouldn't mind particularly. But, considering that any implementation of the
interface would *always* use the same instantiation, why bother? Just
typedef the character type and let each implementation drive it. Its not
likely that a particular build of a particular implementation would need to
change this on the fly, right?

6) The issue of handler ownership is something we punted on. As far as we
are concerned, handlers installed on the SAXParser belong to the caller
because in most cases one object implements a number of handlers.

7) The names of methods of the handlers need to be non-ambiguous to avoid
problems. So DocType handlers should use DocTypeCharacters() or
DTDCharacters() or whatever, and Document handlers should use
DocCharacters() or some such thing. Its just not worth the paranoia of how
implementations would deal with multiple mixed in interfaces having the
same named methods. If the processing should be common, the class
implementing both handlers can delegate to a private method.

8) I disagree with the contention that unsigned shouldn't be used in
interfaces. If the thing being modeled is unsigned, use unsigned because
you are modelling the type desired. I would personally typedef (by logical
usage) all of the fundamental types used by the interfaces and let the
implementation drive them.

9) APIs such as getType() or getValue() should return a "const wchar_t*" so
that the caller uses the returned value directly. The overhead of copying
the return (and having to clean it up) would probably be unacceptable
(actually it wchar_t would be some defined type that is driven by the
implementation.) Yes this involves ownership issues, but as I said, this is
fundamental to C++, so people should probably just 'get over it' :-)

10) I believe that its better to have the interfaces remain pure virtual
and provide a HandlerBase. This lets people who want to be sure that
they've overridden everything be told so by the compiler, and it allows
selective overriding by using HandlerBase where desired.

11) The class names (since we can't afford to use C++ namespaces) should be
expanded to include a SAX prefix to avoid clashes. So SAXParser and
SAXLocator and SAXAttributeList and so on.

12) We added reset() methods to all the handlers. The reason being that, on
the start of a new parse operation, each handler might need to reset its
internal state. We assume that the handlers might be completely unknown to
the code that kicks off the parse event and we didn't want them to have to
assume that the order of events wouldn't change over time (i.e. we didn't
want them to just pick what they think will be the first event and reset
from that.)


That's all I can think of at the moment. I haven't had enough time to look
at SAX2 closely so I don't know what there might be problematic to us in
the C++ world. But, I still think that its good enough to just pick up at
SAX2 as long as SAX2 can be reconcilled with the needs of the C++ world.

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Tue Dec  7 20:51:42 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: <384D58A7.2AA6493F@fxtech.com>
Message-ID: <000501bf40f4$f6348c20$099918d1@docuverse1>

There is clearly a need for this although IMHO the demand
for it will not be large until complexity of XML data grows
significantly.

Event-based API, like SAX, is reactive opposed to active
APIs like Java's StringTokenizer.  I found that reactive
systems dealing with complex data/event stream tend to
get bogged down with state management which increases
maintenance cost significantly.  Extensibility of XML
will works against you unless you know what you are doing.

Active/reactive designs, reactive at high level and active
at low level, are more suited to handling complex XML data.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec  8 00:26:06 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:26 2004
Subject: Appending to an XML document
In-Reply-To: <199912072337.AAA10151@sonne.darmstadt.gmd.de>
Message-ID: <000101bf4112$e677f060$099918d1@docuverse1>

I ran into this as well when working on XLF (eXtensible Log
Format).  At that time, I was not into SML, so the solution
was to store entries in an external parsed entity and have
a wrapper XML document that just defined the entity inside
the document element.

Looking back at what I did with a new perspective/attitude,
namely SML, I now have a different solution.  All you need
to do is redefine your idea of an XML document and refrain
from using certain features of XML.

If you do not use any part of the 'prolog' and 'Misc'
production rules in the XML 1.0 spec, and if you detach
the notion of an XML document being a file, you can send
or store multiple XML documents in a single stream or file.

Appending is now just a matter of appending an XML document
to the end of a file or a stream.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Wed Dec  8 00:37:46 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:26 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
Message-ID: <384DA88A.BFC736C3@fxtech.com>

Thanks for all who have given feedback on my desires for a relatively
atypical parsing idiom for XML. Some of my interest is based on a
proprietary parser I wrote a few years ago, that I've used for
everything since. It's tag-based and object-oriented, and each block of
a document can be parsed as a complete unit. When used to parse
object-oriented data, it lets each object easily handle its own parsing.

Now I'd like to apply the same concepts to an XML parser, used primarily
when object-oriented program data is stored as XML syntax.

I believe the best way to describe what I want to do (and why) is to
show a concrete example. Suppose I have a program that generates images
composed of layers with multiple objects in each layer. Each layer has a
size associated with it as well.

The classes I have are:
	Document (contains one or more layers)
	Layer (contains one or more objects and a Size)
	Object (some type of object)
	Size (an object which represents a width and height)
	Point (x,y value)
	Rect (x1,y1 to x2,y2)
	Circle (type/subclass of Object)
	Square (type/subclass of Object)

Ideally, each object would be able to write out its data in XML form,
and parse its own data (along with a list of attributes if it uses
them).

Here is an example xml file:

<Document name="mydocument">
	<Layer name="background">
		<Size>640x480</Size>
		<Object type="circle">
			<Center>320,240</Center>
			<Radius>25.0</Radius>
		</Object>
		<Object type="square">
			<Rect>10,10-40,40</Rect>
		</Object>
	</Layer>
</Document>

If you think about the object hierarchy associated with this document,
you have something like this:

Document
	contains Layer ("background")
		contains Size (640x480)
		contains Circle (Object)
			Contains Point (320,240)
			Contains float (25)
		contains Square (Object)
			Contains Rect (10,10 - 40,40)

I tend to design APIs from the point of view of the programmer. Since as
the number of classes in my application grows, I want to minimize the
amount of extra code I have to write. So I'd like to simplify the
parsing down to the minimum amount of necessary boilerplate code. So
let's assume that each object has its own Parse() method. This method
gets called with an XML::Element object which has the name and
attributes for that object. Parsing of the entire object should be an
atomic operation.

I use static function pointers as callbacks to avoid having to subclass
from any XML-specific classes. User-data is passed along in the parsing
so we can cast it back to the necessary type in one of the element
handlers. The code is presented in C++ but the parsing operations can
easily have a "C" interface. Exceptions are thrown if anything goes
wrong, so there are no error codes.

Here is the code needed to open the XML file and find the top-level XML
element:

Document *App::LoadDocument(const char *path)
{
	// specify a handler to look for "Document" elements
	XML::ElementHandler handlers[] = {
		XML::ElementHandler("Document", sParseDocument)
		XML::ElementHandler::END
	};
	XML::Input file(path);
	file.Parse(handlers, this);
}

>From here on out each object is responsible for parsing itself, based on
an XML::Element object that is passed to it. Please examine the code
closely to see the indended design and flow.

// when a Document element is found, it is passed to the sParseDocument
handler
void App::sParseDocument(const XML::Element &elem, void *userData)
{
	// userData is the App * from the file.Parse() call above
	App *app = (App *)userData;
	// we found a document element, so make one using the attributes
	Document *doc = new Document(elem.GetAttribute("name"));
	// now parse the document
	doc->Parse(elem);
	// if we get here without a thrown exception, the Document parsed
	// okay and we can add it
	app->AddDocument(doc);
}

void Document::Parse(const XML::Element &elem)
{
	// specify handlers to look for "Layer" elements
	XML::ElementHandler handlers[] = {
		XML::ElementHandler("Layer", sParseLayer)
		XML::ElementHandler::END
	};
	elem.Parse(handlers, this);
	// if we needed to do something special, like validating the
	// document, we could do it right here
}

void Document::sParseLayer(const XML::Element &elem, void *userData)
{
	// again, userData is the Document * passed in elem.Parse() above
	Document *doc = (Document *)userData;
	// make a new layer
	Layer *layer = new Layer(elem.GetAttribute("name"));
	// parse the layer
	layer->Parse(elem);
	doc->AddLayer(layer);
}

void Layer::Parse(const XML::Element &elem)
{
	// specify handlers to look for "Size" and "Object" elements
	// note that for the Size element we call the Size object's static
	// parse function directly, and we're passing the address of our
	// contained Size member as its user-data, so we do not need to
	// provide an additional static Size handler to forward to the Size
	// object's member Parse() method
	XML::ElementHandler handlers[] = {
		XML::ElementHandler("Size", Size::sParse, &mSize)
		XML::ElementHandler("Object", sParseObject)
		XML::ElementHandler::END
	};
	elem.Parse(handlers, this);
}

void Size::sParse(const XML::Element &elem, void *userData)
{
	Size *size = (Size *)userData;
	// size has no attributes, just data, so read it directly
	// note that elem.ReadData() reads character data up to the
	// ending element tag and returns the size found
	char tmp[40];
	size_t len = elem.ReadData(tmp, sizeof(tmp));
	tmp[len] = '\0';
	sscanf(tmp, "%dx%d", &size->width, &size->height);
}

void Layer::sParseObject(const XML::Element &elem, void *userData)
{
	// again, userData is the Layer * passed in elem.Parse() above
	Layer *layer = (Layer *)userData;
	// make a new object from the object type
	std::string type = elem.GetAttribute("type");
	// I would normally use a factory here but this illustrates the 
	// point better
	Object *obj = NULL;
	if (type == "circle")
		obj = new Circle();
	else if (type == "square")
		obj = new Square();

	// now let the object (whatever type it is) parse itself
	obj->Parse(elem);
	layer->AddObject(obj);
}

So I hope this gets the idea across. I'd be interested in feedback.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Wed Dec  8 00:52:15 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:26 2004
Subject: SGML the next big thing?
Message-ID: <87256841.0004BF66.00@d53mta03h.boulder.ibm.com>


>This & thing is so far outside the way most other computer languages
>work that standard off-the-shelf parser generators roll on their
>backs and wave their paws in the air and admit defeat.

Personally, I think & should be limited to just 'leaf' nodes only. This
would keep it sane, and allow it to be implemented as a special case
content model (as we already do for Mixed anyway.) This would provide a lot
of usefulness without forcing everyone to throw out the very fast and
compact DFA type representations in wide use now, or implement another
(much more complex one) in addition to.

People should accept the fact that XML is not going to solve all problems
and still stay light enough to remain what it was intended to be. Schema
will already be very bloated with all the other stuff in it now. I
guesstimate that a Schema implementation will probably be at least twice
the size of the existing parser code in most implementations, if not more
so. Anyone think differently?

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Wed Dec  8 01:27:31 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:26 2004
Subject: A question on nomenclature
References: <805C62F55FFAD1118D0800805FBB428D02BC017D@cc20exch2.mobility.com>
Message-ID: <00fb01bf411b$0db5bd80$0300000a@cygnus.uwa.edu.au>

> <name>
>   <first/>
>   <middle/>
>   <last/>
> </name>
>
> (James Tauber described this as a "schema-by-example") but what I really
> want is the name that I would call that "class" of XML documents.

In linguistic terms, you have a "grammar" defining a "language" which is
really just a set of "utterances".

In XML, a "grammar" is generally called a "schema" and an utterance is
called an "instance". So what you are asking, if I understand correctly, is
what is the term corresponding to "language".

The term most consistent with the XML 1.0 REC would probably be "document
type".

So you would say you have a "schema" defining a "document type" which is
really just a set of "instances".

> but I don't care about that.  I'm still leaning toward "vocabulary",
because
> that still seems to describe it best, but I'm still open too.  (I think
> "schema" is probably correct for what I'm trying to do as well, but that
> would confuse readers with XML Schemas, which are just one type of "schema
> description language"...)

1. Yes, people get confused between a schema and a schema language and use
"schema" to mean both.

2. There is a distinction between a schema and the set of valid documents
for that schema (ie a "document type"). It is the distinction between a
grammar and the language it defines. So you could use the term "schema" for
the *definition* of the set of valid documents (whether its a DTD, a W3C XML
Schema or a schema-by-example), but the actual set of valid documents is
best called something else (like "document type").

Hope this helps

James Tauber


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steve at rsv.ricoh.com  Wed Dec  8 01:31:27 1999
From: steve at rsv.ricoh.com (Stephen R. Savitzky)
Date: Mon Jun  7 17:18:26 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
In-Reply-To: Paul Miller's message of Tue, 07 Dec 1999 19:38:34 -0500
References: <384DA88A.BFC736C3@fxtech.com>
Message-ID: <qcaenmjk2e.fsf@congo.crc.ricoh.com>

This is basically a traditional top-down, recursive-descent parser.
Unfortunately, it's completely different from the way most XML parsers I've
seen work, although I believe there's a lexical layer underneath expat that
can be made to work this way.

But there's better way of looking at the situation, namely that what you
really want to do is make a top-down traversal of the document's parse tree.
In other words, at any given position in the tree, you want to do pseudocode
like

// process a <foo> element.
Foo::process(const XML::Element &elem) { 
   // do the setup
   for (XML::Node *node = elem.getFirstChild();
	node != null;
	node = node->getNextSibling()) 
   { 
	processChild(node); // dispatch on node's type & tag
   }
   // do the cleanup
}
 
This works as-is if the result of your parse is a DOM tree or some
equivalent parse-tree representation of the document, but trees take memory.
So the next step is to use a parser that looks like a tree traverser:

// process a <foo> element.
Foo::process(TreeTraverser &it) { 
   // do the setup, using it.getAttrList(), etc. on the current node
   if (it.hasChildren()) { 
       for (it->toFirstChild(); !it.atEnd(); it.toNextSibling()) { 
	   processChild(it); // dispatch on new current node's type & tag
       }
       it.toParent();  // go back up the tree
   }
   // do the cleanup
}
 
Note that if your parser has this interface, you may never have to actually
build the whole tree.  Similarly, you can output to a ``tree constructor''
that merely appends characters to a string.

We've built a document-processing system (currently in Java) using this kind
of interface; you can find it at  <http://RiSource.org/PIA/>.

-- 
Stephen R. Savitzky  <steve@rsv.ricoh.com>  <http://rsv.ricoh.com/~steve/>
Platform for Information Applications:      <http://RiSource.org/PIA/>
Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center
 voice: 650.496.5710  front desk: 650.496.5700  fax: 650.854.8740 
  home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From murata.makoto at fujixerox.co.jp  Wed Dec  8 04:55:05 1999
From: murata.makoto at fujixerox.co.jp (MURATA Makoto)
Date: Mon Jun  7 17:18:26 2004
Subject: SML and I18N
Message-ID: <199912080457.AA03578@archlute.fujixerox.co.jp>

Rick Jelliffe wrote:
>Where I am, here in Taiwan, the main question people ask is "how do I 
>represent a document in Big5 in XML?". So moving to ASCII or even to 
>only UTF-8 will make SML into a US-only or Western-only language. The 
>simplifications proposed so far seem a gigantic step backwards away from 
>a "World Wide Web" and back 20 (or even 5?) years to a world where rich 
>white countries developed technology which created a technological poverty 
>in non-Western countries. 

I am totally against weakening I18N of XML 1.0.  Even 1% trim down is 
absolutely completely unacceptable to me.  Legacy encodings, natural language 
markup, the xml:lang attribute, encoding declarations, the charset parameter, 
and numeric character references must be preserved.  If SML omits any of them, 
SML is not for the World Wide Web.  

Makoto
 
Fuji Xerox Information Systems
 
Tel: +81-44-812-7230   Fax: +81-44-812-7231
E-mail: murata.makoto@fujixerox.co.jp

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Wed Dec  8 08:12:51 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:26 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: roddey@us.ibm.com's message of "Tue, 7 Dec 1999 13:40:43 -0700"
References: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com>
Message-ID: <wh4sdt97gh.fsf@viffer.oslo.metis.no>

>>>>> roddey@us.ibm.com:

This statement:

> ... If it were up to me, I'd say use every modern service of C++ and
> those who don't have compliant C++ implementation can have a good reason to
> get one.
[...]

conflicts with this statement:

> 2) We would prefer that all data come out of the SAX interfaces as
> raw wchar_t strings. This is the most flexible mechanism and does
> not lock people into using any particular implementation of a string
> object. It also has the highest potential performance for those
> folks who never need to put it into anything more formal than a raw
> array.

std::basic_string<> _is_ a modern service of C++, and a pretty good
one from an API point of view.

Personally I say: use std::basic_string<> and death to all other
string representations in C++.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mikew at o3.co.uk  Wed Dec  8 09:37:41 1999
From: mikew at o3.co.uk (Mike Williams)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: Paul Miller's message of "Tue, 07 Dec 1999 13:57:43 -0500"
References: <Pine.GSU.4.10.9912071036540.9271-100000@netcom2.netcom.com> <384D58A7.2AA6493F@fxtech.com>
Message-ID: <m3so1d93f1.fsf@picasso.o3.co.uk>

  >>> On Tue, 07 Dec 1999 13:57:43 -0500,
  >>> "Paul" == Paul Miller <stele@fxtech.com> wrote:

  Paul> No, I did want things pushed at me (via callbacks), but I want the
  Paul> opportunity to do some object-specific processing "inside" one of the
  Paul> callbacks, after the next set of *nested* elements were processed. This
  Paul> requires a nestable parser, where I can pick up the parsing inside a
  Paul> different scope. 

What about using nestable *handlers*.  Say you're parsing something like
this:

    ...
    <foo>
        <bar>xxx</bar>
    </foo>
    ...

When your main handler sees the <foo> tag, create a new "FooHandler"
object.  Your main handler would then need to delegate all events to the
FooHandler, until the corresponding </foo> is seen.

Not that this is particularly easy to implement.  In fact, I started to
implement something similar (in Java), but got fed up ... I've reverted to
using a DOM as input, for the time being.  

The main complication is that the delegating handler has to maintain a
context-stack while it's delegating, in order to match the correct
end-tag.  One way around this might be to get the FooHandler to notify the
main handler when it's finished.

Just an idea ...

-- 
Mike Williams

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nicmila at vscht.cz  Wed Dec  8 10:33:06 1999
From: nicmila at vscht.cz (Miloslav Nic)
Date: Mon Jun  7 17:18:26 2004
Subject: New tutorial at Zvon
Message-ID: <384E33BA.23FB283C@vscht.cz>

WML (Wireless Markup Language) - language for mobile devices (WAP) is
getting recently some attention. This language is based on XML.
At Zvon you will find a new tutorial:
http://zvon.vscht.cz/HTMLonly/WMLTutorial/Examples/Example1/index.html

which demonstrates some features of this language on several examples.
The tutorial contains a simple emulator of PDA device (in HTML, so do
not worry, there is nothing to download apart from actual pages.).

-- 
***************************************************************
Dr. Miloslav Nic                        e-mail: nicmila@vscht.cz
Department of Organic Chemistry         TEL: +420 2 2435 5012  
ICT Prague (VSCHT Praha)                     +420 2 2435 4118
    				        FAX: +420 2 2435 4288  
****************************************************************
Support free information exchange: http://zvon.vscht.cz
****************************************************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Andy.Bradbury at syntegra.bt.co.uk  Wed Dec  8 11:00:45 1999
From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk)
Date: Mon Jun  7 17:18:26 2004
Subject: Check this
Message-ID: <65AF45D5E535D2118AFB0008C7FA2318035A9AFF@FL-EXCHANGE-03>

Warning

The following e-mail was received on the 'junior' XML list.
The attachment - LINKS.VBS (11K) - contained a VBScript virus: VBS_FREELINK.

Regards

Andy B.


-----Original Message-----
From: Conrad Meier [mailto:conradm@SOFTWAREFUTURES.COM]
Sent: 08 December 1999 09:00
To: XML-L@LISTSERV.HEANET.IE
Subject: Check this


Have fun with these links.
Bye.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From d.d.barnes at ic.ac.uk  Wed Dec  8 11:06:33 1999
From: d.d.barnes at ic.ac.uk (ic\ddsb/d.d.barnes)
Date: Mon Jun  7 17:18:26 2004
Subject: using xt in a browser
Message-ID: <384E3B7A.1D62697F@ic.ac.uk>

Hello,

I am sorry to ask this (again - long story),
but can anyone tell me/ show me an example of how you can use xt
to transform xml into html from within a piece of javascript/java in a
page?

I apologise for my painful ignorance . . .

and thanks in advance,

David


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Wed Dec  8 13:24:23 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:26 2004
Subject: nestable C/C++ XML parser?
References: <384D3843.F3F7AFB9@fxtech.com>
Message-ID: <004a01bf4180$2b6ac4a0$c22a08d1@tomshp>

----- Original Message -----
From: Paul Miller <stele@fxtech.com>

> I'm trying to develop a tag-based front-end to expat and having no luck.
> I'd like to be able to parse an XML document in nestable chunks, by
> calling into a nestable parser. In other words, I'd like to start
> parsing, then branch to a function to handle a specific element, parsing
> in there until that element is closed, then fall back out of the
> function to continue parsing the rest of the document.
>
I take it that you want to be able to ignore part of the doument, and only
process the pieces you are interested in.  Is that right?  Then each piece
would be valid XML if it were enclosed in a root element.  You don't need to
literally do what you have suggested. That is, "parse in there...".  You do
need to parse handle the elements of different pieces differently.  Three
approaches come to mind.

1) Preprocess to extract just the pieces you want, wrap them in root
elements so they are complete documents, then run expat (or whatever)
separately on them using SAX. The preprocess should be fast and easy, and
perhaps could be done using regular expressions, or SAX.  Alternatively, if
the xml is relatively simple, don't wrap the fragments, and process them
using regular espressions insstead. (Search this archives of this group for
the last few months to find a reference to "shallow parsing using regular
expressions").

2)  You really are talking about a state machine, I think.  That is, if you
have reached the right piece of the document, you go to a different manner
of handling the elements (they will still parse the same, it's just the
handling that would be different).  So you could explicitly maintain a state
variable and have the SAX (or whatever) callbacks behave differently
according to the state.  This would be conceptually simple but might be a
pain to implement depending on how many different element handlers you will
use.

3) Again as a state machine, you could use a function pointer to specify the
callbacks, and when you change state you change the function pointers to
point to different handlers.  I don't know whether you would have to modify
expat to do this or not, but changes should be minor if needed.

Regards,

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Wed Dec  8 13:47:34 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:27 2004
Subject: nestable C/C++ XML parser?
References: <384D3843.F3F7AFB9@fxtech.com> <82jqbu$pns$1@eve.enteract.com> <384D7222.23FAAF41@fxtech.com>
Message-ID: <007001bf4183$6388e3a0$c22a08d1@tomshp>


----- Original Message -----
From: Paul Miller <stele@fxtech.com>

> ...It [...] would be nicer to be able treat parsing of an element as an
atomic
> operation, so you can write code like this:
>
> Document::ParseDocument(XML_Input &in)
> {
> XML_ElementHandler handlers[] = {
> { "Object", ParseObject }
> { NULL }
> };
> in.Parse(handlers, this);
> }
>
> Docuement::ParseObject(XML_Element &element, void *userData)
> {
> Document *doc = (Document *)userData;
> Object *obj = new Object;
> obj->Parse(element);
> doc->AddObject(obj);
> }
>
...

> You see in ParseObject() that I can do everything I need to create a new
> object, parse it, and do something with it after I've parsed it. I can
> only do this if the parser lets me parse just a subtree and then stop
> (ie. it returns control back to me when it finds the </Object> token).
>
> --
> Paul Miller - stele@fxtech.com

You can see the difficulty - if you send a fragment to a parser it's not a
valid xml document (so the parser can't work with it).  You could start
building a subtree when you get to the point of interest, using DOM calls,
but you keep saying you don't want to deal with DOM.

Where you are being unclear is when you say "parse just a subtree".  It is
unclear whether you think you need to get (or build) an actual tree
structure, or whether the expression is just a shorthand for indicating a
place in the document.  It is also unclear when you say that, because how do
you know that you are at the right starting place in the document?  I assume
that you have been parsing from the start of the document to get to the
point of interest.  Then you say you want to start parsing at that point.
See why it's confusing?

If you just want to know the names of the elements in the fragment, just
keep a state variable.  I know you said it's too much machinery, but maybe
there is a way it wouldn't be.

Alternatively, there are other tree builders that are simpler than DOM.
Look at Sean McGrath's xml tree code in (I think) "XML By Example) for one
example.  Of course it depends on the complexity of what you are doing.

All in all, I still think that a preprocessing pass to extract the fragments
you want to look at, as I mentioned in my previous post, is the way to go.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Wed Dec  8 14:19:26 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:27 2004
Subject: nestable C/C++ XML parser?
In-Reply-To: Paul Miller's message of "Tue, 07 Dec 1999 13:20:23 -0500"
References: <384D3843.F3F7AFB9@fxtech.com> <m37liqad77.fsf@ifi.uio.no> <uk8mqek8g.fsf@lanber.cam.citrix.com> <m390368wo3.fsf@ifi.uio.no> <384D4FE7.C81F2CCE@fxtech.com>
Message-ID: <whso1d347q.fsf@viffer.oslo.metis.no>

>>>>> Paul Miller <stele@fxtech.com>:

> ... I want to use XML as an application data file format. Why? Two
> primary reasons:
> 1. I don't need/want to invent a new syntax - I like XML just fine and
> it handles object-oriented nesting of data quite nicely
> 2. I can publish a DTD and make it easier for my end-users to use my
> application data in their own applications 
[snip!]

I have a similar situation, but went for a very different solution:
 1. wrapped a SAXoid interface around expat
 2. wrote a callback class with virtual functions for all
    elements in the DTD
 3. wrote a DocumentHandler that contains a pointer to an instance of
    the callback class, and a table of tag names and pointers to
    member functions of the callback class.  This class also
    does some rudimentary element content checking, but this will be
    dropped when a validating parser is available

Then I have two implementations of the callback class:
 - a simple one for debugging of the expat/sax chain, that just prints 
   out what it receives
 - a complicated one that unpacks attributes, keeps context between
   SAX events on a stack, and builds data structures in the system

The gain here is that since I'm relying on SAX (and plan to track the
standard that David Megginson and James Clark et al. settle on) I will
in the future have a choice of parsers, and can use one that supports
namespaces and/or validation.

It also lets me have the same basic infrastructure for all XML based
formats (I currently have two: our native format and SVG).

The biggest and clumsiest code here, is the recognition and decoding
of element attributes in the callback class.

Good guidelines for efficiency and simplicity are highly desired.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jesmith at kaon.com  Wed Dec  8 14:38:04 1999
From: jesmith at kaon.com (Joshua E. Smith)
Date: Mon Jun  7 17:18:27 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
In-Reply-To: <384DA88A.BFC736C3@fxtech.com>
Message-ID: <3.0.1.32.19991208093643.00697d64@tiac.net>

>Now I'd like to apply the same concepts to an XML parser, used primarily
>when object-oriented program data is stored as XML syntax.

Is holding the document in memory not an option?  Because if you can hold
it all in memory, it's simple enough to build a tree representation of the
thing, then walk the tree with your handlers, then throw away the tree.

If that isn't an option, then I agree that you have a combination of:
1) A nice interface paradigm (you could even generate a stub of the program
from a DTD!); and,
2) Quite a challenge getting an event-based parser to work with it because
of control-flow issues.

Here's a nutty idea -- Try threads.  Run the parser and expat as two
separate threads and cross-synchronize them.  Each expat handler would
signal the parser thread to go (and then block until it hears back), and
the ::Parse method in the parser thread would signal the expat handler to
continue (and then block until the end handler signals it back).  The two
threads never actually overlap, but you get two processing stacks to handle
the control flow issues.

-Joshua Smith


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From obecker at informatik.hu-berlin.de  Wed Dec  8 14:50:50 1999
From: obecker at informatik.hu-berlin.de (Oliver Becker)
Date: Mon Jun  7 17:18:27 2004
Subject: VoiceXML
Message-ID: <199912081450.PAA02482@mail.informatik.hu-berlin.de>

Hi there,

I'm exploring the potential of VoiceXML [1] for the development of 
new voice applications. The latest specification is version 0.9 
dated 17 August 1999. Does anybody know further links or resources 
which might be helpful? Currently I'm aware of the IBM/alphaWorks 
VoiceXML tool [2].

In addition Motorola announces VoxML [3], but it's not clear to me
which relationship exists between VoiceXML and VoxML. Motorola
is a member of the VoiceXML Forum.

I would be happy if anyone has more information on this subject
to share with me.

Thanks in advance and best regards,
Oliver

[1] http://www.voicexml.org/
[2] http://www.alphaworks.ibm.com/tech/voicexml/
[3] http://www.voxml.com/voxml.html


/-------------------------------------------------------------------\
|  ob|do        Dipl.Inf. Oliver Becker                             |
|  --+--        E-Mail: obecker@informatik.hu-berlin.de             |
|  op|qo        WWW:    http://www.informatik.hu-berlin.de/~obecker |
\-------------------------------------------------------------------/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jesmith at kaon.com  Wed Dec  8 15:00:22 1999
From: jesmith at kaon.com (Joshua E. Smith)
Date: Mon Jun  7 17:18:27 2004
Subject: SAX/C++: First interface draft
In-Reply-To: <whso1ghy64.fsf@viffer.oslo.metis.no>
References: <John Aldridge's message of "Mon, 06 Dec 1999 15:01:03 +0000">
 <3.0.6.32.19991206150103.009a1c10@mailhost>
Message-ID: <3.0.1.32.19991208095604.0105cb40@tiac.net>


>> We're using MSVC 6 here, and basic_string<> seems fine. 
>
>It's not.  See eg.
>	http://msdn.microsoft.com/visualc/stl/faq.htm#Q4

Actually, it is fine.  I ran the test program from that faq in the debugger
(stepping thru all the template code), and they clearly fixed this problem
in the transition from VC5 to VC6.

-Joshua Smith


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Wed Dec  8 14:59:57 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:27 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
References: <3.0.1.32.19991208093643.00697d64@tiac.net>
Message-ID: <384E72AC.8B02D25F@fxtech.com>

> Is holding the document in memory not an option?  Because if you can hold
> it all in memory, it's simple enough to build a tree representation of the
> thing, then walk the tree with your handlers, then throw away the tree.

I thought about that at first, and I could do it easily over expat (or
SAX). In fact, I could even keep the same API I came up with, because
then I'd just be dealing with a DOM-like representation.

However, for "large" data-files (several megabytes worth), this could be
a problem, because the in-memory representation could be many megabytes
larger (consider a 3D model with 10,000 vertices, each one with a
<Vertex></Vertex> pair). I suppose I could minimize the amount of extra
memory usage by using hash tables, but I tend to prefer to streaming
solution.

I was going under the assumption that for this type of use, namespaces
and validation probably aren't necessary, so there aren't that many
advantages to layering over expat.

> If that isn't an option, then I agree that you have a combination of:
> 1) A nice interface paradigm (you could even generate a stub of the program
> from a DTD!); and,
> 2) Quite a challenge getting an event-based parser to work with it because
> of control-flow issues.

Thanks, and yeah, it would be/is a real effort to match this idiom to
expat.

> Here's a nutty idea -- Try threads.  Run the parser and expat as two

Ouch! :-)  You get an 'A' for creativity!

-Paul

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Wed Dec  8 15:23:44 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:27 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
References: <3.0.1.32.19991208093643.00697d64@tiac.net>
Message-ID: <384E783F.38000C5E@fxtech.com>

> Is holding the document in memory not an option?  Because if you can hold
> it all in memory, it's simple enough to build a tree representation of the
> thing, then walk the tree with your handlers, then throw away the tree.

I started thinking about this some more. If I could build a light-weight
in-memory representation of the XML file, then I could build an "even
simpler" DOM-like interface as an option. So if you didn't want to use
the callback-based "discovery" API that I outlined previously, you can
use an alternative iterator-based API that lets you avoid the callbacks
and just iterate over elements you are interested in.

Here is my previous code, rewritten for a similar iterator-based API. As
you can see, it avoids the static callback functions and a lot of the
extra boiler-plate code, but the API is very similar.

Thoughts?

Document *App::LoadDocument(const char *path)
{	
	XML::Input file(path);
	XML::Element elem = file.GetElement("Document");
	Document *doc = new Document(elem.GetAttribute("name"));
	// now iterate over 'Layer' elements
	XML::Element::iterator it;
	for (it = elem.begin("Layer"); it != elem.end(); ++it)
	{
		Layer *layer = new Layer((*it).GetAttribute("name"));
		layer->Parse(*it);
		doc->AddLayer(layer);
	}
	AddDocument(doc);
	return doc;
}

void Layer::Parse(XML::Element &elem)
{
	// look for (required) size element
	mSize.Parse(elem.GetElement("size"));
	// look for object elements
	XML::Element::iterator it;
	for (it = elem.begin("Object"); it != elem.end(); ++it)
	{
		Object *obj = ObjectFactory::Create((*it).GetAttribute("type"));
		obj->Parse(*it);
		AddObject(obj);
	}
}

void Size::Parse(XML::Element &elem)
{
	sscanf(elem.GetData(), "%dx%d", &width, &height);
}

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From greynolds at datalogics.com  Wed Dec  8 15:40:27 1999
From: greynolds at datalogics.com (Reynolds, Gregg)
Date: Mon Jun  7 17:18:27 2004
Subject: A question on nomenclature
Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17034@martinique>

> -----Original Message-----
> From: James Tauber [mailto:jtauber@jtauber.com]
> Sent: Tuesday, December 07, 1999 7:21 PM
> 
> > <name>
> >   <first/>
> >   <middle/>
> >   <last/>
> > </name>
> >
> > (James Tauber described this as a "schema-by-example") but 
> what I really
> > want is the name that I would call that "class" of XML documents.
> 
> In linguistic terms, you have a "grammar" defining a 
> "language" which is
> really just a set of "utterances".
> 
> In XML, a "grammar" is generally called a "schema" and an utterance is
> called an "instance". So what you are asking, if I understand 
> correctly, is
> what is the term corresponding to "language".
> 

I'm not familiar with this definition of "schema", but then I haven't been
able to follow the discussion on XML Schema Stuff very closely.  "Grammar"
to me suggests syntax, although probably it should mean the whole ball of
wax - syntax, semantics, lexis, etc.  But "schema" to me means (roughly)
"typed", and thus a mapping from syntactic structures to values, which is
extra-syntactic.  In fact I'd argue that XML _syntax_, strictly speaking,
determines only which sentences are legal in the language, and doesn't even
map (concrete) syntactic structures to abstract ones, which is a kind of
semantics.  Well, it does, but very informally and with some ambiguities.

> The term most consistent with the XML 1.0 REC would probably 
> be "document
> type".
> 
> So you would say you have a "schema" defining a "document 
> type" which is
> really just a set of "instances".
> 

I'm confused by David's example - it clearly can only be construed as an
instance in XML terminology.  One can infer any number of DocTypes
(=languages, grammars) from it, but there is nothing in the example to
support choosing one such language over any other.  Also, based on his post
from yesterday, it sounds like he's thinking of a set with only one member.

> 
> 1. Yes, people get confused between a schema and a schema 
> language and use
> "schema" to mean both.
> 

The whole complex of schema-related terms looks terribly ill-defined to me.
Naturally I've got my own little set of definitions, but can you point me to
what you would consider the clearest and most authoritative?  (Remember I'm
often unable to follow xml-dev closely, so please copy me if you respond.)

> 2. There is a distinction between a schema and the set of 
> valid documents
> for that schema (ie a "document type"). It is the distinction 
> between a
> grammar and the language it defines. So you could use the 
> term "schema" for
> the *definition* of the set of valid documents (whether its a 
> DTD, a W3C XML
> Schema or a schema-by-example), but the actual set of valid 
> documents is
> best called something else (like "document type").
> 

I'd suggest good old ZF set terminolgy.  An expression that explicitly
enumerates the members of a set is called an extension expression, and an
expression that logically describes the set is called a set comprehension.
So "{1, 2, 3}" is an extension expr., and "{ i : Z | 0 < i < }" is a
comprehension expression denoting the same set.  (I believe there are some
other terms in use, such as intension, but these two terms are common, and
both are used in Z.)  So the set of all documents that conform to a
particular DTD can be considered the extension of the  set defined by that
DTD, which itself is analogous to a set comprehension expression - call it a
Doc. or Lang. comprehension expression.

> Hope this helps

Ditto.

-gregg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From greynolds at datalogics.com  Wed Dec  8 16:32:04 1999
From: greynolds at datalogics.com (Reynolds, Gregg)
Date: Mon Jun  7 17:18:27 2004
Subject: A question on nomenclature
Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17038@martinique>

Sorry, just remembered the term

> 
> I'd suggest good old ZF set terminolgy.  An expression that explicitly
> enumerates the members of a set is called an extension 
> expression, and an
> expression that logically describes the set is called a set 
> comprehension.
> So "{1, 2, 3}" is an extension expr., and "{ i : Z | 0 < i < }" is a
> comprehension expression denoting the same set.  (I believe 
> there are some
> other terms in use, such as intension, but these two terms 
> are common, and
> both are used in Z.)  

"Construction" is the other term I should have mentioned.

You might find Z's usage illuminating.  In Z, a schema is rigorously defined
as a named set of bindings, where a binding is function (set of ordered
pairs, not an algorithm) from names to values.  (They're also typed, so each
schema has a signature, defined as a function from names to types; the
values in the bindings must be of the appropriate type.)  There are several
ways to express a schema, but basically you can either write a construction
expression or an extension.  A schema construction expression looks
something like:

	+--[ FOO ]----
	|  i : Z
	+-------------
	|  0 < i < 4
	+-------------

meaning the name "FOO" is bound to the set of bindings of the name "i" to
integral (because of the type declaration using "Z") values satisfying the
predicate 0<i<4.

The same thing can be written using an extension expression, something like:

	FOO == { <| i == 1 |>, <| i == 2 |>, <| i == 4 |> }

"FOO" itself can be used as a type, as in the expression "f : FOO"; dot
notation is used to access the "components" of a schema:  "f.i".

What does this have to do with XML, you ask?  Well, nuttin' right now, but
it's possible to use Z's rigorous semantics to define other languages, e.g.
XML-langauges; some day in  the next millenium my pet project of expressing
a typed semantics for XML stuff using Z will bear fruit.  Maybe.

On the other hand, if "schema" is properly construed in terms of semantic
mappings, then Z provides a very handy, very carefully defined meta-language
for that right now.

-gregg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Wed Dec  8 17:10:44 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:27 2004
Subject: Schema validation of XSLT, SVG, XPath : Part 1 Proposal for lists
Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415561@THOR>

A few weeks ago, Tim Berners-Lee strongly suggested that other XML
technologies start using XML Schema.  

I have reviewed several of the other XML technologies and believe with some
minor enhancements, XML Schema can do effective validation of these
technologies.  I've broken up the necessary modifications into several
different messages, so that each can be independently considered and
reviewed, however I see all of them as necessary, reasonable and easy to
implement with minimal additional effort.  I would appreciate any comments:

The following note was written after reviewing XSLT but before reviewing
SVG.  SVG makes much more extensive use of lists and so I believe its adds
even more compelling justification for the proposal.

In XSLT, there are numerous uses of space separated lists, two of which
cannot be addressed with the DTD compatibility NMTOKENS list type.  This
message identifies them, proposes an additional element for XML Schema
Datatypes that would address delimited lists in a  minimally distruptive
manner that would be generally useful and then presents schema fragments for
the XSLT elements.

I believe this is a compelling (even demanding) argument for inclusion of
list support in the initial version of XML Schema.


1. List usage in XSLT:

<xsl:stylesheet
   extension-element-prefixes = tokens
   exclude-element-prefixes = tokens  

<xsl:strip-space elements = nameTests
<xsl:preserve-space elements = nameTests


<xsl:element use-attribute-sets = qnames     //  qname could be done with
NMTOKENS,

<xsl:attribute-set use-attribute-sets = qnames

<xsl:copy use-attribute-sets = qnames

<xsl:output cdata-section-elements = qnames

Only the strip-space and preserve-space elements could not be done with
NMTOKENS, since a nameTest can have '*' and other non-name characters.
However, there would also be value in capturing the fact that qnames and
that extension-element-prefixes should be unqualified names.

Pattern is a list of LocationPathPatterns. However, since
LocationPathPattern is not used separately, the value of having a
LocationPathPattern datatype and Pattern as a list of LocationPAthPattern is
minimal.

RelativePathPattern would appear to be a list of StepPatterns, however since
the delimiter used ("/" or "//") is significant they would not be
appropriate to treat as a generic list.

                                              
------------

2. Proposed solution:

a) Add list element to schema (uses char datatype defined later) -->

<element name="list">
     <archetype>
	 	<attribute name="minOccurs" datatype="non-negative-integer"
default="0"/>
		<attribute name="maxOccurs"
datatype="non-negative-integer"/>
		<!--  absent of separator attribute means no separator
appears    -->
		<attribute name="separator" basetype="char"/>
		<!--  default value (false) means that items can be
separated by only the separator (if any)
		       true would be useful for comma deliminated lists that
have non-significant white space -->
		<attribute name="ignoreExcessWhitespace" datatype="boolean"
default="true"/>
	 </archetype>
</element>

b) add to datatype element and dataQual archetype

  <element name='datatype'>
     <archetype order='all'>
		<element ref="list"/>
        <element ref='basetype'/>
		....
     </archetype>
  </element>


  <element name='datatype'>
     <archetype order='all'>
	    <element ref="list"/>
        <element ref='basetype'/>
		...
     </archetype>
  </element>

c) Add a couple of new built-in datatypes (though not essential, but
generally useful).   (These are also replicated in a following comment on
additional datatypes.)

<datatype name="char">
     <basetype name="string"/>
    <minLength>1</minLength>
    <maxLength>1</maxLength>
</datatype>

<!--  use of qname could result in namespace expansion in type aware
processors  -->
<datatype name="qname">
     <basetype name="nmtoken"/>
</datatype>

<datatype name="ncname">
	<basetype name="nmtoken"/>
	<!--  disallow : character, basetype takes care of assuring nmtoken
production  -->
	<lexicalRepresentation>[^:]*</lexicalRepresentation>
</datatype>

d) remove special narrative about NMTOKENS and IDREFS and redefine NMTOKENS
and IDREFS as:

<datatype name="idrefs">
	<basetype name="id"/>
	<list/>
</datatype>

<datatype name="nmtokens">
	<basetype name="nmtoken"/>
	<list/>
</datatype>

3. Use of list element in XSLT schema

<datatype name="nameTest">
	<basetype name="string"/>
	<literalRepresentation>\*</literalRepresentation>
	<!--  I'm going to make a separate note on multiple literal Reps
	         basically it means that as long as I match one of the
productions
			 I'm acceptible   -->
	<literalRepresentation datatype="nmtoken"/>
</datatype>

<datatype name="nameTests">
	<basetype name="nameTest"/>
	<list minOccur="1"/>
</datatype>
    
<element name="strip-space">
	<archetype>
	    ....
		<attribute name="elements" datatype="nameTests"/>
		....
	</archetype>
</element>

<attribute name="use-attribute-sets">
	<datatype name="qname">
		<list minOccur="1"/>
	</datatype>
</attribute>

4. Processing

The following seems a reasonable processing mechanisms for list (when
separator="," for clarity)

do
   complete production pattern for basetype
   if ignoreWhitespace is true
        match the following regex [&x0A&0x09&0x0D ]*,[&x0A&0x09&0x0D ]*
   else
        match ,
   end if
loop while there is a match

5. Examples of processing

Example a:

<datatype name="strings">
	<basetype name="string"/>
	<list separator=","/>
</datatype>

<element name="nonsense" datatype="strings"/>

Processing any fragment (including the following):

<nonsense>This, is, only, has, one, item, since, nothing, terminates, the,
string, production</nonsense>

will return a one item list since nothing terminates the string production.

Example b:

<datatype name="quotedString">
	<basetype name="string"/>
	<lexicalRepresentation>"[^"]*"</lexicalRepresentation>
</datatype>

<datatype name="quotedStrings">
	<basetype name="quotedString">
	<list separator=","/>
</datatype>

<element name="nonsense" datatype="quotedStrings"/>

Processing the following fragment will result in two items

<nonsense>"I can have my seperator (,) in here since","nothing had
terminated my production"</nonsense>

The comma in parenthesis is not processed as an item seperator since it was
encountered in the scope of the production pattern for quoted string.

Example c:

<datatype name="floats">
	<basetype name="float"/>
	<list seperator=","/>
</separator>

<element name="nonsense" datatype="floats"/>

Processing the following fragment:

<nonsense>3.1415926, 2.718, 1.414</nonsense>

Would result in a validation error, since the space between the first comma
and second number does not match the float production.  If the list element
had been  <list separator="," ignoreExcessWhitespace="true"/>, then it would
return 3 items.

<nonsense>3.1415926,,1.414</nonsense>

Would also be a validation error, since the null string between the two
comma's does not match the float production.

6. Accessing lists through a type-aware DOM

I definitely think that trying to define how a type-aware DOM would access
provide access to list data is outside the scope of the schema work.
However, it would not appear that adding generic lists would add any new
issues to that work project since
they would have to address how to provide access to the compatibility lists
of NMTOKENS and IDREFS.  There solution to that problem could be as easy as
saying that their is no native type support for lists and you can only get
the entire string back.  However you will have been assured that the string
meets your production requirements.


7. Additional burden on schema validation code

I believe the additional burden on validation authors would be minimal since
the generic list validation code can replace any IDREFS or NMTOKENS
validation code.  I would appreciate any comments from the Xerces or other
schema parser initiative team on their accessment of the additional
development burden.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From runnable at hotmail.com  Wed Dec  8 17:20:56 1999
From: runnable at hotmail.com (Bent Rasmussen)
Date: Mon Jun  7 17:18:27 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
Message-ID: <19991208172022.4168.qmail@hotmail.com>


>Thoughts?

yes - one

>		Layer *layer = new Layer((*it).GetAttribute("name"));
>		layer->Parse(*it);
>		doc->AddLayer(layer);

Why do you call a parse method outside of Layer? The parse method might be 
there but it seems to me that giving the constructor the whole DOM node will 
reduce complexity and since it is implied that the object should use the 
information during construction to build its internal state - it might as 
well just start off by parsing the node during the actual construction. If 
you had a method that returned the object state (fx a wrapper object with a 
DOM node containing state information) you could then easily throw in a 
history mechanism for your program (by letting the document object holding 
the shapes catch and reset states of objects in a sequential manner).

I'm rookie (only know about Java) but I think it makes sense, and hope it 
does since I intend redesigning my own java-based drawing application this 
way; using XML for serialization syntax and feeding/outputting it directly 
to/from the objects using it.

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Wed Dec  8 17:20:17 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:27 2004
Subject: Schema validation of XSLT, SVG, XPath : Part 1 Proposal for lists
Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415563@THOR>

A few weeks ago, Tim Berners-Lee strongly suggested that other XML
technologies start using XML Schema.  

I have reviewed several of the other XML technologies and believe with some
minor enhancements, XML Schema can do effective validation of these
technologies.  I've broken up the necessary modifications into several
different messages, so that each can be independently considered and
reviewed, however I see all of them as necessary, reasonable and easy to
implement with minimal additional effort.  I would appreciate any comments.

The following note was written after reviewing XSLT but before reviewing
SVG.  SVG makes much more extensive use of lists and so I believe its adds
even more compelling justification for the proposal.

The datatype draft explicitly defers addressing compound types to the next
revision of Schema, however I believe that lists are so essential to
validating these significant XML technologies and so generally useful that
they should be addressed in the initial recommendation.

In XSLT, there are numerous uses of space separated lists, two of which
cannot be addressed with the DTD compatibility NMTOKENS list type.  This
message identifies them, proposes an additional element for XML Schema
Datatypes that would address delimited lists in a  minimally distruptive
manner that would be generally useful and then presents schema fragments for
the XSLT elements.

I believe this is a compelling (even demanding) argument for inclusion of
list support in the initial version of XML Schema.


1. List usage in XSLT:

<xsl:stylesheet
   extension-element-prefixes = tokens
   exclude-element-prefixes = tokens  

<xsl:strip-space elements = nameTests
<xsl:preserve-space elements = nameTests


<xsl:element use-attribute-sets = qnames     //  qname could be done with
NMTOKENS,

<xsl:attribute-set use-attribute-sets = qnames

<xsl:copy use-attribute-sets = qnames

<xsl:output cdata-section-elements = qnames

Only the strip-space and preserve-space elements could not be done with
NMTOKENS, since a nameTest can have '*' and other non-name characters.
However, there would also be value in capturing the fact that qnames and
that extension-element-prefixes should be unqualified names.

Pattern is a list of LocationPathPatterns. However, since
LocationPathPattern is not used separately, the value of having a
LocationPathPattern datatype and Pattern as a list of LocationPAthPattern is
minimal.

RelativePathPattern would appear to be a list of StepPatterns, however since
the delimiter used ("/" or "//") is significant they would not be
appropriate to treat as a generic list.

                                              
------------

2. Proposed solution:

a) Add list element to schema (uses char datatype defined later) -->

<element name="list">
     <archetype>
	 	<attribute name="minOccurs" datatype="non-negative-integer"
default="0"/>
		<attribute name="maxOccurs"
datatype="non-negative-integer"/>
		<!--  absent of separator attribute means no separator
appears    -->
		<attribute name="separator" basetype="char"/>
		<!--  default value (false) means that items can be
separated by only the separator (if any)
		       true would be useful for comma deliminated lists that
have non-significant white space -->
		<attribute name="ignoreExcessWhitespace" datatype="boolean"
default="true"/>
	 </archetype>
</element>

b) add to datatype element and dataQual archetype

  <element name='datatype'>
     <archetype order='all'>
		<element ref="list"/>
        <element ref='basetype'/>
		....
     </archetype>
  </element>


  <element name='datatype'>
     <archetype order='all'>
	    <element ref="list"/>
        <element ref='basetype'/>
		...
     </archetype>
  </element>

c) Add a couple of new built-in datatypes (though not essential, but
generally useful).   (These are also replicated in a following comment on
additional datatypes.)

<datatype name="char">
     <basetype name="string"/>
    <minLength>1</minLength>
    <maxLength>1</maxLength>
</datatype>

<!--  use of qname could result in namespace expansion in type aware
processors  -->
<datatype name="qname">
     <basetype name="nmtoken"/>
</datatype>

<datatype name="ncname">
	<basetype name="nmtoken"/>
	<!--  disallow : character, basetype takes care of assuring nmtoken
production  -->
	<lexicalRepresentation>[^:]*</lexicalRepresentation>
</datatype>

d) remove special narrative about NMTOKENS and IDREFS and redefine NMTOKENS
and IDREFS as:

<datatype name="idrefs">
	<basetype name="id"/>
	<list/>
</datatype>

<datatype name="nmtokens">
	<basetype name="nmtoken"/>
	<list/>
</datatype>

3. Use of list element in XSLT schema

<datatype name="nameTest">
	<basetype name="string"/>
	<literalRepresentation>\*</literalRepresentation>
	<!--  I'm going to make a separate note on multiple literal Reps
	         basically it means that as long as I match one of the
productions
			 I'm acceptible   -->
	<literalRepresentation datatype="nmtoken"/>
</datatype>

<datatype name="nameTests">
	<basetype name="nameTest"/>
	<list minOccur="1"/>
</datatype>
    
<element name="strip-space">
	<archetype>
	    ....
		<attribute name="elements" datatype="nameTests"/>
		....
	</archetype>
</element>

<attribute name="use-attribute-sets">
	<datatype name="qname">
		<list minOccur="1"/>
	</datatype>
</attribute>

4. Processing

The following seems a reasonable processing mechanisms for list (when
separator="," for clarity)

do
   complete production pattern for basetype
   if ignoreWhitespace is true
        match the following regex [&x0A&0x09&0x0D ]*,[&x0A&0x09&0x0D ]*
   else
        match ,
   end if
loop while there is a match

5. Examples of processing

Example a:

<datatype name="strings">
	<basetype name="string"/>
	<list separator=","/>
</datatype>

<element name="nonsense" datatype="strings"/>

Processing any fragment (including the following):

<nonsense>This, is, only, has, one, item, since, nothing, terminates, the,
string, production</nonsense>

will return a one item list since nothing terminates the string production.

Example b:

<datatype name="quotedString">
	<basetype name="string"/>
	<lexicalRepresentation>"[^"]*"</lexicalRepresentation>
</datatype>

<datatype name="quotedStrings">
	<basetype name="quotedString">
	<list separator=","/>
</datatype>

<element name="nonsense" datatype="quotedStrings"/>

Processing the following fragment will result in two items

<nonsense>"I can have my separator (,) in here since","nothing had
terminated my production"</nonsense>

The comma in parenthesis is not processed as an item separator since it was
encountered in the scope of the production pattern for quoted string.

Example c:

<datatype name="floats">
	<basetype name="float"/>
	<list separator=","ignoreExcessWhitespace="false"/>
</separator>

<element name="nonsense" datatype="floats"/>

Processing the following fragment:

<nonsense>3.1415926, 2.718, 1.414</nonsense>

Would result in a validation error, since the space between the first comma
and second number does not match the float production.  If the list element
had been  <list separator=","/>, then it would return 3 items.

<nonsense>3.1415926,,1.414</nonsense>

Would also be a validation error, since the null string between the two
comma's does not match the float production.

6. Accessing lists through a type-aware DOM

I definitely think that trying to define how a type-aware DOM would access
provide access to list data is outside the scope of the schema work.
However, it would not appear that adding generic lists would add any new
issues to that work project since
they would have to address how to provide access to the compatibility lists
of NMTOKENS and IDREFS.  There solution to that problem could be as easy as
saying that their is no native type support for lists and you can only get
the entire string back.  However you will have been assured that the string
meets your production requirements.


7. Additional burden on schema validation code

I believe the additional burden on validation authors would be minimal since
the generic list validation code can replace any IDREFS or NMTOKENS
validation code.  I would appreciate any comments from the Xerces or other
schema parser initiative team on their accessment of the additional
development burden.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec  8 17:24:07 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com>
Message-ID: <000001bf41a1$2242b380$099918d1@docuverse1>

>This is information for a specific kind of XML processor
>(an indexing robot), but it is not specific to the document
>type. So we need a mechanism that applies to any XML document
>and can be automatically ignored by non-robot processors.
>A PI is an exact fit. Even the name is right -- it is an
>instruction to the robot about how to process it.
>
>The alternative, adding an element to every DTD in the 
>universe, with the corresponding breakage to every processor
>that reads those DTDs, is just too awful to contemplate.

If you intend the indexing PI to be used by document
creators, it is not unreasonable to expect them to
include it in their DTD.

Frankly, this is one of the reasons I do not like to
use DTD.  I am in favor of adopting the policy of
ignoring foreign elements and attributes for extensibility.
Absolute ordering of elements also detracts from
extensibility.  Relative ordering of elements is fine
though.

My ideal solution for this problem is a small set of
elements that can be embedded into documents.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Wed Dec  8 17:37:14 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:27 2004
Subject: RFC: "even simpler" C++ XML parser for object hierarchies
References: <19991208172022.4168.qmail@hotmail.com>
Message-ID: <384E9784.24037450@fxtech.com>

> >               Layer *layer = new Layer((*it).GetAttribute("name"));
> >               layer->Parse(*it);
> >               doc->AddLayer(layer);

> Why do you call a parse method outside of Layer? The parse method might be
> there but it seems to me that giving the constructor the whole DOM node will
> reduce complexity and since it is implied that the object should use the
> information during construction to build its internal state - it might as
> well just start off by parsing the node during the actual construction. If

That's a good point. One advantage to doing it this way is your objects
do not *necessarily* need to know anything about the actual parsing
mechanism. You can call an object parser that parses out the required
attributes and entities on behalf of the object, and then call normal
object methods to get it into the state you want.

Both ways could be utilized trivially.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Wed Dec  8 18:38:02 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:27 2004
Subject: Schema validation of XSLT, SVG, XPath: Part 2: Multiple Lexical R
	epresentation
Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415565@THOR>

Sorry about the duplicate (or near duplicate) Part 1 messages.  I'm not
really sure how that happened.

This part makes some simple modifications that greatly simplify specifying
the lexical represention of data types that have several forms.

Here are some productions that would be difficult to enforce without the
suggested modifications.:

NameTest from XPath:

NameTest ::= '*' 
	         |  NCName ':' '*' 
                     | QName

NCName from XML Namespaces:
4] NCName ::= (Letter | '_') (NCNameChar)

The SVG path data datatype (datatype of the d attribute)
<path d="M 100 100 L 140 100 L 120 140 z"/> 
Proposal:
After reviewing the current datatypes doc, I'm a little confused with what
happened with the previous lexicalRepresentation element.  The
interpretation of pattern and lexical are not adequately discussed.  I'm
moving more things around than I thought that I would need to, but here
goes.  Here are what I think would be reasonable renderings of the previous
production patterns.
<datatype name="nameTest">
	<basetype name="string"/>
	<!-- this is using the lexical element to represent all legal string
encodings of nameTest 
		for a string to be a valid nameTest one of the exclosed
patterns must match and
                       must conform to the lexical representation of the
base type
		-->
	<lexical>
		<!--  could be just an asterisk   -->
		<pattern>\*</pattern>
		<!--  matching this pattern means that it matches the
namespaceWildcard datatype and the 
			   default pattern of ".*"   -->
		<pattern datatype="namespaceWildcard"/>
		<pattern datatype="qname"/>
	</lexical>
</datatype>
<datatype name="namespaceWildcardFragments">
	<basetype name="string"/>
	<lexical>
		<pattern datatype="ncname"/>
		<pattern>:</pattern>
		<pattern>\*</pattern>
	</lexical>
</datatype>
</datatype name="namespaceWildcard">
	<basetype name="namespaceWildcardFragments"/>
	<list minOccur="3" maxOccur="3"/>
	<!--  regex constraint on entire list, making sure that the last two
characters are :*  ->
	<lexical><pattern>.*:\*</pattern></lexical>
</datatype>

<datatype name="qnameFragments">
	<lexical>
		<pattern datatype="ncname"/>
		<pattern>:</pattern>
	</lexical>
</datatype>
<datatype name="qname">
	<basetype name="qnameFragments"/>
	<list minOccur="1" maxOccur="3"/>
	<lexical>
		<!--  matches any two character or longer string that
doesn't have an initial or final colon  -->
		<pattern>[^:].*[^:]</pattern>
		<!--  matches any non-colon single character string   -->
		<pattern>[^:]</pattern>
	</lexical>
</datatype>
<datatype name="ncname">
	<basetype name="nmtoken"/>
	<!--  disallow colon from nmtoken  -->
	<lexical><pattern>[^:]*</pattern>
</datatype>

<!--  SVG example    -->
<datatype name="svgcoord">
	<basetype name="real"/>
	<list minOccur="2" maxOccur="2"/>
</datatype>
<datatype name="moveCommandFragment">
	<basetype name="string"/>
	<lexical>
		<pattern>[Mm]</pattern>
		<pattern datatype="svgcoord"/>
	</lexical>
</datatype>
<datatype name="moveCommand">
	<basetype name="moveCommandFragment"/>
	<list minOccur="2"/>
	<lexical>
		<pattern>[Mm][^Mm]*</pattern>
	</lexical>
</datatype>
...   omited for  other SVG productions

<datatype name="pathdataItem">
	<basetype name="string"/>
	<lexical>
		<pattern datatype="moveCommand"/>
		<pattern datatype="curvetoCommand"/>
		<pattern datatype="smoothCommand"/>
		<pattern datatype="arcCommand"/>
		...
	</lexical>
</datatype>
<datatype name="pathdata">
	<basetype="pathdataItem"/>
	<list/>
</datatype>
<attribute name="d" datatype="pathdata"/>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Wed Dec  8 21:38:01 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
In-Reply-To: <000001bf41a1$2242b380$099918d1@docuverse1>
References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com>
Message-ID: <3.0.5.32.19991208133726.00b28610@corp.infoseek.com>

At 09:24 AM 12/8/99 -0800, Don Park wrote:
>
>If you intend the indexing PI to be used by document
>creators, it is not unreasonable to expect them to
>include it in their DTD.
>
>Frankly, this is one of the reasons I do not like to
>use DTD.  I am in favor of adopting the policy of
>ignoring foreign elements and attributes for extensibility.
>Absolute ordering of elements also detracts from
>extensibility.  Relative ordering of elements is fine
>though.
>
>My ideal solution for this problem is a small set of
>elements that can be embedded into documents.

Adding the robots info to every DTD in the world requires
unanimous agreement. Adding a PI requires non-interference
with other PIs, a vastly simpler task. Waiting for XML
to support mixin vocabularies and for those to be widely
used, could take a few years. So the element-based approach
just doesn't fit my definition of "ideal". That was the
approach I originally thought about, but there were just
too many obstacles for it to succeed.

So, though the element-based approach might be more 
comfortable for authors, the robots PI fits both the letter 
and the intent of PIs in the XML spec and it does the job.
If it is blessed as a standard, it sure would be easy for
an XML editor to add a dialog box to generate it. That 
would be even easier for authors.

wunder
--
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://software.infoseek.com/
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec  8 22:27:16 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991208133726.00b28610@corp.infoseek.com>
Message-ID: <000201bf41cb$7ab65740$099918d1@docuverse1>

>Adding the robots info to every DTD in the world requires
>unanimous agreement. Adding a PI requires non-interference
>with other PIs, a vastly simpler task. Waiting for XML
>to support mixin vocabularies and for those to be widely
>used, could take a few years. So the element-based approach
>just doesn't fit my definition of "ideal". That was the
>approach I originally thought about, but there were just
>too many obstacles for it to succeed.

It is the 'interference' notion that I am against because
it causes loss of extensibility.  This notion creates an
imbalance between document creator and subsequent document
processors because any change to the document would have
to use arcane XML features like PI to avoid 'interfering'
with the original document.

Perhaps a more general placeholder standard for tags such
as your indexing tag/PI is what we need.  For example:

<meta:info xmlns:meta="http://www.xml.org/metainfo">
</meta:info>

In the DTD, meta:header content is declared to be ANY so
that any meta-info tags such as your index tag can be
dropped in.

Such a general proposal would have far better chance of
being adopted than a more specific proposal.  Once it is
adopted, everyone can use it to drop in tags for their
own purpose.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From francis at redrice.com  Thu Dec  9 00:29:46 1999
From: francis at redrice.com (Francis Norton)
Date: Mon Jun  7 17:18:27 2004
Subject: XML. SAX. Streaming processing with Groves.
References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> <011101bf3d53$60df73a0$e5d88dce@WORKGROUP>
Message-ID: <384ADEEF.74C1E4F1@redrice.com>

XPath seems to have missed the boat for DOM level 2. Is there any chance
that XPath will be included in level 3? I can see that it doesn't appear
to fit in to the roadmap, but as someone who does commercial
program-to-program programming I would find not only the basic
functionality but access to the neater data model a real aid to
productivity.

Francis.

Michael Champion wrote:
> 
...
> 
> The DOM WG will be defining the requirements for Level 3 over the next 6
> weeks or so.  Standard APIs for loading, saving, parsing, and serializing
> XML text are "must have" items for Level 3, and this issue (that an
> application may want access to the elements of a document before it is fully
> parsed) has come up. For example, a programmer might choose not to continue
> parsing some huge document after the necessary data were found.
> 
> Concrete suggestions for actual APIs or pointers to APIs that allow this
> would be appreciated.
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mike.Champion at softwareag-usa.com  Thu Dec  9 01:05:16 1999
From: Mike.Champion at softwareag-usa.com (Michael Champion)
Date: Mon Jun  7 17:18:27 2004
Subject: XML. SAX. Streaming processing with Groves.
References: <033701bf3d4c$0b846ca0$5df5c13f@PaulTchistopolskii> <011101bf3d53$60df73a0$e5d88dce@WORKGROUP> <384ADEEF.74C1E4F1@redrice.com>
Message-ID: <009801bf41e1$2e0ffcf0$5dbdb3c7@WORKGROUP>


----- Original Message ----- 
From: Francis Norton <francis@redrice.com>
To: Michael Champion <Mike.Champion@softwareag-usa.com>
Cc: <xml-dev@ic.ac.uk>
Sent: Sunday, December 05, 1999 4:53 PM
Subject: Re: XML. SAX. Streaming processing with Groves.


> XPath seems to have missed the boat for DOM level 2. Is there any chance
> that XPath will be included in level 3? I can see that it doesn't appear
> to fit in to the roadmap, but as someone who does commercial
> program-to-program programming I would find not only the basic
> functionality but access to the neater data model a real aid to
> productivity.

I agree ... and will forward this suggestion to the DOM 
working group.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sunker at telkom.net  Thu Dec  9 02:46:53 1999
From: sunker at telkom.net (sunker@telkom.net)
Date: Mon Jun  7 17:18:27 2004
Subject: XSLT with DOM
Message-ID: <61D3A6AB14FED211856500001C055D9633E042@FS01>

Hi all,

I got a problem with Dom doc with xsl transform

in my xml i included the DTD with the entity references to some file process such as test.asp, (i parse with GenXMLToHTML(xml,xsl)) when it display to the web browser, its blank...!!.
but when i include stylesheet directly to xml file it work properly. why ?.. Is it possible xml cannot load the process under process ? or the xsl is worst processor ?

for more info this the example:

DOM TRANS XML TO HTML:
function GenXMLToHTML(xmlf,xslf)
{
   xmlfile = new ActiveXObject("Microsoft.XMLDOM");
   xmlfile.async = false;
   xmlfile.validateOnParse = false;
   xmlfile.load(Server.MapPath(xmlf));

   xslfile = new ActiveXObject("Microsoft.XMLDOM");	
   xslfile.async = false;
   xslfile.validateOnParse = false;
   xslfile.load(Server.MapPath(xslf));
   
   Response.Write(xmlfile.transformNode(xslfile));
}

=======================================
XML FILENAME = TEST.XML
<?xml version="1.0"?>
<!DOCTYPE TRADING[
<!ELEMENT TRADING (Roots?)>
<!ELEMENT Roots (Pale*)>
<!ELEMENT Pale (#PCDATA)>
<!ENTITY port SYSTEM "test.asp">
]> 
<TRADING>
   &port;
</TRADING> 
=======================================

XSL FILENAME = TEST.XSL
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl"
xmlns="http://www.w3.org/TR/REC-html40" result-ns="HTML">

<xsl:template match="/">
      <HTML>
         <HEAD>
            <TITLE>Account</TITLE>
            <META NAME="description" CONTENT="Member Account"/>
			<META NAME="keywords" CONTENT="Account"/>
			<META NAME="Author" CONTENT="Sunker"/>
			<META NAME="generator" CONTENT="XHTML 5.00.2314.1000"/>
			<META HTTP-EQUIV="no-cache"/>
			<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252"/>
			<META HTTP-EQUIV="Content-Script-Type" CONTENT="text/javascript"/>
			<META HTTP-EQUIV="Content-Style-Type" CONTENT="text/css"/>
			<META HTTP-EQUIV="Window-target" CONTENT="main"/>
         </HEAD>
         <BODY>
         <xsl:apply-templates select="TRADING/Roots/Pale"/>
         </BODY>
      </HTML>  
</xsl:template>

<xsl:template match="TRADING/Roots/Pale">
   <DIV STYLE="COLOR:BLUE;FONT-WEIGHT:BOLD">
     <xsl:eval>childNumber(this)</xsl:eval>-<Span><xsl:value-of select="."/></Span>
   </DIV>
</xsl:template>

</xsl:stylesheet>

=======================================
ASP FILENAME = TEST.ASP
<%@  Language=JScript %>
<%Response.ContentType="text/xml"
msg = '<Roots>'
 for (var i=0;i<100;i++){
   msg+='<Pale>Pxc'+i+'</Pale>';
 }
 msg+='</Roots>';
Response.Write (msg);
%>

thanks
Sunker
(this's xml page generate by GENXMLTOHTML http://www.geocities.com/researchtriangle/campus/7211)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: winmail.dat
Type: application/ms-tnef
Size: 3484 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991209/d87ac12d/winmail.bin
From tbray at textuality.com  Thu Dec  9 06:04:18 1999
From: tbray at textuality.com (tbray@textuality.com)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com> <3.0.5.32.19991208133726.00b28610@corp.infoseek.com>
Message-ID: <0ab601bf420b$71f94be0$0500a8c0@ned>

From: Walter Underwood <wunder@infoseek.com>
> Adding the robots info to every DTD in the world requires
> unanimous agreement. Adding a PI requires non-interference
> with other PIs, a vastly simpler task. Waiting for XML
> to support mixin vocabularies and for those to be widely
> used, could take a few years.

Walter is right on both counts, but I'm having trouble getting comfortable
with his PI idea.  Not violently against it, but two things make me
uncomfortable.  First of all, PIs basically suck.  Having said that, if you
gotta use them, this is the kind of thing to use them for.

But my big problem is with the idea that individual resources ought to embed
robot-steering information. It just feels like the wrong level of
granularity.  Either this ought to be done externally in something like
robots.txt but smarter, at the webmaster/administrator level, or, with a
namespaced vocabulary at the individual element level.  Note that the
external file and the embedded element-level stuff could have the same
namespaced vocabulary.  The PI has the characteristic that it *has* to be in
the document and can modify *only* the whole document.  Also I question the
ability of authors to do the right thing with this kind of a macro-level
control.  Also I question the ability of robot authors to do the right thing
at the individual document level.

In any case, there really should be a namespace with a bunch of predeclared
attributes for this purpose; then for those who want to do fancy things,
they can do so in a clean way at the individual element level.  For those
who *don't* want to wire robot stuff into their document structure, but *do*
want individual resource-level control and *don't* want to do it in a
centralized way, I guess the PI is a tolerable kludge; but it doesn't seem
like much more than that.

Anyhow, is there enough XML on the web to make this interesting?  Serious
question, I don't know the answer. -T.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Dec  9 06:11:05 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
Message-ID: <007601bf420f$a8914aa0$2cf96d8c@NT.JELLIFFE.COM.AU>


From: Don Park <donpark@docuverse.com>

 Not Walter wrote:
>>I'm not Walter, but to me this has the obvious advantage that it can
>>be used completely orthogonally to the document contents and the
>>software used to process the document for non-indexing purposes.
>
>IMHO, this line of thinking (aka 'sacred content')
>forces us to use PI or special attributes for
>extension of document instances.  Poor use of
>the letter 'X' in XML.

But thinking (methodology) forces us to have a need; a markup language
either supports that need or not.  Having PIs does not force anyone to
use them.

At www.apache.org, the first design coccon uses PIs, the second design
does not. The comments are interesting and useful for why, but they
also make the same mistake of saying that because they have moved to
a system complexity where PIs are not needed, therefore PIs are bad;
this is despite them using them in their first system.  So PIs, at
least,
provide an alternative from that many designers find natural.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Dec  9 06:40:13 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
Message-ID: <00cd01bf4213$c9c4e840$2cf96d8c@NT.JELLIFFE.COM.AU>


From: tbray@textuality.com <tbray@textuality.com>

 >Walter is right on both counts, but I'm having trouble getting
comfortable
>with his PI idea.  Not violently against it, but two things make me
>uncomfortable.  First of all, PIs basically suck.  Having said that, if
you
>gotta use them, this is the kind of thing to use them for.

If PIs suck, then perhaps they suck in the same way that using #defines
in C++
does or the SQLJ preprocessor does:  it can be a sign of insufficient
analysis in
the whole system (perhaps for legitimate reasons: the need may have
emerged
over time) or because of habit or to clearly demarcate different
processing
inputs to simplify subsequent phases or because of a deficiency in the
underlying language.

But this is not to allow that PIs suck in the first place.

Actually, to use the C++, I think PIs correspond to pragmas more than
anything.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From efren at banesto.es  Thu Dec  9 09:33:24 1999
From: efren at banesto.es (Efren)
Date: Mon Jun  7 17:18:27 2004
Subject: Manual XML ?
Message-ID: <384F76E0.265E88C8@banesto.es>

Hola a todos,
puede alguien decirme donde puedo encontrar un buen manual de XML?

Gracias


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Thu Dec  9 10:28:32 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:27 2004
Subject: A processing instruction for robots
Message-ID: <199912090527787.SM01128@Unknown.>

I suppose this is where I'm supposed to come in....  ;)

> > Adding the robots info to every DTD in the world requires
> > unanimous agreement. Adding a PI requires non-interference
> > with other PIs, a vastly simpler task. Waiting for XML
> > to support mixin vocabularies and for those to be widely
> > used, could take a few years.
> 
> Walter is right on both counts, but I'm having trouble getting comfortable
> with his PI idea.  Not violently against it, but two things make me
> uncomfortable.  First of all, PIs basically suck.  Having said that, if you
> gotta use them, this is the kind of thing to use them for.

Agreed.  This is an instruction to a specific class of processor, hence it's a good fit as a 
PI from that angle.

> But my big problem is with the idea that individual resources ought to embed
> robot-steering information. It just feels like the wrong level of
> granularity.  Either this ought to be done externally in something like
> robots.txt but smarter, at the webmaster/administrator level, or, with a
> namespaced vocabulary at the individual element level.

This has been tried.  The problem is that the current robots.txt idea just doesn't work for 
everybody - robots.txt is supposed to reside in the domain's root [1], and not everybody 
has that access.  (Big examples: Geocities, Angelfire, Tripod, AOL....)  Granted, a tweak 
to that specification that would allow local copies of robots.txt to affect their subdirectory 
tree would be *most* helpful in that regard, but that just doesn't exist.

[1] - See http://info.webcrawler.com/mak/projects/robots/norobots.html under "The 
Method" header.  The filename is "/robots.txt" - which forces the file into the root.

Because of this overwhelming gap, there's a hack in HTML that uses META to 
granularize this at a per-document level, and a few bots are good about obeying that 
syntax.

> The PI has the characteristic that it *has* to be in the document and can modify
> *only* the whole document.  Also I question the ability of authors to do the right
> thing with this kind of a macro-level control.

I do it all the time.  In fact, I have a default value I can specify in my templates.  (Yes, I 
could use a robots.txt file - the current method is a holdover from before I had a domain 
name for my site.)

> Also I question the ability of robot authors to do the right thing at the individual
> document level.

That's already a current issue.  Bot authors who are conscientious enough to obey the 
META hack will have no problem modifying their source to obey the XML PI as well; 
it's a trivial transformation.  (Especially if the PI uses syntax that's as close to the META 
version as possible!)

> In any case, there really should be a namespace with a bunch of predeclared
> attributes for this purpose; then for those who want to do fancy things,
> they can do so in a clean way at the individual element level.

Fine - swipe the existing values and go from there.  The fewer changes made, the better - 
from all viewpoints.  Not only will there be fewer deltas for page authors to learn, but 
bot authors will be better able to just reuse existing META code to accomodate the PI.

Note that I'm not saying that a local robots file wouldn't be a wonderful idea - just that 
since you currently have only the choices of "global" and "per document" with HTML, 
you ought to have *at least* those same choices with XML.  A local robots.txt would be 
tasty gravy indeed.

> Anyhow, is there enough XML on the web to make this interesting?  Serious
> question, I don't know the answer. -T.

I have enough X(HT)ML up to be very interested in this matter - and there's only going 
to be more online as the spec progresses.  Why not address the issue *before* there's a 
huge amount of X(HT)ML online, instead of waiting until a few assorted hacks come 
up?


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Andy.Bradbury at syntegra.bt.co.uk  Thu Dec  9 11:04:57 1999
From: Andy.Bradbury at syntegra.bt.co.uk (Andy.Bradbury@syntegra.bt.co.uk)
Date: Mon Jun  7 17:18:27 2004
Subject: LINK.VBS
Message-ID: <65AF45D5E535D2118AFB0008C7FA2318035A9B01@FL-EXCHANGE-03>

With regard to the LINK.VBS virus, it seems it is another "send yourself to
everyone on the current victim's mailing list"-type virus - which means it
*could* come from a source that is normally quite unimpeachable.
 
The message below is a useful follow-up to the original warning:

------------------------------------------------------------------------
 
If you were to double click on the attachment, I suspect
that the VB Script would access files on your hard drive
and mail copies of itself to addresses your Contacts folder.
 
I did the wrong thing and got a porno site and a desktop icon.
The virus detector detected the virus and deleted 
c:\WINDOWS\TEMP\LINK.VBS
c:\WINDOWS\SYSTEM\RUNDLL.VBS
being the only two contaminated files. If you got this virus then delete the
email, delete the desktop icon and delete the above two files and hope it is
not anywhere else.
 
It came from somewhere I would have normally trusted. So be warned
 
Regards Trevor Croll 
 
----------------------------------------------------------------------------
-
 
Regards
 
Andy B.
 
 
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Thu Dec  9 11:52:24 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:18:27 2004
Subject: LISTADMIN: Archive of XML-DEV, 1997-1999
Message-ID: <v04220820b4754741d953@[155.198.224.86]>

Several people have reported the index of  XML-DEV found on the
page 

http://www.lists.ic.ac.uk/hypermail/xml-dev/

is broken. Whilst we endeavour to get this fixed, please note that
another index of the forum is at http://www.xml-cml.org/search.html

Also, I have created a "sherlock" plug-in for this latter search at
http://www.ch.ic.ac.uk/chemime/chemdig/xmldev.src.hqx

On this latter theme, can anyone remind me whether a "channel" of 
sites with indexed content might have been created by anyone,
ie it would be useful to search the dozen on so sites related to
XML in parallel using a  "channel" ? Thus the above plugin
would be one component of such a channel.

The down side is that the above plug is  not cross platform  ie
it only works with MacOS. Cross platform suggestions for the
above (based on  XML which is an obvious way of doing it)
are most welcome. 

Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
Dept. Chemistry, Imperial College, London, SW7  2AY, UK. 
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fujisawa at the.canon.co.jp  Thu Dec  9 12:06:43 1999
From: fujisawa at the.canon.co.jp (Jun Fujisawa)
Date: Mon Jun  7 17:18:28 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <14407.1389.659881.147338@localhost.localdomain>
References: <3.0.32.19991202141224.0148fc60@pop.intergate.ca>
 <3.0.32.19991202141224.0148fc60@pop.intergate.ca>
Message-ID: <v04010101b47548c1aab8@the.canon.co.jp>

At 6:49 PM -0500 99.12.2, David Megginson wrote:
>  > At 04:27 PM 12/2/99 -0500, David Megginson wrote:
>  > Good idea, one question.  Any way to do C at the same time? -Tim
>
> Sure -- is there a strong need for a common C interface, though?  We
> already have Expat's C interface, and I don't know of anyone else in
> that space yet.

Gnome libxml and Oracle XML Parser for C do have SAX
interface in C.

<http://xmlsoft.org/>
<http://technet.oracle.com/tech/xml/parser_c/>

Another interesting work is the Simple API for CSS (SAC).
The SAC interface is defined both in Java and C.

<http://www.w3.org/Style/CSS/SAC/>

I think the combination of SAX and SAC might be very
attractive (especially in C binding) while developing
XML software on resource constrained environment,
such as XHTML Basic user agents.

--
Jun Fujisawa
<mailto:fujisawa@the.canon.co.jp>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Thu Dec  9 12:46:56 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
In-Reply-To: <0ab601bf420b$71f94be0$0500a8c0@ned>
Message-ID: <001201bf4243$8f7959c0$099918d1@docuverse1>

Tim Bray wrote:
>But my big problem is with the idea that individual resources 
>ought to embed robot-steering information. It just feels like
>the wrong level of granularity.  Either this ought to be done
>externally in something like robots.txt but smarter, at the
>webmaster/administrator level, or, with a namespaced vocabulary
>at the individual element level.

You are assuming that these resources exists somewhere
where an external resource like robots.txt can coexist
relative to the target resources.

I would much more prefer an arrangement where I have
the option of embedding, linking, or sequencing.  By
sequencing, I mean document transmission order specifies
the relationships between documents.  If only a single
one-way communication channel is available, then meta-info
document can be sent just ahead of the target document.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec  9 13:33:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:28 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: roddey@us.ibm.com's message of "Tue, 7 Dec 1999 13:40:43 -0700"
References: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com>
Message-ID: <m34sdsck90.fsf@localhost.localdomain>

roddey@us.ibm.com writes:

> 11) The class names (since we can't afford to use C++ namespaces) should be
> expanded to include a SAX prefix to avoid clashes. So SAXParser and
> SAXLocator and SAXAttributeList and so on.

Is it true that C++ namespaces are still a problem on any platform?  I
know that they actually do work under Windows, and the newer EGCS/GCC
have supported them for a while for all *nix variants (including
Linux) -- is it the Mac that doesn't have a proper C++ compiler yet?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sdr at camsoft.com  Thu Dec  9 14:02:26 1999
From: sdr at camsoft.com (Stewart Rubenstein)
Date: Mon Jun  7 17:18:28 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <m34sdsck90.fsf@localhost.localdomain>
Message-ID: <003101bf424e$9ed143a0$a66d70c6@camsoft.com>

David Megginson writes:
> is it the Mac that doesn't have a proper C++ compiler yet?

No.  The mac has a great C++ compiler from Metrowerks (now a subsidiary of
Motorola).

-Stew


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Curt.Arnold at hyprotech.com  Thu Dec  9 15:56:29 1999
From: Curt.Arnold at hyprotech.com (Arnold, Curt)
Date: Mon Jun  7 17:18:28 2004
Subject: nestable C/C++ parser
Message-ID: <61DAD58E8F4ED211AC8400A0C9B46873415572@THOR>

I think your original request got lost in a side track.  If is very possible
to do what you want with expat.  The trick is the use of the XML_SetUserData
and the userdata argument.  Basically, the trick is to create a base class
that has methods that you want to change the behavior of (typically,
StartElement and EndElement).  Create derived classes for each different
behavior that you want.  Call XML_SetUserData to the initial handler object.
In your StartElement callback, cast the userdata argument up to a pointer to
your base class and call its startElement virtual method.  If you want to
change the handler, make another call to XML_SetUserData.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Thu Dec  9 16:12:56 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:28 2004
Subject: SAX and <!DOCTYPE>
In-Reply-To: Lars Marius Garshol's message of "07 Dec 1999 10:46:35 +0100"
References: <whln77f8hq.fsf@viffer.oslo.metis.no> <m3r9gzdqx0.fsf@ifi.uio.no>
Message-ID: <whk8mort34.fsf@viffer.oslo.metis.no>

>>>>> Lars Marius Garshol <larsga@garshol.priv.no>:

> It depends on the situation. In the XSA client, which needs to
> accept both XSA and OSD documents, but can't tell them apart before
> parsing begins, uses a DispatchingDocHandler, which has a hash of
> DocumentHandlers keyed on the name of the document element. In this
> very restricted case that worked just fine.

> In other cases one might perhaps key on the namespace of the
> document element, and with SAX 2 one could use the public identifier
> of the DOCTYPE declaration.

I thought of something like this, but I couldn't decide on whether to
use a DocumentHandler or use a buffering handler that would just
buffer up the text and try matching in the buffered text until it
found something to dispatch on, sending all the buffered text as the
initial input to the XML parser.

It looks like a dispatching from a DocumentHandler is the best idea,
but then I need to be able to queue up SAX DocumentHandler events to
send to the actual DocumentHandler when I start it.

Hm... maybe an clone() function an a virtual destructor are in order
for the C++ AttributeList class...?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From xml at waena.edu  Thu Dec  9 16:28:43 1999
From: xml at waena.edu (xml)
Date: Mon Jun  7 17:18:28 2004
Subject: NDATA,XPointer, XLink Confusion
Message-ID: <000701bf4262$603864c0$1d5afea9@adtech.internet.ibm.com>

Hi all,
This is a newbie-ish question.

My servlet accepts XML files which have in them CDATA of base64 encoded
binary data (images, sounds, movies, etc).

I take that CDATA, extract it and save it as a separate file in the
filesystem, but now I need to add and NDATA statement that points to that
file.  Perhaps I should be using XPointers or XLinks?  In any case, the
element that holds the CDATA, called <CONTENT> is of CDATA type.  I can
change that to NDATA and plug in the reference, but do I have to decalre the
actual NDATA in the DTD?  I can't do this (one DTD for many files, all of
which contain different binary data).  If I use XPointers or XLinks, do XML
parsers automatically inster the binary data they point to?  If so, why
would anyone use NDATA to point to external binary files?

Thanks


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Thu Dec  9 17:48:32 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
In-Reply-To: <0ab601bf420b$71f94be0$0500a8c0@ned>
References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com>
 <3.0.5.32.19991208133726.00b28610@corp.infoseek.com>
Message-ID: <3.0.5.32.19991209094748.00aad3b0@corp.infoseek.com>

At 04:54 PM 12/8/99 -0800, tbray@textuality.com wrote:
>
>But my big problem is with the idea that individual resources ought 
>to embed robot-steering information. It just feels like the wrong 
>level of granularity.

For really picky indexing and searching, it is wrong. 
Structural markup opens up some really nice possibilities.
An indexer might weight the bibliography less and the
abstract more, for example.

But that sort of tweakiness changes for each search engine.
So I'd implement that as a DTD-specific configuration in
each engine, rather than trying to add processor-specific
markup to each document. In fact, I already implemented it
that way. Use the structure, Luke.

On the plus side, XML tends to be content-rich, without
navbars and decoration. This means that you get better quality
results without resorting to tweaks. For example, you can 
actually search for "Home", "Copyright", or "Help" and get 
relevant results.

>...  The PI has the characteristic that it *has* to be in
>the document and can modify *only* the whole document.  Also I 
>question the ability of authors to do the right thing with this 
>kind of a macro-level control.  Also I question the ability of robot 
>authors to do the right thing at the individual document level.

I'm willing to trust the authors and webmasters. There are a
lot of professionals out there. As for robot authors, if the
robots PI semantics are the same as the HTML robots meta tag
semantics, it should be pretty easy to get right. If they are
different, all bets are off.

>Anyhow, is there enough XML on the web to make this interesting?  
>Serious question, I don't know the answer. -T.

We're seeing XML-backed websites where they want to index
the XML, but serve URLs pointing to the formatted HTML.

wunder
--
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://software.infoseek.com/
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Thu Dec  9 17:54:46 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
In-Reply-To: <00cd01bf4213$c9c4e840$2cf96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <3.0.5.32.19991209095252.00b32350@corp.infoseek.com>

At 03:05 PM 12/9/99 +0800, Rick Jelliffe wrote:
>
>But this is not to allow that PIs suck in the first place.
>
>Actually, to use the C++, I think PIs correspond to pragmas more than
>anything.

Exactly. Obviously, we need to add "#notation" to the preprocessor.

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dwshin at nlm.nih.gov  Thu Dec  9 18:59:32 1999
From: dwshin at nlm.nih.gov (Dongwook Shin)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com>
	 <3.0.5.32.19991208133726.00b28610@corp.infoseek.com> <3.0.5.32.19991209094748.00aad3b0@corp.infoseek.com>
Message-ID: <384FF7F3.1E32EDB@nlm.nih.gov>


Walter Underwood wrote:

> At 04:54 PM 12/8/99 -0800, tbray@textuality.com wrote:
> >
> >But my big problem is with the idea that individual resources ought
> >to embed robot-steering information. It just feels like the wrong
> >level of granularity.
>
> For really picky indexing and searching, it is wrong.
> Structural markup opens up some really nice possibilities.
> An indexer might weight the bibliography less and the
> abstract more, for example.
>
> But that sort of tweakiness changes for each search engine.
> So I'd implement that as a DTD-specific configuration in
> each engine, rather than trying to add processor-specific
> markup to each document. In fact, I already implemented it
> that way. Use the structure, Luke.
>

If you see XRS (XML retrieval system), you can find that a user
can give a bigger weight to an element than to another. This
kind of weighting is more flexible than those by indexer.
Check XRS Web demonstration system:
http://dlb2.nlm.nih.gov/~dwshin/xrs.html

Dongwook
--
Dongwook Shin
Visiting Scholar
Lister Hill National Center for Biomedical Communications
National Library of Medicine,
8600 Rockville Pike Bethesda 20894, MD
E-mail: dwshin@nlm.nih.gov
Tel: (301) 435-3257
FAX: (301) 480-3035
URL: http://dlb2.nlm.nih.gov/~dwshin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Thu Dec  9 20:05:26 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
In-Reply-To: <384FF7F3.1E32EDB@nlm.nih.gov>
References: <3.0.5.32.19991207095428.00bfd520@corp.infoseek.com>
 <3.0.5.32.19991208133726.00b28610@corp.infoseek.com>
 <3.0.5.32.19991209094748.00aad3b0@corp.infoseek.com>
Message-ID: <3.0.5.32.19991209120409.00b364b0@corp.infoseek.com>

At 01:41 PM 12/9/99 -0500, Dongwook Shin wrote:
>Walter Underwood wrote:
>> Structural markup opens up some really nice possibilities.
>> An indexer might weight the bibliography less and the
>> abstract more, for example.
>
>If you see XRS (XML retrieval system), you can find that a user
>can give a bigger weight to an element than to another. This
>kind of weighting is more flexible than those by indexer.
>Check XRS Web demonstration system:
>http://dlb2.nlm.nih.gov/~dwshin/xrs.html

I think you are suggesting that wighting and selection
should be done at query time instead of at index time.
That is a design tradeoff for the search engine. But the
detailed weighting and selection belong *somewhere* in
the search engine rather than in every single document.

I can imagine a system where each document had indexing
hints scattered throughout the structure, but I can't
imagine anyone having the time or knowledge to do a good
job with all that markup. We have enough trouble getting
people to replace "Untitled Document" in the <title> element
in HTML.

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Dec  9 20:59:29 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:28 2004
Subject: XSLT Question: Inserting a DOCTYPE decl
Message-ID: <38501883.479847CE@mitre.org>

Hi Folks,

I have a situation where I have many XML documents that do not contain a
DOCTYPE declaration, and would like to write a stylesheet that inserts a
declaration within the documents.  The interesting aspect of this
problem is that each XML document contains within it an element which
gives the name of the DTD file.  So, the declaration should use the
value of that element as the name for the DTD file.

For example, here's a sample XML document into which I need to insert a
DOCTYPE declaration:

<?xml version="1.0"?>
<Numbers>
        <DoctypeFile>Number.dtd</DoctypeFile>
        <Number>27</Number>
        <Number>34</Number>
        <Number>18</Number>
        <Number>67</Number>
        <Number>99</Number>
        <Number>16</Number>
</Numbers>

Note the DoctypeFile element, which indicates the name of the DTD file.

The stylesheet should insert the declaration, thus resulting in an XML
document as such:

<?xml version="1.0"?>
<!DOCTYPE Numbers SYSTEM "Number.dtd">
<Numbers>
        <DoctypeFile>Number.dtd</DoctypeFile>
        <Number>27</Number>
        <Number>34</Number>
        <Number>18</Number>
        <Number>67</Number>
        <Number>99</Number>
        <Number>16</Number>
</Numbers>

Here's the stylesheet that I wrote to do this task:

<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
 
    <xsl:variable name="doctype">
        <xsl:value-of select="//DoctypeFile"/>
    </xsl:variable>

    <xsl:output method="xml" doctype-system="string($doctype)"/>

    <xsl:template match="*|@*|comment()|
                         processing-instruction()|text()">
        <xsl:copy>
            <xsl:apply-templates select="*|@*|comment()|
                                     processing-instruction()|text()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet> 

A pretty simple stylesheet - create a variable which gets the value of
the DoctyleFile element, and instruct the xsl:output element to output a
DOCTYPE declaration, using the value of the variable as the name of the
DTD file, and then do a copy operation on the input XML document.

Here is the XML file that I get when this example is run through XT
(Lotus XSL gives the same results):

<?xml version="1.0"?>
<!DOCTYPE Numbers SYSTEM "string($doctype)">
<Numbers>
        <DoctypeFile>Number.dtd</DoctypeFile>
        <Number>27</Number>
        <Number>34</Number>
        <Number>18</Number>
        <Number>67</Number>
        <Number>99</Number>
        <Number>16</Number>
</Numbers>

Note that the XSL processor did not evaluate the expression that I used
in the xsl:output's doctype-system attribute.  Instead, it used the
expression literally.

Thus, here are my questions:

(1)  Is this a bug in XT and Lotus XSL?
(2)  I suspect it isn't a bug, in which case can someone think of
another way to solve this problem?

/Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Thu Dec  9 21:26:45 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:28 2004
Subject: nestable C/C++ parser
References: <61DAD58E8F4ED211AC8400A0C9B46873415572@THOR>
Message-ID: <38501ED0.B8B4F9FD@fxtech.com>

> I think your original request got lost in a side track.  If is very possible
> to do what you want with expat.  The trick is the use of the XML_SetUserData
> and the userdata argument.  Basically, the trick is to create a base class
> that has methods that you want to change the behavior of (typically,
> StartElement and EndElement).  Create derived classes for each different
> behavior that you want.  Call XML_SetUserData to the initial handler object.
> In your StartElement callback, cast the userdata argument up to a pointer to
> your base class and call its startElement virtual method.  If you want to
> change the handler, make another call to XML_SetUserData.

The problem here is it requires 3 callbacks to parse a single element
and restore the state. I'd like a cleaner solution.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Veeraraghavan.Srinivasan at iac.honeywell.com  Thu Dec  9 21:49:10 1999
From: Veeraraghavan.Srinivasan at iac.honeywell.com (Srinivasan, Veeraraghavan (AZ15))
Date: Mon Jun  7 17:18:28 2004
Subject: VBScript error
Message-ID: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com>

Hi all,
	I know this is off-topic. I am using Microsoft Script control to
execute VB and Java scripts. This is to provide programmatic interface to
execute scripts in application programs. 

	When I load the following script using AddCode method (method on
script control) in Java (VJ++ environment), I get an error that says
"Expected end of statement". When I investigated into the cause of the
error, I figured out that the line having "CreateObject" is causing the
problem. When I removed the line, i did not get any errors/exceptions. Also,
I observed that when I use createObject method inside a Subroutine I do not
get any errors/exceptions. Does anybody have any idea on what I'm missing or
point to appropriate resources (Is there a mailing list for Microsoft Script
control?).

Environment : Windows NT 4.0 SP5, IE 5.0 , VJ++ 6.0

Code snippet:
function getNames(folder,subfolders)
			set fso = CreateObject("Scripting.FileSystemObject")
			set fld = fso.GetFolder(folder)

			i = 0
			if subfolders then
				ReDim filenames(fld.subfolders.count-1,1)
				for each f in fld.subfolders
					filenames(i,0) = f.name
					filenames(i,1) = f.datecreated
					i = i + 1
				next
			else
				ReDim filenames(fld.files.count-1,1)
				i = 0
				for each f in fld.files
					filenames(i,0) = f.name
					filenames(i,1) = f.size
					i = i + 1
				next
			end if
			getNames = filenames
end function


              Honeywell
Veeraraghavan Srinivasan
Senior Principal Engineer
Honeywell Hi-Spec Solutions
1280, Kemper Meadow Drive,
Cincinnati, OH 45240
Phone: (513) 595-8913


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From AXu at epnet.com  Thu Dec  9 21:59:49 1999
From: AXu at epnet.com (Amanda Xu)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
Message-ID: <E11wBbG-0005Yl-00@romeo.ic.ac.uk>

Do you expect the end-user to understand 
term weighting techniques as well as the 
structure of an XML document?

Elephant

-----Original Message-----
From: Walter Underwood [mailto:wunder@infoseek.com]
Sent: Thursday, December 09, 1999 3:04 PM
To: Dongwook Shin
Cc: 'XML developers' list'
Subject: Re: A processing instruction for robots


At 01:41 PM 12/9/99 -0500, Dongwook Shin wrote:
>Walter Underwood wrote:
>> Structural markup opens up some really nice possibilities.
>> An indexer might weight the bibliography less and the
>> abstract more, for example.
>
>If you see XRS (XML retrieval system), you can find that a user
>can give a bigger weight to an element than to another. This
>kind of weighting is more flexible than those by indexer.
>Check XRS Web demonstration system:
>http://dlb2.nlm.nih.gov/~dwshin/xrs.html

I think you are suggesting that wighting and selection
should be done at query time instead of at index time.
That is a design tradeoff for the search engine. But the
detailed weighting and selection belong *somewhere* in
the search engine rather than in every single document.

I can imagine a system where each document had indexing
hints scattered throughout the structure, but I can't
imagine anyone having the time or knowledge to do a good
job with all that markup. We have enough trouble getting
people to replace "Untitled Document" in the <title> element
in HTML.

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From KenNorth at email.msn.com  Thu Dec  9 22:11:00 1999
From: KenNorth at email.msn.com (KenNorth)
Date: Mon Jun  7 17:18:28 2004
Subject: VBScript error
References: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com>
Message-ID: <000601bf4291$ff60fa60$0b00a8c0@grissom>

From: Srinivasan, Veeraraghavan (AZ15)

> I know this is off-topic. 
> Is there a mailing list for Microsoft Script control?.

For a discussion of scripting issues, try these newsgroups:

microsoft.public.inetexplorer.ie4.scripting
microsoft.public.inetexplorer.scripting 

Server: msnews.microsoft.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dwshin at nlm.nih.gov  Thu Dec  9 22:17:12 1999
From: dwshin at nlm.nih.gov (Dongwook Shin)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
References: <199912092159.QAA22301@nes.nlm.nih.gov>
Message-ID: <38502645.F62127BD@nlm.nih.gov>

Amanda Xu wrote:

> Do you expect the end-user to understand
> term weighting techniques as well as the
> structure of an XML document?
>
> Elephant
>

It depends. If a user is somewhat aware of the
document structure, then he may be able to give
the weight. If not, he results in totally depending on
the searching strategy of the search system.

So, the term weighting on the fly is an option that
an expert is able to use for better precision.
At the same time, it does not make it harder for
novice users.

Dongwook

Dongwook

--
Dongwook Shin
Visiting Scholar
Lister Hill National Center for Biomedical Communications
National Library of Medicine,
8600 Rockville Pike Bethesda 20894, MD
E-mail: dwshin@nlm.nih.gov
Tel: (301) 435-3257
FAX: (301) 480-3035
URL: http://dlb2.nlm.nih.gov/~dwshin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Thu Dec  9 22:25:43 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:28 2004
Subject: XML parser in Javascript for RDF app? (feasible?)
In-Reply-To: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com>
Message-ID: <Pine.GHP.4.21.9912092200150.29718-100000@mail.ilrt.bris.ac.uk>


XMLists,

Some time ago I remember a posting about there being an XML parser
(presumable an SMLish subset?) available that was written in
Javascript. Perhaps I hallucinated this! If not, I'd love to know where
this work is up to, whether available opensource etc.

Context:
We have some RDF query / logic demos now in Javascript, that work in
many pre-XML Javascript environments. The current implementation uses a
simple text representation of RDF data graphs instead of an XML
serialisation - I'm thinking it *might* be possible to actually parse
serialised graphs from XML in Javascript, using one of the various XML
graph serialisation syntaxes (RDF, BizTalk etc etc). Hence the interest in
an XML parser in Javascript...

I'm confident we can show simple RDF query and inference stuff
clientside in Javascript, eg. for decision support apps. What I'm
worried about is syntax, ie. prospects for parsing data graphs from XML
clientside in 100% Javascript.

(For the curious, this is based on Jan Grant's cute Javascript/Prolog
hack, http://rdf.desire.org/~cmjg/test/prolog.html -- I just glued it
together for rdf and made up the examples.)

There's an installation running as a part of a discussion doc I put
together as background context on RDF's origins... see:

js rdf query demo:  	http://www.w3.org/1999/11/11-WWWProposal/rdfqdemo.html
which is part of:	http://www.w3.org/1999/11/11-WWWProposal/

thanks for any tips on the XML/js front,

cheers,

Dan

ps. bug reports offlist please! this stuff doesn't run everywhere yet...

pps. non-XML data fragment follows. Clearly I'm using the wrong kinds of
brackets; suggestions welcomed... curly braces indicate a URI in the
current hack..., ie we have: 
	{relation-type-URI} ({objectURI}, {value} ).

We can parse this stuff in Javascript. I'd rather use XML instead but am
not sure if this is feasible... (bait for the SMLers... ;-)


Excerpts from: 	http://www.w3.org/1999/11/11-WWWProposal/rdfqdemo.html

{http://www.w3.org/1999/02/22-rdf-syntax-ns#type}
({http://www.w3.org/History/1989/proposal.html}
,{http://www.w3.org/1999/11/11-WWWProposal/vocab.rdf#Document}).

{http://purl.org/dc/elements/1.0/Title}
({http://www.w3.org/History/1989/proposal.html} , 
"Information Management: A Proposal").

{http://www.w3.org/1999/02/22-rdf-syntax-ns#type}
({http://www.w3.org/People/all#timbl%40w3.org}
,{http://www.w3.org/1999/11/11-WWWProposal/vocab.rdf#Person}).

{http://purl.org/dc/elements/1.0/Creator}
({http://www.w3.org/History/1989/proposal.html}
,{http://www.w3.org/People/all#timbl%40w3.org}).

{http://purl.org/dc/elements/1.0/Description}
({http://www.w3.org/History/1989/proposal.html} , 
"This proposal concerns the [...etc] ").


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Thu Dec  9 22:47:53 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
In-Reply-To: <199912092159.NAA18412@postman.infoseek.com>
Message-ID: <3.0.5.32.19991209144458.00c168c0@corp.infoseek.com>

At 04:51 PM 12/9/99 -0500, Amanda Xu wrote:
>Do you expect the end-user to understand 
>term weighting techniques as well as the 
>structure of an XML document?
>
>Elephant

Certainly not. We're lucky if search engine users
type two-word queries.

wunder

>-----Original Message-----
>From: Walter Underwood [mailto:wunder@infoseek.com]
>Sent: Thursday, December 09, 1999 3:04 PM
>To: Dongwook Shin
>Cc: 'XML developers' list'
>Subject: Re: A processing instruction for robots
>
>
>At 01:41 PM 12/9/99 -0500, Dongwook Shin wrote:
>>Walter Underwood wrote:
>>> Structural markup opens up some really nice possibilities.
>>> An indexer might weight the bibliography less and the
>>> abstract more, for example.
>>
>>If you see XRS (XML retrieval system), you can find that a user
>>can give a bigger weight to an element than to another. This
>>kind of weighting is more flexible than those by indexer.
>>Check XRS Web demonstration system:
>>http://dlb2.nlm.nih.gov/~dwshin/xrs.html
>
>I think you are suggesting that wighting and selection
>should be done at query time instead of at index time.
>That is a design tradeoff for the search engine. But the
>detailed weighting and selection belong *somewhere* in
>the search engine rather than in every single document.
>
>I can imagine a system where each document had indexing
>hints scattered throughout the structure, but I can't
>imagine anyone having the time or knowledge to do a good
>job with all that markup. We have enough trouble getting
>people to replace "Untitled Document" in the <title> element
>in HTML.
>
>wunder
>--
>Walter R. Underwood
>Senior Staff Engineer
>Infoseek Software
>GO Network, part of The Walt Disney Company
>wunder@infoseek.com
>http://software.infoseek.com/cce/ (my product)
>http://www.best.com/~wunder/
>1-408-543-6946
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
>981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
>message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>
--
Walter R. Underwood
wunder@infoseek.com
wunder@best.com (home)
http://software.infoseek.com/
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Thu Dec  9 22:53:44 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:18:28 2004
Subject: XML parser in Javascript for RDF app? (feasible?)
In-Reply-To: <Pine.GHP.4.21.9912092200150.29718-100000@mail.ilrt.bris.ac.uk>
References: <5D0478DD31B2D21194E90090273C41D3AD176E@az15m06.iac.honeywell.com>
Message-ID: <199912092253.XAA13844@sonne.darmstadt.gmd.de>

Dan Brickley <Daniel.Brickley@bristol.ac.uk> wrote at 9 Dec 99, 22:24:

> Some time ago I remember a posting about there being an XML parser
> (presumable an SMLish subset?) available that was written in
> Javascript. Perhaps I hallucinated this! If not, I'd love to know where
> this work is up to, whether available opensource etc.

http://www.jeremie.com/Dev/XML/index.jer

	++im
--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Thu Dec  9 23:27:36 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:28 2004
Subject: XML parser in Javascript for RDF app? (feasible?)
In-Reply-To: <199912092253.XAA13844@sonne.darmstadt.gmd.de>
Message-ID: <Pine.GHP.4.21.9912092315530.29718-100000@mail.ilrt.bris.ac.uk>

On Thu, 9 Dec 1999, Ingo Macherius wrote:

> Dan Brickley <Daniel.Brickley@bristol.ac.uk> wrote at 9 Dec 99, 22:24:
> 
> > Some time ago I remember a posting about there being an XML parser
> > (presumable an SMLish subset?) available that was written in
> > Javascript. Perhaps I hallucinated this! If not, I'd love to know where
> > this work is up to, whether available opensource etc.
> 
> http://www.jeremie.com/Dev/XML/index.jer

Aha, thanks. Claims to be fairly complete, and doesn't barf on my sample
XML/RDF files. Am now wondering whether the companion XSL engine
http://www.jeremie.com/Dev/XSL/ (if updated) would work as a way of
transforming (a certain style of) xml serialised graph back into
rdfesque queryable structures. I'll have a play around...
 
cheers,

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Thu Dec  9 23:54:39 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:28 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991209144458.00c168c0@corp.infoseek.com>
Message-ID: <000301bf42a0$de2301e0$d1940e18@smateo1.sfba.home.com>

>Certainly not. We're lucky if search engine users
>type two-word queries.

Have you tried putting up two input boxes instead
of one long one?  A bit of GUI trick can sometime
to wonders.

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Fri Dec 10 12:36:40 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:28 2004
Subject: Answer: XSLT Question: Inserting a DOCTYPE decl
References: <93CB64052F94D211BC5D0010A800133101FDE876@wwmess3.bra01.icl.co.uk>
Message-ID: <3850F405.7B491F7D@mitre.org>

Hi Folks,

The solution to the problem that I posed of writing a stylesheet which
inserts a DOCTYPE declaration into the input XML document, where the DTD
file is found as the text value of an element in the input XML file, is
listed below.  Thanks to all those who made suggestions, particularly
Michael Kay who created the below solution.

                     DoctypeInserter.xsl
------------------------------------------------------------------
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="1.0">
 
    <xsl:variable name="doctype">
        <xsl:value-of select="//DoctypeFile"/>
    </xsl:variable>

    <xsl:variable name="rootnode">
        <xsl:value-of select="name(*)"/>
    </xsl:variable>

    <xsl:template match="/">
        <xsl:variable name="quote">"</xsl:variable>  
        <xsl:value-of disable-output-escaping="yes" 
             select="concat('&lt;!DOCTYPE ', $rootnode, ' SYSTEM ', 
                            $quote, $doctype, $quote, '&gt;')"/>
        <xsl:apply-templates/>
    </xsl:template>

    <xsl:template match="*|@*|comment()|
                         processing-instruction()|text()">
        <xsl:copy>
            <xsl:apply-templates select="*|@*|comment()| 
                                    processing-instruction()|text()"/>
        </xsl:copy>
    </xsl:template>

</xsl:stylesheet>
------------------------------------------------------------------

As you can see, the trick was not to use xsl:output at all. Instead, the
DOCTYPE delcaration must be built "by hand".  After the DTD file is
found (and stored in doctype) the root node is found (and stored in
rootnode).  In the template rule for the document, before any of the XML
elements are copied, the DOCTYPE declaration is constructed by
concatenating together the various components.

With this XML document as input:

<?xml version="1.0"?>
<Numbers>
        <DoctypeFile>Number.dtd</DoctypeFile>
        <Number>27</Number>
        <Number>34</Number>
        <Number>18</Number>
        <Number>67</Number>
        <Number>99</Number>
        <Number>16</Number>
</Numbers>

The result of running it through the above stylesheet is:

<?xml version="1.0"?>
<!DOCTYPE Numbers SYSTEM "Number.dtd">
<Numbers>
        <DoctypeFile>Number.dtd</DoctypeFile>
        <Number>27</Number>
        <Number>34</Number>
        <Number>18</Number>
        <Number>67</Number>
        <Number>99</Number>
        <Number>16</Number>
</Numbers>

Which is, of course, exactly what I wanted!  

Thanks again.  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From allan.kelly at reuters.com  Fri Dec 10 13:38:27 1999
From: allan.kelly at reuters.com (Allan Kelly 59048)
Date: Mon Jun  7 17:18:28 2004
Subject: nestable C/C++ parser
In-Reply-To: <38501ED0.B8B4F9FD@fxtech.com>
Message-ID: <E11wQFb-0001Mk-00@romeo.ic.ac.uk>

I find this debate quiet interesting and would like to share my experience,
maybe this should in the a "object serialisation thread" but here goes....

I've got some code (sorry, company's not mine, can't publish) which started
life a generic container, when we came to serialise the container XML was the
obvious candidate, because, as has been said before why invent a news format?

Anyway, what we've currently got I refer to as XML-like because
- I'm not confident of my DTD writing
- Each serialisation forms a message, the message is really an XML-fragment so
is missing pre-log and post-log
- root elements must have an attribute "name"

I have a plugable factory class.  Each object knows how to stream itself in and
out.  The input stream is pipped into the factory, as long as the
messages/XML-fragments are for classes which have been plugged into the factory
everything is fine, the factory produces a container which holds items from the
stream and can be accessed using operator[]

This works quiet well for passing messages between co-operating processes.

The code is actually quiet small and efficient.  Which makes me wonder why I
need to bother with expat, SAX and DOM?  The short answer is I don't because we
have a tailored solution to our problem.

Allan

>> I think your original request got lost in a side track.  If is very possible
>> to do what you want with expat.  The trick is the use of the XML_SetUserData
>> and the userdata argument.  Basically, the trick is to create a base class
>> that has methods that you want to change the behavior of (typically,
>> StartElement and EndElement).  Create derived classes for each different
>> behavior that you want.  Call XML_SetUserData to the initial handler object.
>> In your StartElement callback, cast the userdata argument up to a pointer to
>> your base class and call its startElement virtual method.  If you want to
>> change the handler, make another call to XML_SetUserData.
>
>The problem here is it requires 3 callbacks to parse a single element
>and restore the state. I'd like a cleaner solution.
>
>--
>Paul Miller - stele@fxtech.com
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

-----------------------------------------------------------------
        Visit our Internet site at http://www.reuters.com

Any views expressed in this message are those of  the  individual
sender,  except  where  the sender specifically states them to be
the views of Reuters Ltd.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Fri Dec 10 14:39:29 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:28 2004
Subject: Appending to an XML document 
In-Reply-To: Your message of "Wed, 08 Dec 1999 00:40:54 +0100."
             <199912072337.AAA10151@sonne.darmstadt.gmd.de> 
Message-ID: <199912101439.HAA03296@localhost.localdomain>

> currently I'm busy designing an XML based log format myself. In 
> contrast to "classic line based logging", appending indeed is 
> prohibitively costly in XML. Thus I decided not to log into a 
> wellformed XML document, but to stick with a sequence of <Event> type 
> doc-fragments, just being well-formed per event.
> Of course one can not parse the result immediately, but at the time 
> of log analysis (or whatever you do with your event data), it's 
> trivial to pre- and append the necessary tags to enclose the doc-
> fragments.
> 
> XML was just not designed to fit the demands of concatenatiation. But 
> I found the value of structuring single events in a "semi-structured" 
> (read: well-formed) way valuable enough to choose XML. The "missing 
> enclosing tag" is not really a serious problem if you delay its 
> insertation until REALLY necessary.

I don't really see this as a problem with XML.  Why must you consider your log 
a well-formed XML document?  If you instead treat it as a well-formed XML 
external parsed entity, then you are freed of the append problem, while being 
fully XML compliant.  And, of course, you already hit upon this solution 
yourself, by enclosing your log in a simple wrapper to create an XML document 
for processing.  Despite the recent controversy about EPEs, most XML tools 
should support them, and so you shouldn't even be too constrained in your XML 
tool-set.

But again, I don't see a problem with XML here.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From LWatanab at JetForm.com  Fri Dec 10 15:19:26 1999
From: LWatanab at JetForm.com (Larry Watanabe)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
Message-ID: <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2>


The decision to make an XML document contain just a single element is the
root of the problem. Perhaps the specification for future versions could
accept multiple elements in a single document. This would make concatenation
simple and has worked well in the lisp world which operates on similar
structures (s-expressions) both for representation of program and data.

-Larry Watanabe

> -----Original Message-----
> From:	uche.ogbuji@fourthought.com [SMTP:uche.ogbuji@fourthought.com]
> Sent:	Friday, December 10, 1999 9:39 AM
> To:	Ingo Macherius
> Cc:	Ross Bleakney; xml-dev@ic.ac.uk
> Subject:	Re: Appending to an XML document 
> 
> > currently I'm busy designing an XML based log format myself. In 
> > contrast to "classic line based logging", appending indeed is 
> > prohibitively costly in XML. Thus I decided not to log into a 
> > wellformed XML document, but to stick with a sequence of <Event> type 
> > doc-fragments, just being well-formed per event.
> > Of course one can not parse the result immediately, but at the time 
> > of log analysis (or whatever you do with your event data), it's 
> > trivial to pre- and append the necessary tags to enclose the doc-
> > fragments.
> > 
> > XML was just not designed to fit the demands of concatenatiation. But 
> > I found the value of structuring single events in a "semi-structured" 
> > (read: well-formed) way valuable enough to choose XML. The "missing 
> > enclosing tag" is not really a serious problem if you delay its 
> > insertation until REALLY necessary.
> 
> I don't really see this as a problem with XML.  Why must you consider your
> log 
> a well-formed XML document?  If you instead treat it as a well-formed XML 
> external parsed entity, then you are freed of the append problem, while
> being 
> fully XML compliant.  And, of course, you already hit upon this solution 
> yourself, by enclosing your log in a simple wrapper to create an XML
> document 
> for processing.  Despite the recent controversy about EPEs, most XML tools
> 
> should support them, and so you shouldn't even be too constrained in your
> XML 
> tool-set.
> 
> But again, I don't see a problem with XML here.
> 
> -- 
> Uche Ogbuji
> FourThought LLC, IT Consultants
> uche.ogbuji@fourthought.com	(970)481-0805
> Software engineering, project management, Intranets and Extranets
> http://FourThought.com		http://OpenTechnology.org
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Fri Dec 10 15:41:10 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: Your message of "Fri, 10 Dec 1999 10:16:19 EST."
             <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2> 
Message-ID: <199912101541.IAA03486@localhost.localdomain>

> The decision to make an XML document contain just a single element is the
> root of the problem. Perhaps the specification for future versions could
> accept multiple elements in a single document. This would make concatenation
> simple and has worked well in the lisp world which operates on similar
> structures (s-expressions) both for representation of program and data.

While this may be the root of other problems, and I do not claim to vouch for 
all such problems, it is _not_ the root of the particular problem in question. 
 I see no reason why the log file must be a well-formed XML document.  Can you 
tell me what is wrong with just treating it as an external parsed entity?


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jurg.Wullschleger at mb.luth.se  Fri Dec 10 15:44:10 1999
From: Jurg.Wullschleger at mb.luth.se (Jurg Wullschleger)
Date: Mon Jun  7 17:18:29 2004
Subject: a simpler document type definition language?
Message-ID: <1D1624C992C5D111B67C00A0C99695012B3737@mailserv.mb.luth.se>

hi everybody.

i like the idea of SML. but i think it is not of so big importance for
"normal" programmers: if they don't like attributes, they just don't use it.

but a really important thing to every user of XML is how to specify your
fileformat. both, DTD and Schemas open you a lots of possibilities to
specify your fileformat. but they are quite complicated. and it's not easy
to write a program that validates a xml document (i think).

so, what do you guys think about a simplified document type definition
language?

the simplest form i can think of would look something like this: (examples
in DTD syntax)
there are only 4 types of elements:

- empty elements
<!ELEMENT name1 EMTPY >

- elements that contain data

<!ELEMENT name2 (#PCDATA) >

- list elements

<!ELEMENT name3 (name1|name2|name3|name4)* >

- structural elements of a fixed length

<!ELEMENT name4 ((name1|name2),name3,name4,(name5|name6|name7)) >

maybe that's a bit too restrictive, but i think it is useful for a lot of
applications. and it is really easy to "validate" a document. If the user
only uses these constructs, he can be sure that the format can easily be
handled by a program. 

i defined a simple document definition language, based one the 4 basic
element types. and wrote a small xml editor that can edit xml files which
are defined in this language. at the moment, there are two formats defined:
one for the rules themseves, with an DTD export filter, and one for a subset
of the functionality of CSS, with a CSS export filter.

download the source at http://www.netmen.ch/wullschleger/xml/Simple.zip !

And let me know what you think.

Thanks.

Juerg Wullschleger

email: jurg@mb.luth.se

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec 10 15:50:23 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2>
Message-ID: <000e01bf4326$639cc560$d1940e18@smateo1.sfba.home.com>

IMHO, this is a parser implementation problem.
I do not know of a single XML parser that expects
more than one XML document in a file or a stream
input.

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rgl at decisionsoft.com  Fri Dec 10 15:54:10 1999
From: rgl at decisionsoft.com (Richard Lanyon)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <199912101541.IAA03486@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912101555430.10391-100000@localhost.localdomain>


> > The decision to make an XML document contain just a single element is the
> > root of the problem. Perhaps the specification for future versions could
> > accept multiple elements in a single document.

I think the idea is that the single element /is/ the XML document.

-- 
Richard Lanyon (Software Engineer) |     "The medium is the message"
XML Script development,            |             - Marshall McLuhan
DecisionSoft Ltd.                  |


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From LWatanab at JetForm.com  Fri Dec 10 16:22:37 1999
From: LWatanab at JetForm.com (Larry Watanabe)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
Message-ID: <111CF63B7D2ED211830000805F65A2FF018049A2@OTTMAIL2>


If a log file can't easily be represented as a well-formed XML document,
then that does indicate a problem with the spec. The XML spec does say that
it should be straigthforwardly usable; I don't think external parsed
entities are a straightforward way of doing a simple operation such as
append.

For someone who has to write a log file now with the current spec, either of
the proposed solutions should be fine (not maintaining the logfile as a
well-formed XML document or using external parsd entities).

-Larry Watanabe

> -----Original Message-----
> From:	uche.ogbuji@fourthought.com [SMTP:uche.ogbuji@fourthought.com]
> Sent:	Friday, December 10, 1999 10:41 AM
> To:	Larry Watanabe
> Cc:	'uche.ogbuji@fourthought.com'; Ingo Macherius; Ross Bleakney;
> xml-dev@ic.ac.uk
> Subject:	Re: Appending to an XML document 
> 
> > The decision to make an XML document contain just a single element is
> the
> > root of the problem. Perhaps the specification for future versions could
> > accept multiple elements in a single document. This would make
> concatenation
> > simple and has worked well in the lisp world which operates on similar
> > structures (s-expressions) both for representation of program and data.
> 
> While this may be the root of other problems, and I do not claim to vouch
> for 
> all such problems, it is _not_ the root of the particular problem in
> question. 
>  I see no reason why the log file must be a well-formed XML document.  Can
> you 
> tell me what is wrong with just treating it as an external parsed entity?
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From macherius at darmstadt.gmd.de  Fri Dec 10 16:26:55 1999
From: macherius at darmstadt.gmd.de (Ingo Macherius)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <199912101541.IAA03486@localhost.localdomain>
References: Your message of "Fri, 10 Dec 1999 10:16:19 EST."             <111CF63B7D2ED211830000805F65A2FF018049A0@OTTMAIL2> 
Message-ID: <199912101626.RAA05881@sonne.darmstadt.gmd.de>

Uche,

using entities of any kind does not change the underlying data model. 
It's syntactic shugar, nothing more. The root of the problem (and a 
great help in other cases) is indeed the fact that any XML 1.0 
document must have a single root.

There are several fields where this is assumend a problem, just think 
e.g. of the return values of XPath or XQL expression, which rarely 
are single-rooted. Think how often you have heared the term "virtual 
root node" recently. Think of Murata's "forest automata". Forest, not 
tree. Guess why.

However, plainly dropping the tree structure of XML is too hasty. The 
main problem here is the fact that there is not much between a tree 
and a general DAG (graph). Oodles of folks are in search for a 
convenient data model just in the middle of tree and graph. This 
would help to truly merge hyperlinks, RDF and XML. Check 
http://www.w3.org/TR/schema-arch

Thus: entities are truely no soulution. However, so far nobody 
succeded in suggesting a "true" non-tree data model for XML.

	++im

uche.ogbuji@fourthought.com <uche.ogbuji@fourthought.com> wrote at 10 Dec 99, 8:41:

> > The decision to make an XML document contain just a single element is the
> > root of the problem.
> 
> While this may be the root of other problems, and I do not claim to vouch for 
> all such problems, it is _not_ the root of the particular problem in question. 
>  I see no reason why the log file must be a well-formed XML document.  Can you 

--
Ingo Macherius//Dolivostrasse 15//D-64293 Darmstadt//+49-6151-869-882
GMD-IPSI German National Research Center for Information Technology
mailto:macherius@gmd.de http://www.darmstadt.gmd.de/~inim/
Information!=Knowledge!=Wisdom!=Truth!=Beauty!=Love!=Music==BEST (Zappa)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From yiminz at timberline.com  Fri Dec 10 16:39:44 1999
From: yiminz at timberline.com (yimin zhu)
Date: Mon Jun  7 17:18:29 2004
Subject: BizTalk Mapper
Message-ID: <2D722CFF0999D111AB860001FA375F1004353D2D@laposte.timberline.com>

Does anyone know whether the BizTalk Mapper is available now?

Yimin Zhu

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Fri Dec 10 16:49:28 1999
From: tbray at textuality.com (tbray@textuality.com)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
References: <111CF63B7D2ED211830000805F65A2FF018049A2@OTTMAIL2>
Message-ID: <0dcd01bf432e$bcb999e0$0500a8c0@ned>

From: Larry Watanabe <LWatanab@JetForm.com>
> If a log file can't easily be represented as a well-formed XML document,
> then that does indicate a problem with the spec. The XML spec does say
that
> it should be straigthforwardly usable; I don't think external parsed
> entities are a straightforward way of doing a simple operation such as
> append.

Seems to me the obvious solution is that for streaming apps like logfiles,
the smart thing to do is to represent them as a sequence of small XML docs.
That way, you can also load each one into memory, validate it, do all sorts
of clever things you wouldn't be able to do if you were trying to pretend
the stream was a single document. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Fri Dec 10 16:57:15 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: Your message of "Fri, 10 Dec 1999 17:29:40 +0100."
             <199912101626.RAA05881@sonne.darmstadt.gmd.de> 
Message-ID: <199912101657.JAA03681@localhost.localdomain>

> using entities of any kind does not change the underlying data model. 
> It's syntactic shugar, nothing more. The root of the problem (and a 
> great help in other cases) is indeed the fact that any XML 1.0 
> document must have a single root.

>From what I know of your problem, it seems as if you are the one who is 
confusing implementation issues with the underlying data model.

If I were faced with the same problem, my solution would be very simple.

The schema (your "underlying data model") for my XML logging document would be 
as follows:

<!ELEMENT log (entry*)>
<!ELEMENT entry (#PCDATA)>

My low-level logging code (where efficiency counts more than schematics) would 
manage a disk file in the form

<entry>Nam Sybillam quidem Cumis ego oculis meis vidi in ampulla 
pendere</entry>
<entry>Pueris respondebat "Volo perire"</entry>

And appending is as efficient as you please.  Let us say this disk file was 
"/var/log/classic.log"

The rest of the world (which is expecting an XML: document) would access the 
logs through the following

<?xml version="1.0">
<!DOCTYPE log [<!ENTITY lf SYSTEM "file:/var/log/classic.log">]>
<log>&lf;</log>

And ta-da!  We've satisfied both our efficiency and semantic concerns using 
XML 1.0.

So where is the problem?


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Fri Dec 10 17:25:28 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:29 2004
Subject: Oops!  Attribution error
In-Reply-To: Your message of "Fri, 10 Dec 1999 09:57:00 MST."
             <199912101657.JAA03681@localhost.localdomain> 
Message-ID: <199912101725.KAA03768@localhost.localdomain>

> > using entities of any kind does not change the underlying data model. 
> > It's syntactic shugar, nothing more. The root of the problem (and a 
> > great help in other cases) is indeed the fact that any XML 1.0 
> > document must have a single root.
> 
> >From what I know of your problem, it seems as if you are the one who is 
> confusing implementation issues with the underlying data model.

This last paragraph was not meant to be a quote of Larry but part of my 
response.  Looks as if an &gt; nipped in there somehow.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From abisheks at india.hp.com  Fri Dec 10 17:32:54 1999
From: abisheks at india.hp.com (Abhishek Srivastava)
Date: Mon Jun  7 17:18:29 2004
Subject: Parsing a DTD for information
Message-ID: <002301bf4334$7144adf0$252f0a0f@india.hp.com>

Hi,

Is there an XML parser that will allow me to parse just a DTD.

Suppose the following is my DTD
<!ELEMENT (name+,lastname+)>

My application needs to know that it can have a list of names and a list of lastnames.

Most parsers give me the data inside the elements/attributes . .. however, do not
allow to access the grammar associated with the elements/attributes  in the DTD.

regards,
Abhishek.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    _/               Abhishek Srivastava
   _/                Hewlett Packard ISO       
  _/_/_/   _/_/_/    -------------------   
 _/    /   _/   _/     (Work)   +91-80-2251554 x1190
_/  _/   _/_/_/      (Ip)     15.10.47.37            
        _/           (Url)    http://sites.netscape.net/abhishes/index.html                        
       _/            
                     Work like you don't need the money.
                     Dance like no one is watching.
                     And love like you've never been hurt.
                     --Mark Twain                       
                     
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991210/ddcb1a12/attachment.htm
From lisarein at finetuning.com  Fri Dec 10 17:34:17 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:18:29 2004
Subject: BizTalk Mapper
References: <2D722CFF0999D111AB860001FA375F1004353D2D@laposte.timberline.com>
Message-ID: <38513A39.A3D9E2E1@finetuning.com>

Nope!  The BizTalk schema mapper server thinggy won't be out till MAYBE
second quarter 2000 (per microsoft evangelists)

lisa rein
http://www.finetuning.com/collect.html

yimin zhu wrote:
> 
> Does anyone know whether the BizTalk Mapper is available now?
> 
> Yimin Zhu
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From klamerus at pobox.com  Fri Dec 10 18:16:32 1999
From: klamerus at pobox.com (Mark & Eileen Klamerus)
Date: Mon Jun  7 17:18:29 2004
Subject: Lists of Schema
Message-ID: <000001bf433a$928ad740$5a67fea9@hydrox>

All,

I'm in a research for various XML schema initiatives.  In particular those
which would be applicable to the chemicals industry.  I know that most
schema are oriented toward work processes (customer data, billing
information, etc.), but even with those it's hard to find a good reference
list.

Are there any sites or references which identify schema?  Are there any
organizations (besides OASIS) which might provide information on initiatives
for define schema underway?

Thanks, especially for e-mail.

Mark Klamerus


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 10 18:22:37 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <000e01bf4326$639cc560$d1940e18@smateo1.sfba.home.com>
Message-ID: <Pine.LNX.4.10.9912100122010.14597-100000@cauchy.clarkevans.com>


On Fri, 10 Dec 1999, Don Park wrote:
> IMHO, this is a parser implementation problem.
> I do not know of a single XML parser that expects
> more than one XML document in a file or a stream
> input.

I tend to agree here.  If a DOM parser encounters
more than one root element, it could easily
create a root element, say by grabbing the 
name of the file.  If a SAX parser encounters
more than one root element, it should just
proceed by ending the first 'root' element,
and then starting the next one.  The only 
alternative is to have your log file 
open a root element and then never 
terminate it -- I think the parser should
handle this as well.

Why would this be a problem?

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 10 18:25:59 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <199912101657.JAA03681@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912100126250.14597-100000@cauchy.clarkevans.com>


On Fri, 10 Dec 1999 uche.ogbuji@fourthought.com wrote:
> And ta-da!  We've satisfied both our efficiency and semantic 
> concerns using XML 1.0.   So where is the problem?

Great solution!  Now let's just implement this
at the parser level so that the average user doesn't 
need to be concerned with this tedious detail.

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec 10 18:32:21 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <Pine.LNX.4.10.9912100122010.14597-100000@cauchy.clarkevans.com>
Message-ID: <000a01bf433c$fe78eda0$d1940e18@smateo1.sfba.home.com>

>Why would this be a problem?

No problem as far as I can see.  But then I am kind-a short.

Seriously, this is a meme-effect.  The 'Document' meme is so
strong that I suspect people actually visualize the color
and the texture of paper when they read the word 'document'.

Not so seriously, note that 'meme-effect' is different from
'mama-effect' which is like the effect W3C has on the XML
community. <g>

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec 10 19:20:32 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document
In-Reply-To: "Clark C. Evans"'s message of "Fri, 10 Dec 1999 01:24:43 -0500 (EST)"
References: <Pine.LNX.4.10.9912100122010.14597-100000@cauchy.clarkevans.com>
Message-ID: <m3iu26mwmb.fsf@localhost.localdomain>

"Clark C. Evans" <clark.evans@manhattanproject.com> writes:

> On Fri, 10 Dec 1999, Don Park wrote:

> > IMHO, this is a parser implementation problem.  I do not know of a
> > single XML parser that expects more than one XML document in a
> > file or a stream input.
> 
> I tend to agree here.  If a DOM parser encounters more than one root
> element, it could easily create a root element, say by grabbing the
> name of the file.  If a SAX parser encounters more than one root
> element, it should just proceed by ending the first 'root' element,
> and then starting the next one.

No, these would both be non-conformant -- the XML spec defines a
document as the main production, and a parser that encounters a second 
root element in what is being given to it as a document simply has to
stop processing, except for error reporting.

You have to distinguish the document boundaries in a single stream
before you pass it on to the parser.  For example, you could use ^L as
the document separator, and start a new parse each time you see it.


All the best,


DAvid

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 10 20:01:10 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document
In-Reply-To: <m3iu26mwmb.fsf@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912100301400.14862-100000@cauchy.clarkevans.com>


On 10 Dec 1999, David Megginson wrote:
> No, these would both be non-conformant -- the XML spec defines a
> document as the main production, and a parser that encounters a second 
> root element in what is being given to it as a document simply has to
> stop processing, except for error reporting.

Yes.  But I thought the computational model for XML was 
a hedge-automata? (not a tree-automata... )

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec 10 20:47:58 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:29 2004
Subject: DOM Level 2 bumped up to Candidate Recommendation
Message-ID: <000001bf434f$ef27bee0$d1940e18@smateo1.sfba.home.com>

DOM Level 2 bumped up to Candidate Recommendation

http://www.w3.org/TR/1999/CR-DOM-Level-2-19991210/

A lot of work ahead for us imps at the bleeding edge.
Don't forget to send back the bloody bandages to:

http://lists.w3.org/Archives/Public/www-dom/

Cheers,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Fri Dec 10 21:10:14 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:18:29 2004
Subject: DOM Level 2 bumped up to Candidate Recommendation
In-Reply-To: <000001bf434f$ef27bee0$d1940e18@smateo1.sfba.home.com>
Message-ID: <199912102106.NAA06308@mail.sqwest.bc.ca>

On 10 Dec 99, at 12:48, Don Park wrote:

> DOM Level 2 bumped up to Candidate Recommendation
> 
> http://www.w3.org/TR/1999/CR-DOM-Level-2-19991210/
> 
> A lot of work ahead for us imps at the bleeding edge.
> Don't forget to send back the bloody bandages to:
> 
> http://lists.w3.org/Archives/Public/www-dom/

Yes, please do. Candidate Recommendation is a new phase for 
W3C specs; the idea is to see if it's implementable before sending 
it off to PR. The only changes that will be made from now on are if 
something is seriously broken and it's very difficult or impossible to 
implement. (Apart from clarifications, of course!). So if the spec 
isn't clear enough to implement from, or implementations are nearly 
impossible, please send email. 

Also, if you do implement some part of Level 2, I'd like to hear 
about it, so we can be sure that the spec has been implemented 
often enough that we probably have the bugs out of it. If you want 
your email to be confidential, just send it to me 
(lauren@softquad.com), marking it as confidential, otherwise 
please send email to the public DOM mailing list.

thanks,


Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From orchard at pacificspirit.com  Fri Dec 10 22:57:25 1999
From: orchard at pacificspirit.com (David Orchard)
Date: Mon Jun  7 17:18:29 2004
Subject: Object-oriented serialization (Was Re: Some questions)
In-Reply-To: <33D189919E89D311814C00805F1991F7F4A9A5@RED-MSG-08>
Message-ID: <001601bf4361$e8bd1de0$63511c09@n54wntw.vancouver.can.ibm.com>

I believe there a 3rd way, that is:

Between system wide (meta-grammar) and mapping rules associated with a
schema, there is the third option of a graph-specific set of rules for
associating schemas.  The graph specification language allows arbitrary
graphs and mapping rules for subgraphs of the universe of elements described
a schema.  Thus I could take the same large graph of Java objects and
serialize different subgraphs to different XML documents.  The XML documents
follow any pattern , but the mapping rules are not necessarily 1:1.  In
another case, I could take 2 distinct Java graphs and serialize to the same
XML schema with different "denormalization" in mapping or transforming
between the lhs and rhs.  There can be many mappings or bindings for
arbitrary graph traversals, potentially selected at runtime.

I personally am very interested in graph grammars and would love to hear
about papers on the topic.  A sample that I am interested is a graph grammar
that can be used to specify a graph to traverse from a given starting point.
Typical examples of this are a graph of COM+ or EJB objects to instantiate
for a request, a graph of XLink extended links to retrieve for a given
document, or a graph of XInclude elements to traverse.

Cheers,
Dave Orchard
XLink co-editor

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Andrew Layman
> Sent: Monday, December 06, 1999 12:09 PM
> To: XML Developers List
> Subject: RE: Object-oriented serialization (Was Re: Some questions)
>
>
> Thanks.  As a recap: There are, broadly, two approaches to serializing a
> graph in XML.
>
> One is to invent a meta-grammar, a set of canonicalization rules.  That is
> what RDF syntax did, and what the attribute-centric and element-centric
> canonical format papers do, what SOAP section eight does. I think
> of this as
> "tunnelling the graph through XML."
>
> The other is to allow XML documents to follow any pattern described in a
> schema, and augmenting the schema with a set of mapping rules.
>
> There appears to be significant value to each approach. (In particular,
> however, I disagree with the sometimes-asserted claim that graphs capture
> the semantics of a communication while grammars do not.  Graphs are just
> another grammar.  This makes me reluctant to deprecate grammars.)
>
> I agree that formal approaches to mapping would be helpful. I look forward
> to reading your papers.
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wunder at infoseek.com  Fri Dec 10 22:57:28 1999
From: wunder at infoseek.com (Walter Underwood)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <111CF63B7D2ED211830000805F65A2FF018049A2@OTTMAIL2>
Message-ID: <3.0.5.32.19991210145224.00c08100@corp.infoseek.com>

At 11:09 AM 12/10/99 -0500, Larry Watanabe wrote:
>
>If a log file can't easily be represented as a well-formed XML document,
>then that does indicate a problem with the spec.

Well, time series data (a logfile) isn't normalizable in the 
relational model, so there's a problem with the SQL spec, too. 
And I'm having a lot of trouble accessing my meatloaf with my 
SCSI bus, so let's fix that while we're at it.

Seriously, I think it is good that XML documents have a 
definate end. But the logfile problem can be solved in
two (near-)standard ways.

1) Treat the logfile as an XML Fragment (see proposal at W3C).
   An XML Fragment needs to be "well-balanced", but doesn't have
   to have a single root element.

2) Treat the logfile as a series of XML documents, each of 
   which is a log record. They could be separated by formfeed
   (a character illegal inside an XML document). Since the
   the XML declaration is technically optional, it could be
   omitted.

But nobody every made representing logfiles a requirement
for a markup language. It is usually far more important that
logfiles are compact and are readable if the program crashes
(or the disk fills) when a record is partly-written.

wunder
--
Walter R. Underwood
Senior Staff Engineer
Infoseek Software
GO Network, part of The Walt Disney Company
wunder@infoseek.com
http://software.infoseek.com/cce/ (my product)
http://www.best.com/~wunder/
1-408-543-6946

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec 10 23:48:38 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document 
In-Reply-To: <3.0.5.32.19991210145224.00c08100@corp.infoseek.com>
Message-ID: <000001bf4369$3186ad00$d1940e18@smateo1.sfba.home.com>

Here is another solution that most people overlook:

You can use XML APIs without using XML and still get
most of the benefits.

Store your XML in any format you want but write a
SAX parser for it so that log processing applications
can be XML applications.  This allows you to move up
to XML storage format when and if you need to without
rewriting any software.

Log producers work in reverse, meaning that they fire
SAX events and a special SAX application write out the
information into a file in custom format or send it to
a log server using whatever wire format you want.

You can do something similar with DOM as well.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From greynolds at datalogics.com  Sat Dec 11 00:08:04 1999
From: greynolds at datalogics.com (Reynolds, Gregg)
Date: Mon Jun  7 17:18:29 2004
Subject: Object-oriented serialization (Was Re: Some questions)
Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17056@martinique>

Sorry I'm a bit late with this...

> -----Original Message-----
> From: James Tauber [mailto:jtauber@jtauber.com]
> Sent: Sunday, December 05, 1999 7:25 PM
> To: xml-dev@ic.ac.uk
> 
> > The semantic constraints I am
> > talking about are one step away from these "ultimate" 
> semantics; they
> > tell you that an integer contained in a given element 
> cannot be greater
> > than 100, but they don't tell you why. These are still 
> semantics to me
> 
> Ah. This is why I have have some difficulty understanding 
> some of what you
> are saying. To me, the constraint that an integer cannot be 
> greater than 100
> is not semantics. It's syntax.
> 
> MyInteger ::= ( '100' | digit{1,2} | '-' digit+ )
> 
> or in some more perspicuous grammar:
> 
> MyInteger = Integer x : x <= 100
> 

But this won't work outside of Europe.  You have to have a clean distinction
between syntax and semantics, and an explicit, rigorous mapping from one to
the other.  The symbols used in your example have no intrinsic meaning, just
as numbers have no intrinsic form available to our perception, so the syntax
can only constraint the formal properties of expressions using those
symbols.  Sure, we have conventional meanings for constant symbols like '1'
and '0'; but they're still symbols pointing to something else, and as soon
as you start writing expressions with a different symbol set - in Sanskrit,
say, or Ethiopic, or you name it - then you're out of luck without a formal
semantics.

On the other hand, the cultural interpretation of the denotatum is beyond
the scope of language definition.  Doesn't matter what the user intends to
model using integers; be it widgets, fingers, or planets, the best the
language designer can do is provide a consistent language that accurately
models the integers.

-gregg

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Min.Zhong at penske.com  Sat Dec 11 04:35:11 1999
From: Min.Zhong at penske.com (Min Zhong)
Date: Mon Jun  7 17:18:29 2004
Subject: BizTalk JumpStart Kit - File missing
Message-ID: <s8518e1a.055@penske.com>

Hi, 

Is there anyone tryed to install BizTalk JumpStart Kit before?  I tried to install it but failed.  I had to debug the setup script and finally found "mtsPkgMgs.dll" is missing.  Could someone help me to find out where I can download that file?

Thank you very much!

Min 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sat Dec 11 08:26:01 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:29 2004
Subject: Appending to an XML document
Message-ID: <005801bf43b4$d99db970$06f96d8c@NT.JELLIFFE.COM.AU>


From: David Megginson <david@megginson.com>
 >You have to distinguish the document boundaries in a single stream
>before you pass it on to the parser.  For example, you could use ^L as
>the document separator, and start a new parse each time you see it.

This is at least the third time this has come up (from memory, it was
discussed during the XML development, then a year ago on XML-DEV).

So it would be good to make a QnA for the SGML FAQ on this subject.
Can anyone suggest an improvement on the following?
-----
Q.  How can I have an unending stream of data in XML?

A.  You must use a stream of XML documents. The simplest way
to do this is separate each document with ^L, which is not an
allowed character in XML and which is not used for in-band
signalling by common streaming systems.

If the incoming stream terminates unexpectedly during
a document, then that document is not well-formed.  You
should consider how to handle such fragments.

Note that "document" is a technical term meaning a
"collection of  information that is processed as a unit"
(ISO 8879:1986) and represents a distinct layer between
storage/transport (e.g., entities, streams, archives) and
publication. An open-ended stream must be partitioned
into distinct XML documents, for example, one per
entry.  Consequently, you cannot use ID/IDREF for
references between documents in a stream, but rather
you should use some more general reference mechanism,
such as W3C XPointers.

Another alternative, suggested by Uche Ogbuji, is suitable
when the incoming log data is to be sent to a file rather than
processed:

The schema (your "underlying data model") for my XML logging document
would be
as follows:

<!ELEMENT log (entry*)>
<!ELEMENT entry (#PCDATA)>

My low-level logging code (where efficiency counts more than schematics)
would
manage a disk file in the form

<entry>Nam Sybillam quidem Cumis ego oculis meis vidi in ampulla
pendere</entry>
<entry>Pueris respondebat "Volo perire"</entry>

And appending is as efficient as you please.  Let us say this disk file
was
"/var/log/classic.log"

The rest of the world (which is expecting an XML: document) would access
the
logs through the following

<?xml version="1.0">
<!DOCTYPE log [<!ENTITY lf SYSTEM "file:/var/log/classic.log">]>
<log>&lf;</log>

And ta-da!  We've satisfied both our efficiency and semantic concerns
using
XML 1.0.

------

Why is this not in the XML Spec?
    1) Simplicity and layering
    2) It is not the W3C's business to make specs for streams of
entities:  IETF is the forum for that.


Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ceo at citix.com  Sat Dec 11 12:24:58 1999
From: ceo at citix.com (Steven Livingstone)
Date: Mon Jun  7 17:18:29 2004
Subject: BizTalk JumpStart Kit - File missing
References: <s8518e1a.055@penske.com>
Message-ID: <003501bf43d3$3bda35a0$0a0a0a0a@deltabiz>

Min, I have the BizTalk kit installed and that file is not on my disk !?

What setup file tries to install this ?

cheers
steven

Steven Livingstone
Glasgow, Scotland.
+44 (0) 7771 957 280

Professional Site Server 3
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002696
Professional Site Server 3.0 Commerce Edition
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002505
----- Original Message -----
From: Min Zhong <Min.Zhong@penske.com>
To: <xml-dev@ic.ac.uk>
Sent: Friday, December 10, 1999 10:51 PM
Subject: BizTalk JumpStart Kit - File missing


Hi,

Is there anyone tryed to install BizTalk JumpStart Kit before?  I tried to
install it but failed.  I had to debug the setup script and finally found
"mtsPkgMgs.dll" is missing.  Could someone help me to find out where I can
download that file?

Thank you very much!

Min


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Sat Dec 11 13:41:36 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:30 2004
Subject: SAX and <!DOCTYPE>
In-Reply-To: <whk8mort34.fsf@viffer.oslo.metis.no>
References: <whln77f8hq.fsf@viffer.oslo.metis.no> <m3r9gzdqx0.fsf@ifi.uio.no> <whk8mort34.fsf@viffer.oslo.metis.no>
Message-ID: <m3emctd27e.fsf@ifi.uio.no>


* Lars Marius Garshol
|
| In the XSA client, which needs to accept both XSA and OSD documents,
| but can't tell them apart before parsing begins, uses a
| DispatchingDocHandler, which has a hash of DocumentHandlers keyed on
| the name of the document element. In this very restricted case that
| worked just fine.

* Steinar Bang
| 
| [...]
| It looks like a dispatching from a DocumentHandler is the best idea,
| but then I need to be able to queue up SAX DocumentHandler events to
| send to the actual DocumentHandler when I start it.

If you dispatch on the document element this is easy, since the only
events you can get before it (in SAX 1.0, that is) are PI events. In
my handler I simply stuffed those into a Vector and replayed them when
the correct DocumentHandler had been selected.

| Hm... maybe an clone() function an a virtual destructor are in order
| for the C++ AttributeList class...?

There is an equivalent to a cloning function in the AttributeListImpl
class already:

  <URL: http://www.megginson.com/SAX/javadoc/org.xml.sax.helpers.AttributeListImpl.html >

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Sat Dec 11 14:14:14 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:30 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991207101003.00bfc100@corp.infoseek.com>
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>  <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991207101003.00bfc100@corp.infoseek.com>
Message-ID: <m3d7sdd0oz.fsf@ifi.uio.no>


* Lars Marius Garshol
| 
| First thought: this is fine for very simple uses, but for more
| complex uses something along the lines of the robots.txt file would
| be very nice. How about a variant PI that can point to a robots.rdf
| resource?

* Walter Underwood
| 
| In our experience, the simple form covers almost all needs.  We have
| 1000+ customers, and only three or four of them use our selective
| indexing support. So, I think of the robots meta tag as a proven
| solution that doesn't need major improvement.

I agree with you that probably the majority of web authoring
individuals prefer and are happy with the "meta tag" solution,
however, lots of people (such as me, for example) are not going to be
happy with it, since it requires indexing information to be added to
each and every document. My gut reaction to that is that it's plain
wrong, because it leads to so much hassle in content maintenance.

Also, using an RDF file to describe the site structure opens up for
new possibilities such as being able to group resources in a sensible
way to enable search engines to respond with more meaningful search
results.
 
Ever since RDF appeared I've been waiting for some application that
would enable me to say:

  - all these resources are small pieces of this larger split-up
    resource, which is represented by _this_ resource

  - this group of resources belongs together, and they are represented
    by _this_ resource

  - this group contains this other group

  - this are the groups of resources that make up this site, and this
    is the home page of the site

  - these groups are authored by this person, who is represented by
    this resource

  - this resource is of this kind


In an ideal world, this would lead to search engine responses like the
following:

  http://www.infotek.no/foredrag/lmg-xml.no-99/slide34.html

  xml-dev, part of a slide presentation by Lars Marius Garshol.
  Part of the STEP Infotek web pages.
  [top slide] [site top page] [author]

This doesn't really seem all that hard, but optimist that I am I may
of course be seriously underestimating the difficulties involved.

| Secondly, fetching two or more entities for one document makes the
| robot code much more complex. If the robots.rdf file gets a 404,
| what happens? What about a 401 or a timeout? The robot may need
| separate last-modified dates and revisit times for each entity. And
| after it is implemented and tested, how do you explain all that to
| customers who just want search results?

Personally, if I were a search engine vendor, I would see this is a
great chance to really stand out from the competition and deliver
something beyond what the others do, at least until they catch on.

Yes, it requires more from the users, yes, it requires more from the
implementation, but this has to be weighed against the benefits, which
are presumably large. 

Also, seeing the amount of interest for "meta tags" and optimizing for
various search engines among various content providers I assume that
if this facility really did help providers get more hits for their
sites then that would be all the motivation they need.

But in any case this was only meant as a loose suggestion, so if
you're not interested, then that's the end of that.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Sat Dec 11 14:28:01 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:30 2004
Subject: A processing instruction for robots
In-Reply-To: <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com>
References: <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com>  <3.0.5.32.19991202135858.00ac6100@corp.infoseek.com> <3.0.5.32.19991207101517.00b528f0@corp.infoseek.com>
Message-ID: <m3bt7xd021.fsf@ifi.uio.no>


* Lars Marius Garshol
| 
| Second thought: "and the index attribute must be first". This is
| nice for implementors, but is likely to clash with the expectations
| of users and the cost of more generality is very low for
| implementors.

* Walter Underwood
| 
| I'm open to changing this, but I thought I would start with the most
| strict version. The advantage of the strict version is that it
| doesn't need to be parsed. The Desparate Perl Hacker can do four
| regex compares for the four variants and get back to work.

I think I agree with Robert Hanson here. It's so easy to parse even if
the order is optional that I don't really see any point in fixing the
order. (Note for example that "Associating stylesheets with ..." does
not fix the order.)

| Maybe folks who've worked with authors on SGML systems have
| some relevant experience. Is this too strict for folks that
| aren't tamed by computers?

I'm not really qualified to answer this, so I'll pass.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Sat Dec 11 14:46:31 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:30 2004
Subject: RSS and WAP - another RSS question
In-Reply-To: <384018F6.182C7EFC@finetuning.com>
References: <3.0.6.32.19991127171313.00920c80@gpo.iol.ie> <384018F6.182C7EFC@finetuning.com>
Message-ID: <m366y5cz76.fsf@ifi.uio.no>


* Lisa Rein
| 
| I'm having trouble understanding why or if RSS even matters 

For me that's very simple. I like to keep track of news from lots of
different web sites, but I _really_ dislike having to keep revisiting
each site all the time to see if there is anything new. (Especially
with sites that are rarely updated, or dog-slow, like slashdot.)

The nice thing about RSS is that for the sites that support it it
completely removes that need. Instead I can just tell my RSS viewer to
get new stories when I feel like an update and it will list the new
stories nicely with the source etc in the client window.

Then I remove the ones that don't seem interesting and make the client
open the interesting ones in my browser. In fact, this is so nice that
I notice that I tend to just skip the non-RSS sites, and I also note
that it allows me to keep track of more sites than the old model.

The next feature for the client is probably going to be filters, so
that I don't have to see stuff that I know a priori won't be of
interest.

| (despite its apparent level of widespread adoption) and

Well, that's IMHO the other reason why it matters. It's a first
example of an application like the global XML applications that were
envisioned when this whole XML thing started that has succeeded in
terms of adoption. Personally I think that is very important.

| am trying to determine its level of inclusion in my books/classes.

I use it extensively in talks and classes, and am finding that it
seems to go down well, probably because:

 - anyone can understand what it's about
 - the documents are so simple
 - when presented in the right way the utility becomes obvious

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tlainevool at yahoo.com  Sat Dec 11 22:57:04 1999
From: tlainevool at yahoo.com (Toivo Lainevool)
Date: Mon Jun  7 17:18:30 2004
Subject: XML Design Patterns
Message-ID: <19991211225649.24570.qmail@web2103.mail.yahoo.com>

--- Don Park <donpark@docuverse.com> wrote:
> These are preliminary XML design patterns
> so pattern names are weird to say the least.

Does anyone have any references to XML design patterns, seems like a great
idea.

Thanks,

Toivo Lainevool

__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Sun Dec 12 00:10:24 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:18:30 2004
Subject: XPointer Working Draft entered Last Call
Message-ID: <19991211191014.D23680@w3.org>

  Since nobody seems to have posted about it yet, I forward this
information here, people should review (and start implementing if
interested) this Last Call draft and report problems or implementations
to the public list www-xml-linking-comments@w3.org

  http://lists.w3.org/Archives/Public/www-xml-linking-comments/

  thanks,
   
------------- Excerpts ---------------
Status of this document

The XML Linking Working Group, with this 1999 December 6 XPointer Last
Call working draft, invites comment on this specification. The Last Call
period begins 6 December 1999 and ends 27 December 1999.

The W3C Membership and other interested parties are invited to review
the specification and report implementation experience. Please send
comments to www-xml-linking-comments@w3.org
[...]

Abstract

This specification defines the XML Pointer Language (XPointer), the
language to be used as a fragment identifier for any URI-reference that
locates a resource of Internet media type text/xml or application/xml.

XPointer, which is based on the XML Path Language (XPath), supports
addressing into the internal structures of XML documents. It allows for
traversals of a document tree and choice of its internal parts based on
various properties, such as element types, attribute values, character
content, and relative position.
---------------------------------------

Daniel

W3C Staff contact for the XML Linking WG

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tlainevool at yahoo.com  Sun Dec 12 01:40:04 1999
From: tlainevool at yahoo.com (Toivo Lainevool)
Date: Mon Jun  7 17:18:30 2004
Subject: a simpler document type definition language?
Message-ID: <19991212014000.25736.qmail@web2101.mail.yahoo.com>

--- Jurg Wullschleger <Jurg.Wullschleger@mb.luth.se> wrote:
> the simplest form i can think of would look something like this: (examples
> in DTD syntax)
> there are only 4 types of elements:
> 
> - empty elements
> <!ELEMENT name1 EMTPY >
> 
> - elements that contain data
> 
> <!ELEMENT name2 (#PCDATA) >
> 
> - list elements
> 
> <!ELEMENT name3 (name1|name2|name3|name4)* >
> 
> - structural elements of a fixed length
> 
> <!ELEMENT name4 ((name1|name2),name3,name4,(name5|name6|name7)) >

I would go even simpler than that.  Don't allow nested brackets, #4 could be
represented like this:

<!ELEMENT name4 (nameA,name3,name4,nameB)>
<!ELEMENT nameA (name1|name2)>
<!ELEMENT nameB (name5|name6|name7)>

I recently wrote a quick and dirty DTD processor that generated java classes
for parsing valid XML documents, using this simplification in my DTDs made
things a whole lot easier.

Toivo Lainevool
__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Sun Dec 12 02:31:38 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:30 2004
Subject: XML Design Patterns
In-Reply-To: <19991211225649.24570.qmail@web2103.mail.yahoo.com>
Message-ID: <000701bf4449$1f7f48c0$d1940e18@smateo1.sfba.home.com>

>--- Don Park <donpark@docuverse.com> wrote:
>> These are preliminary XML design patterns
>> so pattern names are weird to say the least.
>
>Does anyone have any references to XML design patterns, seems 
>like a great
>idea.

Great ideas are usually obvious ideas.  I have been a 
'pattern-head' ever since Christopher Alexander's books
were mentioned in context of software engineering.  Patterns
in software design lead naturally to patterns in information
design.  XML design patterns are subpatterns of general
information design patterns just like database schema design
patterns.

IMHO, we have greater opportunity to exploit the design
patterns in information engineering than software engineering
because automated application of design patterns is easier
with structured information such as XML than with programming
languages.

XML design pattern activities are just starting to appear.
I have heard that an article will appear on XML.com soon.
As soon as I can find some spare time, I intend to build a
repository for XML Design Patterns which will allow the XML
community to pool design knowledges, to evolve them as the
community evolves, and hopefully, to apply the patterns
automatically using tools that use the repository as a
knowledge-base.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec 12 08:41:29 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:30 2004
Subject: XML Design Patterns
Message-ID: <004e01bf4480$39caadf0$5df96d8c@NT.JELLIFFE.COM.AU>


From: Toivo Lainevool <tlainevool@yahoo.com
>Does anyone have any references to XML design patterns, seems like a
great
>idea.

There are three main references.

1) The first is Ian Graham and Liam Quin's web pages "Introduction to
XML
Design Patterns" at
http://www.groveware.com/xmlbook/patterns.html

They give:
  * Running Text Pattern
  * Generated Text Pattern
  * Footnote Pattern
  * Text Blocks Pattern

This is just a teaser site: I expect (or at least, I would love to see)
a full book
or website along those lines.

2) The second is my book:

  The XML & SGML Cookbook: Recipes for Structured Information
  Charles F. Goldfarb Series on Open Information Management
  Prentice Hall, 1998, ISBN 0-13-614223-0, 650 pages

In researching that book I spent 1 year looking through as many DTDs
as I could to try to discover the patterns they contained. I began
trying to
use Alexander's pattern approach (which is just as much a rhetorical
form
as it is a methodology) but ultimately I abandoned it because:

    * Many entries were trivial (e.g., Q. "When should you use a list
pattern?"
A. "Whenever you need a list")

    * Many entries only made sense after an analytical framework was
first established:  in fact, the first third of the book ("Systems of
Documents") was spent establishing such a framework/vocabulary
for the second third ("Document Patterns"). The last third of the
book ("Characters & Glyphs") similarly had to cover a lot of
analytical ground (e.g. the ISO Character/Glyph model) before
moving on to patterns.

    * It was clear that the SGML literature had not even began to cover
this kind of area: I am not smart enough to single-handedly establish
a pattern vocabulary--indeed, it is only possible as a community
effort--
so the best I could do was to try to set things up.

    * There was some opposition to the idea that you could usefully
construct
DTDs from prefabricated components, rather than by doing extensive
document analysis.  So the pattern approach was dismissed by some:
in particular, the idea that one could appropriate elements from a "toy"
DTD like HTML;
of course, with the advent of namespaces, the idea of reusable
vocabularies
is now utterly accepted: I don't know if my book contributed to that.

    * Many patterns made sense only in distinction from some other type:
so I moved towards a more "X versus Y" style.

    * Alexander gave an interview (in "Computer Languages" magazine?
perhaps with Michael Swaine?) in which he said how disappointing the
results of the early uses of his pattern language in architecture had
been.
He said that rather than creating functional and innovative buildings,
people applied patterns and made buildigs that looked the same. So
pattern languages seemed good for QA but not for excellence. Because
I view DTDs as a tool for software engineering rather than data
modeling,
it made sense to try to integrate patterns into a software engineering
framework, for my point of view.

Here is most of the "Index of Patterns, Stuctures & Forms" from my book:
    * active versus passive (DTD style)
    * address
    * analytical domain
    * architectural form
    * attribute v. element (DTD style)
    * base (element set)
    * building block versus paragon (DTD style)
    * calendar
    * catalog
    * character versus glyph
    * character set
    * citation
    * class (attribute)
    * color
    * continuation paragraph
    * core (element set)
    * country code
    * cross-reference
    * data attribute for element
    * database versus literature (DTD style)
    * date
    * default value list
    * definition list
    * derived (element set)
    * description table
    * document
    * editorial structure (view)
    * element reference (reflection)
    * embedded data
    * encoding
    * fielded names
    * fielded text
    * floating elements
    * font
    * fragment
    * generic versus specific (DTD style)
    * gestural domain
    * hyperlink
    * identifier
    * IETM
    * information unit
    * inline versus interlaced
    * internal markup versus external markup (DTD style)
    * language
    * language codes
    * lexical type
    * linear versus nested (DTD style)
    * line
    * logical domain
    * loose versus tight (DTD style)
    * marketplace versus hierarchy (DTD style)
    * metadata
    * microdocument
    * name (ID)
    * nested paragraph
    * note (endnote, footnote, annotation, warning, caution)
    * occurrence
    * page object (View)
    * page layout (View)
    * paragraph
    * paragraph group (aka formal paragraph)
    * pool
    * prototypes
    * reusable components
    * ruby annotation
    * running text
    * self-labelling versus extenal labelling (DTD style)
    * semi-graphical text
    * sequence (generic)
    * sequence (list)
    * schema
    * stylesheet
    * subparagraph
    * table
    * text block
    * time and space
    * type extension
    * unspecified
    * visual domain
    * word segment

(You can imagine my surprise when a review said that most of my book
was found elsewhere, when in fact, it is still the only thing available
persuing the pattern idea--though not the literature form :-(   )

One very easy way to gather patterns is to look in DTDs for the things
that parameter entities name. These groupings often correspond to
what people may think the XML equivalent of the OO people's
appropriation
of Alexandar's patterns are.

3) The third is my schema language and tools: Schematron, which was
designed
to support patterns. The current design only allows labelling of found
structures
as patterns rather than specification of patterns in the abstract (that
is possible
to implement, but a long way off: as long as we are all fixated on
grammars or
classes or other implementation/modeling paradigms there is little
chance of
stepping back for a more general view).

Available at: http://www.ascc.net/xml/schematron/schematron.html
There is an interview on schematron with XML-DEV regular Simon
St.Laurent at: http://www.xmlhack.com/read.php?item=121


If anyone is interested in persuing patterns further, please do any of
the following:
    * email me or this list
    *
    * read the HTML page
    * buy or read my book!
    * read "the Gang of Four" pattern books from Addison Wesley, and
also the
    excellent "Anti-Patterns" book from John Wiley
    * you can find patterns tacetly lurking in most good SGML/XML books
(that
are not just introductions) such as Eve Maler's or Dave Megginson's
books.

Rick Jelliffe

P.S. In case this post is lost, Toivo Lainevool is following on from a
post of Don Park
on XML-DEV which mentions three "preliminary XML design patterns" to aid
thinking about XML:
    * pockets ( elements that could provide an equivalent information
set as
    the element/attribute distinction provides in normal XML)
    * parental guidence (elements that could provide an equivalent
information
    set as provided by attributes that apply to child elements)
    * road signs (elements that could provide an equivalent information
set
    to attributes or PIs (unclear, sorry))


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From t.haraguti at computer.org  Sun Dec 12 12:07:34 1999
From: t.haraguti at computer.org (Tetsuharu Haraguchi)
Date: Mon Jun  7 17:18:30 2004
Subject: Scope entity
Message-ID: <38538F6D.9B26E9DF@computer.org>

Hi, everybody!

  I think it is usefull to add 'the scope entity' to the linked
document.

Examples :

<story>
  <public>
    <abstruct>Show me!</abstruct>
  </public>
  <private>
    <detail>Do not show me!</detail>
  </private>
</story>

or

<story>
   <abstruct scope=public>Show me!</abstruct>
   <detail scope=private>Do not show me!</detail>
</story>

Cheers,
--
Tetsuharu Haraguchi
mailto:t.haraguti@computer.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Sun Dec 12 18:09:07 1999
From: digitome at iol.ie (Sean McGrath)
Date: Mon Jun  7 17:18:30 2004
Subject: Announce: Pyxie - an Open Source XML Processing Library for
  Python
Message-ID: <3.0.6.32.19991212175738.009a52f0@gpo.iol.ie>

All,

I have finally got around to putting the Pyxie library up
on the Web at http://www.pyxie.org.

I hope some of you find it useful and help me to develop
it further - either by submitting problem reports or
contributing to the development effort.

regards,

http://www.pyxie.org - an Open Source XML Processing library for Python


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Dec 12 18:26:34 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:30 2004
Subject: A new XHTML PR! (1999-12-10)
Message-ID: <m3aengaudc.fsf@localhost.localdomain>

At the W3C site, I just noticed a new XHTML PR available at

  http://www.w3.org/TR/xhtml1/

I have recently left the W3C's XML activity (for personal and business
reasons only -- it just took way too much time), so I am not privy to
any internal discussions, but it looks like they've got it right this
time around.  Here are some highlights after a very, very cursory
first skim (I may have gripes later after a more careful reading):

1. A single XHTML Namespace, http://www.w3.org/1999/xhtml

2. Examples of using elements from other Namespaces inside an XHTML
   document, and of using the XHTML Namespace inside other document
   types (though there are no strict conformance criteria defined for
   either yet).

3. All element and attribute names in lower case.

4. The DOCTYPE declaration is still required for strict XHTML
   conformance (annoying, but I can live with that), and there are
   still three different DTDs.

I am particularly impressed with the HTML WG and with Tim Berners-Lee
for taking the discussion (and debate) out into the open on XML-Dev
rather than keeping it locked up inside the W3C cone of silence.  A
bit of credit should also go to us, the XML-Dev membership, who took a
lot of time debating the different sides of several difficult
questions.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sun Dec 12 21:53:06 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:30 2004
Subject: SAX/C++: First interface draft
In-Reply-To: <14406.59198.949047.2487@localhost.localdomain>
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEEKEJAA.martind@netfolder.com>

Hi David,

Why not implement the document handler also as an interface? thus we we
woudl have:

class IDocumentHandler
{
public:
  virtual void setDocumentLocator (const Locator &locator) = 0;
  virtual void startDocument (void) = 0;
  virtual void endDocument (void) = 0;
  virtual void startElement (const char * name, const AttributeList &atts) =
0;
  virtual void endElement (const char * name) = 0;
  virtual void characters (const char * ch, size_t length) = 0;
  virtual void ignorableWhitespace (const char * ch, size_t length) = 0;
  virtual void processingInstruction (const char * target, const char *
data) = 0;
}

class MyDocHandlerImp : public IDocumentHandler
{
  MyDocHandlerImp();
  ~MyDocHandlerImp();
  void setDocumentLocator (const Locator &locator);
  void startDocument (void);
  void endDocument (void);
  void startElement (const char * name, const AttributeList &atts);
  void endElement (const char * name);
  void characters (const char * ch, size_t length);
  void ignorableWhitespace (const char * ch, size_t length);
  void processingInstruction (const char * target, const char * data);
protected:
  Locator * _locator;
}

PRO and CON:
------------
The event generator talks to a generic interface not to a particular
implementation. However this reauires that the interface.h to be included
and that interfaces are inherited by implementations.

Other point of view:
--------------------
SP uses an event record which tend to reduce the number of interface
members. In the OpenJade project, we are thinking to remake the C++
interface of OpenSP as follow:

class IDocumentHandler
{
  virtual void startElement(const StartElementEvent &event) = 0;
  virtual void endElement(const EndElementEvent &) = 0;
}

class parser: public IDocumentHandler, public SGML_XML_Application
{
	void startElement((const StartElementEvent &event);
	void endElement(const EndElementEvent &);
}
note: the usage of an interface also allows the class implementation to use
multiple inheritence and still be able to be interfaced with the client
without problems, this may not be the case for an ordinary class.

This is the event record which provide description of the event type. I
understand that SAX saw its origins as a set of Java classes. Would it be to
alien to SAX to use structs intead of methods? (just asking with curiosity
and trying to make these world compatible and useful to developers)

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sun Dec 12 22:26:04 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:30 2004
Subject: About a C++ interface for event handling
Message-ID: <NBBBJPGDLPIHJGEHAKBAMEEKEJAA.martind@netfolder.com>

Hi,

In order to get a good C++ mapping for XML parsing and also take the
occasion to merge with the SGML world so that it could be possible to have a
parser to do both, here is waht we already have as interface in OpenSP (the
version 1.14 soon to be released)

class document handler
{
  virtual void appinfo(const AppinfoEvent &) = 0;
  virtual void startDtd(const StartDtdEvent &) = 0;
  virtual void endDtd(const EndDtdEvent &) = 0;
  virtual void endProlog(const EndPrologEvent &) = 0;
  virtual void startElement(const StartElementEvent &) =0 ;
  virtual void endElement(const EndElementEvent &) =0 ;
  virtual void data(const DataEvent &) = 0 ;
  virtual void sdata(const SdataEvent &) = 0;
  virtual void pi(const PiEvent &) = 0;
  virtual void externalDataEntityRef(const ExternalDataEntityRefEvent &) =
0;
  virtual void subdocEntityRef(const SubdocEntityRefEvent &) = 0;
  virtual void nonSgmlChar(const NonSgmlCharEvent &) = 0;
  virtual void commentDecl(const CommentDeclEvent &) = 0;
  virtual void markedSectionStart(const MarkedSectionStartEvent &) = 0;
  virtual void markedSectionEnd(const MarkedSectionEndEvent &) = 0;
  virtual void ignoredChars(const IgnoredCharsEvent &) = 0;
  virtual void generalEntity(const GeneralEntityEvent &) = 0;
  virtual void error(const ErrorEvent &) = 0 ;
  virtual void openEntityChange(const OpenEntityPtr &) = 0;
}

Because it is more convenient in C++ to work either with classes or struct,
each member provide an event structure to the interface implementation.
Obvioiusly, for XML some of these members are useless so waht we could have
now is an XML interface having less members and having the SGML interface
having more members and which inherits from the XML interface. This way the
set/superset relationship between SGML and XML would also be expressed with
a class superclass inheritance.

So, for XML what would be useful is:

class IXMLDocumentHandler
{
  virtual void startDtd(const StartDtdEvent &) = 0;
  virtual void endDtd(const EndDtdEvent &) = 0;
  virtual void startElement(const StartElementEvent &) =0 ;
  virtual void endElement(const EndElementEvent &) =0 ;
  virtual void pi(const PiEvent &) = 0;
  virtual void externalDataEntityRef(const ExternalDataEntityRefEvent &) =
0;
  virtual void commentDecl(const CommentDeclEvent &) = 0;
  virtual void error(const ErrorEvent &) = 0 ;
}


Did I forgot anything for XML? So the SGMLDocumentHandler interface would
inherit from the XMLDocumentHandler interface and add waht is necessary for
SGML

class ISGMLDocumentHandler: public IXMLDocumentHandler
{
  virtual void endProlog(const EndPrologEvent &) = 0;
  virtual void data(const DataEvent &) = 0 ;
  virtual void sdata(const SdataEvent &) = 0;
  virtual void subdocEntityRef(const SubdocEntityRefEvent &) = 0;
  virtual void nonSgmlChar(const NonSgmlCharEvent &) = 0;
  virtual void markedSectionStart(const MarkedSectionStartEvent &) = 0;
  virtual void markedSectionEnd(const MarkedSectionEndEvent &) = 0;
  virtual void ignoredChars(const IgnoredCharsEvent &) = 0;
  virtual void generalEntity(const GeneralEntityEvent &) = 0;
  virtual void openEntityChange(const OpenEntityPtr &) = 0;
}

Comments?

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web Boston (http://www.mfweb.com)
Markup 99 (http://www.gca.com)
Book:
To come soon: XML Pro published by Wrox Press
Products:
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From liamquin at interlog.com  Mon Dec 13 00:29:40 1999
From: liamquin at interlog.com (Liam R. E. Quin)
Date: Mon Jun  7 17:18:30 2004
Subject: XML Design Patterns
In-Reply-To: <004e01bf4480$39caadf0$5df96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <Pine.BSI.3.96r.991212185627.21987D-100000@shell1.interlog.com>

On Sun, 12 Dec 1999, Rick Jelliffe wrote:

> 1) The first is Ian Graham and Liam Quin's web pages "Introduction to
> XML Design Patterns" at
> http://www.groveware.com/xmlbook/patterns.html

Thanks for mentioning this, Rick.

> This is just a teaser site: I expect (or at least, I would love to see)
> a full book or website along those lines.
I'd like to do more, but I'm too busy writing the xml database book
right now :|  And Ian is busy with his HTML books.

I'd like to write more about computer typography & xml, too.

[...]
>     * There was some opposition to the idea that you could usefully
> construct DTDs from prefabricated components, rather than by doing
> extensive document analysis.

There was opposition to programming languages too, since assembly
languages are "more efficient".  Object Oriented programming
has introduced "inefficiencies" too, at the hardware level, but
the extra efficiencies at the software level more than pays for
the difference.

I do agree that design patterns can be misused, both in programming
and elsewhere.  You have to do the analysis and then apply the
patterns.

I think they are most useful to learn from, not to apply blindly.

Lee

-- 
Liam Quin, Barefoot Computing, Toronto;  The barefoot wizard
l i a m    at    h o l o w e b    dot    n e t
Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Steve.Ball at zveno.com  Mon Dec 13 01:09:17 1999
From: Steve.Ball at zveno.com (Steve Ball)
Date: Mon Jun  7 17:18:30 2004
Subject: GUI XML doc authoring tools
References: <000b01bf4028$38c669a0$90fc4dd8@bhm.bellsouth.net>
Message-ID: <3854478C.F9BA8E07@zveno.com>

Jeff Russell wrote:
> 
> Anybody know of any Windows GUI (or Linux, as a last resort) XML document
> authoring tools? Something like SoftQuad's XMeTaL, but that doesn't
> require a DTD.

I have just uploaded a new version of our XML Editor, Swish, to the
FTP site: ftp://ftp.zveno.com/swish/Swish-1.0b5.tar.gz

Swish is a non-vaidating XML editor.  
It is cross-platform, Unix, Windows and Macintosh, and is now
Open Source.

Unfortunately, we're running behind on updating the website, but if
you're interested then please drop my an email for installation and
usage instructions.

Regards,
Steve Ball

-- 
Steve Ball            |   Swish XML Editor    | Training & Seminars
Zveno Pty Ltd         |   Web Tcl Complete    |      XML XSL
http://www.zveno.com/ |    TclXML TclDOM      | Tcl, Web Development
Steve.Ball@zveno.com  +-----------------------+---------------------
Ph. +61 2 6242 4099   | Mobile (0413) 594 462 | Fax +61 2 6242 4099

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From l-arcini at uniandes.edu.co  Mon Dec 13 04:41:33 1999
From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.)
Date: Mon Jun  7 17:18:30 2004
Subject: XML Design Patterns
In-Reply-To: <Pine.BSI.3.96r.991212185627.21987D-100000@shell1.interlog.com>
Message-ID: <Pine.GSO.3.96.991212232400.29612C-100000@isis>

Hi!

I also wrote something about XML and design patterns (I've been working
on the subject for some months now), it is focused on
structural patterns in applications using XML for their persistence.
I'm preparing an introductory article for some of those patterns for
XML.com, so I hope you can see that soon. In the meantime you can see a
little outdated(april 99) version of some of the ideas at:

http://wwwest.uniandes.edu.co/~l-arcini/xmlablePattern.ps 

Best,
	Fabio

Fabio Arciniegas A.				fabio@viaduct.com
Viaduct Technologies Inc.
Interests: XML, Wittgestein, and just about everything in between...

> On Sun, 12 Dec 1999, Rick Jelliffe wrote:
> 
> > 1) The first is Ian Graham and Liam Quin's web pages "Introduction to
> > XML Design Patterns" at
> > http://www.groveware.com/xmlbook/patterns.html
> 
> Thanks for mentioning this, Rick.
> 
> > This is just a teaser site: I expect (or at least, I would love to see)
> > a full book or website along those lines.
> I'd like to do more, but I'm too busy writing the xml database book
> right now :|  And Ian is busy with his HTML books.
> 
> I'd like to write more about computer typography & xml, too.
> 
> [...]
> >     * There was some opposition to the idea that you could usefully
> > construct DTDs from prefabricated components, rather than by doing
> > extensive document analysis.
> 
> There was opposition to programming languages too, since assembly
> languages are "more efficient".  Object Oriented programming
> has introduced "inefficiencies" too, at the hardware level, but
> the extra efficiencies at the software level more than pays for
> the difference.
> 
> I do agree that design patterns can be misused, both in programming
> and elsewhere.  You have to do the analysis and then apply the
> patterns.
> 
> I think they are most useful to learn from, not to apply blindly.
> 
> Lee
> 
> -- 
> Liam Quin, Barefoot Computing, Toronto;  The barefoot wizard
> l i a m    at    h o l o w e b    dot    n e t
> Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 

--
Fabio Arciniegas Arjona              
l-arcini@uniandes.edu.co            
                                

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Mon Dec 13 06:52:20 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:30 2004
Subject: XML Design Patterns
Message-ID: <002501bf453a$1b06e680$36f96d8c@NT.JELLIFFE.COM.AU>


From: Fabio Arciniegas A. <l-arcini@uniandes.edu.co>
 >I also wrote something about XML and design patterns (I've been
working
>on the subject for some months now), it is focused on
>structural patterns in applications using XML for their persistence.
>I'm preparing an introductory article for some of those patterns for
>XML.com, so I hope you can see that soon. In the meantime you can see a
>little outdated(april 99) version of some of the ideas at:
>
>http://wwwest.uniandes.edu.co/~l-arcini/xmlablePattern.ps

This is a good paper. Toivo Lainevool has also sent me privately an
interesting
pattern too.

Fabio's XMLable pattern is more concerned with using Alexander Patterns
for
OO program design.  Toivo's pattern is more concerned with using
Alexander
Patterns for DTD design.  Liam's patterns are more concerned with using
Alexander Patterns for DTD implementation. My book is about design and
implementation and not at all about programs.

I look forward to Fabio releasing his work: it is important for fitting
XML into
mainstream OO discourse.  But the patterns I would call for are more in
the
area of what goes on inside a document: apart from the promise of
namespace,
XML has resulted in no advances in analytical methodologies, as far as I
can
see.  I hope that XML Schemas, Schematron, Express, AT&T's DSD, etc,
will
provide a base technology which people can use to explore new paradigm.
I think the two most useful areas are "patterns/assertion grammars" and
"cohesion and coupling analysis"; it is quite probably that the XML
Schema
language will allow class or inheritence in some form which again will
provide a more direct path from a high-level analysis to an
implementation.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Sajeev_M1 at verifone.com  Mon Dec 13 07:01:00 1999
From: Sajeev_M1 at verifone.com (Sajeev M.)
Date: Mon Jun  7 17:18:30 2004
Subject: problem with included dtds
Message-ID: <F9FBA0D1187BD11188B200A0C9979DF902DB5687@blrmail.india.hp.com>


	hi,
	    I have a problem with with dtds. I have three dtds all of which have
few fields in common.If I try to make a dtd of those fields and include that in
all the three, the definitions will clash and dtd becomes invalid.H
	Can I use namespaces in dtds?If so how?

	thanks
	Sajeev


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Mon Dec 13 08:26:40 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:30 2004
Subject: SAX and <!DOCTYPE>
In-Reply-To: Lars Marius Garshol's message of "11 Dec 1999 14:41:41 +0100"
References: <whln77f8hq.fsf@viffer.oslo.metis.no> <m3r9gzdqx0.fsf@ifi.uio.no> <whk8mort34.fsf@viffer.oslo.metis.no> <m3emctd27e.fsf@ifi.uio.no>
Message-ID: <wh66y3456s.fsf@viffer.oslo.metis.no>

>>>>> Lars Marius Garshol <larsga@garshol.priv.no>:

> There is an equivalent to a cloning function in the AttributeListImpl
> class already:

>   <URL: http://www.megginson.com/SAX/javadoc/org.xml.sax.helpers.AttributeListImpl.html >

Yes, but unless you implement a cloning function in the AttributeList
interface, it has to iterate throught the AttributeList, using the
public interface to make a deep copy, and this means we can't optimize
by using a refcounting shallow copy.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mah at oxiasoft.com  Mon Dec 13 14:56:11 1999
From: mah at oxiasoft.com (Hemissi Maher)
Date: Mon Jun  7 17:18:30 2004
Subject: MicroSoft XML DOM
Message-ID: <011601bf457a$67b36540$100a0a0a@mah>

Hi all,

How can i distribute XML DOM for people that don't have IE4 or later ?

Thanks 
Hemissi Maher
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991213/858169dc/attachment.htm
From rja at arpsolutions.demon.co.uk  Mon Dec 13 15:49:58 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:30 2004
Subject: MicroSoft XML DOM
References: <011601bf457a$67b36540$100a0a0a@mah>
Message-ID: <001101bf4581$41962580$c5010180@p197>

Make them install IE4 ;)

or

change over to ActiveDOM which is fairly compatible :
http://www.vivid-creations.com/dom/index.htm

----- Original Message -----
From: Hemissi Maher <mah@oxiasoft.com>
To: <xml-dev@ic.ac.uk>
Sent: 13 December 1999 14:56
Subject: MicroSoft XML DOM


Hi all,

How can i distribute XML DOM for people that don't have IE4 or later ?

Thanks
Hemissi Maher


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From francis at redrice.com  Mon Dec 13 17:05:17 1999
From: francis at redrice.com (Francis Norton)
Date: Mon Jun  7 17:18:30 2004
Subject: MicroSoft XML DOM
References: <011601bf457a$67b36540$100a0a0a@mah> <001101bf4581$41962580$c5010180@p197>
Message-ID: <38552667.CFC6BA1A@redrice.com>

Or download the redistributable MSXML parser from
http://msdn.microsoft.com/downloads/tools/xmlparser/xmlparser.asp - if
you want a COM interface for windows users. There are cross-platform
options too.

Francis.

Richard Anderson wrote:
> 
> Make them install IE4 ;)
> 
> or
> 
> change over to ActiveDOM which is fairly compatible :
> http://www.vivid-creations.com/dom/index.htm
> 
> ----- Original Message -----
> From: Hemissi Maher <mah@oxiasoft.com>
> To: <xml-dev@ic.ac.uk>
> Sent: 13 December 1999 14:56
> Subject: MicroSoft XML DOM
> 
> Hi all,
> 
> How can i distribute XML DOM for people that don't have IE4 or later ?
> 
> Thanks
> Hemissi Maher
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From john.aldridge at informatix.co.uk  Mon Dec 13 17:25:19 1999
From: john.aldridge at informatix.co.uk (John Aldridge)
Date: Mon Jun  7 17:18:30 2004
Subject: MicroSoft XML DOM
In-Reply-To: <38552667.CFC6BA1A@redrice.com>
References: <011601bf457a$67b36540$100a0a0a@mah>
 <001101bf4581$41962580$c5010180@p197>
Message-ID: <3.0.6.32.19991213172359.00b0ec70@mailhost>

At 17:01 13/12/99 +0000, Francis Norton <francis@redrice.com> wrote:
>Or download the redistributable MSXML parser from
>http://msdn.microsoft.com/downloads/tools/xmlparser/xmlparser.asp

Unfortunately, this page says:

1. GRANT OF LICENSE. This EULA grants you the following rights:

Installation and Use. You may install and use an unlimited number of copies
of the SOFTWARE PRODUCT only in conjunction with validly licensed copies of
Microsoft Internet Explorer version 4.0 Service Pack 1 or greater...

-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Mon Dec 13 17:29:09 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:30 2004
Subject: MicroSoft XML DOM
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC01A7@cc20exch2.mobility.com>

This still requires IE4 or later.  (It's not immediately apparent, but one
of the download pages on Microsoft's site mentions it.)  MSXML uses IE for
things like loading an XML document from a URL.

-----Original Message-----
From: Francis Norton [mailto:francis@redrice.com]
Sent: Monday, December 13, 1999 12:01 PM
To: Richard Anderson
Cc: xml-dev@ic.ac.uk
Subject: Re: MicroSoft XML DOM


Or download the redistributable MSXML parser from
http://msdn.microsoft.com/downloads/tools/xmlparser/xmlparser.asp - if
you want a COM interface for windows users. There are cross-platform
options too.

Francis.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Mon Dec 13 17:34:30 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:30 2004
Subject: MicroSoft XML DOM
References: <011601bf457a$67b36540$100a0a0a@mah> <001101bf4581$41962580$c5010180@p197> <38552667.CFC6BA1A@redrice.com>
Message-ID: <000901bf458f$da14a850$c5010180@p197>

Except that STILL NEEDS IE4 to be installed.


----- Original Message ----- 
From: Francis Norton <francis@redrice.com>
To: Richard Anderson <rja@arpsolutions.demon.co.uk>
Cc: <xml-dev@ic.ac.uk>
Sent: 13 December 1999 17:01
Subject: Re: MicroSoft XML DOM


> Or download the redistributable MSXML parser from
> http://msdn.microsoft.com/downloads/tools/xmlparser/xmlparser.asp - if
> you want a COM interface for windows users. There are cross-platform
> options too.
> 
> Francis.
> 
> Richard Anderson wrote:
> > 
> > Make them install IE4 ;)
> > 
> > or
> > 
> > change over to ActiveDOM which is fairly compatible :
> > http://www.vivid-creations.com/dom/index.htm
> > 
> > ----- Original Message -----
> > From: Hemissi Maher <mah@oxiasoft.com>
> > To: <xml-dev@ic.ac.uk>
> > Sent: 13 December 1999 14:56
> > Subject: MicroSoft XML DOM
> > 
> > Hi all,
> > 
> > How can i distribute XML DOM for people that don't have IE4 or later ?
> > 
> > Thanks
> > Hemissi Maher
> >


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Mon Dec 13 17:51:34 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:30 2004
Subject: need clarifications on XPath vs pattern match
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEFMEJAA.martind@netfolder.com>

Hi

I got a comment on my transformation chapter that brought me some doubts.
The comment says about the following sentence:

"XPath is not based on XML notation but more as a string notation which
resemble a file system path. It is intended, for instance, to be used as a
string appended to a URL used to access an XML document or as a value for
the XSLT template match attribute"

"That?s a bad example because the match attribute is a pattern and patterns
are defined in XSLT, not in XPath."

I always thought that the match attribute was taking XPath expression has
value. I am wrong to think that? Am I wrong also to think that XPointer is
using XPath expressions?

Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Mon Dec 13 18:32:58 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:18:30 2004
Subject: need clarifications on XPath vs pattern match
References: <NBBBJPGDLPIHJGEHAKBAEEFMEJAA.martind@netfolder.com>
Message-ID: <38553C23.7FDD21B2@finetuning.com>

dider ph martin wrote:
 
> I always thought that the match attribute was taking XPath expression has
> value. I am wrong to think that? 

1) No, you are not wrong to think that it CAN be an XPath expression
value.
But it does not HAVE to be -- it can simply be "matching" an element
name, or attribute name, etc.  An XPath expression can also be used to
make that "match".

then didier asked:
Am I wrong also to think that XPointer is
> using XPath expressions?

2)
As far as Xpointer using XPath...that IS something we can all agree on
(i hope :-)  By definition, XPointer extends XPath, yes?  But upon
reading the spec, XPointer appears to operate on the information set,
rather than strictly as an extension of an XPath expression....hmmmmm? 
Does that mean that XPointer can be used to extend OTHER path languages?
Or at least, the path functions of other Path/query languages (such as
XQL)

So ....can XPointer be used to extend XQL, for example?  (or am i just
really confusing the issues at this point???)

I look forward to having both of these assumptions (assumptions on my
part anyway) confirmed or corrected :-)

thanks everybody,

lisa 


http://www.finetuning.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Dec 13 19:38:50 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:30 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <87256846.006BCA94.00@d53mta03h.boulder.ibm.com>


>> 11) The class names (since we can't afford to use C++ namespaces) should
be
>> expanded to include a SAX prefix to avoid clashes. So SAXParser and
>> SAXLocator and SAXAttributeList and so on.
>
>Is it true that C++ namespaces are still a problem on any platform?  I
>know that they actually do work under Windows, and the newer EGCS/GCC
>have supported them for a while for all *nix variants (including
>Linux) -- is it the Mac that doesn't have a proper C++ compiler yet?

Unfortunately, a number of the compilers supported by XML4C have no
namespace support, no bool support, limited template support, no mutable
member support, etc...

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Dec 13 19:36:16 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:31 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <87256846.006B8850.00@d53mta03h.boulder.ibm.com>


>> 2) We would prefer that all data come out of the SAX interfaces as
>> raw wchar_t strings. This is the most flexible mechanism and does
>> not lock people into using any particular implementation of a string
>> object. It also has the highest potential performance for those
>> folks who never need to put it into anything more formal than a raw
>> array.
>
>std::basic_string<> _is_ a modern service of C++, and a pretty good
>one from an API point of view.
>
>Personally I say: use std::basic_string<> and death to all other
>string representations in C++.
>

You are never going to win this argument. If you do try to force this, this
'standard' will die on the vine. As an example, and I'm speaking for me
personally here, not IBM... I'm adding an XML parser to my CIDLib C++
libraries. There is zero chance that I'll use any standard library
functionality in it, because the whole point of it is to not use the
standard library, since its intended to (among many other things) replace
the standard library with a much more powerful and integrated system. It
gets high portability by having zero system or runtime headers show up
outside of a very small core virtual kernel. Any standard which required me
to use standard library stuff would be a non-starter, and I'd have no
choice but to ignore it.

So either it has to be wchar_t or its left to the implementation how it
will spit the stuff out. Of those two, wchar_t is the only one that will
make this standard remotely standard. If everyone has to ignore the
standard because it forces the use of stuff that they can't make use of,
then its not much of a standard really and will just be a waste of time.

Just my opinion of course...

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Mon Dec 13 21:23:13 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:31 2004
Subject: Java a standard? not necessarily...
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEGCEJAA.martind@netfolder.com>

Hi

I thought this may be interesting for those who think that Java is a
language normalized as a standard (and interesting for the others too).

http://www.zdnet.com/zdnn/stories/news/0,4586,2407597,00.html

Conclusion: it now clear that Java is controled by Sun. No more ambiguities.
Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Mon Dec 13 21:26:29 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:31 2004
Subject: About Java
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEGDEJAA.martind@netfolder.com>

Hi,

In the last message I forgot to mention that Sun said that they will "give
away away all source code, run times and binaries of J2SE, eliminating any
subsequent royalty fees and requiring only the passage of compatibility test
suites."

Tanks to Sun, they are doing the right thing. Some ISVs and OpenSource
groups are really anxious to get the code ;-)

Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web Boston (http://www.mfweb.com)
Markup 99 (http://www.gca.com)
Book:
To come soon: XML Pro published by Wrox Press
Products:
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Alex_Garrett at coregis.com  Mon Dec 13 22:09:47 1999
From: Alex_Garrett at coregis.com (Alex Garrett)
Date: Mon Jun  7 17:18:31 2004
Subject: XQL question
Message-ID: <85256846.0079F503.00@smtp.apprise.com>


The XQL doc at the W3 Consortium (http://www.w3.org/TandS/QL/QL98/pp/xql.html)
states:

book[index() $le$ 2] "Find[s] the first two books" in its sample data set. (S.
3.3)
Assuming there are more than two books in the document, wouldn't this find the
first three
(books 0, 1, & 2)? as a [foo]ML newbie, I feel certain I've misunderstood
something down
the line. Please advise.

Cheers,
     Alex


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Mon Dec 13 22:18:37 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:31 2004
Subject: need clarifications on XPath vs pattern match
In-Reply-To: <38553C23.7FDD21B2@finetuning.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAMEGDEJAA.martind@netfolder.com>

Hi Lisa

Lisa said:
1) No, you are not wrong to think that it CAN be an XPath expression
value.
But it does not HAVE to be -- it can simply be "matching" an element
name, or attribute name, etc.  An XPath expression can also be used to
make that "match".

Didier reply:
So, you say that a pattern expression can potentially have a different
syntax than an XPath expression or that a pattern match expression could not
be contained in the XPath specs (otherwise it is an XPath expression). Does
this means that I can create my own expression and be compliant with the
specs or that only the pattern match expression mentioned in the specs are
valid. In this last case, it means that the pattern match is quite limited.
Jee, I am having trouble interpreting the specs. Do we need jurisprudence?
if yes who state that the jurisprudence? Why did I woke up this morning,
life was easier in my bed :-))

Since I read your message I read again the specs and saw this text fragments
which lead me to conclude that a pattern match expression is an XPath
expression (I cannot create my own expression - it has to be part of the
XPath specifications - more precisely a subset or any expression that may
lead to a "node-set").

"XSLT uses the expression language defined by XPath [XPath]. Expressions are
used in XSLT for a variety of purposes including:

 - selecting nodes for processing;
 - specifying conditions for different ways of processing a node;
 - generating text to be inserted in the result tree."
.....
.....
"TThe syntax for patterns is a subset of the syntax for expressions. In
particular, location paths that meet certain restrictions can be used as
patterns. An expression that is also a pattern always evaluates to an object
of type node-set. A node matches a pattern if the node is a member of the
result of evaluating the pattern as an expression with respect to some
possible context; the possible contexts are those whose context node is the
node being matched or one of its ancestors."

so, it seems that it is not the full set but it is a subset of XPath as long
as the expression points to an object of type "node-set". I guess that
node-set will be defined in the information set specification. Are they,
Will they? OK now I have a new problem: where a node-set is defined? Yop, I
was feeling better in bed this morning :-))

David where are you? Have you, will you define what a "node-set" is in the
information set specifications?

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Mon Dec 13 22:57:22 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:31 2004
Subject: need clarifications on XPath vs pattern match
In-Reply-To: <NBBBJPGDLPIHJGEHAKBAMEGDEJAA.martind@netfolder.com>; from Didier PH Martin on Mon, Dec 13, 1999 at 05:15:47PM -0500
References: <38553C23.7FDD21B2@finetuning.com> <NBBBJPGDLPIHJGEHAKBAMEGDEJAA.martind@netfolder.com>
Message-ID: <19991214095241.A12486@io.mds.rmit.edu.au>

Didier PH Martin wrote [re. XSLT patterns vs XPath expressions]:
> so, it seems that it is not the full set but it is a subset of XPath as
> long as the expression points to an object of type "node-set".

Not quite. See XSLT, section 5.2:
    "A pattern must match the grammar for Pattern. A Pattern is a set of
    location path patterns separated by |. A location path pattern is a
    location path whose steps all use only the child or attribute axes.
    Although patterns must not use the descendant-or-self axis, patterns
    may use the // operator as well as the / operator. Location path
    patterns can also start with an id or key function call with a literal
    argument.  Predicates in a pattern can use arbitrary expressions just
    like predicates in a location path."

This paragraph is followed by a grammar spelling out the above.
See also XPath, section 2 "Location Paths".

> I guess that node-set will be defined in the information set specification.
> Are they, Will they? OK now I have a new problem: where a node-set is defined?

"Node-sets" are defined in XPath, section 3.3:
    "A location path can be used as an expression. The expression returns
    the set of nodes selected by the path."

> Yop, I was feeling better in bed this morning :-))

Me too.

Bottom line: you can't fully understand the XSLT spec. w/o first
understanding XPath. (Arguably, the reverse is true also. ;-)

Michael
-- 
http://www.mds.rmit.edu.au/~msf/
Multimedia Databases Group, RMIT, Australia.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Mon Dec 13 23:19:42 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:31 2004
Subject: XQL question
Message-ID: <3.0.32.19991213180743.009614d0@nexus.webmethods.com>

You are right... 0 through 2.  FYI, XQL dropped the $$ notation due to rampant nose wrinkling.  Also, XPath supercedes XQL as the standard mechanism for locating data in a document, while XQL, XML-QL, Lorel and other languages serve as input for a more general XML document/repository query language that the W3C is designing.

At 04:10 PM 12/13/99 -0600, Alex Garrett wrote:
>
>
>The XQL doc at the W3 Consortium (http://www.w3.org/TandS/QL/QL98/pp/xql.html)
>states:
>
>book[index() $le$ 2] "Find[s] the first two books" in its sample data set. (S.
>3.3)
>Assuming there are more than two books in the document, wouldn't this find the
>first three
>(books 0, 1, & 2)? as a [foo]ML newbie, I feel certain I've misunderstood
>something down
>the line. Please advise.
>
>Cheers,
>     Alex
>
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
--
Joe Lapp                     (Looking for some good people to
Principal Architect           help create XML technologies that
http://www.webMethods.com     connect businesses to businesses
jlapp@webMethods.com          over the web.)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Tue Dec 14 00:42:26 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:31 2004
Subject: need clarifications on XPath vs pattern match
In-Reply-To: <19991214095241.A12486@io.mds.rmit.edu.au>
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEGOEJAA.martind@netfolder.com>

Hi Michael,

Michael said:
Not quite. See XSLT, section 5.2:
    "A pattern must match the grammar for Pattern. A Pattern is a set of
    location path patterns separated by |. A location path pattern is a
    location path whose steps all use only the child or attribute axes.
    Although patterns must not use the descendant-or-self axis, patterns
    may use the // operator as well as the / operator. Location path
    patterns can also start with an id or key function call with a literal
    argument.  Predicates in a pattern can use arbitrary expressions just
    like predicates in a location path."

This paragraph is followed by a grammar spelling out the above.
See also XPath, section 2 "Location Paths".

Didier reply:
so, basically a pattern match could be a collection of XPath location path
Can we say that?

Michael said:
"Node-sets" are defined in XPath, section 3.3:
    "A location path can be used as an expression. The expression returns
    the set of nodes selected by the path."

Didier says:
I expected a better definition. don't you? Obviously the name node-set
convey implicitly the meanings of node and set. A set being a collection,
then we can say that a node-set is a collection of nodes. However, I
expected something more formal for something which is as fundamental. Again,
David, have we, or will we have a more formal definition of node-set in
information sets?

Michael said:
Bottom line: you can't fully understand the XSLT spec. w/o first
understanding XPath. (Arguably, the reverse is true also. ;-)

Didier reply:
Yes I noticed :-)

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msf at mds.rmit.edu.au  Tue Dec 14 01:07:51 1999
From: msf at mds.rmit.edu.au (Michael Fuller)
Date: Mon Jun  7 17:18:31 2004
Subject: need clarifications on XPath vs pattern match
In-Reply-To: <NBBBJPGDLPIHJGEHAKBAEEGOEJAA.martind@netfolder.com>; from Didier PH Martin on Mon, Dec 13, 1999 at 07:39:35PM -0500
References: <19991214095241.A12486@io.mds.rmit.edu.au> <NBBBJPGDLPIHJGEHAKBAEEGOEJAA.martind@netfolder.com>
Message-ID: <19991214120329.A16980@io.mds.rmit.edu.au>

I wrote [re. def. of XSLT "patterns"]:
>> Not quite. See XSLT, section 5.2:
[...]

Didier asked:
> so, basically a pattern match could be a collection of XPath location path
> Can we say that?

:-) I guess so, as long as you're clear that's only an approximate definition.

>> "Node-sets" are defined in XPath, section 3.3:
>>     "A location path can be used as an expression. The expression returns
>>     the set of nodes selected by the path."

> I expected a better definition. don't you?

Ok; the introduction to XPath also states:
    "The primary syntactic construct in XPath is the expression. An
    expression matches the production Expr. An expression is evaluated to
    yield an object, which has one of the following four basic types:

	 * node-set (an unordered collection of nodes without duplicates) 
	 [...other basic types...] 

That enough?
    
Michael

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Tue Dec 14 01:09:23 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:18:31 2004
Subject: need clarifications on XPath vs pattern match
References: <38553C23.7FDD21B2@finetuning.com> <NBBBJPGDLPIHJGEHAKBAMEGDEJAA.martind@netfolder.com> <19991214095241.A12486@io.mds.rmit.edu.au>
Message-ID: <38559927.768B65A9@finetuning.com>

okay so is the conclusion of all of this is that:

1) XSLT's patterns MUST be XPath expressions  

and also

2) I was WRONG when I said that was not the case earlier because:

Even when the value of a simple match pattern is nothing more than an
element name, that in itself is an XPath expression (being evaluated
from the root element, by default -- unless another context is
explicitly stated somewhere in the stylesheet).

I just want to make sure that this is a done deal.  It's a pretty big
issue for there to be confusion about, if ya think about it. When I
realized I couldn't say "yes" or "no" without thinking for a minute, I
was a bit surprised.  It should be a simple question with a simple yes
or no answer. This isn't one of those gray areas, is it?

With all that in mind, I ask again to the group:  
Are we all in agreement here?  (with the above statements 1 & 2 above)

Just checking :-)

thanks,

lisa

Michael Fuller wrote:
> 
> Didier PH Martin wrote [re. XSLT patterns vs XPath expressions]:
> > so, it seems that it is not the full set but it is a subset of XPath as
> > long as the expression points to an object of type "node-set".
> 
> Not quite. See XSLT, section 5.2:
>     "A pattern must match the grammar for Pattern. A Pattern is a set of
>     location path patterns separated by |. A location path pattern is a
>     location path whose steps all use only the child or attribute axes.
>     Although patterns must not use the descendant-or-self axis, patterns
>     may use the // operator as well as the / operator. Location path
>     patterns can also start with an id or key function call with a literal
>     argument.  Predicates in a pattern can use arbitrary expressions just
>     like predicates in a location path."
> 
> This paragraph is followed by a grammar spelling out the above.
> See also XPath, section 2 "Location Paths".
> 
> > I guess that node-set will be defined in the information set specification.
> > Are they, Will they? OK now I have a new problem: where a node-set is defined?
> 
> "Node-sets" are defined in XPath, section 3.3:
>     "A location path can be used as an expression. The expression returns
>     the set of nodes selected by the path."
> 
> > Yop, I was feeling better in bed this morning :-))
> 
> Me too.
> 
> Bottom line: you can't fully understand the XSLT spec. w/o first
> understanding XPath. (Arguably, the reverse is true also. ;-)
> 
> Michael
> --
> http://www.mds.rmit.edu.au/~msf/
> Multimedia Databases Group, RMIT, Australia.
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Tue Dec 14 03:10:47 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:31 2004
Subject: need clarifications on XPath vs pattern match
In-Reply-To: <19991214120329.A16980@io.mds.rmit.edu.au>
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEHBEJAA.martind@netfolder.com>

Hi Michael,

Michael said:
Ok; the introduction to XPath also states:
    "The primary syntactic construct in XPath is the expression. An
    expression matches the production Expr. An expression is evaluated to
    yield an object, which has one of the following four basic types:

	 * node-set (an unordered collection of nodes without duplicates)
	 [...other basic types...]

That enough?

Didier reply:
I guess that yes. And I guess that we should interpret it in such ways that
a node could be of any node type (since there is no explicit restrictions on
the node types). Thus, that a node-set is an unordered collection of nodes
without duplicates. And that the collection can contain any node type. Now,
I guess that the possible sequence of nodes depends on the particular
information set used. David, what are you doing, we need some precision from
you here or at least some indications that we are or not on the right track.


Many thanks to Lisa and Michael to help me clarify this stuff.

Cheers


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From IndrajitC at catsglobal.co.in  Tue Dec 14 06:50:42 1999
From: IndrajitC at catsglobal.co.in (Indrajit Chaudhuri)
Date: Mon Jun  7 17:18:31 2004
Subject: Parsing a DTD for information
References: <002301bf4334$7144adf0$252f0a0f@india.hp.com>
Message-ID: <3855E8E6.C23AB89F@catsglobal.co.in>

Hi Abhishek,

If you are using Java, then you can try using the XMLDTDScanner class 
provided in the Xerces 3.0.0EA3 Java Parser avaliable from IBM
Alphaworks.

In case you are using C++, I think there is no C++ parser which provides
the facility to retrieve element information from a DTD (if anyone knows
about it please let me know!). In this case, one approach may be to use
some converter to convert the DTD to DDML/XSchema and retrieve the
schema information from the DDML/XSchema file.

Thanks,
Indrajit


Abhishek Srivastava wrote:
> 
> Hi,
> 
> Is there an XML parser that will allow me to parse just a DTD.
> 
> Suppose the following is my DTD
> <!ELEMENT (name+,lastname+)>
> 
> My application needs to know that it can have a list of names and a
> list of lastnames.
> 
> Most parsers give me the data inside the elements/attributes . ..
> however, do not
> allow to access the grammar associated with the elements/attributes in
> the DTD.
> 
> regards,
> Abhishek.
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>     _/               Abhishek Srivastava
>    _/                Hewlett Packard ISO
>   _/_/_/   _/_/_/    -------------------
>  _/    /   _/   _/     (Work)   +91-80-2251554 x1190
> _/  _/   _/_/_/      (Ip)     15.10.47.37
>         _/           (Url)
> http://sites.netscape.net/abhishes/index.html
>        _/
>                      Work like you don't need the money.
>                      Dance like no one is watching.
>                      And love like you've never been hurt.
>                      --Mark Twain
> 
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Jurg.Wullschleger at mb.luth.se  Tue Dec 14 08:31:54 1999
From: Jurg.Wullschleger at mb.luth.se (Jurg Wullschleger)
Date: Mon Jun  7 17:18:31 2004
Subject: a simpler document type definition language?
Message-ID: <1D1624C992C5D111B67C00A0C99695012B374A@mailserv.mb.luth.se>

> > the simplest form i can think of would look something like this:
> (examples
> > in DTD syntax)
> > there are only 4 types of elements:
> > 
> > - empty elements
> > <!ELEMENT name1 EMTPY >
> > 
> > - elements that contain data
> > 
> > <!ELEMENT name2 (#PCDATA) >
> > 
> > - list elements
> > 
> > <!ELEMENT name3 (name1|name2|name3|name4)* >
> > 
> > - structural elements of a fixed length
> > 
> > <!ELEMENT name4 ((name1|name2),name3,name4,(name5|name6|name7)) >
> 
> I would go even simpler than that.  Don't allow nested brackets, #4 could
> be
> represented like this:
> 
> <!ELEMENT name4 (nameA,name3,name4,nameB)>
> <!ELEMENT nameA (name1|name2)>
> <!ELEMENT nameB (name5|name6|name7)>
> 
	yes, that's maybe better.

	i was affraid of defining it this way, because there are a lot of
extra
	elements needed. but these elements help a lot in a better
	structuring, so it's maybe good to be forced to use them.

	for example, it also solves the problem the "running text pattern"
solves,
	but in a more natural way, i think.

	juerg wullschleger

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec 14 08:54:41 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:31 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: roddey@us.ibm.com's message of "Mon, 13 Dec 1999 12:33:13 -0700"
References: <wh4sdt97gh.fsf@viffer.oslo.metis.no> <87256846.006B8850.00@d53mta03h.boulder.ibm.com>
Message-ID: <whaendx5vb.fsf@viffer.oslo.metis.no>

>>>>> roddey@us.ibm.com:

> You are never going to win this argument. If you do try to force
> this, this 'standard' will die on the vine. As an example, and I'm
> speaking for me personally here, not IBM... I'm adding an XML parser
> to my CIDLib C++ libraries. There is zero chance that I'll use any
> standard library functionality in it, because the whole point of it
> is to not use the standard library, since its intended to (among
> many other things) replace the standard library with a much more
> powerful and integrated system. It gets high portability by having
> zero system or runtime headers show up outside of a very small core
> virtual kernel. Any standard which required me to use standard
> library stuff would be a non-starter, and I'd have no choice but to
> ignore it.

Then ignore it.

This seems a bit like the SML discussion.

Personally I think the Standard C++ Library is the way to go for all
C++ standardization in the future.  I'll strech to adapt to SAX' use
of the standard library, where my own build environments are lacking
today. 

If you want low overhead, there's always C, or your own SAX-like
interface. 

Besides, in the particular environment you cite above, I don't think
you'd be able to achive parser plugability anyways, because
alternative parsers might need library features you don't support.

> So either it has to be wchar_t

It has to be SAXChar, actually, since wchar_t will be 32bit on some
platforms. 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From john.aldridge at informatix.co.uk  Tue Dec 14 11:22:29 1999
From: john.aldridge at informatix.co.uk (John Aldridge)
Date: Mon Jun  7 17:18:31 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <whaendx5vb.fsf@viffer.oslo.metis.no>
References: <roddey@us.ibm.com's message of "Mon, 13 Dec 1999 12:33:13 -0700">
 <wh4sdt97gh.fsf@viffer.oslo.metis.no>
 <87256846.006B8850.00@d53mta03h.boulder.ibm.com>
Message-ID: <3.0.6.32.19991214112109.00ae5b10@mailhost>

At 09:51 14/12/99 +0100, Steinar Bang <sb@metis.no> wrote:
>> So either it has to be wchar_t
>
>It has to be SAXChar, actually, since wchar_t will be 32bit on some
>platforms. 

Why?  What's wrong with storing UTF-16 encoded data in a 32 bit wchar_t?  I
know it uses more storage space; but there won't typically be that much
data around in this format at once.

I'd much rather have the format defined to be wstring (or wchar_t*, if you
must, but that's another debate), because of the compatibility with wide
string literals.
-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nicmila at vscht.cz  Tue Dec 14 11:54:06 1999
From: nicmila at vscht.cz (Miloslav Nic)
Date: Mon Jun  7 17:18:31 2004
Subject: New Zvon tutorial - RDF I
Message-ID: <38562F57.B22C12@vscht.cz>

In a new Zvon tutorial you will find results of my weekend work :) :

http://zvon.vscht.cz/HTMLonly/RDFTutorial/General/book.html

It covers the basic syntax of RDF and containers.

If time permits, I would like to finish the rest of the RDF
recommendation by the end
of January.


-- 
***************************************************************
Dr. Miloslav Nic                        e-mail: nicmila@vscht.cz
Department of Organic Chemistry         TEL: +420 2 2435 5012  
ICT Prague (VSCHT Praha)                     +420 2 2435 4118
    				        FAX: +420 2 2435 4288  
****************************************************************
Support free information exchange: http://zvon.vscht.cz
****************************************************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec 14 12:13:41 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:31 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: John Aldridge's message of "Tue, 14 Dec 1999 11:21:09 +0000"
References: <3.0.6.32.19991214112109.00ae5b10@mailhost>
Message-ID: <whso15u3ik.fsf@viffer.oslo.metis.no>

>>>>> John Aldridge <john.aldridge@informatix.co.uk>:

> Why?  What's wrong with storing UTF-16 encoded data in a 32 bit
> wchar_t?  I know it uses more storage space; but there won't
> typically be that much data around in this format at once.

We store a lot of strings, so I think a quadrupling of the storage
space compared with what we do today, or doubling wrt. to UTF-16, will
be significant.

Another thing is that if we actually have 32 bit available, I would
have liked to use UCS-4, rather than UTF-16... using UCS-4 would fit
better with the 

(But of course to do that, I would need to have platform specific code
depending on sizeof(wchar_t)...:-/)

> I'd much rather have the format defined to be wstring (or wchar_t*, if you
> must, but that's another debate), because of the compatibility with wide
> string literals.

Hm... I don't know anything about wide string literals and their
behaviour wrt. to wstring, text editors and debuggers.  Could you
elaborate, maybe...?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mah at oxiasoft.com  Tue Dec 14 12:40:56 1999
From: mah at oxiasoft.com (Hemissi Maher)
Date: Mon Jun  7 17:18:31 2004
Subject: About SAX and ActiveXml
Message-ID: <006c01bf4631$1f9ba8a0$100a0a0a@mah>

Hi all,

Am Trying to replace MSDom i used in my project by ActiveXml Com, is there
any equivalent of the microsoft DOM transformeNode methode in Sax or in
ActiveXml ?

Thanks
Maher


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From john.aldridge at informatix.co.uk  Tue Dec 14 12:56:45 1999
From: john.aldridge at informatix.co.uk (John Aldridge)
Date: Mon Jun  7 17:18:31 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: <whso15u3ik.fsf@viffer.oslo.metis.no>
References: <John Aldridge's message of "Tue, 14 Dec 1999 11:21:09 +0000">
 <3.0.6.32.19991214112109.00ae5b10@mailhost>
Message-ID: <3.0.6.32.19991214125527.00940150@mailhost>

At 13:10 14/12/99 +0100, Steinar Bang <sb@metis.no> wrote:
>>>>>> John Aldridge <john.aldridge@informatix.co.uk>:
>
>> Why?  What's wrong with storing UTF-16 encoded data in a 32 bit
>> wchar_t?  I know it uses more storage space; but there won't
>> typically be that much data around in this format at once.
>
>We store a lot of strings, so I think a quadrupling of the storage
>space compared with what we do today, or doubling wrt. to UTF-16, will
>be significant.

I'm guessing that this will be fairly unusual, though.  I suspect that most
clients of such a streaming interface will be processing the data on the
fly, and not hanging on to large chunks of it for the duration of the
program run.

Of course, you don't have to store the strings in your data structures in
the same format as they are passed to you from SAX.

>> I'd much rather have the format defined to be wstring (or wchar_t*, if you
>> must, but that's another debate), because of the compatibility with wide
>> string literals.
>
>Hm... I don't know anything about wide string literals and their
>behaviour wrt. to wstring, text editors and debuggers.  Could you
>elaborate, maybe...?

Brief summary:

    L'a'   is a wchar_t containing the character 'a'
    L"abc" is s wchar_t[] containing the characters 'a', 'b', 'c', '\0'

basic_string<wchar_t> (aka wstring) has constructors and comparison
operators and the like which take wchar_t* arguments.

It seems to me that code like:

void DocumentHandler::startElement (
    const std::wstring &name, const AttributeList &atts)
{
    if (name == L"Paragraph") ...
}

is going to be a whole lot neater than

void DocumentHandler::startElement (
    const std::basic_string<SAXChar> &name, const AttributeList &atts)
{
    static const SAXChar paraString[] =
        {'P','a','r','a','g','r','a','p','h',\0'};
    if (name == paraString) ...
}
-- 
Cheers,
John

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec 14 13:23:22 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:31 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: John Aldridge's message of "Tue, 14 Dec 1999 12:55:27 +0000"
References: <3.0.6.32.19991214125527.00940150@mailhost>
Message-ID: <whbt7tu06y.fsf@viffer.oslo.metis.no>

>>>>> John Aldridge <john.aldridge@informatix.co.uk>:

> Of course, you don't have to store the strings in your data
> structures in the same format as they are passed to you from SAX.

Well, the whole idea about using SAX was that we build the data
structures we'll finally use...

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Tue Dec 14 13:31:58 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:18:31 2004
Subject: Xpath and DOM
Message-ID: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk>

At present the DOM spec only allows one to traverse the 
tree 'manually' using getChild, etc. Or jump into the 
tree at some point using getElementsByTagName.

Theres nothing in there to allow me to do getElementsByExpression
(accepting an XPath search expression), or similarly pull out 
sections of the DOM tree using XPath expressions.

I've written basic utilities to do this, as have others I'm sure 
(XSLT engines must use something similar), but I'm curious as to when, or 
even whether, this type of feature is going to be added to the 
DOM API itself. 

It would seem to be pretty useful. In the applications I've built 
so far, I've not wanted to traverse or walk the tree, just pick 
out bits of it (and sure I could use SAX but I want the tree 
in memory because I'm manipulating it multiple times).

Unless I'm asking the wrong question - is there a tool that will 
search a DOM tree for me, assuming I supply it with an XPath 
expression.

Cheers,

L.

==================================================================
    "Never Do With More, What Can Be Achieved With Less"
				---William of Occam
==================================================================
Leigh Dodds                             Eml:  ldodds@ingenta.com
ingenta ltd                             Tel:  +44 1225 826619
BUCS Building, University of Bath       Fax:  +44 1225 826283

eclectic				http://weblogs.userland.com/eclectic
homepage				http://www.bath.ac.uk/~ccslrd
==================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From alexis at overtheweb.com  Tue Dec 14 13:35:52 1999
From: alexis at overtheweb.com (Alexis D. Gutzman)
Date: Mon Jun  7 17:18:31 2004
Subject: How "final" is the new proposed recommendation for XHTML
References: <John Aldridge's message of "Tue, 14 Dec 1999 11:21:09 +0000"><3.0.6.32.19991214112109.00ae5b10@mailhost> <3.0.6.32.19991214125527.00940150@mailhost>
Message-ID: <002201bf4638$04a68cc0$8c8f75cc@rpc579>

Hello.

I've been asked to submit a proposal to IDG Books for the XHTML Bible, but
because of the controversy surrounding the XHTML spec, we're not sure
whether to write against this spec, or wait.  I'd appreciate your thoughts
on how "final" this proposed recommendation is.

What is your sense about how well the new proposed recommendation for XHTML
1.0 released yesterday will be received?  Are you generally satisfied with
the changes to the namespace definition?  Any guesses on when/whether it
will be approved?

Thanks.
Alexis D. Gutzman
E-commerce Technology Author and Consultant
Author, _The HTML 4 Bible_, _ColdFusion for Dummies_


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pv400 at yahoo.com  Tue Dec 14 13:59:58 1999
From: pv400 at yahoo.com (P.V. NARASIMHA RAO)
Date: Mon Jun  7 17:18:31 2004
Subject: Not able to locate resource specified 
Message-ID: <19991214135954.7915.qmail@web1905.mail.yahoo.com>

Hi everybody,
          I was trying to develop an application with
an external dtd.i do use IE5.0.
In my XML document i referred the dtd by both the
following ways.

<!DOCTYPE Books SYSTEM "c:/xml/xappli/Books.dtd">
<!DOCTYPE Books PUBLIC "-//ECC, Inc.//DTD Books//EN"
"http://c:/xml/xappli/Books.dtd"> 

The dtd was stored in my C drive in xappli folder in
xml.

When i was trying to browse the document in IE5.0 i
was getting an error saying thta it is not able to
locate resource specified.
How can i rectify this error.
Please help me.
Thank u.
Narasimha 
__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Tue Dec 14 14:36:54 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:31 2004
Subject: About SAX and ActiveXml
References: <006c01bf4631$1f9ba8a0$100a0a0a@mah>
Message-ID: <006e01bf4640$4203cda0$c5010180@p197>

> Am Trying to replace MSDom i used in my project by ActiveXml Com, is there
> any equivalent of the microsoft DOM transformeNode methode in Sax or in
> ActiveXml ?

That method is not supported currently, but if you check out the Wrox Press
book ProXML (http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861003110)
you'll see a case study that implements transformations using native code +
xpaths + sax in a generic fashion that might be useful to you.

Another option is of course to write your own mini XSL engine using the
selectNodes and selectSingleNode functions although only basic XPath syntax
is supported.  Let me know what you need and in the way of XSL
support/patterns and maybe I can help.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 14 15:44:21 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:31 2004
Subject: How "final" is the new proposed recommendation for XHTML
In-Reply-To: alexis@overtheweb.com's message of "Tue, 14 Dec 1999 08:34:48 -0500"
Message-ID: <m3aendbnlz.fsf@localhost.localdomain>

alexis@overtheweb.com (Alexis D. Gutzman) writes:

> I've been asked to submit a proposal to IDG Books for the XHTML
> Bible, but because of the controversy surrounding the XHTML spec,
> we're not sure whether to write against this spec, or wait.  I'd
> appreciate your thoughts on how "final" this proposed recommendation
> is.
> 
> What is your sense about how well the new proposed recommendation
> for XHTML 1.0 released yesterday will be received?  Are you
> generally satisfied with the changes to the namespace definition?
> Any guesses on when/whether it will be approved?

Since the question arrived publicly, I'll answer it publicly as well.
>From my quick initial skim (as an outsider), I'll hazard a guess that
this spec will gain acceptance -- it's clear that the HTML WG listened
very careful to comments, not only from W3C members but from the user
community at large.

As for timing, with Christmas coming not much more is likely to happen 
until January, and then the whole approval process can still take a
couple of months.  I very much hope that the W3C will choose to
fast-track this one, though, since it is fairly important.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From larsga at garshol.priv.no  Tue Dec 14 17:25:30 1999
From: larsga at garshol.priv.no (Lars Marius Garshol)
Date: Mon Jun  7 17:18:31 2004
Subject: Xpath and DOM
In-Reply-To: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk>
References: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <m3puw95t9q.fsf@ifi.uio.no>


* Leigh Dodds
|
| Unless I'm asking the wrong question - is there a tool that will
| search a DOM tree for me, assuming I supply it with an XPath
| expression.

There are some, yes, depending on the programming language you use.
See 

<URL: http://www.stud.ifi.uio.no/~lmariusg/linker/xmltools/by-standard.html#S_XPath >

for a list.

--Lars M.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Tue Dec 14 19:36:13 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:31 2004
Subject: Musing over Namespaces
Message-ID: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>

I am in the habit of musing over what seems obvious to
most people.  My latest musing was over XML Namespaces.

I started with the question: "Why isn't there namespaces
in normal languages like English?"  A boring and divergent
question.

I moved on to more interesting question: "What if English
language used namespaces?"  There are several ways to
interpret this question but the most interesting is this:
"What is the social impact of using namespaces in English
or any other spoken/written language?"

My answer was: massive fragmentation of society.

I have a feeling that the answer to above question is
important to understanding the impact of namespaces in XML.

Here are some of the side questions I asked myself:

Is it really a 'good thing' to have namespaces in XML?  What
ill effect will it have on XML's future?  Why can't the
semantic of '<name>' be determined purely by context?  What
is wrong with using just <html> to distinguish HTML's use of
'a' tag?  Is the ability to inject attributes from other
namespaces really useful?  What is the possitive effect of
having just one namespace?  Why can't we have central
registry of XML names?

What do you all think?  I would be very much interested in
your answers.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Tue Dec 14 19:43:33 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:31 2004
Subject: FW: IE5.0 security warning
Message-ID: <NBBBJPGDLPIHJGEHAKBAKEIJEJAA.martind@netfolder.com>

Hi,

I tought this may be interesting.

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com 

-----Original Message-----
From: owner-xmledi-list@lists.bizserve.com
[mailto:owner-xmledi-list@lists.bizserve.com]On Behalf Of David RR
Webber
Sent: Friday, December 03, 1999 1:55 AM
To: The XML/EDI Group
Subject: FWD: IE5.0 security warning


dave thought you might like to read this one - first time I've seen xml
implicated in a security threat...

* INTERNET EXPLORER 5.0 XML REDIRECTS
Georgio Guninski reported a problem with Internet Explorer (IE) 5.0
under Windows NT 4.0 and Windows 95. According to the report, IE 5.0
has a problem with the way it handles HTTP redirects in Extensible
Markup Language (XML) objects. The problem unnecessarily exposes a
user's local file.
   When a user embeds an XML document within an HTML document, IE 5.0
doesn't handle the HTTP redirects properly, thereby allowing access to
the domain of the embedded XML document.
   http://www.ntsecurity.net/go/load.asp?iD=/security/IE54.htm

==========================================
XML/EDI Group members-only discussion list
Homepage =  http://www.xmledi.com

Brought to you by: Online Technologies Corporation
                  Home of BizServe - www.bizserve.com

TO UNSUBSCRIBE: Send email to <xmledi-list-request@lists.bizserve.com>
               Leave the subject blank, and
               In the body of the message, enter ONLY: unsubscribe

Questions/requests should be sent to: owner-xmledi-list@bizserve.com
To join the XML/EDI Group complete the form located at:
http://www.geocities.com/WallStreet/Floor/5815/mail1.htm


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Tue Dec 14 20:35:05 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:31 2004
Subject: Musing over Namespaces
In-Reply-To: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <Pine.LNX.4.10.9912140316360.28872-100000@cauchy.clarkevans.com>


On Tue, 14 Dec 1999, Don Park wrote:
> I started with the question: "Why isn't there namespaces
> in normal languages like English?"  A boring and divergent
> question.

I would assert that there is, it's called Jargon.
Languages form a hierarchy (described below).

We recognize its existence and often try to
making meaning of a message by first discarding
Jargon we don't grok.  If we can't do this, then
we educate ourself by finding out what the word
means in the given context. 

A general dictionary is used for "common" every 
day words... however, industry specific dictionaries
(like the New Hacker's Dictionary) are essential 
for words overloaded for a particular domain.

> I moved on to more interesting question: "What if English
> language used namespaces?"  There are several ways to
> interpret this question but the most interesting is this:
> "What is the social impact of using namespaces in English
> or any other spoken/written language?"
> 
> My answer was: massive fragmentation of society.

Actually, I'd say that fragmentation based on
context is very natural.   And without overloaded
meanings of words, like "abstract", "static", "virtual",
etc., it would be hard to keep the lexicon small.

In general, I'd say that fragmentation occurs already.
There is a *huge* difference between the way a developer
speaks and the way a non-developer talks.  We ( software 
developers) overload meanings of many words, like "abstract", 
"virtual", "iterator", "composite", etc. to fit our context.  
Perhaps if we were more cognizant of context of others we 
would use the words "correctly" in different ways in other 
contexts, or we would prefix them when in mixed company.

Instead we often mix the contexts and confuse half of
the people in a conversation.  This is horribly clear
in a biz/tech meeting where on one side of the table
you have the suits and on the other side you have
hard core t-shirts.  The context switching that 
occurs is amazing.  At best, the contexts are very
obvious and a mediator does a great job at dividing
the conversation along the appropriate name spaces,
providing mappings as needed.  At worse, mis-understanding 
and ill-feelings result.

> I have a feeling that the answer to above question is
> important to understanding the impact of namespaces in XML.
> 
> Here are some of the side questions I asked myself:
> 
> Is it really a 'good thing' to have namespaces in XML?  What
> ill effect will it have on XML's future?  Why can't the
> semantic of '<name>' be determined purely by context?  What
> is wrong with using just <html> to distinguish HTML's use of
> 'a' tag?  Is the ability to inject attributes from other
> namespaces really useful?  What is the possitive effect of
> having just one namespace?  Why can't we have central
> registry of XML names?
> 
> What do you all think?  I would be very much interested in
> your answers.

Well, I personally believe that namespaces form a
hierarchy of specilization.  And that any name space
specification must take this into account for it
to be successful.  In otherwords,

  Unabridged Dictionary
    |
    +-- Conversational Dictionary
    |   (only common words, with 
    |    tightly defined meanings)
    |
    +-- Medical Dictionary
    |    |
    |    +-- Internal Medicine
    |    |     ...
    |    +-- Optimalogy
    |          ...
    |
    +-- Information Technology Dictionary
    |    |
    |    +-- Object Oriented Programming
    |    |    |
    |    |    +-- Java
    |    |    | 
    |    |    +-- C++
    |    |    |
    |    |     ...     
    |    |  
    |    +-- Network Administration
    |     ...


This type of arrangment allows for the least
amount of strings in the language, but
gives them definate meaning depending upon
the context.   As you move to the left
the meaning of the word becomes more
vague, as you move to the right it
becomes more specific.  Of course,
domain specific words are introduced
to the right, but more often than
not, the closest general purpose 
word is borrowed rather than 
inventing a whole new name....

Hope this helps!

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 14 21:26:40 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: "Don Park"'s message of "Tue, 14 Dec 1999 11:37:10 -0800"
References: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <m37lihb4f2.fsf@localhost.localdomain>

"Don Park" <donpark@docuverse.com> writes:

> I started with the question: "Why isn't there namespaces
> in normal languages like English?"  A boring and divergent
> question.

Yes, that's an interesting difference between natural and formal
languages.  Formal languages used for programming, like Java, Perl,
and (more recently) C++, all use their own versions of namespaces.
The challenge is deciding where to put XML in that spectrum.

> Is it really a 'good thing' to have namespaces in XML?  

Probably -- at least, my instincts tell me that it is.  I think that
at least some of the benefits that have accrued to Java and Perl from
having packages can also accrue to XML.  Can you imagine any
significant number of independently-developed modules at CPAN if Perl
put all function and variable names into the same package?

> What ill effect will it have on XML's future?

I don't see a lot of ill coming from putting names into packages.  In
the C and pre-ANSI C++ world, people tried to avoid collisions by
using ad-hoc prefixes, so that every function in a library might start
with "bt_" or "mm_" -- that worked about 95% of the time, and blew up
the other 5% (I know, because I still have scars from the shrapnel).

> Why can't the semantic of '<name>' be determined purely by context?

That's fine if the document itself is the atomic unit, as it was in
the SGML world, and if there's some standard way to determine the
document type.

> What is wrong with using just <html> to distinguish HTML's use of
> 'a' tag?  

The idea is that we can reuse HTML <a> in other document types, like
(say) NITF or an XML schema language.  If you rely on the <html> root
element, that won't work.

What about something like <doc> -- would you allow only one element
type to use that?

> Is the ability to inject attributes from other namespaces really
> useful?  

I think so -- at least, it's very useful in RDF, because I can build
generic RDF engines that can do something useful with

  <megg:Thingy rdf:about="http://www.foo.com/ids/111">
    <megg:someProperty rdf:resource="http://www.foo.comm/ids/222"/>
  </megg:Thingy>

even if I don't know anything about the class.  Some people also like
the lang and space attributes from the XML Namespace, though I haven't 
found much use for them myself yet.

> What is the possitive effect of having just one namespace?

Hard to say, really -- I programmed for over a decade without package
names (namespaces), and I would never want to go back.  In SGML, we
were already experimenting with Architectural Forms, so there was a
demonstrated need for some kind of Namespace partitioning even before
XML came along.

> Why can't we have central registry of XML names?

I don't think that the XML world needs its own Network Solutions.
Hmmm .. actually, maybe I could squat on element names like "task" and
"work-order" until GM pays me $2M for them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pandeng at telepath.com  Tue Dec 14 21:25:19 1999
From: pandeng at telepath.com (Steve Schafer)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
References: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <3881b555.370737567@90.0.0.40>

On Tue, 14 Dec 1999 11:37:10 -0800, "Don Park" <donpark@docuverse.com>
wrote:

>"Why isn't there namespaces in normal languages like English?"

There are most certainly namespaces in English. _Every_ trade and
profession has its own terminology and jargon, which uses the same
words used elsewhere but which have very context-specific
interpretations.

Even when you restrict yourself to casual conversation, a milkshake in
Boston is a very different thing from a milkshake in Los Angeles. In a
London pub, a man might ask a woman who was getting up to leave, "Mind
if I pinch your seat?" Saying the same thing in New York might result
in him ending up on the floor.

-Steve Schafer


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elm at east.sun.com  Tue Dec 14 21:38:39 1999
From: elm at east.sun.com (Eve L. Maler)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <3881b555.370737567@90.0.0.40>
References: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
 <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <4.2.0.58.19991214163440.00a46150@abnaki>

There are cases where the same word/name has multiple meanings in different 
lexicons, but natural languages generally don't have a namespace-like 
mechanism for disambiguation.  The closest thing we probably have is the 
"in the XXX sense" construction, like "Are there any standards at this 
parade, in the 'banner' sense?"

If we did have namespace declarations and references in the course of a 
conversation, how would we ever be able to pun? :-)

         Eve

At 09:24 PM 12/14/99 +0000, Steve Schafer wrote:
>On Tue, 14 Dec 1999 11:37:10 -0800, "Don Park" <donpark@docuverse.com>
>wrote:
>
> >"Why isn't there namespaces in normal languages like English?"
>
>There are most certainly namespaces in English. _Every_ trade and
>profession has its own terminology and jargon, which uses the same
>words used elsewhere but which have very context-specific
>interpretations.
>
>Even when you restrict yourself to casual conversation, a milkshake in
>Boston is a very different thing from a milkshake in Los Angeles. In a
>London pub, a man might ask a woman who was getting up to leave, "Mind
>if I pinch your seat?" Saying the same thing in New York might result
>in him ending up on the floor.
>
>-Steve Schafer

--
Eve Maler            Sun Microsystems
elm @ east.sun.com    +1 781 442 3190

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Tue Dec 14 21:40:04 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:32 2004
Subject: SAX for WAP binary XML
Message-ID: <3856B3FF.A47F6C5C@trantor.de>

Hi!

I have implemented a SAX based JAVA parser for WAP binary encoded
XML. (http://www.trantor.de/wbxml). However, there are some problems
with SAX for WBXML:

- WBXML adds some properitary extensions. For that reason, I added an
  interface "WapExtensionHandler". Can that interface or a similar one
  be included in org.xml.sax.wap? Who cares about extensions of
  org.xml.sax? Unfortunally, the ugly extensions cannot be ignored
  since they are alread used in the WML definintions.
 
- The handler for the WAP extensions needs to be registered at the
  parser. WML depends on fixed "tag and attribute tables", other WBXML
  languages also may use similar tables. A mechanism to register the
  tables with the parser is needed. Thus, an extended Parser interface
  is needed (e.g. org.xml.sax.wap.WbxmlParser)

- WBXML is designed for small devices like a Palm Pilot. Deriving the
  WbxmlParser from org.sax.xml.Parser includes a lot of unneeded
  overhead. It could be a better solution to start a new hierarchy
  (WbxmlParser -> WapParser). A Wrapper could implements the full
  org.sax.xml.Parser interface for compatibility.

Most issued are solved "somehow" in my parser. However, extending the
standard (e.g. by adding a org.xml.sax.wap package) would be better
solution since WBXML parsers then could be exchanged without changing
the application.

Best regards

Stefan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Tue Dec 14 21:40:20 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:32 2004
Subject: SAX and DOM
In-Reply-To: <m3puw95t9q.fsf@ifi.uio.no>
Message-ID: <Pine.LNX.4.10.9912140343460.28872-100000@cauchy.clarkevans.com>


Just some thoughts....

As both of these standard become more and more 
solid (i.e. lots of code depending upon them),
perhaps it is now high time to look closely
at their integration to make it as painless
as possible for the application developer...
before _more_ code is developed.

Item #0
~~~~~~~
SAX is fundamentally event-driven,
sequential-access, read-only.  DOM 
is fundamentally object-based, 
random-access, read-write. 

No problem here, the stark differences 
make for a joyful interaction!

Item #1
~~~~~~~
AttributeList and NamedNodeList both do 
essentially the same thing. Excepting
NamedNodeMap allows for mutation, and
AttributeList is more specific.  

A. In DOM, drop all of the mutators from 
   NamedNodeMap.  The replaceChild,
   addChild, and removeChild methods
   could just as easily be used for 
   attributes as well as elements.

   In DOM level 2, those methods 
   could be deprechiated; giving
   applications a chance to catch up.
  
B. In SAX, replace AttributeList with
   NamedNodeMap.

   In SAX2, AttributeList could be
   deprechiated, making it an interface 
   supported by the particular 
   NamedNodeMap used.  

Item #2
~~~~~~~
This one is a bit more brutal.  A DOM
node is heavyweight, where a SAX beginElement 
only has the element name and attribute list.
A few things to note here:  

 (a) often times a read/write DOM 
     node is not needed; so it is often
     pure overhead.

 (b) with a query on a DOM tree, a list 
     of nodes fiting the criteria must
     be returned.  It will be tempting to 
     return them as an array, however,
     if relational database design has 
     anything to  say, it will be realized 
     that a stream of Nodes will be a far
     better canidate.  And what is SAX
     but a stream of nodes?

 (c) When using SAX, access to the ancestor
     element stack would be horribly
     valueable.

Thus,

A.  Introduce a BaseNode interface that
    includes the node's name and value,
    parent node, attribute list, and 
    (possibly) a child list...

B.  In DOM Level 2, make Node inherit
    from BaseNode

C.  For SAX 2, introduce an alternative
    DocumentHandler interface, called
    NodeHandler with beginElement and,
    handleNode( BaseNode node) methods.

    Notes:  I'm not sure how to handle
    characters() method, perhaps the 
    light-weight Node interface needs 
    quite a bit more modification...

    Perhaps, instead of returning a String
    (for Java), it returns another object,
    something like this:
      class CharBuff {
   	char []  array;
  	int      begin;
        int      length;
      }

Item #3
~~~~~~~
SAX is a stream interface, but unfortunately,
an event/listener pattern was not used.  So,
perhaps for SAX2, a xpath based dispach
system could be used, to pick a particular
NodeHandler based on a particular criteria.
This would also work *wonderfully* for a
DOM query handler.  A system like this, BTW,
really drives the need for the ancestor
stack (at a minimum) to be made available
through a SAX2 interface.

Item #4
~~~~~~~
DOM is a random-access interface, but unfortunately,
it does not currently allow user-defined containers
for sub-sets of children.  This would, IMHO, be
a great boon for a moudular grammer.... in some
cases a linked list might be perfect.  In other
cases, a ballenced red/black tree might be 
the ticket, etc.   By delegating this to implementation
a huge amount of choice is stripped from the
application developer.   


Anyway... just thinking out loud here.

...

BTW, the experimental YML syntax has really
cleared up my thinking with regard to 
sequential vs. random access.   

On the SML list is a possible starting proposal 
for a better SAX/DOM integration based on
the endEvent/handleNode returning a boolean, 
"true" if the node is to be added to its parent's 
child list, or "false" if it and all of its 
children are to be garbage collected.  The result
is suprizing...  if the answer to this question
is "no" recursively, then the unified interface
is logically equivalent to SAX.  If the answer
is "yes" recurisvely, then the unified interface
is logically equivalent to a DOM builder with
SAX calls.  If the question is "no" for many
top level nodes, but "yes" for an entire sub-tree,
then the result is similar to Pyxie's hybrid
approach.  However, what this interface allows
is far more granularity of choice than either 
of these models... thus with a small amount of
added complexity (a boolean decision), great
flexibility is granted.

It is in my attempt to unify DOM/SAX using this
type of "SAX->DOM" binary-recurisve builder
that the above concerns popped up.  It would
be cool to have a debate about them, or 
perhaps better, pointers as to where the
debates on these points were carried out.

Best Wishes,

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Tue Dec 14 21:40:41 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAIEIMEJAA.martind@netfolder.com>

Hi Don,

Don said:
"What is the social impact of using namespaces in English
or any other spoken/written language?"

Didier reply:
This is what Pokemon are doing! They append a name space identifier to each
sound they make. Now you know the secrets of Pokemons :-)). Concerning the
social impact, just ask to each parent what they think of Pokemons :-))

Seriously, allowing each one to create his own language has obviously
certain consequences. One of them is that only the name space vocabulary
creator can understand the meaning. Have you ever understood what a pokemon
said? me neither :-))

PS: my spell checker wants to replace Pokemon by Poke on!

Have a good day
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Tue Dec 14 21:45:37 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <m37lihb4f2.fsf@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912140447070.28872-100000@cauchy.clarkevans.com>


On 14 Dec 1999, David Megginson wrote:
> > Why can't we have central registry of XML names?
> 
> I don't think that the XML world needs its own Network Solutions.
> Hmmm .. actually, maybe I could squat on element names like "task" and
> "work-order" until GM pays me $2M for them.

I think it can be done very effectively informally.
The Oxford English Dictionary is not _the_ authoritative
reference for english, but it is certainly one of the
top ones... and other dictionaries will have a hard
time if they are too "incompatible" 

;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From greynolds at datalogics.com  Tue Dec 14 22:11:22 1999
From: greynolds at datalogics.com (Reynolds, Gregg)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
Message-ID: <51ED3F5356D8D011A0B1006097C3073401B17075@martinique>

> -----Original Message-----
> From: Don Park [mailto:donpark@docuverse.com]
> Sent: Tuesday, December 14, 1999 1:37 PM
> 
> I started with the question: "Why isn't there namespaces
> in normal languages like English?"  A boring and divergent
> question.

Who told you English was normal?

Of course natural languages have namespaces: they're innumerable and
flexible, and also form the ground of articulation, not the surface, and we
get to decide which to use when, both sending and receiving.  If you're
really good, you can play on the available field of namespaces to create
things like jokes and irony.  Think "semiotic field".
 
> I moved on to more interesting question: "What if English
> language used namespaces?"  There are several ways to
> interpret this question but the most interesting is this:
> "What is the social impact of using namespaces in English
> or any other spoken/written language?"
> 
> My answer was: massive fragmentation of society.
> 

Who told you society wasn't massively fragmented?  

Explicit namespacing (aside from being impossible) would only render the
illusion of communication even more untenable than it already is.  I believe
logicians call this the problem of indexicals:  in "this sucks!", how do you
know what "this" references?  Very important in determining the attitude of
the speaker.

> I have a feeling that the answer to above question is
> important to understanding the impact of namespaces in XML.
> 
> Here are some of the side questions I asked myself:
> 
> Is it really a 'good thing' to have namespaces in XML?  What

Do you mean "a means of scoping names, so that a local, apparently atomic
name can be mapped to a universal name"?  Yes.  Or do you mean "namespaces
as designed in the current rec"?  Dunno.

> ill effect will it have on XML's future?  Why can't the
> semantic of '<name>' be determined purely by context?  What

Efficiency rears its ugly head.  If you could derive the semantics from the
context, then you could just tell the computer what to do, in English.

> is wrong with using just <html> to distinguish HTML's use of
> 'a' tag?  Is the ability to inject attributes from other
> namespaces really useful?  What is the possitive effect of
> having just one namespace?

Well, it makes life easier for language designers and implementers.  Makes
life really hard for everybody else.

>  Why can't we have central
> registry of XML names?

Who's gonna run it?  Networking Solutions?  What happens when somebody sets
up a competing registry?  Who's going to settle disputes over the semantics
of FOO?

-gregg


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 14 23:24:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <Pine.LNX.4.10.9912140447070.28872-100000@cauchy.clarkevans.com>
References: <m37lihb4f2.fsf@localhost.localdomain>
	<Pine.LNX.4.10.9912140447070.28872-100000@cauchy.clarkevans.com>
Message-ID: <14422.53643.708848.3956@localhost.localdomain>

Clark C. Evans writes:
 > 
 > 
 > On 14 Dec 1999, David Megginson wrote:
 > > > Why can't we have central registry of XML names?
 > > 
 > > I don't think that the XML world needs its own Network Solutions.
 > > Hmmm .. actually, maybe I could squat on element names like "task" and
 > > "work-order" until GM pays me $2M for them.
 > 
 > I think it can be done very effectively informally.  The Oxford
 > English Dictionary is not _the_ authoritative reference for
 > english, but it is certainly one of the top ones... and other
 > dictionaries will have a hard time if they are too "incompatible"

Let's say that we have an element- and attribute-name registery
instead of Namespaces, and I register the element name "purchase";
now, presumably, no one else can use that in a document type without
my authorization (since a registry is pointless otherwise).

Personally, I'd rather see a world with

  {http://www.sun.com/ns/}purchase
  {http://www.ibm.com/ns/}purchase
  {http://www.amazon.com/ns/}purchase

than a world with

  XML Name Registery Search results
  ---------------------------------

  XML name: "purchase"

  Sorry, this element name has already been registered.
  
  Owner: Microsoft Ltd.
  Technical contact: xmlnames@microsoft.com
  Contact info: ...


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec 15 00:04:01 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <51ED3F5356D8D011A0B1006097C3073401B17075@martinique>
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEJDEJAA.martind@netfolder.com>

Hi Don,

Don said:
> I started with the question: "Why isn't there namespaces
> in normal languages like English?"  A boring and divergent
> question.

Didier reply:
Yes we have name space for English or any language. This is the non verbal
annotation that comes with the word or it is the tone or a combination of
both.

When we write, we simply suppress the name space and this is why sometime we
have to add more qualifier or simply more explanations or add symbols like
;-) to add the name space qualifier. So if I say Pokemon ;-) (this is
because we are both parent and you now what I mean by pokemon or can imagine
the tone) , or Pokemon :-) (this is because I am under 12), or Pokemon
-( (I just came from the store and got had by a Pokemon store). So, you can
imagine symbols like ;-), :-) or :-( after a word as name space qualifiers.

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec 15 00:17:40 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <14422.53643.708848.3956@localhost.localdomain>
Message-ID: <000201bf4691$ec1eb1a0$d1940e18@smateo1.sfba.home.com>

David Megginson wrote:
>Let's say that we have an element- and attribute-name registery
>instead of Namespaces, and I register the element name "purchase";
>now, presumably, no one else can use that in a document type without
>my authorization (since a registry is pointless otherwise).

I was thinking more along the line of English dictionary which
allows multiple entries per word.  Each entry is qualified with
some XPath-subset.  Company-specific entries should be possible
if qualified with some proprietary information.

David Megginson wrote:
>Personally, I'd rather see a world with
>
>  {http://www.sun.com/ns/}purchase
>  {http://www.ibm.com/ns/}purchase
>  {http://www.amazon.com/ns/}purchase

My concern is that XML namespace standard seems to encourage
proliferation of proprietary tags such as the three 'purchase'
tags in your example.  With just three companies, we have three
different tags that basically means the same thing.

Of course, this sort of problems are usually 'fixed' retroactively
through the standardization process but such an approach has some
disadvantages such as:

1. we get caught up in the 'mess-up/clean-up/repeat' cycle.
2. resulting standards are likely to be much larger than
   per-word granuarity that spoken languages use.

BTW, thanks to everyone for contributing to this thread.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steve at redsquare.com.au  Wed Dec 15 01:02:47 1999
From: steve at redsquare.com.au (Steve Baty)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
Message-ID: <042e01bf4697$633cacb0$0f044ccb@kit.redsquare.com.au>

>I was thinking more along the line of English dictionary which
>allows multiple entries per word.  Each entry is qualified with
>some XPath-subset.  Company-specific entries should be possible
>if qualified with some proprietary information.

Don,

Although I agree to a certain extent that xml tags could conform to a
dictionary-style approach to registration, is there any real benefit to this
approach when each definition for a particular element requires additional
information such as "some XPath-subset" and "some proprietary information".

Why not just reference the namespace in the registered definition?

On a slightly different note, if I were to register an element such as
"purchase" which had child elements "card_type", "product_id" and "price"
how would I differentiate these child elements from somebody else's elements
of the same name? If each element in the registry had a namespace
declaration I could follow the thread of the element def'ns fairly easily.

The main problem I see in the registry approach, albeit
registry-as-dictionary, is that there would need to be some way to
differentiate between "my" "purchase" element, and the one that, say, a
stockbroker might use to define a request for shares "purchase".

The way I see it, what you are proposing would look something like:
purchase: i) element, sun microsystems, attributes:approved (true|false),
child elements (payment_type,delivery_address,...)
ii) attribute, foo_bar corp, element:transaction;

etc

Now, how might I, looking at this defn, determine who's defn of
"payment_type" I should go and look up? Perhaps instead of a dictionary
we're really talking thesaurus!?

I suppose the point is: if you need to include company-specific and XPath
information into the definition in the registry in order to identify which
element we're "really" talking about, why not use namespaces?

=========================
Steve Baty
Technical Designer
Red Square Productions
http://www.redsquare.com.au
steve@redsquare.com.au
Ph: +612 9519 4599
Fax: +612 9519 4699


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kamiya at rp.open.cs.fujitsu.co.jp  Wed Dec 15 02:44:38 1999
From: kamiya at rp.open.cs.fujitsu.co.jp (Takuki Kamiya)
Date: Mon Jun  7 17:18:32 2004
Subject: ANNOUNCE: XPath interface for XT Version 0.90
References: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <021301bf46a6$3fc58d10$866e230a@sysrap.cs.fujitsu.co.jp>

Leigh Dodds writes:
> 
> Unless I'm asking the wrong question - is there a tool that will 
> search a DOM tree for me, assuming I supply it with an XPath 
> expression.
> 

If the language being used is Java, there is a tool "XPath interface for XT"
which performs XPath query on top of DOM, accessible at:

http://www.246.ne.jp/~kamiya/pub/XPath4XT.html

The tool comes with source. Free for any purpose, but with no warranty. 

I recently enhanced it to support variables so that now you can reference
variables in your query string.

= Takuki Kamiya  Phone: (045)476-4586 Fax: (045)476-4749   =
= FUJITSU LIMITED (COINS:7128-4217 NIFTY:HHA01731)         =


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From w.hedley at auckland.ac.nz  Wed Dec 15 02:57:55 1999
From: w.hedley at auckland.ac.nz (Warren Hedley)
Date: Mon Jun  7 17:18:32 2004
Subject: dom.Document -> sax.DocumentHandler
Message-ID: <385703A9.5BF36C59@auckland.ac.nz>

With all of the XML parsing packages I've downloaded in 1999,
I've completely forgotten which one had a class which could
produce SAX events from a DOM Document tree. I'm sure I've
seen one.

Can anyone tell me which package has this?

Thanks in advance.


-- 
Warren Hedley
Department of Engineering Science
Auckland University
New Zealand

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 03:17:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: "Don Park"'s message of "Tue, 14 Dec 1999 16:18:32 -0800"
References: <000201bf4691$ec1eb1a0$d1940e18@smateo1.sfba.home.com>
Message-ID: <m34sdkc2qo.fsf@localhost.localdomain>

"Don Park" <donpark@docuverse.com> writes:

> My concern is that XML namespace standard seems to encourage
> proliferation of proprietary tags such as the three 'purchase'
> tags in your example.  With just three companies, we have three
> different tags that basically means the same thing.

Yes, and that's the natural *second* stage of standards development:
first you allow innovation, then you figure out what to standardize.
So you start with

  {http://www.sun.com/ns/}purchase
  {http://www.ibm.com/ns/}purchase
  {http://www.amazon.com/ns/}purchase

and then you bring everyone together for a while, bash heads, and hope 
that you end up with

  {http://www.ecommerce-coop.org/ns/}purchase

It's messy, but it's the only standards path that really seems to
work.  At least with Namespaces we can remove 50% of the messiness
(there's no chance of confusing different party's extensions) on the
way to standards Nirvana.

> Of course, this sort of problems are usually 'fixed' retroactively
> through the standardization process but such an approach has some
> disadvantages such as:
> 
> 1. we get caught up in the 'mess-up/clean-up/repeat' cycle.
> 2. resulting standards are likely to be much larger than
>    per-word granuarity that spoken languages use.

As for #1, that's not a problem, that's The Right Way To Do it.  We
have to let the market mess up first so that we can figure out where
to invest our effort; otherwise, we waste our time and everyone
else's, because we're not smart enough to standardize in advance.
It's not pretty, but that's inevitable when we're dealing with humans
rather than machines, and the market brings its own, different kind of 
efficiency to the process.

I'm going to write a paper with a catchy title on this topic some day,
so that I can make US$45M like Eric Raymond just did from "The
Cathedral and the Bazaar."


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Wed Dec 15 03:21:24 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <14422.53643.708848.3956@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912141015120.28872-100000@cauchy.clarkevans.com>


On Tue, 14 Dec 1999, David Megginson wrote:
> Clark C. Evans writes:
>  > On 14 Dec 1999, David Megginson wrote:
>  > > > Why can't we have central registry of XML names?
>  > > 
>  > > I don't think that the XML world needs its own Network Solutions.
>  > > Hmmm .. actually, maybe I could squat on element names like "task" and
>  > > "work-order" until GM pays me $2M for them.
>  > 
>  > I think it can be done very effectively informally.  The Oxford
>  > English Dictionary is not _the_ authoritative reference for
>  > english, but it is certainly one of the top ones... and other
>  > dictionaries will have a hard time if they are too "incompatible"
> 
> Let's say that we have an element- and attribute-name registery
> instead of Namespaces, and I register the element name "purchase";
> now, presumably, no one else can use that in a document type without
> my authorization (since a registry is pointless otherwise).

Definitions in the OED are not pointers to a company
who has defined the word.  The OED is valueable not
beacuse it "defines" but more so beacuse the organization
continually scans the language for new meanings and
twists in usage; thus is is much more of a defacto
than a dejure standard -- it changes by usage.

What you assumed below is more of a dejure system
where a given entity would "own" the name...

> Personally, I'd rather see a world with
> 
>   {http://www.sun.com/ns/}purchase
>   {http://www.ibm.com/ns/}purchase
>   {http://www.amazon.com/ns/}purchase
> 

This is flexible, but it could be tedious.

I'd rather see an XML DTD which describes 
a particular usage; and then have a central
repository for people to store their suggested
usage of a particular name.

>From a database like this, one could attempt to 
define "best practices" -- scanning what "other" 
purchases are out there; perhaps even using an 
existing definition rather than each company 
creating their own.

A database of usage could also be used for 
independent groups to identify common patterns
and submit more or less generic versions for
a particular industry.  It could also be used
to generate mappings from one system to 
another, etc.

I don't think anyone would be proposing this:

> than a world with
> 
>   XML Name Registery Search results
>   ---------------------------------
> 
>   XML name: "purchase"
> 
>   Sorry, this element name has already been registered.
>   
>   Owner: Microsoft Ltd.
>   Technical contact: xmlnames@microsoft.com
>   Contact info: ...
> 

However, if it was, the registry shoudl be a government
function, and the names should be auctioned to the
highest bidder.... *smirk*  *evil grin*

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robertl1 at home.com  Wed Dec 15 03:48:45 1999
From: robertl1 at home.com (Robert La Quey)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces 
Message-ID: <3.0.6.32.19991214194054.0430e5d0@mail.dt1.sdca.home.com>

<boblq>
 <quote>
   <author>
     <lastname>Park</lastname>
     <firstname>Don</firstname>
   </author>
   Is it really a 'good thing' to have namespaces in XML?  
 </quote>

Probably not as defined by W3 in the current rec, but ...

 <quote>
   <author>
     <lastname>Reynolds</lastname>
     <firstname>Gregg</firstname>
   </author>
   Do you mean "a means of scoping names, so that a local, apparently atomic
   name can be mapped to a universal name"?  Yes.  
 </quote>

 <quote>
   <author>
     <lastname>Park</lastname>
     <firstname>Don</firstname>
   </author>
   What ill effect will it have on XML's future?  
 <quote>

Like all unnecessary complexity it gets in the way of thinking 
clearly about the problem one is trying to solve. 

 <quote>
   <author>
     <lastname>Park</lastname>
     <firstname>Don</firstname>
   </author>
   Why can't the semantic of '<name>' be determined purely by context?  
  </quote>

It can. An "elements only" argument was given earlier.

 <quote>
   <author>
     <lastname>Seivers</lastname>
     <firstname>Kent</firstname>
   </author>

	As evidence of this I give 

	1) almost every other object oriented language in existence.  
	   author.name.firstname = 'joe' is easy to understand, and, 
	   behind the scenes, since even even an INT is an object 
	   and even an "=" is a function, is entirely done in the 
	   spirit of "elements only" and 

	2) the obvious nature of everyones first XML tutorial in which 
	   they are typically shown something like 
	   <author><firstname>joe<firstname/><author/> and 
	   understand it completely.
 </quote>

 <quote>
   <author>
     <lastname>Park</lastname>
     <firstname>Don</firstname>
   </author>
   What is wrong with using just <html> to distinguish 
   HTML's use of 'a' tag?  
 </quote>

Nothing in a fully qualified (hierarchical, See Clark Evan's remarks and 
example above) context. The <boblq> tag would disambiguate the <html>.

>Is the ability to inject attributes from other namespaces really useful?  

No. 

<sigh>
Yes ... but for legacy reasons 
</sigh> 

 <quote>
   <author>
     <lastname>Park</lastname>
     <firstname>Don</firstname>
   </author>
   What is the positive effect of having just one namespace?  
 </quote>

The systematic development of a taxonomic hierarchy closely related
to semantics would follow from having a consistant syntax. As it is,
folks are hiding like ostrichs from this problem which syntax will
never solve. 

 <quote>
   <author>
     <lastname>Park</lastname>
     <firstname>Don</firstname>
   </author>
   Why can't we have central registry of XML names?
 </quote>

Good question. Better question, Why can't we have a distributed
registry? Actually I think, in some sense, as David Meggison has pointed
out such a registration mechanism is the essense of the W3 name identification 
mechanism which is one part of the rec that I do like. 

So I suggest following Don's suggestion:

purchase {http://www.w3.org/ns/} default definition 
         {http://www.sun.com/ns/} variant
         {http://www.ibm.com/ns/} variant
         {http://www.amazon.com/ns/} variant 

where the variants are diffs from the default. I would hope the W3 is
a decent place to get the "default definition" adjudicated. Certainly 
W3 is not Network Solutions ;( but than it is not the IETF either. 

I believe the Jabber guys are working a distributed namespace management 
system for their own purposes but I have not reviewed it. A Jabberite 
might care to comment.

</boblq>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Wed Dec 15 04:01:28 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
In-Reply-To: <m34sdkc2qo.fsf@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912141055440.28872-100000@cauchy.clarkevans.com>

David,

Are namespaces just a modern way to handle 
inter-woven document type definitions?

BTW, what ever happened to architectural forms
and how is this related? 

...

> Yes, and that's the natural *second* stage of standards development:
> first you allow innovation, then you figure out what to standardize.
> So you start with
> 
>   {http://www.sun.com/ns/}purchase
>   {http://www.ibm.com/ns/}purchase
>   {http://www.amazon.com/ns/}purchase
> 
> and then you bring everyone together for a while, bash heads, and hope 
> that you end up with
> 
>   {http://www.ecommerce-coop.org/ns/}purchase
> 
> It's messy, but it's the only standards path that really seems to
> work.  At least with Namespaces we can remove 50% of the messiness
> (there's no chance of confusing different party's extensions) on the
> way to standards Nirvana.

This sounds reasonable; where the ecommerse-coop.org 
acts like the "OED" previously mentioned...

It'd still be nice to have a single database with 
everyone's namespace definitions in one place though...
perhaps even a DTD to help describe them.  I'm sure 
there are organizations doing this... are there?

> I'm going to write a paper with a catchy title on this topic some day,
> so that I can make US$45M like Eric Raymond just did from "The
> Cathedral and the Bazaar."

Good luck. Eric is a great writer, and a pretty 
good speaker to boot. More than that, he is 
definately _not_ slick...  if he was the nerd 
population would have cast him aside.  And to
put icing on the cake, if you corner him for 
a personal chat, he is an extremely likeable 
fella.  Not to mention his wonderful 
stories (regarless as to how true they are).


;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec 15 04:22:21 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
Message-ID: <3.0.32.19991214202152.02074c10@pop.intergate.ca>

At 11:04 AM 12/14/99 -0500, Clark C. Evans wrote:
>It'd still be nice to have a single database with 
>everyone's namespace definitions in one place though...
>perhaps even a DTD to help describe them.  I'm sure 
>there are organizations doing this... are there?

Yes, lots.  That's the problem. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Wed Dec 15 04:43:38 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC01BF@cc20exch2.mobility.com>

From: Don Park [mailto:donpark@docuverse.com]
Sent: Tuesday, December 14, 1999 7:19 PM
> 
> My concern is that XML namespace standard seems to encourage
> proliferation of proprietary tags such as the three 'purchase'
> tags in your example.  With just three companies, we have three
> different tags that basically means the same thing.

In my mind, this is not a bad thing.  As some people may or may not be
saying on this thread, the word "purchase" means different things to
different companies, so they are naturally going to write XML to define a
"purchase" differently.  So the key word I see in that last sentence is
"basically".  "Purchase" may mean <em>basically</em> the same thing to two
different companies, but they won't mean <em>exactly</em> the same thing.

<aside>
As has been mentioned before, I think this is one of the areas where XSLT is
really going to shine over the Internet.  Even when we start coming up with
<em>standardized</em> ways of expressing things in XML, people don't have to
use those standards for their own applications.  They can write XML in
whatever way they want, in a way that makes sense for them, and transform it
to another form of XML when they need to communicate with someone else.
</aside>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Wed Dec 15 04:48:03 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:32 2004
Subject: Xpath and DOM
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC01C0@cc20exch2.mobility.com>

This may or may not be what you're looking for, but the DOM included with
Internet Explorer 5, MSXML, allows you to enter an XPath expression as a
parameter to the selectNodes() and selectSingleNode() methods.

(Actually, it's not *real* XPath.  It's based on a note submitted to the W3C
for XQL, as memory serves.  [It usually doesn't.]  But it's pretty durned
close, and when the next MSXML is released which is slated to have full 100%
compliant XSL support, I'm assuming it will have full 100% compliant XPath
support as well.)

-----Original Message-----
From: Leigh Dodds [mailto:ldodds@ingenta.com]
Sent: Tuesday, December 14, 1999 8:32 AM
To: xml-dev
Subject: Xpath and DOM


At present the DOM spec only allows one to traverse the 
tree 'manually' using getChild, etc. Or jump into the 
tree at some point using getElementsByTagName.

Theres nothing in there to allow me to do getElementsByExpression
(accepting an XPath search expression), or similarly pull out 
sections of the DOM tree using XPath expressions.

I've written basic utilities to do this, as have others I'm sure 
(XSLT engines must use something similar), but I'm curious as to when, or 
even whether, this type of feature is going to be added to the 
DOM API itself. 

It would seem to be pretty useful. In the applications I've built 
so far, I've not wanted to traverse or walk the tree, just pick 
out bits of it (and sure I could use SAX but I want the tree 
in memory because I'm manipulating it multiple times).

Unless I'm asking the wrong question - is there a tool that will 
search a DOM tree for me, assuming I supply it with an XPath 
expression.

Cheers,

L.

==================================================================
    "Never Do With More, What Can Be Achieved With Less"
				---William of Occam
==================================================================
Leigh Dodds                             Eml:  ldodds@ingenta.com
ingenta ltd                             Tel:  +44 1225 826619
BUCS Building, University of Bath       Fax:  +44 1225 826283

eclectic				http://weblogs.userland.com/eclectic
homepage				http://www.bath.ac.uk/~ccslrd
==================================================================


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kent.fitch at its.csiro.au  Wed Dec 15 05:45:06 1999
From: kent.fitch at its.csiro.au (Kent Fitch)
Date: Mon Jun  7 17:18:32 2004
Subject: Topic Maps and RDF
Message-ID: <00ff01bf46bf$859733c0$420a5398@cbr.its.csiro.au>

Topic Maps and RDF seem to my inexpert view to be
technologies providing alternative, maybe
overlapping paths to the "semantic web". 

Is RDF a candidate for representing Topic Map
type assertions about resources? 

Could someone across this area explain how Topic
Maps and RDF fit together, if at all?

Kent Fitch                           Ph: +61 2 6276 6711
ITS  CSIRO  Canberra  Australia      kent.fitch@its.csiro.au


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From srn at techno.com  Wed Dec 15 08:30:16 1999
From: srn at techno.com (Steven R. Newcomb)
Date: Mon Jun  7 17:18:32 2004
Subject: Topic Maps and RDF
In-Reply-To: <00ff01bf46bf$859733c0$420a5398@cbr.its.csiro.au>
	(kent.fitch@its.csiro.au)
References: <00ff01bf46bf$859733c0$420a5398@cbr.its.csiro.au>
Message-ID: <199912150818.CAA00892@bruno.techno.com>

[Kent Fitch:]
> Is RDF a candidate for representing Topic Map
> type assertions about resources? 

This is a good question.  Like you, I have a hunch that it's possible
that it's possible.  If we assume it's true that topic maps can be
interchanged as RDF documents, it's also important to ask whether or
under what circumstances it would be worth the effort to use RDF.  I'd
very much like to know good technical and/or economic arguments in
favor of preferring an existing or proposed RDF-based syntax over the
hyperlinks ("extended" xlinks) already being used to interchange topic
maps.  Does anyone know any such arguments?

I attended Ora Lassila's excellent talk on RDF at XML 99, which made
the RDF phenomenon a lot clearer for me.  During the question period,
I asked from the audience whether the RDF model would be supportable
by using extended xlinks to express tuples and their arcs.  Ora said
that RDF is a set of abstract notions that exist at a higher level
than specific syntaxes.  He said that he would welcome an alternative
syntax based on xlink.

It's also interesting to note that one completely separable and
optional part of the Topic Maps architecture, called "facet link", is
an xlink whose semantic appears to be at least in some respects
indistinguishable from the tuplet-and-arc semantic carried by RDF
tuplets.  If facet links can be used to express RDF-based assertions
(and I don't yet know any reason why they can't), then there already
is an xlink-based syntax for RDF, defined by the Topic Maps standard.

-Steve

--
Steven R. Newcomb, President, TechnoTeacher, Inc.
srn@techno.com  http://www.techno.com  ftp.techno.com

voice: +1 972 517 7954  <<-- new phone number
fax    +1 972 517 4571  <<-- new fax number
pager (150 characters max): srn-page@techno.com

Suite 211               <<-- new address
7101 Chase Oaks Boulevard 
Plano, Texas 75025 USA

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From wperry at fiduciary.com  Wed Dec 15 09:00:34 1999
From: wperry at fiduciary.com (W. E. Perry)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
References: <805C62F55FFAD1118D0800805FBB428D02BC01BF@cc20exch2.mobility.com>
Message-ID: <38574352.E4D0CEB8@fiduciary.com>

"Hunter, David" wrote:

> In my mind, this is not a bad thing.  As some people may or may not be
> saying on this thread, the word "purchase" means different things to
> different companies, so they are naturally going to write XML to define a
> "purchase" differently.  So the key word I see in that last sentence is
> "basically".  "Purchase" may mean <em>basically</em> the same thing to two
> different companies, but they won't mean <em>exactly</em> the same thing.

This quality is fundamental to the nature--and to the appeal--of XML. The great advantage of
XML as a data interchange mechanism is that it abstracts an understanding of data items and of
data structures--e.g. '<purchase>'--from their specific realization on either the originating
or the target system. The same is true of process, and with process that abstraction is even
more valuable. Ultimately, process is the reason for exchanging these XML data structures. You
are sending me an element tagged <purchase> because of the processing I will perform--the
commercial transaction I will execute--on that element. Not only might I define a different
structure than you within an element of that name, but I am most likely to perform a different
process on it than you can. That is, in fact, why you send me an instance of that element:
you have a commercial need for the processing which I perform upon such elements. That
processing is the expression of the role which I perform in the transaction. You send that
element to me (or I pull it from some neutral place where you have posted it) because I have
the ability to do something useful--some process--with it.

> <aside>
> As has been mentioned before, I think this is one of the areas where XSLT is
> really going to shine over the Internet.  Even when we start coming up with
> <em>standardized</em> ways of expressing things in XML, people don't have to
> use those standards for their own applications.  They can write XML in
> whatever way they want, in a way that makes sense for them, and transform it
> to another form of XML when they need to communicate with someone else.
> </aside>

Yes, this sort of transformation is necessary, but it should be handled by software, relying
on database tools to derive and to manipulate the different implicit schemata which can be
mined from the past correspondence with each other node. And, the transformation should be
done on the *receiving* node: only the receiving node knows exactly how it would like to
instantiate an element tagged <purchase> by a particular correspondent. Only the receiving
node defines the particular processing which it would like to initiate upon instantiation of
that element. And if in time the nature of that processing changes, that node's correspondents
will have no way to know that, nor to know how their <purchase> elements are now being
differently instantiated at that node.

Respectfully,

Walter Perry


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Wdehora at cromwellmedia.co.uk  Wed Dec 15 09:17:11 1999
From: Wdehora at cromwellmedia.co.uk (Bill dehOra)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A5CA3E2@odin.cromwellmedia.co.uk>

 
   :  I started with the question: "Why isn't there namespaces
   :  in normal languages like English?"  A boring and divergent
   :  question.

I assume you mean 'natural languages'? Namespaces in natural languages would
be something like jargon, dialects or even idiolects.

   :  My answer was: massive fragmentation of society.
  
Like it's not fragged already.

   :  Is it really a 'good thing' to have namespaces in XML?  What
   :  ill effect will it have on XML's future?  Why can't the
   :  semantic of '<name>' be determined purely by context?  What
   :  is wrong with using just <html> to distinguish HTML's use of
   :  'a' tag?  Is the ability to inject attributes from other
   :  namespaces really useful?  What is the possitive effect of
   :  having just one namespace?  Why can't we have central
   :  registry of XML names? 

The semantics of <name> can't be determined by context because at the
moment, computers aren't good at context (that's why we do it for them by
giving them artificial namespaces). You can of course get a person to read
the mark-up and apply context to it.

As for a central registry, things like namespaces are perhaps best left to
evolve openly.


Regards,

Bill de hOra 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From xmlstat at yahoo.com  Wed Dec 15 09:38:35 1999
From: xmlstat at yahoo.com (=?iso-8859-1?q?dubolc=20jean?=)
Date: Mon Jun  7 17:18:32 2004
Subject: No subject
Message-ID: <19991215093801.16350.qmail@web3103.mail.yahoo.com>


hi,

I am translating a UML chart into Xml with the IBM
parser XML4j and I've got a probleme:

How can a Child Element have two Parents?

To make an element inheritate from another, is the
solution to put an attribute "inheritance" in the
Child with the value "yes" or "no".

Thanks for your help.

Jean D
__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From xmlstat at yahoo.com  Wed Dec 15 09:48:58 1999
From: xmlstat at yahoo.com (=?iso-8859-1?q?dubolc=20jean?=)
Date: Mon Jun  7 17:18:32 2004
Subject: Inheritance in Xml, (Xpointer??)
Message-ID: <19991215094824.24656.qmail@web3105.mail.yahoo.com>


 hi,
 
 I am translating a UML chart into Xml with the IBM
 parser XML4j and I've got a probleme:
 
 How can a Child Element have two Parents?
 
 To make an element inheritate from another, is the
 solution to put an attribute "inheritance" in the
 Child with the value "yes" or "no".
 
 Thanks for your help.
 
 Jean D
 
__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Wed Dec 15 10:04:31 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:32 2004
Subject: Musing over Namespaces
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A67518D@odin.cromwellmedia.co.uk>

Don Park wrote,
> I moved on to more interesting question: "What if English
> language used namespaces?"  There are several ways to
> interpret this question but the most interesting is this:
> "What is the social impact of using namespaces in English
> or any other spoken/written language?"
>
> My answer was: massive fragmentation of society.

But we don't have to _imagine_ different natural language
'namespaces' ... in the sense relevant to a comparison with
XML namespaces we've had that since year dot. Not everybody 
speaks English as a first (or even secondary language). Within
the thousands of distinct natural languages there are many
different national, regional and cultural variations. There are 
also innumerable specialized technical and discipline specific 
vocabularies that get mixed in with ordinary language in various 
contexts.

And your gloomy conclusions don't hold (or, at least, don't hold 
in a way that is directly attributable to language differences).
Granted, language difference might contribute in a small way to
social fragmentation ... but I'd be very surprised if it was
a particularly significant factor in many cases.

Admittedly this situation means that there's plenty of work
for language schools, translators, dictionary compilers and the
like ... but is that really so bad? 

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Wed Dec 15 10:07:33 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:32 2004
Subject: Xpath and DOM
References: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk> <m3puw95t9q.fsf@ifi.uio.no>
Message-ID: <385768FD.CC678E4C@pacbell.net>

I believe there are a few gotchas with any XPath/XSLT to "pure" DOM L1
mapping, no matter what language or implementaion.  Here are a few I've run
into:

1) DOM L1 lacks "real" support for namspace to prefix mapping so you have
to know the exact prefixes used in the documents you process, whereas with
XPath via XSLT namespace prefixes are automagically mapped for you.  
2) no support for the XPath namespace axis so you cannot query or select
based on namespace
3) no support for the XSLT document() function.  This would be a useful DOM
extension to XPath, but DOM L1 (or L2 for that matter) has no mechanism for
opening a document.
4) DOM does not support ID/IDREFs as it has no concept of attribute types
(this is a function of the parser, not the DOM) which means there's no way
to map the XPath id() function to a pure DOM implementation.  Another
WIBN!  The Sun parser (ProjectX TR2) has an extension function which
supports ID/IDREFs (XmlDocument.getElementsById() I think), but even this
support is limited to elements which are parsed as opposed to elements
which you construct using the DOM Document.createElement() API. 

Anyone know of any others?  

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Wed Dec 15 10:58:58 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:32 2004
Subject: multiple parents
References: <19991215093801.16350.qmail@web3103.mail.yahoo.com>
Message-ID: <38577509.CAD2CC0E@pacbell.net>

dubolc jean wrote:
> I am translating a UML chart into Xml with the IBM
> parser XML4j and I've got a probleme:
> 
> How can a Child Element have two Parents?
> 
> To make an element inheritate from another, is the
> solution to put an attribute "inheritance" in the
> Child with the value "yes" or "no".

If you're looking for XML Element type inheritance, there is not yet any
such mechanism, except for architype refinement as (not yet really) defined
in XML Schemas (http://www.w3.org/TR/xmlschema-1).  

If you're just trying to represent abstract parent-child relationships in
your xml you can use the NMTOKENS (or IDREFS) attribute type as in:

<uml.chart>
  <element name="element1">...</element>
  <element name="element2">...</element>
  <element name="element3" parents="element1 element2">...</element>
</uml.chart>

or you can use seperate elements, as in:

<uml.chart>
  <element name="element1">...</element>
  <element name="element2">...</element>
  <element name="element3>
    <parents>
      <parent name="element1"/>
      <parent name="element2"/>
    <parents>
    ...
  </element>
</uml.chart>

Good luck...

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Wed Dec 15 11:34:29 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
References: <m37lihb4f2.fsf@localhost.localdomain>
		<Pine.LNX.4.10.9912140447070.28872-100000@cauchy.clarkevans.com> <14422.53643.708848.3956@localhost.localdomain>
Message-ID: <38577DC2.11E09C68@mecomnet.de>

There are two sides to "identity". One is "what the symbol 'means'". This
requirement is apparent in Mr. Park's initial musings and in follow-ups. An
additional, though somewhat neglected, requirement is that of "dynamic
uniqueness": that is, that two parties can guarantee the uniqueness of a name
without negotiation with others. Namespaces fulfill this second requirement as well.

While it is true that a "fixup cycle" is entailed by the present namespace
standard in order to reconcile names, this is an omission of the present
formulation and is not inherent in namespaces. 

David Megginson wrote:
> ...
> Personally, I'd rather see a world with
> 
>   {http://www.sun.com/ns/}purchase
>   {http://www.ibm.com/ns/}purchase
>   {http://www.amazon.com/ns/}purchase

while i appreciate the presence of

    {http://www.net/ns/<media-access-code>/<process-id>/}purchase
> 
> than a world with
> ...w


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 11:42:48 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
In-Reply-To: <Pine.LNX.4.10.9912141055440.28872-100000@cauchy.clarkevans.com>
References: <m34sdkc2qo.fsf@localhost.localdomain>
	<Pine.LNX.4.10.9912141055440.28872-100000@cauchy.clarkevans.com>
Message-ID: <14423.32395.140071.750744@localhost.localdomain>

Clark C. Evans writes:

 > Are namespaces just a modern way to handle inter-woven document
 > type definitions?

In themselves, Namespaces are pretty-much exactly equivalent to Perl
or Java packages and nothing more; they simply allow two things:

1. the ability to identify a name unambiguously in any context; and
2. the ability to avoid naming collisions.

They promise very little, then, but they certainly can form a solid
foundation for higher-order standardization efforts, like
document-type component reuse.

 > BTW, what ever happened to architectural forms
 > and how is this related? 

They can be used together.  AF's are pretty-much dead now, but there's 
much of value in them, and it may turn out that we need to resurrect
them in a year or two.

 > > Yes, and that's the natural *second* stage of standards development:
 > > first you allow innovation, then you figure out what to standardize.
 > > So you start with
 > > 
 > >   {http://www.sun.com/ns/}purchase
 > >   {http://www.ibm.com/ns/}purchase
 > >   {http://www.amazon.com/ns/}purchase
 > > 
 > > and then you bring everyone together for a while, bash heads, and hope 
 > > that you end up with
 > > 
 > >   {http://www.ecommerce-coop.org/ns/}purchase
 > > 
 > > It's messy, but it's the only standards path that really seems to
 > > work.  At least with Namespaces we can remove 50% of the messiness
 > > (there's no chance of confusing different party's extensions) on the
 > > way to standards Nirvana.
 > 
 > This sounds reasonable; where the ecommerse-coop.org 
 > acts like the "OED" previously mentioned...

Except that there's not a single one for everything -- it's
unreasonably difficult to maintain something like that.

 > It'd still be nice to have a single database with 
 > everyone's namespace definitions in one place though...
 > perhaps even a DTD to help describe them.  I'm sure 
 > there are organizations doing this... are there?

OASIS and Biztalk are both pushing their own schema repositories, and
there are other, less visible ones, but in the end that's not really
the way things work -- a single, central registry would be
unreasonably difficult to maintain and inflexible.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Wed Dec 15 12:39:17 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
References: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <38578C94.3DC75071@pacbell.net>

<my:rambling.thoughts.on.the.subject
xmlns:my="http://www.waldin.net/silly-example">

Don Park wrote:
> I started with the question: "Why isn't there namespaces
> in normal languages like English?"  A boring and divergent
> question.

I think your analogy is inaccurate: that XML is equivalent to one specific
human language and that XML namespaces are equivalent to variations of that
one specific human language.  After all, isn't the English language just
one amongst all human languages?  Aren't there hundreds of human languages,
or human namespaces?  And is human language at risk of becoming useless
because of the massive fragmentation?  My interpretation of a namespace is
that a namespace *is* a language.  

With or without a namespace specification or namespace identifiers or
namespace prefixes, XML namespaces exist.  By the very nature of
application development, there will always be an XML application that
defines elements that are not understood by other XML application developed
in a different time, at a different place, for a different purpose. 
Formally defined namespaces are only a manifestation of this and allow
software to *at least* not misunderstand something, and *at most*
understand it.

My ability (very limited compared to others) to understand sentences
composed of multiple human languages is only possible given the
capabilities (advanced compared to software, yet very limited compared to
other humans) of my human language processor (my brain).  If someone were
to say to me "XML is a de facto standard", or "Yo quiero Taco Bell", I
would have no trouble understanding them.  However, with "Ceci ist non
Ingleses", I would first have to determine what languages the words belong
to.  If someone were to say "french:Ceci german:ist italian:non
portugese:Ingleses" I would at least have a chance of understanding this
sentence a little better (with the appropriate translators, anyway)?  

> I moved on to more interesting question: "What if English
> language used namespaces?"  There are several ways to
> interpret this question but the most interesting is this:
> "What is the social impact of using namespaces in English
> or any other spoken/written language?"
> 
> My answer was: massive fragmentation of society.

You mean, massive fragmentation of English speaking society.  There already
is a massive fragmentation of society at large, and I don't think anyone is
rushing to fix that problem (except the Esperanto hangers-on, that is :)). 
In fact many people are rushing to accomodate it by regionalizing their
products, learning foreign languages, etc.  

There is even some fragmentation of English speaking society, to the extent
that you should be careful whom you ask for a "fag" or a "shag".  The
result could be quite dramatic if not downright painful.  :)  Again,
namespaces can help us avoid such pain in the world of XML.

> I have a feeling that the answer to above question is
> important to understanding the impact of namespaces in XML.
>
> Here are some of the side questions I asked myself:
> 
> Is it really a 'good thing' to have namespaces in XML?  What
> ill effect will it have on XML's future?  Why can't the
> semantic of '<name>' be determined purely by context?  

I think people have a hard enough time with this, let alone software.  What
do I mean by book-review/name?  Is this the name of the book or the name of
the book review or the name of the book author or the name of the book
reviewer?  

What is variable/name?  Is it the name of the variable or does it signify
that the variable holds a name?  Maybe this is a name which varies?  Even
if *we* can agree on this, I doubt that we can do so for most meaningful
contextual combinations, and even then it would be near impossible to get
every XML application author to agree to our definition.

> What
> is wrong with using just <html> to distinguish HTML's use of
> 'a' tag?  

The real question is: is an html/body/dl/dt/em/a the same as an html/a? 
Does html's use of these element names preclude their use as "context"
identifiers in other languages?  This is way too limiting.  I can no longer
use any of html's element names in my language to specify contexts that are
outside of html within html!  

> Is the ability to inject attributes from other
> namespaces really useful?  

To the extent that your xml application looks for and understands (or
avoids misunderstanding) them, yes.

> What is the possitive effect of
> having just one namespace?  Why can't we have central
> registry of XML names?

Like networksolutions.com... another website where I would have to go to
find unused "context" element names...  Xperanto.com!

> What do you all think?  I would be very much interested in
> your answers.

I think there is something to be said for common languages, just not one
universal language.  In order to be useful, XML must not move to either
extreme.  On the one hand you have the Tower of Babble where everyone is
eternally isolated in their own little namespace and on the other you have
Esperanto which requires universal cooperation and has no chance of ever
succeeding.  In the middle you have useful, but not universal,
cooperation.  Namespaces are needed, and many will use them, some will even
abuse them, but to their own detriment...  Those that see the advantage in
not reinventing <wheel/> just for the sake of reinventing it will benefit
most.

-Ray  
</my:rambling.thoughts.on.the.subject>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 14:56:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: NSUtils.java
Message-ID: <14423.44017.257984.862867@localhost.localdomain>

I'm attaching a copy of the Java source code for the (short) NSUtils
class that I described in the last message.  I'd be very grateful if
the Java specialists on the list could look this over, paying special
attention to synchronization problems.

-------------- next part --------------
// NSUtils.java - utilities for dealing with Namespaces
// Copyright (c) 1999 by Megginson Technologies Ltd.
// Free redistribution permitted.

// $Id: NSUtils.java,v 1.1 1999/12/15 14:44:06 david Exp $

package org.xml.sax.helpers;

import java.util.Hashtable;


/**
 * Utilities for splitting and joining Namespace-qualified names.
 *
 * This class contains only static methods, and may not be instantiated.
 * The methods in this class allow applications to split and
 * rejoin Namespace-qualified names in the format {URI}localpart.  The
 * static methods use tables to cache the result of each split or
 * join, so that applications can avoid repeating expensive Java
 * String operations.
 *
 * @see org.xml.sax.DocumentHandler
 * @see org.xml.sax.NamespaceHandler
 */
public final class NSUtils
{


    ////////////////////////////////////////////////////////////////////
    // Private constructor to avoid instantiation.
    ////////////////////////////////////////////////////////////////////

    private NSUtils ()
    {
    }


    ////////////////////////////////////////////////////////////////////
    // Static tables.
    ////////////////////////////////////////////////////////////////////

				// Table counter.  Once this hits
				// COUNTER_MAX, flush all of the tables
				// and start over.
    private static int counter = 0;
    private static final int COUNTER_MAX = 1024;

				// Singleton instance of this class.
    private static NSUtils nsUtils;

				// A reusable qname for searching.
    private static QName qName;

				// Internal hash tables for caching
				// expensive string operations.
    private static Hashtable splitNameTable;
    private static Hashtable joinNameTable;


				// Initialize the tables when the
				// class is loaded.
    static {
	nsUtils = new NSUtils();
	qName = nsUtils.makeQName(null, null);
	resetTables();
    }


    ////////////////////////////////////////////////////////////////////
    // Public static utility methods.
    ////////////////////////////////////////////////////////////////////

    
    /**
     * Test whether a name is qualified with a Namespace URI.
     *
     * The name must appear in {URI}local format.
     *
     * @param name The name to test.
     * @return true if the name has a URI part, false otherwise.
     */
    public static boolean isQualified (String name)
    {
	return (name.charAt(0) == '{');
    }


    /**
     * Split a name into its URI part and its local part.
     *
     * <p>Both the URI and local parts will be internalized.  This
     * function is extremely efficient: it uses a hash
     * table to keep track of names it has already split, and
     * won't do the same work twice.  Since the same element and
     * attribute names tend to appear over and over in an XML
     * document, this method will not usually incur the
     * overhead of Java String processing.</p>
     *
     * @param name The qualified name in {URI}local format.
     * @return An array containing the URI part as the first
     *         member (or null if there is no URI part) and the
     *         local part as the second member.
     * @see java.lang.String#intern
     */
    public static String [] splitName (String name)
    {
	String parts[] = (String [])splitNameTable.get(name);

	if (parts == null) {
	    parts = new String[2];
	    if (name.charAt(0) == '{') {
		int endPos = name.indexOf('}');
		if (endPos == -1) {
		    throw new
			RuntimeException("Malformed Namespace name: " + name);
		}
		parts[0] = (name.substring(1, endPos)).intern();
		parts[1] = (name.substring(endPos+1)).intern();
	    } else {
		parts[0] = null;
		parts[1] = name.intern();
	    }
	    incrCounter();
	    synchronized (splitNameTable) {
		splitNameTable.put(name, parts);
	    }
	}
	return parts;
    }


    /**
     * Join a URI part and a local part into a single qualified name.
     *
     * <p>The name will be merged into a single string in
     * "{URI}local" format, and the merged string will be
     * internalized.</p>
     *
     * @param uriPart The URI part of the name.
     * @param localPart The local part of the name.
     * @return The joined name in {URI}local format.
     * @see java.lang.String#intern
     */
    public static String joinName (String uriPart, String localPart)
    {
	qName.uri = uriPart;
	qName.local = localPart;
	String name = (String)joinNameTable.get(qName);

	if (name == null) {
	    name = ("{" + uriPart + '}' + localPart).intern();
	    incrCounter();
	    synchronized (joinNameTable) {
		joinNameTable.put(nsUtils.makeQName(uriPart.intern(),
						    localPart.intern()),
				  name);
	    }
	}
	return name;
    }


    ////////////////////////////////////////////////////////////////////
    // Internal methods.
    ////////////////////////////////////////////////////////////////////

    private QName makeQName (String uri, String local)
    {
	return new QName(uri, local);
    }


    ////////////////////////////////////////////////////////////////////
    // Internal static methods.
    ////////////////////////////////////////////////////////////////////

    /**
     * Reset the internal cache.
     */
    private static void resetTables ()
    {
	splitNameTable = new Hashtable();
	joinNameTable = new Hashtable();
    }


    /**
     * Increment the counter, and reset tables at COUNTER_MAX.
     */
    private static void incrCounter ()
    {
	if (++counter == COUNTER_MAX) {
	    counter = 0;
	    resetTables();
	}
    }


    /**
     * Inner class for a split, qualified name.
     *
     * This class represents a two-part, Namespace-qualified
     * name for the sake of hashing.
     */
    class QName
    {
	QName (String uri, String local)
	{
	    this.uri = uri;
	    this.local = local;
	}

	public boolean equals (Object o)
	{
	    if (o instanceof QName) {
		String uri2 = ((QName)o).uri;
		String local2 = ((QName)o).local;
		if (uri == uri2 && local == local2) {
		    return true;
		} else if (uri == null && uri2 == null) {
		    return local.equals(local2);
		} else if (uri == null) {
		    return false;
		} else {
		    return uri.equals(uri2) && local.equals(local2);
		}
	    } else {
		return false;
	    }
	}

	public int hashCode ()
	{
	    int hash = 0;
	    if (uri != null) {
		hash += uri.hashCode();
	    }
	    if (local != null) {
		hash += local.hashCode();
	    }
	    return hash;
	}

	String uri;
	String local;
    }

}

// end of NamespaceUtils.java
-------------- next part --------------


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/
From david at megginson.com  Wed Dec 15 14:55:07 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <14423.43924.116647.613307@localhost.localdomain>

OK, back to SAX2 for now.  I'm doing some serious projects with RDF
and Namespaces right now, so I've done a lot of thinking about how we
can make SAX2 Namespace processing both efficient and
backwards-compatible.

I'm pretty sure that the best choice is to use James Clark's
{URI}localpart notation for Namespace-qualified names, so that an
XHTML <p> element (for example) will be reported as
"{http://www.w3.org/1999/xhtml}p".

Unfortunately, that creates some potential inefficiencies, especially
for Java, which is painfully slow at string processing (compared to
C/C++).  To work around this problem, I've designed a new SAX2 helper
class, NSUtils, with the following static methods:

  public boolean isQualified (String name)
  public String [] splitName (String name)
  public String joinName (String uri, String local)

The first of these is very simple -- it just checks whether the first
character is '{' (as it always must be for a qualified name).  The
other two, however, use static hashtables to cache their work, so that 
they're pretty efficient to call over and over again.

For example, the first time you call

  splitName("{http://www.w3.org/1999/xhtml}p")

the method will use java.lang.String.indexOf and java.lang.String to
pick out the URI part "http://www.w3.org/1999/xhtml" and the local
part "p" and will return them as a two-member String array, which it
will also store in a Hashtable.

The next time (or 1,000 times) you call

  splitName("{http://www.w3.org/1999/xhtml}p")

the method will find the string already in the hash table and will
return the same two-member array that it returned last time (or should
it be a copy?  I wish Java had const) without repeating any of the
expensive string operations.

I use a similar approach for joinName(), which makes writing a
NamespaceFilter extremely efficient.

Does this sound like a reasonable approach to the Java-heads out
there?  I'll send the source out in a separate message, since it's
only three screens or so.


Thanks, and all the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec 15 14:57:04 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:33 2004
Subject: Topic Maps and RDF
In-Reply-To: <00ff01bf46bf$859733c0$420a5398@cbr.its.csiro.au>
Message-ID: <NBBBJPGDLPIHJGEHAKBAMEKAEJAA.martind@netfolder.com>

Hi Kent,

Kent said:
Could someone across this area explain how Topic
Maps and RDF fit together, if at all?

Didier reply:
This is something we try to resolve since one year ;-). Here is what we
discovered.

a) topics are links (links governed by the Hytime architecture). We learned
that these links can easily be mapped to xlink:type="extended" links. So
far, so good.

b) RDF descriptions are, simply said, records about a resource. An RDF
record can also be perceived as a property collection or as a property set
(not to be confounded by SGML property sets).

c) both xlink and rdf behave like an architecture (I said, behave like). We
can have our own tags and include the "architecture" attributes in it, so
that, theoretically (or practically in the case of an SGML processor) an
Xlink or RDF processor can recognize and process the elements as xlink or
rdf elements. So far, so good.

The example below is a folder object. This object is a topic, as such, it
contains occurrences or links. It would obviously be useful to have more
than just links but also information about the resource pointed by the link
doesn't it? For example, to tell you who is the author, in which language is
this resource written, and so on and so forth.

So, in such ways that, in addition to include the object's location we also
have some more information about the object and that this information is "at
the right place" or contained near the "location" reference. This is the
principle of locality, which, as we all know, augment the readability of the
content. (have ever used a listing where you have to move back and forth
between the beginning of the document and the end of it. particularly when
the document has more than 50 pages? Remember how funny it was :-)).

So, the example below is XMLized and uses name spaces. Why? simply because
we talked about name spaces this week :-). No, just kidding, it also helps
the interpreter to retrieve its eggs.

Example:
--------
<tm:topic scope="public">
<tm:topname>
  <tm:basename>Name spaces</tm:basename>
  <tm:dispname>Name spaces</tm:dispname>
</tm:topname>
<tm:occurs xlink:type="extended">
  <tm:loc xlink:type="locator"
rdf:about="http://mydomain.com/HotTopic/name_space.htm"
xlink:href="http://mydomain.com/HotTopic/name_space.htm">
   <tns:name>Name_space</tns:name>
   <tns:creator>code runner, bip bip!</tns:creator>
   <tns:date>today</tns:date>
</tns:item>
</tm:occurs>
</tns:folder>

Let's explain a bit what it is, and the actual limits of merging this
"poutine" (for those who do not know what this is, it a strange mixture of
French fries and cheese :-).

a) the master element is the tm:topic element. When I said that the topic
element is a link, I should have said instead "contains an extended link"
when we adapt the topic map element to the XML world.  Thus, the scope
attribute is a topic attribute. An XML parser can now provide to a topic map
interpreter the tm:topic element and this latter to recognize that this is a
topic. Because the XML world does not have any architectural form
recommendation, a W3C compliant parser cannot make the right substitution,
so, the interpreter then needs a clear way to know that this is a topic.
Thus, we have to help this poor guy by being explicit and tell it that this
is a topic element (if we want it to do its job). So far, so good. We
discovered that architectural form cannot be applied when we deal with XML
parser which never heard about this feature (I have been told by a good
friend of mine these XML parsers consider Architectural form as snobbish,
but this is between you and me and do not repeat this to "SP" he does not
feel well these days :-))

b) We then have a name element which contains as sub elements several
flavors of naming. Enough to make Ben and Jerry jealous :-)

c) This is followed by the occurs element, itself the real stuff: the link
(with a background music of "Also spreach Zaratoustra"). It is a one to many
link. I.e. a link pointing to multiple resources. Here come the xlink
"extended" kind of link and this latter contains locations. Again, to help
our friend, the topic map interpreter, we used a keyword which is part of
the tm vocabulary (tm:occurs). Now the topic map parser knows that this is a
topic occurrence set. The topic map intepreter expect to retreive location
element as content.

Some sophisticated interpreter may do the following process
 1 - transform what the parser gave into an internal structure or use the
structure provided by the parser (i.e. the DOM)

 2 - Then, have different "element handler" recognize what an element is.
So, in our case, the tm:occurs, help our friend (the topic map interpreter)
to recognize here a topic map occurs element and then will expect locations
to be contained in it. The element could also ge given as food to the xlink
interpreter.  The "xlink:type" attribute is a keyword recognized by the
xlink interpreter and thus indicates to this latter that this is a one to
many kind of link and that this latter will contain one or more locators.
Thus, both the topic map and xlink interpreters have expectations about
what's should be coming next.

 3 - Then, we encounter a loc element. Hummm, we have some guys here that
have expectations about what this element should be. The xlink interpreter
is anxiously waiting for a "locator", the topic map interpreter for a
"location", this latter may simply decide that linkage is after all a matter
to be treated by the xlink interpreter and will simply let this later do the
job. These two guys already satisfied, give as food, the element to other
interpreters, just in case... Bang! (a big gong sound) the rdf interpreter
recognizes the "rdf:about" attribute and prepare itself to build a property
set for the resource identified by the "about" value. The rdf interpreter
expects properties to be contained in this element.  Finally, the
"xlink:href" attribute is what the xlink interpreter expected: a resource
reference. The xlink interpreter can do something with it like store it, or
information about the kind of behavior we expect from it if some more
attributes are included in the locator element. But wait a minute here! we
have two times the same value (a) for the rdf:about attribute and (b) for
the xlink:href attribute. Hummm, I guess these guys will go to meet the
judge to know how to resolve this dispute about the resource location. One
of these guys say that to be an rdf description "about" this resource, you
need to provide the resource URI as a value. And the other party repeat that
to be able to resolve the link, it needs the URI too. And I have been
convened to be part of the jury. Both have a good story to tell. Hummm hard
to decide.

And here I am, sitting on my jury chair, trying to figure out why these two
guys do not talk each other and could share the same resource pointer.

So here we are, yes we have all the intuition that these two world could be
merged and that it make sense that these two world be merged, but when you
are seated on a jury's chair, things are not so easy.

Have a good day (And please do not tell to SP what the other parsers think
about architectural form, I heard that he feels lonely these days since all
his SGML friends moved to XML, be kind to him :-))

Note: All resemblance between characters and real persons is pure
coincidence and the producers may not be held as responsible for such
resemblance :-))

PS: I am writing about this topic and issues and will try to do my jury's
job as well as possible.

Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Next conference: Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products: http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ksievers at novell.com  Wed Dec 15 16:15:46 1999
From: ksievers at novell.com (Kent Sievers)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
Message-ID: <s8575a8e.043@orm-mail20.orem.novell.com>

In my opinion, namespaces are not only important, they are critical to XML.

In my opinion, we do not need a central registry, I am perfectly happy riding the coat-tails of URIs or any other existing dis-ambiguating mechanism.

But I still puzzle over how namespaces got implemented in XML.  It seems so absurd to me.

>>> "Don Park" <donpark@docuverse.com> 12/14/99 12:37PM >>>
I am in the habit of musing over what seems obvious to
most people.  My latest musing was over XML Namespaces.

I started with the question: "Why isn't there namespaces
in normal languages like English?"  A boring and divergent
question.

I moved on to more interesting question: "What if English
language used namespaces?"  There are several ways to
interpret this question but the most interesting is this:
"What is the social impact of using namespaces in English
or any other spoken/written language?"

My answer was: massive fragmentation of society.

I have a feeling that the answer to above question is
important to understanding the impact of namespaces in XML.

Here are some of the side questions I asked myself:

Is it really a 'good thing' to have namespaces in XML?  What
ill effect will it have on XML's future?  Why can't the
semantic of '<name>' be determined purely by context?  What
is wrong with using just <html> to distinguish HTML's use of
'a' tag?  Is the ability to inject attributes from other
namespaces really useful?  What is the possitive effect of
having just one namespace?  Why can't we have central
registry of XML names?

What do you all think?  I would be very much interested in
your answers.

Best,

Don Park    -   mailto:donpark@docuverse.com 
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk 
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec 15 16:49:19 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
In-Reply-To: <38578C94.3DC75071@pacbell.net>
Message-ID: <001801bf471c$7940a7c0$d1940e18@smateo1.sfba.home.com>

My thanks to everyone who contributed to this thread.

Some of you thought I am proposing removal of namespace.
I am not.  What I am doing is trying to see what lies
beyond our technical concerns.  I do not accept that
the current standardization practice is the best one,
just better than before, and I believe we focus too
much on logic and not enough on the big picture, the
one that astronauts see when they look at Earth from
out of space.

I think the real question I am asking is: What if we
were coerced into proactive standardization?  What
do we lose and what do we gain?

The registry/dictionary I mentioned is a form of
coercion.  Appropriate amount of anarchy can be
introduced with distributed architecture similar
to DNS.  XML-DEV old timers might recognize that
this distributed name registry network is similar
to the TagNet idea I wrote briefly about two years
ago.  Looks like I am going around in circles. <g>

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 17:05:03 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX for WAP binary XML
In-Reply-To: Stefan Haustein's message of "Tue, 14 Dec 1999 22:17:51 +0100"
References: <3856B3FF.A47F6C5C@trantor.de>
Message-ID: <m3u2lki19w.fsf@localhost.localdomain>

Stefan Haustein <stefan.haustein@trantor.de> writes:

> - WBXML adds some properitary extensions. For that reason, I added an
>   interface "WapExtensionHandler". Can that interface or a similar one
>   be included in org.xml.sax.wap? Who cares about extensions of
>   org.xml.sax? Unfortunally, the ugly extensions cannot be ignored
>   since they are alread used in the WML definintions.

The extensions should go in a different package, such as org.wap.sax.

> - The handler for the WAP extensions needs to be registered at the
>   parser. WML depends on fixed "tag and attribute tables", other WBXML
>   languages also may use similar tables. A mechanism to register the
>   tables with the parser is needed. Thus, an extended Parser interface
>   is needed (e.g. org.xml.sax.wap.WbxmlParser)

Again, that's not a problem at all, though SAX2 will have a more
general mechanism for this kind of thing.

> - WBXML is designed for small devices like a Palm Pilot. Deriving the
>   WbxmlParser from org.sax.xml.Parser includes a lot of unneeded
>   overhead. It could be a better solution to start a new hierarchy
>   (WbxmlParser -> WapParser). A Wrapper could implements the full
>   org.sax.xml.Parser interface for compatibility.

Certainly that could make a lot of sense.  We kept the core SAX
interfaces very small so that they would be suitable for environments
with strict space constraints, but they still take up a few KB.
Building a SAX wrapper on top might make sense if every kilobyte
counts.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Wed Dec 15 17:19:56 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: NSUtils.java
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A67518E@odin.cromwellmedia.co.uk>

David Megginson wrote,
> I'm attaching a copy of the Java source code for the (short) 
> NSUtils class that I described in the last message.  I'd be 
> very grateful if the Java specialists on the list could look 
> this over, paying special attention to synchronization 
> problems.

I fear there are some big problems here. In particular,

  public final class NSUtils
  {
    private static QName qName;

    public static String joinName
      (String uriPart, String localPart)
    {
	qName.uri = uriPart;
	qName.local = localPart;
	String name = (String)joinNameTable.get(qName);
      
      // ... etc ...
    }
  }

Is rather nastily thread-unsafe: the shared qName could be
read/written by multiple threads in joinName(). You should
either synchronize this method, or create a new QName locally.
There are other less important problems here too ... this one 
leapt out at me immediately.

To be honest tho', I don't think it's likely to be worth the 
effort of finding and fixing them all. Despite the rumours about 
Javas poor string handling, it's really not all that bad, and 
it's quite likely that your attempts at optimization will do more 
harm than good, bearing in mind that any shared caches will 
involve either synchronization or replication (with object 
creation as a consequence). At the very least you should do some 
benchmarks to see whether there's any gain from going down this 
route ... what was it Knuth said about optimization? ;-)

Why not just do something like this,

  public class QName
  {
    private String itsURIPart;
    private String itsLocalPart;

    public QName(String uName)
    {
      if(uName.charAt(0) == '{')
      {
        int endPos = uName.lastIndexOf('}');
        if(endPos == -1)
          throw new IllegalArgumentException
            ("Malformed Namespace name: " + uName);
		
        itsURIPart = uName.substring(1, endPos);
        itsLocalPart = uName.substring(endPos+1);
      }
      else
      {
        itsURIPart = null;
        itsLocalPart = uName;
      }
    }

    public QName(String uriPart, String localPart)
    {
      if(localPart == null)
        throw new IllegalArgumentException();

      itsURIPart = uriPart;
      itsLocalPart = localPart;
    }

    public String getURIPart()
    {
      return itsURIPart;
    }

    public void setURIPart(String uriPart)
    {
      itsURIPart = uriPart;
    }

    public String getLocalPart()
    {
      return itsLocalPart;
    }

    public void setLocalPart(String localPart)
    {
      if(localPart == null)
        throw new IllegalArgumentException();

      itsLocalPart = localPart;
    }

    public boolean equals(Object other)
    {
      if(other instanceof QName)
        return equals((QName)other);
      else if(other instanceof String)
        return equals(new QName((String)other));
      else
        return false;
    }

    public boolean equals(QName other)
    {
      return
        !itsLocalPart.equals(other.itsLocalPart) &&
        (itsURIPart == other.itsURIPart ||
         (itsURIPart != null && itsURIPart.equals(other.itsURIPart)));
    }

    public int hashCode()
    {
      return
        (itsURIPart != null ? itsURIPart.hashCode() : 0) ^
        itsLocalPart.hashCode();
    }

    public String toString()
    {
      return toString(uriPart, localPart);
    }

    public static String toString(String uriPart, String localPart)
    {
	if(itsURIPart != null)
      {
        StringBuffer buffer = new StringBuffer();
        buffer.append('{');
        buffer.append(uriPart);
        buffer.append('}');
        buffer.append(localPart);

        return buffer.toString();
      }
      else
        return itsLocalPart;
    }
  }

As is traditional, this is untried and untested ;-)

If someone wants to split a name they simply constuct a
new QName ... no more expensive than constucting a new String[]
to hold the 2 parts, and rather more convenient.

Here a some ideas for possible optimizations _if_ benchmarking
suggests that they're needed,

1. The universal name built in toString() or passed in via the
   one arg constructor could be cached (and invalidated if
   either of the set methods are called).

2. The hashcode could be cached (and invalidated if either of
   the set methods are called).

3. The StringBuffer.append()'s in toString() could be wrapped in
   a synchronized block on the buffer, ie.,

   synchronized(buffer)
   {
     buffer.append(...);
     buffer.append(...);
     //etc.
   }

   This gives a marginal speedup over repeated grabbing and
   releasing of the buffers monitor (which is done internally to
   StringBuffer.append()).

4. Selected methods or the whole class could be marked as final.

5. Strings could be intern'ed (note that intern'ing a String is
   a comparatively expensive operation involving a JVM internal
   hash lookup, so shouldn't be done willy nilly).

6. Some class-scope caching could be used (ie. something like
   the caching your NSUtils does, only mapping between Strings
   and QNames rather than between Strings and String[]s).

I'm very doubtful that it'd be worth going down below (4).

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec 15 17:30:47 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <3.0.32.19991215092815.01443430@pop.intergate.ca>

At 09:54 AM 12/15/99 -0500, David Megginson wrote:
>I've done a lot of thinking about how we
>can make SAX2 Namespace processing both efficient and
>backwards-compatible.

I'm not sure backwards-compatible is really a good idea.  The 
namespace-sensitive and namespace-oblivious views of a chunk of XML
are just deeply, totally, massively incompatible, and it seems wrong
to try to paper that over.  I also don't think it's worth investing
any effort at all in accomodating XML documents that use colons in names
but aren't namespace-aware, given the explicit warnings against doing
this in the XML 1.0 spec.

So I think it would be cleaner to deal with the fact that names can have
two parts, and not kludge them together with {} marks.  -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tlainevool at yahoo.com  Wed Dec 15 17:48:47 1999
From: tlainevool at yahoo.com (Toivo Lainevool)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: NSUtils.java
Message-ID: <19991215174843.19821.qmail@web2104.mail.yahoo.com>

--- Miles Sabin <msabin@cromwellmedia.co.uk> wrote:
> David Megginson wrote,
> > I'm attaching a copy of the Java source code for the (short) 
> > NSUtils class that I described in the last message.  I'd be 
> > very grateful if the Java specialists on the list could look 
> > this over, paying special attention to synchronization 
> > problems.
> 
> I fear there are some big problems here. In particular,
... snip ...
> 	String name = (String)joinNameTable.get(qName);
... snip ...
> Is rather nastily thread-unsafe: the shared qName could be
> read/written by multiple threads in joinName(). You should
> either synchronize this method, or create a new QName locally.

Hashtable get() and put() are synchronized, so the read/write operations are
thread safe.  No need to have separate synchronized blocks.

The only problem I saw was the incrCounter method.  It increments and resests
the counter field.  This needs to be synchronized.  The easiest way would be to
make the incrCounter() method synchronized.

Toivo Lainevool
__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From reschke at medicaldataservice.de  Wed Dec 15 17:58:17 1999
From: reschke at medicaldataservice.de (Julian Reschke)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
In-Reply-To: <51ED3F5356D8D011A0B1006097C3073401B17075@martinique>
Message-ID: <NCBBIPMOPKLLGKJPBINCMELEDHAA.reschke@medicaldataservice.de>


> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Reynolds, Gregg
> Sent: Tuesday, December 14, 1999 11:11 PM
> To: xml-dev@ic.ac.uk
> Subject: RE: Musing over Namespaces
>
>
> > -----Original Message-----
> > From: Don Park [mailto:donpark@docuverse.com]
> > Sent: Tuesday, December 14, 1999 1:37 PM
> >
> > I started with the question: "Why isn't there namespaces
> > in normal languages like English?"  A boring and divergent
> > question.
>
> Who told you English was normal?

Actually: shouldn't the question be about languages in general? In that case
I'd say that the different national languages *do* form a kind of namespace.
It shouldn't be too hard to find an example where a word "x" has the same
spelling but completely different meanings in for instance German and
English...


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Wed Dec 15 18:04:35 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: NSUtils.java
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A67518F@odin.cromwellmedia.co.uk>

Toivo Lainevool wrote,
> Miles Sabin wrote,
> > I fear there are some big problems here. In particular,
> ... snip ...
> > 	String name = (String)joinNameTable.get(qName);
> ... snip ...
> > Is rather nastily thread-unsafe: the shared qName could be
> > read/written by multiple threads in joinName(). You should
> > either synchronize this method, or create a new QName locally.
>
> Hashtable get() and put() are synchronized, so the read/write 
> operations are thread safe.  No need to have separate 
> synchronized blocks.

True but that wasn't the problem I was pointing to. You snipped
out the important bits and left the bit that was OK ;-)

    private static QName qName;

    public static String joinName
      (String uriPart, String localPart)
    {
	qName.uri = uriPart;      // Unsynchronized write of
	qName.local = localPart;  // shared qName
	String name = (String)joinNameTable.get(qName);
                                         //   ^^^^^
                                         // Unsynchronized read of
                                         // shared qName
      // ... etc ...
    }

Hashtable.get() being synchronized doesn't help here.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 18:07:12 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: Tim Bray's message of "Wed, 15 Dec 1999 09:29:18 -0800"
References: <3.0.32.19991215092815.01443430@pop.intergate.ca>
Message-ID: <m3r9gohyea.fsf@localhost.localdomain>

Tim Bray <tbray@textuality.com> writes:

> So I think it would be cleaner to deal with the fact that names can have
> two parts, and not kludge them together with {} marks.  -Tim

So, in other words, we'd have something like this:

  public interface DocumentHandler2 extends DocumentHandler {
    public void startElement (String ns, String name, AttributeList2 atts);
    public void endElement (String ns, String name);
  }

  public interface AttributeList2 extends AttributeList {
    public String [] getName (int i);
    public String getType (int i);
    public String getValue (int i);
    public String getType (String ns, String name);
    public String getValue (String ns, String name);
  }

We talked about this a few months ago, but I'd be happy to hear what
people think now.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Toby.Speight at streapadair.freeserve.co.uk  Wed Dec 15 18:12:27 1999
From: Toby.Speight at streapadair.freeserve.co.uk (Toby Speight)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
In-Reply-To: "Julian Reschke"'s message of "Wed, 15 Dec 1999 18:54:50 +0100"
References: <NCBBIPMOPKLLGKJPBINCMELEDHAA.reschke@medicaldataservice.de>
Message-ID: <u4sdkgjjv.fsf@lanber.cam.eu.citrix.com>

Julian> Julian Reschke <URL:mailto:reschke@medicaldataservice.de>

0> In article
0> <NCBBIPMOPKLLGKJPBINCMELEDHAA.reschke@medicaldataservice.de>,
0> Julian wrote:

Julian> ...  I'd say that the different national languages *do* form
Julian> a kind of namespace.  It shouldn't be too hard to find an
Julian> example where a word "x" has the same spelling but completely
Julian> different meanings in for instance German and English...

Like "hat", perhaps?  (An example off the top of my head...)

English and American is a particularly problematic pair, as often the
words are the same part of speech: a "gas-powered" vehicle is much
more unusual on one side than the other.  In neither country would I
start an argument with someone who is "pissed", though for different
reasons.

-- 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec 15 18:21:46 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <3.0.32.19991215102002.0146ecf0@pop.intergate.ca>

At 01:06 PM 12/15/99 -0500, David Megginson wrote:
>So, in other words, we'd have something like this:

I like it, except for

>  public interface AttributeList2 extends AttributeList {
>    public String [] getName (int i);

I don't think you can do that, because you'll have two methods that differ
only in return type, right?  Blecch. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Dec 15 19:06:21 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <m3r9gohyea.fsf@localhost.localdomain>
References: <Tim Bray's message of "Wed, 15 Dec 1999 09:29:18 -0800">
 <3.0.32.19991215092815.01443430@pop.intergate.ca>
Message-ID: <199912151906.OAA24894@hesketh.net>

At 01:06 PM 12/15/99 -0500, David Megginson wrote:
>> So I think it would be cleaner to deal with the fact that names can have
>> two parts, and not kludge them together with {} marks.  -Tim
>....
>We talked about this a few months ago, but I'd be happy to hear what
>people think now.

I'm afraid I think this is a hideous idea, and that's an understatement.
Even apart from code bloat - having to deal with two parts of a name means
twice as much code every place the name matters - I'm really not sure I
like the logic behind this approach.  

There are two cases where I think this approach could be useful. The first
is separating components from different namespaces for different
processing, which could be a good idea but isn't worth the cost so far as
I'm concerned. The second case is downright funny to me at least,
situations where you want to discard the namespace entirely and focus only
on the local part.  This latter approach might have been useful, indeed
necessary, if we'd been stuck with a 3-namespace XHTML, but we're not, at
least for today.

In just about every other situation, I think I'd rather work with a single
easily-kludged and un-kludged name that have to make two calls and kludge
them together myself.  While I think the two part approach might be
acceptable _if_ there's an option (heck, SAX2 supports that!) to choose
between them, I don't think two parts as the only approach in SAX2 itself
is even worth considering.


Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dave.pawson at virgin.net  Wed Dec 15 19:44:54 1999
From: dave.pawson at virgin.net (Dave Pawson)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
References: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <021301bf4734$e0920280$0100a8c0@dave>


> I am in the habit of musing over what seems obvious to
> most people.  My latest musing was over XML Namespaces.
> 
> What do you all think?  I would be very much interested in
> your answers.

You wouldn't be the first :-)

Rumour has it that it's flawed at best,
redundant at worst.

The justification would appear real,
the solution to that 'real' problem would
seem flawed. 'just that no one has bettered it...
yet.

Longer time spent in the problem domain perhaps?

What we appear to have is a 'how' solution, without
time being spent addressing the root cause and
a 'what is needed to resolve it' period.


Regards, DaveP


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec 15 19:58:08 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <3.0.32.19991215115634.01439650@pop.intergate.ca>

At 02:06 PM 12/15/99 -0500, Simon St.Laurent wrote:
>>We talked about this a few months ago, but I'd be happy to hear what
>>people think now.
>
>I'm afraid I think this is a hideous idea, and that's an understatement.
>Even apart from code bloat - having to deal with two parts of a name means
>twice as much code every place the name matters - I'm really not sure I
>like the logic behind this approach.  

Hmm, I perceive the opposite.  I anticipate that patterns such as the 
following will be very common - not sax primitives, but you'll see the idea.

 while (iterator.hasNext())
 {
    whatever = (Whatever) iterator.next();
    if (whatever.ns().equals(myNamespace))
    doMyProcessing(whatever.name());
 }

i.e. the namespace processing is highly decoupled from the name
processing.  Another way to say it is that much name processing will
be written to deal with one particular vocabulary, and want to just
deal with names, assuming the NS to have been checked already.

Given this, then if the parser insisted on giving you these things 
glued together, you'd actually have to do extra work to pick them
apart before doing your real work.  Since the low level parsing
code is going to have them in two places anyhow, it seems real awkward
to parse them apart, glue them together with curly braces, and then
pick them apart again to do the real work.  The namespace spec de jure
and de facto contemplates qualified names as being 2-part things, and
modern programming techniques can deal with multi-part data objects,
so why why all this concatenation side-stepping?

The whole {ns}name notion has a smell to me of pretending that we're
still living in the pre-namespace era, and we're not.  I mean, you 
could take all the structs in your C programs and concatenate the
string representatioons of all the members together and pick them apart
to do work with them, but that would be perverse.  So is {ns}name.

>There are two cases where I think this approach could be useful. The first
>is separating components from different namespaces for different
>processing, which could be a good idea but isn't worth the cost so far as
>I'm concerned. 

Hmm... I have the notion that this is probably the most common use case.
Of course, we're both prognosticating, i.e. guessing.

>The second case is downright funny to me at least,
>situations where you want to discard the namespace entirely and focus only
>on the local part.

I think this is going to happen all the time, once you've decided that
this is the namespace you know about; then you just focus on elements &
attributes in the old-fashioned way.  -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 20:19:56 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:33 2004
Subject: Musing over Namespaces
In-Reply-To: "Dave Pawson"'s message of "Wed, 15 Dec 1999 18:27:08 -0000"
References: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com> <021301bf4734$e0920280$0100a8c0@dave>
Message-ID: <m3hfhkares.fsf@localhost.localdomain>

"Dave Pawson" <dave.pawson@virgin.net> writes:

> The justification would appear real,
> the solution to that 'real' problem would
> seem flawed. 'just that no one has bettered it...
> yet.

Well, we have years of very positive experience with namespaces in
programming languages like Java and Perl.  It's possible that our
experience in programming languages doesn't apply to XML, but I'd like
to hear a clear argument of why it doesn't.  

As I mentioned in a previous posting, I'd hate to have to write Perl
modules without packages.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Dec 15 20:37:47 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <3.0.32.19991215115634.01439650@pop.intergate.ca>
Message-ID: <199912152037.PAA30480@hesketh.net>

At 11:57 AM 12/15/99 -0800, Tim Bray wrote:
>i.e. the namespace processing is highly decoupled from the name
>processing.  Another way to say it is that much name processing will
>be written to deal with one particular vocabulary, and want to just
>deal with names, assuming the NS to have been checked already.

I'm afraid my experience is rather different - that in building XML
applications, people are reading the namespaces spec as providing a new and
more sophisticated name, not a multi-level architecture.  While the
multi-level architecture is intriguing architecturally, I'm not sure that
requiring every application to support it is even worth contemplating.

It's fine as an option, but for many many use cases - especially smaller
use cases where SAX is being used for its quick-and-dirty nature, I think
I'd much rather have the big kludged string.  If I as a programmer have to
deal with this every time I write a handler, or even have to track down
filters, I'll waste a lot of time complaining on XML-dev about what an
utterly idiotic notion namespaces in XML were to begin with.  If I can just
tell the parser my preference, and not be forced into extra work, I'll be a
lot more productive.

>The whole {ns}name notion has a smell to me of pretending that we're
>still living in the pre-namespace era, and we're not.  I mean, you 
>could take all the structs in your C programs and concatenate the
>string representatioons of all the members together and pick them apart
>to do work with them, but that would be perverse.  So is {ns}name.

Until schemas/DTDs are capable of doing real work with namespaces, we are
living in the pre-namespace era.  The W3C dropped the ball on validation
and namespaces, and we've been living with the consequences - life between
'eras' - ever since.

Pretty ugly, sad to say.  But that's another fight.

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec 15 20:50:14 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <3.0.32.19991215115634.01439650@pop.intergate.ca>
Message-ID: <002201bf473e$2007f4c0$d1940e18@smateo1.sfba.home.com>

Tim Bray wrote:
> while (iterator.hasNext())
> {
>    whatever = (Whatever) iterator.next();
>    if (whatever.ns().equals(myNamespace))
>    doMyProcessing(whatever.name());
> }

composite name doesn't have to be taken apart.

String nss = "{" + myNamespace + "}";
while (iterator.hasNext()) {
  whatever = (Whatever) iterator.next();
  if (whatever.name().startsWith(myNamespace))
    doMyProcessing(whatever);
}

Not very efficient but workable.

David, how about introducing a new class: Name.

static final String FOO_NS = "http://www.foo.com/ns";
private Name foobar = new Name(FOO_NS, "bar");
private Name fooid = new Name(FOO_NS, "id");
...
void startElement(Name name, AttributeList atts) {
  if (name.equals(foobar)) {
    String id = atts.getValue(fooid);
  }
}

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec 15 21:25:17 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <3.0.32.19991215115634.01439650@pop.intergate.ca>
Message-ID: <NBBBJPGDLPIHJGEHAKBACEKOEJAA.martind@netfolder.com>

Hi Tim,

Tim said:
Hmm, I perceive the opposite.  I anticipate that patterns such as the
following will be very common - not sax primitives, but you'll see the idea.

 while (iterator.hasNext())
 {
    whatever = (Whatever) iterator.next();
    if (whatever.ns().equals(myNamespace))
    doMyProcessing(whatever.name());
 }

Didier reply:
I noticed, in our work within the OpenJade group (which also involve parser
development) that new needs are emerging. And for the gentleman who asked if
there is some Open Source developement in the markup techs, yes, there are
and OpenJade is one of them (but probably the only group dedicated to markup
techs ;-)

So, we noticed these new needs. In fact, these needs are created by the
different ways and maybe new ways to look at and process an XML document. So
let's speak of parsing pattern.

a) A pattern we discovered (or re-discovered) is the multi-event handler
pattern. For a single document, we may have multiple event handler. For
instance, the same document may be perceived as a topic map document, an
xlink document and a rdf document. These three domains have their respective
domain handler. So, the first pattern to resolve is how can we dispatch the
right event to the right event handler. This is a problem contrary to other
event based systems like for instance VB where we have multiple event
sources and a single event handler sink. So instead, we have the pattern of
having a single event source but multiple event sinks. No actual parsers can
do that. hummm, Can they? Maybe with (b) as an answer.

b) an other pattern, this one more well known, is a several pass parsing
pattern. Where each domain is processed by a separate document handler (i.e.
event handler). So, if we keep the same example, the document is first
parsed and handled for the topic map domain, then for the xlink domain and
then finally for the rdf domain. Obviously, this pattern won't win the speed
contest.

c)An other pattern is to build an internal model (i.e. a grove) and have
this grove being accessed by an API (i.e. a DOM). In this case, we have a
single pass for parsing. The processing or interpretation phase could be
either (a) a single pass interpretation or (b) a multiple pass
interpretation. As you know, this last pattern is not event driven.

As you noticed, I didn't use the term pattern in the sense of Alexander (not
the Alexander the great but the architect :-)

>From an observation of both RDF and XLink, dispatching to the right handler
requires that the event source fires the event and the event sink to receive
the current element and attribute if:
a) this element is tagged with a name space identifier and that an event
handler is associated to this name space.
b) an attribute is tagged with a name space identifier and that an event
handler is associated to this name space.

Off course, each handler (or in this case name space handler) has to keep a
certain memory of the processing or interpretation context.

We are now studying seriously the pattern (a) because it allows _real_ reuse
and combination of domain handlers and thus (here is the plug for your
managers dear Dilbert(s) of this world) it reduces development cost (and has
been prescribed to reduces headaches by the federal health department :-))

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products: http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Wed Dec 15 21:21:49 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:33 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <002201bf473e$2007f4c0$d1940e18@smateo1.sfba.home.com>
Message-ID: <Pine.LNX.4.10.9912150408070.32050-100000@cauchy.clarkevans.com>


For SAX2, I'd prefer taking the opportunity to converge
more with DOM2 where appropriate.  Thus,

  void startElement(SaxNode node);

interface SaxNode {
  readonly attribute String nodeName;
  readonly attribute String nodeValue;    // could raise Exception... 
  readonly attribute unsigned short nodeType;
  readonly attribute Node parentNode;
  readonly attribute NamedNodeMap attributes;
  readonly attribute String prefix;
  readonly attribute String namespaceURI;
  readonly attribute String localName; 
};

such that

interface DomNode extends SaxNode {
 ... adds child access, mutable methods, etc ...
};

This would make everyone's life *far* easier.

;)  Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Wed Dec 15 21:33:24 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <Pine.LNX.4.10.9912150408070.32050-100000@cauchy.clarkevans.com>
Message-ID: <Pine.LNX.4.10.9912150430230.32050-100000@cauchy.clarkevans.com>


On Wed, 15 Dec 1999, Clark C. Evans wrote:
>   void startElement(SaxNode node);

Just thinking...

Or better yet, rather than using the flat Node 
model DOM uses, have mix-ins.

interface Node {
       unsigned short nodeType;
}

interface AttributeAxis {
	NamedNodeMap attributes;
}

interface Name {
	String nodeName;
	String prefix;
	String namespaceURI;
	String localName;
}

interface Value {
       String nodeValue;
}

interface AncestorAxis {
       Node parentNode();
}

interface ElementNode extends Node, Name, AttributeAxis, AncestorAxis { }
interface AttributeNode extends Node, Name, Value {]

// for DOM...

interface Node extends Node, Name, AttributeAxis, AncestorAxis, etc... 
{
   // add child Axis, mutators, etc.
}


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Wed Dec 15 22:13:43 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:18:34 2004
Subject: LISTADMIN: IMPORTANT. Transfer of this list to  OASIS.
Message-ID: <v04220800b47dbec6584d@[155.198.8.91]>

During the last three years, this list has had a home at Imperial College,
and more than 18,000 postings have been made, the cummulative effect
of which we feel has made a major impact on the development of 
the Internet and of  Open Standards. 

The time has come to move the list to an organisation which can offer 
extensive additional resources to improve the features available to 
subscribers.

We are delighted to be able to announce that OASIS have most generously
offered to host the list from January  1st, 2000. A press release announcing
this has been made available on the  OASIS site;

http://www.oasis-open.org/html/oasis_xml-dev.html

Further announcements at the above site, and to this list  will be made
shortly to indicate the arrangements for transfer of the list.

Here at Imperial College, we will be closing down the list re-distribution
on  Tuesday 21 December, since no administrators will be available after
that date because of the holiday shutdowns.
Please continue to post up to that date, and please watch for
further announcements.

The list archive will continue to be available for some time from 
http://www.lists.ic.ac.uk/hypermail/xml-dev/ for postings
made up to the 21 December 1999, with the possibility of a duplicate at OASIS.

Meanwhile, I would like to thank every one of the  1400 of subscribers
and contributors to the list for an extraordinary and quite 
possibly unique experience of watching the birth
and development of XML over the last three years. Many thanks folks,
and see you all, as they say, in the new home shortly!

Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Wed Dec 15 22:13:17 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
Message-ID: <4.2.0.58.19991216090343.00c55d00@203.41.126.17>

Hi all,

I must say, it has been nice not
being swamped with the hundreds of
messages relating to SML.

I am, however, a little fearful of
what has been cooked up, now that the
SML list most likely only consists of
advocates ...

Can anyone give a (brief) status update?

Thanks,

J

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
Illumination: an out-of-the-box Intranet solution

http://www.steptwo.com.au/
jamesr@steptwo.com.au

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Wed Dec 15 22:22:35 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX for WAP binary XML
References: <3856B3FF.A47F6C5C@trantor.de> <m3u2lki19w.fsf@localhost.localdomain>
Message-ID: <38581497.E3D4BB83@trantor.de>

> > - WBXML adds some properitary extensions. For that reason, I added an
> >   interface "WapExtensionHandler". Can that interface or a similar one
> >   be included in org.xml.sax.wap? Who cares about extensions of
> >   org.xml.sax? Unfortunally, the ugly extensions cannot be ignored
> >   since they are alread used in the WML definintions.
> 
> The extensions should go in a different package, such as org.wap.sax.

what about org.xml.wap? Or org.xml.swx (simple access to WAP binary
xml)?
org.wap is owned by an apple user group "washington apple pi". I am also 
not willing to spend 27000 USD in order to become a member of the wap 
forum (www.wapforum.org, similar to org.xml.sax vs. org.w3.sax? :-)

All the best,

Stefan

-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Wed Dec 15 22:51:25 1999
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:18:34 2004
Subject: Transfer of this list to  OASIS and future developments
In-Reply-To: <v04220800b47dbec6584d@[155.198.8.91]>
Message-ID: <3.0.1.32.19991216225432.009b7510@pop3.demon.co.uk>

At 10:12 PM 12/15/99 +0000, Rzepa, Henry wrote:
[...]
>
>We are delighted to be able to announce that OASIS have most generously
>offered to host the list from January  1st, 2000. A press release announcing
>this has been made available on the  OASIS site;
>
I am delighted to add my delight to this message.  Henry and I set this
list up nearly three years ago and we have been overwhelmed with the way
that you have contributed.

I used to be very active as "moderator" of this list but for nearly a year
have taken a self-imposed vow of silence. [More than one person at XML
Europe was pleased to see I was still alive!]. That was only partly
deliberate but I have been extremely pleased to see that the list has
really moderated itself as far as content goes.  It has had the spin-off
that we feel no problem in loosing control because that has effectively
happened.

I may soon indulge myself slightly in posting a few forward- and backward-
looks at XML in the context of this list. I have already had the privilege
of being asked to talk at XML99 and I choose to highlight the role of this
list and announce its future. [You can see a very accurate report on
http://www.xml.com]. I have also handed in my text to the GCA (to whom I am
very grateful for making my visit possible) - it should appear in the
proceedings.

Probably few of you appreciate how much work Henry has had to put in.
Every week he gets e-mail bounces - many of which have to be replied to. He
also doesn't have admin control over the server on which the hypermail
sits, so that have been many things he simply couldn't do. And he has had
his share of spam, listsniffing, and viruses.

Henry and I have had a not very hidden agenda - to do our bit to help XML
succeed and in that way CML would have the support of a worldwide community
and set of tools. We are really delighted how universal XML is and it is
now possible to go back to the chemical community and show that markup
languages really offer a solution that no other approach does. You will
appreciate that now that CML is starting to take off we are concentrating
some of our activities there. We are delighted that there are several *.org
and *.gov who are taking it up (regulatory, patents, safety, informatics
etc.) and also know that it is being adopted in *.com. There is a growing
range of freeware/open tools for CML and there are at least two editors
written independently of our efforts. [Writing a good chemical editor is
extremely difficult and I will probably be asking for help on the generic
editor problem shortly.]

We shall not of course be leaving XML-DEV and will continue to read and
post. But we are going to try a new venture, very much in the spirit of the
virtual community built here. The way in which members have communally
contributed to discussion and construction has been priceless. I believe
that the creation of SAX was certainly a highlight in my life. 

We have created a virtual learning environment for XML - VirtualXML. This
is based on the values created on XML-DEV - mutual help, hard work,
attention to detail and respect for others. In this environment - which
will be created in XML as much as possible - we shall help newcomers to XML
to learn both the technology and the philosophy of XML. And we expect to
learn a great deal as well!

We shall be announcing more details in the next day or so.

	Peter Murray-Rust ["moderator" of XML-DEV]


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 15 23:16:30 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX for WAP binary XML
In-Reply-To: <38581497.E3D4BB83@trantor.de>
References: <3856B3FF.A47F6C5C@trantor.de>
	<m3u2lki19w.fsf@localhost.localdomain>
	<38581497.E3D4BB83@trantor.de>
Message-ID: <14424.8476.751219.528483@localhost.localdomain>

Stefan Haustein writes:

 > > The extensions should go in a different package, such as
 > > org.wap.sax.
 > 
 > what about org.xml.wap? Or org.xml.swx (simple access to WAP binary
 > xml)?  org.wap is owned by an apple user group "washington apple
 > pi". I am also not willing to spend 27000 USD in order to become a
 > member of the wap forum (www.wapforum.org, similar to org.xml.sax
 > vs. org.w3.sax? :-)

The xml.org domain now belongs to OASIS, so you'll need their
permission to use it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From aray at q2.net  Wed Dec 15 23:18:37 1999
From: aray at q2.net (Arjun Ray)
Date: Mon Jun  7 17:18:34 2004
Subject: Musing over Namespaces
In-Reply-To: <000201bf466a$9dd31f80$d1940e18@smateo1.sfba.home.com>
Message-ID: <Pine.LNX.4.10.9912151840460.2731-100000@mail.q2.net>


On Tue, 14 Dec 1999, Don Park wrote:

> Is it really a 'good thing' to have namespaces in XML?  

Depends on whether a 'solution in search of a problem' can be a 'good
thing'.

> What ill effect will it have on XML's future?

Wheels spinning madly in place?

> Why can't the semantic of '<name>' be determined purely by context?

Because names are instrumental.  The real issue is whether the context is
understood.

> What is wrong with using just <html> to distinguish HTML's use of
> 'a' tag?  

Because it's not about tags.  It's about notations.

> Is the ability to inject attributes from other namespaces really useful?

Injection is not useful.  Reference (or annotation) is useful.

> What is the positive effect of having just one namespace?  

None.  Each document is a name space.

> Why can't we have central registry of XML names?

I don't know if that's a good idea.  We could be headed that way, though.
(i.e. "ownership" of names, as in, you gotta download my plugin to do
anything with this markup...)


Arjun


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lisarein at finetuning.com  Thu Dec 16 00:09:41 1999
From: lisarein at finetuning.com (Lisa Rein)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
References: <4.2.0.58.19991216090343.00c55d00@203.41.126.17>
Message-ID: <38582E67.70BBA415@finetuning.com>

James Robertson wrote:
> 
> 
> I am, however, a little fearful of
> what has been cooked up, now that the
> SML list most likely only consists of
> advocates ...
> 

yep, and this XML list consists of XML ADVOCATES :-)

please tell me this is just a cruel, cruel little prank james...
let us go on with our overburdeoned, attribute-laden lives.....:-|

lisa


> Can anyone give a (brief) status update?
> 
> Thanks,
> 
> J
> 
> -------------------------
> James Robertson
> Step Two Designs Pty Ltd
> SGML, XML & HTML Consultancy
> Illumination: an out-of-the-box Intranet solution
> 
> http://www.steptwo.com.au/
> jamesr@steptwo.com.au
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jtauber at jtauber.com  Thu Dec 16 01:41:23 1999
From: jtauber at jtauber.com (James Tauber)
Date: Mon Jun  7 17:18:34 2004
Subject: ANNOUNCE: XPath interface for XT Version 0.90
References: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk> <021301bf46a6$3fc58d10$866e230a@sysrap.cs.fujitsu.co.jp>
Message-ID: <01f601bf4766$c9b7de90$eb020a0a@bowstreet.com>

> > Unless I'm asking the wrong question - is there a tool that will
> > search a DOM tree for me, assuming I supply it with an XPath
> > expression.
> >
>
> If the language being used is Java, there is a tool "XPath interface for
XT"
> which performs XPath query on top of DOM, accessible at:
>
> http://www.246.ne.jp/~kamiya/pub/XPath4XT.html

This is cool. Exactly what I wanted way back when I was musing about a
"shell" that enabled you to navigate a DOM the same way you'd navigate a
file system in a command-line interface.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Thu Dec 16 01:59:11 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
In-Reply-To: <4.2.0.58.19991216090343.00c55d00@203.41.126.17>
Message-ID: <001301bf4769$484e6a60$d1940e18@smateo1.sfba.home.com>

SML is still cooking, currently in use case
enumeration phase.  You can find more bits of
information at the SML-DEV eGroup pages:

http://www.egroups.com/group/sml-dev/info.html

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mlepage at antimeta.com  Thu Dec 16 02:36:59 1999
From: mlepage at antimeta.com (Marc Lepage)
Date: Mon Jun  7 17:18:34 2004
Subject: Request for Discussion: SAX 1.0 in C++
References: <87256840.0071AFFD.00@d53mta03h.boulder.ibm.com> <wh4sdt97gh.fsf@viffer.oslo.metis.no>
Message-ID: <38585041.AEB285EF@antimeta.com>

Steinar Bang wrote:
> 
> >>>>> roddey@us.ibm.com:
> 
> This statement:
> 
> > ... If it were up to me, I'd say use every modern service of C++ and
> > those who don't have compliant C++ implementation can have a good reason to
> > get one.
> [...]
> 
> conflicts with this statement:
> 
> > 2) We would prefer that all data come out of the SAX interfaces as
> > raw wchar_t strings. This is the most flexible mechanism and does
> > not lock people into using any particular implementation of a string
> > object. It also has the highest potential performance for those
> > folks who never need to put it into anything more formal than a raw
> > array.
> 
> std::basic_string<> _is_ a modern service of C++, and a pretty good
> one from an API point of view.
> 
> Personally I say: use std::basic_string<> and death to all other
> string representations in C++.

Agreed. I don't see why you need to obviate the C++ standard library
string. If it's that bad, upgrade your compiler environment (e.g.
Windows) or install an entirely new one (e.g. STLport and the like).

-- 
Marc Lepage
http://www.antimeta.com/
Minion open source game, RTS game programming, etc.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Thu Dec 16 04:58:51 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <3.0.32.19991215115634.01439650@pop.intergate.ca>
Message-ID: <006901bf4782$e2c75ca0$0101a8c0@tomshp>


Tim Bray wrote:
> At 02:06 PM 12/15/99 -0500, Simon St.Laurent wrote:
> >>We talked about this a few months ago, but I'd be happy to hear what
> >>people think now.
> >
> >I'm afraid I think this is a hideous idea, and that's an understatement.
> >Even apart from code bloat - having to deal with two parts of a name
means
> >twice as much code every place the name matters - I'm really not sure I
> >like the logic behind this approach.
>
> Hmm, I perceive the opposite.  I anticipate that patterns such as the
> following will be very common - not sax primitives, but you'll see the
idea.
>
>  while (iterator.hasNext())
>  {
>     whatever = (Whatever) iterator.next();
>     if (whatever.ns().equals(myNamespace))
>     doMyProcessing(whatever.name());
>  }
>
> i.e. the namespace processing is highly decoupled from the name
> processing.  Another way to say it is that much name processing will
> be written to deal with one particular vocabulary, and want to just
> deal with names, assuming the NS to have been checked already.
>
> Given this, then if the parser insisted on giving you these things
> glued together, you'd actually have to do extra work to pick them
> apart before doing your real work.  Since the low level parsing
> code is going to have them in two places anyhow, it seems real awkward
> to parse them apart, glue them together with curly braces, and then
> pick them apart again to do the real work.  The namespace spec de jure
> and de facto contemplates qualified names as being 2-part things, and
> modern programming techniques can deal with multi-part data objects,
> so why why all this concatenation side-stepping?
>
> The whole {ns}name notion has a smell to me of pretending that we're
> still living in the pre-namespace era, and we're not.  I mean, you
> could take all the structs in your C programs and concatenate the
> string representatioons of all the members together and pick them apart
> to do work with them, but that would be perverse.  So is {ns}name.
>
> >There are two cases where I think this approach could be useful. The
first
> >is separating components from different namespaces for different
> >processing, which could be a good idea but isn't worth the cost so far as
> >I'm concerned.
>
> Hmm... I have the notion that this is probably the most common use case.
> Of course, we're both prognosticating, i.e. guessing.
>
> >The second case is downright funny to me at least,
> >situations where you want to discard the namespace entirely and focus
only
> >on the local part.
>

Both cases will and do happen.  Like in XSLT, you don't have to call the
namespace "xsl", but whatever you do call it, your xslt tag names have to
have the same "local" names (like "apply-templates").  This fits the second
case, above.

On the other hand, there are plenty of explanations of using namespaces out
there where two different namespaces use the same local names - the authors
presumably like to come up with extreme examples.

So we need to be able to work either way, which Tim's way would do.  As for
code bloat, if you use the same method calls to handle the two name parts,
there would only be one actual copy of the code in memory, wouldn't there?
(I suppose that's somewhat language dependent - Java people will set me
straight).

> I think this is going to happen all the time, once you've decided that
> this is the namespace you know about; then you just focus on elements &
> attributes in the old-fashioned way.  -Tim
>
>
Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Dec 16 06:16:03 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
Message-ID: <001d01bf4790$9414ff50$49f96d8c@NT.JELLIFFE.COM.AU>

 From: Lisa Rein <lisarein@finetuning.com> 

>please tell me this is just a cruel, cruel little prank james...
>let us go on with our overburdeoned, attribute-laden lives.....:-|

But some of the SMLies want more attributes not less: i.e., "YML"!

YML is interesting: if attributes v. elements can be justified because
they both are used differently in many programs (i.e., pull v. push),
why not have a syntax that allows heirarchical attributes: the API 
provides these-new attributes in a tree and elements in a stream: 
a self-pruning tree.  (Of course, the trouble with this idea is that
prunability is more a processing issue rather than a data issues.
And it could be done in XML by adding an attribute to elements
or element type declarations, such as   yml:prunable="yes".)
(there is no need for a PI, since it follows element boundaries).

XML.COM asked me to write an article on SML: it is now up in
the  current issue  www.xml.com

I hope it is more conciliatory. During the XML development,
many people who were antagonistic to SGML got a grudging
respect for it, and many SGML people who doubted toy languages
would work shifted their positions too. I expect the same thing
can happen with SML: it may go in an unexpected direction.  

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Thu Dec 16 06:45:11 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <3.0.32.19991215092815.01443430@pop.intergate.ca> <m3r9gohyea.fsf@localhost.localdomain>
Message-ID: <385870DA.2CC2FF9E@jclark.com>

David Megginson wrote:
> 
> Tim Bray <tbray@textuality.com> writes:
> 
> > So I think it would be cleaner to deal with the fact that names can have
> > two parts, and not kludge them together with {} marks.  -Tim

I tend to agree: pasting the namespace URI and local name together is a
hack. Perhaps it's justified for backwards compatibility. (I did it
myself in expat for this reason, so I can't really complain if SAX2 does
it.)

> So, in other words, we'd have something like this:
> 
>   public interface DocumentHandler2 extends DocumentHandler {
>     public void startElement (String ns, String name, AttributeList2 atts);
>     public void endElement (String ns, String name);
>   }
> 
>   public interface AttributeList2 extends AttributeList {
>     public String [] getName (int i);
>     public String getType (int i);
>     public String getValue (int i);
>     public String getType (String ns, String name);
>     public String getValue (String ns, String name);
>   }
> 
> We talked about this a few months ago, but I'd be happy to hear what
> people think now.

For some applications (for example, layering DOM2 on top of SAX2), it's
really useful to have prefixes as well.  So I would rather see:

final class Name {
  public String getNamespaceURI();
  public String getLocalName();
  public String getPrefix();
}

public interface DocumentHandler2 extends DocumentHandler {
  void startElement(Name name, AttributeList2 atts);
  void endElement(Name name);
}

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pv400 at yahoo.com  Thu Dec 16 08:03:33 1999
From: pv400 at yahoo.com (P.V. NARASIMHA RAO)
Date: Mon Jun  7 17:18:34 2004
Subject: Please Help Me
Message-ID: <19991216080329.23457.qmail@web1906.mail.yahoo.com>

Hi,
       Someone please help me.I amtrying to use an
external DTD.This is located in  my C drive.
when i am trying to view it in the IE5.0 browser i am
getting the folloing error.

The system cannot locate the resource specified. Error
processing resource 'C:/xml/xappli/DTDs/Books.dtd'.  


I tried with  the following statements.
<!DOCTYPE Books SYSTEM "/C|/xml/xappli/Books.dtd">
<!DOCTYPE Books PUBLIC "-//ECC, Inc.//DTD Books//EN"
"http://c:/xml/xappli/Books.dtd"> 
<!DOCTYPE Books SYSTEM "C:/xml/xappli/DTDs/Books.dtd">

I got the same error all the times.
I tried to chang the settings in my Explorer, but it
was of no use.
Please help me.
Thank u.
Narasimha Rao 

__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 16 08:05:56 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
In-Reply-To: <001d01bf4790$9414ff50$49f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <Pine.LNX.4.10.9912151431240.10177-100000@cauchy.clarkevans.com>

On Thu, 16 Dec 1999, Rick Jelliffe wrote:
> >please tell me this is just a cruel, cruel little prank james...
> >let us go on with our overburdeoned, attribute-laden lives.....:-|
> 
> But some of the SMLies want more attributes not less: i.e., "YML"!
> 
> YML is interesting: if attributes v. elements can be justified because
> they both are used differently in many programs (i.e., pull v. push),
> why not have a syntax that allows heirarchical attributes: the API 
> provides these-new attributes in a tree and elements in a stream: 
> a self-pruning tree.

I *love* this description of it.  Cool.  It's a fun experiment,
I think it might bear fruit, but the proof will be in the puddin.

> (Of course, the trouble with this idea is that
> prunability is more a processing issue rather than a data issues.
> And it could be done in XML by adding an attribute to elements
> or element type declarations, such as   yml:prunable="yes".)
> (there is no need for a PI, since it follows element boundaries).

Certainly!  However... the only tangeable meaning I 
can acertain for a given grammer is defined by an 
actor's resulting behavior over the language generated.

So, in a certain sense, the "processing issues" and the
"syntax" are one and the same ... In the YML case, the 
clarity of the syntax allows for a clear insight into 
the analogous "pruning" processing phenonomena -- which 
seems to me far less tangeable.

Once this processing style is examined and understood 
well, then the syntax used to bootstrap the understanding
isn't all that important anymore -- XML might do just fine.
As you point out, there are other equivalent ways to draw 
the recursive binary distinction; either as an attribute, 
or, perhaps defined by a pre-compiled version of an XSL 
spreadsheet over a class of XML documents...

For now, there is a slowly developing thread discussing
this resulting processing model and how it could be
used to unify sequential vs random accessors (DOM&SAX).
Anyway, if you are interested, I'd love comments. 

Also, this attempt at unification is one of the reasons
why I'd like to see SAX2 to become a bit more "DOM2 
friendly" _only_ when it doesn't hinder the underlying 
need for pure sequential access to a XML data source.  
XML is still *young* and 99.99% of programs using it 
have yet to be imagined... so, to Dave Megginson (whom 
I have great respect), I'd like to see a more "low level" 
review of SAX... to put it in-line with DOM.  I feel any 
pain up front will more than pay for itself down stream.
Actually, I like AttributeList far better than NamedNodeMap,
so I think DOM should do the changing, but we all know
what kind of chance that snowball has.

> XML.COM asked me to write an article on SML: it is now up in
> the  current issue  www.xml.com

Yes, I read it this evening.  I really liked it.

> I hope it is more conciliatory. During the XML development,
> many people who were antagonistic to SGML got a grudging
> respect for it, and many SGML people who doubted toy languages
> would work shifted their positions too. I expect the same thing
> can happen with SML: it may go in an unexpected direction.  

At the very worst, we will learn as a community from 
the process of both stripping down XML to its minimilistic 
form, and from re-introducing back XML extensions that have
clearly defined use-cases.

;) clark

P.S.   I'm going to have to drop out for a few days to get 
a product out the door.... so if I'm curiously silent, you
know why -- I have yet to earn January's rent. 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Thu Dec 16 08:09:58 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
In-Reply-To: <001d01bf4790$9414ff50$49f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <000201bf479d$166ddec0$d1940e18@smateo1.sfba.home.com>

>XML.COM asked me to write an article on SML: it is now up in
>the  current issue  www.xml.com

Nice article.  I just love good analogies. <g>

>I hope it is more conciliatory. During the XML development,
>many people who were antagonistic to SGML got a grudging
>respect for it, and many SGML people who doubted toy languages
>would work shifted their positions too. I expect the same thing
>can happen with SML: it may go in an unexpected direction.

It was.  Although SML is still heading in the same direction,
SML-DEV is going in unexpected directions.

First, SML-DEV is taking a more open, experimental attitude
toward new ideas.  While XML-DEV is a great place, I have
always felt that it was more of a dining room than a kitchen.
People could use SML-DEV to cook half-baked ideas into a more
presentable form before bringing it to the [dining] table.

Second, SML-DEV is going to be expanding our focus to other XML
related standards and release 'usage' specs.  SML itself will
probably be a 'usage' spec as well.  The idea is to formalize
common practices and to discourage unhealthy practices (i.e.
Safe-SAX).

For those who don't know where SML-DEV is located, here is a
pointer:

  http://www.egroups.com/group/sml-dev/info.html

Don't forget to bring your Ginsu knives. <g>

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Thu Dec 16 08:23:12 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <14423.43924.116647.613307@localhost.localdomain>
Message-ID: <3858A207.43D73E9D@pacbell.net>

I'm only a user of SAX and not an implementor, but I've always expected SAX 2 to include
namespace scoping events.  It seems a bit restrictive to completely hide these events from
an application.  For example:

<element1 xmlns="uri1" xmlns:p2="uri2">
  <ns2:element2 foo="bar" xmlns:p3="uri3" p3:whiz="bang"/>
</element1>

would produce the following stream of events:

  startDocument();
  startNamespace( prefix=null, uri="uri1" );
  startNamespace( prefix="p2", uri="uri2" );
  startElement( name="element1", attrList=null );
  startNamespace( prefix="p3", uri="uri3" );
  startElement( name="p2:element2", attrList={ foo="bar", p3:whiz="bang" });
  endElement( name="p2:element2" )
  endNamespace( prefix="p3" );
  endElement( name="element1" );
  endNamespace( prefix="p2" );
  endNamespace( prefix="p1" );
  endDocument();

Simple management of these namespaces can be provided for in some helper class like the
NSUtils example, and can even be handled automatically for the most part in a new
HandlerBase2 class, but this decision of how (or whether) to handle namespaces could be
left to the application author.  Anyone else feel this way?

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Teun.Duynstee at macaw.nl  Thu Dec 16 08:30:14 1999
From: Teun.Duynstee at macaw.nl (Teun Duynstee)
Date: Mon Jun  7 17:18:34 2004
Subject: Please Help Me
Message-ID: <E77DE7E35B22D3118D3C00A0C942EC5222D09D@EXCHSRV01>


Your declaration should refer to a valid URL. A file system URL looks like
this:
file://C:/xml/xappli/DTDs/Books.dtd
If you are working locally on both your XML and DTD, you can also use a
relative reference. If you want to use an HTTP URL, it must be possible to
retrieve the resource. http://c:/xml/xappli/Books.dtd doesn't look like a
valid HTTP URL to me.

Good luck,
Teun

PS. There is a fairly good tutorial on XML, DTD an such at
http://wdvl.internet.com/Authoring/Languages/XML/Tutorials/Intro/toc.html

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 16 09:05:40 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <3858A207.43D73E9D@pacbell.net>
Message-ID: <Pine.LNX.4.10.9912151548500.17129-100000@cauchy.clarkevans.com>


This is very interesting....

On Thu, 16 Dec 1999, Ray Waldin wrote:
> I'm only a user of SAX and not an implementor, but I've always 
> expected SAX 2 to include namespace scoping events.  It seems 
> a bit restrictive to completely hide these events from
> an application.  For example:
> 
> <element1 xmlns="uri1" xmlns:p2="uri2">
>   <ns2:element2 foo="bar" xmlns:p3="uri3" p3:whiz="bang"/>
> </element1>

I assume you mean "p2" instead of "ns2" ?

> would produce the following stream of events:
> 
>   startDocument();
>   startNamespace( prefix=null, uri="uri1" );
>   startNamespace( prefix="p2", uri="uri2" );
>   startElement( name="element1", attrList=null );
>   startNamespace( prefix="p3", uri="uri3" );
>   startElement( name="p2:element2", attrList={ foo="bar", p3:whiz="bang" });
>   endElement( name="p2:element2" )
>   endNamespace( prefix="p3" );
>   endElement( name="element1" );
>   endNamespace( prefix="p2" );
>   endNamespace( prefix="p1" );
>   endDocument();
> 

Slight problem.

The namespaces are defined *after* the element
tag begins.  So, startNamespace would logically
be *inside* of the element like:

startDocument()
  startElement(name="element1" attributes="{...}" )
    defineNamespace(prefix=null, uri="uri1");
     ...

However, attributes (since they are random
and not sequential access), occur within
the event scope of the element.  Yet, the
namespace applies to the attribute!

I realise you put the events immediately
preceding the begin tag and immediately 
following the end tag to get around this
problem, but it just doesn't look pretty.

However, your suggestion would work perfectly
for the 100% pure sequential access interface
which JClark has for native expat !

Best,

Clark 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Thu Dec 16 10:23:58 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <Pine.LNX.4.10.9912151548500.17129-100000@cauchy.clarkevans.com>
Message-ID: <3858BE52.E65D050D@pacbell.net>

> I assume you mean "p2" instead of "ns2" ?

Yes.  My mistake.

> Slight problem.
> 
> The namespaces are defined *after* the element
> tag begins.  So, startNamespace would logically
> be *inside* of the element like:

I disagree.  I would argue that namespaces are declared before the element start tag ends.
And it is only *after* the start tag ends that a parser can trigger the startElement event
as it must parse the entire start tag to gather all attributes.  So it makes sense to send
startNamespace events before the startElement event for the element in which the namespace
is declared.

> startDocument()
>   startElement(name="element1" attributes="{...}" )
>     defineNamespace(prefix=null, uri="uri1");
>      ...
> 
> However, attributes (since they are random
> and not sequential access), occur within
> the event scope of the element.  Yet, the
> namespace applies to the attribute!

This is an area that's always been a little unclear to me.  Given the following XML:

<a b="c" xmlns="u1">
  <d:e f="g" xmlns:d="u2"/>
</a>

What is the namespace of the f attribute?  From my understanding of the spec, I think it's
"u2", but I'm not sure how that differs from an attribute called d:f on the same element. 
Or is d:f not allowed because it's expanded name is identical to f's expanded name?
 
What's even more confusing is...what about b?  According to the spec, "the default
namespace does not apply to attribute names".  So then everything in the above example
belongs to a namespace except b?  That doesn't make sense...

> I realise you put the events immediately
> preceding the begin tag and immediately
> following the end tag to get around this
> problem, but it just doesn't look pretty.

Yet I think this is what was intended by http://www.w3.org/TR/REC-xml-names#scoping: 

"The namespace declaration is considered to apply to the element where it is specified and
to all elements within the content of that element, unless overridden by another namespace
declaration with the same NSAttName part"

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From oren at capella.co.il  Thu Dec 16 11:14:48 1999
From: oren at capella.co.il (Oren Ben-Kiki)
Date: Mon Jun  7 17:18:34 2004
Subject: Status of SML?
References: <001d01bf4790$9414ff50$49f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <077d01bf47b4$de1ea960$4602a8c0@capella.co.il>

Rick Jelliffe <ricko@allette.com.au> wrote: (to the XmlDev mailing list. I'm
redirecting this to the SML mailing list - I think this is more
appropriate.)

> XML.COM asked me to write an article on SML: it is now up in
> the  current issue  www.xml.com
>
> I hope it is more conciliatory. During the XML development,
> many people who were antagonistic to SGML got a grudging
> respect for it, and many SGML people who doubted toy languages
> would work shifted their positions too. I expect the same thing
> can happen with SML: it may go in an unexpected direction.

I'd like to respond to the points you raised in the article. They touch on
the issues which have been debated in the SML mailing list in these past few
weeks (not that anything has been settled :-)

> So what areas is an SML suited for?
> Perhaps reverse engineering will give some clue:
>
> * If it is UTF-8 only, it is not practical for
> local use for much non-Western data.

SML should support UTF-16 as well as UTF-8.

> * If it allows other sets but does not allow this
> to be unambiguously labeled, is it not suitable
> for transnational use.

UTF-16 is unambiguously labeled. The first char is FFFE. Otherwise the file
is assumed to be UTF-8.

> * If it does not include PIs, it is not suitable for
> server use (on the evidence that most server-side
> includes use special delimiters for what are in fact PIs).

I could never figure out why <?...?> is better then <pi:.../> - that is, why
PIs are inherently better then simple elements. Perhaps you can give an
example?

> * If it does not include some mechanism for literal text,
> it is not suitable for direct data entry.

The use case you are referring to is writing an SML document by hand - say,
in Notepad? I agree that SML isn't well-suited for that. Then again, XML
isn't, either, since it has done away with SGML CDATA elements (good
riddance!).

As for whether using CDATA sections is easier then escaping characters, this
depends on the editor. Any decent one will automate either form of escaping
for you. And users who do such things presumably use a decent editor.

> * If it does not include syntactic distinction for the
> most common targets of tags (i.e., comments, elements,
> processing instructions, entity references), then people
> must introduce another layer straight away.

I don't see why an element-based solution is worse then a special mechanism
solution. For example, why is an XLink based approach worse then an entity
based one? Likewise for comments (<sml:comment>...</sml:comment>), PIs
(<pi:do-something/>), etc.

> * If it does not have basic attribute defaulting, it must be
> bundled with some transformation language; so it is best
> for recipient systems that know the defaults.

First, this assumes SML supports attributes in the first place. I'm for
that - what you call a "soft reductionist" :-) At any rate, it seems XML
defines "default attributes" as:

If an element does not specify this attribute, the parser will _add_ one
with this value; the "rest of the system" is not aware of the defaults.

On the other hand, I thought a major advantage of XML over SGML was that a
document was processable without a DTD. And yet, DTDs as specified are a
watered-down transformation language. It isn't clear to me how this is
reconciled.

SML simply goes "all the way". No DTDs required, period. Defaults, if any,
reside in "the rest of the system" (e.g., using XSLT) and not in the parser
itself. This is a reasonable choice for a simplified usage profile.

Share & Enjoy,

    Oren Ben-Kiki


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Thu Dec 16 11:39:40 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: "Simon St.Laurent"'s message of "Wed, 15 Dec 1999 14:06:07 -0500"
References: <199912151906.OAA24894@hesketh.net>
Message-ID: <whvh5z9kr5.fsf@yggdrasil.metis.no>

>>>>> "Simon St.Laurent" <simonstl@simonstl.com>:

> At 01:06 PM 12/15/99 -0500, David Megginson wrote:
>>> So I think it would be cleaner to deal with the fact that names can have
>>> two parts, and not kludge them together with {} marks.  -Tim
>> ....
>> We talked about this a few months ago, but I'd be happy to hear what
>> people think now.

> I'm afraid I think this is a hideous idea, and that's an understatement.
> Even apart from code bloat - having to deal with two parts of a name means
> twice as much code every place the name matters - I'm really not sure I
> like the logic behind this approach.  

It would fit my model well, I think.  I would dispatch to the
appropriate DocumentHandler, based on the namespace.

But what if the handling of stuff from different namespaces is
intermingled...?  Hm...

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Dec 16 12:21:45 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <whvh5z9kr5.fsf@yggdrasil.metis.no>
Message-ID: <NBBBJPGDLPIHJGEHAKBAIELMEJAA.martind@netfolder.com>

HI Steinar,

Steinar said:
But what if the handling of stuff from different namespaces is
intermingled...?  Hm...

Didier reply:
Can you be more explicit? Where precisely do you see the problem?

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Thu Dec 16 13:02:30 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: "Didier PH Martin"'s message of "Thu, 16 Dec 1999 07:03:47 -0500"
References: <NBBBJPGDLPIHJGEHAKBAIELMEJAA.martind@netfolder.com>
Message-ID: <whln6v82e9.fsf@yggdrasil.metis.no>

>>>>> "Didier PH Martin" <martind@netfolder.com>:

> HI Steinar,
> Steinar said:
> But what if the handling of stuff from different namespaces is
> intermingled...?  Hm...

> Didier reply:
> Can you be more explicit? Where precisely do you see the problem?

My problem is that I don't see clearly how people are going to use
namespaces.  That's why I put a "Hm..." there.

I suspect that people are going to intermingle stuff from different
namespaces, in a way that'll make processing hard.

Will stuff from different namespaces always have the same semantics?
Or should I expect seperate processesing depending on the context?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 16 14:24:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX and DOM
In-Reply-To: "Clark C. Evans"'s message of "Tue, 14 Dec 1999 04:43:07 -0500 (EST)"
References: <Pine.LNX.4.10.9912140343460.28872-100000@cauchy.clarkevans.com>
Message-ID: <m37lif55hh.fsf@localhost.localdomain>

"Clark C. Evans" <clark.evans@manhattanproject.com> writes:

> SAX is a stream interface, but unfortunately,
> an event/listener pattern was not used.

SAX doesn't use the Java version of the event/listener stuff because
an XML document will generate several thousand events in a second or
two, and you'd really, really start to notice the overhead (unlike a
GUI, where you'll see at most a couple of dozen events in a second,
and usually much less).  It's very easy to implement the Java event
stuff on top of SAX if you want it, but there wasn't a good reason to
impose it on everyone.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pete.kirkham at minster.york.ac.uk  Thu Dec 16 14:33:31 1999
From: pete.kirkham at minster.york.ac.uk (pete.kirkham@minster.york.ac.uk)
Date: Mon Jun  7 17:18:34 2004
Subject: Inheritance in Xml, (Xpointer??)
Message-ID: <3858F796.863AB527@minster.york.ac.uk>

> dubolc jean (xmlstat@yahoo.com)
>  hi,
> 
>  I am translating a UML chart into Xml with the IBM
>  parser XML4j and I've got a probleme:
> 
>  How can a Child Element have two Parents?

XML files are structured as trees.  Thus you cannot represent this in the
structure of XML.
 
>  To make an element inheritate from another, is the
>  solution to put an attribute "inheritance" in the
>  Child with the value "yes" or "no".

You are better off having some form of element representing each of your
relations in a many to many scenario
 
You might want to look at
http://www.yy.cs.keio.ac.jp/~suzuki/project/uxf/uxf.html
which contains some DTDs for representing UML, though these use some text
matching or element contents for relations (such as source or target of state
machine transitions) which might better be represented as an XLink attribute.

Pete.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 16 14:38:25 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:34 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: James Clark's message of "Thu, 16 Dec 1999 11:55:54 +0700"
References: <3.0.32.19991215092815.01443430@pop.intergate.ca> <m3r9gohyea.fsf@localhost.localdomain> <385870DA.2CC2FF9E@jclark.com>
Message-ID: <m34sdj54v2.fsf@localhost.localdomain>

James Clark <jjc@jclark.com> writes:

> > Tim Bray <tbray@textuality.com> writes:
> > 
> > > So I think it would be cleaner to deal with the fact that names
> > > can have two parts, and not kludge them together with {} marks.
> > > -Tim
> 
> I tend to agree: pasting the namespace URI and local name together
> is a hack. Perhaps it's justified for backwards compatibility. (I
> did it myself in expat for this reason, so I can't really complain
> if SAX2 does it.)

It turns out that it's a very difficult decision, and in the end,
someone's life has to end up being harder.

> For some applications (for example, layering DOM2 on top of SAX2), it's
> really useful to have prefixes as well.  So I would rather see:
> 
> final class Name {
>   public String getNamespaceURI();
>   public String getLocalName();
>   public String getPrefix();
> }

To make this really useful, however, we should add equals(), intern(),
and hashCode() methods, and that leads to a different (and trickier)
should equals() and hashCode() consider the prefix, or not?  People
will get really surprising results if 

  {"http://www.w3.org/1999/xhtml", "a", ""}

equals()

  {"http://www.w3.org/1999/xhtml", "a", "html"}

but it is counterintuitive that the two are not equal from a normal
processing perspective.  Nasty business, really.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 16 14:42:13 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: "Simon St.Laurent"'s message of "Wed, 15 Dec 1999 14:06:07 -0500"
References: <Tim Bray's message of "Wed, 15 Dec 1999 09:29:18 -0800"> <3.0.32.19991215092815.01443430@pop.intergate.ca> <199912151906.OAA24894@hesketh.net>
Message-ID: <m31z8n54oj.fsf@localhost.localdomain>

"Simon St.Laurent" <simonstl@simonstl.com> writes:

> I'm afraid I think this is a hideous idea, and that's an understatement.
> Even apart from code bloat - having to deal with two parts of a name means
> twice as much code every place the name matters - I'm really not sure I
> like the logic behind this approach.  

Well, the code bloat isn't strictly true if the Name class implements
equals(), since you will usually just be testing for equality; still,
it does add an extra level of indirection, and every indirection in an
API is an open flame in a powder magazine (it won't cause any problems
as long as you're careful and follow the proper procedures, but...).


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Thu Dec 16 14:55:43 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A675194@odin.cromwellmedia.co.uk>

David Megginson wrote,
> still, it does add an extra level of indirection, and every 
> indirection in an API is an open flame in a powder magazine
> (it won't cause any problems as long as you're careful and 
> follow the proper procedures, but...).

Bear in mind that a decent JIT should be able to eliminate a
lot of the overhead in this case.

I really think you're getting too hung up on the performance
issue too early. Sure, tune it if it proves to be a bottleneck,
but, please, do some benchmarks before twisting the twisting
the design in the name of optimization.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 16 15:01:25 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <AA4C152BA2F9D211B9DD0008C79F760A675194@odin.cromwellmedia.co.uk>
References: <AA4C152BA2F9D211B9DD0008C79F760A675194@odin.cromwellmedia.co.uk>
Message-ID: <14424.65166.312499.405767@localhost.localdomain>

Miles Sabin writes:

 > David Megginson wrote,

 > > still, it does add an extra level of indirection, and every 
 > > indirection in an API is an open flame in a powder magazine
 > > (it won't cause any problems as long as you're careful and 
 > > follow the proper procedures, but...).
 > 
 > Bear in mind that a decent JIT should be able to eliminate a
 > lot of the overhead in this case.
 > 
 > I really think you're getting too hung up on the performance
 > issue too early. Sure, tune it if it proves to be a bottleneck,
 > but, please, do some benchmarks before twisting the twisting
 > the design in the name of optimization.

I think there's a bit of confusion here -- at least, my comments that
you quote above have nothing to do with optimization or performance
questions.  The problem is that it's not self-evident which is the
better design.

Every level of indirection is an open flame because it increases the
difficulty (and cost) of learning and implementing an API, which
leads to several problems:

a) the API is less likely to gain acceptance;
b) the API is less likely to be implemented correctly; and
c) the costs of learning, teaching, and documenting the API are higher.

I don't doubt that we can optimize for either case when the time comes 
-- my NSUtils class was one example of how we could optimize for the
single-string case (though obviously, it needs a little tweaking).
Right now, however, this is strictly a design question.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Dec 16 15:45:31 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <3.0.32.19991216074341.013812d0@pop.intergate.ca>

At 09:37 AM 12/16/99 -0500, David Megginson wrote:
>  People
>will get really surprising results if 
>
>  {"http://www.w3.org/1999/xhtml", "a", ""}
>
>equals()
>
>  {"http://www.w3.org/1999/xhtml", "a", "html"}

This leads back to the main point.  In a namespace-oblivious world, it
is nuts to claim that these are the same thing.  In a namespace-sensitive
world, it is nuts to claim that they're not.  The two world-views are 
deeply incompatible.

And the namespace-oblivious world is just no longer interesting.

-Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Thu Dec 16 15:59:58 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <whln6v82e9.fsf@yggdrasil.metis.no>
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEMCEJAA.martind@netfolder.com>

Hi Steinar,

Steinar said:
My problem is that I don't see clearly how people are going to use
namespaces.  That's why I put a "Hm..." there.

I suspect that people are going to intermingle stuff from different
namespaces, in a way that'll make processing hard.

Will stuff from different namespaces always have the same semantics?
Or should I expect seperate processesing depending on the context?

Didier reply:
So, if I understand you well, you think that there is a high probability
that people will intermingle elements from different name spaces without
mentionning that they are own by a name space? Is it what you mean?

In that case, it will be harder to attach the processing of a particular
element with a document or event handler.

One possible algorith though:
a) All event handler are structured as a list.
b) each event handler receives the element for interpretation. If the
element is recognized then it is processed

here we may have a variant and a possible source of conflict:
a) if the handler recognize the element and process it, then the element
processing is ended. CON: the element could be part of a name space and then
processed. This element also contains an attribute  from an other name space
(ex: an xlink:type attribute) and therefore, the element is not processed by
the event handler attached to the element by the attribute.

b) the handler always go through the whole list. CON if two name space have
the same attribute, then it may be improperly processed by the wong event
handler.

Conclusion: yes, in the absence of architectural form in the XML world, if
people do not take the habit to mark their elements with the name space
identifier, it may bring some problems. People used to HTML will have
problems here. For them we increased the level of complexity. Sound like XML
is slowly becomming as complex as SGML :-)

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nisse at lysator.liu.se  Thu Dec 16 16:09:51 1999
From: nisse at lysator.liu.se (Niels M�ller)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: David Megginson's message of "Thu, 16 Dec 1999 10:00:30 -0500 (EST)"
References: <AA4C152BA2F9D211B9DD0008C79F760A675194@odin.cromwellmedia.co.uk> <14424.65166.312499.405767@localhost.localdomain>
Message-ID: <nn3dt2dg06.fsf@sanna.lysator.liu.se>

David Megginson <david@megginson.com> writes:

> Every level of indirection is an open flame because it increases the
> difficulty (and cost) of learning and implementing an API, which
> leads to several problems:

In my experience, extra indirection should be safe and with no
surprises as long as there are no destructive mutations involved. If
you create the name instances (and appropriate hash and equality
methods that says that names are equal iff both of their parts are
equal), with both parts filled in when the object is instantiated, and
if you disallow all further destrucite changes, I think you should be
pretty safe.

If you do this, you could also use some kind of (weak) hash table from
strings "{ns}local" to name instances, in order to share name
instances where possible.

I'm no java expert, but in general I think it sounds very reasonable
to use some object oriented model for the particular
"name"-abstraction defined by XML and the namespaces spec.

/Niels

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Thu Dec 16 16:30:12 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC01C7@cc20exch2.mobility.com>

From: Tim Bray [mailto:tbray@textuality.com]
Sent: Thursday, December 16, 1999 10:46 AM
> 
> And the namespace-oblivious world is just no longer interesting.
> 
> -Tim

Interesting or not, and that depends on point of view :-), I'm not convinced
that the namespace-oblivious world is going to go away.  Ever.  If one wants
to use XML only in the context of one's own application, namespaces aren't
useful or needed.  One of the reasons XML is so great is that you don't
*need* a DTD to process XML documents - are we just going to replace the DTD
with namespaces, and require that all XML documents use namespaces of some
sort?

David Hunter
david.hunter@mobileq.com
http://www.MobileQ.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Dec 16 16:32:34 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:35 2004
Subject: xpointer - just a location mechanism?
Message-ID: <3859147A.D86C58FD@mitre.org>

Hi Folks,

I am making my way through the latest xpointer spec.  I may be missing
something (I am not finished reading it) but it appears that the spec
does not support the xLink capability of embedding[1] a portion of an
XML document into the currently active document.  Consider this simple
xLink with a xpointer:

<xlink:simple 
    href="http://www.somewhere.com/BookCatalogue.xml#xpointer(/Book[1])"
    show="parsed"/>

This hyperlink should result in extracting out of BookCatalogue.xml the
subtree referenced by the first Book element, and embedding it into the
current active document.  (This is my understanding of how this should
work.  Please correct me if I am in error.)

The xpointer spec states that an XPointer is simply a location
mechanism.  In section 2.4 it states:

"XPointers are not a general query mechanism, they are a specification
of document locations."

I read this as saying "an xpointer defines how to move a cursor around
in an XML document.  It does not describe what a location means in terms
of nodes (or strings) being referenced."  

If the spec doesn't describe what the movement of a cursor to a location
means in terms of node lists (or strings) then how is it going to
support the xLink embed mechanism?  /Roger


[1] From the xLink spec: "The parsed option (of the show attribute),
relating directly to the XML concept of a parsed entity, indicates that
the content should be integrated into the document from which the link
was actuated."


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Dec 16 16:52:50 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:35 2004
Subject: xpointer - just a location mechanism?
References: <3859147A.D86C58FD@mitre.org>
Message-ID: <01fd01bf47e7$d64496e0$a5addccf@prioritynetworks.net>

XPointer uses the 'to' keyword to isolate out a section of the document eg
someurl.xml#id1 to id2.

Frank
----- Original Message -----
From: Roger L. Costello <costello@mitre.org>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 16, 1999 11:34 AM
Subject: xpointer - just a location mechanism?


> Hi Folks,
>
> I am making my way through the latest xpointer spec.  I may be missing
> something (I am not finished reading it) but it appears that the spec
> does not support the xLink capability of embedding[1] a portion of an
> XML document into the currently active document.  Consider this simple
> xLink with a xpointer:
>
> <xlink:simple
>     href="http://www.somewhere.com/BookCatalogue.xml#xpointer(/Book[1])"
>     show="parsed"/>
>
> This hyperlink should result in extracting out of BookCatalogue.xml the
> subtree referenced by the first Book element, and embedding it into the
> current active document.  (This is my understanding of how this should
> work.  Please correct me if I am in error.)
>
> The xpointer spec states that an XPointer is simply a location
> mechanism.  In section 2.4 it states:
>
> "XPointers are not a general query mechanism, they are a specification
> of document locations."
>
> I read this as saying "an xpointer defines how to move a cursor around
> in an XML document.  It does not describe what a location means in terms
> of nodes (or strings) being referenced."
>
> If the spec doesn't describe what the movement of a cursor to a location
> means in terms of node lists (or strings) then how is it going to
> support the xLink embed mechanism?  /Roger
>
>
> [1] From the xLink spec: "The parsed option (of the show attribute),
> relating directly to the XML concept of a parsed entity, indicates that
> the content should be integrated into the document from which the link
> was actuated."
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ray at xmission.com  Thu Dec 16 16:58:55 1999
From: ray at xmission.com (Ray Whitmer)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <whln6v82e9.fsf@yggdrasil.metis.no>
Message-ID: <Pine.GSO.4.10.9912160946020.12160-100000@xmission.xmission.com>

On 16 Dec 1999, Steinar Bang wrote:

> >>>>> "Didier PH Martin" <martind@netfolder.com>:
> 
> > HI Steinar,
> > Steinar said:
> > But what if the handling of stuff from different namespaces is
> > intermingled...?  Hm...
> 
> > Didier reply:
> > Can you be more explicit? Where precisely do you see the problem?
> 
> My problem is that I don't see clearly how people are going to use
> namespaces.  That's why I put a "Hm..." there.

I think of it as borrowing a few words of German into my English 
vocabulary to describe things that are already well-said.  American
English is already a great mixture of things from other languages.

If I see that some credible company has a good definition people
are using, I will use their definition.

>From a programming perspective, as with Java classes, I think of
the namespace together with the local name as the only good identifier
of the element.  Separately, either the namespace or the name are
not very useful to me once I live in a mixed environment.  I always
want to be certain, which is the way I program Java, as well.  In
Java, I frequently get multiple classes in the same environment with
the same name, but with the namespaces, I always know exactly which
one I am using.

The DOM spec chose to represent them as separate strings rather than
joining them in a single universal name, which for my use cases would
have been simpler to use.  So I am about 50-50 whether to choose
to do it the same way as DOM did, or whether to do it the more-
convenient way for my applications.  But others obviously don't
find this more convenient.

> I suspect that people are going to intermingle stuff from different
> namespaces, in a way that'll make processing hard.
> 
> Will stuff from different namespaces always have the same semantics?
> Or should I expect seperate processesing depending on the context?
> 

Remember, you still have a DTD or some kind of content model telling
how the objects go together, which might be unique for your use,
even if some of the elements are standardized elsewhere.  Any particular
application or DTD still chooses what is legal (or if there are sections
where anything is allowed as some sort of blind inclusion, it is passed
to another processor that presumably will understand it).

That's how I think they will be used, anyway.

Ray Whitmer
ray@xmission.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Thu Dec 16 17:25:03 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:35 2004
Subject: Please Help Me
Message-ID: <199912161225217.SM01076@Unknown.>

> Your declaration should refer to a valid URL. A file system URL looks like
> this:
> file://C:/xml/xappli/DTDs/Books.dtd

Actually, last time I checked, it looks like this:

file:///C|/xml/xappli/DTDs/Books.dtd

(Note the third introductory slash and the vertical bar as substitute for the colon.)

Granted, IE is rather notorious for allowing sloppy file:// URL constructs, but that's no 
reason to encourage the practice.  :)


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Dec 16 18:31:58 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:35 2004
Subject: xpointer - just a location mechanism?
References: <3859147A.D86C58FD@mitre.org> <01fd01bf47e7$d64496e0$a5addccf@prioritynetworks.net>
Message-ID: <3859301C.643BCA91@mitre.org>

Frank Boumphrey wrote:
> 
> XPointer uses the 'to' keyword to isolate out a section of the document eg
> someurl.xml#id1 to id2.
> 

So, does that mean that if I want to use the xLink embed capability then
I must necessarily use the 'to' facility in my xpointer?  Is it an error
if I don't use the 'to' facility?  The 'to' xpointer facility is tied to
the embed xLink capability?  /Roger

> Frank
> ----- Original Message -----
> From: Roger L. Costello <costello@mitre.org>
> To: <xml-dev@ic.ac.uk>
> Sent: Thursday, December 16, 1999 11:34 AM
> Subject: xpointer - just a location mechanism?
> 
> > Hi Folks,
> >
> > I am making my way through the latest xpointer spec.  I may be missing
> > something (I am not finished reading it) but it appears that the spec
> > does not support the xLink capability of embedding[1] a portion of an
> > XML document into the currently active document.  Consider this simple
> > xLink with a xpointer:
> >
> > <xlink:simple
> >     href="http://www.somewhere.com/BookCatalogue.xml#xpointer(/Book[1])"
> >     show="parsed"/>
> >
> > This hyperlink should result in extracting out of BookCatalogue.xml the
> > subtree referenced by the first Book element, and embedding it into the
> > current active document.  (This is my understanding of how this should
> > work.  Please correct me if I am in error.)
> >
> > The xpointer spec states that an XPointer is simply a location
> > mechanism.  In section 2.4 it states:
> >
> > "XPointers are not a general query mechanism, they are a specification
> > of document locations."
> >
> > I read this as saying "an xpointer defines how to move a cursor around
> > in an XML document.  It does not describe what a location means in terms
> > of nodes (or strings) being referenced."
> >
> > If the spec doesn't describe what the movement of a cursor to a location
> > means in terms of node lists (or strings) then how is it going to
> > support the xLink embed mechanism?  /Roger
> >
> >
> > [1] From the xLink spec: "The parsed option (of the show attribute),
> > relating directly to the XML concept of a parsed entity, indicates that
> > the content should be integrated into the document from which the link
> > was actuated."
> >
> >
> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> > To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> > unsubscribe xml-dev
> > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> message;
> > subscribe xml-dev-digest
> > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> >


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jean-marc_vanel at effix.fr  Thu Dec 16 18:59:56 1999
From: jean-marc_vanel at effix.fr (Jean Marc VANEL)
Date: Mon Jun  7 17:18:35 2004
Subject: complete database in XML/Javascript?
Message-ID: <C1256849.00660338.00@smtpnotes.effix.fr>


There is a possibility to write more or less the equivalent of MS Access
on a browser platform with XML, XSLT, XPath, XML Schema, and
Javascript.

Either use IE5's native XSL/XPath processor, alas somewhat obsolete, or
an Aplet (which one?).

There are also possibilities with Mozilla (www.mozilla.org), with XML
and CSS, and an Aplet .

Has anyone tried in this direction?
Cheers
JMV


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From klhi at dcc.uchile.cl  Thu Dec 16 19:24:56 1999
From: klhi at dcc.uchile.cl (Hekwang Lhi)
Date: Mon Jun  7 17:18:35 2004
Subject: unsubscribe
Message-ID: <Pine.GSO.4.10.9912161617020.4856-100000@anakena>

unsubscribe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From orchard at pacificspirit.com  Thu Dec 16 19:50:43 1999
From: orchard at pacificspirit.com (David Orchard)
Date: Mon Jun  7 17:18:35 2004
Subject: Xpath and DOM
In-Reply-To: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <001001bf47fe$cec9f330$63511c09@n54wntw.vancouver.can.ibm.com>

Leigh, I completely agree with you.  I have been in some discussions about
this already so I'll try to relay what I've heard.

About 6 weeks ago, I asked Lauren Wood about DOM implementing XPath.  My
version of her answer is "Nobody asked for it for Level 2 or Level 3, and
Level 2 is too late now.  Nobody volunteered to write it for the DOM Spec".
There's no technical reason why getElementByXPathExpr couldn't be added.

I asked some of the other IBM XML standards reps about this and my version
of their answer is "XPath is a query language, and we've got a better query
language coming.  Why support an inferior query language now when we'll have
to support the better one soon.  Additionally, why should the DOM be the
bucket for all API gorp?  Query should be built on top of the DOM so we can
have layered parsers".

On one hand, I want an interoperable getElementByXPathExpr, but I understand
the political and technical reasons why the DOM group isn't rushing to
implement it.

Cheers,
Dave Orchard
IBM Architect
XLink co-editor

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Leigh Dodds
> Sent: Tuesday, December 14, 1999 5:32 AM
> To: xml-dev
> Subject: Xpath and DOM
>
>
> At present the DOM spec only allows one to traverse the
> tree 'manually' using getChild, etc. Or jump into the
> tree at some point using getElementsByTagName.
>
> Theres nothing in there to allow me to do getElementsByExpression
> (accepting an XPath search expression), or similarly pull out
> sections of the DOM tree using XPath expressions.
>
> I've written basic utilities to do this, as have others I'm sure
> (XSLT engines must use something similar), but I'm curious as to when, or
> even whether, this type of feature is going to be added to the
> DOM API itself.
>
> It would seem to be pretty useful. In the applications I've built
> so far, I've not wanted to traverse or walk the tree, just pick
> out bits of it (and sure I could use SAX but I want the tree
> in memory because I'm manipulating it multiple times).
>
> Unless I'm asking the wrong question - is there a tool that will
> search a DOM tree for me, assuming I supply it with an XPath
> expression.
>
> Cheers,
>
> L.
>
> ==================================================================
>     "Never Do With More, What Can Be Achieved With Less"
> 				---William of Occam
> ==================================================================
> Leigh Dodds                             Eml:  ldodds@ingenta.com
> ingenta ltd                             Tel:  +44 1225 826619
> BUCS Building, University of Bath       Fax:  +44 1225 826283
>
> eclectic				http://weblogs.userland.com/eclectic
> homepage				http://www.bath.ac.uk/~ccslrd
> ==================================================================
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Thu Dec 16 20:38:45 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:18:35 2004
Subject: Xpath and DOM
In-Reply-To: <001001bf47fe$cec9f330$63511c09@n54wntw.vancouver.can.ibm.com>
References: <000f01bf4637$9bccd060$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <199912162034.MAA15933@mail.sqwest.bc.ca>

On 16 Dec 99, at 11:50, David Orchard wrote:

> About 6 weeks ago, I asked Lauren Wood about DOM implementing XPath.  My
> version of her answer is "Nobody asked for it for Level 2 or Level 3, and
> Level 2 is too late now.  Nobody volunteered to write it for the DOM Spec".
> There's no technical reason why getElementByXPathExpr couldn't be added.
> 
> I asked some of the other IBM XML standards reps about this and my version
> of their answer is "XPath is a query language, and we've got a better query
> language coming.  Why support an inferior query language now when we'll have
> to support the better one soon.  Additionally, why should the DOM be the
> bucket for all API gorp?  Query should be built on top of the DOM so we can
> have layered parsers".
> 
> On one hand, I want an interoperable getElementByXPathExpr, but I understand
> the political and technical reasons why the DOM group isn't rushing to
> implement it.

I think the DOM group would be happy if some other interested 
group were to write an XPath API module that sits on top of the 
DOM. I also think the DOM WG would be happy to review such an 
API to make sure it does work with the DOM Level 2. And if 
something else is needed in the Core DOM to make it work (can't 
think of anything right now, but maybe there is something) we can 
add it to the Level 3 wishlist that's being discussed now. 


Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Dec 16 20:42:55 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:35 2004
Subject: complete database in XML/Javascript?
References: <C1256849.00660338.00@smtpnotes.effix.fr>
Message-ID: <008e01bf4807$1bcaa6e0$9caddccf@oemcomputer>

The MSXNL COM object allows access to the document tree using the W3C DOM
JavaScript bindings. This works very well with XML Data Stores and ASP, or
any other program that can use either ActiveX or COM objects.

Frank
----- Original Message -----
From: Jean Marc VANEL <jean-marc_vanel@effix.fr>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 16, 1999 1:34 PM
Subject: complete database in XML/Javascript?


>
>
> There is a possibility to write more or less the equivalent of MS Access
> on a browser platform with XML, XSLT, XPath, XML Schema, and
> Javascript.
>
> Either use IE5's native XSL/XPath processor, alas somewhat obsolete, or
> an Aplet (which one?).
>
> There are also possibilities with Mozilla (www.mozilla.org), with XML
> and CSS, and an Aplet .
>
> Has anyone tried in this direction?
> Cheers
> JMV
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Dec 16 20:38:31 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:35 2004
Subject: xpointer - just a location mechanism?
References: <3859147A.D86C58FD@mitre.org> <01fd01bf47e7$d64496e0$a5addccf@prioritynetworks.net> <3859301C.643BCA91@mitre.org>
Message-ID: <008701bf4806$7d3e45e0$9caddccf@oemcomputer>

> So, does that mean that if I want to use the xLink embed capability then
> I must necessarily use the 'to' facility in my xpointer?

Well actually the xpath language returns not the starting point of the
insert, but the node object desribed by the expression itself, so if you do
not use the 'to' keyword, just the node described by xpath should be
embedded. The 'to' key word gives us a way to embed a document fragment
extending across several nodes.

Frank

Certainly the to key word is the way
----- Original Message -----
From: Roger L. Costello <costello@mitre.org>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 16, 1999 1:31 PM
Subject: Re: xpointer - just a location mechanism?


> Frank Boumphrey wrote:
> >
> > XPointer uses the 'to' keyword to isolate out a section of the document
eg
> > someurl.xml#id1 to id2.
> >
>
> So, does that mean that if I want to use the xLink embed capability then
> I must necessarily use the 'to' facility in my xpointer?  Is it an error
> if I don't use the 'to' facility?  The 'to' xpointer facility is tied to
> the embed xLink capability?  /Roger
>
> > Frank
> > ----- Original Message -----
> > From: Roger L. Costello <costello@mitre.org>
> > To: <xml-dev@ic.ac.uk>
> > Sent: Thursday, December 16, 1999 11:34 AM
> > Subject: xpointer - just a location mechanism?
> >
> > > Hi Folks,
> > >
> > > I am making my way through the latest xpointer spec.  I may be missing
> > > something (I am not finished reading it) but it appears that the spec
> > > does not support the xLink capability of embedding[1] a portion of an
> > > XML document into the currently active document.  Consider this simple
> > > xLink with a xpointer:
> > >
> > > <xlink:simple
> > >
href="http://www.somewhere.com/BookCatalogue.xml#xpointer(/Book[1])"
> > >     show="parsed"/>
> > >
> > > This hyperlink should result in extracting out of BookCatalogue.xml
the
> > > subtree referenced by the first Book element, and embedding it into
the
> > > current active document.  (This is my understanding of how this should
> > > work.  Please correct me if I am in error.)
> > >
> > > The xpointer spec states that an XPointer is simply a location
> > > mechanism.  In section 2.4 it states:
> > >
> > > "XPointers are not a general query mechanism, they are a specification
> > > of document locations."
> > >
> > > I read this as saying "an xpointer defines how to move a cursor around
> > > in an XML document.  It does not describe what a location means in
terms
> > > of nodes (or strings) being referenced."
> > >
> > > If the spec doesn't describe what the movement of a cursor to a
location
> > > means in terms of node lists (or strings) then how is it going to
> > > support the xLink embed mechanism?  /Roger
> > >
> > >
> > > [1] From the xLink spec: "The parsed option (of the show attribute),
> > > relating directly to the XML concept of a parsed entity, indicates
that
> > > the content should be integrated into the document from which the link
> > > was actuated."
> > >
> > >
> > > xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
> > > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> > CD-ROM/ISBN 981-02-3594-1
> > > To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> > > unsubscribe xml-dev
> > > To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
> > message;
> > > subscribe xml-dev-digest
> > > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> > >
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Thu Dec 16 20:47:04 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:35 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <776DC00B49ECD21189750090273F729130B356@EROS>

On a related topic:
I like what Ray is saying because it lays the foundation for a missing
higher level abstraction to SAX.  What we and just about everyone else using
SAX have developed is an in-house HandlerBase dispatcher class that acts as
a giant switch, doling out events based on element names (a common pattern
in many different types of processes).  Now with namespaces, we effectively
have a two-layer! switch that will be continually developed independently in
the XML community!

To address this, I propose adding a higher level utility provided as part of
SAX2 named ElementDispatcherBase.  For those who have philosophical
differences with this approach, I am proposing that we compliment the
current monolithic handler feature of SAX.  The interfaces required would be
along the lines of:

public interface SAX2ElementHandler
{
	public void startElement(Attribute[] attrs) throws SAXException;
	public void endElement() throws SAXException;
}

public interface SAX2DefaultElementHandler
{
	public void startElement(String nameSpace, String elementName,
Attribute[] attrs) throws SAXException;
	public void endElement(String nameSpace, String elementName) throws
SAXException;
}

public interface SAX2ElementDispatcher extends SAX2DocumentHandler
{
	/**
	 * Default element handler, effectively the SAX status quo
	 */
	public void addDefaultElementHandler(SAX2DefaultElementHandler
hndlr);

	/**
	 * Handle element in all namespaces
       */
	public void addElementHandler(String elementName, SAX2ElementHandler
hndlr);

	/**
	 * Handle element in this namespace only
       */
	public void addElementHandler(String nameSpace, String elementName,
SAX2ElementHandler hndlr);

	public void removeDefaultElementHandler(SAX2DefaultElementHandler
hndlr);
	public void removeElementHandler(String elementName,
SAX2ElementHandler hndlr);
	public void removeElementHandler(String nameSpace, String
elementName, SAX2ElementHandlerNS hndlr);
	/* remove by name only? */
}

The normal use of the interfaces should be straightforward.  There maybe
some concern over elements that are closely related and have almost
identical processing that is differentiated by name.  The standard Java
technique of using inner classes to register unique, one-liner,
mini-handlers would work well in most of those cases; or the default handler
could be invoked -- the equivalent of today's big switch (or hopefully,
home-grown dispatcher :).

Note that handlers are removed by name as well as object-identity, allowing
for the handling multiple differently identified elements by the same
handler [if you removed by object-identity only, and you had registered for
multiple elements, removal would probably not do the right thing].

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================


Ray Waldin writes:
>For example:
>
><element1 xmlns="uri1" xmlns:p2="uri2">
>  <ns2:element2 foo="bar" xmlns:p3="uri3" p3:whiz="bang"/>
></element1>
>
>would produce the following stream of events:
>
>  startDocument();
>  startNamespace( prefix=null, uri="uri1" );
>  startNamespace( prefix="p2", uri="uri2" );
>  startElement( name="element1", attrList=null );
>  startNamespace( prefix="p3", uri="uri3" );
>  startElement( name="p2:element2", attrList={ foo="bar", p3:whiz="bang"
});
>  endElement( name="p2:element2" )
>  endNamespace( prefix="p3" );
>  endElement( name="element1" );
>  endNamespace( prefix="p2" );
>  endNamespace( prefix="p1" );
>  endDocument();
>
>Simple management of these namespaces can be provided for in some helper
class like the
>NSUtils example, and can even be handled automatically for the most part in
a new
>HandlerBase2 class, but this decision of how (or whether) to handle
namespaces could be
>left to the application author.  Anyone else feel this way?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Dec 16 20:51:44 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:35 2004
Subject: Xpath and DOM
References: <001001bf47fe$cec9f330$63511c09@n54wntw.vancouver.can.ibm.com>
Message-ID: <00a601bf4808$58e4ef80$9caddccf@oemcomputer>

<david>On one hand, I want an interoperable getElementByXPathExpr, but I
understand
the political and technical reasons why the DOM group isn't rushing to
implement it.</david>

It should be a simple enough matter to write such a function one self and
just call it! may be a week project for a student!(Hint!)

Frank

----- Original Message -----
From: David Orchard <orchard@pacificspirit.com>
To: xml-dev <xml-dev@ic.ac.uk>
Sent: Thursday, December 16, 1999 2:50 PM
Subject: RE: Xpath and DOM


> Leigh, I completely agree with you.  I have been in some discussions about
> this already so I'll try to relay what I've heard.
>
> About 6 weeks ago, I asked Lauren Wood about DOM implementing XPath.  My
> version of her answer is "Nobody asked for it for Level 2 or Level 3, and
> Level 2 is too late now.  Nobody volunteered to write it for the DOM
Spec".
> There's no technical reason why getElementByXPathExpr couldn't be added.
>
> I asked some of the other IBM XML standards reps about this and my version
> of their answer is "XPath is a query language, and we've got a better
query
> language coming.  Why support an inferior query language now when we'll
have
> to support the better one soon.  Additionally, why should the DOM be the
> bucket for all API gorp?  Query should be built on top of the DOM so we
can
> have layered parsers".
>
> On one hand, I want an interoperable getElementByXPathExpr, but I
understand
> the political and technical reasons why the DOM group isn't rushing to
> implement it.
>
> Cheers,
> Dave Orchard
> IBM Architect
> XLink co-editor
>
> > -----Original Message-----
> > From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> > Leigh Dodds
> > Sent: Tuesday, December 14, 1999 5:32 AM
> > To: xml-dev
> > Subject: Xpath and DOM
> >
> >
> > At present the DOM spec only allows one to traverse the
> > tree 'manually' using getChild, etc. Or jump into the
> > tree at some point using getElementsByTagName.
> >
> > Theres nothing in there to allow me to do getElementsByExpression
> > (accepting an XPath search expression), or similarly pull out
> > sections of the DOM tree using XPath expressions.
> >
> > I've written basic utilities to do this, as have others I'm sure
> > (XSLT engines must use something similar), but I'm curious as to when,
or
> > even whether, this type of feature is going to be added to the
> > DOM API itself.
> >
> > It would seem to be pretty useful. In the applications I've built
> > so far, I've not wanted to traverse or walk the tree, just pick
> > out bits of it (and sure I could use SAX but I want the tree
> > in memory because I'm manipulating it multiple times).
> >
> > Unless I'm asking the wrong question - is there a tool that will
> > search a DOM tree for me, assuming I supply it with an XPath
> > expression.
> >
> > Cheers,
> >
> > L.
> >
> > ==================================================================
> >     "Never Do With More, What Can Be Achieved With Less"
> > ---William of Occam
> > ==================================================================
> > Leigh Dodds                             Eml:  ldodds@ingenta.com
> > ingenta ltd                             Tel:  +44 1225 826619
> > BUCS Building, University of Bath       Fax:  +44 1225 826283
> >
> > eclectic http://weblogs.userland.com/eclectic
> > homepage http://www.bath.ac.uk/~ccslrd
> > ==================================================================
> >
> >
> > xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> > Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> > CD-ROM/ISBN 981-02-3594-1
> > To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> > unsubscribe xml-dev
> > To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> > following message;
> > subscribe xml-dev-digest
> > List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> >
> >
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Dec 16 21:27:15 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:35 2004
Subject: Musing over Namespaces
References: <000201bf4691$ec1eb1a0$d1940e18@smateo1.sfba.home.com> <m34sdkc2qo.fsf@localhost.localdomain>
Message-ID: <385839DD.67FB@hiwaay.net>

David Megginson wrote:
>
> It's messy, but it's the only standards path that really seems to
> work.  At least with Namespaces we can remove 50% of the messiness
> (there's no chance of confusing different party's extensions) on the
> way to standards Nirvana.

I think not.  The standards path is complete once the means to create 
and identify the namespace is possible.  What now needs to happen is 
for the technologists to create means by which company registries of 
namesspaces can be accessed and negotiated with such that for any
contract, 
the ROA namespaces can be declared.  OASIS and BIZTALK may be good 
libraries, may provide examples, but trying even at the UN level to 
standardize everyone's local names or names of exchange is a foolish 
if goodhearted errand.  It is not the way to do it right.  It 
is not the path to success.  It is the way that standards wonks 
think, but I doubt it works as well as registry negotiation.  
If Microsoft wants to be a good partner, provide technology and 
some startup templates, then get out of the way.  If OASIS wants 
to thrive, teach the means, do not arbitrate.  Powertrips are 
not attractive.  Education is seductive.  Influence is more powerful 
than rules.

> I'm going to write a paper with a catchy title on this topic some day,
> so that I can make US$45M like Eric Raymond just did from "The
> Cathedral and the Bazaar."

"Green green it's green they say on the far side of the hill.
Green green I'm going away to where the grass is greener still."

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Dec 16 21:28:00 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:35 2004
Subject: Musing over Namespaces
References: <805C62F55FFAD1118D0800805FBB428D02BC01BF@cc20exch2.mobility.com> <38574352.E4D0CEB8@fiduciary.com>
Message-ID: <38583C9D.67FF@hiwaay.net>

W. E. Perry wrote:
> 
> Only the receiving
> node defines the particular processing which it would like to initiate upon instantiation of
> that element. And if in time the nature of that processing changes, that node's correspondents
> will have no way to know that, nor to know how their <purchase> elements are now being
> differently instantiated at that node.

Yes!  And the receiving node gets to change its mind when it needs to.  
It gets to renegotiate and it gets to advertise the interfaces by 
which it does both to configure the means and expressions of the 
transactions.

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Thu Dec 16 21:27:36 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:35 2004
Subject: Musing over Namespaces
References: <3.0.32.19991214202152.02074c10@pop.intergate.ca>
Message-ID: <38583BCE.509E@hiwaay.net>

Tim Bray wrote:
> 
> At 11:04 AM 12/14/99 -0500, Clark C. Evans wrote:
> >It'd still be nice to have a single database with
> >everyone's namespace definitions in one place though...
> >perhaps even a DTD to help describe them.  I'm sure
> >there are organizations doing this... are there?
> 
> Yes, lots.  That's the problem. -Tim

No Tim, that's the answer.  Common goals achieved by 
common means, not common control.  Put the definitions in 
place by performance, not uberNamespace.  It has to 
breathe, my friends, like a good well performed piece 
of music, it has to breathe.

The standards wonks really need to study management 
and contract negotiation.  OASIS and BIZTALK are 
the absolute wrong way to do this.  This means a 
marvelous lucrative opportunity is sitting there being 
ignored.  Someone or some company will get rich 
off this if they go to these companies and demonstrate 
the right way to do this.

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 16 21:58:30 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:35 2004
Subject: Musing over Namespaces
In-Reply-To: <385839DD.67FB@hiwaay.net>
References: <000201bf4691$ec1eb1a0$d1940e18@smateo1.sfba.home.com>
	<m34sdkc2qo.fsf@localhost.localdomain>
	<385839DD.67FB@hiwaay.net>
Message-ID: <14425.24666.22699.410005@localhost.localdomain>

Len Bullard writes:

 > David Megginson wrote:
 > >
 > > It's messy, but it's the only standards path that really seems to
 > > work.  At least with Namespaces we can remove 50% of the messiness
 > > (there's no chance of confusing different party's extensions) on the
 > > way to standards Nirvana.
 > 
 > I think not.  The standards path is complete once the means to
 > create and identify the namespace is possible.  What now needs to
 > happen is for the technologists to create means by which company
 > registries of namesspaces can be accessed and negotiated with such
 > that for any contract, the ROA namespaces can be declared.

I've noticed this assumption behind a lot of the Namespace-related
postings (not just Len's) -- I guess it would be a good thing to have
a global resolution mechanism of some kind, but I'm surprised that
people consider Namespaces incomplete without this.  

After all, there's nothing similar for (say) Java or Perl package
names, and while having something like that would be convenient (and
CPAN is very nice as far as it goes), the absence of a global
resolution mechanism for figuring out what org.xml.sax or XML::Parser
means doesn't seem to inhibit useful work in Java or Perl.

Why is XML different?  Is it just that we come from the SGML
background, where we consider structural validation to be part of a
document rather than a process applied to it, or is there some kind of
a fundamental difference between naming code and naming document
nodes that no one has articulated yet?

 > > I'm going to write a paper with a catchy title on this topic some
 > > day, so that I can make US$45M like Eric Raymond just did from
 > > "The Cathedral and the Bazaar."
 > 
 > "Green green it's green they say on the far side of the hill.
 > Green green I'm going away to where the grass is greener still."

No, that's too long -- I need something shorter and catchier.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jamesr at steptwo.com.au  Thu Dec 16 22:16:33 1999
From: jamesr at steptwo.com.au (James Robertson)
Date: Mon Jun  7 17:18:35 2004
Subject: Namespace obvlivion (was Re: SAX2: Namespace Processing and
  NSUtils helper class)
In-Reply-To: <3.0.32.19991216074341.013812d0@pop.intergate.ca>
Message-ID: <4.2.0.58.19991217090051.00a34f00@203.41.126.17>

At 02:45 17/12/1999 , Tim Bray wrote:

>And the namespace-oblivious world is just no longer interesting.
>
>-Tim

Keeping tongue-in-cheek, I would suggest:

Enjoy your world Tim.

Many of us still enjoy the namespace-oblivious
world, called the "real world".

Some of us even still find DTDs interesting ...

J

-------------------------
James Robertson
Step Two Designs Pty Ltd
SGML, XML & HTML Consultancy
Illumination: an out-of-the-box Intranet solution

http://www.steptwo.com.au/
jamesr@steptwo.com.au

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Thu Dec 16 23:00:00 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:35 2004
Subject: XML parsing memory overhead concerns
Message-ID: <38596F2F.C98B95CC@fxtech.com>

As I've written before I've been working on a callback-based streaming
XML parser that is sort of DOM-like, specifically for reading
application data from XML files where you know what the object hierarchy
is. At first I tried layering my work over expat to no avail, since
expat uses a push model and I needed a pull model (so a subelement could
parse its subtree by itself).

Now that I am almost finished with the first cut and about to release
it, I was explaining the solution to a colleague who said I should have
tried to build it on expat anyway. The only way I could have done that
and keep the sub-element parsing model that I want is to have expat
parse entire document into one big internal memory buffer. One of the
advantages of my solution is you only need a small file buffer, since
it's streaming. If I have a data file with 100,000 elements in it, I
would need to store the entire file in memory, along with a couple of
extra megabytes of housekeeping data. My colleague said "so what?".

So, here is my plea for feedback about memory usage concerns. My current
solution works as designed and streams into a small buffer, but it only
supports ASCII and doesn't validate, and relies on C-style callback
functions which can require slightly more code. If I wasn't worried
about the memory I could rewrite my design on top of expat (gaining all
of its benefits, and presumably validation in the future), and provide
an optional DOM-like interface (without all the extra DOM mumbo-jumbo).
It would be possible to combine the streaming and the parsing and throw
away the housekeeping data when the elements are no longer needed (such
as when we've moved on to the next subelement tree).

What do people think? Spare the memory and provide a simpler (and
slightly less capable) solution or store the entire thing in memory and
use the nice stuff in expat and give more features?

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Thu Dec 16 23:25:53 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:35 2004
Subject: Xpath and DOM
In-Reply-To: <001001bf47fe$cec9f330$63511c09@n54wntw.vancouver.can.ibm.com>
Message-ID: <000701bf481d$0b23a4c0$d1940e18@smateo1.sfba.home.com>

Actually, I would have preferred to have a small set of
query mechanism independent API in the DOM API.

Here is a string-based version:

interface DocumentQuery {
  NodeIterator getQueryResult(String query);
}

where query is prefixed with query type information such
as "xpath:" or "xsql:".  API for plugging in new query
engine is needed to support new query types.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Fri Dec 17 00:27:01 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:35 2004
Subject: Musing over Namespaces
References: <000201bf4691$ec1eb1a0$d1940e18@smateo1.sfba.home.com>
		<m34sdkc2qo.fsf@localhost.localdomain>
		<385839DD.67FB@hiwaay.net> <14425.24666.22699.410005@localhost.localdomain>
Message-ID: <38598275.1E25@hiwaay.net>

David Megginson wrote:
> 
> I've noticed this assumption behind a lot of the Namespace-related
> postings (not just Len's) -- I guess it would be a good thing to have
> a global resolution mechanism of some kind, but I'm surprised that
> people consider Namespaces incomplete without this.

My point is not that that namespaces are incomplete but that they 
are complete.  The next step is not necessarily to provide uber
registration agnencies such as BizTalk or OASIS although nothing
prevents 
such things.  I think the better solution is local registries which the 
businesses create and administer.  It assuredly takes longer for such 
things to be created, but by interfaces to such, the noise levels 
are encapsulated and localized.  It is by process and definition of how 
to open and close and structure processes, what I once called views 
in the older days, that businesses execute and perform.  It is necessary 
to tune these.  What we have with wall-to-wall markup systems is 
the loose coupling by which such message based systems can both 
operate openly but control visibility.  This is necessary because 
all business processes and instruments have a degree of slop which has 
to be there to cope with what the Boeing trainers called the 
unknown-unknowns.

> No, that's too long -- I need something shorter and catchier.

Ok, but that was a quote from the early sixties folk.  Something 
from the millenial folk:

"Only money matters.  Only money's gonna pay your bills.
Only money matters.  You got no money honey, you get no thrills."

Not shorter but newer. :-)

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steve at rsv.ricoh.com  Fri Dec 17 01:32:51 1999
From: steve at rsv.ricoh.com (Stephen R. Savitzky)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: Paul Miller's message of Thu, 16 Dec 1999 18:01:03 -0500
References: <38596F2F.C98B95CC@fxtech.com>
Message-ID: <qcg0x2fj42.fsf@congo.crc.ricoh.com>

Paul Miller <stele@fxtech.com> writes:

> As I've written before I've been working on a callback-based streaming
> XML parser that is sort of DOM-like, specifically for reading
> application data from XML files where you know what the object hierarchy
> is.

This is probably similar to what I'm using: the parser's API is basically 
a tree traverser:  it has methods like:

  toNextSibling
  toFirstChild
  toParent
  ... and a bunch of methods to examine the current node.

Though it's DOM-like, you never actually have to build the whole tree.

It does have the feature (some might call it a bug) that you're processing
large chunks of the document before you know that it doesn't validate.
Since I deal with human-written documents I consider it a feature, since it
permits graceful error recovery.

> What do people think? Spare the memory and provide a simpler (and
> slightly less capable) solution or store the entire thing in memory and
> use the nice stuff in expat and give more features?

I believe there's a low-level module in expat that can handle the lexical
part of the parsing but still be called in pull mode.  Or you could use
"full" expat if you have threads (i.e. run expat as a coroutine).

But like you, I'd really like to see a traversal-oriented (pull) parser API
to go along with the current event-oriented (push) ones.  There are times
when it's just the right thing to do.

-- 
Stephen R. Savitzky  <steve@rsv.ricoh.com>  <http://rsv.ricoh.com/~steve/>
Platform for Information Applications:      <http://RiSource.org/PIA/>
Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center
 voice: 650.496.5710  front desk: 650.496.5700  fax: 650.854.8740 
  home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Fri Dec 17 03:04:19 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
References: <38596F2F.C98B95CC@fxtech.com> <qcg0x2fj42.fsf@congo.crc.ricoh.com>
Message-ID: <3859A86A.AD7E478C@fxtech.com>

> But like you, I'd really like to see a traversal-oriented (pull) parser API
> to go along with the current event-oriented (push) ones.  There are times
> when it's just the right thing to do.

I'm glad others see the value in this. I'll post my code tomorrow after
a few more tweaks.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Fri Dec 17 04:35:10 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:36 2004
Subject: Musing over Namespaces
Message-ID: <199912162335454.SM00700@Unknown.>

>  > I think not.  The standards path is complete once the means to
>  > create and identify the namespace is possible.  What now needs to
>  > happen is for the technologists to create means by which company
>  > registries of namesspaces can be accessed and negotiated with such
>  > that for any contract, the ROA namespaces can be declared.
> 
> I've noticed this assumption behind a lot of the Namespace-related
> postings (not just Len's) -- I guess it would be a good thing to have
> a global resolution mechanism of some kind, but I'm surprised that
> people consider Namespaces incomplete without this.  
> 
> After all, there's nothing similar for (say) Java or Perl package
> names, and while having something like that would be convenient (and
> CPAN is very nice as far as it goes), the absence of a global
> resolution mechanism for figuring out what org.xml.sax or XML::Parser
> means doesn't seem to inhibit useful work in Java or Perl.
> 
> Why is XML different?  Is it just that we come from the SGML
> background, where we consider structural validation to be part of a
> document rather than a process applied to it, or is there some kind of
> a fundamental difference between naming code and naming document
> nodes that no one has articulated yet?

Just to take a stab in the dark here, but wouldn't this fundamental difference be that Perl 
and Java are almost unilaterally self-contained parcels (hence, it doesn't matter what the 
package is named, because you're writing everything that deals with it anyway), but 
XML documents are designed for interchange - where the names don't just have to make 
sense to you, but also to an unknown client?

In other words, my only concern when naming a function or a class in a program is that I 
need to know what it is; I can name a variable "Fred" or a 50-char string class "Bubba" if 
I want to, and it doesn't matter - because nobody else needs to understand what those 
names mean.  However, if I'm writing a document that I'm going to send somewhere 
else for Joe to deal with, I'd better use names that Joe can understand and easily map.  
(For an HTML example, I like "BQ" much more than "BLOCKQUOTE" as an element 
name - but if I use BQ in my code, no UA will know what I'm talking about because BQ 
isn't defined anywhere.  If, OTOH, I slip a transformative preprocessor that morphs BQ 
into the defined BLOCKQUOTE somewhere between authoring and the public UA, that 
works fine.)

The difference you're looking for is one of scope.  Internal names don't matter to the 
outside world, because nobody outside has to do anything with them...but external 
names MUST be defined in some way, else nobody outside CAN do anything with 
them.  Am I expressing it clearly?


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 17 05:12:25 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: <qcg0x2fj42.fsf@congo.crc.ricoh.com>
Message-ID: <Pine.LNX.4.10.9912161213270.20127-100000@cauchy.clarkevans.com>


On 16 Dec 1999, Stephen R. Savitzky wrote:
> This is probably similar to what I'm using: the parser's API is basically 
> a tree traverser:  it has methods like:
> 
>   toNextSibling
>   toFirstChild
>   toParent
>   ... and a bunch of methods to examine the current node.
> 
> Though it's DOM-like, you never actually have to build the whole tree.

So, some random access storage is needed with this technique?
 
> It does have the feature (some might call it a bug) that you're processing
> large chunks of the document before you know that it doesn't validate.
> Since I deal with human-written documents I consider it a feature, since it
> permits graceful error recovery.

Not a problem.

> I believe there's a low-level module in expat that can handle the lexical
> part of the parsing but still be called in pull mode.  Or you could use
> "full" expat if you have threads (i.e. run expat as a coroutine).

Why not have expat build the sub tree you are interested,
put it in memory.  You can then "pull" from this?

> But like you, I'd really like to see a traversal-oriented (pull) parser API
> to go along with the current event-oriented (push) ones.  There are times
> when it's just the right thing to do.

I'd like to hear *much* more about this.

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec 17 05:36:55 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
References: <38596F2F.C98B95CC@fxtech.com>
Message-ID: <3859CBC8.FFD00E81@jclark.com>

Paul Miller wrote:

> The only way I could have done that
> and keep the sub-element parsing model that I want is to have expat
> parse entire document into one big internal memory buffer.

How so?  You can layer a next event style interface on top of expat by
maintaining a queue of events.  A request for the next event returns the
head of the queue; if the queue is empty, it fills the queue by reading
another chunk of input and passing it to XML_Parse() with event handlers
that append to the queue of events.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 17 05:50:52 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: <3859CBC8.FFD00E81@jclark.com>
Message-ID: <Pine.LNX.4.10.9912161253010.20127-100000@cauchy.clarkevans.com>


On Fri, 17 Dec 1999, James Clark wrote:
> Paul Miller wrote:
> 
> > The only way I could have done that
> > and keep the sub-element parsing model that I want is to have expat
> > parse entire document into one big internal memory buffer.
> 
> How so?  You can layer a next event style interface on top of expat by
> maintaining a queue of events.  A request for the next event returns the
> head of the queue; if the queue is empty, it fills the queue by reading
> another chunk of input and passing it to XML_Parse() with event handlers
> that append to the queue of events.

James, 

This would require a multi-threaded approach, 
is this correct?

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec 17 06:21:51 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
References: <Pine.LNX.4.10.9912161253010.20127-100000@cauchy.clarkevans.com>
Message-ID: <3859D663.A29154F8@jclark.com>

"Clark C. Evans" wrote:
> 
> On Fri, 17 Dec 1999, James Clark wrote:
> > Paul Miller wrote:
> >
> > > The only way I could have done that
> > > and keep the sub-element parsing model that I want is to have expat
> > > parse entire document into one big internal memory buffer.
> >
> > How so?  You can layer a next event style interface on top of expat by
> > maintaining a queue of events.  A request for the next event returns the
> > head of the queue; if the queue is empty, it fills the queue by reading
> > another chunk of input and passing it to XML_Parse() with event handlers
> > that append to the queue of events.
> 
> James,
> 
> This would require a multi-threaded approach,
> is this correct?

No. Unlike most parsers, expat has a "push" input model: instead of the
parser making calls to get each block of input bytes, the application
calls the parser passing it each block in sequence whenever it is
convenient for the application.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ebohlman at netcom.com  Fri Dec 17 06:49:50 1999
From: ebohlman at netcom.com (Eric Bohlman)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: <Pine.LNX.4.10.9912161253010.20127-100000@cauchy.clarkevans.com>
Message-ID: <Pine.GSU.4.10.9912162246320.17781-100000@netcom9.netcom.com>

On Thu, 16 Dec 1999, Clark C. Evans wrote:
> On Fri, 17 Dec 1999, James Clark wrote:
> > How so?  You can layer a next event style interface on top of expat by
> > maintaining a queue of events.  A request for the next event returns the
> > head of the queue; if the queue is empty, it fills the queue by reading
> > another chunk of input and passing it to XML_Parse() with event handlers
> > that append to the queue of events.
> 
> James, 
> 
> This would require a multi-threaded approach, 
> is this correct?

Nope.  You just check if there are any events in the queue; if not, you
read in a chunk of input and call XML_Parse on it (setting the "final"
parameter as appropriate) to fill up the queue.  XML_Parse doesn't require
the *entire* document as an argument.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 17 07:19:44 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: <3859D663.A29154F8@jclark.com>
Message-ID: <Pine.LNX.4.10.9912161344010.20127-100000@cauchy.clarkevans.com>


James,  

Could you comment once more?  *confused look*

On Fri, 17 Dec 1999, James Clark wrote:
> "Clark C. Evans" wrote:
> > On Fri, 17 Dec 1999, James Clark wrote:
> > > Paul Miller wrote:
> > > > The only way I could have done that
> > > > and keep the sub-element parsing model that I want is to have expat
> > > > parse entire document into one big internal memory buffer.
> > >
> > > How so?  You can layer a next event style interface on top of expat by
> > > maintaining a queue of events.  A request for the next event returns the
> > > head of the queue; if the queue is empty, it fills the queue by reading
> > > another chunk of input and passing it to XML_Parse() with event handlers
> > > that append to the queue of events.
> > 
> > This would require a multi-threaded approach,
> > is this correct?
> 
> No. Unlike most parsers, expat has a "push" input model: instead of the
> parser making calls to get each block of input bytes, the application
> calls the parser passing it each block in sequence whenever it is
> convenient for the application.

Paul (and correct me if I'm wrong) is attempting to 
develop a "pull" model for his C++ program, similar 
to the following XSL...

  <xsl:template match="my-criteria" >
     <!-- pre-process -->
     <xsl:apply-templates />
     <!-- post-process -->
  </xsl:template>

Kinda like:

   process(String elementName) {
      ;  // pre-process
      process-children();
      ;  // post-process
   }

As opposed to the following "equivalent" SAX:

  beginEvent(String name, AttributeList att)
  {
       ; // pre-process
  }

  endEvent(String name)
  {
       ; // post-process
  }


Anyway, given a SAX event source, pushing
the entire document his way, I don't see
how a single threaded solution is possible.

And, from the expat declaration of setElementHandler, 
which requires both a StartElementHandler and an 
EndElementHandler, I assumed that expat works in 
a similar (if not identical) manner.  

Assume the XML source is "<parent><child/></parent>"
>From the StartElementHandler, paul's process()
would be called on parent.  The pre-processing 
would occur, and then process-children() would 
be the next item on the execution stack.  
However, since StartElementHandler has not
gotten it's return, expat cannot move on
to the child... thus the "push" model is
incompatible with Paul's "pull" mechanism.

Thus, two options are left: the XML source
is stored in random-access memory, or, 
the system is broken into two threads,
communicating through an element queue.

Is there something I'm missing here?

If I'm not going crazy above, as a consequence 
of this "push" model (evenif expat is not used
for the XML source), Paul's processor proposal 
requires a thread for each stage of processing 
in a pipe-line.

More generally, this has implications for
multi-stage XSL processing.  It would require
one of three things: (a) either random access 
to the source document, (b) each stage in 
in a seperate thread, (c) a pre-compiler which
rewrites expressions of the first form (single 
function with a call-back in the middle) to 
expressions of the second form (two functions,
one for the begin and one for the end).

Am I completely on a different planet?

Thanks James!

Clark

P.S.  Paul's processor proposal also uses a
nested matching system -- which I belive is 
a tangential issue.  This would be something 
like this in XSL:


  <xsl:template name="x" />
 
  <xsl:template match="my-criteria">
    <paul:register match="my-sub-criteria" to-template-named="x" />
    <!-- pre-child processing -->
    <xsl:apply-templates />
    <!-- post-child processing -->
  </xsl:template>
                   

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec 17 07:49:37 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
References: <Pine.LNX.4.10.9912161344010.20127-100000@cauchy.clarkevans.com>
Message-ID: <3859EAB6.4CE3599C@jclark.com>

"Clark C. Evans" wrote:

> Anyway, given a SAX event source, pushing
> the entire document his way, I don't see
> how a single threaded solution is possible.
> 
> And, from the expat declaration of setElementHandler,
> which requires both a StartElementHandler and an
> EndElementHandler, I assumed that expat works in
> a similar (if not identical) manner.

Expat doesn't work like SAX.  Clark Cooper has written a nice article
explaining expat's API:

  http://www.xml.com/pub/1999/09/expat/index.html

With SAX, the application calls parse once per document; the parser
makes a call on an InputStream to get each chunk of input.  With expat,
the parser doesn't make any calls to get input; rather the application
calls XML_Parse() arbitrarily many times for a single document, each
time passing it another chunk of the input.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 17 08:13:17 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: <3859EAB6.4CE3599C@jclark.com>
Message-ID: <Pine.LNX.4.10.9912161512310.20127-100000@cauchy.clarkevans.com>


Thanks James.  I think I got it.  You use the 
expat events to build the queue, and the use
the call-backs to process the queue.  When
the queue is empty, you ask expat to load
the queue with the next chunk of the source.
This should work like a charm -- and in a 
single thread.   

Oh How Pretty! 

Clark

On Fri, 17 Dec 1999, James Clark wrote:
> "Clark C. Evans" wrote:
> > Anyway, given a SAX event source, pushing
> > the entire document his way, I don't see
> > how a single threaded solution is possible.
> > 
> > And, from the expat declaration of setElementHandler,
> > which requires both a StartElementHandler and an
> > EndElementHandler, I assumed that expat works in
> > a similar (if not identical) manner.
> 
> Expat doesn't work like SAX.  Clark Cooper has written a nice article
> explaining expat's API:
> 
>   http://www.xml.com/pub/1999/09/expat/index.html
> 
> With SAX, the application calls parse once per document; the parser
> makes a call on an InputStream to get each chunk of input.  With expat,
> the parser doesn't make any calls to get input; rather the application
> calls XML_Parse() arbitrarily many times for a single document, each
> time passing it another chunk of the input.
> 
> James
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Fri Dec 17 09:50:22 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:18:36 2004
Subject: Xpath and DOM
In-Reply-To: <000701bf481d$0b23a4c0$d1940e18@smateo1.sfba.home.com>
Message-ID: <000901bf4874$257e3ea0$ab20268a@pc-lrd.bath.ac.uk>

> Actually, I would have preferred to have a small set of
> query mechanism independent API in the DOM API.

You're right it would make sense to have it separate to 
the DOM.
 
> Here is a string-based version:
> 
> interface DocumentQuery {
>   NodeIterator getQueryResult(String query);
> }
> 
> where query is prefixed with query type information such
> as "xpath:" or "xsql:".  API for plugging in new query
> engine is needed to support new query types.

Hmm, I'd suggest:

interface QueryFactory {

  DocumentQuery getDocumentQuery(String queryType);
}

And have the query type, "xpath" or "xsql" passed in to the 
factory method. Solves the plugging in of new query types, and 
avoids having to prepend the query string with "xpath:" or whatever.

L.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ray at xmission.com  Fri Dec 17 12:43:37 1999
From: ray at xmission.com (Ray Whitmer)
Date: Mon Jun  7 17:18:36 2004
Subject: Musing over Namespaces
References: <199912162335454.SM00700@Unknown.>
Message-ID: <385A33A2.53E144FA@xmission.com>

rev-bob@gotc.com wrote:

> Just to take a stab in the dark here, but wouldn't this fundamental difference be that Perl
> and Java are almost unilaterally self-contained parcels (hence, it doesn't matter what the
> package is named, because you're writing everything that deals with it anyway), but
> XML documents are designed for interchange - where the names don't just have to make
> sense to you, but also to an unknown client?

This is not true.  We combine Java and Perl from a variety of sources and refer to these
classes from other packages from other sources -- significantly more diverse than our current
XML declarations in terms of one set of classes referring to another.

> In other words, my only concern when naming a function or a class in a program is that I
> need to know what it is; I can name a variable "Fred" or a 50-char string class "Bubba" if
> I want to, and it doesn't matter - because nobody else needs to understand what those
> names mean.  However, if I'm writing a document that I'm going to send somewhere
> else for Joe to deal with, I'd better use names that Joe can understand and easily map.
> (For an HTML example, I like "BQ" much more than "BLOCKQUOTE" as an element
> name - but if I use BQ in my code, no UA will know what I'm talking about because BQ
> isn't defined anywhere.  If, OTOH, I slip a transformative preprocessor that morphs BQ
> into the defined BLOCKQUOTE somewhere between authoring and the public UA, that
> works fine.)

Packages are heavily reused and refer to the public declarations of other packages.

In Java, methods only to be used within a package are not given scope beyond that package and
can be renamed as desired.  But software packages would be quite useless without a significant
number of public declarations.

For example, take the W3C DOM Java bindings which exists almost purely of public
declarations.  It declares such common public names for use as "Document", "Comment", "Attr".
There are numerous conflicts between these compiled interface class names and classes in the
applications I use them in.  Without package name qualification of the classes, the situation
would be quite difficult not only distinguishing between ambiguous names, but also just trying
to keep track of which standard each class belonged to.

IMO, this is exactly what happens with XML, mixing different standard elements and
architectural forms from different specs in a single DTD or content model to produce a desired
result.

> The difference you're looking for is one of scope.  Internal names don't matter to the
> outside world, because nobody outside has to do anything with them...but external
> names MUST be defined in some way, else nobody outside CAN do anything with
> them.  Am I expressing it clearly?

No.  Java packages typically have lots of external names that matter to the outside.

Ray Whitmer
ray@xmission.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From David.L.Eiland at fritolay.com  Fri Dec 17 14:07:22 1999
From: David.L.Eiland at fritolay.com (Eiland, David)
Date: Mon Jun  7 17:18:36 2004
Subject: Musing over Namespaces
Message-ID: <E11yy2R-0006CX-00@romeo.ic.ac.uk>

	While the conversation has primarily focused on the Company
(Organization) level, it is my experience working with the Fortune 500 that
the same name is often used differently between departments/functions.  Then
considering buying/selling of other divisions/companies that periodically
occurs, the Namespace mechanism must be fluid.

	Dave


	Len Bullard <cbullard@hiwaay.net>
	12/15/99 07:09 PM
	To:	Tim Bray <tbray@textuality.com>@SMTP@Exchange
	cc:	"Clark C. Evans"
<clark.evans@manhattanproject.com>@SMTP@Exchange, David Megginson
<david@megginson.com>@SMTP@Exchange, xml-dev@ic.ac.uk@SMTP@Exchange 
	Subject:	Re: Musing over Namespaces

	Tim Bray wrote:
	> 
	> At 11:04 AM 12/14/99 -0500, Clark C. Evans wrote:
	> >It'd still be nice to have a single database with
	> >everyone's namespace definitions in one place though...
	> >perhaps even a DTD to help describe them.  I'm sure
	> >there are organizations doing this... are there?
	> 
	> Yes, lots.  That's the problem. -Tim

	No Tim, that's the answer.  Common goals achieved by 
	common means, not common control.  Put the definitions in 
	place by performance, not uberNamespace.  It has to 
	breathe, my friends, like a good well performed piece 
	of music, it has to breathe.

	The standards wonks really need to study management 
	and contract negotiation.  OASIS and BIZTALK are 
	the absolute wrong way to do this.  This means a 
	marvelous lucrative opportunity is sitting there being 
	ignored.  Someone or some company will get rich 
	off this if they go to these companies and demonstrate 
	the right way to do this.

	len


	xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
	Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
	To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
	unsubscribe xml-dev
	To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
	subscribe xml-dev-digest
	List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Fri Dec 17 14:23:01 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
References: <Pine.LNX.4.10.9912161512310.20127-100000@cauchy.clarkevans.com>
Message-ID: <385A4785.93FDC75E@fxtech.com>

> Thanks James.  I think I got it.  You use the
> expat events to build the queue, and the use
> the call-backs to process the queue.  When
> the queue is empty, you ask expat to load
> the queue with the next chunk of the source.
> This should work like a charm -- and in a
> single thread.
> 
> Oh How Pretty!

Yes, I think so. This sounds exactly like what I was trying to do.
Perhaps my implementation was flawed. The problem occurred when I got to
some element data and wanted to "pull" that from within an element
handler.

Say I get called for an element "Point" which has this syntax:

	<Point>(x,y)</Point>

When I get to the handler, I'd like to write this:

void PointHandler(XML::Element &elem, Point *point)
{
	// "pull" a chunk of element data, but it'll automatically
	// stop when it gets to the end of the element (it stops
	// when it sees </Point>)
	char buf[40];
	elem.GetData(buf, sizeof(buf));
	sscanf(buf, "%dx%d", &point->x, &point->y);
}

For some reason I couldn't figure out how to make this work, because
with expat I might have hit the element but not necessary gotten to the
character data handler.

I think this will make a lot more sense once I release the code. And
fortunately, I could rewrite the internals (once I figure out how) to
once again layer over expat, without changing the interface.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Mike.Champion at softwareag-usa.com  Fri Dec 17 14:26:43 1999
From: Mike.Champion at softwareag-usa.com (Michael Champion)
Date: Mon Jun  7 17:18:36 2004
Subject: Xpath and DOM
References: <000901bf4874$257e3ea0$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <040801bf489a$238586a0$f9bdb3c7@WORKGROUP>


----- Original Message -----
From: "Leigh Dodds" <ldodds@ingenta.com>
To: "xml-dev" <xml-dev@ic.ac.uk>
Sent: Friday, December 17, 1999 4:50 AM
Subject: RE: Xpath and DOM


> > Actually, I would have preferred to have a small set of
> > query mechanism independent API in the DOM API.
>
> You're right it would make sense to have it separate to
> the DOM.

I completely agree.  The DOM is not "the" API for all XML standards, it is
meant to be a foundation upon other W3C APIs can be built.  The DOM
extensions for individual specs might be trivial, e.g. the XSLT people might
want to simply define a method such as [off the top of my head]
transformNode(Node input, Node output, Node stylesheet) so that everyone
implementing XSLT in a DOM-compatible system uses the same names.  At the
other extreme is SVG, which is defining a whole set of DOM-compatible
interfaces to expose the higher-level semantics of SVG.  XPath (either the
WG or the user community) is free to do likewise.

The DOM working group would much rather be a source of advice on how to make
APIs that are DOM-compatible, platform neutral, vendor neutral, etc. than to
be the ultimate standardization authority for XML APIs.

On the other hand, if there are problems in DOM Level 2 that impede an
implementation of XPath or a definition of simple XPath extensions, they
should be reported to www-dom@w3.org RIGHT AWAY.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Fri Dec 17 14:31:20 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:36 2004
Subject: Musing over Namespaces
In-Reply-To: <E11yy2R-0006CX-00@romeo.ic.ac.uk>
Message-ID: <Pine.GHP.4.21.9912171419380.16086-100000@mail.ilrt.bris.ac.uk>

Len Bullard <cbullard@hiwaay.net>
 	12/15/99 07:09 PM
 	To:	Tim Bray <tbray@textuality.com>@SMTP@Exchange
 	Tim Bray wrote:
 	> 
 	> At 11:04 AM 12/14/99 -0500, Clark C. Evans wrote:
 	> >It'd still be nice to have a single database with
 	> >everyone's namespace definitions in one place though...
 	> >perhaps even a DTD to help describe them.  I'm sure
 	> >there are organizations doing this... are there?
 	> 
 	> Yes, lots.  That's the problem. -Tim
 
> 	No Tim, that's the answer.  Common goals achieved by 
> 	common means, not common control.  Put the definitions in 

I didn't take Tim to be arguing against decentralisation, but against
the multitude of companies/organisations who both promote the 
centralised monolithic (meta)data registry approach, and
(coincidentally) attempt to promote themselves as providing that
service.

One of the supposed big wins of using the same syntax/model for our
meta-languages and our instance data is re-use, synergy. 
Eg. if RDF and XML vocabularies are themselves described using RDF and
XML, then generic discovery/indexing/trust systems applicable to _all_
XML/RDF content should be equally applicable to schemas. Why then
promote centralised registries for all schemas? Surely there will be
search engines and 'trusted third parties' for schema data, as there
will be for other applications of XML and RDF. By defining schema
languages in instance syntax, we implicitly promote the idea that there
will be some big payoff for doing so (otherwise, lets stick with
DTDs). Some synergy that means generic tools will be applicable to
schemata. I find this impossible to reconcile with the
www.really-important-trusted-metaregistry.[com|org] approach that seems
popular in the industry. I'm banking on doing schema searches at the
mainstream search engines in 2-3 years time...

Dan

--
daniel.brickley@bristol.ac.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mikew at o3.co.uk  Fri Dec 17 14:49:45 1999
From: mikew at o3.co.uk (Mike Williams)
Date: Mon Jun  7 17:18:36 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: David Megginson's message of "16 Dec 1999 09:41:16 -0500"
References: <Tim Bray's message of "Wed, 15 Dec 1999 09:29:18 -0800"> <3.0.32.19991215092815.01443430@pop.intergate.ca> <199912151906.OAA24894@hesketh.net> <m31z8n54oj.fsf@localhost.localdomain>
Message-ID: <m3zov9sxns.fsf@picasso.o3.co.uk>

  >>> On 16 Dec 1999 09:41:16 -0500,
  >>> "David" == David Megginson <david@megginson.com> wrote:

  David> Well, the code bloat isn't strictly true if the Name class implements
  David> equals(), since you will usually just be testing for equality; still,
  David> it does add an extra level of indirection, and every indirection in an
  David> API is an open flame in a powder magazine (it won't cause any problems
  David> as long as you're careful and follow the proper procedures, but...).

If Name extended String, could you retain backward compatibility, but still
provide support for namespaces?  The default string value would be the same
as it is currently, ie. optional prefix + local name.  Additional methods
would give access to the un-prefixed local name, prefix, and namespace URI.
No need for NSUtils.

Dumb idea?

-- 
Mike Williams

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Fri Dec 17 14:56:23 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:36 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A675195@odin.cromwellmedia.co.uk>

Mike Williams wrote,
> If Name extended String, could you retain backward
> compatibility, but still provide support for namespaces?

Small problem: String is final.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec 17 15:05:07 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:36 2004
Subject: Musing over Namespaces
In-Reply-To: rev-bob@gotc.com's message of "16 Dec 99 23:39:18 -0500"
References: <199912162335454.SM00700@Unknown.>
Message-ID: <m3vh5x4nio.fsf@localhost.localdomain>

rev-bob@gotc.com writes:

> > Why is XML different?  Is it just that we come from the SGML
> > background, where we consider structural validation to be part of
> > a document rather than a process applied to it, or is there some
> > kind of a fundamental difference between naming code and naming
> > document nodes that no one has articulated yet?
> 
> Just to take a stab in the dark here, but wouldn't this fundamental
> difference be that Perl and Java are almost unilaterally
> self-contained parcels (hence, it doesn't matter what the package is
> named, because you're writing everything that deals with it anyway),
> but XML documents are designed for interchange - where the names
> don't just have to make sense to you, but also to an unknown client?

I'd considered that possibility, but it's not true.  Consider the
following, very typical case in the Java world:

1. I write a compiled Java app that references org.xml.sax.Parser.
2. I send you the app.
3. You run the app, and the JVM complains that org.xml.sax.Parser is
   not found.

I find it very hard to see any difference between that case and the
following one:

1. I compile a data record in XML format that includes an element
   named {http://www.foo.com/ns/}alpha.
2. I send you the data record.
3. You process the record, and your processing code complains that
   {http://www.foo.com/ns/}alpha is not recognized.

In neither case is there any automatic resolution mechanism to help
you -- the JVM won't go out on the Web and find org.xml.sax.Parser for 
you if it's not already on your classpath.  I'm not disputing that
something like this could be helpful (both for Java and Namespaces),
but I am disputing that Namespaces (or Java packages) are
fundamentally incomplete without it.

> In other words, my only concern when naming a function or a class in
> a program is that I need to know what it is; I can name a variable
> "Fred" or a 50-char string class "Bubba" if I want to, and it
> doesn't matter - because nobody else needs to understand what those
> names mean.  

But they do need to know -- see my example above.  The difference is
that once you have the class files for org.xml.sax.Parser, you're
fine, while it's far from obvious what you *could* retrieve that would 
help with {http://www.foo.com/ns/}alpha.

> However, if I'm writing a document that I'm going to send somewhere
> else for Joe to deal with, I'd better use names that Joe can
> understand and easily map.  

That Joe can understand, or that his software can understand?  There's 
a significant difference there.  We could set up some mechanism so
that Joe's software could automatically retrieve a schema or schema
fragment that covers {http://www.foo.com/ns/}alpha, but that's not
much help -- Joe's software could tell that it's valid or invalid
(whoopee!), but it still couldn't tell what it means.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Fri Dec 17 15:13:54 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:36 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: Mike Williams's message of "17 Dec 1999 09:51:19 +0000"
References: <Tim Bray's message of "Wed, 15 Dec 1999 09:29:18 -0800"> <3.0.32.19991215092815.01443430@pop.intergate.ca> <199912151906.OAA24894@hesketh.net> <m31z8n54oj.fsf@localhost.localdomain> <m3zov9sxns.fsf@picasso.o3.co.uk>
Message-ID: <m3so114n3z.fsf@localhost.localdomain>

Mike Williams <mikew@o3.co.uk> writes:

> If Name extended String, could you retain backward compatibility,
> but still provide support for namespaces?

Great idea, but java.lang.String is final, so it cannot be extended
(that supposedly allows many optimizations in the JVM).  To help with
this problem, java.lang.Object does contain a toString() method for
printing, etc., and different classes can override that.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Dec 17 15:26:16 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:36 2004
Subject: A bit of synergy this morning?
Message-ID: <NBBBJPGDLPIHJGEHAKBACEOEEJAA.martind@netfolder.com>

Hello,

The question about how to topic maps and RDF brought me a wave of
reflections.

Question:
Could an RDF "record" be a link?
or could an Xlink be an RDF record?

At first sight this may not be too significant, but if you think more about
it, it brings tremendous advantages. So, let's imagine, for a second if an
xlink extended locator would also be an RDF "record". The link would also
contain meta-information about the linked resource for instance if I have
the following expression:

<specifications xlink:type="extended">
<rdf:desciption xlink:type="locator" xlink:href="http://www.w3c.org/xlink">
  <relase_date>12/24/99<release_date>
  <type>christmas gift</type>
  <description>W3C would be a santa claus for us poor XML
developers</description>
</rdf:description>
</specifications>

OK, put as resource description more significant information :-)) but the
point here is the following: if an RDF description would also recognize the
Xlink attribute for linkage (so that we can replace the
rdf:about="http://www.w3c.org/xlink" with
xlink:href="http://www.w3c.org/xlink" then the resource description can also
be a link. or vise versa.

The impacts are:
a) more significant links (links that also include meta-information about
the linked resource)
b) Resources descriptions could be used as links (the commutative reasoning)
c) a browser can display a one to many link as a two level context menu as
below

   rdf specifications ----------------------
			    | W3C documents       |
			    | Didier's suggestion |
			    | examples            |----------------------
			    ----------------------| author: Will johnson |
							  | date: 12/24/99       |
							  | description: bla..   |
							  ------------------------
The first menu is...a menu, then when a particular locator is highlighted, a
tool tip kind of window is displayed to provide additional information about
the link (the meta information about the resource).
d) probably a lot more I didn't envisionned.

Observation:
I discovered something observing the W3C output. It seems that each
workgroup creates its own workspace... heu sorry, its own name space and do
not re-use the work of others (have you found a lot of name space element
re-usage among the WGs?). For example, it would be beneficial is the rdf WG
would use the xlink workgroup reference attribute for the resource reference
and vise versa. The xlink group can as well take the rdf:about attribute as
a resource reference. Anyway, if these group where to talk each other or
just exercise their synthetic mind, it would become more obvious that if
they mix a bit their mind space... heu sorry their name space they would
provide us synergistic constructs. Or maybe that it never occurred to
somebody that a resource could be a link or that a link could also be a
resource description. Hoops, did I invented something here or did I
discovered that workgroups do not talk each other?

Cheers

Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products: http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From richard at cogsci.ed.ac.uk  Fri Dec 17 15:50:52 1999
From: richard at cogsci.ed.ac.uk (Richard Tobin)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: Paul Miller's message of Thu, 16 Dec 1999 18:01:03 -0500
Message-ID: <199912171550.PAA01341@rhymer.cogsci.ed.ac.uk>

Though James Clark has explained how you can do what you want with
expat, you might be interested in LT XML.  LT XML was originally
written to handle large natural-language corpora, some of which are
several gigabytes.  It allows you to read "bits" sequentially, and
when you find a start tag that you are interested in, easily fill in
the tree rooted there.  It can also validate if desired (validation
requires memory to store a table of IDs and IDREFs, but still works
without building a tree).

-- Richard

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Fri Dec 17 16:39:50 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:36 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <776DC00B49ECD21189750090273F729130B35A@EROS>

Mike Williams writes:
> If Name extended String

Mike,

String is final and cannot be extended.

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Fri Dec 17 17:11:13 1999
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:18:36 2004
Subject: XML Schema: New Public Working drafts available
Message-ID: <f5b902txzki.fsf@cogsci.ed.ac.uk>

On behalf of the XML Schema Working Group we are pleased to announce
the release of new Public Working Drafts for XML Schema: Structures [1]
and XML Schema: Datatypes [2].

These working drafts incorporate substantial new material, and
represent an important step forward.  The following is an extract from 
the 'Status' section of both drafts:

  The WG believes this draft to be `feature-complete': the
  functionality included here is substantially complete and is
  expected to be stable. We do not expect to add major new
  functionality, or to make major changes to the functionality
  described in this draft. Some sections of the draft (in particular
  those on conformance), and some aspects of the design (in particular
  details of the transfer syntax for schemas), on the other hand, are
  still rough and are expected to be revised.

  The WG expects to spend January, 2000, working out details,
  clarifying points of uncertainty that arise in the review of this
  draft, cleaning up inconsistencies, reviewing the design of the
  concrete transfer syntax, and making editorial improvements.

  Following that period of review and polishing, it is the WG's intent
  to issue a Last Call for Review by other W3C working groups sometime
  during February, 2000, and to submit this specification in March,
  2000, for publication as a Candidate Recommendation.  This schedule
  may vary, depending on the comments of the public and of other W3C
  working groups on this draft. Such comments are instrumental in the
  WG's deliberations, and we encourage readers to review the draft and
  send comments to www-xml-schema-comments@w3.org.

  Although the Working Group does not anticipate further substantial
  changes to the functionality described here, this is still a working
  draft, subject to change based on experience and on comment by the
  public and other W3C working groups. The present version should be
  implemented only by those interested in providing a check on its
  design or by those preparing for an implementation of the Candidate
  Recommendation. _The Schema WG will not allow early implementation to_
  _constrain its ability to make changes to this specification prior to_
  _final release_.

Henry S. Thompson
Paul V. Biron

[1] http://www.w3.org/TR/xmlschema-1/
[2] http://www.w3.org/TR/xmlschema-2/
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jrgardn at emory.edu  Fri Dec 17 17:29:10 1999
From: jrgardn at emory.edu (John Robert Gardner)
Date: Mon Jun  7 17:18:36 2004
Subject: XHTML Parable
In-Reply-To: <040801bf489a$238586a0$f9bdb3c7@WORKGROUP>
Message-ID: <Pine.GSO.4.05.9912171228040.11145-100000@jet.cc.emory.edu>


Just for fun, and to commemorate the re-birth of XHTML, if you're bored,
have a look at:

http://vedavid.org/xml/luke/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Dec 17 17:51:08 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:36 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <AA4C152BA2F9D211B9DD0008C79F760A675194@odin.cromwellmedia.co.uk> <14424.65166.312499.405767@localhost.localdomain>
Message-ID: <385A763D.1DDA393F@mecomnet.de>

That the argument cited below appears in the context of
names/namespaces/xml-encoding is an artifact of xml's genesis. If one leaves
text strings alone for a moment, and permits that, once the document is
decoded, the processing can occur on a symbolic level, then the argument has
no grounds, as the worrisome interface is relegated to the javadoc archives
and never appears in code.

Why are people so worried about the "parts of a name" anyway? In lisp code for
processing stuff decoded from / encoded into xml there is not a single
instance of either (SYMBOL-NAME ...) or (PACKAGE-NAME (SYMBOL-PACKAGE ...)).
The only test is (EQ ...). and even those are primarily by virtue of
classification for method dispatch.

So long as one stays on the "strings" level, however, the need for centralized
registration and rectification will prevail... Move up a level: let namespaces
themselves be first class objects. Parsers appear in any event to be moving
towards interning parsed strings. The processing infrastructure needs
operations to manipulate namespaces. A name should not bind an uri; a
namespace comprises a collection of names. If implemented in a reasonable
fashion, this permits the processor to associate a name with more than one
namespace. given that, a process can guarantee the correct mapping from names
to symbols and thus to processes without need for a centralized authority.

David Megginson wrote:
> 
> Every level of indirection is an open flame because it increases the
> difficulty (and cost) of learning and implementing an API, which
> leads to several problems:
> 
> a) the API is less likely to gain acceptance;
> b) the API is less likely to be implemented correctly; and
> c) the costs of learning, teaching, and documenting the API are higher.
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Dec 17 17:51:09 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:36 2004
Subject: namespace-oblivion [Re: SAX2: Namespace Processing and NSUtils helper class]
References: <3.0.32.19991216074341.013812d0@pop.intergate.ca>
Message-ID: <385A6F2C.48317CDD@mecomnet.de>

Tim Bray wrote:
> 
> ..
> 
> And the namespace-oblivious world is just no longer interesting.
> 
> -Tim
> 

?
was it ever?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Dec 17 18:17:56 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:36 2004
Subject: REBOL and XML
Message-ID: <199912171817.NAA21438@hesketh.net>

Is anyone doing work using REBOL[1] with XML?  The 'everything is data'
approach of REBOL seems both like a good fit and perhaps a conflict with
the XML approach.  I'm just getting started with this, but if anyone has
opinions or stories, I'd love to hear them.

[1] - http://www.rebol.com  (and yes, it's pronounced like rebel)

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From steve at rsv.ricoh.com  Fri Dec 17 18:29:12 1999
From: steve at rsv.ricoh.com (Stephen R. Savitzky)
Date: Mon Jun  7 17:18:36 2004
Subject: XML parsing memory overhead concerns
In-Reply-To: "Clark C. Evans"'s message of Thu, 16 Dec 1999 12:15:20 -0500 (EST)
References: <Pine.LNX.4.10.9912161213270.20127-100000@cauchy.clarkevans.com>
Message-ID: <qcbt7pe820.fsf@congo.crc.ricoh.com>

"Clark C. Evans" <clark.evans@manhattanproject.com> writes:

> On 16 Dec 1999, Stephen R. Savitzky wrote:
> > This is probably similar to what I'm using: the parser's API is basically 
> > a tree traverser:  it has methods like:
> > 
> >   toNextSibling
> >   toFirstChild
> >   toParent
> >   ... and a bunch of methods to examine the current node.
> > 
> > Though it's DOM-like, you never actually have to build the whole tree.
> 
> So, some random access storage is needed with this technique?

Yes -- you need a stack to match start tags and end tags, and you need to
build a parse tree for any element that needs to be processed as a whole.
This parser is part of a system that does only local tree transformations,
so any tokens that don't have to be transformed can just be sent on to the
output, which has the interface of a tree _constructor_ (or SAX application,
if you want to look at it that way).

> Why not have expat build the sub tree you are interested,
> put it in memory.  You can then "pull" from this?

You can, of course, use SAX or some other "push" parser on the input and
either copy events (tokens) or build subtrees as appropriate.  Doesn't work
as well if you have to process an entire document, but could do it in
streaming mode.  Works in the more usual case of a handful of active tags
embedded in a mostly-passive document, because most of the subtrees that
require processing in that case are small.

> I'd like to hear *much* more about this.

Our system is open source, at http://RiSource.org/PIA/

-- 
Stephen R. Savitzky  <steve@rsv.ricoh.com>  <http://rsv.ricoh.com/~steve/>
Platform for Information Applications:      <http://RiSource.org/PIA/>
Chief Software Scientist, Ricoh Silicon Valley, Inc. Calif. Research Center
 voice: 650.496.5710  front desk: 650.496.5700  fax: 650.854.8740 
  home: <steve@theStarport.org> URL: http://theStarport.org/people/steve/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dave at userland.com  Fri Dec 17 18:48:28 1999
From: dave at userland.com (Dave Winer)
Date: Mon Jun  7 17:18:37 2004
Subject: More real-world XML
Message-ID: <094f01bf48bf$d3bee640$1918ccce@murphy>

We've found another pragmatic use for XML.

On our Weblog Monitor site we allow people who have registered sites to specify how they'd like the sites to be categorized. This results in a hierarchy of categories and nodes. This hierarchy is browsable thru the web, and it's also mirrored in XML. The XML file is here:

http://static.userland.com/weblogMonitor/categories.xml

You can browse the structure thru HTML here:

http://subhonker2.userland.com/weblogMonitor/categories/

If you register a weblog, you enter the Categories string on your Prefs page:

http://subhonker2.userland.com/weblogMonitor/prefs/

The levels are slash-delimited, you can specify as many paths as you want, they're comma-separated. 

Here's the categories string I entered for Scripting News:

/computers/software/scripting,/culture/web,/humor,/Geography/USA/California

The structure evolves in real-time, if a member changes his or her listing so that a category is emptied, it's no longer in the structure.

A Frontier script that reads the XML into an outline.

http://samples.userland.com/discuss/enclosures/view$64

A screen shot of the outline:

http://www.outliners.com/images/frontier6/weblogMonitorCategoriesOutline.gif

Dave Winer


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991217/9f4e2582/attachment.htm
From gmckenzi at JetForm.com  Fri Dec 17 19:05:36 1999
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:18:37 2004
Subject: CR/LF in XML?
Message-ID: <A03B4F4BCAAFD311A76A00805F6512CA277C95@OTTMAIL1>

Folks,

After working with XML for the better portion of the last two years I am
somewhat embarassed about this...and I'm hoping that someone can give me an
opinion based on experience...

I thought that I understood the way the XML spec states that parsers are
required to convert all CR/LF pairs to simply LF.  But I've had that
understanding shaken when I was told by a co-worker that placing the CR/LF
pair in an XML file as character-entity references &#x000D;&#x000A; results
in the parser *not* translating the pair to a LF.

So in summary, a CR/LF pair appears to be translated, while the
&#000D;&#000A; is not...right?

I've re-read the spec (especially sec 2.11), and the annotated
version...I've brushed up on the definition of a 'parsed-entity', and I'm
still unable to understand whether the reported behaviour of using character
entities to defeat the translation is correct; and if so, which magic words
in the spec (is it the repeated use of 'literal' in 2.11) cause the
behaviour of not translating the character-entity references to be correct.

Gavin.
 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec 17 19:05:42 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces
In-Reply-To: <m3vh5x4nio.fsf@localhost.localdomain>
Message-ID: <000601bf48c1$dd784d00$d1940e18@smateo1.sfba.home.com>

>That Joe can understand, or that his software can understand?  There's 
>a significant difference there.  We could set up some mechanism so
>that Joe's software could automatically retrieve a schema or schema
>fragment that covers {http://www.foo.com/ns/}alpha, but that's not
>much help -- Joe's software could tell that it's valid or invalid
>(whoopee!), but it still couldn't tell what it means.

What if the registry was a distributed semantic network?
If the problem could be solved, wouldn't it be worth
building it?  Note that this is not the way BizTalk and
XML.org schema repositories work.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Dec 17 20:03:39 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces
In-Reply-To: <000601bf48c1$dd784d00$d1940e18@smateo1.sfba.home.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBACEPFEJAA.martind@netfolder.com>

Hi Don
Don Said:
What if the registry was a distributed semantic network?
If the problem could be solved, wouldn't it be worth
building it?  Note that this is not the way BizTalk and
XML.org schema repositories work.

Didier reply:
Right ON!!

I think this could be effectively a good way to have both name space and
schemas documented.

Let's see a practical example since I am precisely working on that this week
(among other things :-)

I created a name space for the dynamic DNS. All Dynamic DNS requests and
responses are transformed respectively into URLs and XML document. For
instance to add a new sub domain to a particular domain you send a HTTP/Get
or HTP/post with the request URL and receive an XML document about the new
inscription, and so on and so forth for all the request/response.

The name space is called DDNS. Today, if I follow the specs I would
assocaite the name space to an empty URI or do like W3C is doing for some of
its own name space URIs, have an HTML page pointed by the URI.

But, not staisfied with such half solution, I tried something: topic maps.
As we all know, do we? topic maps are composed of topic elements and a topic
element contains both names and an extended link (if you map the Hytime link
to xlink). So, the solution was simple, The name space URI is pointing to a
topic map document. This document has for main topic the ddns name space.
The extended link itself points to different specifications used to create
this name space. This could include also schemas.

So, if the name space URI would be use for something useful, it could point
to a topic map document, this latter provide information about the name
space seen as a topic.

But Didier, what did we gained? he said with some incredulity.

Simple, a machine can decode a topic map document but not necessarily an
HTML page. Moreover, if we all agree on a convention about the document
structure, then a processor could easily retreive the schema, humans the
written documentation and even have this name space associated to other ones
by relationship like "is_derived_from" is "is-part_of" or more simply with
set relations and tagging like "finance", "system", "network" etc...

Do others are interested to have meaninful name space? if yes, we can
publish a paper providing a recommendation on how to document a name space.
I'll start the first draft and post it.

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products: http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Fri Dec 17 20:20:33 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces
In-Reply-To: <NBBBJPGDLPIHJGEHAKBACEPFEJAA.martind@netfolder.com>
Message-ID: <000301bf48cc$4f634fa0$d1940e18@smateo1.sfba.home.com>

Didier wrote:
>Do others are interested to have meaninful name space? if yes, we can
>publish a paper providing a recommendation on how to document 
>a name space.
>I'll start the first draft and post it.

Sounds like a candidate for SML-DEV 'usage' spec. <g>

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Dec 17 21:11:09 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces
In-Reply-To: <000301bf48cc$4f634fa0$d1940e18@smateo1.sfba.home.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBACEPIEJAA.martind@netfolder.com>

Hi Don,

Don Said:
Sounds like a candidate for SML-DEV 'usage' spec. <g>

Didier reply:
I just posted a document about the semantic name space on this list. Should
I post it also to the SML list? Maybe we can create a variant for SML too.

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cyang at mdsintl.com  Fri Dec 17 21:11:22 1999
From: cyang at mdsintl.com (Yang, Chol)
Date: Mon Jun  7 17:18:37 2004
Subject: No subject
Message-ID: <C6672FD5E16BD3118C3A00805FFE91D91B37EF@tornt4.mdshealth.com>

subscribe xml-dev cyang@mdsintl.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Fri Dec 17 21:11:00 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:37 2004
Subject: Very very preliminary doc about semantic name spaces
Message-ID: <NBBBJPGDLPIHJGEHAKBAAEPIEJAA.martind@netfolder.com>


Hi,

Like I said earlier to Don, I will write a the preliminary draft for the
semantic name space. Things said things done. I just wrote a very very
preliminary draft document about semantic name spaces. If there are any
comments, I ask not more than adding names to the list of contributors. And
also, harvest from the collective mind some wisdom and good suggestions. I
didn't waited to have a more elaborate document to request for comments. I
will update the document regularly.

PS: the joined document in the zipped file is in PDF format.


Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:
Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products:
http://www.netfolder.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: NameSpaceTopicMap.zip
Type: application/x-zip-compressed
Size: 81525 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991217/e56f9ed6/NameSpaceTopicMap.bin
From gmckenzi at JetForm.com  Fri Dec 17 22:19:36 1999
From: gmckenzi at JetForm.com (Gavin McKenzie)
Date: Mon Jun  7 17:18:37 2004
Subject: REBOL and XML
Message-ID: <A03B4F4BCAAFD311A76A00805F6512CA277C99@OTTMAIL1>


Yes...I've been hooked on REBOL for a couple of months now.

It truly is a different (in the good sense) and very powerful scripting
language.

That, plus the rebol mailing list and developers have been very responsive.

However...and this is a biggie...it doesn't do XML like you would expect, or
at least like I expected.

The people at REBOL will tell you that REBOL has a built-in XML parser.
True enough, it does have the capability to parse XML -- and it has some
nifty features for composing HTML or XML, and a truly great type system
where things like URIs and XML tags are first class datatypes built into the
language.

But, the result of parsing an XML file is that it is loaded into a tree
structure in memory.  No, it isn't loaded into a DOM, it is loaded into a
REBOL 'block structure' which I've not found very easy to use.  Blocks are
just simple nested lists, and they're easy enough to deal with on their
own...but, trying to work on an XML document that has been put into a block
is very non-intuitive and tedious.

I've asked the REBOL folks whether they are considering an add-on or another
flavour of REBOL that exposes either a real SAX-style callback interface or
a real DOM to the scripter.  They have said that they are aware of the
requirement, and do plan to build it...alas I expect they have *alot* of
other work on their plate.

Gavin.

> -----Original Message-----
> From: Simon St.Laurent [mailto:simonstl@simonstl.com]
> Sent: Friday, December 17, 1999 1:18 PM
> To: XML-Dev Mailing list
> Subject: REBOL and XML
> 
> 
> Is anyone doing work using REBOL[1] with XML?  The 
> 'everything is data'
> approach of REBOL seems both like a good fit and perhaps a 
> conflict with
> the XML approach.  I'm just getting started with this, but if 
> anyone has
> opinions or stories, I'd love to hear them.
> 
> [1] - http://www.rebol.com  (and yes, it's pronounced like rebel)
> 
> Simon St.Laurent
> XML: A Primer, 2nd Ed.
> Building XML Applications
> Inside XML DTDs: Scientific and Technical
> Sharing Bandwidth / Cookies
> http://www.simonstl.com
> 
> xml-dev: A list for W3C XML Developers. To post, 
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and 
> on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the 
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Fri Dec 17 23:09:17 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:18:37 2004
Subject: dtddiff of new schema proposal 
Message-ID: <01BA10F0CD20D3119B2400805FD40F9F27828F@MDYNYCMSX1>

Below is the result of running Earl Hood's dtddiff script to compare the
DTD's from appendix B of the November and December drafts of the W3C Schema
Proposal. There are obviously a lot more differences than there were between
the September and November drafts. Looking through it is still a lot quicker
than reading the entire new draft. It will look better displayed in a
monospaced font than in a proportionally spaced font.

Bob DuCharme          www.snee.com/bob           <bob@  
snee.com>  "The elements be kind to thee, and make thy
spirits all of comfort!" Anthony and Cleopatra, III ii

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

    ----------------------------------------------------------------------
                 New Elements/Attributes (xmlschema12-99.dtd)
    ----------------------------------------------------------------------
	<annotation>                     <any>
	<any maxoccurs>                  <any minoccurs>
	<any namespace>                  <anyattribute>
	<anyattribute namespace>         <appinfo>
	<appinfo source>                 <attributegroup>
	<attributegroup name>            <attributegroup ref>
	<datatype abstract>              <datatype final>
	<datatype source>                <element abstract>
	<element equivclass>             <element exact>
	<element final>                  <element nullable>
	<encoding value>                 <enumeration value>
	<field>                          <import namespace>
	<import schemalocation>          <include schemalocation>
	<info>                           <info source>
	<info xml:lang>                  <key>
	<key name>                       <keyref>
	<keyref name>                    <keyref refer>
	<length value>                   <maxexclusive value>
	<maxinclusive value>             <maxlength value>
	<minexclusive value>             <mininclusive value>
	<minlength>                      <minlength value>
	<pattern value>                  <period value>
	<precision value>                <restrictions>
	<scale value>                    <schema exactdefault>
	<schema finaldefault>            <schema targetnamespace>
	<selector>                       <sic>
	<type>                           <type abstract>
	<type content>                   <type derivedby>
	<type exact>                     <type final>
	<type name>                      <type source>
	<unique>                         <unique name>
    ----------------------------------------------------------------------
             Old/removed Elements/Attributes (xmlschema11-99.dtd)
    ----------------------------------------------------------------------
	<archetype>                      <archetype content>
	<archetype default>              <archetype fixed>
	<archetype model>                <archetype name>
	<archetype order>                <archetype schemaabbrev>
	<archetype schemaname>           <archetype type>
	<attrgroup>                      <attrgroup export>
	<attrgroup name>                 <attrgroupref>
	<attrgroupref name>              <attrgroupref schemaabbrev>
	<attrgroupref schemaname>        <attribute schemaabbrev>
	<attribute schemaname>           <basetype>
	<basetype name>                  <basetype schemaabbrev>
	<basetype schemaname>            <comment>
	<component>                      <component name>
	<component type>                 <datatype export>
	<datatypequal>                   <element archref>
	<element export>                 <element schemaabbrev>
	<element schemaname>             <export>
	<export archetypes>              <export attrgroups>
	<export datatypes>               <export elements>
	<export entities>                <export groups>
	<export notations>               <externalentity>
	<externalentity export>          <externalentity name>
	<externalentity notation>        <externalentity public>
	<externalentity system>          <group collection>
	<group export>                   <group schemaabbrev>
	<group schemaname>               <import archetypes>
	<import attrgroups>              <import datatypes>
	<import elements>                <import entities>
	<import groups>                  <import notations>
	<import schemaabbrev>            <import schemaname>
	<include archetypes>             <include attrgroups>
	<include datatypes>              <include elements>
	<include entities>               <include groups>
	<include notations>              <include schemaname>
	<lexical>                        <literal>
	<maxabsolutevalue>               <minabsolutevalue>
	<notation export>                <refines>
	<refines name>                   <refines schemaabbrev>
	<refines schemaname>             <schema model>
	<schema targetns>                <textentity>
	<textentity export>              <textentity name>
	<unparsedentity>                 <unparsedentity export>
	<unparsedentity name>            <unparsedentity notation>
	<unparsedentity public>          <unparsedentity system>
    ----------------------------------------------------------------------
                           Content Rule Differences
    ----------------------------------------------------------------------
         ------------------------------------------------------------
	                         <ATTRIBUTE>

  << old content rule <<
  ((mininclusive|minexclusive)|
   (maxinclusive|maxexclusive)|
   (MAXABSOLUTEVALUE,MINABSOLUTEVALUE)?|precision|scale|pattern|
   enumeration|length|maxlength|encoding|period)*

  >> new content rule >>
  ((annotation)?,
   (datatype)?)

         ------------------------------------------------------------
	                          <DATATYPE>

  << old content rule <<
  (BASETYPE,
   ((mininclusive|minexclusive)|
    (maxinclusive|maxexclusive)|
    (MAXABSOLUTEVALUE,MINABSOLUTEVALUE)?|precision|scale|pattern|
    enumeration|length|maxlength|encoding|period)*)

  >> new content rule >>
  ((annotation)?,
   ((mininclusive|minexclusive)|
    (maxinclusive|maxexclusive)|precision|scale|pattern|enumeration|
    length|maxlength|minlength|encoding|period)*)

         ------------------------------------------------------------
	                          <ELEMENT>

  << old content rule <<
  ((ARCHETYPE|datatype)?)

  >> new content rule >>
  ((annotation)?,
   (type|datatype)?,
   (unique|key|keyref)*)

         ------------------------------------------------------------
	                          <ENCODING>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                        <ENUMERATION>

  << old content rule <<
  (LITERAL)+

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                           <GROUP>

  << old content rule <<
  (element|group)*

  >> new content rule >>
  ((annotation)?,
   (element|group|any)*)

         ------------------------------------------------------------
	                           <IMPORT>

  << old content rule <<
  (COMPONENT*)

  >> new content rule >>
  EMPTY

         ------------------------------------------------------------
	                          <INCLUDE>

  << old content rule <<
  (COMPONENT*)

  >> new content rule >>
  EMPTY

         ------------------------------------------------------------
	                           <LENGTH>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                        <MAXEXCLUSIVE>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                        <MAXINCLUSIVE>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                         <MAXLENGTH>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                        <MINEXCLUSIVE>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                        <MININCLUSIVE>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                          <PATTERN>

  << old content rule <<
  (LEXICAL)+

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                           <PERIOD>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                         <PRECISION>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                           <SCALE>

  << old content rule <<
  (#PCDATA)

  >> new content rule >>
  (annotation)?

         ------------------------------------------------------------
	                           <SCHEMA>

  << old content rule <<
  ((import*,include*,EXPORT?,
    (COMMENT|datatype|ARCHETYPE|element|ATTRGROUP|group|notation|
     TEXTENTITY|EXTERNALENTITY|UNPARSEDENTITY)*))

  >> new content rule >>
  ((include|import|annotation)*,
   (datatype|type|element|attributegroup|group|notation),
   (annotation|datatype|type|element|attributegroup|group|notation|unique|
    key|keyref)*)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ssdhanoa at us.ibm.com  Sat Dec 18 01:12:12 1999
From: ssdhanoa at us.ibm.com (ssdhanoa@us.ibm.com)
Date: Mon Jun  7 17:18:37 2004
Subject: XML Parser for Java Update
Message-ID: <8725684B.000586F6.00@d53mta02h.boulder.ibm.com>


New XML Tutorial from developerWorks
Doug Tidwell of IBM has just released another excellent XML Tutorial
on ?Transforming XML Documents?.
http://www.ibm.com/developer/xml/?open&loc=138,t=g,p=xml000


Updated Technology from alphaWorks

XML Parser for Java
A validating XML parser written in 100% pure Java.

Update / New Features
XML4J 3.0.0EA3 is based on the Apache Xerces XML Parser. New features
include DOM Level 2, SAX2 (alpha), and parts of W3C Schema.
Download
Http://www.alphaworks.ibm.com/tech/xml4j?open&l=xml-dev,t=xm4j1


New XML Tools from alphaWorks

Visual XML Tools Package
The XML Tools Package is an early technology release for providing XML
tooling in the Application Framework for e-Business.
Download
http://www.alphaworks.ibm.com/tech/visualxmltools?open&l=xml-dev,t=xmtp1


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Sat Dec 18 01:23:20 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces [long]
Message-ID: <199912172023341.SM00700@Unknown.>

Reinserting the original question for reference purposes....

> > > Why is XML different? Is it just that we come from the SGML 
> > > background, where we consider structural validation to be part of 
> > > a document rather than a process applied to it, or is there some 
> > > kind of a fundamental difference between naming code and naming 
> > > document nodes that no one has articulated yet? 
> >
> > Just to take a stab in the dark here, but wouldn't this fundamental difference be that
> > Perl and Java are almost unilaterally self-contained parcels (hence, it doesn't matter
> > what the package is named, because you're writing everything that deals with it
> > anyway), but XML documents are designed for interchange - where the names don't
> > just have to make sense to you, but also to an unknown client?
> 
> This is not true.  We combine Java and Perl from a variety of sources and refer to
> these classes from other packages from other sources -- significantly more diverse than
> our current XML declarations in terms of one set of classes referring to another.

Okay, granted - I don't pretend to be a Java/Perl guru (hence, "stab in the dark"), so 
perhaps I misunderstood exactly what is meant by "naming code" in this context.  Now, 
here's the key point I'm trying to get a handle on - what level of code are we talking 
about?

Currently, I see the following levels corresponding to each other between J/P and XML:

1. Content == content ==  non-code.  Irrelevant for this discussion; no names involved.
2. Program-specific objects (private class?) == local DTD fragments == stuff that 
pertains to this file and only this file.  Names don't matter, since all the definitions/code 
required to deal with them are sent in the file itself.  (This seems from where I sit to be 
common in programs but rare in documents; all programs need to have a few routines 
that are unique to them, but very few *ML documents need to have tags which are 
defined for only that document.  This is part of the distinction I was drawing above.)
3. Public class == Public DTD == definitions for shared objects.  Names matter; other 
programs have to know how to access/handle/deal with these.
4. Inside class (pub or private) == Inside DTD (pub or private) == internal code for the 
definitions.  Names do not matter; the very purpose of a class/DTD is to let people deal 
with these objects without having to know about the guts of those manipulation methods 
- we don't need to know the protocols for manipulating magnetic particles on a videotape 
to hit Record on a VCR, as we trust the VCR (class/DTD definition/etc.) to handle that 
stuff for us.  Those protocols are only important to the author, so he can refer to them as 
Clyde and Billy if he so desires.  The key point here is "unexposed" code - you don't 
care if I use the variable I, R, or GatesIsEvil as a loop iterator in a function, because 
you're just calling the function on a black-box level.  As long as you feed the box the 
right inputs and it spits out the right outputs, you don't care what goes on in the box - 
and that's why you have the box in the first place.

My analysis is as follows, and please tell me if I'm wrong.  With XML, you have very 
little private code in terms of document elements - after all, the whole idea is one of 
interchange, which means sharing, which means some level of publicness, which in turn 
requires some way to resolve the public parts that appear in a document.  (Cascading 
style sheets come to mind as a framework model - look on the client's system for the 
highest-priority definitions, then look at the document and where it points.  Just a 
tangent.)  Your private code is going to be in your software, in how you handle 
documents that come in from wherever.  OTOH, with programs, you actually have quite 
a bit of private code - even if that private code is built with public-class bricks, the 
building itself is private.  Responsibility demands that, if you're going to use public-class 
stuff, you have to bundle it with the program in some way - but, if I get the classpath 
concept right, the client can say "oh, I already have that module".

To translate that last bit a little more clearly, I am saying that programs either need to be 
sent as complete entities - with all the public modules required to handle all the public 
classes used in the program included with the distribution - or there needs to be some 
way for the client to obtain any public modules that were not sent.  From what I can tell, 
C++ takes the former approach; if you reference a standard class, the tools required to 
manipulate that class are built into the object file, and thus the distinction between public 
and private classes vanishes where the end user is concerned.  If Java and Perl do things 
differently in this regard, then I would expect there to be some way to take a reference to 
an unknown yet defined-public class, and retrieve the missing class definitions from 
somewhere.

Isn't that what namespaces are supposed to be for in XML?

As I understand the XML namespace concept, the namespace is included in a document 
to tell the client software where to get instructions on how to handle some set of 
elements.  If you want to use HTML elements, you give directions to the HTML 
definitions.  If you want to stick some MathML in there, tell the client where the 
MathML definitions are.  In short, assume the client software doesn't know what ANY 
elements mean and thus provide namespaces that cover everything you use...just as a 
responsible programmer includes all the libraries that his program needs.  However, just 
as a Java VM can apparently say "oh, I have that class already" and default to a local 
version, the key to meaningful XML namespaces seems to be giving the XML client a 
way to say "oh, I need to transform that element into *this*".

I think that's right, anyway.  (Gimme a break; I've had a long day.)

Markup languages are, when you get down to brass tacks, all about the facilitation of a 
transfomation of data.  (Okay, so technically ALL computer tasks are merely data 
transformation in some form, but that's a tangent.)  Transformation requires input, 
instructions, and output.  Well-formed XML gives us a coherent form of input, but that 
means nothing without meaningful instructions...and those instructions have to come 
from somewhere.  That's where the DTDs, schemas, and namespaces come in - making 
the pretty document mean something to a client.  (And yes, I'm being deliberately 
ambiguous with the word "client".  Think about it.)

With that in mind, consider my other comments:

> > In other words, my only concern when naming a function or a class in a program is
> > that I need to know what it is; I can name a variable "Fred" or a 50-char string class
> > "Bubba" if I want to, and it doesn't matter - because nobody else needs to
> > understand what those names mean.

I'm talking about private/local code here.  If I write my own funky sort routine, nobody 
else needs to know what I call my variables.  If I make the routine private and keep it in 
my own software, nobody else even needs to know what its name is.  If I make the 
routine public, I need to document the calling protocols, the inputs and the outputs - but 
the interior code is still "mine" to name as I please.

> >  However, if I'm writing a document that I'm going to send somewhere else for Joe
> > to deal with, I'd better use names that Joe can understand and easily map.

With XML, this mapping can be an XSLT transformation or something else that Joe can 
handle - but I cannot simply leave names undefined.  If I use something funky, I have to 
tell Joe what to do with it.

> Packages are heavily reused and refer to the public declarations of other packages.

Great.  I like code reuse.

> In Java, methods only to be used within a package are not given scope beyond that
> package and can be renamed as desired.  But software packages would be quite
> useless without a significant number of public declarations.

In other words, private code is yours to deal with as you please, but for certain 
functions, you shouldn't reinvent the wheel every time.  Fine; I accept that and even 
encourage it.

> For example, take the W3C DOM Java bindings which exists almost purely of public
> declarations.  It declares such common public names for use as "Document",
> "Comment", "Attr". There are numerous conflicts between these compiled interface
> class names and classes in the applications I use them in.  Without package name
> qualification of the classes, the situation would be quite difficult not only distinguishing
> between ambiguous names, but also just trying to keep track of which standard each
> class belonged to.

In other words, you need a way to take a given element/object and find out for certain 
exactly what set of instructions you need to follow when processing it.  Sounds awful 
similar to XML namespaces from here.  Context is critical, yet we can only assume a 
minimal context (the declared XML spec/the defined Java syntax) - everything else has 
to be unambiguously identified in some way.
 
> IMO, this is exactly what happens with XML, mixing different standard elements and
> architectural forms from different specs in a single DTD or content model to produce a
> desired result.

I'm not so sure a single DTD is the answer, but I may be reading you a bit too closely.  
Rather, I'd say references to all required name sources are needed - why copy someone 
else's definition into your single DTD when you can simply say "for this element, go by 
this other DTD"?

> > The difference you're looking for is one of scope.  Internal names don't matter to the
> > outside world, because nobody outside has to do anything with them...but external
> > names MUST be defined in some way, else nobody outside CAN do anything with
> > them.  Am I expressing it clearly?
> 
> No.  Java packages typically have lots of external names that matter to the outside.

Rupture alert!  As I read you, you're talking about "external names that matter to the 
outside" in the contexts of (a) public classes that will be referred to by other software, 
and/or (b) calls to such public classes.  Yes, both of those DO matter to the outside, and 
I say as much above ("external names MUST be defined in some way") - but neither of 
them is an "internal name" as I discussed, precisely because I mean by "internal names" 
things which are not exposed to third parties; they are completely internal to your code.  
You're talking about one thing, I'm talking about another.  External names have to be 
referenced in a meaningful way, otherwise the program falls apart - "I'm supposed to 
make an object named Bubba of class public-hick - but what's "hick" mean?  It's not 
defined anywhere, and I don't know where else to look!  (crash)".  This is wholly 
different from "I'm supposed to make an object named Fred of class public-foo - okay, I 
know what foo is, so I can do that."  The class names hick and foo matter because they 
are external references.  The variable names Bubba and Fred only matter if they are 
being exposed for specific reuse.  You're talking about hick and foo; I'm talking about 
Bubba and Fred.  (I'm also talking about both sides of "object Jake of class private-
schmuck".)


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sat Dec 18 01:30:25 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:37 2004
Subject: Very preliminary doc on semantic name spaces
Message-ID: <NBBBJPGDLPIHJGEHAKBACEPNEJAA.martind@netfolder.com>

Hello,


Sorry, Peter reminded me not to include the doc with email. So I posted it
at the following URL:

ftp://www.netfolder.com/NameSpaceTopicMap.zip

The document itself contained in the zip in in PDF format.

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Sat Dec 18 02:32:06 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:37 2004
Subject: REBOL and XML
References: <199912171817.NAA21438@hesketh.net>
Message-ID: <004e01bf4900$b32160c0$1efbb1cd@tomshp>


----- Original Message -----
From: Simon St.Laurent <simonstl@simonstl.com>


> Is anyone doing work using REBOL[1] with XML?  The 'everything is data'
> approach of REBOL seems both like a good fit and perhaps a conflict with
> the XML approach.  I'm just getting started with this, but if anyone has
> opinions or stories, I'd love to hear them.
>
> [1] - http://www.rebol.com  (and yes, it's pronounced like rebel)
>

I've copied this from a post I sent to the Python XML-SIG.  It's a small
example of getting an html file and parsing it lightly to extract some
specific data.

<extract>
Actually, REBOL looks very interesting.  There isn't enough documentation as
yet so the learning curve is on the steep side.  At the risk of being
off-topic (and off-Python), I'm including a REBOL script - my only one- that
retrieves a URL, and extracts a particular section from the html.  The
section REBOL[...] is essentially a comment.

-----------------------------------------
REBOL [
 Title: "Zone Forecast Extractor"
 File: %zone.r
 Purpose: {Extract the Virginia Zone Forecast and display it.}
]

zone: read http://iwin.nws.noaa.gov/iwin/va/zone.html

print ""
{parse zone [thru <title> copy result to </title>]
print result}

print "Current Fairfax County Zone Forecast"

fairfax: find zone "Fairfax"
parse fairfax [thru "..." copy forecast to "$$"]

print forecast

</extract>

Notice how easy it is to acquire the html file - it is regarded as a
first-class data type that has content.  No includes or imports needed, all
this machinery is built-in.  Also notice how easy it was to extract the
content of a tag.  This is about all I've done with it, though.  It seems
promising.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rev-bob at gotc.com  Sat Dec 18 02:51:15 1999
From: rev-bob at gotc.com (rev-bob@gotc.com)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces
Message-ID: <19991217211349.SM00700@Unknown.>

> > Just to take a stab in the dark here, but wouldn't this fundamental
> > difference be that Perl and Java are almost unilaterally
> > self-contained parcels (hence, it doesn't matter what the package is
> > named, because you're writing everything that deals with it anyway),
> > but XML documents are designed for interchange - where the names
> > don't just have to make sense to you, but also to an unknown client?
> 
> I'd considered that possibility, but it's not true.  Consider the
> following, very typical case in the Java world:
> 
> 1. I write a compiled Java app that references org.xml.sax.Parser.
> 2. I send you the app.
> 3. You run the app, and the JVM complains that org.xml.sax.Parser is
>    not found.

Not to offend, but I consider that hideously sloppy.  If you're going to use a module, you 
have to make sure the client has that module.  Otherwise, to be blunt, you're not doing 
your job as a programmer.  You wouldn't define a variable as being of type Jimmy 
without defining what "Jimmy" means (either by explicit code or through reference); why 
would you try to use method Frank on a document without defining where method 
"Frank" can be found?

If you're going to use oxsP in your application, you need to bundle it.  Say, here we go 
with namespaces in XML again - "if you're going to use <foo/>, you need to indicate 
what <foo/> means".

> I find it very hard to see any difference between that case and the
> following one:
> 
> 1. I compile a data record in XML format that includes an element
>    named {http://www.foo.com/ns/}alpha.
> 2. I send you the data record.
> 3. You process the record, and your processing code complains that
>    {http://www.foo.com/ns/}alpha is not recognized.

Agreed; the cases seem precisely analagous - and IMO, both are utterly inexcusable.

> In neither case is there any automatic resolution mechanism to help
> you -- the JVM won't go out on the Web and find org.xml.sax.Parser for 
> you if it's not already on your classpath.  I'm not disputing that
> something like this could be helpful (both for Java and Namespaces),
> but I am disputing that Namespaces (or Java packages) are
> fundamentally incomplete without it.

Here's where I don't get you.  Here we have a big, gaping hole in usability - and you're 
saying that the hole doesn't need to be plugged?!?  I un-comprehend that point of view to 
such a staggering degree that I can't even frame the objection; it's like going outside in 
the middle of a thunderstorm and being surprised because water's falling from the sky.  
The words "blindingly obvious" come to mind - you provide what is needed to run your 
program (or parse your document), because you cannot count on someone else to 
magically know what (classes|elements) you used and take the initiative to download the 
definitions manually.  Providing the required modules is YOUR job as author, not the 
USER'S job!

In case there is any doubt, let me spell my position out in no uncertain terms: 
Namespaces are utterly pointless - even counterproductive! - if there is no standard way 
to resolve them.  This goes beyond "helpful" into the core realm of "essential".

> > In other words, my only concern when naming a function or a class in
> > a program is that I need to know what it is; I can name a variable
> > "Fred" or a 50-char string class "Bubba" if I want to, and it
> > doesn't matter - because nobody else needs to understand what those
> > names mean.  
> 
> But they do need to know -- see my example above.  The difference is
> that once you have the class files for org.xml.sax.Parser, you're
> fine, while it's far from obvious what you *could* retrieve that would 
> help with {http://www.foo.com/ns/}alpha.

Uh-uh.  You misunderstand (and being the second person to do so, I blame myself) - I'm 
saying that if I write, say, the method foo.bar(x), I can name the variables inside the 
bar() handler however I please.  I can call the lone parameter Bubba, manipulate it using 
variables Joe, Lily, Susan, and Ethel, and return the value of the resulting Martha - and it 
doesn't matter.  I have a JavaScript example handy; observe:

function parse(z) {
  x = 0; isthere = false;
  while ((x <= ua_max) && (!isthere))
  { isthere = (z.indexOf(ua_Arr[x],0) >= 0); x++; }
  return(!isthere);
}

I could just as easily rewrite this as the following:

function parse(Martha) {
  Wilma = 0; Cathy = false;
  while ((Wilma <= ua_max) && (!Cathy))
  { Cathy = (Martha.indexOf(ua_Arr[Wilma],0) >= 0); Wilma++; }
  return(!Cathy);
}

That function will work exactly like the first one - no difference at all.  Why?  Because 
Martha, Wilma, and Cathy are all purely INTERNAL.  Nothing outside this function 
sees them.  At all.  Period.  Conversely, the parse(), ua_max, and ua_Arr names are 
external, and so they can't be changed.  Your oxsP class reference is analagous to 
discussing parse(), but I'm talking about Martha.

> > However, if I'm writing a document that I'm going to send somewhere
> > else for Joe to deal with, I'd better use names that Joe can
> > understand and easily map.  
> 
> That Joe can understand, or that his software can understand?  There's 
> a significant difference there.

Who says "Joe" identifies a person?  Come on; I used human names for software 
references all through that message - why do you assume that this use of Joe is any 
different from my prior uses of Bubba and Frank?  Think outside the box for a minute.

> We could set up some mechanism so that Joe's software could automatically retrieve a
> schema or schema fragment that covers {http://www.foo.com/ns/}alpha, but that's not
> much help -- Joe's software could tell that it's valid or invalid (whoopee!), but it still
> couldn't tell what it means.

Exactly what do you mean by "couldn't tell what it means"?  Do you mean that the 
retrieved information is semantic and hence has no bearing on "How do I handle 
foo:alpha?", or do you mean that the retrieved information is relevant to the processing, 
but simply insufficient to handle the task?

Either way, the goal is not being addressed.  The whole idea behind XML is that you can 
write your own tags.  A natural followup to this is that you'll want to share tags with 
other people.  From there, it's a short step to the realization that sharing by reference is 
better than sharing by cut-and-paste.  And yet, at this point, there seems to be a lot of 
hand-wringing over how to handle sharing by reference, or even if there's a need to 
handle it at all.

Am I the only one who sees "Duh!" written all over this issue?  We're going to have 
numerous XML tagsets out there - fact.  (Hello - "eXtensible" ring any bells?)  The 
people developing documents are going to want to share tags - fact.  Therefore, there 
needs to be a way to identify where a tag comes from and where an engine needs to look 
for instructions on how to handle it - "Duh!"  That requires a standard form of resource 
location and retrieval (URI, XPointer, whatever).  Some uses of this technology will 
require a local definition to take the helm - thus, we need a way to handle that.

Is this really such a contentious and complicated issue?  The only reason to have XML is 
to facilitate data exchange - and if some form of transformation wasn't involved, people 
could simply send each other MS Office files all day long.  "Should we handle this?" isn't 
the question.  The question is "How do we handle this?" - because we've got to do it.  
There's simply no alternative...other than shutting XML down and going home, of 
course.


 Rev. Robert L. Hood  | http://rev-bob.gotc.com/
  Get Off The Cross!  | http://www.gotc.com/

Download NeoPlanet at http://www.neoplanet.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tmmet at hotmail.com  Sat Dec 18 03:05:36 1999
From: tmmet at hotmail.com (tmmet tvp)
Date: Mon Jun  7 17:18:37 2004
Subject: Urgent: XSL/XML
Message-ID: <19991218030502.28136.qmail@hotmail.com>

Hi,
Can anyone help me out.I want to create a tree view from my XML file
using XSL. IS it possible?.If so,can anyone mail me how this can be done.I'm 
new to this area and as this is an urgent requirement,I'm mailing.
Any ideas will be greatly helpful for me.
Thanks in advance.
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tmmet at hotmail.com  Sat Dec 18 03:14:45 1999
From: tmmet at hotmail.com (tmmet tvp)
Date: Mon Jun  7 17:18:37 2004
Subject: XSL/XML
Message-ID: <19991218031411.42311.qmail@hotmail.com>

Hi,
Iw ant to create a tree view from my XML file using XSL.How this can be 
done?.Any ideas/suggestions will be greatly helpful for me.I'm totally new 
to this area.Thanks in advance.
______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Dec 18 12:41:59 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:37 2004
Subject: Musing over Namespaces
In-Reply-To: rev-bob@gotc.com's message of "17 Dec 99 21:17:06 -0500"
References: <19991217211349.SM00700@Unknown.>
Message-ID: <m3ogboo200.fsf@localhost.localdomain>

rev-bob@gotc.com writes:

> > I'd considered that possibility, but it's not true.  Consider the
> > following, very typical case in the Java world:
> > 
> > 1. I write a compiled Java app that references org.xml.sax.Parser.
> > 2. I send you the app.
> > 3. You run the app, and the JVM complains that org.xml.sax.Parser is
> >    not found.
> 
> Not to offend, but I consider that hideously sloppy.  If you're
> going to use a module, you have to make sure the client has that
> module.  Otherwise, to be blunt, you're not doing your job as a
> programmer.

That is exactly my point.

> You wouldn't define a variable as being of type Jimmy without
> defining what "Jimmy" means (either by explicit code or through
> reference); why would you try to use method Frank on a document
> without defining where method "Frank" can be found?

But Java and Perl have global as well as local names, and they both
have a high degree of code reuse.  There is (to my knowledge) only one
org.xml.sax Java package in the world -- people who reference
org.xml.sax are not definining it the way that you define your local
"jimmy" variable, and in an ideal world, there should be no reason to
bundle sax.jar with every app that happens to use it.

> If you're going to use oxsP in your application, you need to bundle
> it.  Say, here we go with namespaces in XML again - "if you're going
> to use <foo/>, you need to indicate what <foo/> means".

I'd rephrase slightly:

- If you're going to use oxsP in your application, you need to bundle
  it or tell the users where they can find it.

- If you're going to use {http://www.foo.com/ns}foo in your XML
  document, you need to indicate what {http://www.foo.com/ns}foo
  means, tell the users where they can find out, or at least provide
  default rules for dealing with unrecognized elements and attributes.

Neither of these is The Way It Should Be, but it's interesting that
the problem of global resolution, which some people claim is trivially
easy and should have been solved ages ago for XML Namespaces, has not
yet been solved even for the older, enormous, and lucrative Java and
Perl markets.

[snip]

> In case there is any doubt, let me spell my position out in no
> uncertain terms: Namespaces are utterly pointless - even
> counterproductive! - if there is no standard way to resolve them.
> This goes beyond "helpful" into the core realm of "essential".

An interesting, if contradictory argument.  Why are Java package names
(which are global like XML Namespaces) not also pointless and
counterproductive, then, since there's no standard way to resolve
them?  Above, you argue that it's OK (in fact, essential) to have to
bundle shared Java class libraries with every app that uses them, but
then you state that it's not OK to have to bundle documentation for
shared global XML element or attribute types with every document type
that uses them.

It may be that both statements are true, but I still see no crisp
distinction.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Sat Dec 18 13:01:36 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:38 2004
Subject: XML parsing memory overhead concerns
References: <Pine.LNX.4.10.9912161512310.20127-100000@cauchy.clarkevans.com> <385A4785.93FDC75E@fxtech.com>
Message-ID: <385B85EC.4BD33B82@fxtech.com>

> When I get to the handler, I'd like to write this:
> 
> void PointHandler(XML::Element &elem, Point *point)
> {
>         // "pull" a chunk of element data, but it'll automatically
>         // stop when it gets to the end of the element (it stops
>         // when it sees </Point>)
>         char buf[40];
>         elem.GetData(buf, sizeof(buf));
>         sscanf(buf, "%dx%d", &point->x, &point->y);
> }
> 
> For some reason I couldn't figure out how to make this work, because
> with expat I might have hit the element but not necessary gotten to the
> character data handler.

I remembered why I couldn't get this part to work. Suppose (using expat)
I get to the <Point> element. So far I can build the tree down to this
point, so when I see <Point> I look in the handler list for that subtree
and call the callback for the Point element. Now in PointHandler, the
code wants to pull the element data. I can't go back into expat and
parse some more at this point, getting called back with expat's
CharacterData handler, fill my buffer, and fall back out to
PointHandler.

But this brings up a question: does expat buffer the entire character
data itself? If it does, perhaps I could delay the call to PointHandler
until after the expat character data handler has been called. I can save
a copy of that pointer away in my Element list, and only call the user
code when I see the end element tag.

Does this sound workable?

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Sat Dec 18 14:32:05 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:18:38 2004
Subject: A bit of synergy this morning?
In-Reply-To: <NBBBJPGDLPIHJGEHAKBACEOEEJAA.martind@netfolder.com>
References: <NBBBJPGDLPIHJGEHAKBACEOEEJAA.martind@netfolder.com>
Message-ID: <19991218093145.B22432@w3.org>

  That's a very good point, 
 You should definitely forward your suggestion to the RDF interest list !
  
 the question can be reversed:
   - how well can we map Xlink to RDF model ? 
 
There will be a attempt to answer the second question in the next XLink
draft (to be made public real soon),

Daniel

> Question:
> Could an RDF "record" be a link?
> or could an Xlink be an RDF record?
> 
> At first sight this may not be too significant, but if you think more about
> it, it brings tremendous advantages. So, let's imagine, for a second if an
> xlink extended locator would also be an RDF "record". The link would also
> contain meta-information about the linked resource for instance if I have
> the following expression:
> 
> <specifications xlink:type="extended">
> <rdf:desciption xlink:type="locator" xlink:href="http://www.w3c.org/xlink">
>   <relase_date>12/24/99<release_date>
>   <type>christmas gift</type>
>   <description>W3C would be a santa claus for us poor XML
> developers</description>
> </rdf:description>
> </specifications>
> 
> OK, put as resource description more significant information :-)) but the
> point here is the following: if an RDF description would also recognize the
> Xlink attribute for linkage (so that we can replace the
> rdf:about="http://www.w3c.org/xlink" with
> xlink:href="http://www.w3c.org/xlink" then the resource description can also
> be a link. or vise versa.
> 
> The impacts are:
> a) more significant links (links that also include meta-information about
> the linked resource)
> b) Resources descriptions could be used as links (the commutative reasoning)
> c) a browser can display a one to many link as a two level context menu as
> below
> 
>    rdf specifications ----------------------
> 			    | W3C documents       |
> 			    | Didier's suggestion |
> 			    | examples            |----------------------
> 			    ----------------------| author: Will johnson |
> 							  | date: 12/24/99       |
> 							  | description: bla..   |
> 							  ------------------------
> The first menu is...a menu, then when a particular locator is highlighted, a
> tool tip kind of window is displayed to provide additional information about
> the link (the meta information about the resource).
> d) probably a lot more I didn't envisionned.
> 
> Observation:
> I discovered something observing the W3C output. It seems that each
> workgroup creates its own workspace... heu sorry, its own name space and do
> not re-use the work of others (have you found a lot of name space element
> re-usage among the WGs?). For example, it would be beneficial is the rdf WG
> would use the xlink workgroup reference attribute for the resource reference
> and vise versa. The xlink group can as well take the rdf:about attribute as
> a resource reference. Anyway, if these group where to talk each other or
> just exercise their synthetic mind, it would become more obvious that if
> they mix a bit their mind space... heu sorry their name space they would
> provide us synergistic constructs. Or maybe that it never occurred to
> somebody that a resource could be a link or that a link could also be a
> resource description. Hoops, did I invented something here or did I
> discovered that workgroups do not talk each other?
> 
> Cheers
> 
> Didier PH Martin
> ----------------------------------------------
> Email: martind@netfolder.com
> Conferences:
> Web New York (http://www.mfweb.com)
> Book to come soon: XML Pro published by Wrox Press
> Products: http://www.netfolder.com
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Sat Dec 18 14:29:56 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:18:38 2004
Subject: XML parsing memory overhead concerns
Message-ID: <3.0.6.32.19991218141143.009ca2a0@gpo.iol.ie>

Paul,

I would suggest that what you need here is a layer
above Expat that queues up the
"events" generated from expats callbacks for consumption
by your application.

Off the top of my head you could have a top level
GetExpatEvent() entry point that works like this:-

1) If there is an Expat event queued up, it returns in.
2) Otherwise, it feeds more data into Expat until
such time as an event appears in the queue and
then returns that event.

Basically, you would create callbacks for Expat
whose function in life is to add the event information
to a queue which you then consume from the
GetExpatEvent() entry point.


>> When I get to the handler, I'd like to write this:
>> 
>> void PointHandler(XML::Element &elem, Point *point)
>> {
>>         // "pull" a chunk of element data, but it'll automatically
>>         // stop when it gets to the end of the element (it stops
>>         // when it sees </Point>)
>>         char buf[40];
>>         elem.GetData(buf, sizeof(buf));
>>         sscanf(buf, "%dx%d", &point->x, &point->y);
>> }
>> 
>> For some reason I couldn't figure out how to make this work, because
>> with expat I might have hit the element but not necessary gotten to the
>> character data handler.
>
>I remembered why I couldn't get this part to work. Suppose (using expat)
>I get to the <Point> element. So far I can build the tree down to this
>point, so when I see <Point> I look in the handler list for that subtree
>and call the callback for the Point element. Now in PointHandler, the
>code wants to pull the element data. I can't go back into expat and
>parse some more at this point, getting called back with expat's
>CharacterData handler, fill my buffer, and fall back out to
>PointHandler.
>
>But this brings up a question: does expat buffer the entire character
>data itself? If it does, perhaps I could delay the call to PointHandler
>until after the expat character data handler has been called. I can save
>a copy of that pointer away in my Element list, and only call the user
>code when I see the end element tag.
>
>Does this sound workable?
>
>--
>Paul Miller - stele@fxtech.com
>

Sean,


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Sat Dec 18 14:56:05 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:38 2004
Subject: XML parsing memory overhead concerns
References: <3.0.6.32.19991218141143.009ca2a0@gpo.iol.ie>
Message-ID: <385BA0C8.46C52668@fxtech.com>

> I would suggest that what you need here is a layer
> above Expat that queues up the
> "events" generated from expats callbacks for consumption
> by your application.

I've tried this. The problem comes when you get to character data
between element tags. I'd have to buffer all of the data inside the
event queue. For my needs, this wouldn't be a problem, since my custom
element data is generally pretty small. But for a more general-purpose
solution, an implementation that did not have to copy all of the data
would be preferrable. I can't make my design work over expat without
copying all of this data (so I can provide it to my user handlers as a
single unit). In other words, because of expat's (and most other
parsers') push model, I can't possibly implement the pull model I need
for element data, without also buffering this data somewhere. It's
*possible*, but is this desired?

I asked around a few days ago about general memory-usage requirements,
and got a lot of feedback about requiring a low-memory-overhead
solution. My current implementation, with its several limitations,
provides almost zero-memory-overhead, which is a huge advantage in some
situations.

To recap, for those just joining us. I want to be able to handle a
single element all at once, and write code like this inside an element
handler (this is C++):

For this XML fragment:
	<Point>100x100</Point>

void Point::Parse(XML::Element &elem)
{
	// this gets called when the parser sees <Point>
	// I'll ask for the element data, and the parser will
	// pull this to me until it sees </Point>
	char buf[40];
	elem.GetData(buf, sizeof(buf));
	sscanf(buf, "%dx%d", &x, &y);
}

The only way to implement this over expat is to queue up the Element
tags as they are found as you suggest. When the characterDataHandler is
called, it would have to buffer the contents of the character data
(until the ending element tag is found) with that element, storing a
complete copy of the data. When I see the </Point> end element, I can
then call my Point parser and allow it to "pull" the data from the
buffer element data. To implement this, I would be storing THREE copies
of the element data in memory (expat's buffered chunk of the document,
my copy of the data in the element queue, and then the element handler's
copy while it is interpreting the data (buf[40] above). With my
implementation, there is only one "copy" of the data in memory, and
that's the element handler's private buffer (buf[40] above), since the
implementation pulls a character at a time.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sat Dec 18 16:22:42 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:38 2004
Subject: A bit of synergy this morning?
In-Reply-To: <19991218093145.B22432@w3.org>
Message-ID: <NBBBJPGDLPIHJGEHAKBAEEACEKAA.martind@netfolder.com>

Hi Daniel,

Daniel said:
 the question can be reversed:
   - how well can we map Xlink to RDF model ?

There will be a attempt to answer the second question in the next XLink
draft (to be made public real soon),

Didier reply:
I think you're right, Xlink should be mapped to RDF. I look forward reading
the xlink draft to see how this can be made.

I will post the question to the RDF mailing list to see if this can feed RDF
2 to bring us such facilities. Let's cross our fingers that it will ;-).

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences:Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products: http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sat Dec 18 17:19:21 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:38 2004
Subject: XML parsing memory overhead concerns (fwd)
Message-ID: <Pine.LNX.4.10.9912180019550.22278-100000@cauchy.clarkevans.com>


Paul,

James posted the answer to your concern a while
back... (it took me a while to grok what he
was saying).   Sean's description is correct,
only he didn't mention "how" you do it with expat.
James describes this below.  You only send "part" 
of the XML stream at any given time, it fires 
callbacks filling up your event buffer, as
Sean describes.  Evidently expat handles
"restarting" mid element..

Clark

---------- Forwarded message ----------
Date: Fri, 17 Dec 1999 14:48:06 +0700
From: James Clark <jjc@jclark.com>
To: Clark C. Evans <clark.evans@manhattanproject.com>
Cc: Paul Miller <stele@fxtech.com>, xml-dev <xml-dev@ic.ac.uk>
Subject: Re: XML parsing memory overhead concerns

"Clark C. Evans" wrote:

> Anyway, given a SAX event source, pushing
> the entire document his way, I don't see
> how a single threaded solution is possible.
> 
> And, from the expat declaration of setElementHandler,
> which requires both a StartElementHandler and an
> EndElementHandler, I assumed that expat works in
> a similar (if not identical) manner.

Expat doesn't work like SAX.  Clark Cooper has written a nice article
explaining expat's API:

  http://www.xml.com/pub/1999/09/expat/index.html

With SAX, the application calls parse once per document; the parser
makes a call on an InputStream to get each chunk of input.  With expat,
the parser doesn't make any calls to get input; rather the application
calls XML_Parse() arbitrarily many times for a single document, each
time passing it another chunk of the input.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Sat Dec 18 17:34:27 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:38 2004
Subject: XML parsing memory overhead concerns (fwd)
References: <Pine.LNX.4.10.9912180019550.22278-100000@cauchy.clarkevans.com>
Message-ID: <385BC5DC.EA31C48B@fxtech.com>

> James posted the answer to your concern a while
> back... (it took me a while to grok what he
> was saying).   Sean's description is correct,
> only he didn't mention "how" you do it with expat.
> James describes this below.  You only send "part"
> of the XML stream at any given time, it fires
> callbacks filling up your event buffer, as
> Sean describes.  Evidently expat handles
> "restarting" mid element..

I read through Clark's writeup and I'm willing to give it another go. I
think the main problem will be handling the reentrancy caused by the
parser parsing a startElement, calling a handler in my interface, which
calls back into the parser to get some element data, which causes
another call to the parser to read more data, which causes the
characterData handler to be called, which fills in the buffer passed by
my code, and have it all unwind correctly. Perhaps this would be easier
to think about if I had a few beers first...

Another issue is expat's license. I'm not very familiar with the Mozilla
license, but presumably anything I build on top would also require the
Mozilla (or GPL) license. I personally feel the GPL license is too
restrictive. Does Mozilla let code be used in commercial products where
source code is not made available? That's my main requirement.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sat Dec 18 17:41:41 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:38 2004
Subject: XML parsing memory overhead concerns (fwd)
In-Reply-To: <385BC5DC.EA31C48B@fxtech.com>
Message-ID: <Pine.LNX.4.10.9912180041070.22278-100000@cauchy.clarkevans.com>


On Sat, 18 Dec 1999, Paul Miller wrote:
> Another issue is expat's license. I'm not very familiar with the Mozilla
> license, but presumably anything I build on top would also require the
> Mozilla (or GPL) license. I personally feel the GPL license is too
> restrictive. Does Mozilla let code be used in commercial products where
> source code is not made available? That's my main requirement.

I'm not sure; you may be able to link to an expat 
shared library from a commercial product using 
Mozilla... also the GPL is slightly blurry here,
XML is on its way to becoming a "core" / operating 
system level service.  Perhaps James Clark would
release it under the LGPL instead?

This issue aside, you could layer your approach,
providing a much simpler (SML?) parser instead
of a full blown XML one. 

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From liefke at seas.upenn.edu  Sat Dec 18 20:36:46 1999
From: liefke at seas.upenn.edu (Hartmut Liefke)
Date: Mon Jun  7 17:18:38 2004
Subject: Free Tool for Efficient XML Data Compression
Message-ID: <199912182036.PAA26308@red.seas.upenn.edu>

We would like to announce the release of XMill, a compressor for XML
data.  XMill typically compresses twice as good as gzip, at about the
same speed.

XMill is written in C++, and its distribution site is:

   http://www.research.att.com/sw/tools/xmill


TECHNICAL SUMMARY. The idea in XMill is that XML data items are
grouped according to their tags, then each group is compressed
separately with gzip.  This is XMill's default behavior, and already
results in better compression than gzip.

Furthermore, users can select particular semantic compressors (such as
for integers, IP addresses, state names, ...) to be applied to certain
data items, or can override the way data items are grouped: this
further improves the compression rate.

In complex applications where XML files contain highly specialized
datatypes, such as images or DNA sequences, XMill can be extended with
user-defined compressors for those data types.  XMill defines a C++
Semantic Compressor API (SCAPI) for such extensions.


PAPER.  A technical paper describing XMill is included in the
distribution package, and is also available directly from:

   http://www.cis.upenn.edu/~liefke/papers/xmill.ps.gz


HIGHLIGHT. It is known that XML files tend to be larger than files in
application specific data formats.  With XMill however, it becomes
more economical to choose XML over application specific data formats,
because the XML file is compressed with XMill better (up to half the
size) than the application specific file compressed with gzip.  We
discovered this surprising fact by converting several application
specific data formats to XML (including weblog data, protein meta
data, linguistic data), then compressing the original file with gzip,
and the XML file with XMill: the compressed XML file was always
smaller, up to a factor of two. Details are included in the
distribution at the Web site above.

MAILING LIST.  For updates on XMill, please subscribe to
xmill@research.att.com, by sending an email to
majordomo@research.att.com, with the body "subscribe xmill".


Regards,

   Hartmut Liefke  -- University of Pennsylvania, liefke@seas.upenn.edu
   Dan Suciu       -- AT&T Labs, suciu@research.att.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Dec 18 22:57:10 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:38 2004
Subject: SAX2: Namespace proposal
Message-ID: <14427.62086.25937.792412@localhost.localdomain>

In my last message, I summarized some of the major positions that have
emerged on Namespace support in SAX2.  I've done a lot of thinking
about this recently, partly because Namespace support is a technical
requirement for two of my customers.

With SAX2 alpha 1, the idea was to preserve SAX 1.0 pretty much intact 
and add extra features to it.  Upon reflection, I think that it will
make more sense to create a whole new package, org.xml.sax2 (if OASIS
gives permission), which is similar to org.xml.sax in many but not all 
respects -- that will avoid nightmarish problems with classpaths,
etc., and it will be quite easy to write adapters.

That said, I think that the best solution in SAX2 will be to allow
either or both Namespace-qualified and raw XML names.  People seem to
want both, and although I *strongly* disagree, DOM2 has decided to
provide what should be irrelevant information about the original
prefix used.

Anyway, here's what I'm suggesting.  Please take a couple of Rolaids
and maybe even a Gravol before you look at this, because while it's
functional, it's not at all pretty...


1. From org.xml.sax2.DocumentHandler

  public void startElement (String name, String namespaceURI,
                            String localName, AttributeList atts)
    throws SAXException;

  public void endElement (String name, String namespaceURI,
                          String localName)
    throws SAXException;


2. From org.xml.sax2.AttributeList

  public String getName (int index);
  public String getNamespaceURI (int index);
  public String getLocalName (int index);
  public String getType (int index);
  public String getValue (int index);

  public String getType (String name);
  public String getType (String namespaceURI, String localName);
  public String getValue (String name);
  public String getValue (String namespaceURI, String localName);
    

Notes:

a. A parser has the right to supply only cooked Namespace information, 
   only raw XML 1.0 information, or both, as long as it does so
   consistently.  It will be possible to query and set the parser's
   features.

b. The Namespace URI will be null when there is no Namespace declared
   or none available.

c. If the Parser is delivering raw XML 1.0 names, the Namespace
   declarations (xmlns*) will be included in the AttributeList but
   both the Namespace URI and the local name will be set to null, and
   getType(String, String) and getValue(String, String) will always
   fail (even with null arguments).

d. There will probably also be a NamespaceHandler to report the scope
   of Namespace declarations, but it may be possible to roll that into 
   LexicalHandler.


Here's a short example:

  <note xmlns:html="http://www.w3.org/1999/xhtml">
   <html:p>Hello, world!</html:p>
  </note>

If the parser does only raw XML 1.0 processing, it will report the
following events (omitting a little whitespace for clarity):

  startDocument()
  startElement("note", null, null, [ATTS]);
  startElement("html:p", null, null, [ATTS]);
  characters("Hello, world!")
  endElement("html:p", null, null)
  endElement("note", null, null)
  endDocument()

If the parser does only Namespace processing it will report the
following events (again, omitting a little whitespace):

  startDocument()
  startElement(null, null, "note", [ATTS]);
  startElement(null, "http://www.w3.org/1999/xhtml", "p", [ATTS]);
  characters("Hello, world!")
  endElement(null, "http://www.w3.org/1999/xhtml", "p")
  endElement(null, null, "note")
  endDocument()

If the parser does both kinds of processing, it will report the
following:

  startDocument()
  startElement("note", null, "note", [ATTS]);
  startElement("html:p", "http://www.w3.org/1999/xhtml", "p", [ATTS]);
  characters("Hello, world!")
  endElement("html:p", "http://www.w3.org/1999/xhtml", "p")
  endElement("note", null, "note")
  endDocument()

This is kind-of painful, I know, but please remember that SAX is meant
to be the equivalent of a low-level device driver, not an elegant,
high-level API (though those can be and have been built on top of it).
A higher-level API would be free to ignore the raw XML 1.0 names, and,
in my opinion, would probably be wise to do so.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sat Dec 18 22:57:09 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:39 2004
Subject: SAX2: summary of Namespace-support arguments
Message-ID: <14427.49858.421668.821690@localhost.localdomain>

Thanks to everyone for their input on Namespace support in SAX2.  A
few identifiable positions have come up:

1. SAX2 should provide no special Namespace support at all, because
   applications can use the xmlns* attributes to get at information
   they need.

2. SAX2 should provide Namespace-qualified names as single strings.

3. SAX2 should provide the Namespace URI and the local name
   separately.

4. SAX2 should provide the Namespace URI and local name separately,
   and should also provide information about the original prefix used
   for each element or attribute name.

I will arbitrarily and high-handedly dismiss #1 out of hand -- since
Namespace processing is a prerequisite for XSL, RDF, XML Schemas,
XHTML, DOM2, and just about everything else interesting happening in
XML right now, there is an obvious benefit in not forcing every
application writer to write custom code for the same task -- SAX2
*must* expose cooked Namespace information if it's going to be useful
with the next generation of XML specs and apps.

That leaves us with three positions, which I will assign to their most 
outspoken (though not sole) advocates and will attempt to summarize,
probably incorrectly:

#2 (Simon St-Laurent)
  SAX should be as simple and transparent as possible: it's easy
  for applications to work with a single string rather than a new
  class of object, and using a single string keeps SAX2
  backwards-compatible.

#3 (James Clark)
  James sympathizes with the backwards-compatible argument because
  he's done the same with Expat, but he believes that SAX2 should
  present what are in effect compound names as compound objects.
  James would like to create a new Name class that includes the
  original prefix (if any) as well as the Namespace URI and local
  name.

#4 (Tim Bray)
  Tim agrees that a compound object makes the most sense, because it
  makes no sense for the parser to put the two parts together into a
  string only so that the app can take them apart again.  Tim argues
  that the original prefix shouldn't matter at all, and that apps
  should see only the cooked Namespace info: the Namespace URI and the 
  local part.

(I should note that James and Tim have been conducting this friendly
debate -- prefix does/doesn't matter -- for quite a while and in many
contexts other than SAX2.)
  
To rephrase, then, there are actually two different decisions that we
need to make:

1. maximal vs. minimal Namespace information; and
2. String vs. compound-object representation.

I'm going to post my suggestion in a separate message, and as with
most of the rest of SAX, while no one will like it, I hope that
everyone will be able to live with it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sun Dec 19 00:13:15 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:39 2004
Subject: SAX2: summary of Namespace-support arguments
Message-ID: <3.0.32.19991218161206.014eb310@pop.intergate.ca>

At 12:22 PM 12/18/99 -0500, David Megginson wrote:
>#3 (James Clark)...  James would like to create a new Name class that 
>  includes the
>  original prefix (if any) as well as the Namespace URI and local name.
>#4 (Tim Bray)...  Tim argues
>  that the original prefix shouldn't matter at all, and that apps
>  should see only the cooked Namespace info: the Namespace URI and the 
>  local part.
>(I should note that James and Tim have been conducting this friendly
>debate -- prefix does/doesn't matter -- for quite a while and in many
>contexts other than SAX2.)

Actually, you are slightly mis-characterizing James' position.  I don't
think he cares about which prefix was actually used on any particular
name, but he wants APIs to manifest the prefix/namespace mapping so that
they can be used in things like XPath expressions.

Yes, I have argued repeatedly against this, but have resoundingly lost
that argument every time.  Since I'm obviously right, this can only 
be ascribed to James' devilish political cunning and awesome powers of
salesmanship and persuasion.  Hrumph.  Anyhow, since XPath and some
other things now rely on access to prefix-ns mappings, it seems obvious 
that SAX has to provide this info one way or another. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sun Dec 19 00:35:15 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:39 2004
Subject: SAX2: Namespace proposal
Message-ID: <3.0.32.19991218163458.01500100@pop.intergate.ca>

At 03:45 PM 12/18/99 -0500, David Megginson wrote:
>That said, I think that the best solution in SAX2 will be to allow
>either or both Namespace-qualified and raw XML names.  People seem to
>want both, and although I *strongly* disagree, DOM2 has decided to
>provide what should be irrelevant information about the original
>prefix used.

Actually, as I just finished arguing in response to your other note, 
I think what people care about is having the prefix/uri mapping available,
not knowing what prefix was actually used.  Have I missed something? Is
there an application scenario where you get

<a xmlns:a="http://x.com" xmlns:b="http://x.com">
  <a:foo /><b:bar />
  </a>

...and you actually care that the foo and bar had different prefixes?  
I'd find that really hard to believe.

In fact, the *only* scenario I can see in which you want the 
raw un-namespace-processed full name is where you are going to be doing
namespace-oblivious processing, and specifically the case in which you
are going to disregard the explicit warnings in XML 1.0 section 2.3
and use element names like a:foo and b:bar and would be upset if
intervening machinery picks them apart for you.

Anyone who wants to do that is IMHO nuts, but if they do, then since you 
thoughtfully provide a sax2 interface, then all they have to do is use
sax1 instead and they'll never have to worry about that terrible misguided 
namespace stuff, and they can mis-use colons to their hearts' content.

>Anyway, here's what I'm suggesting.  Please take a couple of Rolaids
>and maybe even a Gravol before you look at this, because while it's
>functional, it's not at all pretty...

I think that's often a symptom of having gone off the rails.  Sax1 is
not perfect but it's not aesthetically repellent in the slightest.

>1. From org.xml.sax2.DocumentHandler
>  public void startElement (String name, String namespaceURI,
>                            String localName, AttributeList atts)
>    throws SAXException;

Just lose the "name" argument to all of these, and add a facility
to query in-scope prefix-namespace mappings, and I think you've got
a winner.  

>Notes:
>a. A parser has the right to supply only cooked Namespace information, 
>   only raw XML 1.0 information, or both, as long as it does so
>   consistently.

Bleccch.  Gag.  Choke.  If you want raw XML 1.0 (and remember, this only
matters if you're misusing colons) why would you want to go near SAX2?

>b. The Namespace URI will be null when there is no Namespace declared
>   or none available.

Note that since the namespace URI can't be an empty string, it's worth
thinking about using "" rather than null.  It would allow you to do 
things like

 if (namespace.equals("http://my.namespace.com"))
   whatever();

instead of

 if ((namespace != null) && namespace.equals("http://my.namespace.com"))
   whatever();

but I haven't thought about this a lot.

>c. If the Parser is delivering raw XML 1.0 names, the Namespace
>   declarations (xmlns*) will be included in the AttributeList but
>   both the Namespace URI and the local name will be set to null, and
>   getType(String, String) and getValue(String, String) will always
>   fail (even with null arguments).

What!?!?!?  This seems baroque and totally wrong.  Why?

>d. There will probably also be a NamespaceHandler to report the scope
>   of Namespace declarations, but it may be possible to roll that into 
>   LexicalHandler.

Perhaps.  But why not just have a couple of new methods on the parser:

 String getURIForPrefix(String prefix)
 String[] getPrefixesForURI(String uri)

yeah, you'll have to stash a bit of state, but not very much.

James, that would give you what you need for XPath and so on, right?

>This is kind-of painful, I know, but please remember that SAX is meant
>to be the equivalent of a low-level device driver, not an elegant,
>high-level API 

Not good enough. Sax2 is going to be with us for a loooooooooooong
time.  Let's get it right.  -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From elharo at metalab.unc.edu  Sun Dec 19 00:38:36 1999
From: elharo at metalab.unc.edu (Elliotte Rusty Harold)
Date: Mon Jun  7 17:18:39 2004
Subject: Free Tool for Efficient XML Data Compression
In-Reply-To: <199912182036.PAA26308@red.seas.upenn.edu>
References: <199912182036.PAA26308@red.seas.upenn.edu>
Message-ID: <v04210102b481d61885dc@[192.168.1.254]>

>
>
>HIGHLIGHT. It is known that XML files tend to be larger than files in
>application specific data formats.


While XMill sounds interesting, I really have to take issue with this 
statement.  I've seen no evidence that XML based, uncompressed file 
formats are larger than the corresponding binary file formats. This 
is a common fear about XML but I have not seen it borne out in my 
tests. For instance, my 700K, very verbose baseball statistics 
example is more than two megabytes in both FileMaker 3 and Microsoft 
Excel.


+-----------------------+------------------------+-------------------+
| Elliotte Rusty Harold | elharo@metalab.unc.edu | Writer/Programmer |
+-----------------------+------------------------+-------------------+
|                  The XML Bible (IDG Books, 1999)                   |
|              http://metalab.unc.edu/xml/books/bible/               |
|   http://www.amazon.com/exec/obidos/ISBN=0764532367/cafeaulaitA/   |
+----------------------------------+---------------------------------+
|  Read Cafe au Lait for Java News:  http://metalab.unc.edu/javafaq/ |
|  Read Cafe con Leche for XML News: http://metalab.unc.edu/xml/     |
+----------------------------------+---------------------------------+

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Sun Dec 19 00:45:03 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:39 2004
Subject: Musing over Namespaces
References: <E11yy2R-0006CX-00@romeo.ic.ac.uk>
Message-ID: <385C258F.3591@hiwaay.net>

Eiland, David wrote:
> 
>         While the conversation has primarily focused on the Company
> (Organization) level, it is my experience working with the Fortune 500 that
> the same name is often used differently between departments/functions.  Then
> considering buying/selling of other divisions/companies that periodically
> occurs, the Namespace mechanism must be fluid.

Precisely.  Views were proposed originally with that in mind.  Consider
a 
project as a performance in which processes are controlled via the
visibility of 
nested views.  A WBS (Work Breakdown Structure) is that.  Treat it
dynamically.  
Each level of that organization (not really a pure nesting but for the
sake 
of discussion) has responsibilities for opening and closing processes
and 
proving a valid performance by proving a valid deliverable.  Any systems 
engineer is familiar with the models needed.  XML is just another means
of 
enabling a loose coupling among the systems.  

In the days when this was proposed using SGML (late eighties), there
were 
not only concerns about the buying and selling, but organizing
performances 
across cultures as well.

The model writ large has aspects of real time systems design.

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Sun Dec 19 01:26:18 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:39 2004
Subject: Musing over Namespaces
References: <Pine.GHP.4.21.9912171419380.16086-100000@mail.ilrt.bris.ac.uk>
Message-ID: <385C3330.1B68@hiwaay.net>

Dan Brickley wrote:
> 
> I didn't take Tim to be arguing against decentralisation, but against
> the multitude of companies/organisations who both promote the
> centralised monolithic (meta)data registry approach, and
> (coincidentally) attempt to promote themselves as providing that
> service.

I guess that includes OASIS.  Too bad.  It has a chance of 
being a significant organization in this.  I've really no 
objection to ANY company selling such a service.  I can 
think of many small companies for which it may be useful 
in the same sense as bookies are to horse races but when 
you look at the models for contracting used for very large 
transactions, it is much more useful if each B2B interface 
is established dynamically.  Otherwise, faster simply means 
less quality.  Review the decline of the airline industry 
services as they reach saturation.  But my point is, if there 
is going to be one, we are better served if there are lots 
of them.  Caveat Emptor.  If OASIS wants to do it, I certainly
hope MS does, and Sun does, and MaAndPaRegistry do too.

The real product by which ecom-ecologies will be judged is the 
quality of the experience both in getting the product 
and in using the product.  Quality is coupled to profit.  
As the profits become razor thin, the quality declines. 
Faster, better, cheaper:  pick any two.
 
> One of the supposed big wins of using the same syntax/model for our
> meta-languages and our instance data is re-use, synergy.

Yes.  I am aware.  See the experience of the CALS community with 
this and in fact, the entire history of markup since type definitions 
were added to the original generic coding designs.  Most attempts at
centralization  failed.  

I don't think it was the fault of the registry designers 
or anything sinister.  It has to do with defining and maintaining 
visible and hidden definitions of processes which preserve profit.  

So, reuse is cool but not the goal, just an optimization.  If the 
effect of B2B and B2Customer is to suck out the local profit (think 
for example, sales tax), the result is the destabilization of the 
local ecologies just as super malls killed downtowns.  Except in this 
case, China Kills California.   It happens fast.  I suspect some 
weird and typically badly informed debates are going on at high 
levels of governments worldwide.  I predict higher property taxes 
but anyway, the point is, the local system declares its own 
namespace, declares what is visible, and contracts for tests 
of quality performance and deliverable. 

It's called Logistics.  Originally CALS was Computer-Aided Logistics 
before it became Commerce At Light Speed.  The WebSters are about 
to repeat the same mistake made by the CALS groups when they chose to 
take a bridge too far.

> Eg. if RDF and XML vocabularies are themselves described using RDF and
> XML, then generic discovery/indexing/trust systems applicable to _all_
> XML/RDF content should be equally applicable to schemas.

The fate of RDF is irrelevant.  It is a means.  I think the topic 
map folks already had most of this ground covered but that is a means 
too.  It does have some interesting running code to back it up.

> Why then
> promote centralised registries for all schemas? Surely there will be
> search engines and 'trusted third parties' for schema data, as there
> will be for other applications of XML and RDF.

Surely.  Now what will they do with that namespace?  What is it useful 
for other than as David is pointing out in the sibling thread,
packaging? 

> By defining schema
> languages in instance syntax, we implicitly promote the idea that there
> will be some big payoff for doing so (otherwise, lets stick with
> DTDs). 

There is some payoff:

1.  Political.  XML can finally dissolve the XML to SGML parentage.
2.  Technically.  It is a stronger schema model. 

There are some downsides too:

1.  Politically.  It is absurdly complicated to explain.  It needs 
so many names to name the names, it vindicates the Hytime work 
completely.

2.  It is limited unnecessarily.  I have to go back an look, but 
unless arrays have been added it is incomplete.  Even if one 
can simulate that in the application language, the experience has 
been when the implementors have to do an unreasonable amount of 
work to get the same results as familiar means, they opt for 
familiar means or they adapt and create similar means which 
they make more useful in their own application. 

In other words, it may be the case that schemas aren't the 
best or only means and may not become the preferred means. 
A lot depends on how other language communities perceive them 
and other economic ecologies provide alternatives.

> Some synergy that means generic tools will be applicable to
> schemata. I find this impossible to reconcile with the
> www.really-important-trusted-metaregistry.[com|org] approach that seems
> popular in the industry. I'm banking on doing schema searches at the
> mainstream search engines in 2-3 years time...

Yes.  It concerns me if the community of XML developers are not yet 
recognizing that.  OTOH, if you do, you are ahead so Get It While 
The Getting Is Good.

We need a discussion of common models of contracting.  How is an 
ROA declared and maintained by an organization to organization 
process.   The models we used at a decade ago used the 
process/control design familiar to systems analysts.  There 
were interesting parallels with real time systems but those 
may not be as important as I once thought they were.  

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Sun Dec 19 01:41:24 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:39 2004
Subject: Free Tool for Efficient XML Data Compression
References: <199912182036.PAA26308@red.seas.upenn.edu> <v04210102b481d61885dc@[192.168.1.254]>
Message-ID: <001501bf49c2$9fad96e0$73fbb1cd@tomshp>


Elliotte Rusty wrote:
> >
> >
> >HIGHLIGHT. It is known that XML files tend to be larger than files in
> >application specific data formats.
>
>
> While XMill sounds interesting, I really have to take issue with this
> statement.  I've seen no evidence that XML based, uncompressed file
> formats are larger than the corresponding binary file formats. This
> is a common fear about XML but I have not seen it borne out in my
> tests. For instance, my 700K, very verbose baseball statistics
> example is more than two megabytes in both FileMaker 3 and Microsoft
> Excel.
>
>
I took an SVG  XML picture (it actually was a painting in SVG, not just
lines and simple shapes) - 100K, and converted it to a GIF - 80K.  Not much
difference, really!  And if the XML file is compressed in transmission, it
would probably be even smaller than to send than the GIF (since the GIF
probably won't compress much more).  I also took a 2-column Word97
document - 63.5K - and opened it with Abiword (www.abisource.com) then saved
it.  Abiword uses an XML file format as its native format.  Abiword XML file
size: 48.8K.

These results support Elliotte's post.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ebohlman at netcom.com  Sun Dec 19 02:51:11 1999
From: ebohlman at netcom.com (Eric Bohlman)
Date: Mon Jun  7 17:18:39 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <3.0.32.19991218163458.01500100@pop.intergate.ca>
Message-ID: <Pine.GSU.4.10.9912181831560.24621-100000@netcom9.netcom.com>

On Sat, 18 Dec 1999, Tim Bray wrote:
> Actually, as I just finished arguing in response to your other note, 
> I think what people care about is having the prefix/uri mapping available,
> not knowing what prefix was actually used.  Have I missed something? Is
> there an application scenario where you get
> 
> <a xmlns:a="http://x.com" xmlns:b="http://x.com">
>   <a:foo /><b:bar />
>   </a>
> 
> ...and you actually care that the foo and bar had different prefixes?  
> I'd find that really hard to believe.

I can think of one such class of applications: editors or other tools that
transform human-generated XML documents into other human-edited and read
XML documents.  A near-perennial question on some of the Usenet XML
newsgroups is how to keep a tool from transforming <element></element>
into <element/>.  Admittedly, such applications are going to be a
minority, and support for such requirements should probably lurk in the
corners of the interface rather than being the first thing anybody sees
(we do *not* want to encourage developers to think of prefix names as the
"official" identifiers for namespaces!; I have a depressing hunch that a
majority of people who use XSLT believe that "<xsl:for-each>" is *the*
proper name for one of that language's elements, rather than one of
several), but a complete lack of support for such requirements will, IMHO,
encourage people to hack-parse with regexes or the like in such
applications, which is, again IMHO, not a Good Thing (and, YAIMHO, a Worse
Thing than indulging people's religious beliefs about how a document's
"source code" should "look").


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ebohlman at netcom.com  Sun Dec 19 03:02:32 1999
From: ebohlman at netcom.com (Eric Bohlman)
Date: Mon Jun  7 17:18:39 2004
Subject: Free Tool for Efficient XML Data Compression
In-Reply-To: <v04210102b481d61885dc@[192.168.1.254]>
Message-ID: <Pine.GSU.4.10.9912181847530.24621-100000@netcom9.netcom.com>

On Sat, 18 Dec 1999, Elliotte Rusty Harold wrote:

> While XMill sounds interesting, I really have to take issue with this 
> statement.  I've seen no evidence that XML based, uncompressed file 
> formats are larger than the corresponding binary file formats. This 
> is a common fear about XML but I have not seen it borne out in my 
> tests. For instance, my 700K, very verbose baseball statistics 
> example is more than two megabytes in both FileMaker 3 and Microsoft 
> Excel.

Slightly playing devil's advocate here, I rather strongly suspect that
what you're seeing is the size advantage of a variable-length-field data
format over a fixed-length-field data format; your database files probably
have 1.3+ megabytes of blank padding.  What does the comparison look like
if you export the files to CSV?

While I do believe there's a lot of FUD running around regarding the size
of XML files, and that shouldn't be the primary selling point of something
like XMill, it does appear that the XMill developers are on to something:
XML allows specifying the structure of data, and XMill appears to be able
to use that structural information to get better compression than would be
available by treating the data as an unstructured blob of text.  It's very
often the case that the average "entropy" of the contents of each "field"
of a set of "records" is much lower than the "entropy" of the set as a
whole, and thus the sum of the gains from compressing each "column"
individually can be greater than the gain from compressing the entire set
as if it were homogeneous.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Dec 19 06:25:54 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:39 2004
Subject: XML parsing memory overhead concerns (fwd)
References: <Pine.LNX.4.10.9912180019550.22278-100000@cauchy.clarkevans.com> <385BC5DC.EA31C48B@fxtech.com>
Message-ID: <385C7731.61A34363@jclark.com>

Paul Miller wrote:

> I read through Clark's writeup and I'm willing to give it another go. I
> think the main problem will be handling the reentrancy caused by the
> parser parsing a startElement, calling a handler in my interface, which
> calls back into the parser to get some element data, which causes
> another call to the parser to read more data, which causes the
> characterData handler to be called, which fills in the buffer passed by
> my code, and have it all unwind correctly. Perhaps this would be easier
> to think about if I had a few beers first...

In the scheme I described, the expat handlers simply append events to a
queue, so you have no reentrancy problems.

> Another issue is expat's license. I'm not very familiar with the Mozilla
> license, but presumably anything I build on top would also require the
> Mozilla (or GPL) license.

No.  The Mozilla license isn't like that.

> I personally feel the GPL license is too
> restrictive. Does Mozilla let code be used in commercial products where
> source code is not made available?

Yes. That's the main point of the Mozilla license.  It's requirements
are not at all onerous.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Dec 19 06:27:23 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:39 2004
Subject: SAX2: summary of Namespace-support arguments
References: <3.0.32.19991218161206.014eb310@pop.intergate.ca>
Message-ID: <385C6A26.7B4E6AA6@jclark.com>

Tim Bray wrote:
> 
> At 12:22 PM 12/18/99 -0500, David Megginson wrote:
> >#3 (James Clark)...  James would like to create a new Name class that
> >  includes the
> >  original prefix (if any) as well as the Namespace URI and local name.

> Actually, you are slightly mis-characterizing James' position.  I don't
> think he cares about which prefix was actually used on any particular
> name, but he wants APIs to manifest the prefix/namespace mapping so that
> they can be used in things like XPath expressions.

I do indeed want that, and in the past I've argued against *requiring*
processors to provide information about the prefix used. I've become a
lot less negative about this prefix information recently and I think
it's better for an API to provide it.  My reasons are:

1. I view namespaces as much more core to XML than DTDs.  I want to be
able to build namespace processing into the parser at very low-level (so
it has negligible overhead); but I don't want to build XML 1.0
validation in.  This means I need an API that is both namespace aware
and allows XML 1.0 validation (which of course requires prefix) to be
layerd on top of it.

2. DOM Level 2 needs prefixes.  It would be very unfortunate if SAX 2
was such that DOM Level 2 could not be layered on top of it.

3. I want to use DocumentHandler not just an interface between a parser
and an application but between an application and a serializer. 
Serialization can in fact be as performance critical as parsing.  A
serializer can do it's job much more efficiently and easily if it has
the prefix available rather than having to figure it out from the
prefix/namespace bindings in effect. Although the combination of XML 1.0
DTDs and namespaces is a problematic, many users want to use namespaces
and still have their documents been XML 1.0 valid; this may apply to the
documents they are creating with a DocumentHandler.  For a serializer
that uses a DocumentHandler as its interface to be able to do this, it
has to have prefix information.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Dec 19 06:25:53 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain>
Message-ID: <385C7716.6F394D37@jclark.com>

David Megginson wrote:

> c. If the Parser is delivering raw XML 1.0 names, the Namespace
>    declarations (xmlns*) will be included in the AttributeList but
>    both the Namespace URI and the local name will be set to null, and
>    getType(String, String) and getValue(String, String) will always
>    fail (even with null arguments).

The one scenario that bothers me with this is:

- the parser always supplies both XML 1.0 names and cooked namespace
declarations

- the application is namespace-aware and only interested in the cooked
namespace information

- the application iterates over all attributes using getLocalName(int).

In this scenario, the fact that the parser is supplying XML 1.0 raw
names in addition to cooked namespace declarations will force the
application to check whether getLocalName is returning null (indicating
a namespace declaration) and, if so, ignore the attribute.  This seems
to be the one case where a namespace-aware application that is being
driven by a parser that provides cooked namespace information is
affected by whether or not the parser also provides XML 1.0 raw names.
This seems a bad thing, but so far I haven't been able to think of a
fully satisfactory solution.

> This is kind-of painful

Indeed.  Although I think your proposal is quite reasonable, and
although I don't have anything better to propose right now, I can't say
I feel altogether comfortable with it.  I think we should continue to
explore other possible solutions.  If it's possible to have a completely
new org.xml.sax2 package, then more radical solutions are possible.  For
example, I think there are good arguments for moving to a 

interface DocumentHandler {
  void startElement(StartElementEvent event)
  void endElement(EndElementEvent event)
  ...
}

style interface.  This makes it much easier to evolve the interface in
the future.  For example, a SAX3 that added support for XML Schemas
could just add additional methods to StartElementEvent and
EndElementEvent and thus remain compatible with SAX2.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From aray at q2.net  Sun Dec 19 06:35:10 1999
From: aray at q2.net (Arjun Ray)
Date: Mon Jun  7 17:18:40 2004
Subject: Musing over Namespaces
In-Reply-To: <385C3330.1B68@hiwaay.net>
Message-ID: <Pine.LNX.4.10.9912190130280.6950-100000@mail.q2.net>


On Sat, 18 Dec 1999, Len Bullard wrote:
> Dan Brickley wrote:
> 
> > By defining schema languages in instance syntax, we implicitly
> > promote the idea that there will be some big payoff for doing so
> > (otherwise, lets stick with DTDs).

How about improving DTDs? (Just a thought.)

> There is some payoff:
> 
> 1.  Political.  XML can finally dissolve the XML to SGML parentage.

Done deal already: XML is a W3C trademark, and AFAICT, the XML 1.0
Recommendation does not reference SGML (ISO8879+TCs) normatively.

> 2.  Technically.  It is a stronger schema model. 

That's the hope.  The latest XSchema stuff has plenty of changes.

> There are some downsides too:
> 
> 1.  Politically.  It is absurdly complicated to explain.  It needs 
> so many names to name the names, it vindicates the Hytime work 
> completely.

Unfortunately, it's now politically correct to break out the garlic and
crosses at any mention of HyTime - saves having to ask how much of it is
being reinvented.

> 2.  It is limited unnecessarily.  I have to go back an look, but 
> unless arrays have been added it is incomplete.  

Not there as a "primitive": but it seems possible to have a user-defined
type with list-like characteristics using "facets".  The data-typing in
XSchema seems heavily influenced by (R)DBMS-think.

> In other words, it may be the case that schemas aren't the best or
> only means and may not become the preferred means.  A lot depends on
> how other language communities perceive them and other economic
> ecologies provide alternatives.

Isn't *some* (form of) schema indispensible?  The real issue seems to be
the content of schemas (i.e. what they provide for in the way of asserted
and verifiable constraints) rather than the syntax, although sheer
clumsiness of instance-syntax could become an issue too.

> > Some synergy that means generic tools will be applicable to 
> > schemata. 

I'd prefer a useful syntax to a convenient one, where convenience is seen
through the prism of tools at hand.  Not everything need have to be a nail
just because we have a hammer.


Arjun


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec 19 07:13:26 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:40 2004
Subject: Free Tool for Efficient XML Data Compression
Message-ID: <001401bf49f3$dd314ee0$08f96d8c@NT.JELLIFFE.COM.AU>

 From: Elliotte Rusty Harold <elharo@metalab.unc.edu>
 
>>HIGHLIGHT. It is known that XML files tend to be larger than files in
>>application specific data formats.
>
>
>While XMill sounds interesting, I really have to take issue with this 
>statement.  I've seen no evidence that XML based, uncompressed file 
>formats are larger than the corresponding binary file formats. This 
>is a common fear about XML but I have not seen it borne out in my 
>tests. For instance, my 700K, very verbose baseball statistics 
>example is more than two megabytes in both FileMaker 3 and Microsoft 
>Excel.
 
I think he means files with various kinds of minimization
(tag omission, short-tags, eg </>, NET tags eg <xxx/.../ ) and 
home-made markup languages optimised for specific information
types.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec 19 07:28:10 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX2: summary of Namespace-support arguments
Message-ID: <002501bf49f5$deb5ba10$08f96d8c@NT.JELLIFFE.COM.AU>

 From: James Clark <jjc@jclark.com>

 >I do indeed want that, and in the past I've argued against *requiring*
>processors to provide information about the prefix used. I've become a
>lot less negative about this prefix information recently and I think
>it's better for an API to provide it

Isn't it a requirement for XSL?  In XSL, if two elements have the name
namespace but different prefixes, they are processed differently (at
least, for the XSL namespace itself).

This is essential for XSL scripts that generate XSL.  This cannot be
done if either the prefix is lost or unless some other arrangement
(e.g. adding some attribute giving an instance number) is used.

Also, prefixes form part of the markup.  They should have documentation
meaning.   I think it is bad if the "rdf": prefix were to be replace by
"xyz" or, worse, by some prefix generated at random like "dom".   So,
like document-entity encoding, prefixes do not form part of the
canonical
information set of the document but they do form part of the pragmatic
information set required for round-tripping.

Different instances of the same document are required for programming
languages that use XML syntax and which allow namespaced elements
as data.  Prefix-preservation is required for this. Prefix-preservation
is also required for round-tripping.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From abisheks at india.hp.com  Sun Dec 19 07:39:13 1999
From: abisheks at india.hp.com (Abhishek Srivastava)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX Question
Message-ID: <004201bf49f3$d74e3290$252f0a0f@india.hp.com>

Hi,

Can someone explain that  why does SAX have two call back functions one for Tag Name  (startElement ) and one for the tag Value ( Characters ) Instead, why not have one function that calls back the processing application with a name value pair. ( for elements as well as attributes )

This makes it easier to process the data as u get the name of the tag and it's value in one shot.

Currently , when my application encounters a tagname it would be interested in, it sets a global Boolean flag to true so that when the call back happens on the Characters function the value is stored in a variable and not ignored.

The parser could throw a name value pair for both the elements as well as the attributes , since my application just needs the data and it would not be concerned whether the name value pair is from an element or an attribute.

Can I modify the source code of any parser that would give me name value pairs in one call back function rather than two ? How complicated would that be ?

regards,
Abhishek.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    _/               Abhishek Srivastava
   _/                Hewlett Packard ISO       
  _/_/_/   _/_/_/    -------------------   
 _/    /   _/   _/     (Work)   +91-80-2251554 x1190
_/  _/   _/_/_/      (Ip)     15.10.47.37            
        _/           (Url)    http://sites.netscape.net/abhishes/index.html                        
       _/            
                     Work like you don't need the money.
                     Dance like no one is watching.
                     And love like you've never been hurt.
                     --Mark Twain                       
                     
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991219/a3dfb97f/attachment.htm
From jjc at jclark.com  Sun Dec 19 09:43:40 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX2: summary of Namespace-support arguments
References: <002501bf49f5$deb5ba10$08f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <385C9A6A.11215A65@jclark.com>


Rick Jelliffe wrote:
> 
>  From: James Clark <jjc@jclark.com>
> 
>  >I do indeed want that, and in the past I've argued against *requiring*
> >processors to provide information about the prefix used. I've become a
> >lot less negative about this prefix information recently and I think
> >it's better for an API to provide it
> 
> Isn't it a requirement for XSL?

No.  XSL requires only information about what namespace declarations are
in scope, not what prefix was actually used.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Dec 19 09:43:16 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX2: Namespace proposal
References: <3.0.32.19991218163458.01500100@pop.intergate.ca>
Message-ID: <385C9B42.ACE9D1B2@jclark.com>

Tim Bray wrote:

> >d. There will probably also be a NamespaceHandler to report the scope
> >   of Namespace declarations, but it may be possible to roll that into
> >   LexicalHandler.
> 
> Perhaps.  But why not just have a couple of new methods on the parser:
> 
>  String getURIForPrefix(String prefix)
>  String[] getPrefixesForURI(String uri)
> 
> yeah, you'll have to stash a bit of state, but not very much.
> 
> James, that would give you what you need for XPath and so on, right?

No.  I need to be able to build a tree that includes the all the
namespace scoping information.  Something like:

interface NamespaceDeclHandler {
  void startNamespaceDecl(String prefix, String namespace);
  void endNamespaceDecl();
}

with calls to start/endNamespaceDecl outside the calls to the
corresponding start/endElement would do it.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Dec 19 09:43:12 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain>
Message-ID: <385CA792.A218D9E5@jclark.com>

Here's an alternative proposal:

interface DocumentHandler {
  void startElement(String namespaceURI, String localName,
                    AttributeList attList);
  void endElement(String namespaceURI, String localName);
  // ..
}

interface NamespaceDeclHandler {
  void startNamespaceDecl(String prefix, String namespaceURI);
  void endNamespaceDecl(); // prefix argument?
}

interface AttributeList {
  String getValue(int i);
  String getLocalName(int i);
  String getNamespaceURI(int i);
  int getType(int i);
  String getType(String namespaceURI, String localName);
  String getValue(String namespaceURI, String localName);
}

I propose that there there is a NamespaceProcessingLevel feature that
applications can set on the parser with 4 possible values:

Cooked (the default)
IncludePrefixes
NamespaceDeclarationsAsAttributes
Raw

When the level is Cooked, you get a fully cooked
post-namespace-processing only view: namespace declarations don't
appear in the AttributeList; localName contains the local name and
namespace URI contains the namespace URI.

When the level is IncludePrefixes, the parser instead of reporting
local names, will report QNames.  Specifically the second argument in
startElement and the value returned by AttributeList.getLocalName(int)
will include any prefix as well as the localName.
AttributeList.getValue(String, String) and
AttributeList.getType(String, String) are affected as follows: the
localName may have a prefix; if the namespaceURI is non-null, then the
prefix is ignored; if the namespaceURI is null, then any prefix is
used to identify the attribute.

When the level is NamespaceDeclarationsAsAttributes, as well as the
IncludePrefixes changes, namespace declarations are included in the
AttributeList. The namespaceURI for namespace declarations is always
treated as null.  (This level is intended for applications like the
DOM Level 2 that want to support simultaneously both a XML 1.0 view
and an XML Namespaces view of a document; applications like XSLT and
XPath that just need namespace scoping information would instead use
the NamespaceDeclHandler.)

When the level is None, then no namespace processing
is done at all.  This is the same as the
NamespaceDeclarationsAsAttributes level, except that namespace URIs
will always be null.

The thinking behind this proposal is that there are three classes of
use:

1. Those who want a fully cooked namespace view

2. Those who want a fully cooked namespace view, but also want to know
   the actual prefix used.

3. Those who don't want any namespace processing at all (perhaps
   because they want to roll their own)

My view is that class 1 is in the long-term the most important class
and the interface should be optimized for them; therefore it's
acceptable to subject class 2 and 3 users to a little inconvenience in
order to buy convenience for class 1.

With this proposal, class 1 users get an ideal interface. Class 2
users have the inconvenience that they need to split up a QName into a
local name and a prefix whenever they have a non-null namespace URI;
this is easy and cheap to do, and SAX can provide convenience
functions; they also have deal a slightly misleading name for the
AttributeList.getLocalName. Class 3 users have the inconvenience that
they have to pass a null namespaceURI argument to some methods on
AttributeList, and they have to ignore a null namespaceURI argument on
startElement and endElement.

SAX processors would be required to support at least one of the
Cooked level and the Raw level, and could
optionally support as many others as they wished.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Sun Dec 19 10:50:24 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:40 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain> <385CA792.A218D9E5@jclark.com>
Message-ID: <385CB8C4.6B2B539E@pacbell.net>

James Clark wrote:
> Here's an alternative proposal:

My comments:

1) Nice!  

2) Please add the prefix param to endNamespaceDecl().  This makes application
management of prefix stacks somewhat simpler.

3) Since the localName param and getLocalName methods play different roles for
different NamespaceProcessingLevels, you may want to rename them using the more
generic name "name", as in:

DocumentHandler.startElement(String namespaceURI, String name, AttributeList
attList);

and

AttributeList.getName(int i);

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Sun Dec 19 10:52:38 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:40 2004
Subject: Attributes, namespace partitions, and schemas
Message-ID: <385CB961.D5DB56B4@pacbell.net>

Given the following XML:

<a xmlns="uri1" xmlns:b="uri1" c="d" b:c="e"/>

This seems to be legal according to the namespaces rec (see section 5.3) but I
don't see how it can make any sense to a schema aware application.  

Is it possible for a schema to indicate that the above is allowed or disallowed? 

Should SAX2 report two attributes with the same localName and namespaceURI?

Which value for c should an application honor if it encounters such a case?  

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Sun Dec 19 11:16:31 1999
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <14427.62086.25937.792412@localhost.localdomain>
Message-ID: <3.0.1.32.19991220111837.00961ac0@pop3.demon.co.uk>

At 03:45 PM 12/18/99 -0500, David Megginson wrote:
>In my last message, I summarized some of the major positions that have
>emerged on Namespace support in SAX2.  

I am deliberately not taking a position in this.

[...]
>
>Anyway, here's what I'm suggesting.
>
[... spec and example deleted ...]
>
>This is kind-of painful, I know, but please remember that SAX is meant
>to be the equivalent of a low-level device driver, not an elegant,
>high-level API (though those can be and have been built on top of it).

I would like to thank David for coming up with this proposal and I hope
that the community is able to move forward rapidly with it. Namespace
support is an all-or-nothing thing - if you have one critical tool in the
set that doesn't support namespaces, then it is likely that you won't use
namespaces.

To take CML as an example, a typical document *ought to* look something like:
<foo:bar xmlns:foo="http://foo.org">
  <xyzzy:molecule xmlns:xyzzy="http://www.xml-cml.org/cml.dtd">
    <atom ...>
    </atom>
  </xyzzy:molecule>
</foo:bar>

People are now starting to author CML documents, using the DTD and examples
that I and Henry have published. NONE of the authors, including myself uses
namespaces because there is not a consistent toolset. So we actually see:
<html>
  ...
  <molecule>... </molecule>
</html>

and processing programs rely on the likelihood that there will be no
semantic collision with some other document.

My pressing need is to get a workable namespace-processing mechanism before
we get overtaken by well-meant but incorrect attempts to create exemplar
documents *** and code ***. I would expect almost all CML authors to
actually be using code that looks like:
	startElement(name, atts);
	if (name.equals("molecule")) ...

I expect that some may be even be writing:
	if (name.equals("molecule") || name.equals("cml:molecule")) ...
which we all wish to avoid.

One thing that I continue to urge is that things be kept simple. I have
consistently done this because only simple things can be easily implemented
in a community-wide approach. It may sound irresponsible but I think it is
more important that SAX2 gets implemented than that we get every bit
exactly right.

It is worth noting that Namespaces - which apparently involve very little
programming to implement - have taken a surprisingly long time to get
universally implemented.

What David has suggested is something that I can use and will use if it
gets taken forward. 

	P.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Dec 19 12:20:05 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <385C7716.6F394D37@jclark.com>
References: <14427.62086.25937.792412@localhost.localdomain>
	<385C7716.6F394D37@jclark.com>
Message-ID: <14428.52513.463513.610128@localhost.localdomain>

James Clark writes:

 > The one scenario that bothers me with this is:
 > 
 > - the parser always supplies both XML 1.0 names and cooked namespace
 > declarations
 > 
 > - the application is namespace-aware and only interested in the cooked
 > namespace information
 > 
 > - the application iterates over all attributes using getLocalName(int).

Yes, this is my biggest misgiving as well -- it's the only place where 
a Namespace-aware application is at a disadvantage.

 > Indeed.  Although I think your proposal is quite reasonable, and
 > although I don't have anything better to propose right now, I can't say
 > I feel altogether comfortable with it.  I think we should continue to
 > explore other possible solutions.  If it's possible to have a completely
 > new org.xml.sax2 package, then more radical solutions are possible.  For
 > example, I think there are good arguments for moving to a 
 > 
 > interface DocumentHandler {
 >   void startElement(StartElementEvent event)
 >   void endElement(EndElementEvent event)
 >   ...
 > }
 > 
 > style interface.  This makes it much easier to evolve the interface in
 > the future.  For example, a SAX3 that added support for XML Schemas
 > could just add additional methods to StartElementEvent and
 > EndElementEvent and thus remain compatible with SAX2.

It's worth considering this approach, but it does create an awful lot
of classes.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Sun Dec 19 12:40:33 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX Question
In-Reply-To: "Abhishek Srivastava"'s message of "Sun, 19 Dec 1999 13:07:00 +0530"
References: <004201bf49f3$d74e3290$252f0a0f@india.hp.com>
Message-ID: <m3k8mbnm6l.fsf@localhost.localdomain>

"Abhishek Srivastava" <abisheks@india.hp.com> writes:

> Can someone explain that why does SAX have two call back functions
> one = for Tag Name (startElement ) and one for the tag Value (
> Characters ) = Instead, why not have one function that calls back
> the processing = application with a name value pair. ( for elements
> as well as attributes = )
> 
> This makes it easier to process the data as u get the name of the
> tag = and it's value in one shot.

Consider the following document:

  <div>
  <h1>Section</h1>

  <p>This is the <em>first</em> section.</p>
  </div>

Without the start/end callbacks, you wouldn't be able to figure out
the nesting of the document, or even the real order of the character
data:

  element("div", "\n\n\n\n")
  element("h1", "Section")
  element("p", "This is the  section.")
  element("em", "first")


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Sun Dec 19 13:50:48 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <14427.62086.25937.792412@localhost.localdomain>
Message-ID: <199912191350.IAA09542@hesketh.net>

At 03:45 PM 12/18/99 -0500, David Megginson wrote:
>That said, I think that the best solution in SAX2 will be to allow
>either or both Namespace-qualified and raw XML names.  People seem to
>want both, and although I *strongly* disagree, DOM2 has decided to
>provide what should be irrelevant information about the original
>prefix used.

Despite David's irrational prejudice against prefixes (I'd like to be able
to round-trip documents through SAX), I think the proposal here does the
job nicely.

It'll satisfy the bizarre demands of the XLink folk that we use their
namespace on attributes any place we create links in our own documents,
without placing undue stress on programmers.  I think allowing for both
styles will make it much easier to handle the transition between eras that
we've been facing for nearly a year now, and might actually encourage more
people to use namespaces who hadn't been thus far.

Elegant, no, not completely.  But remember that worse is better is important!

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Sun Dec 19 14:15:58 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain> <385CA792.A218D9E5@jclark.com>
Message-ID: <385CE80B.CCCA027A@jclark.com>

James Clark wrote:

> I propose that there there is a NamespaceProcessingLevel feature that
> applications can set on the parser with 4 possible values:
> 
> Cooked (the default)
> IncludePrefixes
> NamespaceDeclarationsAsAttributes
> Raw

A variation on this would be to drop the second level, so that there
would be three levels corresponding to:

- a pure namespaces view
- a simultaneous namespaces and XML 1.0 view
- a pure XML 1.0 view

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec 19 16:24:15 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:41 2004
Subject: Beta schematron validadator of XML Schema schemas available {was Re: XML Schema: New Public Working drafts available)
Message-ID: <001e01bf4a40$d83ad580$19f96d8c@NT.JELLIFFE.COM.AU>

 From: Henry S. Thompson <ht@cogsci.ed.ac.uk>

>On behalf of the XML Schema Working Group we are pleased to announce
>the release of new Public Working Drafts for XML Schema: Structures [1]
>and XML Schema: Datatypes [2].

I have made a Schematron that validates XML Schemas.  The beta is
available under
    http://www.ascc.net/xml/resource/schematron/schematron.html

Note that this just validates XML Schemas, it does not use XML schemas
to validate other documents.

The XML Schemas language has a lot of structural
constraints that a grammar-based system cannot check: for example
there are many instances where the value of an attribute restricts
the content model, or to check that the top-level element of
an included schema (i.e., an external file) has the correct
element type. (The fact that Schematron can handle these
better than XML Schemas is no big deal: it does show that XML
Schemas is targeted at databases and Java data rather than
being particularly optimised for validating W3C specifications. )

To use it, you will need James Clark's XT and a Schematron
implementation: I suggest schematron-report from the same
site, which gives a nice display of messages linked to the
context element where the error is in effect.

A few provisos: namespaces not supported and some sub-subtyping
errors will not be caught.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Sun Dec 19 16:32:17 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:41 2004
Subject: Free Tool for Efficient XML Data Compression
In-Reply-To: <v04210102b481d61885dc@[192.168.1.254]>
Message-ID: <NBBBJPGDLPIHJGEHAKBACEAMEKAA.martind@netfolder.com>

Hi Elliotte

Elliotte said:
While XMill sounds interesting, I really have to take issue with this
statement.  I've seen no evidence that XML based, uncompressed file
formats are larger than the corresponding binary file formats. This
is a common fear about XML but I have not seen it borne out in my
tests. For instance, my 700K, very verbose baseball statistics
example is more than two megabytes in both FileMaker 3 and Microsoft
Excel.

Didier reply:
What??? no evidence???? come on Elliotte, this is a joke I presume, you're
not serious ;-) so there is a :-) missing at the end of the statement.

In fact, their compression algorithm is something that the HTTP group should
look very closely as an addition to simple zip compression (as you know HTTP
1.1 can transport content in a compressed format) for certain MIME types
such as XML derived domain languages.

Cheers
Didier PH Martin
----------------------------------------------
Email: martind@netfolder.com
Conferences: Web New York (http://www.mfweb.com)
Book to come soon: XML Pro published by Wrox Press
Products: http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Sun Dec 19 17:02:22 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:41 2004
Subject: Attributes, namespace partitions, and schemas
References: <385CB961.D5DB56B4@pacbell.net>
Message-ID: <385D0FFB.924750E4@pacbell.net>

OK, I think I've answered my own questions here by re-reading the spec a few
times and digging through a ton of posts on this subject from earlier this year.
A little confirmation would help tho...

> Given the following XML:
> 
> <a xmlns="uri1" xmlns:b="uri1" c="d" b:c="e"/>
...
> Is it possible for a schema to indicate that the above is allowed or
> disallowed?

Yes, the unconstrained wildcard attribute specification <anyAttribute/> allows
this, but there is no way to explicitly allow a particular global attribute yet. 

> Should SAX2 report two attributes with the same localName and namespaceURI?

No. The per-element attribute (c) should have a null namespaceUri and the global
attribute (b:c) should have a namespaceUri of "uri1".

> Which value for c should an application honor if it encounters such a case?

Depends on whether the application is looking for a per-element attribute or a
global attribute.  Maybe both.

So the bottom line is, there are two "classes" of attribute: global and
per-element.  Global attributes and per-element attributes may belong to the
same namespace indirectly, but never share namespaceUris directly as per-element
attributes don't really have namespaceUris of their own.  

This also explains why "default namespaces do not apply directly to attributes".
Unqualified attributes *always* belong *indirectly* to whatever namespace their
parent element belongs to, so they can never *directly* belong to the default
(unqualified) namespace as global attributes.  

Is this even close to correct?  TIA...

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sun Dec 19 17:33:22 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <385C9B42.ACE9D1B2@jclark.com>
Message-ID: <Pine.LNX.4.10.9912190016000.11945-100000@cauchy.clarkevans.com>

This looks very similar to what Ray had 
proposed a few days ago. 

On Sun, 19 Dec 1999, James Clark wrote:
> No.  I need to be able to build a tree that includes the all the
> namespace scoping information.  Something like:
> 
> interface NamespaceDeclHandler {
>   void startNamespaceDecl(String prefix, String namespace);
>   void endNamespaceDecl();
> }
> 
> with calls to start/endNamespaceDecl outside the calls to the
> corresponding start/endElement would do it.

In this case, perhaps leave startElement alone:

  void startElement(String name, AttributeList atts);

This is more in the spirit with SAX, a low-level simple
interface; where the application programmer has to 
do much of the work.  To make things easy for the
application programmer:

package org.sax.helper;

class NamespaceDeclHandlerImpl implements NnamespaceDeclHandler 
{
  Stack stack;
  private class Pair { 
    String prefix; 
    String namespace; 
    public void Pair(String prefix, String namespace) {
      this.prefix=prefix; this.namespace=namespace;
    }
  };
  public void NamespaceDeclHandlerImpl() {
    stack = new Stack(); 
  }          
  void startNamespaceDecl(String prefix, String namespace) {
     stack.push(new Pair(prefix,namespace));
  }
  void endNamespaceDecl() {
     stack.pop();
  }
  String getNamespace(String name) {
    return namespaceFoundBySearchingStack;
  }
}


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Sun Dec 19 17:40:50 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:41 2004
Subject: Musing over Namespaces
References: <Pine.LNX.4.10.9912190130280.6950-100000@mail.q2.net>
Message-ID: <385D176B.4C46@hiwaay.net>

Arjun Ray wrote:
> 
> On Sat, 18 Dec 1999, Len Bullard wrote:
> > Dan Brickley wrote:
> >
> > > By defining schema languages in instance syntax, we implicitly
> > > promote the idea that there will be some big payoff for doing so
> > > (otherwise, lets stick with DTDs).
> 
> How about improving DTDs? (Just a thought.)

I guess it could be done but it seems to me this should come 
from the ISO WG responsible for SGML.  However, the issue 
of having to have separate facilities for DTDs is still there 
and being able to use the document framework to get values 
from the schema is useful.  IOW, as I note below, they have 
to be equally powerful in what they express and enable.

 > > There is some payoff:
> >
> > 1.  Political.  XML can finally dissolve the XML to SGML parentage.
> 
> Done deal already: XML is a W3C trademark, and AFAICT, the XML 1.0
> Recommendation does not reference SGML (ISO8879+TCs) normatively.
 
Spilt milk.  Wish it weren't.

> > 2.  Technically.  It is a stronger schema model.

> Unfortunately, it's now politically correct to break out the garlic and
> crosses at any mention of HyTime - saves having to ask how much of it is
> being reinvented.

For sure, but hey, some of us know who did what when and which names 
go on what inventions.  When someone raises that garlic and cross I 
have to laugh.  Superstition and hero worship... so much for scholarship 
and sound technical judgement.  Let it go.  The point is, they haven't 
delivered on their promises and human memory is not as short as 
Internet time.

> Not there as a "primitive": but it seems possible to have a user-defined
> type with list-like characteristics using "facets".  The data-typing in
> XSchema seems heavily influenced by (R)DBMS-think.

Yes.  It has been noted elsewhere which is why the suggestion that we 
may want to do some things differently in VRML has come up from Sony. 
They suggest we recast PROTOs as VRML schemas.  There is merit to 
the suggestion.  Having done a lot of relational work the last three 
years, I can see where that direction makes sense to RDBMS thinkers.  
To those doing object models, it is a bit less sensible.

> Isn't *some* (form of) schema indispensible?  The real issue seems to be
> the content of schemas (i.e. what they provide for in the way of asserted
> and verifiable constraints) rather than the syntax, although sheer
> clumsiness of instance-syntax could become an issue too.

I agree on both points.  I can live with the clumsiness of the instance 
syntax to enjoy the benefits of the DOM and event-oriented parsers, but 
extensibility was the promise of XML and that is what has to be there
for 
it to be useful.  If that is clumsy, we have a bigger problem.  Again, 
think Record of Authority.  Is it the schema or the runtime engine?

> I'd prefer a useful syntax to a convenient one, where convenience is seen
> through the prism of tools at hand.  Not everything need have to be a nail
> just because we have a hammer.

That would suggest that the DTD HAS to be as powerful as the schema. 
For 
that to happen, action from ISO is needed.  

Good to hear from you, Arjun!

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Sun Dec 19 17:55:01 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:41 2004
Subject: SAX2: Namespace proposal
Message-ID: <3.0.32.19991219095330.014f9a50@pop.intergate.ca>

At 08:50 AM 12/19/99 -0500, Simon St.Laurent wrote:
>It'll satisfy the bizarre demands of the XLink folk that we use their
>namespace on attributes any place we create links in our own documents,

What's bizarre? -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Sun Dec 19 18:54:21 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:41 2004
Subject: Free Tool for Efficient XML Data Compression
References: <8525684C.002A524E.00@aardvark.inso.com>
Message-ID: <004701bf4a52$fd4fc3a0$172a08d1@tomshp>


Philip Boutros
>
>
> Thomas B. Passin wrote:
>
> > I also took a 2-column Word97 document - 63.5K -
> > and opened it with Abiword (www.abisource.com)
> > then saved it.  Abiword uses an XML file format as
> > its native format.  Abiword XML file size: 48.8K.
>
> While I would love to chime in on the whole compressed XML discussion (my
> guess is word dictionaries and skip lists should slaughter gzip in terms
of
> compression size) I would like to address this statement in particular.
>
> 1.
> Comparing Word97's file size to that of the current version of Abiword is
a
> ludicrous exercise. I am very familiar with the Microsoft Word file format
> (no, I don't work for Microsoft and never have) and while it contains a
> number of inefficiencies and chunks of legacy garbage, given how much it
> encodes it is reasonably efficient for large documents. I can name at
least
> a hundred features (styles, page layout, frames, borders, backgrounds,
> graphics, fields, properties, etc.) that Word97 must deal with in its file
> format that Abiword's format does not address. In fact, given how little
> Abiword encodes, I was surprised that Abiword wasn't 10 times as
efficient.
> See #2 for a tirade about that.
>
<snip/>
Well, of course I know Word documents include a ton of stuff that, say,
Abiword files don't.  And this particular file doesn't even have any VBA
macros of my own in it. And I'm not arguing for Abiword's file format,
either.  In this case, though, my document doesn't need the rest of that
stuff Word includes.  So doing this conversion gave me some rough way to
compare sizes when the two documents were typical of the type I often use.
For an XML document, you could have argued that some other format doesn't
need end tags so of course XML would be bigger.  That's not the point.  The
point is that - I think it will turn out this way - for many actual cases,
the supposed size disadvantage of an XML document  will be relatively small
or non-existent.

Of course, the XML standard was developed under the guideline that
"terseness ... is of minimal importance", so if file size alone is going to
be the driver, XML might not be a favorable candidate.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Sun Dec 19 20:30:57 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:41 2004
Subject: Musing over Namespaces
Message-ID: <001201bf4a63$5f723080$60f96d8c@NT.JELLIFFE.COM.AU>

 From: Arjun Ray <aray@q2.net>


>On Sat, 18 Dec 1999, Len Bullard wrote:
>> Dan Brickley wrote:
>>
>> > By defining schema languages in instance syntax, we implicitly
>> > promote the idea that there will be some big payoff for doing so
>> > (otherwise, lets stick with DTDs).
>
>How about improving DTDs? (Just a thought.)

Also, there are mixed strategies that are worthwhile ocnsidering too:
in particular, regex syntax (i.e., (),|+* ) is very convenient.  I think
it would be great for XML Schema to allow a simplified syntax where
we could go
    <element name="boy" type="(frog+, snail+, puppyDogTail+)">
XSchemas and XLink and RDF include simplified syntax, so there
is precedent.

ISO has a correction to SGML in the wings to allow alternative
schema syntaxes to DTDs: I think it has been waiting for 2 years.
I think the problem with extending DTDs is that you have to create
new declarations (unless you use PIs): also, DTDs  are much more
aimed at parsing while what is needed is at a different level:
structures, datatypes, semantics.

>> There is some payoff:
>>
>> 1.  Political.  XML can finally dissolve the XML to SGML parentage.
>
>Done deal already: XML is a W3C trademark, and AFAICT, the XML 1.0
>Recommendation does not reference SGML (ISO8879+TCs) normatively.

But that only means that you don't need to read ISO 8879 in order to use
the XML spec. SGML is referred to in the abstract and at the start.
XML 1.0 is also referred to by ISO 8879 Annex L.

>> 2.  Technically.  It is a stronger schema model.
>
>That's the hope.  The latest XSchema stuff has plenty of changes.

The new draft needs a lot of scrutiny to test that it *is* a stronger
model.
Certainly, it approaches schemas from a more abstract perspective rather
than the nuts and bolts approach of  DTDs with PEs.

>Isn't *some* (form of) schema indispensible?  The real issue seems to
be
>the content of schemas (i.e. what they provide for in the way of
asserted
>and verifiable constraints) rather than the syntax, although sheer
>clumsiness of instance-syntax could become an issue too.

There is also the issue that different classes of languages have
different
families of constraints.  XML Schemas look like being specially good
for relational database interaction, forms, fun with keys, commerce, and
serializing (single inheritence?) objects.

But we need to test it.  I think we may have XML-Schemas with us for
quite afew years, so it is in all developer's interest to go through it
and
objectively think "does this solve my problems" or at least "does this
prevent solution of my problems".  For example, are you happy that
XML Schemas make infoset contributions?  That means that processing
software must be "schema-aware" before using the data. Should there
be mechanisms in place by which a document can say "I use infoset
contributions: if you don't have a full XML processor, don't accept me!"

Or, what should the criteria for validity be: structures, structures +
datatypes,
structure+ datatypes + encoding-checking?  Or should it implement
an ANY like XML DTDs (any element that is defined) or like WF XML
(accept anything)?

There are hundreds of these questions that XML Schemas 1.0 is
having to face, and hundreds more that must be deferred till later.
The new drafts are said to be "feature complete", so this is a good time
to start reading and thinking "could I actually use this thing?"

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Sun Dec 19 21:30:08 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:42 2004
Subject: pulling with expat
References: <Pine.LNX.4.10.9912161512310.20127-100000@cauchy.clarkevans.com> <385A4785.93FDC75E@fxtech.com> <385B85EC.4BD33B82@fxtech.com>
Message-ID: <385D4E8B.CA668281@fxtech.com>

As promised I decided to take another stab at implementing my streaming
pull-style parser over expat. Either I haven't had enough strong drink
yet for it to make sense, or it's not possible. If you'll recall my
design for the hierarchical nature of the element handlers, I want to be
able to nest my parsing, just like an object hierarchy:

(in XML)

	<Parent>
		<Child>
		</Child>
	</Parent>

(in class form)

	class Child
	{
	};

	class Parent
	{
		Child *child;
	};

(in parsing form)

	encounter Parent element: call Parent::Parse
	Parent::Parse - install new handlers and look for Child element
		(here is the key!) DO NOT RETURN UNTIL CHILD HAS BEEN PARSED!
	encounter Child element: call Child::Parse
	encounter </Parent>
		------> RETURN BACK TO Parent::Parse <--------
	clean up

(in code)

void Parent::Parse(Element &elem)
{
	Handler handlers[] = {
		Handler("Child", Child::Parse)
		Handler::END
	};
	elem.Parse(handlers, this);
	// now do some stuff before returning control
}

The major design point is you enter an element handler for a class, that
class does all of its parsing for any sub-elements, and control is
returned back to the handler, allowing it do some cleanup or validation
or whatever the code wants to do.

So far, I just haven't been able to find a way to make expat work this
way. The only way I can see doing it would be to have expat build a tree
of all encountered elements, then process the tree using my interface,
but that goes down the DOM road of excessive memory usage.

I'm hoping someone can point out the obvious thing I'm missing here... 

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Sun Dec 19 22:43:08 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:42 2004
Subject: pulling with expat
In-Reply-To: <385D4E8B.CA668281@fxtech.com>
Message-ID: <Pine.LNX.4.10.9912190518040.15596-100000@cauchy.clarkevans.com>


Does this code fragment help out?  It is far from
perfect, however, the "pull" based code would call 
getNextEvent() to return the next XML event...

    XML_Parser parser; 
    FILE        *fileptr; 
    struct Event { .... };
    Stack<Event> stack;

    void initialize(const char *filename) {
	fileptr = fopen(filename, ...);
        parser  = XML_ParserCreate(..);        
    }
   
    Event getNextEvent() {
      if(NULL==parser) 
         throw Error("Invalid Function Sequence");
      while(stack.isEmpty())
      {
        const int BUFSIZ = 4096;
        char  buff[BUFSIZ];
        // fill buffer with next chunk
        // of the file being read, make sure
        // that at least one new element is 
        // put on the stack..
        int len = fread(fileptr, ... );
	XML_Parse(parser, buff, len, (len!=BUFSIZ));      
        if(len!=BUFSIZ) 
          break;
      }
      if(!stack.isEmpty()) 
        return stack.pop();
      // if you get this far, we are at EOF
      fclose(fileptr);
      XML_ParserFree(parser);
      parser = NULL;
      fileptr = NULL;
      return NULL;
    }

    void StartElementHandler(void *userData,
                            const XML_Char *name,
                            const XML_Char **atts)
    { 
       // code to put event on the stack 
       newelements++;
    }


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Sun Dec 19 23:32:30 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace proposal
References: <Pine.LNX.4.10.9912190016000.11945-100000@cauchy.clarkevans.com>
Message-ID: <385D6B9E.82971483@pacbell.net>

"Clark C. Evans" wrote:
> 
> This looks very similar to what Ray had
> proposed a few days ago.

A quick visit back to the xml-dev archives shows that this idea was around
*long* before I ever had it!  :)

John Cowan (cowan@locke.ccil.org) Tue, 11 Aug 1998 14:12:38 -0400:
>> 1. A new optional application-side interface to allow applications to
>> intercept, record, and decode namespace scope events:
>>
>>public interface NamespaceHandler {
>>public void startNamespaceScope (String prefix, String URI,
>>NamespaceResolver resolver);
>>public void endNamespaceScope (String prefix);
>>}

> In this case, perhaps leave startElement alone:
> 
>   void startElement(String name, AttributeList atts);

Yes, this seems to be what James' proposal gives us with the  
NamespaceDeclarationsAsAttributes level, with the difference that there will be
an extra namespaceUri param to some methods which should always be null, and the
AttributeList parameter to startElement will contain namespace declarations as
well as normal attributes.  Minor inconveniences, really.

Here's a helper class to record namespace uri to prefix mappings, query those
mappings, and split qnames.  

-Ray

NamespacePrefixMap.java
-----------------------
import java.util.Hashtable;
import java.util.Enumeration;
import java.util.Stack;
import java.util.EmptyStackException;

public class NamespacePrefixMap {
    private Hashtable prefixStacks;

    /** constructs a new NamespacePrefixMap and initializes it with the
        built-in mapping of the prefix "xml" according to REC-xml-names */
    public NamespacePrefixMap()
    {
        prefixStacks = new Hashtable();
        add("xml", "http://www.w3.org/XML/1998/namespace");
    }

    /** adds a new prefix to namespace uri mapping */
    public void add(String prefix, String nsUri) {
        Stack prefixStack;
        if((prefixStack = (Stack) prefixStacks.get(prefix)) == null) {
            // create a prefixStack for first use of each prefix
            prefixStack = new Stack();
            prefixStacks.put(prefix, prefixStack);
        }
        prefixStack.push(nsUri);
    }

    /** removes a prefix to namespace uri mapping */
    public void remove(String prefix) {
        Stack prefixStack;
        if((prefixStack = (Stack) prefixStacks.get(prefix)) != null)
            prefixStack.pop();
    }

    /** returns the namespace uri currently bound to a given prefix or null
        if there is none */
    public String getNamespaceUri(String prefix) {
        Stack prefixStack;
        try {
            if((prefixStack = (Stack) prefixStacks.get(prefix)) != null) {
                String nsUri = (String) prefixStack.peek();

                // check to see if the default namespace is "undeclared"
                if(prefix.length() == 0 && nsUri.length() == 0) return null;

                return nsUri;
            }
        } catch( EmptyStackException ese ) {}
        return null;
    }

    /** returns some prefix currently bound to a given namespace uri or null
        if there is none */
    public String getPrefix(String nsUri) {
        String prefix;
        Stack prefixStack;

        // for each prefix
        for(Enumeration p = getPrefixes(); p.hasMoreElements(); ) {
            prefix = (String) p.nextElement();
            prefixStack = (Stack) prefixStacks.get(prefix);

            // is prefix currently bound to nsUri?
            if(prefixStack.search(nsUri) == 1) return prefix;
        }
        return null;
    }

    /** returns an Enumeration of the prefixes currently declared */
    public Enumeration getPrefixes() {
        return prefixStacks.keys();
    }

    /** indices into String array returned by splitQName() */
    public static final int LOCAL_NAME = 0;
    public static final int PREFIX = 1;
    public static final int NAMESPACE_URI = 2;

    /** Splits qname into {"localName", "prefix", "namespaceUri"} */
    public String[] splitQName(String qname) {
        String[] results = new String[3];
        int colon = qname.indexOf(':');
        if(colon == -1) {
            results[PREFIX] = "";
            results[LOCAL_NAME] = qname;
        } else {
            results[PREFIX] = qname.substring(0, colon);
            results[LOCAL_NAME] = qname.substring(colon + 1);
        }
        results[NAMESPACE_URI] = getNamespaceUri(results[PREFIX]);
        return results;
    }

    /** determines if name is a QName */
    public boolean isQName(String name) {
        return (name.indexOf(':') != -1);
    }

    /** determines if name equals "xmlns" or begins with "xmlns:" */
    public boolean isNamespaceDeclaration(String name) {
        return (name.equals("xmlns") || name.startsWith("xmlns:"));
    }
}

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 20 00:52:41 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain> <385C7716.6F394D37@jclark.com>
Message-ID: <385D7DB8.3EFDC6DE@trantor.de>

> For example, I think there are good arguments for moving to a
> 
> interface DocumentHandler {
>   void startElement(StartElementEvent event)
>   void endElement(EndElementEvent event)
>   ...
> }

I also would prefer this kind of interface. Further advantage besides
the improved extensability might be that 

- building a new object seems some overhead at the first sight, 
  but in JAVA also a new String is a new object...

- some computation could be performed on demand only(?) 

- I think it is less difficult to remember the access method names 
  than a more or less unmotivated order of a lot of parameters

- the ElementEvent access methods could be a subset of the DOM access
methods (!)

please do not forget to include the "old" SAX 1 methods in HandlerBase
and call them from the new methods as default behaviour preserving
compatibility at least with applications extending
HandlerBase instead of implementing DocumentHandler. 

All the best,

Stefan


-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Mon Dec 20 02:07:38 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
In-Reply-To: <m3so114n3z.fsf@localhost.localdomain> from "David Megginson" at Dec 17, 99 10:13:04 am
Message-ID: <199912200207.VAA29342@locke.ccil.org>

David Megginson scripsit:

> Great idea, but java.lang.String is final, so it cannot be extended
> (that supposedly allows many optimizations in the JVM).  

Specifically, Strings are immutable, and allowing an immutable class
to be extended potentially throws away the immutability.  Things like
threads and Hashtables rely on knowing what is immutable and what is not.

-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Mon Dec 20 02:41:35 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:42 2004
Subject: Bizarre XLinks (was Re: Namespace proposal)
In-Reply-To: <3.0.32.19991219095330.014f9a50@pop.intergate.ca>
Message-ID: <4.0.1.19991219205919.04503100@216.27.10.33>

At 09:54 AM 12/19/99 -0800, Tim Bray wrote:
>At 08:50 AM 12/19/99 -0500, Simon St.Laurent wrote:
>>It'll satisfy the bizarre demands of the XLink folk that we use their
>>namespace on attributes any place we create links in our own documents,
>
>What's bizarre? -Tim


The last draft had a nifty if sort of crusty mechanism for specifying
XLinks using any attribute names you wanted (see
http://www.w3.org/TR/1998/WD-xlink-19980303#remapping).  The most recent
draft lost that mechanism and everything requires the use of xlink:href
instead of myown:src.

To me, that's bizarre, excessively demanding, and highly irritating
behavior.  XLink right now is what I call an 'inconsiderate spec', one
which requires everything else built on it to look like it, without the
kind of openness that XML provided in the first place.

I don't know what anyone else thinks of it, but I've given considerable
thought to a short proposal rebuilding the remapping mechanism, starting
with xlink:attributes instead of xml:attributes.  (And yes, I know that
I'll be changing what that prefix maps to, and not just the @#X! prefix.)


Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Mon Dec 20 03:16:06 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:42 2004
Subject: What is the reasoning why a well-formed XML text cannot have multiple
 top level elements?
In-Reply-To: <4.0.1.19991219205919.04503100@216.27.10.33>
Message-ID: <Pine.LNX.4.10.9912191008430.15596-100000@cauchy.clarkevans.com>

I guess the title says it all.  

It just seems that two domains, log files and continuous 
broadcasts are left out in the cold with this seemingly
arbitrary choice.  So, I'm looking for answers *other*
than the following:
  
  0. Beacuse the spec says so.

  1. To stay compatible with SGML, so XML documents
     can be used in SGML processors.

  3. Beacuse XML is syntax for *documents* ; it was
     never intended to be a general *information* syntax.

Thanks!

Clark

P.S. Yes, I'm familar with the Document Fragment NOTE, this
is exactly why I'm asking the question.  


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Dec 20 04:30:33 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:42 2004
Subject: What is the reasoning why a well-formed XML text cannot
  have multiple top level elements?
Message-ID: <3.0.32.19991219202900.014c3a20@pop.intergate.ca>

At 10:15 AM 12/19/99 -0500, Clark C. Evans wrote:
>It just seems that two domains, log files and continuous 
>broadcasts are left out in the cold with this seemingly
>arbitrary choice.  So, I'm looking for answers *other*
>than the following:

Both logfiles and continuous broadcasts work just fine, simply send
a series of small XML documents rather than try to pretend the whole
logfile is one parsable object.  Which is arguably better design anyhow;
to start with, among other things you can validate each record; in 
principle you can't validate a doc until you've read it all.

I've always thought the single-root-element idea was a good one for
networked apps simply because it allows you to know unambiguously when
you're done receiving useful data, and don't have to rely the programmer
at the other end closing the socket, and the connection teardown time,
and so on.

But the real reason XML was done that way was becasuse it was one of
the things inherited from SGML that nobody ever asked to have changed.
Really, I don't recall a single word on that subject at the time. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Mon Dec 20 05:41:29 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:42 2004
Subject: What is the reasoning why a well-formed XML text cannot  have
 multiple top level elements?
In-Reply-To: <3.0.32.19991219202900.014c3a20@pop.intergate.ca>
Message-ID: <Pine.LNX.4.10.9912191222020.20628-100000@cauchy.clarkevans.com>


Thanks Tim!  This is *really* good, I think I just 
changed my perspective on things.  I had been wondering
if I had missed something... 

On Sun, 19 Dec 1999, Tim Bray wrote:
> Both logfiles and continuous broadcasts work just fine, simply send
> a series of small XML documents rather than try to pretend the whole
> logfile is one parsable object.

Yes, using "^L" as the document seperator. 

> Which is arguably better design anyhow; to start with, among other 
> things you can validate each record; in principle you can't validate 
> a doc until you've read it all.

Well, if you consider "is-valid()" to be a property of the
XML source as it is encountered, then is-valid() is itself
not a value, but a sequence of values.  The last value having
particular relevance in the "document" domain.

> I've always thought the single-root-element idea was a good one for
> networked apps simply because it allows you to know unambiguously when
> you're done receiving useful data, and don't have to rely the programmer
> at the other end closing the socket, and the connection teardown time,
> and so on.

Yes.  Ok.  This one clicks.  Assume a broadcast, let's say from 
a programmable controller watching 4 gauges and reporting their
differentials.  The broadcast has a definite beginning.  I always
thought of it as "infinite" beacuse in practice, the controller
can be left on for years...  However, there may be cases where the
controller needs to be shut down for maintanence... thus, switching
the power switch to "OFF", would first send the end tag, and then
power down.  Ok.  I can grok this one.  Thanks tons!

Perhaps the log file might be similar?  Let's say I startup
a-daemon-with-xml-logging.  It would start by creating a new
log file by picking the date/time of the run, and then 
write out the <log> element.  As events occur, they are 
appended as <event/> elements to the file.  If the daemon 
is shutdown gracefully (99.9% of the time), then it can end 
the log file with a </log> tag.  Nice.  So, there are only 
two cases where a log file would not have an end-tag: 
(a) if the file is read while the deamon is running, or 
(b) if the daemon crashes.   

In both of these cases above, SAX can be used to report
on the stream as it arrives anyway...  This leads me to 
a question:  How do SAX implemetations (like XT) treat 
files that are opened for writing by another process.  
Is it possible for them to have behavior like "tail -f" ?

> But the real reason XML was done that way was becasuse it was one of
> the things inherited from SGML that nobody ever asked to have changed.
> Really, I don't recall a single word on that subject at the time. -Tim

This is cool.   Thank you for humoring my question!

;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dent at oofile.com.au  Mon Dec 20 07:04:57 1999
From: dent at oofile.com.au (Andy Dent)
Date: Mon Jun  7 17:18:42 2004
Subject: pulling with expat
In-Reply-To: <385D4E8B.CA668281@fxtech.com>
References: 
 <Pine.LNX.4.10.9912161512310.20127-100000@cauchy.clarkevans.com>
 <385A4785.93FDC75E@fxtech.com> <385B85EC.4BD33B82@fxtech.com>
 <385D4E8B.CA668281@fxtech.com>
Message-ID: <v04205500b483839f5842@[192.168.0.52]>

At 16:30 -0500 19/12/99, Paul Miller wrote:
>As promised I decided to take another stab at implementing my streaming
>pull-style parser over expat. Either I haven't had enough strong drink
>yet for it to make sense, or it's not possible. If you'll recall my
>design for the hierarchical nature of the element handlers, I want to be
>able to nest my parsing, just like an object hierarchy:
...
>The major design point is you enter an element handler for a class, that
>class does all of its parsing for any sub-elements, and control is
>returned back to the handler, allowing it do some cleanup or validation
>or whatever the code wants to do.
Umm, have you looked at expatpp?

As far as I can see, I've already done exactly what you want.

There's substantial working available as open source - the OOFILE
report-writer uses expatpp to parse saved report files which include
layouts, database schemae etc. This project would have been a damn sight
harder to debug without the sub-parsers.

See http://www.oofile.com.au/ for links to expatpp (including a copy of
expat) and the report writer.

This code has been validated on Mac and Windows personally and others have
used expatpp on Unix.

Andy Dent BSc MACS AACM, Software Designer, A.D. Software, Western Australia
OOFILE - Database, Reports, Graphs, GUI for c++ on Mac, Unix & Windows
PP2MFC - PowerPlant->MFC portability
http://www.oofile.com.au/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Mon Dec 20 07:59:51 1999
From: uche.ogbuji at fourthought.com (Uche Ogbuji)
Date: Mon Jun  7 17:18:42 2004
Subject: ANN: 4DOM 0.9.0
Message-ID: <385DE1F7.E6751961@fourthought.com>

FourThought LLC (http://FourThought.com) announces the release of

                             4DOM 0.9.0
                      -----------------------
                An XML/HTML Python library using the
                  Document Object Model interface

4DOM is a Python library for XML and HTML processing and manipulation
using the W3C's Document Object Model for interface.  4DOM implements
DOM Core level 2, HTML level 2 and Level 2 Document Traversal.

4DOM should work on all platforms supported by Python.  If you have
any problems with a particular platform, please e-mail the authors.

4DOM is designed to allow developers rapidly design applications
that read, write or manipulate HTML and XML.

News
----

Changes:

- Major re-write to match the general consensus DOm binding for
  Python.  Code formerly in the form "node.getChildNodes()"
  is now to be used in the form "node._get_childNodes()" or
  simply "node.childNodes".  Similarly "text.setData("spam")"
  becomes "text._set_data("spam")" or text.data = "spam"

- Update to full Level 2 support in core and HTML, including
  namespace-support.

- Many bug-fixes


More info and Obtaining 4DOM
----------------------------

Please see

        http://FourThought.com/4Suite/4DOM

Or you can download 4DOM from

        ftp://FourThought.com/pub/4Suite/4DOM

4DOM is distributed under a license similar to that of Python.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From KenNorth at email.msn.com  Mon Dec 20 08:21:25 1999
From: KenNorth at email.msn.com (KenNorth)
Date: Mon Jun  7 17:18:42 2004
Subject: Free Tool for Efficient XML Data Compression
References: <199912182036.PAA26308@red.seas.upenn.edu> <v04210102b481d61885dc@[192.168.1.254]>
Message-ID: <015501bf4ac3$221d7fe0$0b00a8c0@grissom>

> >HIGHLIGHT. It is known that XML files tend to be larger than files in
> >application specific data formats.
>
>
> While XMill sounds interesting, I really have to take issue with this
> statement.  I've seen no evidence that XML based, uncompressed file
> formats are larger than the corresponding binary file formats. This
> is a common fear about XML but I have not seen it borne out in my
> tests.

The difference in size between application data, and its XML counterpart,
seems to be application-dependent (and schema-dependent). Message processing
applications, for example, use techniques such as piggybacking to reduce the
size of a data stream.

If the "application specific data format" is a normalized SQL database,
there are likely to be differences because the normalization process
eliminates redundant data. (A billing address would be stored only once, not
once per order.)

This Note to the W3C documents a sizeable increase when moving from EDI
(EDIFACT) messages to XML/EDI:

http://www.xml-edifact.org/TR/NOTE-xml-edifact.html

Check out section 7 (the TeleOrdering UK example).


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 11:49:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <Pine.LNX.4.10.9912190016000.11945-100000@cauchy.clarkevans.com>
References: <385C9B42.ACE9D1B2@jclark.com>
	<Pine.LNX.4.10.9912190016000.11945-100000@cauchy.clarkevans.com>
Message-ID: <14430.6019.345198.934176@localhost.localdomain>

Clark C. Evans writes:

 > This looks very similar to what Ray had proposed a few days ago.

Actually, it's part of what we all came up with for SAX2 last
spring...

 > On Sun, 19 Dec 1999, James Clark wrote:
 > > No.  I need to be able to build a tree that includes the all the
 > > namespace scoping information.  Something like:
 > > 
 > > interface NamespaceDeclHandler {
 > >   void startNamespaceDecl(String prefix, String namespace);
 > >   void endNamespaceDecl();
 > > }
 > > 
 > > with calls to start/endNamespaceDecl outside the calls to the
 > > corresponding start/endElement would do it.
 > 
 > In this case, perhaps leave startElement alone:
 > 
 >   void startElement(String name, AttributeList atts);

Unfortunately, using only the NamespaceDeclHandler makes more work for
applications that care just about Namespace-qualified names, and makes
those applications less efficient (since the parser has probably
already done the work that the application will have to do over): I
agree with James and Tim that such applications will be the most
common, and we should optimize for them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mdash at techbooks.com  Mon Dec 20 12:13:10 1999
From: mdash at techbooks.com (Manoranjan Dash)
Date: Mon Jun  7 17:18:42 2004
Subject: unsubscribe
In-Reply-To: <14430.6019.345198.934176@localhost.localdomain>
References: <Pine.LNX.4.10.9912190016000.11945-100000@cauchy.clarkevans.com>
 <385C9B42.ACE9D1B2@jclark.com>
 <Pine.LNX.4.10.9912190016000.11945-100000@cauchy.clarkevans.com>
Message-ID: <3.0.6.32.19991220173801.008b8c20@pinnacle.techbooks.com>

unsubscribe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Nigel.Robbins at macro4.com  Mon Dec 20 12:16:10 1999
From: Nigel.Robbins at macro4.com (Nigel Robbins)
Date: Mon Jun  7 17:18:42 2004
Subject: Expat and whitespace handling
Message-ID: <99Dec20.121746gmt.25987@gateway.macro4.com>

I want to suppress all white space that  immediately follows a start tag and
precedes an end tag.
Does anyone know how this can be done with expat ?
Many thanks, Nigel
Nigel.Robbins@macro4.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991220/4a199460/attachment.htm
From Daniel.Brickley at bristol.ac.uk  Mon Dec 20 12:28:47 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:42 2004
Subject: Bizarre XLinks (was Re: Namespace proposal)
In-Reply-To: <4.0.1.19991219205919.04503100@216.27.10.33>
Message-ID: <Pine.GHP.4.21.9912201044320.28884-100000@mail.ilrt.bris.ac.uk>


[I've added added www-XML-Linking-comments to the CC: list.
XML-DEVers please drop this from replies unless it seems appropriate]


On Sun, 19 Dec 1999, Simon St.Laurent wrote:

> At 09:54 AM 12/19/99 -0800, Tim Bray wrote:
> >At 08:50 AM 12/19/99 -0500, Simon St.Laurent wrote:
> >>It'll satisfy the bizarre demands of the XLink folk that we use their
> >>namespace on attributes any place we create links in our own documents,
> >
> >What's bizarre? -Tim
> 
> 
> The last draft had a nifty if sort of crusty mechanism for specifying
> XLinks using any attribute names you wanted (see
> http://www.w3.org/TR/1998/WD-xlink-19980303#remapping).  The most recent
> draft lost that mechanism and everything requires the use of xlink:href
> instead of myown:src.
> 

> 
> 
> To me, that's bizarre, excessively demanding, and highly irritating
> behavior.  XLink right now is what I call an 'inconsiderate spec', one
> which requires everything else built on it to look like it, without the
> kind of openness that XML provided in the first place.
> 
> I don't know what anyone else thinks of it, but I've given considerable
> thought to a short proposal rebuilding the remapping mechanism, starting
> with xlink:attributes instead of xml:attributes.  (And yes, I know that
> I'll be changing what that prefix maps to, and not just the @#X! prefix.)

> Simon St.Laurent


I'm not sure I get what you're saying here Simon... Does this mean I
couldn't define inter-document relations with URIs such as
'http://x-sitemap.org/xlink-ns/isPartOf' or ...'subsection' or
'...copyrightStatement' and use make namespace'd use of them as
'SITEMAP:isPartOf' or 'RIGHTS:copyrightStatement' to express the type of
the link? From minimal knowledge of the spec, I thought the use of the
XLink namespace was merely for link _detection_, and that link typing
was what mattered w.r.t. community-defined constructs. 

[Dan revisits XLink draft... finds...]

 http://www.w3.org/TR/xlink
          4.4 Semantic Attributes

        There are two attributes associated with semantics, role and
	title. The role attribute is a generic string used to describe
        the function of the link's content. For example, a poem might
	have a link with a role="essay".

        The title attribute is designed to provide human-readable text
	describing the link. It is very useful for those who have
        text-based applications, whether that be due to a constricted
	device that cannot display the link's content, or if it's being read
        by an application to a visually-impaired user, or if it's
	being used to create a table of links. The title attribute should
	contain a simple, descriptive string.


(example snippet)
<A xmlns:xlink="http://www.w3.org/XML/XLink/0.9" xlink:type="simple"
                       xlink:href="students.xml" xlink:role="student list"
                       xlink:title="Student List" xlink:show="new" xlink:actuate="user">
                       Current List of Students
               </A>


Oh. That's kindof retro. Free text? From my re-reading of the spec, it
looks as if the only way to use community-defined constructs is through
the 'role' and 'title' facilities above, which are both free
text. Althought these _could_ contain URIs or (perhaps controversially?) 
namespace-prefixed names, the spec doesn't appear to provide any
mechanism for making such practice syntactically evident. Which is odd,
since my understand of role of XLink was that it was all about
delivering a mechanism which made typed links syntactically evident.


This seems to miss the mark with three of XLink's design goals, and
therefore constitute something of a bug (hence the CC to the WG):

In terms of the XLink goals:

	8. XLink must represent the abstract structure and 
	significance of links. 

Unless a processor can tell when it encountered a URI reference in the
'role', or a namespace qualified name, or free text, Web automation
is going to be severely hampered. If I link to a copyright statement, or a
contents page, or a subsection, or a critique, I want to be able to give
a Web name (URI) to that type of link so that the significance of what
I've done is mechanically evident. The string "student list" doesn't do
this. Trivial example, "student:list" could be free text, a qualified
name, or some new URI scheme. (another eg. news:foo looks like like a
qname and/or a URI).

	9.XLink must be feasible to implement regardless of the 
	media used for the presentation of links. 

Unless a URI for the role/type of a link is evident, user agents will be
incapable of making adequately informed decisions regarding
presentation. Example: 'copyrightInfo', 'healthWarning',
'legalDisclaimer', 'critique' etc are inter-document relations that
might be presented differently depending on presentation media. If these
are represented in free text, they won't be easily recognised.

	10.XLink must be informed by knowledge of established hypermedia 
	systems and standards. 

It might be argued that the significance of unique identifiers in
distributed, multilingual information environments is pretty widely
appreciated these days...


Here's a quick counter-proposal (spot the difference) for 'role':

(amended example from above)
<A xmlns:xlink="http://www.w3.org/XML/XLink/0.9" xlink:type="simple"
		xlink:href="students.xml" 
	 	xlink:role_webID="http://edustuff.org/xlns/studentList"
                       xlink:title="Student List" xlink:show="new" xlink:actuate="user">
                       Current List of Students
               </A>

...so at http://edustuff.org or elsewhere we might find (but XLink
needed specifify in the spec...):

	<LinkRole webID="http://edustuff.org/xmlns/studentList">
	<label xml:lang="en">Student List</label>
	<label xml:lang="fr">List De Students (or whatever...)</label>

	<!-- using URIs would allow additional information/annotations
	 etc, for example... -->
	<generalisationOf webID="http://edustuff.org/xmlns/cleverStudentList"/>
	<sameAs webID="http://someotherlinkvocab.org/education/listOfStudents"/>
	</LinkRole>

I don't want to get into details of alternate mechanisms here, just
wanted to check that I've not misread the spec, and if so, register 
my puzzlement as to the chosen design.

cheers,

Dan

--
daniel.brickley@bristol.ac.uk


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Mon Dec 20 12:31:47 1999
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:18:42 2004
Subject: Tutorial material available
Message-ID: <f5bso0xvln6.fsf@cogsci.ed.ac.uk>

The slides [1] and additional material [2] for my 1-day intensive
XML Schema tutorial are available for interested parties.  Originally
presented at XML '99 in Philadelphia, they have been updated to the
PWD of 17 December (I think :-).

Please note this material is copyright -- you are welcome to use it
for personal study, but if you want to distribute it further/use it in 
public yourself, please get in touch with me first.

ht

[1] ftp://ftp.cogsci.ed.ac.uk/pub/ht/tutorials/docs/Schema XML99.ppt
[2] ftp://ftp.cogsci.ed.ac.uk/pub/ht/tutorials/docs/XML Additional 1999.doc
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Mon Dec 20 12:49:16 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:18:42 2004
Subject: Tutorial material available
In-Reply-To: <f5bso0xvln6.fsf@cogsci.ed.ac.uk>
References: <f5bso0xvln6.fsf@cogsci.ed.ac.uk>
Message-ID: <19991220074912.D3165@w3.org>

On Mon, Dec 20, 1999 at 12:31:41PM +0000, Henry S. Thompson wrote:
> The slides [1] and additional material [2] for my 1-day intensive
> XML Schema tutorial are available for interested parties.  Originally
> presented at XML '99 in Philadelphia, they have been updated to the
> PWD of 17 December (I think :-).
> 
> Please note this material is copyright -- you are welcome to use it
> for personal study, but if you want to distribute it further/use it in 
> public yourself, please get in touch with me first.
> 
> ht
> 
> [1] ftp://ftp.cogsci.ed.ac.uk/pub/ht/tutorials/docs/Schema XML99.ppt
> [2] ftp://ftp.cogsci.ed.ac.uk/pub/ht/tutorials/docs/XML Additional 1999.doc

Can we get something in a non proprietary format ? Like HTML with images ?
I would like to read those, but I don't have PowerPoint nor Word, 

  thanks,

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Mon Dec 20 13:15:38 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: summary of Namespace-support arguments
References: <3.0.32.19991218161206.014eb310@pop.intergate.ca> <385C6A26.7B4E6AA6@jclark.com>
Message-ID: <385E2CFC.E4A3D83@mecomnet.de>

there may well be reasons to bind the origianl literal prefix to the "name
instance". this is not one of them. serialization prettiness and speed can
well be viewed as diametrically opposed. where speed is of the utmost, then
all that matters is that serializer guarantee requisite prefix uniqueness and
the correctness of prefix/uri bindings. to this end, a storage model which
binds a unique token to a namespace instance and binds the namespace instance
to the "name instance" suffices.

where the namespace instance also statically binds its uri, this form of
"name" obviates the need for a directly binding the uri to the name.

where prettiness is important, an optional interface to the serializer can
well specify the prefixes to be used for the respective namespaces. in either
case, no "figuring out" is required.

James Clark wrote:
> 
> 3. I want to use DocumentHandler not just an interface between a parser
> and an application but between an application and a serializer.
> Serialization can in fact be as performance critical as parsing.  A
> serializer can do it's job much more efficiently and easily if it has
> the prefix available rather than having to figure it out from the
> prefix/namespace bindings in effect. Although the combination of XML 1.0
> DTDs and namespaces is a problematic, many users want to use namespaces
> and still have their documents been XML 1.0 valid; this may apply to the
> documents they are creating with a DocumentHandler.  For a serializer
> that uses a DocumentHandler as its interface to be able to do this, it
> has to have prefix information.
> 
> James
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 13:22:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace proposal
In-Reply-To: James Clark's message of "Sun, 19 Dec 1999 21:13:31 +0700"
References: <14427.62086.25937.792412@localhost.localdomain> <385CA792.A218D9E5@jclark.com> <385CE80B.CCCA027A@jclark.com>
Message-ID: <m3zov5iw81.fsf@localhost.localdomain>

James Clark <jjc@jclark.com> writes:

> James Clark wrote:
> 
> > I propose that there there is a NamespaceProcessingLevel feature that
> > applications can set on the parser with 4 possible values:
> > 
> > Cooked (the default)
> > IncludePrefixes
> > NamespaceDeclarationsAsAttributes
> > Raw
> 
> A variation on this would be to drop the second level, so that there
> would be three levels corresponding to:
> 
> - a pure namespaces view
> - a simultaneous namespaces and XML 1.0 view
> - a pure XML 1.0 view

I agree -- I think that this is the cleanest approach.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Mon Dec 20 13:38:48 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:42 2004
Subject: pulling with expat - success!
References: <Pine.LNX.4.10.9912190518040.15596-100000@cauchy.clarkevans.com>
Message-ID: <385E31A6.A9DCCEC9@fxtech.com>

"Clark C. Evans" wrote:
> Does this code fragment help out?  It is far from
> perfect, however, the "pull" based code would call
> getNextEvent() to return the next XML event...

Thanks Clark! That was what I needed. I had my mind set on recursion and
failed to think about an event queue. I was able to slide this under my
API and seems to work well. Although memory usage will be higher than my
solution, the design is taking advantage of expat for the lower-level
details.

The basic design goes like this:

1. XML::Parser::Parse() starts pulling events in a loop

<loop>
2. expat calls startElement
	I create an Element (with the attributes) and push it on a stack
	I create a StartElement Event and push it on the event queue

3. expat calls endElement
	I create an EndElement Event and push it on the event queue. At this
point I take the top Element off the element stack (which should match
this EndElement), and store a pointer to it inside the Event. The
element stack is there only to synchronize the events with the elements.

4. expat calls characterData:
	I buffer the data into a string inside the top-most element

</loop>

5. When there are some events in the queue, an event is pulled off and
handled
	If there is a handler at the current scope, it is called
	Otherwise, a skip-to flag is set and all events that don't get us back
to the previous level are thrown away

6. When a handler is called, it pulls events from the queue and
processes them until the ending-element event that matches this element
is found

7. When a handler's Element::Parse() returns, the element is pushed onto
a delete queue (because we can still use the Element pointer in the
remainder of the handler)

8. When a handler returns, the delete queue is emptied

I've only tested with some small files, but I think the design is sound.
There is a lot more memory overhead (creating events and elements as
they are encountered) than my previous implementation, but I'm hoping to
use some memory pools for the Elements, Events, and Attributes, so the
overall run-time overhead should be minimal when it's finished.

Thanks for all the suggestions!

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 13:50:18 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:42 2004
Subject: Bizarre XLinks (was Re: Namespace proposal)
In-Reply-To: "Simon St.Laurent"'s message of "Sun, 19 Dec 1999 21:41:39 -0500"
References: <4.0.1.19991219205919.04503100@216.27.10.33>
Message-ID: <m3vh5tiuxt.fsf@localhost.localdomain>

"Simon St.Laurent" <simonstl@simonstl.com> writes:

> The last draft had a nifty if sort of crusty mechanism for specifying
> XLinks using any attribute names you wanted (see
> http://www.w3.org/TR/1998/WD-xlink-19980303#remapping).  The most recent
> draft lost that mechanism and everything requires the use of xlink:href
> instead of myown:src.

Yes, I argued strongly in favour of this on the old XML WG, at one of
the few meetings when we actually had the opportunity to spend time on
XLink.  Though I admit to not having read the latest draft, I have now
changed my position, for reasons that I will explain below.

> To me, that's bizarre, excessively demanding, and highly irritating
> behavior.  XLink right now is what I call an 'inconsiderate spec', one
> which requires everything else built on it to look like it, without the
> kind of openness that XML provided in the first place.

Sure, it is annoying, but this is not a problem that's confined to
XLink -- I have changed my position because I think that any remapping 
mechanism needs to be implemented across all of XML, and not confined
to XLink, or we'll end up with a nasty hodgepodge of mapping
mechanisms.  There are two possibilities:

1. We use some kind of schema mechanism to say that myown:src
   is a type of xlink:href.

2. We use some kind of mapping mechanism in the document instance
   itself to say that a specific myown:src attribute is a type of
   xlink:ref.

The former mechanism is well in line with the work already taking
place on XML Schemas, but I'm reluctant to endorse it because it will
make schema processing a required part of XML processing.  

The second mechanism can already be accomplished using Architectural
Forms, and I'm sure that eventually we'll either adapt AFs in XML or,
more likely, develop something a little simpler with a catchier name
and a freely-redistributable spec that works more-or-less the same

<rant>
Have you ever tried getting permission to include an ISO spec on a CD
in a book or to mirror it on a Web site, even if the spec is already
available free online (actually, I'm pretty sure that Simon has)?
It's sad that an organization that considers itself the Guardian of
Open Standards just doesn't get it when it comes to open documents.
That closed mentality probably harmed SGML more than any other single
problem, despite the good intentions of the volunteers who actually
did the ISO SGML work.
</rant>

> I don't know what anyone else thinks of it, but I've given considerable
> thought to a short proposal rebuilding the remapping mechanism, starting
> with xlink:attributes instead of xml:attributes.  (And yes, I know that
> I'll be changing what that prefix maps to, and not just the @#X! prefix.)

I think that would be a waste of time -- give us a general remapping
mechanism instead, please.  You can take a look at the documentation
for my now-ancient XAF (XML Architectural Forms) package for a few
ideas:

  http://www.megginson.com/XAF


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 14:06:35 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace proposal
In-Reply-To: Stefan Haustein's message of "Mon, 20 Dec 1999 01:52:08 +0100"
References: <14427.62086.25937.792412@localhost.localdomain> <385C7716.6F394D37@jclark.com> <385D7DB8.3EFDC6DE@trantor.de>
Message-ID: <m3so0xiu6k.fsf@localhost.localdomain>

Stefan Haustein <stefan.haustein@trantor.de> writes:

> > For example, I think there are good arguments for moving to a
> > 
> > interface DocumentHandler {
> >   void startElement(StartElementEvent event)
> >   void endElement(EndElementEvent event)
> >   ...
> > }
> 
> I also would prefer this kind of interface. Further advantage besides
> the improved extensability might be that 
> 
> - building a new object seems some overhead at the first sight, 
>   but in JAVA also a new String is a new object...

And that is why most parsers internalize strings rather than creating
new ones, and that's why the SAX characters() and
ignorableWhiteSpace() methods use character arrays rather than
strings.  XML parsing shows up a lot of problems that Java programmers 
aren't used to, because it generates so many events (often tens of
thousands) in only a few seconds.

In this case, however, the real solution is internalizing but reusing
-- the SAX driver would have only one copy of each kind of event
object, and would simply change its values each time it makes a
callback.  The problem with this approach (it showed up before in C++
with a simple SP API that James Clark made) is that programmers will
try -- despite documentation warning against it -- to keep the event
objects around and reference them outside the scope of the callback,
where strange things will happen.

> - some computation could be performed on demand only(?) 

This might be an advantage, but probably not -- the parser will
probably have done all of the work anyway, because of basic
constraints for checking well-formedness, etc.

> - I think it is less difficult to remember the access method names 
>   than a more or less unmotivated order of a lot of parameters

Perhaps, but you have to remember a lot of class and method names.
I'm programmed to interfaces that use both approaches, and I did not
find either harder or easier -- on balance, I prefer to avoid bloating 
a low-level interface like SAX with a lot of extra classes, even if
the performance would be the same.

> - the ElementEvent access methods could be a subset of the DOM access
> methods (!)

No, I don't think we should go there.  SAX has its warts, and the DOM
has its warts, but any combination might give us the product rather
than the sum of their warts, and the world doesn't need that much
ugliness.

> please do not forget to include the "old" SAX 1 methods in HandlerBase
> and call them from the new methods as default behaviour preserving
> compatibility at least with applications extending
> HandlerBase instead of implementing DocumentHandler. 

Actually, if we create a new package, there will be no compatibility
at all, except by using adapters (aka filters).  There will certainly
be a SAX1Filter class to wrap around SAX 1.0 parsers, and there may
also be a SAX2Filter class to make SAX2 parsers act like SAX 1.0
parsers.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Mon Dec 20 14:46:28 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:42 2004
Subject: SAX2: Namespace proposal
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A675198@odin.cromwellmedia.co.uk>

David Megginson wrote,
> Stefan Haustein wrote,
> > - building a new object seems some overhead at the first 
> > sight, but in JAVA also a new String is a new object...
>
> And that is why most parsers internalize strings rather than 
> creating new ones,

This isn't necessarily the best approach. Intern'ing a string
involves a lookup in a JVM-internal hash table. This table is
shared across all threads, and consequently has to be locked
against simultaneous reads and updates. That means we've got
two potential sources of overhead: the lookup itself; and lock
contention between multiple threads trying to access the
table. The former probably isn't a big deal, but that latter
can make for a serious performance hit in heavily threaded
systems, especially on SMP machines. Unless you know there's
not going to be contention (eg., because you know you're
running single threaded) it's probably wisest *not* to intern.

It's also worth remembering that you've got to _already_ have
a String before you can intern it! If you've just created one
(eg. from a portion of a char array) then you're only going to
add overhead by doing an intern in addition.

The only possible benefits are,

1. If you've got a pair of Strings that are both *known* to be
   intern'ed you can use == for equality comparisons rather
   than equals. 'known' is the crucial qualifier here: in my
   experience it's most common that only one of a pair of
   Strings will be known to be interned, which means that
   before we can use == the other has to be intern'ed first ...
   which more than wipes out any speedup.

2. Intern'ed Strings share storage. I can imagine situations
   where this _might_ be significant, but they're likely to
   be edge cases. Unless you're actually hanging on to
   references to large numbers of equal Strings then garbage
   collection _should_ recycle the storage allocated to old
   ones. Some JVMs might have trouble doing this nicely, but
   then the best bet would be to get hold of a better JVM
   rather than tying to hack around the problem. Bear in mind
   that troublesome JVMs will also cause problems even with
   intern'ing ... because, as mentioned in (1), we'll have had
   to create a String before we can intern it, and typically
   the pre-intern String will be discarded: if gc is slack then
   these will pile up even tho' unreferenced.

> and that's why the SAX characters() and ignorableWhiteSpace() 
> methods use character arrays rather than strings.

This, on the other hand, can bring genuine gains, at the cost
of considerably uglifying the API.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 14:59:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <AA4C152BA2F9D211B9DD0008C79F760A675198@odin.cromwellmedia.co.uk>
References: <AA4C152BA2F9D211B9DD0008C79F760A675198@odin.cromwellmedia.co.uk>
Message-ID: <14430.17453.592853.432083@localhost.localdomain>

Miles Sabin writes:
 > David Megginson wrote,
 > > Stefan Haustein wrote,
 > > > - building a new object seems some overhead at the first 
 > > > sight, but in JAVA also a new String is a new object...
 > >
 > > And that is why most parsers internalize strings rather than 
 > > creating new ones,
 > 
 > This isn't necessarily the best approach. Intern'ing a string
 > involves a lookup in a JVM-internal hash table. This table is
 > shared across all threads, and consequently has to be locked
 > against simultaneous reads and updates.

That's probably why most parsers have their own intern routines rather 
than using java.lang.String.intern -- at least, I wrote a custom
hashing and interning routine for AElfred that sped it up quite
significantly over (a) using java.lang.String.intern() or (b)
allocating a new string for every element name.  If I were doing it
over, though, I would actually call java.lang.String.intern once for
each of the strings in the intern table so that they were == to the
regular intern'ed versions.

 > It's also worth remembering that you've got to _already_ have
 > a String before you can intern it! If you've just created one
 > (eg. from a portion of a char array) then you're only going to
 > add overhead by doing an intern in addition.

Again, I avoided this problem by building my own intern and hashing
methods in AElfred.  There's no reason that a hash table should not be
able to use an array as a key -- Java's Hashtable just doesn't happen
to be designed that way.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Mon Dec 20 15:14:52 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A675199@odin.cromwellmedia.co.uk>

David Megginson wrote,
> Miles Sabin wrote,
> > This isn't necessarily the best approach. Intern'ing a 
> > string involves a lookup in a JVM-internal hash table. This 
> > table is shared across all threads, and consequently has to 
> > be locked against simultaneous reads and updates.
>
> That's probably why most parsers have their own intern 
> routines rather than using java.lang.String.intern -- at 
> least, I wrote a custom hashing and interning routine for 
> AElfred that sped it up quite significantly over (a) using 
> java.lang.String.intern() or (b) allocating a new string for 
> every element name.

Ditto, using a very nice data structure that John Cowan brought
to my attention, see,

http://www.ddj.com/articles/1998/9804/9804a/9804a.htm

which has turned out to be considerably faster than hash tables
(nb. hash tables _generally_ not just java.util.Hashtable or
java.util.HashMap) for our applications.

> If I were doing it over, though, I would actually call 
> java.lang.String.intern once for each of the strings in the 
> intern table so that they were == to the regular intern'ed 
> versions.

Try it, but I think you'll be more likely to lose than gain.

> > It's also worth remembering that you've got to _already_ 
> > have a String before you can intern it! If you've just 
> > created one (eg. from a portion of a char array) then you're 
> > only going to add overhead by doing an intern in addition.
>
> Again, I avoided this problem by building my own intern and 
> hashing methods in AElfred.  There's no reason that a hash 
> table should not be able to use an array as a key -- Java's 
> Hashtable just doesn't happen to be designed that way.

Agreed.

But all this seems to suggest that the use of java.lang.String.
intern() and java.util.Hashtable isn't that good an idea.
Insofar as SAX defines interfaces you can leave the choice to
implementors. But wiring them into the implementation of SAX 
utility classes (eg. your NSUtils) would mean that we don't
get the option.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 20 15:45:10 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain> <385C7716.6F394D37@jclark.com> <385D7DB8.3EFDC6DE@trantor.de> <m3so0xiu6k.fsf@localhost.localdomain>
Message-ID: <385E4EF9.A9D45CA8@trantor.de>

> > - building a new object seems some overhead at the first sight,
> >   but in JAVA also a new String is a new object...
> 
> And that is why most parsers internalize strings rather than creating
> new ones, and that's why the SAX characters() and
> ignorableWhiteSpace() methods use character arrays rather than
> strings.  XML parsing shows up a lot of problems that Java programmers
> aren't used to, because it generates so many events (often tens of
> thousands) in only a few seconds.

OK, what about

 void startElement(String localName, AttributeList attr,
NameSpaceContext nsc)

NameSpaceContext could be unmutable and thus be reused while unchanged.
I could still remember all parameters since there is only one more at
the end :-) 

The parser would only need to check if the NSC has changed and could
reuse the same object otherwise. 

Best regards

Stefan

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 16:03:22 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
In-Reply-To: Miles Sabin's message of "Mon, 20 Dec 1999 15:13:55 -0000"
References: <AA4C152BA2F9D211B9DD0008C79F760A675199@odin.cromwellmedia.co.uk>
Message-ID: <m3k8m9iorz.fsf@localhost.localdomain>

Miles Sabin <msabin@cromwellmedia.co.uk> writes:

> > If I were doing it over, though, I would actually call 
> > java.lang.String.intern once for each of the strings in the 
> > intern table so that they were == to the regular intern'ed 
> > versions.
> 
> Try it, but I think you'll be more likely to lose than gain.

I'd appreciate more information here -- if I call
java.lang.String.intern the first time I add a string to the intern
table, then the cost is proportional to the number of entries in the
table, not the number of accesses.

Or, in plainer English, if the XML name "div" appears 20,000 times in
my document, and I have my own custom intern() routine that calls
java.lang.String.intern only when a new name is added to the table, I
will still incur the cost of java.lang.String.intern only once, not
20,000 times.  The benefit is that my interned strings will still be
== to those from java.lang.String.intern.  I wish that I had done it
this way for AElfred.

> But all this seems to suggest that the use of java.lang.String.
> intern() and java.util.Hashtable isn't that good an idea.
> Insofar as SAX defines interfaces you can leave the choice to
> implementors. But wiring them into the implementation of SAX 
> utility classes (eg. your NSUtils) would mean that we don't
> get the option.

I think that NSUtils is dead -- it was a trial balloon, and a pretty
successful one as far as it went.  I am convinced that the
single-string approach is not suitable for Namespace-qualified names
in SAX2.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 16:05:03 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
In-Reply-To: Stefan Haustein's message of "Mon, 20 Dec 1999 16:44:57 +0100"
References: <14427.62086.25937.792412@localhost.localdomain> <385C7716.6F394D37@jclark.com> <385D7DB8.3EFDC6DE@trantor.de> <m3so0xiu6k.fsf@localhost.localdomain> <385E4EF9.A9D45CA8@trantor.de>
Message-ID: <m3hfhdiop6.fsf@localhost.localdomain>

Stefan Haustein <stefan.haustein@trantor.de> writes:

> OK, what about
> 
>  void startElement(String localName, AttributeList attr,
> NameSpaceContext nsc)
> 
> NameSpaceContext could be unmutable and thus be reused while unchanged.
> I could still remember all parameters since there is only one more at
> the end :-) 
> 
> The parser would only need to check if the NSC has changed and could
> reuse the same object otherwise. 

How would that help with attribute lists?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Dec 20 16:46:28 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
Message-ID: <3.0.32.19991220084504.0152c700@pop.intergate.ca>

At 03:13 PM 12/20/99 -0000, Miles Sabin wrote:
>> That's probably why most parsers have their own intern 
>> routines rather than using java.lang.String.intern -- at 
>> least, I wrote a custom hashing and interning routine for 
>> AElfred that sped it up quite significantly over (a) using 
>> java.lang.String.intern() or (b) allocating a new string for 
>> every element name.
>
>Ditto, using a very nice data structure that John Cowan brought
>to my attention, see,

Heh-heh, Lark's is just an array that was binary-searched.  We don'
need no steeenkin' advanced data structures.  It is fast enough to
vanish in the static in the profiler output. -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Dec 20 16:45:47 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
Message-ID: <3.0.32.19991220084131.014e93c0@pop.intergate.ca>

At 08:21 AM 12/20/99 -0500, David Megginson wrote:
>> - a pure namespaces view
>> - a simultaneous namespaces and XML 1.0 view
>> - a pure XML 1.0 view
>
>I agree -- I think that this is the cleanest approach.

I have a great deal of trouble imagining a situation in which the 
"simultaneous" view is desirable or even safe.  Could someone help out
with a use-case please?

If I'm right, then given that SAX1 already does the pure XML1.0 view, why do 
we need more than one view? -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 16:58:01 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <3.0.32.19991220084131.014e93c0@pop.intergate.ca>
References: <3.0.32.19991220084131.014e93c0@pop.intergate.ca>
Message-ID: <14430.24542.458680.917600@localhost.localdomain>

Tim Bray writes:
 > At 08:21 AM 12/20/99 -0500, David Megginson wrote:
 > >> - a pure namespaces view
 > >> - a simultaneous namespaces and XML 1.0 view
 > >> - a pure XML 1.0 view
 > >
 > >I agree -- I think that this is the cleanest approach.
 > 
 > I have a great deal of trouble imagining a situation in which the 
 > "simultaneous" view is desirable or even safe.  Could someone help out
 > with a use-case please?

The classic use case is a transformation where the result will be
still used by an author, sort of an XML sed.  I don't find this case
particularly persuasive, but clearly the DOM WG did, and as a result,
the Infoset was constrained to follow.

What I like about James's approach is that it's transparent for proper
Namespace processing -- you get what you expect -- and the (very
slight) extra difficulty is offloaded onto those who want the original
prefix.

 > If I'm right, then given that SAX1 already does the pure XML1.0
 > view, why do we need more than one view?

When you need both simultaneously.  I have never encountered such a
case, and I imagine that it's mostly imaginary, but I would like to be 
able to manage a little more DOM2 compatibility in SAX2.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Mon Dec 20 16:58:19 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:18:43 2004
Subject: LISTADMIN: URGENT. Delay on Transfer of this list to  OASIS.
Message-ID: <v04220807b4840df6e825@[155.198.224.86]>

I announced last Thursday that we would be transferring the administration
of  the XML-DEV list to  OASIS, and that the list would be shutting down
21 December 1999.

Because the logistics of the transfer are taking longer to finalise than 
expected,  the expected shutdown of the list has been POSTPONED.

Please note therefore that the list will continue to operate as normal
for postings during the remainder of  December and into early January
(Y2K permitting!).  Details of the list transfer arrangements will appear
here within the next two weeks.

Note also however that all requests to the list that need moderation,
ie ones of the type

subscribe xml-dev email@address  or 
unsubscribe xml-dev email@address

will NOT be processed during this period (those of the type 
subscribe  xml-dev are processed automatically). Please don't send requests to
me (and especially not to the list itself) seeking such actions during this period.

May I wish everyone on the list an especially festive holiday, and
a particularly fruitful new Millennium. 

Henry Rzepa. +44 (0)20 7594 5774 (Office) +44 (0)20 7594 5804 (Fax)
Dept. Chemistry, Imperial College, London, SW7  2AY, UK. 
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From George.Zhao at kla-tencor.com  Mon Dec 20 17:09:58 1999
From: George.Zhao at kla-tencor.com (Zhao, George)
Date: Mon Jun  7 17:18:43 2004
Subject: unsubscribe
Message-ID: <B5EAB589884BD311827A00A0C95C3FD0673C06@milxpr03.kla-tencor.com>


unsubscribe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Mon Dec 20 17:15:25 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:43 2004
Subject: The per-element-type namespace partition
Message-ID: <3.0.32.19991220121633.01d544a0@nexus.webmethods.com>

I spend a lot of time creating specialized tool APIs on top of our XML parser.  This often involves taking a piece of information out of a document and running with it.  For the most part I can represent any piece with just the following:

- type of piece (eg. element, attribute, PI, comment, text, etc.)
- namespace URI (when applicable)
- local name or PI target name (when applicable)
- text (when applicable)

It seems that the following additional pieces are needed to represent attribute belonging to the per-element-type namespace partition:

- namespace URI of the associated element type
- local name of the associated element type

Since an attribute of this partition has only one associated namespace URI, you could replace the second occurrence of a namespace URI with a flag indicating that the first namespace URI actually applies to the element type local name.

Four pieces of information is already more complex than I desire.  Six pieces of information is really really distressing.  I have to find ways to keep this complexity out of my tool APIs and am looking for suggestions.

Are there any applications out there that require per-element-type namespace information?  What will I break if I effectively lump all per-element-type attributes into a single universal "default" namespace and require applications that need this information to retain information about the elements to which such attributes belong?  Why didn't the W3C just allow an unqualified attribute to assume the namespace of its owning element?  And what am I to infer from the fact that the "per-element-type partition" is defined in a "non-normative" section of the Namespaces in XML specification?

Thanks in advance for you help!
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Mon Dec 20 17:38:34 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A67519A@odin.cromwellmedia.co.uk>

David Megginson wrote,
> if I call java.lang.String.intern the first time I add a 
> string to the intern table, then the cost is proportional to 
> the number of entries in the table, not the number of 
> accesses.
>
> Or, in plainer English, if the XML name "div" appears 20,000 
> times in my document, and I have my own custom intern() 
> routine that calls java.lang.String.intern only when a new 
> name is added to the table, I will still incur the cost of 
> java.lang.String.intern only once, not 20,000 times.  The 
> benefit is that my interned strings will still be == to those 
> from java.lang.String.intern.

That's right.

But my question is: what's the benefit? Sure you have all the
Strings in your table intern'ed. But how do you exploit that?

>From what you've said before, I presume that you'll do a
lookup with a part of a char array as a key, and get a
corresponding intern'ed String back. Now you can compare that 
String using == with another String (so long as that String is 
known to be intern'ed too). Unless you're doing lots of == 
comparisons with the same String (I'm not quite sure why you 
would be), then you won't gain unless the cost of a lookup +
the cost of == is less than the cost of String.equals().

Since you're using a hash table you'll have to examine each
character in the array portion once to compute the hash code,
then you'll have to examine them all over again (at least once,
possibly more than once, depending on your hash table 
implementation) to do the final key comparison which resolves
hash collisions. Add to that all the other overhead involved
in a hash lookup, and I very much doubt that you're faster
than String.equals() ... which just has to examine each
character in the two Strings once (don't forget that the String 
class has direct access to the internal representation char 
arrays, and doesn't have to use charAt()). In the case of a 
mismatch String.equals() might not have to do much work at all 
... "foo".equals("bar") can return false after just comparing 
the 'f' and the 'b'.

The only likely reason I can think of for wanting to do more
than one == comparison is a bad one. If you're doing things 
like,

  // elemChars is the array 'd', 'i', 'v'; len == 3
  String interned = table.lookup(elemChars, 0, len);
  if(interned == DIV)
    // do div processing
  else if(interned == SPAN)
    // do span processing
  else if
    // ... etc ...

then you'd be better off taking a different approach altogether.
Instead of mapping to an intern'ed String you could map to a
handler (a Design Patterns Strategy),

  ElementHandler handler = table.lookup(elemChars, 0, 3);
  if(handler != null)
    handler.doProcessing();

You can even eliminate the test against null if you can arrange 
for your table implementation to return a do-nothing or error-
generating handler on a mismatch.

>From a design point of view this is a win, because you no
longer have to hard wire the list of elements to test for into a 
long sequence of conditionals (which is, in effect, a great big
switch on type ... ugh). And with even a not so good JIT, this 
is likely to be the best bet performance-wise.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From paul at prescod.net  Mon Dec 20 17:49:23 1999
From: paul at prescod.net (Paul Prescod)
Date: Mon Jun  7 17:18:43 2004
Subject: SAX2: Namespace proposal
References: <3.0.32.19991220084131.014e93c0@pop.intergate.ca> <14430.24542.458680.917600@localhost.localdomain>
Message-ID: <385E6B61.255666D3@prescod.net>

David Megginson wrote:
> 
> The classic use case is a transformation where the result will be
> still used by an author, sort of an XML sed.  I don't find this case
> particularly persuasive, but clearly the DOM WG did, and as a result,
> the Infoset was constrained to follow.

James also mentioned the case where the output had to be DTD-compatible.
In that case the prefixes used are meaningful because they are what the
DTD expects.
-- 
 Paul Prescod  - ISOGEN Consulting Engineer speaking for himself
The occasional act of disrespect for the American flag creates but a 
flickering insult to the values of democracy -- unless it provokes 
America into limiting the freedoms that are its hallmark.
           -- Paul Tash, executive editor of the St. Petersburg Times


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Dec 20 18:05:04 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:43 2004
Subject: The per-element-type namespace partition
Message-ID: <3.0.32.19991220100008.0132e3a0@pop.intergate.ca>

At 12:16 PM 12/20/99 -0500, Joe Lapp wrote:
> What will I break if I effectively lump all per-element-type attributes 
>into a single universal "default" namespace and require applications that 
>need this information to retain information about the elements to which such 
>attributes belong?  Why didn't the W3C just allow an unqualified attribute 
>to assume the namespace of its owning element?  And what am I to infer from 
>the fact that the "per-element-type partition" is defined in a "non-
>normative" section of the Namespaces in XML specification?

The key question that took up literally months of the WG's time was, what
about the following two elements:

<html:a href="foo">
<html:a html:href="foo">

The question was, are these two

(a) the same in all cases
(b) never the same thing
(c) sometimes the same

(a) is probably the most common case, and what you're proposing doing.
But at the end of the day, the WG simply couldn't bring itself to decree
that they must always be the same.  Another way to ask the same question
is: in <html:a href="foo">, what namespace is href in?  The fuzzy, 
non-precise answer that kept coming up was "it's in the namespace of
the <html:a> element".  That sounded good to everyone but it was hard
to understand at a programmer-level term just what it meant.

So the language in the spec allows the pair of examples to be treated
as non-equivalent by an app that wants to do so.  The non-normative
appendix tries to make the notion of "the namespace of an element"
useful.

To get back to your question, I'm unconvinced that you have a problem.
If you're going to extract an attribute from a doc and do things
with it, you probably want to remember what element type it came
from anyhow.  Element types have two parts and that's all there is to it.
At some point you're going to have to deal with
<html:a xlink:role="foo"> anyhow; that attribute has potentially 4
pieces of identifying information and there's no getting away from it.
 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lauren at sqwest.bc.ca  Mon Dec 20 18:09:49 1999
From: lauren at sqwest.bc.ca (Lauren Wood)
Date: Mon Jun  7 17:18:44 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <14430.24542.458680.917600@localhost.localdomain>
References: <3.0.32.19991220084131.014e93c0@pop.intergate.ca>
Message-ID: <199912201805.KAA00477@mail.sqwest.bc.ca>


The DOM WG put some amount of time into designing the DOM 
Level 2 namespace support to cover the case where a DOM Level 1 
client application (e.g., script) runs in a DOM Level 2 
implementation. So the script doesn't know about namespaces, 
but the implementation does. It should still be possible for these 
scripts to run properly. We didn't cover the converse case; scripts 
that need Level 2 should ask whether the implementation supports 
what they need first.

Another use case was to allow processes that don't use prefixes at 
all (e.g., the XML is never serialized) to use the namespace 
support in the DOM. The prefix is, in the DOM, considered to be 
syntactic sugar, but the Level 1 constraints meant we couldn't lose 
prefixes altogether.

I think it would be good for SAX2 to supply (perhaps as an 
option) the information the DOM implementations need; I've had 
quite a few questions about CDATA sections in SAX for example, 
since the DOM exposes these and SAX1 doesn't.


Lauren

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Dec 20 18:11:41 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:44 2004
Subject: SAX2: Namespace proposal
Message-ID: <3.0.32.19991220101201.029b04f0@pop.intergate.ca>

At 05:37 PM 12/20/99 -0000, Miles Sabin wrote:
>But my question is: what's the benefit? Sure you have all the
>Strings in your table intern'ed. But how do you exploit that?

One way is like this.  In Lark, my own private intern table was on
char arrays, not strings.  So when you parse a name out of the doc,
you look it up before you even (expensively) make the String.  The effect
is that you end up not only intern()ing once per unique name, but
only make that many String objects.   The benefit of intern()ing was
for the application not the parser (of course you have to document that
the intern()ing is happening - I forget, was this advertised in SAX1?
Should it be in SAX2?)

I measured this as a huge saving.  But I've heard that the String object 
has gotten a bit smarter in recent Javas, so maybe it's overkill now?
 -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Mon Dec 20 18:24:33 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:44 2004
Subject: The per-element-type namespace partition
Message-ID: <3.0.32.19991220132543.01f68b60@nexus.webmethods.com>

At 10:05 AM 12/20/99 -0800, Tim Bray wrote:
>[...] I'm unconvinced that you have a problem.
>If you're going to extract an attribute from a doc and do things
>with it, you probably want to remember what element type it came
>from anyhow.  [...]

Here's a use case.  Assume for purposes of this discussion that both attributes and elements are nodes (among other kinds of nodes).  I want an API that inputs a node that represents a date and outputs the date in a standard format.  To identify the format of the input node, assuming that no crazy AI heuristics can be defined to do the job, the API must lookup the schema definition for this node.  The input node need not belong to any document at this point (eg. the document might not exist yet).

Unless I am missing something, to make this API work for attributes in the per-element-type namespace, I either have to expand my definition of node to include informational items 5 and 6 mentioned previously, or have the node take an additional parent node parameter.  Or I have to require that the node belong to a document -- but I also have use cases for why nodes should have to belong to documents.
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cyang at mdsintl.com  Mon Dec 20 18:32:43 1999
From: cyang at mdsintl.com (Yang, Chol)
Date: Mon Jun  7 17:18:44 2004
Subject: No subject
Message-ID: <C6672FD5E16BD3118C3A00805FFE91D91B37F2@tornt4.mdshealth.com>

unsubscribe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Mon Dec 20 18:33:21 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:44 2004
Subject: SAX2: Namespace proposal
Message-ID: <3.0.32.19991220133428.01f6ac80@nexus.webmethods.com>

Using Java String interning, how do you guys guarantee performance in any of the DOM Element get*() methods that take Strings?  Do you require that the app intern the string before passing it in?  Do you try to make the methods smart so that if they're interned, you get performance, and if they aren't, you get a bit more of a penalty (for having done the intern check first)?  Do you make them dumb so that if you forgot to intern, you don't get anything?  Or would one always intern these externally provided Strings within the method?
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Mon Dec 20 18:47:41 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:18:44 2004
Subject: XLink and XML Base W3C working drafts
Message-ID: <19991220134733.J3165@w3.org>

 The W3C XML Linking Working Group just published the following drafts:

 - A new version of XLink:

    http://www.w3.org/TR/1999/WD-xlink-19991220/

-----
This specification defines the XML Linking Language (XLink), which allows
elements to be inserted into XML documents in order to create and describe
links between resources. It uses XML syntax to create structures that
can describe the simple unidirectional hyperlinks of today's HTML as
well as more sophisticated links.
-----
  
    This is considered near completion, we are seeking feedback on a few
remaining open items, and would also welcome feedback from implementors.

 - A first version of XBase:

    http://www.w3.org/TR/1999/WD-xmlbase-19991220

-----
This document proposes syntax for providing the equivalent of HTML BASE
functionality generically in XML documents by defining an XML attribute
named xml:base.
-----

    While being a first draft, this is simple enough that we hope
that it will get ready for general consumption quickly.

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From liamquin at interlog.com  Mon Dec 20 18:48:56 1999
From: liamquin at interlog.com (Liam R. E. Quin)
Date: Mon Jun  7 17:18:44 2004
Subject: Tutorial material available
In-Reply-To: <f5bso0xvln6.fsf@cogsci.ed.ac.uk>
Message-ID: <Pine.BSI.3.96r.991220134901.8957A-100000@shell1.interlog.com>

On 20 Dec 1999, Henry S. Thompson wrote:
> The slides [1] and additional material [2] for my 1-day intensive
> XML Schema tutorial are available for interested parties.

Neat, Henry!

Hmm, I could make the Typography Tutorial slides and (non-printed) material
available if anyone wants, under similar terms.

However, you need the "Mrs Eaves" fonts (including ligatuers) to
be able to read the slides -- you can buy them from www.emigre.com,
but they are not free.

If anyone is still interested, feel free to contact me.

Lee

-- 
Liam Quin, Barefoot Computing, Toronto;
SGML/XML/XSL/Unix/Perl/Java/Intercal Consulting & Development
l i a m    at    h o l o w e b    dot    n e t
IRC: Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Veillard at w3.org  Mon Dec 20 18:51:25 1999
From: Daniel.Veillard at w3.org (Daniel Veillard)
Date: Mon Jun  7 17:18:44 2004
Subject: Last Call version of the W3C XML Infoset draft
Message-ID: <19991220135118.K3165@w3.org>

  The W3C XML Core Working Group just released the Last Call version
of the XML Infoset draft.

   http://www.w3.org/TR/1999/WD-xml-infoset-19991220

-----
This specification describes an abstract data set containing the
information available from an XML document.
-----

  The Last Call review ends up 31 January 2000, and the working group
seeks feedback on the specification and implementation reports 

Daniel

-- 
Daniel.Veillard@w3.org | W3C, INRIA Rhone-Alpes  | Today's Bookmarks :
Tel : +33 476 615 257  | 655, avenue de l'Europe | Linux XML libxml WWW
Fax : +33 476 615 207  | 38330 Montbonnot FRANCE | Gnome rpm2html rpmfind
 http://www.w3.org/People/all#veillard%40w3.org  | RPM badminton Kaffe

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Mon Dec 20 18:51:55 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:44 2004
Subject: MIME like CDATA sections?
In-Reply-To: <385BC5DC.EA31C48B@fxtech.com>
Message-ID: <Pine.LNX.4.10.9912200146220.28897-100000@cauchy.clarkevans.com>

I'm sure this has been brought up before
on this list; but I was wondering if in 
a future version of XML MIME style boundaries
could be used to delimit non-XML content.

For example:

  <root xmlns:cdata="http://w3.org/XML/MIME" >
    <cdata:my-boundary>
      data that is not compliant with XML and
      does not contain my boundary. 
    </cdata:my-boundary>
  </root>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Dec 20 19:17:18 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:44 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <8725684D.0069ECEC.00@d53mta03h.boulder.ibm.com>


>It seems to me that code like:
>
>void DocumentHandler::startElement (
>    const std::wstring &name, const AttributeList &atts)
>{
>    if (name == L"Paragraph") ...
>}
>
>is going to be a whole lot neater than
>
>void DocumentHandler::startElement (
>    const std::basic_string<SAXChar> &name, const AttributeList &atts)
>{
>    static const SAXChar paraString[] =
>        {'P','a','r','a','g','r','a','p','h',\0'};
>    if (name == paraString) ...
>

John is absolutely correct. It *must* be wchar_t if its going to be a fixed
thing. The massive convenience this provides to people who actually want to
do something with the data and for the ability to use constants (look at
XML4C if you want to see what a pain in the butt it is to do the latter
scheme) is paramount. For those folks who need to store text, they can
certainly strip off unwanted bytes before storing it. It is much more
reasonable to require transcoding of people storing text than to require
everyone using the data on the fly to transcode.

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Mon Dec 20 19:22:47 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:44 2004
Subject: Attributes, namespace partitions, and schemas
References: <385CB961.D5DB56B4@pacbell.net> <385D0FFB.924750E4@pacbell.net>
Message-ID: <385E8305.C068C58F@mecomnet.de>

Ray Waldin wrote:
>  ...
> 
> > Given the following XML:
> >
> > <a xmlns="uri1" xmlns:b="uri1" c="d" b:c="e"/>
> ...
> > Should SAX2 report two attributes with the same localName and namespaceURI?
> 
> No. The per-element attribute (c) should have a null namespaceUri and the global

This conforms to the spec, but still does not suffice.
The spec is weak on this point. given, for example,

  <a xmlns="uri1" xmlns:b="uri1" c="d" b:c="e"/>
  <!-- ... -->
  <x xmlns="uri1" xmlns:b="uri1" c="d" b:c="e"/>

The two unqualified 'c' attributes could, given the assertion, that they both
"have a null namespaceUri", be misconstrued as be equal. It would depend on
how the implementation construes a "null namespaceUri". Or is it that the
"names" are, in fact equal, but their binding spaces are disjoint. The spec
leaves this matter unclear. It would appear that SAX2 stands to become the
defacto spec on this matter. This is unfortunate.

> ...
> 
> This also explains why "default namespaces do not apply directly to attributes".
> Unqualified attributes *always* belong *indirectly* to whatever namespace their
> parent element belongs to, so they can never *directly* belong to the default
> (unqualified) namespace as global attributes.

The specifics of this relationship are undefined. This is also unfortunate.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Dec 20 19:25:37 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:44 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
Message-ID: <8725684D.006AB02B.00@d53mta03h.boulder.ibm.com>


>>I've done a lot of thinking about how we
>>can make SAX2 Namespace processing both efficient and
>>backwards-compatible.
>
>I'm not sure backwards-compatible is really a good idea.  The
>namespace-sensitive and namespace-oblivious views of a chunk of XML
>are just deeply, totally, massively incompatible, and it seems wrong
>to try to paper that over.  I also don't think it's worth investing
>any effort at all in accomodating XML documents that use colons in names
>but aren't namespace-aware, given the explicit warnings against doing
>this in the XML 1.0 spec.
>
>So I think it would be cleaner to deal with the fact that names can have
>two parts, and not kludge them together with {} marks.  -Tim
>

But its also a code issue. The event callbacks from a parser has to be able
to pass out data from both namespace and non-namespace aware documents
using the same APIs. For some folks, this type of syntax might allow that
to be done much easier. Personally I didn't go this route, and have event
API parameters for prefix and URI that just aren't used if namespaces are
not enabled, but some folks might prefer the {uri}name thing to use the
same API for both styles. You have to pick one or the other though, right?
Either you use some such kludge, or your event handlers have to always know
whether namespaces are enabled and look at some event parameters sometimes
but not others (though I guess they also have to be aware enough to strip
out the parts of the {uri}name form anyway, since its not a legal name as
is.)

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From roddey at us.ibm.com  Mon Dec 20 19:31:21 1999
From: roddey at us.ibm.com (roddey@us.ibm.com)
Date: Mon Jun  7 17:18:44 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <8725684D.006B35D9.00@d53mta03h.boulder.ibm.com>


>> > 2) We would prefer that all data come out of the SAX interfaces as
>> > raw wchar_t strings. This is the most flexible mechanism and does
>> > not lock people into using any particular implementation of a string
>> > object. It also has the highest potential performance for those
>> > folks who never need to put it into anything more formal than a raw
>> > array.
>>
>> std::basic_string<> _is_ a modern service of C++, and a pretty good
>> one from an API point of view.
>>
>> Personally I say: use std::basic_string<> and death to all other
>> string representations in C++.
>
>Agreed. I don't see why you need to obviate the C++ standard library
>string. If it's that bad, upgrade your compiler environment (e.g.
>Windows) or install an entirely new one (e.g. STLport and the like).
>

Well, if you want to take on the job of convincing our very large and
important customers that they have to do this, then I'll go along with your
proposal. Otherwise, I can't. And I don't think that they will. They have
some crufty old compilers and we have to support them, unfortunately. Many
of them do not have any namespace support at all.

And, as I said before, for those of us who are building an XML parser on
top of a comprehensive framework, this is a non-starter because the whole
point of those frameworks are to replace the standard C++ library, which is
pretty poor as a comprehensive development framework.

----------------------------------------
Dean Roddey
Software Weenie
IBM Center for Java Technology - Silicon Valley
roddey@us.ibm.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Mon Dec 20 19:42:05 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:44 2004
Subject: roundtripping is just that [Re: SAX2: Namespace proposal]
References: <199912191350.IAA09542@hesketh.net>
Message-ID: <385E8796.7CF9DB1C@mecomnet.de>

What are parties who are intrested in roundtripping planning to do when the
document tree is modified between parsing and serializing? For example to move
an element from one part of the tree to another. For example to a portion of
the tree where the prefix is captured? As soon as a tree is modified, the
prefix can no longer be assumed valid and may need to be regenerated for serialization.

Either the prefixes are unique within the scope of a document and static
relations to the respective uri's suffice, or, they are not unique to a
document and only a lookup based on a lexically apparent uri binding can be
guaranteed to produce a correct prefix. In the former case binding the prefix
to the name is redundant. In the latter case it is incorrect.

I like the idea of the instance-based API. It will make it easier to ignore
the prefixes.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 20 19:53:37 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain> <385C7716.6F394D37@jclark.com> <385D7DB8.3EFDC6DE@trantor.de> <m3so0xiu6k.fsf@localhost.localdomain> <385E4EF9.A9D45CA8@trantor.de> <m3hfhdiop6.fsf@localhost.localdomain>
Message-ID: <385E8946.9C76437E@trantor.de>

David Megginson wrote:
> Stefan Haustein <stefan.haustein@trantor.de> writes:
> > OK, what about
> >
> >  void startElement(String localName, AttributeList attr,
> > NameSpaceContext nsc)
> >
> > NameSpaceContext could be unmutable and thus be reused while unchanged.
> > I could still remember all parameters since there is only one more at
> > the end :-)
> >
> > The parser would only need to check if the NSC has changed and could
> > reuse the same object otherwise.
> 
> How would that help with attribute lists?

Where is the problem in adding something similar to the attribute lists,
e.g.: 

- getLocalName
- getType
- getValue
- getNameSpaceContext

NameSpaceContext could provide access methods for the namespace URI and
the prefix. It also
could provide a method "getName (String localName)" that builds the full
name if needed. 

Best regards

Stefan

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 20:02:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <385E8946.9C76437E@trantor.de>
References: <14427.62086.25937.792412@localhost.localdomain>
	<385C7716.6F394D37@jclark.com>
	<385D7DB8.3EFDC6DE@trantor.de>
	<m3so0xiu6k.fsf@localhost.localdomain>
	<385E4EF9.A9D45CA8@trantor.de>
	<m3hfhdiop6.fsf@localhost.localdomain>
	<385E8946.9C76437E@trantor.de>
Message-ID: <14430.35598.995518.75644@localhost.localdomain>

Stefan Haustein writes:

 > David Megginson wrote:
 > > Stefan Haustein <stefan.haustein@trantor.de> writes:
 > > > OK, what about
 > > >
 > > >  void startElement(String localName, AttributeList attr,
 > > > NameSpaceContext nsc)
 > > >
 > > > NameSpaceContext could be unmutable and thus be reused while unchanged.
 > > > I could still remember all parameters since there is only one more at
 > > > the end :-)
 > > >
 > > > The parser would only need to check if the NSC has changed and could
 > > > reuse the same object otherwise.
 > > 
 > > How would that help with attribute lists?
 > 
 > Where is the problem in adding something similar to the attribute lists,
 > e.g.: 
 > 
 > - getLocalName
 > - getType
 > - getValue
 > - getNameSpaceContext
 > 
 > NameSpaceContext could provide access methods for the namespace URI and
 > the prefix. It also
 > could provide a method "getName (String localName)" that builds the full
 > name if needed. 

I'm not sure that this approach buys us much, in the end.  You'd need
something like this:

  public class AttributeList
  {
    public int getLength ();

    public NamespaceContext getNamespaceContext (int i);
    public String getLocalName (int i);
    public String getType (int i);
    public String getValue (int i);

    public String getType (NamespaceContext c, String localName);
    public String getValue (NamespaceContext c, String localName);
  }

That does not seem substantially different than James Clark's
proposal:

  public class AttributeList
  {
    public int getLength ();

    public String getNamespaceURI (int i);
    public String getLocalName (int i);
    public String getType (int i);
    public String getValue (int i);

    public String getType (NamespaceURI c, String localName);
    public String getValue (NamespaceURI c, String localName);
  }

The only difference (other than being able to add methods into the
NamespaceContext class) is that you want the application to keep track 
of what the current Namespace URI is rather than having the parser
keep track of it.  

The main choice, then, is between the following (imagine that "foo" is 
a full URI):

  startNamespace("foo")
  startElement("bar", atts)
  character("Hello, world!")
  endElement("bar")
  endNamespace("foo")

and

  startElement("foo", "bar", atts)
  characters("Hello, world!")
  endElement("foo", "bar")

I cannot say that I have an extremely strong preference either way,
but if (as some people suspect) documents will consists of elements
from many Namespaces mixed together, the latter might be a little more 
efficient.

What does everyone else think?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 20:15:02 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <3.0.32.19991218163458.01500100@pop.intergate.ca>
References: <3.0.32.19991218163458.01500100@pop.intergate.ca>
Message-ID: <14430.36362.589577.199567@localhost.localdomain>

Tim Bray writes:
 > At 03:45 PM 12/18/99 -0500, David Megginson wrote:

 > >That said, I think that the best solution in SAX2 will be to allow
 > >either or both Namespace-qualified and raw XML names.  People seem to
 > >want both, and although I *strongly* disagree, DOM2 has decided to
 > >provide what should be irrelevant information about the original
 > >prefix used.
 > 
 > Actually, as I just finished arguing in response to your other note, 
 > I think what people care about is having the prefix/uri mapping available,
 > not knowing what prefix was actually used.  Have I missed something? Is
 > there an application scenario where you get
 > 
 > <a xmlns:a="http://x.com" xmlns:b="http://x.com">
 >   <a:foo /><b:bar />
 >   </a>
 > 
 > ...and you actually care that the foo and bar had different prefixes?  
 > I'd find that really hard to believe.

Yeah, I do too, but so many people (some from big companies) have
sworn up and down on every holy book in Creation that they need that
kind of thing -- and managed to convince the DOM WG to include it in
DOM2 -- that I simply have to assume myself (and Tim) wrong.  At least 
with James's proposal, this silliness won't have to be in people's
faces unless they explicitly request it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Mon Dec 20 20:18:48 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
References: <3.0.32.19991220121633.01d544a0@nexus.webmethods.com>
Message-ID: <385E9038.2F428B86@mecomnet.de>

If your goal is a simple interface, use symbols. That is, reify the names as
first class objects. One of the properties of the respective symbol is its
package. One of the properties of the package is its static name. In the
simple case, this name is the URI.
If you go this route, you need only
> 
> - type of piece (eg. element, attribute, PI, comment, text, etc.)
> - the name-symbol
> - text (when applicable)
> 

The per-element-type case does not change the interfaces, as the name of the
package is generated to accomplish the desired partition. (once you figure out
what that is...)

> It seems that the following additional pieces are needed to represent attribute belonging to the per-element-type namespace partition:
> 
> - namespace URI of the associated element type (subsumed in package name)
> - local name of the associated element type (subsumed in package name)
> 

There's an xml processor which comes with the cl-http server. It passes
symbols around. nothing else.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Mon Dec 20 20:29:07 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain><385C7716.6F394D37@jclark.com><385D7DB8.3EFDC6DE@trantor.de><m3so0xiu6k.fsf@localhost.localdomain><385E4EF9.A9D45CA8@trantor.de><m3hfhdiop6.fsf@localhost.localdomain><385E8946.9C76437E@trantor.de> <14430.35598.995518.75644@localhost.localdomain>
Message-ID: <004001bf4b28$6faf07a0$c5010180@p197>

My vote goes for this:

>   NamespaceDecl("foo","http://www.foo.com")
>   startElement("foo", "bar", atts)
>   characters("Hello, world!")
>   endElement("foo", "bar")

I've really not had time to follow the thread though, so I'd be grateful if
somebody could summize the pro/cons of the approaches suggested in some type
of article maybe ?

Certainly the above event stream would generally be faster to process as 1)
there would be fewer method calls 2) 9 times out of 10 the namespace prefix
used will be shorter than the full URL and therefore result in quicker
string comparions etc 3) would reduce tendancy of software 'so' hard wired
to given URIs

 > ...and you actually care that the foo and bar had different prefixes?
 > I'd find that really hard to believe.

I must have missed the point on this, but if foo and bar exist in both
namespaces then the prefixes are very very important.  (Well, at least the
namespaces they imply are important for reasons of semantics etc )

Cheers,

Rich.

Professional XML -
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861003110
Professional ASP 3.0 -
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002610
Beginning ASP Components -
http://www.wrox.com/Consumer/Store/Details.asp?ISBN=1861002882


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Mon Dec 20 20:34:24 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:45 2004
Subject: Representing IP addresses in XML Schema
Message-ID: <385E931C.6862D724@mitre.org>

Hi Folks,

I am trying to create an IP address datatype using the latest xml schema
spec, and am having difficulty seeing the solution.  My initial attempt
was:

<datatype name="IP" source="string">
   <pattern value="\d{3}.\d{3}.\d{3}.\d{3}"/>
</datatype>

However, this is not satisfactory - it allows each field in the IP to
have values from 000-999.  I want to restrict the possible values to:
[0-255].[0-255].[0-255].[0-255]

Any thoughts on how to do this?  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Vlashua at rsgsystems.com  Mon Dec 20 20:42:02 1999
From: Vlashua at rsgsystems.com (Vane Lashua)
Date: Mon Jun  7 17:18:45 2004
Subject: XML in the toaster (was Why I Hate Palmtops)
Message-ID: <E9EB65078F9BD31193FE009027B100E10CBF71@RSGMAIL01>

Thank god for American football and commercials. Yesterday a TV ad showed
cell phone users in Finland using cell phones as terminals/"credit cards"
for ATMs, dispensing machines, ticketing, etc. On the radio news this
morning, a toy store is reported to be giving in-store scanners to kids who
would scan barcodes of desired items then download the list to the register,
where it would be emailed? printed? for delivery to the parent. The phone
could as easily do this work. Whether the phone is a thick or thin client,
you don't "need" XML for this kind of application, but "specific" turns
generic when you let it. There's a jini(tm) in your future, and it already
speaks XML.
Vane   

> -----Original Message-----
> From: Gavin Thomas Nicol [mailto:gtn@ebt.com]
> Sent: Saturday, November 27, 1999 5:41 PM
> To: xml-dev@ic.ac.uk
> Subject: RE: RE: Why I Hate Palmtops (was: Re: SGML, XML and SML)
> 
> 
> > Take a phone for example.  Your front office could send you
> > some notes in XML format so that if it contains a phone number,
> > the phone can display a small phone icon at the corner of the
> > message and make the call for you when you press a button.  Your
> > business card can be stored on your phone so that you can give
> > it to another person by simply pointing it at his phone.  Another
> > use might be to receive address information so that you can find it
> > using cellular tri-angulation or satellite technology.
> 
> This is such a specific application that I fail to see the necessity
> of XML... unless of course we imagine *all* phones
> being able to deal with such messages.
> 
> 
> xml-dev: A list for W3C XML Developers. To post, 
mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 20 20:48:36 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
References: <14427.62086.25937.792412@localhost.localdomain>
		<385C7716.6F394D37@jclark.com>
		<385D7DB8.3EFDC6DE@trantor.de>
		<m3so0xiu6k.fsf@localhost.localdomain>
		<385E4EF9.A9D45CA8@trantor.de>
		<m3hfhdiop6.fsf@localhost.localdomain>
		<385E8946.9C76437E@trantor.de> <14430.35598.995518.75644@localhost.localdomain>
Message-ID: <385E9627.A3A20987@trantor.de>

> I'm not sure that this approach buys us much, in the end.  You'd need
> something like this:
> 
>   public class AttributeList
>   {
>     public int getLength ();
> 
>     public NamespaceContext getNamespaceContext (int i);
>     public String getLocalName (int i);
>     public String getType (int i);
>     public String getValue (int i);
> 
>     public String getType (NamespaceContext c, String localName);
>     public String getValue (NamespaceContext c, String localName);
>   }
> 

I would further propose that the AttributeList "knows" the
NamespaceContext of the corresponding element and provides 
two additional methods defaulting to the NamespaceContext
of the element:

   public String getType (String localName);
   public String getValue (String localName);

Best regards

Stefan
-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Mon Dec 20 20:58:58 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
In-Reply-To: <3.0.32.19991220100008.0132e3a0@pop.intergate.ca>
Message-ID: <001701bf4b2d$315db6e0$d1940e18@smateo1.sfba.home.com>

Tim Bray wrote:
>Another way to ask the same question is: in
><html:a href="foo">, what namespace is href in?
>The fuzzy, non-precise answer that kept coming
>up was "it's in the namespace of the <html:a>
>element".  That sounded good to everyone but
>it was hard to understand at a programmer-level
>term just what it meant.

How about "attributes inherit the element's namespace?"

Frankly, I think some parts of XML Namespace spec
suffers from paranoia over uncommon use cases.
I have suffered such paranoia myself, usually after
a bone-jarring series of arguments.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rzepa at tesco.net  Mon Dec 20 21:07:46 1999
From: rzepa at tesco.net (Rzepa, Henry)
Date: Mon Jun  7 17:18:45 2004
Subject: ANNOUNCE: VirtualXML.  An on-line Course on  XML
Message-ID: <v04220800b4844aa45006@[155.198.8.81]>

Through a collaboration between  Graphnet and CMLConsulting,  we are 
pleased to announce a Web-based  Virtual course on XML (VirtualXML).

 The start date will be mid January, 2000, and the course will comprise three
modules, each of approximately one month duration.

All course material is prepared in XML (XHTML) and managed as XML entities. 
We shall use this to demonstrate many features of XML including:
- navigation strategies depending on experience and interests of course members
- indexing free text *in context*
- normalisation of effort through hyperlinking to central resources
- restructuring of material
- rendering course material in different forms (XHTML, FOs, RTF/PDF, etc.)
- communal contributions to course material through agreed addressing schemes

VirtualXML will be a unique development in learning and communal activity. 
Because all the  course material will be created in XML, we have a remarkable store 
of real XML documents and activities to work with. We benefit greatly from the 
wide range of XML tools available, which we can help you to install and use, 
whatever your environment and experience.

If you or your company is interested in participating, please complete the on-line
questionnaire, which is available at the course home page on 
  
http://www.cmlconsulting.com/ or via the link  
http://www.cmlconsulting.com/admin/registration.html

=====================================
Henry Rzepa.
==========
Henry Rzepa. Domestic Net. Home +44 181 575 1839.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Mon Dec 20 21:09:14 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
Message-ID: <776DC00B49ECD21189750090273F729130B369@EROS>

David Megginson writes:
> The main choice, then, is between the following (imagine that 
> "foo" is 
> a full URI):
> 
>   startNamespace("foo")
>   startElement("bar", atts)
>   character("Hello, world!")
>   endElement("bar")
>   endNamespace("foo")
> 
> and
> 
>   startElement("foo", "bar", atts)
>   characters("Hello, world!")
>   endElement("foo", "bar")

How could the first work without the expanded signature of the second?
Imagine the following:

<f:bar xmlns:f="foo" xmlns:b="baz" b:y="z">
</f:bar>

resulting in:

	startNamespace("foo")
	startNamespace("baz")
	startElement("bar", atts)

which namespace is "bar" in?

Where is the "efficiency" in the second without the explicit namespace
events?  Imagine the following:

<f:bar xmlns:f="foo" xmlns:b="baz" b:y="z">
	<f:moo b:x="z">
	</f:moo>
</f:bar>

resulting in:

	startElement("foo", "bar", atts) <== must parse and remember the
namespaces
	startElement("baz", "moo", atts) <== refer to above

note that the parsing and storing of the namespaces is ADDED work as the
parser must have already done so itself!

Perhaps the two should be combined...

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 21:15:26 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <776DC00B49ECD21189750090273F729130B369@EROS>
References: <776DC00B49ECD21189750090273F729130B369@EROS>
Message-ID: <14430.39984.255611.434325@localhost.localdomain>

Hill, Les writes:

 > How could the first work without the expanded signature of the second?
 > Imagine the following:
 > 
 > <f:bar xmlns:f="foo" xmlns:b="baz" b:y="z">
 > </f:bar>
 > 
 > resulting in:
 > 
 > 	startNamespace("foo")
 > 	startNamespace("baz")
 > 	startElement("bar", atts)
 > 
 > which namespace is "bar" in?

That's a separate question -- the idea of the original suggestion is
that NamespaceContext would not be nested -- in the worst case, there
would have to be start/endNamespaceContext events for every element.
This is separate from the question of reporting the scope of Namespace 
declarations.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 21:28:02 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: Burn, Flag, Burn! (was Re: SAX2: Namespace proposal)
In-Reply-To: Paul Prescod's message of "Mon, 20 Dec 1999 11:46:09 -0600"
References: <3.0.32.19991220084131.014e93c0@pop.intergate.ca> <14430.24542.458680.917600@localhost.localdomain> <385E6B61.255666D3@prescod.net>
Message-ID: <m3emchi9qz.fsf_-_@localhost.localdomain>

[offline]

Paul Prescod <paul@prescod.net> writes:

> The occasional act of disrespect for the American flag creates but a 
> flickering insult to the values of democracy -- unless it provokes 
> America into limiting the freedoms that are its hallmark.
>            -- Paul Tash, executive editor of the St. Petersburg Times

Hmm -- a Canadian living in Texas publicly advocating the freedom to
burn the American flag (and quoting a Florida paper that sounds like
it's Russian)?  You'd better buy a gun.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 21:30:15 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
In-Reply-To: Joe Lapp's message of "Mon, 20 Dec 1999 13:34:28 -0500"
References: <3.0.32.19991220133428.01f6ac80@nexus.webmethods.com>
Message-ID: <m3bt7li9n7.fsf@localhost.localdomain>

Joe Lapp <jlapp@webMethods.com> writes:

> Using Java String interning, how do you guys guarantee performance
> in any of the DOM Element get*() methods that take Strings?  Do you
> require that the app intern the string before passing it in?  Do you
> try to make the methods smart so that if they're interned, you get
> performance, and if they aren't, you get a bit more of a penalty
> (for having done the intern check first)?  Do you make them dumb so
> that if you forgot to intern, you don't get anything?  Or would one
> always intern these externally provided Strings within the method?

With the DOM, I think, the biggest issue is not performance but memory
usage -- you do not want 500 separate "div2" strings floating around
in the same tree.  Interning is much more obviously essential for a
tree API than it is for a streaming API.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Mon Dec 20 21:38:41 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: Namespace proposal
Message-ID: <776DC00B49ECD21189750090273F729130B36B@EROS>

David Megginson writes:
> Hill, Les writes:
>  > 	startNamespace("foo")
>  > 	startNamespace("baz")
>  > 	startElement("bar", atts)
>  > 
>  > which namespace is "bar" in?
> 
> That's a separate question -- the idea of the original suggestion is
> that NamespaceContext would not be nested -- in the worst case, there
> would have to be start/endNamespaceContext events for every element.
> This is separate from the question of reporting the scope of 
> Namespace 
> declarations.

I must have missed that nuance.  In fact I am still missing it :)

If the element name has its own contextual information (i.e. using compound
names+), then yes, the namespace scoping is orthogonal.  If the element name
does not have the contextual information, then namespace scoping is crucial.
Unfortunately, startNamespace()/endNamespace() isn't good enough (see above
:)

What am I missing?

+The original event sequence read as using simple Strings and NOT using
compound names mostly due to the second sequence in the message explicitly
identifying the namespace.

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Mon Dec 20 21:45:48 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
Message-ID: <3.0.32.19991220134521.0133fbf0@pop.intergate.ca>

At 01:00 PM 12/20/99 -0800, Don Park wrote:
>Tim Bray wrote:
>>Another way to ask the same question is: in
>><html:a href="foo">, what namespace is href in?
>
>How about "attributes inherit the element's namespace?"

That is equivalent to asserting that

<html:a html:href="foo"> and
<html:a href="foo">

must always in all languages and in all applications be considered
identical.  Which is not an unreasonable viewpoint.  But the WG at
the time, after lengthy consideration, couldn't swallow it. -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 21:48:42 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: Burn, Flag, Burn! (was Re: SAX2: Namespace proposal)
In-Reply-To: David Megginson's message of "20 Dec 1999 16:27:00 -0500"
References: <3.0.32.19991220084131.014e93c0@pop.intergate.ca> <14430.24542.458680.917600@localhost.localdomain> <385E6B61.255666D3@prescod.net> <m3emchi9qz.fsf_-_@localhost.localdomain>
Message-ID: <m34sddi8sg.fsf@localhost.localdomain>

Someone writes:

> [offline]

Or not, depending on whether people are smart enough to change the
headers before hitting Send.

> Paul Prescod <paul@prescod.net> writes:
> 
> > The occasional act of disrespect for the American flag creates but a 
> > flickering insult to the values of democracy -- unless it provokes 
> > America into limiting the freedoms that are its hallmark.
> >            -- Paul Tash, executive editor of the St. Petersburg Times
> 
> Hmm -- a Canadian living in Texas publicly advocating the freedom to
> burn the American flag (and quoting a Florida paper that sounds like
> it's Russian)?  You'd better buy a gun.

Apologies to all flag-lovin' Texan gun owners out there.  This message
was actually posted by someone else -- almost certainly not a
Canadian, and certainly not me... I mean, not David.  Oh well.


All the best,


Someone else

-- 
not David Megginson                 not david@megginson.com
           and definitely not http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Mon Dec 20 21:51:38 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2: a new namespace proposal
In-Reply-To: <385E9627.A3A20987@trantor.de>
Message-ID: <001901bf4b34$8df835e0$d1940e18@smateo1.sfba.home.com>

Here is a proposal that has very little impact on SAX
interfaces:

  // NamespaceHandler is a callback interface for
  // handling namespace declarations
public interface NamespaceHandler {
  void startNamespace(String uri, String prefix);
  void endNamespace(String uri, String prefix);
}

  // New interface that provides namespace support
  // to DocumentHandlers similarly to the way Locator
  // provides source information.
  // NOTE: With the exception of getNameObject,
  // Namespaces methods must be invoked within
  // startElement method.
public interface Namespaces
{
  Object getNameObject(String uri,
           String localPart);
    // get a unique non-mutable opaque object
    // representing a fully qualified name

  boolean equals(String rawName, Object no);
    // see if given name string and name object
    // are equivalent.
  boolean equalsNS(String rawName, Object no);
    // see if given name string belongs to the
    // namespace of given name object.

    // same as above except uses components
    // of a name object.
  boolean equals(String rawName,
            String uri, String localName);
  boolean equalsNS(String rawName,
            String uri);

    // name information methods
  String getNameURI(String rawName);
  String getNamePrefix(String rawName);
  String getNameLocalPart(String rawName);
}

  // DocumentHandlerNS extends DocumentHandler
  // with namespace support
public interface DocumentHandlerNS extends
  DocumentHandler
{
  void setDocumentNamespaces(Namespaces nss);
}

  // ParserNS extends Parser with namespace
  // support
public interface ParserNS extends Parser
{
  void setNamespaceHandler(NamespaceHandler h);
}

With above changes, SAX implementations can
detect whether an XML application is namespace-
aware or not by checking whether DocumentHandlerNS
is being used or not.

Here is an example of a DocumentHandler based on
this proposal:

public class MySAXApp implements DocumentHandlerNS {
  Namespaces _nss;
  Object     _foo, _bar;
  void setDocumentNamespaces(Namespaces nss) {
    _nss = nss;
    _foo = nss.getNameObject(MY_NS_URI, "foo");
    _bar = nss.getNameObject(MY_NS_URI, "bar");
  }
  void startElement(String name, AttributeList atts) {
    if (_nss.equals(name, _foo)) {
      // its our 'foo'
      String prefix = _nss.getNamePrefix(name);
        // what prefix was used if any.  If default NS
        // is in effect, this returns null.
    }
    else if (_nss.equalsNS(name, MY_NS_URI)) {
      // its not foo but one of ours
    }
  }
}

There might be some holes in this proposals so please
constrain your comments to the design approach rather
than design specifics.  Key point is the use of Locator
like mechanism.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Mon Dec 20 22:00:55 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
In-Reply-To: <3.0.32.19991220134521.0133fbf0@pop.intergate.ca>
Message-ID: <001e01bf4b35$d7ef5740$d1940e18@smateo1.sfba.home.com>

>must always in all languages and in all applications be considered
>identical.  Which is not an unreasonable viewpoint.  But the WG at
>the time, after lengthy consideration, couldn't swallow it. -Tim

I guess being a big-mouth, like me, has its advantages. <g>

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dhunter at Mobility.com  Mon Dec 20 22:03:08 1999
From: dhunter at Mobility.com (Hunter, David)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
Message-ID: <805C62F55FFAD1118D0800805FBB428D02BC01E5@cc20exch2.mobility.com>

From: Tim Bray [mailto:tbray@textuality.com]
Sent: Monday, December 20, 1999 4:45 PM
> 
> That is equivalent to asserting that
> 
> <html:a html:href="foo"> and
> <html:a href="foo">
> 
> must always in all languages and in all applications be considered
> identical.  Which is not an unreasonable viewpoint.  But the WG at
> the time, after lengthy consideration, couldn't swallow it. -Tim

But it seems to be a fundamental point of view of the specs coming out of
the W3C, implied or otherwise.  Take XSLT, for example.  If I declare a
stylesheet, I do it like this:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
  <!--other stuff-->
</xsl:stylesheet>

Notice that there is no explicit namespace declared for the version
attribute, it's just assumed to be the same as the <stylesheet> element.
But, there is also a shorthand we can use for stylesheets with one template
that matches against "/", so I could do something like this:

<html xsl:version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!--other stuff-->
</html>

in which case I *have* to declare the namespace of version explicitly.

Are there any specs coming out of the W3C that *don't* assume that

<whatever:element version="1.0" xmlns:whatever="whatever">
is identical to
<whatever:element whatever:version="1.0" xmlns:whatever="whatever">

?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Mon Dec 20 22:04:12 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:45 2004
Subject: Musing over Namespaces
In-Reply-To: <81A48CA1E953D211B11600A0C9E1C1B975F148@mail.bowstreet.com>
Message-ID: <001f01bf4b36$4df544e0$d1940e18@smateo1.sfba.home.com>

>> What if the registry was a distributed semantic network?
>> If the problem could be solved, wouldn't it be worth
>> building it?  Note that this is not the way BizTalk and
>> XML.org schema repositories work.
>
>I would be willing to explore this on schema.net. Any takers?

I think semantic tag network is a considerably larger work
than XML pattern repository.  If we could limit ourselves to
discussions for now, you can count me in.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From h.rzepa at ic.ac.uk  Mon Dec 20 22:18:23 1999
From: h.rzepa at ic.ac.uk (Rzepa, Henry)
Date: Mon Jun  7 17:18:45 2004
Subject: LISTADMIN: IMPORTANT. Transfer of this list to  OASIS.postponed.
Message-ID: <v04220801b4845968c8db@[155.198.8.81]>

Last Wednesday, I indicated that administration of the XML-DEV list
would be transferred to  the OASIS organisation, and that 
accordingly xml-dev@ic.ac.uk would stop operation on  21 December.

Because the arrangements for this transfer are not yet complete, I
would like to announce that the transfer to  OASIS is POSTPONED.

The list will continue to operate as usual during the period  21 December
- 31 December, and possibly beyond.

Please note however that all subscribe/unsubscribe requests of the type

subscribe xml-dev youremail@youraddress 

which require direct intervention and moderation, will NOT be processed during the above
period. Please do not send such requests to majordomo@ic.ac.uk 
(or worse, to the list itself). Requests of the type 

(un)subscribe xml-dev  

require no moderation and will proceed as usual.

May I wish everyone on the list a happy Christmas, and a productive
new Year. 

Henry Rzepa. +44 171 594 5774 (Office) +44 171 594 5804 (Fax)
http://www.ch.ic.ac.uk/rzepa/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Mon Dec 20 23:03:45 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2 Namespace Support
Message-ID: <14430.46481.974707.922192@localhost.localdomain>

OK, here goes...

Background
----------

I accept Tim Bray's argument that Namespace-qualified names should be
passed on in two parts and not one, and that Namespace-aware
processing is the future of XML.

I accept James Clark's argument that there must be a mechanism for
passing on the original prefix if the parser supports doing so and the 
application desires it, if only for DOM2 support.


General Rules
-------------

1. By default, SAX2 parsers shall perform Namespace processing unless
   explicitly requested not to do so.  There's no point having them
   start in an indeterminate state, or in allowing SAX2 parsers not to
   do Namespace processing.  I will provide a filter that can be
   embedded in the driver for parsers that don't do NS processing
   natively.

2. There will be features for (i) requesting that the original prefix
   be prepended to each local name and (ii) turning off Namespace
   processing altogether.  The default value for each of these
   features will be false, and no parser is required to support either 
   of them.

3. If a LexicalHandler is set, the parser may use it to report the
   scope of Namespace declarations.  Note that this is a little
   brittle, and probably less useful than people think, but that it is
   essential for XSLT.  Not all parsers will support LexicalHandler.

4. Namespace-qualified names are always reported as two separate
   strings: the Namespace URI and the local name.  The local name may
   have the original prefix prepended at client request, but the
   prefix will not be there by default.

The idea of all of this is that fully-cooked Namespace processing is
the default behaviour and the normal, transparent operating mode for
SAX2 -- most application writers need never know that other modes are
available.

However, there are optional, non-obtrusive mechanisms for passing on
extra information (such as the original prefixes and the scope of NS
declarations).  The presence or absence of support for these optional
features can be determined by feature queries.


Implementation
--------------

This implementation is based largely on suggestions from James Clark.


[from org.xml.sax2.DocumentHandler]

  public void startElement (String ns, String name, 
                            AttributeList atts)
    throws SAXException;

  public void endElement (String ns, String name)
    throws SAXException;


[org.xml.sax2.AttributeList]

  public class AttributeList
  {
    public int getLength ();

    public String getNamespaceURI (int i);
    public String getName (int i);
    public String getType (int i);
    public String getValue (int i);

    public String getType (String ns, String localName);
    public String getValue (String ns, String localName);

                    // For searching on prefixed names
    public String getType (String name);
    public String getValue (String name);
  }


[from org.xml.sax2.LexicalHandler]

  public void startNamespaceDeclScope (String prefix, String uri)
    throws SAXException;

  public void endNamespaceDeclScope (String prefix)
    throws SAXException;


Unless someone shows a catastrophic problem with all this (not a
purely aesthetic one), I plan to go ahead to other SAX2 problems now.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Mon Dec 20 23:23:54 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
Message-ID: <33D189919E89D311814C00805F1991F7F4AA6E@RED-MSG-08>

I support Tim on this point.  If the discussion is about what the
specification actually says (as distinct from a debate over what you would
rather it said) then, as Tim says very correctly, the following two are not
generally equivalent:

<foo:a foo:href="bar"> and
<foo:a href="foo">

Best wishes,
Andrew Layman

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Mon Dec 20 23:46:17 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2 Namespace Support
References: <14430.46481.974707.922192@localhost.localdomain>
Message-ID: <012701bf4b44$0c68b030$4a5eedc1@arp01>

Why are you trying to complicate ours lifes :)

Please change these from:

>   public void startElement (String ns, String name,
>                             AttributeList atts)
>     throws SAXException;
>
>   public void endElement (String ns, String name)
>     throws SAXException;

to:

>   public void startElement (String nsPrefix, String ns, String name,
>                             AttributeList atts)
>     throws SAXException;
>
>   public void endElement (String nsPrefix,String ns, String name)
>     throws SAXException;

We can build ours DOM more easily this way dont have to buffer the other
namespace events.  I also would be surprised if at least 80% of SAX2 users
a) wouldnt mind this being present b) would probably use it

Anybody agree with me or am I standing in the dark all alone on this ?

Thanks,

Rich
----- Original Message -----
From: David Megginson <david@megginson.com>
To: XMLDev list <xml-dev-digest@ic.ac.uk>
Sent: Monday, December 20, 1999 11:02 PM
Subject: SAX2 Namespace Support


> OK, here goes...
>
> Background
> ----------
>
> I accept Tim Bray's argument that Namespace-qualified names should be
> passed on in two parts and not one, and that Namespace-aware
> processing is the future of XML.
>
> I accept James Clark's argument that there must be a mechanism for
> passing on the original prefix if the parser supports doing so and the
> application desires it, if only for DOM2 support.
>
>
> General Rules
> -------------
>
> 1. By default, SAX2 parsers shall perform Namespace processing unless
>    explicitly requested not to do so.  There's no point having them
>    start in an indeterminate state, or in allowing SAX2 parsers not to
>    do Namespace processing.  I will provide a filter that can be
>    embedded in the driver for parsers that don't do NS processing
>    natively.
>
> 2. There will be features for (i) requesting that the original prefix
>    be prepended to each local name and (ii) turning off Namespace
>    processing altogether.  The default value for each of these
>    features will be false, and no parser is required to support either
>    of them.
>
> 3. If a LexicalHandler is set, the parser may use it to report the
>    scope of Namespace declarations.  Note that this is a little
>    brittle, and probably less useful than people think, but that it is
>    essential for XSLT.  Not all parsers will support LexicalHandler.
>
> 4. Namespace-qualified names are always reported as two separate
>    strings: the Namespace URI and the local name.  The local name may
>    have the original prefix prepended at client request, but the
>    prefix will not be there by default.
>
> The idea of all of this is that fully-cooked Namespace processing is
> the default behaviour and the normal, transparent operating mode for
> SAX2 -- most application writers need never know that other modes are
> available.
>
> However, there are optional, non-obtrusive mechanisms for passing on
> extra information (such as the original prefixes and the scope of NS
> declarations).  The presence or absence of support for these optional
> features can be determined by feature queries.
>
>
> Implementation
> --------------
>
> This implementation is based largely on suggestions from James Clark.
>
>
> [from org.xml.sax2.DocumentHandler]
>
>   public void startElement (String ns, String name,
>                             AttributeList atts)
>     throws SAXException;
>
>   public void endElement (String ns, String name)
>     throws SAXException;
>
>
> [org.xml.sax2.AttributeList]
>
>   public class AttributeList
>   {
>     public int getLength ();
>
>     public String getNamespaceURI (int i);
>     public String getName (int i);
>     public String getType (int i);
>     public String getValue (int i);
>
>     public String getType (String ns, String localName);
>     public String getValue (String ns, String localName);
>
>                     // For searching on prefixed names
>     public String getType (String name);
>     public String getValue (String name);
>   }
>
>
> [from org.xml.sax2.LexicalHandler]
>
>   public void startNamespaceDeclScope (String prefix, String uri)
>     throws SAXException;
>
>   public void endNamespaceDeclScope (String prefix)
>     throws SAXException;
>
>
> Unless someone shows a catastrophic problem with all this (not a
> purely aesthetic one), I plan to go ahead to other SAX2 problems now.
>
>
> All the best,
>
>
> David
>
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 20 23:50:42 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2 Namespace Support
References: <14430.46481.974707.922192@localhost.localdomain>
Message-ID: <385EC0DF.3F28F1D8@trantor.de>


>   public void startElement (String ns, String name,
>                             AttributeList atts)

Nice! Actually, I like it more than my own suggestion: I just did not
like the tons of parameters in the previous suggestion, and removing the
redundant one is probably a better solution than introducing a new
class....

Best regards

Stefan

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Tue Dec 21 00:57:19 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:45 2004
Subject: SAX2 Namespace Support
References: <14430.46481.974707.922192@localhost.localdomain>
Message-ID: <385ED118.76BD2756@pacbell.net>

Overall, very nice!

My only concern is that LexicalHandler should be required by all parsers.
Otherwise you will get "fully compliant" SAX2 parsers which cannot be used to
resolve QNames found in attribute values against in-scope namespace
declarations. There are many examples where that's critical:

- evaluate XPath expression (XLST, XPointer, etc.)
- resolve XLink locator role (see http://www.w3.org/TR/xlink/#link-semantics)
- follow XML Schema references (see
http://www.w3.org/TR/xmlschema-1/#refSchemaConstructs)

and probably more to come... 

IMHO, LexicalHandler must be supported by all SAX2 parsers.

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Tue Dec 21 01:14:06 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:45 2004
Subject: The per-element-type namespace partition
In-Reply-To: <33D189919E89D311814C00805F1991F7F4AA6E@RED-MSG-08>
Message-ID: <000001bf4b50$d7844520$d1940e18@smateo1.sfba.home.com>

>the following two are not generally equivalent:
>
><foo:a foo:href="bar"> and
><foo:a href="foo">

I think you meant to say:

<foo:a foo:href="bar"> and
<foo:a href="bar"> are not generally equivalent.

Well, I find it rather counter-intuitive.

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Tue Dec 21 01:17:52 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:46 2004
Subject: Representing IP addresses in XML Schema
References: <385E931C.6862D724@mitre.org>
Message-ID: <385ED5E9.2C6FF554@pacbell.net>

"Roger L. Costello" wrote:
> ...
> <datatype name="IP" source="string">
>    <pattern value="\d{3}.\d{3}.\d{3}.\d{3}"/>
> </datatype>
>
> However, this is not satisfactory - it allows each field in the IP to
> have values from 000-999.  I want to restrict the possible values to:
> [0-255].[0-255].[0-255].[0-255]

You can avoid the problem by storing the IP as a hex address, as in:

ff.ff.ff.ff

<pattern    
value="[0-9a-f][0-9a-f].[0-9a-f][0-9a-f].[0-9a-f][0-9a-f].[0-9a-f][0-9a-f]"/>

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Dec 21 01:26:55 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2 Namespace Support
Message-ID: <3.0.32.19991220172413.014e5a80@pop.intergate.ca>

At 06:02 PM 12/20/99 -0500, David Megginson wrote:
>2. There will be features for (i) requesting that the original prefix
>   be prepended to each local name and (ii) turning off Namespace
>   processing altogether.  The default value for each of these
>   features will be false, and no parser is required to support either 
>   of them.

Urgh... if you want the prefix, don't you want as a separate datum, same
as the URI?  Same arguments as before.

For the rest, I think you've found the compromise space. -Tim

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Tue Dec 21 01:40:50 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:46 2004
Subject: ANNOUNCE: XMLIO ver 0.3 - nestable C++ XML parser/writer
Message-ID: <385EDAE2.31985B91@fxtech.com>

This is my simple nestable, streaming, XML parser for C++ application
data, now layered over expat (thanks to all who provided prods in the
right direction). Version 0.3 also adds chained element handlers and
support for parsing lists.

	http://www.fxtech.com/xmlio/index.html

I've decided to release this under the MIT(X11) license, which I feel is
the least restrictive of the popular licenses. How this will conflict
with James Clark's expat license, I don't know yet (I'm not a lawyer nor
do I try to play one on the internet, and I didn't want to bother trying
to interpret the Mozilla license). If there is a problem, someone please
let me know!

UNICODE isn't support yet, and I'm sure there are still bugs, but I've
tried this on data-files with thousands of elements and it seems fairly
quick. I still have some optimizations to do in the memory department,
to avoid unnecessary allocation/deallocation overhead.

An ANSI C++ compiler with namespaces, exceptions, and the standard
library is required. A vanilla "C" version of this API could be built,
if there is desire.

Please check the sample object implementation (sample.cpp and sample.h)
in the distribution for an example of how this API should be used.

A quick example of how the list-parsing feature mentioned above works.
Lets say I have this class:

class Date
{
public:
	enum Day
	{
		Sunday, Monday, Tuesday, 
		Wednesday, Thursday, Friday, Saturday, Days
	};

	static const char *labels[Days] = 
	{ "Sunday", "Monday", "Tuesday", 
	"Wednesday", "Thursday", "Friday", "Saturday" };

	void Write(XML::Output &out) const;
	void Read(XML::Parser &in);

private:
	int m_day;
};

void Date::Write(XML::Output &out) const
{
	// write out an element with the day as text
	out.WriteElement("Day", labels[m_day]);
}

void Date::Read(XML::Parser &in)
{
	// set up a handler for the date as a list
	XML::Handler handlers[] = {
		XML::Handler("Day", &m_day, labels, Days),
		XML::Handler::END
	};
	in.Parse(handlers, this);
}

Note the handler for the "Day" element above takes a pointer to an int
and an array of character strings and a total. When the "Day" element is
encountered it will automatically compare the element data with the
provided list, and write the index of the day found in the m_day
variable. It will throw an exception if an invalid Day is provided.

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 03:14:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2 Namespace Support
In-Reply-To: Ray Waldin's message of "Mon, 20 Dec 1999 17:00:08 -0800"
References: <14430.46481.974707.922192@localhost.localdomain> <385ED118.76BD2756@pacbell.net>
Message-ID: <m3yaapgjjf.fsf@localhost.localdomain>

Ray Waldin <rwaldin@pacbell.net> writes:

> Overall, very nice!
> 
> My only concern is that LexicalHandler should be required by all parsers.
> Otherwise you will get "fully compliant" SAX2 parsers which cannot be used to
> resolve QNames found in attribute values against in-scope namespace
> declarations. There are many examples where that's critical:
> 
> - evaluate XPath expression (XLST, XPointer, etc.)
> - resolve XLink locator role (see http://www.w3.org/TR/xlink/#link-semantics)
> - follow XML Schema references (see
> http://www.w3.org/TR/xmlschema-1/#refSchemaConstructs)
> 
> and probably more to come... 
> 
> IMHO, LexicalHandler must be supported by all SAX2 parsers.

Or else we can put the callbacks back into a separate
NamespaceHandler, so that parsers are not forced to report comments,
CDATA section boundaries, and other noise as well.

Do others agree that the scope of NS declarations is essential
(i.e. shouldn't be optional)?  I knew that XSLT needed it, but I
hadn't realized that so many other apps were now relying on resolving
prefixes in attribute values and character data -- I need to keep more 
up to date on the specs.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 03:14:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2 Namespace Support
In-Reply-To: <012701bf4b44$0c68b030$4a5eedc1@arp01>
References: <14430.46481.974707.922192@localhost.localdomain>
	<012701bf4b44$0c68b030$4a5eedc1@arp01>
Message-ID: <14430.55483.433692.943811@localhost.localdomain>

Richard Anderson writes:

 > Why are you trying to complicate ours lifes :)
 > 
 > Please change these from:
 > 
 > >   public void startElement (String ns, String name,
 > >                             AttributeList atts)
 > >     throws SAXException;
 > >
 > >   public void endElement (String ns, String name)
 > >     throws SAXException;
 > 
 > to:
 > 
 > >   public void startElement (String nsPrefix, String ns, String name,
 > >                             AttributeList atts)
 > >     throws SAXException;
 > >
 > >   public void endElement (String nsPrefix,String ns, String name)
 > >     throws SAXException;
 >
 > We can build ours DOM more easily this way dont have to buffer the other
 > namespace events.  I also would be surprised if at least 80% of SAX2 users
 > a) wouldnt mind this being present b) would probably use it

That was my original suggestion, but James Clark wisely pointed out
that it was possible to drop one of the arguments.  Consider the
following document:

 <html:p xmlns:html="http://www.w3c.org/1999/xhtml">Hello.</html:p>

With my proposal, what you will get by default is

  startElement("http://www.w3c.org/1999/xhtml", "p", atts);
  characters("Hello.");
  endElement("http://www.w3c.org/1999/xhtml", "p");

Nothing too tricky there, and that's all that most apps will ever
need.  If you pass in a LexicalHandler and the parser supports it, you
might get a little more:

  startNamespaceDeclScope("html", "http://www.w3c.org/1999/xhtml");
  startElement("http://www.w3c.org/1999/xhtml", "p", atts);
  characters("Hello.");
  endElement("http://www.w3c.org/1999/xhtml", "p");
  endNamespaceDeclScope("html");

I assume that's what you were thinking you'd have to cache, but that
would be wrong, since you could not with certainty map the Namespace
URI back to the original prefix used.  James's suggestion was that, at 
user option, the parser leave the original prefix on the name:

  startElement("http://www.w3c.org/1999/xhtml", "html:p", atts);
  characters("Hello.");
  endElement("http://www.w3c.org/1999/xhtml", "html:p");

This would never be enabled by default, but for the relatively small
class of apps that needed to know the original prefix, the prefix
would be available simply by splitting the name argument.

I like this approach because it doesn't throw the prefix in the face
of apps that don't need it -- to paraphrase Larry Wall, it makes common
tasks easy and uncommon tasks possible.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From klamerus at pobox.com  Tue Dec 21 03:46:21 1999
From: klamerus at pobox.com (Mark & Eileen Klamerus)
Date: Mon Jun  7 17:18:46 2004
Subject: Looking for Schema development initiatives
Message-ID: <NDBBJFGHMLALPMKIKAGLOEPDCCAA.klamerus@pobox.com>

All,

I'm in a research for various XML schema initiatives.  In particular those
which would be applicable to the chemicals industry.  I know that most
schema are oriented toward work processes (customer data, billing
information, etc.), but even with those it's hard to find a good reference
list.

Are there any sites or references which identify schema?  Are there any
organizations (besides OASIS) which might provide information on initiatives
for define schema underway?

I'm already aware of BizTalk (which is very limited), so please provide any
other
sources you know of (such as industry initiatives).

Thanks, especially for e-mail.

Mark Klamerus


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webmethods.com  Tue Dec 21 04:19:45 1999
From: jlapp at webmethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:46 2004
Subject: Looking for Schema development initiatives
In-Reply-To: <NDBBJFGHMLALPMKIKAGLOEPDCCAA.klamerus@pobox.com>
Message-ID: <199912210419.UAA02496@penguin.prod.itd.earthlink.net>

The materials science group at NIST (National Institute of Standards
and Technology) has recently taken a keen interest in XML.  Might be
something useful for you there.  I don't have the contact info, tho.

At 10:44 PM 12/20/1999 -0500, Mark & Eileen Klamerus wrote:
>All,
>
>I'm in a research for various XML schema initiatives.  In particular those
>which would be applicable to the chemicals industry.  I know that most
>schema are oriented toward work processes (customer data, billing
>information, etc.), but even with those it's hard to find a good reference
>list.
>
>Are there any sites or references which identify schema?  Are there any
>organizations (besides OASIS) which might provide information on initiatives
>for define schema underway?
>
>I'm already aware of BizTalk (which is very limited), so please provide any
>other
>sources you know of (such as industry initiatives).
>
>Thanks, especially for e-mail.
>
>Mark Klamerus
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From aray at q2.net  Tue Dec 21 04:18:23 1999
From: aray at q2.net (Arjun Ray)
Date: Mon Jun  7 17:18:46 2004
Subject: Musing over Namespaces
In-Reply-To: <001201bf4a63$5f723080$60f96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <Pine.LNX.4.10.9912202302550.6950-100000@mail.q2.net>


On Mon, 20 Dec 1999, Rick Jelliffe wrote:
>  From: Arjun Ray <aray@q2.net>
> >On Sat, 18 Dec 1999, Len Bullard wrote:
> >> Dan Brickley wrote:

> >How about improving DTDs? (Just a thought.)
> 
> ISO has a correction to SGML in the wings to allow alternative
> schema syntaxes to DTDs: I think it has been waiting for 2 years.

Is there anything on this at the WG4 site?  (And/Or is this another
implication of SEEALSO?)

> I think the problem with extending DTDs is that you have to create
> new declarations (unless you use PIs): 

That was the thought:)  HyTime jumped through a gazillion hoops and still
wound up using "new" declaration-style syntax in the LTDR (not to mention
the pre-WebSGML stuff in the AFDRmeta notation.)  

> also, DTDs are much more aimed at parsing while what is needed is at a
> different level: structures, datatypes, semantics.

Structures, yes; datatypes, maybe (I'm not convinced that regexes aren't
enough for a syntactic formalism); semantics, no.  A machine processable
meta description (itself a "semantic"!) for any imaginable semantic system
or requirement sounds like pinning a tail on the DWIM.  Here I think SGML
got the theory exactly right: there is a point at which one really
shouldn't try to do better than declare a notation.  We need more and
better ways to refer to notations, as well as new declarative types to
take the pressure off having to tuck something away in an opaque notation
declaration just because ISO8879 doesn't provide (syntactic) machinery.

An extensible syntax for declarations...:)

> There is also the issue that different classes of languages have
> different families of constraints.  

And "one syntax fits all" can be limiting...?

[ re testing utility of the new schema drafts:]
> For example, are you happy that XML Schemas make infoset
> contributions?

No.  The Infoset is the Universe of Discourse ("all the names to name the
names"), and schemas should be layered on top.  Schemas contributing to
infosets is like moving the goalposts.

> Should there be mechanisms in place by which a document can say "I use
> infoset contributions: if you don't have a full XML processor, don't
> accept me!"

Doesn't this come down to notation declarations?  ("I use *these*
thingummies: bail out if you don't grok")

> Or, what should the criteria for validity be: structures, structures +
> datatypes, structure+ datatypes + encoding-checking?

Schemas should provide for all three; the receiver decides what it needs.

> Or should it implement an ANY like XML DTDs (any element that is
> defined) or like WF XML (accept anything)?

"Archetype" seems to have been lost in the shuffle.  IMHO, that was the
way to go.

> The new drafts are said to be "feature complete", so this is a good time
> to start reading and thinking "could I actually use this thing?"

Yep.


Arjun


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Tue Dec 21 04:49:16 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2 Namespace Support
References: <14430.46481.974707.922192@localhost.localdomain>
	 <385ED118.76BD2756@pacbell.net> <m3yaapgjjf.fsf@localhost.localdomain>
Message-ID: <385F076A.B151CCEA@pacbell.net>

David Megginson wrote:
>
> Or else we can put the callbacks back into a separate
> NamespaceHandler, so that parsers are not forced to report comments,
> CDATA section boundaries, and other noise as well.

Sure, as long as namespace declaration events end up being a required feature.
 
> I hadn't realized that so many other apps were now relying on resolving
> prefixes in attribute values and character data -- I need to keep more
> up to date on the specs.

In fact, the latest XML Schema Datatypes WD defines QName as a built-in
datatype, so I expect that the trend will continue...

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Dec 21 06:17:12 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: summary of Namespace-support arguments
Message-ID: <011601bf4b7e$97409da0$2af96d8c@NT.JELLIFFE.COM.AU>


From: James Clark <jjc@jclark.com>

 >Rick Jelliffe wrote:
>>
>>  From: James Clark <jjc@jclark.com>
>>
>>  >I do indeed want that, and in the past I've argued against
*requiring*
>> >processors to provide information about the prefix used. I've become
a
>> >lot less negative about this prefix information recently and I think
>> >it's better for an API to provide it
>>
>> Isn't it a requirement for XSL?
>
>No.  XSL requires only information about what namespace declarations
are
>in scope, not what prefix was actually used.

Sorry for being thick:  are you using "namespace declaration" here to
mean
that each use of a namespace should point to its declaration rather than
just
to the simple namespace URI?   I strongly agree.

It seems like we need to get away from the idea that two namespace
objects
are equal if their URI strings are equal.  Either we need to know the
address of the definition of the object (in languages that support this)
or
the object also needs to have some definition-occurrence indicator:
this could be an identifier or a position on a stack. (If it is an
indentifier
then  I think the prefix would do fine.)

For other people, the issue comes up in XSL, where one can have
<xsl:stylesheet
  version="1.0"
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  xmlns:axsl="http://www.w3.org/1999/XSL/TransformAlias"
>
    ...
    <xsl:template match="/">
        <axsl:value-of select="xxx"/>
    </xsl:template>
</xsl:stylesheet>

In this document, the axsl: element will not be interpreted
as an XSLT function by an XSL processor: it will be output
unchanged as
    <axsl:value-of select="xxx"/>

So an API that resolves namespaces cannot merely replace
the prefix with the URI and the suffix:

<(http://www.w3.org/1999/XSL/Transform)stylesheet
  version="1.0"
>
    ...
    <(http://www.w3.org/1999/XSL/Transform)template match="/">
        <(http://www.w3.org/1999/XSL/Transform)value-of select="xxx"/>
    </(http://www.w3.org/1999/XSL/Transform)template>
</(http://www.w3.org/1999/XSL/Transform)stylesheet>

because that loses the scope.

B.T.W., I cannot see how this can be reconciled with the current
information set Last Call draft (which seems to drop the prefix).
I think it means that any API that simply drops the prefix (or
scoping information in liu of the prefix) is incorrect.


Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Dec 21 06:51:10 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:46 2004
Subject: XML Schemas to be allowed in SGML? Re: Musing over Namespaces
Message-ID: <012701bf4b83$589daa70$2af96d8c@NT.JELLIFFE.COM.AU>

 
From: Arjun Ray <aray@q2.net>
 >> ISO has a correction to SGML in the wings to allow alternative
>> schema syntaxes to DTDs: I think it has been waiting for 2 years.
>
>Is there anything on this at the WG4 site?  (And/Or is this another
>implication of SEEALSO?)

http://www.ornl.gov/sgml/wg8/document/1963.htm

The draft proposal is for "DTD Notations". The syntax proposed is:

 <!DOCTYPE xxx SYSTEM "xxx.xmlschema" 
        NDATA http://..xmlschemas >

I doubt ISO will move on this unless XML or some other SGML
community requires it, and I doubt that XML would require it. 
So I take it to mean that ISO JTC1/SC34 is happy to allow 
non-DTD schema languages providing they are well-defined 
(in the same way that SGML allows many DTDs,
many syntaxes, many encodings).

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nicmila at vscht.cz  Tue Dec 21 07:03:51 1999
From: nicmila at vscht.cz (Miloslav Nic)
Date: Mon Jun  7 17:18:46 2004
Subject: A new Zvon tutorial - XML - basic syntax
Message-ID: <385F2623.701AB356@vscht.cz>

I found a few spare hours so there is a new tutorial:
http://zvon.vscht.cz/HTMLonly/XMLTutorial/General/book.html

It covers the basic features of XML. While rather useless for the
experienced users as
you are I hope it will be useful when trying to explain XML to
beginners.
-- 
***************************************************************
Dr. Miloslav Nic                        e-mail: nicmila@vscht.cz
Department of Organic Chemistry         TEL: +420 2 2435 5012  
ICT Prague (VSCHT Praha)                     +420 2 2435 4118
    				        FAX: +420 2 2435 4288  
****************************************************************
Support free information exchange: http://zvon.vscht.cz
****************************************************************

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec 21 07:40:32 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:46 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: roddey@us.ibm.com's message of "Mon, 20 Dec 1999 12:15:22 -0700"
References: <3.0.6.32.19991214125527.00940150@mailhost> <8725684D.0069ECEC.00@d53mta03h.boulder.ibm.com>
Message-ID: <wh1z8g4u8f.fsf@viffer.metis.no>

>>>>> roddey@us.ibm.com:

> John is absolutely correct. It *must* be wchar_t if its going to be
> a fixed thing.

Please note that John was advocating std::wstring, rather than
std::wchar_t*.

I'll live with it, if that's what the standard SAX for C++ ends up
with, but unhappily, since this will make for unefficient handling on
gcc. :-/

> The massive convenience this provides to people who actually want to
> do something with the data and for the ability to use constants
> (look at XML4C if you want to see what a pain in the butt it is to
> do the latter scheme) is paramount.

Actually I think it would be much easier on debuggers and editors and
me, if I could store the text constants in UTF-8.

> For those folks who need to store text, they can certainly strip off
> unwanted bytes before storing it. It is much more reasonable to
> require transcoding of people storing text than to require everyone
> using the data on the fly to transcode.

Using data on the fly is what I want to do.  That's why I prefer to
get it in the form I will use it.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec 21 08:04:49 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:46 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: John Aldridge's message of "Tue, 14 Dec 1999 12:55:27 +0000"
References: <3.0.6.32.19991214125527.00940150@mailhost>
Message-ID: <whpuw03ejo.fsf@viffer.metis.no>

>>>>> John Aldridge <john.aldridge@informatix.co.uk>:

> It seems to me that code like:

> void DocumentHandler::startElement (
>     const std::wstring &name, const AttributeList &atts)
> {
>     if (name == L"Paragraph") ...
> }

> is going to be a whole lot neater than

> void DocumentHandler::startElement (
>     const std::basic_string<SAXChar> &name, const AttributeList &atts)
> {
>     static const SAXChar paraString[] =
>         {'P','a','r','a','g','r','a','p','h',\0'};
>     if (name == paraString) ...
> }

Sure, but how about this, then...?

typedef unsigned short SAXChar;
class SAXString : public basic_string<SAXChar> {
public:
    SAXString(const SAXString& s, size_type pos = 0, size_type n = npos)
	: basic_string<SAXChar>(s,pos,npos) {}
    SAXString(const SAXChar* p, size_type n)
	: basic_string<SAXChar>(p,n) {}
    SAXString(const SAXChar* p)
	: basic_string<SAXChar>(p) {}
    SAXString(size_type n, SAXChar c)
	: basic_string<SAXChar>(n,c) {}
    SAXString(const char* p)
	: basic_string<SAXChar>(p)
    {
	// do UTF-8 to UTF-16 decoding here
    }
    
};

and then:

void DocumentHandler::startElement (
     const SAXString& name, const AttributeList& atts)
{
     if (name == SAXString("Paragraph")) ...
}

or maybe just

void DocumentHandler::startElement (
     const SAXString& name, const AttributeList& atts)
{
     if (name == "Paragraph") ...
}

since the appropriate copy constructor is available.

Does it matter if SAXString doesn't have a virtual destructor if it
doesn't add any extra state to basic_string<SAXChar> and has no
virtual functions?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From davidc at nag.co.uk  Tue Dec 21 10:56:33 1999
From: davidc at nag.co.uk (David Carlisle)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: summary of Namespace-support arguments
Message-ID: <199912211052.KAA14750@nag.co.uk>


Rick,

Perhaps I missed something but I fail to see why your xslt example
shows that prefixes are required.

> In this document, the axsl: element will not be interpreted
> as an XSLT function by an XSL processor: it will be output
> unchanged as
>     <axsl:value-of select="xxx"/>

That is true, but not because it is a different prefix, but because it
is a different namespace.

The expanded form is not

   <(http://www.w3.org/1999/XSL/Transform)value-of select="xxx"/>

that you suggest but rather

   <(http://www.w3.org/1999/XSL/TransformAlias)value-of select="xxx"/>

which is (to the namespace processor parsing the stylesheet) just some
unknown namespace unrelated to XSLT which is why it is a literal result
element not an xsl instruction.

The `magic' aliasing comes later, as result elements in that namespace
are switched to the XSLT namespace as the result tree is written out.


David


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Tue Dec 21 11:43:47 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:46 2004
Subject: Representing IP addresses in XML Schema
References: <385E931C.6862D724@mitre.org> <385ED5E9.2C6FF554@pacbell.net>
Message-ID: <385F684F.4F8A122D@mitre.org>

Ray Waldin wrote:
> 
> "Roger L. Costello" wrote:
> > ...
> > <datatype name="IP" source="string">
> >    <pattern value="\d{3}.\d{3}.\d{3}.\d{3}"/>
> > </datatype>
> >
> > However, this is not satisfactory - it allows each field in the IP to
> > have values from 000-999.  I want to restrict the possible values to:
> > [0-255].[0-255].[0-255].[0-255]
> 
> You can avoid the problem by storing the IP as a hex address, as in:
> 
> ff.ff.ff.ff
> 
> <pattern
> value="[0-9a-f][0-9a-f].[0-9a-f][0-9a-f].[0-9a-f][0-9a-f].[0-9a-f][0-9a-f]"/>
> 
> -Ray

Thanks Ray.  An interesting choice of words - "You can avoid the
'problem' ...".  I think that it is a "problem" if such a common, simple
datatype cannot be expressed directly, and one is forced to resort to
indirect means of expressing what is desired.  Yes?  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From msabin at cromwellmedia.co.uk  Tue Dec 21 12:02:52 1999
From: msabin at cromwellmedia.co.uk (Miles Sabin)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: Namespace proposal
Message-ID: <AA4C152BA2F9D211B9DD0008C79F760A67519B@odin.cromwellmedia.co.uk>

Joe Lapp wrote,
> Using Java String interning, how do you guys guarantee 
> performance in any of the DOM Element get*() methods that take 
> Strings?  Do you require that the app intern the string before 
> passing it in?  Do you try to make the methods smart so that 
> if they're interned, you get performance, and if they aren't, 
> you get a bit more of a penalty (for having done the intern 
> check first)?  Do you make them dumb so that if you forgot to 
> intern, you don't get anything?  Or would one always intern 
> these externally provided Strings within the method?

This is one of the big problems with intern'ing IMO. The cost
of intern'ing those args (either done externally by the client 
app, or internally by a library) wipes out the potential
benfits.

Cheers,


Miles

-- 
Miles Sabin                       Cromwell Media
Internet Systems Architect        5/6 Glenthorne Mews
+44 (0)20 8817 4030               London, W6 0LJ, England
msabin@cromwellmedia.com          http://www.cromwellmedia.com/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Tue Dec 21 12:28:46 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:18:46 2004
Subject: Representing IP addresses in XML Schema
In-Reply-To: <385F684F.4F8A122D@mitre.org>
Message-ID: <000801bf4bae$ee397980$ab20268a@pc-lrd.bath.ac.uk>

> > > However, this is not satisfactory - it allows each field in the IP to
> > > have values from 000-999.  I want to restrict the possible values to:
> > > [0-255].[0-255].[0-255].[0-255]
> > 
> > You can avoid the problem by storing the IP as a hex address, as in:
> > 
> > ff.ff.ff.ff
> > 
> Thanks Ray.  An interesting choice of words - "You can avoid the
> 'problem' ...".  I think that it is a "problem" if such a common, simple
> datatype cannot be expressed directly, and one is forced to resort to
> indirect means of expressing what is desired.  Yes?  /Roger

It also becomes trickier if you want to disallow IP addresses such 
as 127.0.0.1, 0.0.0.0, etc.

L.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webmethods.com  Tue Dec 21 13:42:20 1999
From: jlapp at webmethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <AA4C152BA2F9D211B9DD0008C79F760A67519B@odin.cromwellmedia.
 co.uk>
Message-ID: <199912211342.FAA00548@penguin.prod.itd.earthlink.net>

At 12:02 PM 12/21/1999 +0000, Miles Sabin wrote:
>This is one of the big problems with intern'ing IMO. The cost
>of intern'ing those args (either done externally by the client 
>app, or internally by a library) wipes out the potential
>benfits.

If you can intern a name once and repeatedly use a ref to that
intern, you do gain benefits.  SAX itself might suffer a hit
doing all the interning, but it saves the app from taking a hit
doing all the string compares, assuming the app can repeatedly
do string compares using the same intern ref of its own naming.
Whether savings are realized depends on the application.

I'm just questioning the use of intern in document APIs.  We
use a special name object instead and force the app to select
the appropriate name object to hand to the API.  By using
String args you aren't forcing the app to use only interned
objects, so you set yourself up for having to deal with both
interned and non-interned Strings.  This leaves you with a
large performance hit to discover that a requested name is
not there.

Your paragraph would be more accurate to me if it said that the
cost of interning potentially wipes out its benefits.
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 14:22:58 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: Exceptions
Message-ID: <14431.36102.720933.248648@localhost.localdomain>

Now that the Namespace-support discussion is pretty-much wrapped up
(aside from some aesthetic questions), I'd like to move on to the
question of exceptions, which I hope will be considerably less
controversial.

For SAX2, I'm proposing the following major changes in exceptions:

1. SAXException extends IOException.

2. Add SAXNotRecognizedException, for a feature or property name
   that's not recognized.  Extends SAXException.

3. Add SAXNotSupportedException, for a request that the parser doesn't 
   support.  Extends SAXException.

4. Modify SAXParseException so that it also has an integer value
   containing the error number.  If there ever is a standard catalog
   of XML (and related) errors, we can use this to hold the number.

5. Have all callbacks that formerly threw SAXException throw
   IOException instead.  This should help to avoid a lot of exception
   tunneling.

Any issues?  I know that a few people suggested extending
IOException back in the SAX 1.0 design a year and a half ago, and
after doing a lot of implementation on top of SAX, I now agree with
them.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 14:32:12 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:46 2004
Subject: Why internalize? (was Re: SAX2: Namespace proposal)
In-Reply-To: Joe Lapp's message of "Tue, 21 Dec 1999 08:45:47 -0500"
References: <199912211342.FAA00548@penguin.prod.itd.earthlink.net>
Message-ID: <m3puw0gyc4.fsf@localhost.localdomain>

Joe Lapp <jlapp@webmethods.com> writes:

> I'm just questioning the use of intern in document APIs.  We
> use a special name object instead and force the app to select
> the appropriate name object to hand to the API. 

That's another type of interning.  It's important to remember that
while fast comparisons are a nice side-benefit, the main purpose of
interning strings or other objects is to guarantee that there is never
more than one equivalent object allocated -- otherwise, you can waste
an awful lot of memory.

To take one example, consider an attribute "security-level", allowed
for every element in a document, and with a default value.  If your
document has 5,000 elements (not an unusually large document), then
without some kind of internalization mechanism, you will end up
allocating 5,000 separate String objects, all with the value
"security-level".  If you internalize (somehow), then you have only
one String object (or compound Name object in Joe's case) that is
shared throughout the tree.

Internalizing can be tricky with mutable objects, but with immutable
objects like java.lang.String, it's a big win in this problem domain.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 14:35:17 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: ErrorHandler mods
Message-ID: <14431.36842.362320.404738@localhost.localdomain>

Following up on a long-ago suggestion from (I think) David Brownell,
I'm consider adding an extra method, validationError(), to the SAX2
ErrorHandler.  That way, we'd have the following methods:

  warning - any kind of non-fatal warning
  error - an optionally-reportable error of some kind, or an error
          from outside the XML 1.0 domain (such as a Namespace error
          that doesn't affect well-formedness)
  validationError - an error in schema validation
  fatalError - a well-formedness error

Does this work?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 14:59:20 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2 Namespace Support
References: <14430.46481.974707.922192@localhost.localdomain>
	 <385ED118.76BD2756@pacbell.net> <m3yaapgjjf.fsf@localhost.localdomain>
Message-ID: <385F95C0.B3C9DC30@pacbell.net>

David Megginson wrote:
> 
> Ray Waldin <rwaldin@pacbell.net> writes:
> 
> > Overall, very nice!
> >
> > My only concern is that LexicalHandler should be required by all parsers.
> > Otherwise you will get "fully compliant" SAX2 parsers which cannot be used to
> > resolve QNames found in attribute values against in-scope namespace
> > declarations. There are many examples where that's critical:
> >
> > - evaluate XPath expression (XLST, XPointer, etc.)
> > - resolve XLink locator role (see http://www.w3.org/TR/xlink/#link-semantics)
> > - follow XML Schema references (see
> > http://www.w3.org/TR/xmlschema-1/#refSchemaConstructs)
> >
> > and probably more to come...
> >
> > IMHO, LexicalHandler must be supported by all SAX2 parsers.
> 
> Or else we can put the callbacks back into a separate
> NamespaceHandler, so that parsers are not forced to report comments,
> CDATA section boundaries, and other noise as well.

I certainly prefer to see marginally relevant stuff like
comments and CDATA boundaries remain marginal.

The core data models of XML are elements, text, and (for
some) PIs.  Namespace-aware element (and attribute) processing
won't change that; most of the rest is noise.


> Do others agree that the scope of NS declarations is essential
> (i.e. shouldn't be optional)?  I knew that XSLT needed it, but I
> hadn't realized that so many other apps were now relying on resolving
> prefixes in attribute values and character data -- I need to keep more
> up to date on the specs.

Those three specs listed above seem to be a convincing argument
for exposing the capability if it's already in the parser.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Alex_Garrett at coregis.com  Tue Dec 21 15:11:43 1999
From: Alex_Garrett at coregis.com (Alex Garrett)
Date: Mon Jun  7 17:18:46 2004
Subject: Websites using XML
Message-ID: <8525684E.0053CFA4.00@smtp.apprise.com>


I realise this is probably the wrong place to ask this, but I've come to rely on
the concentrated expertise of the list members and I ask your indulgence. What I
need to know is the names (and possibly URLs) of companies that have web sites
that are implemented in XML. I need to provide a case for designing a site in
XML, and a crucial element for that case is to demonstrate that it's been done
before and it works. There's no need to clutter the list any more than I already
have with this -- please send responses straight to me. If there's interest,
I'll compile a list of responses and post it. Thanks,

Alex
alex_garrett@coregis.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dave at userland.com  Tue Dec 21 15:45:12 1999
From: dave at userland.com (Dave Winer)
Date: Mon Jun  7 17:18:46 2004
Subject: Websites using XML
References: <8525684E.0053CFA4.00@smtp.apprise.com>
Message-ID: <104801bf4bca$e63df3c0$1918ccce@murphy>

We use XML extensively on UserLand.Com.

The uses are summarized here:

http://backend.userland.com/

Dave


----- Original Message -----
From: "Alex Garrett" <Alex_Garrett@coregis.com>
To: <xml-dev@ic.ac.uk>
Sent: Tuesday, December 21, 1999 7:12 AM
Subject: Websites using XML


>
>
> I realise this is probably the wrong place to ask this, but I've come to
rely on
> the concentrated expertise of the list members and I ask your indulgence.
What I
> need to know is the names (and possibly URLs) of companies that have web
sites
> that are implemented in XML. I need to provide a case for designing a site
in
> XML, and a crucial element for that case is to demonstrate that it's been
done
> before and it works. There's no need to clutter the list any more than I
already
> have with this -- please send responses straight to me. If there's
interest,
> I'll compile a list of responses and post it. Thanks,
>
> Alex
> alex_garrett@coregis.com
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 16:03:27 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2 Namespace Support
References: <14430.46481.974707.922192@localhost.localdomain>
	 <012701bf4b44$0c68b030$4a5eedc1@arp01> <14430.55483.433692.943811@localhost.localdomain>
Message-ID: <385FA4C9.2B054683@pacbell.net>

There's been way too much email on this topic -- I should have weighed
in earlier.  In all honesty I'd prefer to see all namespace support be
cleanly layered on top of SAX1.  It's easy to do it that way; just add
some optional code to postprocess a SAX event stream.


With respect to this particular proposal, I have several comments.

First, it's unclear to me what's happened to our old friend, the
org.xml.sax.DocumentHandler.startElement callback:

    public void startElement (String name, AttributeList attrs)
    throws SAXException;

If that call is gone, I anticipate migration problems to SAX2.

If it's still there, then it must be the application's choice to use
the new sax2.DocumentHandler interface or the original ... presumably
it would use Configurable.setProperty() with some ID for the new
namespace-aware sax2.DocumehtHandler to identiy its choice.


Second, it's unclear how to report violations of namespace conformance.

I'd asked that the namespace spec resolve this issue, by using the same
reporting terminology that the XML spec uses ("warning", "error", and
of course "fatal error"), but instead it got even more vague.  So I'll
have to ask how SAX will address this ... keeping in mind that if W3C
gets around to answering those questions, it might pick different answers.

That is, faced with this document

	<?xml version="1.0"?>
	<html:p>Hello again! :-)</html:p>
	<?at-end-of-document?>

Two reporting issues arise:  (a) How does one know that namespaces are
to be used at all?  It's a legal XML 1.0 document, so inherently there
is no error.  (b) If one knows that namespaces are to be used, is
the undeclared "html" prefix to generate a warning, recoverable error,
or fatal error through sax.ErrorHandler?  Is it reported some other way?

I think that using ErrorHandler.error() is the best solution, but then
that leads to the issue of how to report namespace URIs that aren't
available.  (And as I recall, there were more errors to deal with than
just unresolved namespace prefixes.)


David Megginson wrote:
> 
> Richard Anderson writes:
> 
>  > >   public void startElement (String nsPrefix, String ns, String name,
>  > >                             AttributeList atts)
>  > >     throws SAXException;
>  > >
>  > >   public void endElement (String nsPrefix,String ns, String name)
>  > >     throws SAXException;
>  >
>  > We can build ours DOM more easily this way dont have to buffer the other
>  > namespace events.  I also would be surprised if at least 80% of SAX2 users
>  > a) wouldnt mind this being present b) would probably use it
> 
> ...  James's suggestion was that, at
> user option, the parser leave the original prefix on the name:

A problem with this approach is that it expects that what's generating
those SAX callbacks is a parser.  If namespace support is added by a
filter layer, then anything generating SAX callbacks can be combined
with the filter.


>   startElement("http://www.w3c.org/1999/xhtml", "html:p", atts);
>   characters("Hello.");
>   endElement("http://www.w3c.org/1999/xhtml", "html:p");
> 
> This would never be enabled by default, but for the relatively small
> class of apps that needed to know the original prefix, the prefix
> would be available simply by splitting the name argument.

Clearly that class includes "DOM-using applications", which for better
or worse (opinions do vary :-) isn't a small class.

DOM L2 applications explicitly have the same option that I noted above:
use (or non-use) of namespace information is the choice of the application,
not the choice of some version of an XML infrastructure.


> I like this approach because it doesn't throw the prefix in the face
> of apps that don't need it -- to paraphrase Larry Wall, it makes common
> tasks easy and uncommon tasks possible.

A third issue:  building a DOM is quite "common" though, and it needs
those prefixes.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Tue Dec 21 16:48:06 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:46 2004
Subject: interning [Re: SAX2: Namespace proposal]
References: <AA4C152BA2F9D211B9DD0008C79F760A67519B@odin.cromwellmedia.co.uk>
Message-ID: <385FB056.51A907C@mecomnet.de>

My experience is exactly the opposite. Where the app is written in a language
like java, it will likely know the names of significant things at the time it
is coded. In which case they can be statically interned. Even if it necessary
to push this off to runtime a binding to "behaviour" can be done on the basis
of interned symbols. For something like an XSL processors the story _may_ be
different, but i'd like to see numbers first, before i would believe it.

Miles Sabin wrote:
> 
> Joe Lapp wrote,
> > Using Java String interning, how do you guys guarantee
> > performance in any of the DOM Element get*() methods that take
> > Strings?  Do you require that the app intern the string before
> > passing it in?  

As I've noted elsewhere, in lisp, symbols suffice: they're interned when the
program source is read, that is, even before the program is compiled.

...


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 17:17:20 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: Exceptions
References: <14431.36102.720933.248648@localhost.localdomain>
Message-ID: <385FB61D.157D0C90@pacbell.net>

David Megginson wrote:
> 
> 1. SAXException extends IOException.

Not my preferred model ... it _isn't_ an I/O exception, and the
recovery mechanisms applicable to I/O problems won't typically
apply, or even act reasonably given a data format error (which
is what a SAXException basically indicates).

What's the rationale behind this proposal?

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 17:26:17 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:46 2004
Subject: SAX2: Exceptions
References: <14431.36102.720933.248648@localhost.localdomain>
Message-ID: <385FB834.82BF526@pacbell.net>

David Megginson wrote:
> 
> 4. Modify SAXParseException so that it also has an integer value
>    containing the error number.  If there ever is a standard catalog
>    of XML (and related) errors, we can use this to hold the number.

That one I sort of like ... except that it shouldn't be done
without having such a catalogue, and a means to update it!  In my
experience, API hooks which are incomplete are _always_ trouble;
and identifier namespaces (like error numbers) need to have ways
to evolve over time.

I'd start such a catalogue by identifying every error case in the
XML specification.  Syntax errors might need subcases, since each
grammar production could generate multiple errors, but for each
WFC and VC it's easy to justify separate error identifiers.

However:  do we have experience with applications which _can_ usefully
do more with fatal errors than throw up their hands?  Or with nonfatal
ones (including validation errors) than optionally report them?  Use
cases there should IMHO drive this issue.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Dec 21 17:27:17 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:47 2004
Subject: Khare's paper on WAP
Message-ID: <3.0.32.19991221092815.028ac500@pop.intergate.ca>

This is 9 months old now, but I just ran across it.  I suspect that many in
this community will find it interesting, whether they agree with the
premise or not.

 http://www.4k-associates.com/IEEE-L7-WAP-BIG.html

 -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 17:27:21 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: ErrorHandler mods
References: <14431.36842.362320.404738@localhost.localdomain>
Message-ID: <385FB874.1EF10BE3@pacbell.net>

David Megginson wrote:
> 
> Following up on a long-ago suggestion from (I think) David Brownell,

[ not that I recall, FWIW ]

> I'm consider adding an extra method, validationError(), to the SAX2
> ErrorHandler.  That way, we'd have the following methods:
> 
>   warning - any kind of non-fatal warning
>   error - an optionally-reportable error of some kind, or an error
>           from outside the XML 1.0 domain (such as a Namespace error
>           that doesn't affect well-formedness)
>   validationError - an error in schema validation
>   fatalError - a well-formedness error
> 
> Does this work?

Hmm, if it's schema-specific, then shouldn't it be "schemaError" ?

Were I to ask for more specific indications of DTD (or schema) related
errors, I'd use subclasses of SAXParseException, passed to error().

Note that this addresses the same issue that the integer code in
exceptions addresses, but it does it in a different way.  I'd really
go for only one solution to such problems.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From smuench at us.oracle.com  Tue Dec 21 17:31:16 1999
From: smuench at us.oracle.com (Steve Muench)
Date: Mon Jun  7 17:18:47 2004
Subject: Websites using XML
References: <8525684E.0053CFA4.00@smtp.apprise.com>
Message-ID: <009c01bf4bd8$cdb87420$5f7a1990@us.oracle.com>

Alex,

At XML99 Adam Bosworth and Charlie Heinemann from
Microsoft repeatedly cited their MSDN site as a showcase
reference for a big site using lots of XML/XSLT
behind the scenes (presumably in combination with
ASP pages).

Maybe someone from Microsoft can send some pointers.

_________________________________________________________
Steve Muench, Consulting Product Manager & XML Evangelist
Business Components for Java Development Team
http://technet.oracle.com/tech/java
http://technet.oracle.com/tech/xml
----- Original Message -----
From: "Alex Garrett" <Alex_Garrett@coregis.com>
To: <xml-dev@ic.ac.uk>
Sent: Tuesday, December 21, 1999 7:12 AM
Subject: Websites using XML


|
|
| I realise this is probably the wrong place to ask this, but I've come to
rely on
| the concentrated expertise of the list members and I ask your indulgence.
What I
| need to know is the names (and possibly URLs) of companies that have web
sites
| that are implemented in XML. I need to provide a case for designing a site
in
| XML, and a crucial element for that case is to demonstrate that it's been
done
| before and it works. There's no need to clutter the list any more than I
already
| have with this -- please send responses straight to me. If there's
interest,
| I'll compile a list of responses and post it. Thanks,
|
| Alex
| alex_garrett@coregis.com
|
|
|
| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
| Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
| To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
| unsubscribe xml-dev
| To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
| subscribe xml-dev-digest
| List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|
|


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sharris at primus.com  Tue Dec 21 17:35:42 1999
From: sharris at primus.com (Steve Harris)
Date: Mon Jun  7 17:18:47 2004
Subject: Request for Discussion: SAX 1.0 in C++
Message-ID: <EE5F339A2558D311B7360008C73BFD001687BB@exchange1.primus.com>

Steinar Bang wrote:
> Sure, but how about this, then...?
> 
> typedef unsigned short SAXChar;
> class SAXString : public basic_string<SAXChar> {
> public:
>     SAXString(const SAXString& s, size_type pos = 0, size_type n = npos)
> 	: basic_string<SAXChar>(s,pos,npos) {}
>     SAXString(const SAXChar* p, size_type n)
> 	: basic_string<SAXChar>(p,n) {}
>     SAXString(const SAXChar* p)
> 	: basic_string<SAXChar>(p) {}
>     SAXString(size_type n, SAXChar c)
> 	: basic_string<SAXChar>(n,c) {}
>     SAXString(const char* p)
> 	: basic_string<SAXChar>(p)
>     {
> 	// do UTF-8 to UTF-16 decoding here
>     }
>     
> };

This UTF-8/UTF-16 representation translation seems like a job for Standard
C++'s <locale>/codecvt facility, not something to be embedded in a string
class.
Here's a couple of somewhat relevant Usenet postings that may encourage you
to investigate a more appropriate solution:

http://www.deja.com/[ST_rn=ps]/getdoc.xp?AN=530606635&fmt=text
http://www.deja.com/[ST_rn=ps]/getdoc.xp?AN=530837402&fmt=text


Steven E. Harris
Senior Software Engineer
PRIMUS

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 18:06:01 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
References: <3.0.32.19991218163458.01500100@pop.intergate.ca> <14430.36362.589577.199567@localhost.localdomain>
Message-ID: <385FC180.8AFC226E@pacbell.net>

David Megginson wrote:
> 
> Tim Bray writes:
>  >
>  > Is there an application scenario where you get
>  >
>  > <a xmlns:a="http://x.com" xmlns:b="http://x.com">
>  >   <a:foo /><b:bar />
>  >   </a>
>  >
>  > ...and you actually care that the foo and bar had different prefixes?
>  > I'd find that really hard to believe.

Anything that is, to use Tim's term, "namespace oblivious".

Which essentially means all XML software written to date,
as well as anyting written using the widely available
namespace-oblivous APIs -- an issue that'll take some time
to fix, since when namespace-aware APIs become moderately
available in six months or so, developers still won't be
using them all that widely.


> Yeah, I do too, but so many people (some from big companies) have
> sworn up and down on every holy book in Creation that they need that
> kind of thing -- and managed to convince the DOM WG to include it in
> DOM2 -- that I simply have to assume myself (and Tim) wrong.

It's the distinction between new code and old code.  XML isn't a
"clean slate" design any more, it's got to deal with some history.
One of them is that a DOM L2 implementation MUST NOT (!!!) break
existing applications -- L2 is there to add features, not sacrifice
compatibility.

If XML 1.0 had included namespace there would be no backward
compatibility issues in this area.  (There are lots of other
areas to worry about though!)  But with over two realtime years
of lag betweeen the XML 1.0 spec and the DOM Level 2 spec, there
is a significant chunk of code that must still work.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 18:20:53 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: Representing IP addresses in XML Schema
References: <000801bf4bae$ee397980$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <385FBE1F.B6B1DD8C@pacbell.net>

Leigh Dodds wrote:
> 
> > > > However, this is not satisfactory - it allows each field in the IP to
> > > > have values from 000-999.  I want to restrict the possible values to:
> > > > [0-255].[0-255].[0-255].[0-255]
> > >
> > > You can avoid the problem by storing the IP as a hex address, as in:
> > >
> > > ff.ff.ff.ff
> >
> > Thanks Ray.  An interesting choice of words - "You can avoid the
> > 'problem' ...". 

And create a worse one.  IP addresses are "dotted DECIMAL"
not "dotted hex".  

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 18:28:42 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
References: <AA4C152BA2F9D211B9DD0008C79F760A675199@odin.cromwellmedia.co.uk> <m3k8m9iorz.fsf@localhost.localdomain>
Message-ID: <385FC6D3.EA1B96BC@pacbell.net>

David Megginson wrote:
> 
> Miles Sabin <msabin@cromwellmedia.co.uk> writes:
> 
> > > If I were doing it over, though, I would actually call
> > > java.lang.String.intern once for each of the strings in the
> > > intern table so that they were == to the regular intern'ed
> > > versions.
> >
> > Try it, but I think you'll be more likely to lose than gain.

Having done this, I'll disagree.  The cost wasn't observable.

Except ... that processing some documents blew up a fixed-size
table in at least one incarnation of the JDK 1.1.x JVM.  I saw
that in a pathological stress test case (which generated random
documents), never once "in the wild", but it does provide yet
another reason to use a better JVM (like JDK 1.2.x) on servers
and in other cases where you expect the JVM to live a long time.


> I'd appreciate more information here -- if I call
> java.lang.String.intern the first time I add a string to the intern
> table, then the cost is proportional to the number of entries in the
> table, not the number of accesses.

Right.  The man cost being mapping from a portion of a buffer
to the String ... a "local" intern.  Going one step beyond that
to a "global" intern (using the JVM) is cheap, since it's hardly
ever done.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 18:44:14 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
In-Reply-To: <385FB61D.157D0C90@pacbell.net>
References: <14431.36102.720933.248648@localhost.localdomain>
	<385FB61D.157D0C90@pacbell.net>
Message-ID: <14431.51777.724090.430167@localhost.localdomain>

David Brownell writes:

 > David Megginson wrote:
 > > 
 > > 1. SAXException extends IOException.
 > 
 > Not my preferred model ... it _isn't_ an I/O exception, and the
 > recovery mechanisms applicable to I/O problems won't typically
 > apply, or even act reasonably given a data format error (which
 > is what a SAXException basically indicates).
 > 
 > What's the rationale behind this proposal?

Consider higher-level APIs that happen to allow XML input.  It seems
(to me, at least) that the following makes sense:

  public interface Gizmo
  {
    ...
    public void import (String uri) throws IOException;
  }

You don't want to hardwire SAXException into the Gizmo interface,
because it really has nothing to do with the problem domain, and
people may choose to implement other kinds of readers (say, over the
DOM); you could, of course, create a new kind of IOException that
allows a SAXException to be embedded in it, but that hides the fact
that an XML parsing error really is an I/O error for most
applications -- XML's just a high-level read format.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 18:51:45 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
In-Reply-To: David Brownell's message of "Tue, 21 Dec 1999 10:05:52 -0800"
References: <3.0.32.19991218163458.01500100@pop.intergate.ca> <14430.36362.589577.199567@localhost.localdomain> <385FC180.8AFC226E@pacbell.net>
Message-ID: <m3n1r4gmbj.fsf@localhost.localdomain>

David Brownell <david-b@pacbell.net> writes:

> > Yeah, I do too, but so many people (some from big companies) have
> > sworn up and down on every holy book in Creation that they need that
> > kind of thing -- and managed to convince the DOM WG to include it in
> > DOM2 -- that I simply have to assume myself (and Tim) wrong.
> 
> It's the distinction between new code and old code.  XML isn't a
> "clean slate" design any more, it's got to deal with some history.
> One of them is that a DOM L2 implementation MUST NOT (!!!) break
> existing applications -- L2 is there to add features, not sacrifice
> compatibility.

That misses Tim's original argument, though -- he was arguing that
people who don't want Namespace processing can use SAX1, and people
who want it can use SAX2, and wondered (aloud) why there would be
anyone who would need both in the same API.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rhanson at blast.net  Tue Dec 21 18:52:06 1999
From: rhanson at blast.net (Robert Hanson)
Date: Mon Jun  7 17:18:47 2004
Subject: Representing IP addresses in XML Schema
References: <000801bf4bae$ee397980$ab20268a@pc-lrd.bath.ac.uk> <385FBE1F.B6B1DD8C@pacbell.net>
Message-ID: <042401bf4be3$7487cee0$0cb919ce@INTERNETDEPT>

How about this...

( (1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5]).){3}(1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])

I took a quick look at the new version of the spec, and this seems to conform
to the syntax.  This will not check for IP addresses in the 127.?.?.? rang or
the 0.?.?.? range, but that could be added.

Robert

 > However, this is not satisfactory - it allows each field in the IP to
> have values from 000-999.  I want to restrict the possible values to:
> [0-255].[0-255].[0-255].[0-255]


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 19:23:39 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
References: <3.0.32.19991220084131.014e93c0@pop.intergate.ca>
Message-ID: <385FD3B6.8F378E5C@pacbell.net>

Tim Bray wrote:
> 
> At 08:21 AM 12/20/99 -0500, David Megginson wrote:
> >> - a pure namespaces view
> >> - a simultaneous namespaces and XML 1.0 view
> >> - a pure XML 1.0 view
> >
> >I agree -- I think that this is the cleanest approach.
> 
> I have a great deal of trouble imagining a situation in which the
> "simultaneous" view is desirable or even safe.  Could someone help out
> with a use-case please?
>
> If I'm right, then given that SAX1 already does the pure XML1.0 view, why do
> we need more than one view? -Tim

If there's going to be just one view in the system (advantageous
overall) then IMHO it needs to be simultaneous ... unless you can
afford to punt on namespace-oblivious software, which few folk
really can afford.  (DOM couldn't, as one example.)

I think that any "single view" approach is cleaner than any kind
of "selectable view" approach.  Modes at low levels just percolate
on up the stack.  While there are places that modes are the right
solution, I don't think this is one.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Robin_Dahlstedt at pml.com  Tue Dec 21 19:31:12 1999
From: Robin_Dahlstedt at pml.com (Robin Dahlstedt)
Date: Mon Jun  7 17:18:47 2004
Subject: Appending a Doctype node using Microsoft's XML COM interfaces
Message-ID: <8E6C9AEA17A8D2118D6E00A0C9986940F52EFF@hermes.pml.com>

Im building an XML document in C++ using the COM interfaces provided by
Microsoft. Does anybody know of a way to append the DOCTYPE declaration to
the xml file? Ive tried to get around the problem by creating an element
with the tagname "!DOCTYPE", but apparently, the boys and girls down at
Microsoft already thought of that (ie: it didn't work). From the looks of
things, I believe that Microsoft purposefully left this ability out...help!

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 19:37:17 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
References: <3.0.32.19991220133428.01f6ac80@nexus.webmethods.com> <m3bt7li9n7.fsf@localhost.localdomain>
Message-ID: <385FD6E3.A43A00B8@pacbell.net>

[ interning, not sax2 or namespaces ] 

David Megginson wrote:
> 
> Joe Lapp <jlapp@webMethods.com> writes:
> 
> > Using Java String interning, how do you guys guarantee performance
> > in any of the DOM Element get*() methods that take Strings?  Do you
> > require that the app intern the string before passing it in?  Do you
> > try to make the methods smart so that if they're interned, you get
> > performance, and if they aren't, you get a bit more of a penalty
> > (for having done the intern check first)? 

Lotsa questions ... the optimization isn't _specifically_ for DOM.

Note that starting sometime around JDK 1.1.6 or so, String.equals()
tests for equality before it does much else; a classic trick that
someone omitted in JDK 1.0, perhaps to ensure that the other code
paths got fully debugged (or perhaps just a performance bug).

What this means is that if you consistently use a string returned by
a SAX parser (interned either localy or, for bigger win, globally)
you'll detect the "equals == true" case much more quickly, on average.


>	 Do you make them dumb so
> > that if you forgot to intern, you don't get anything?  Or would one
> > always intern these externally provided Strings within the method?
> 
> With the DOM, I think, the biggest issue is not performance but memory
> usage -- you do not want 500 separate "div2" strings floating around
> in the same tree.  Interning is much more obviously essential for a
> tree API than it is for a streaming API.

And the CPU performance improvement is "on average"; yes, the memory
performance improvement is a bit more direct in that scenario!

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 19:45:01 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
References: <3.0.32.19991218163458.01500100@pop.intergate.ca>
	 <14430.36362.589577.199567@localhost.localdomain>
	 <385FC180.8AFC226E@pacbell.net> <m3n1r4gmbj.fsf@localhost.localdomain>
Message-ID: <385FD8B8.C027C016@pacbell.net>

David Megginson wrote:
> 
> David Brownell <david-b@pacbell.net> writes:
> 
> > > Yeah, I do too, but so many people (some from big companies) have
> > > sworn up and down on every holy book in Creation that they need that
> > > kind of thing -- and managed to convince the DOM WG to include it in
> > > DOM2 -- that I simply have to assume myself (and Tim) wrong.
> >
> > It's the distinction between new code and old code.  XML isn't a
> > "clean slate" design any more, it's got to deal with some history.
> > One of them is that a DOM L2 implementation MUST NOT (!!!) break
> > existing applications -- L2 is there to add features, not sacrifice
> > compatibility.
> 
> That misses Tim's original argument, though -- he was arguing that
> people who don't want Namespace processing can use SAX1, and people
> who want it can use SAX2, and wondered (aloud) why there would be
> anyone who would need both in the same API.

For the non-namespace features that SAX2 will provide ... it will
have some, right?  Alpha certainly does.

Examples include lexical noise (CDATA, comments, etc), explicit
access to validation capabilities, and ability to tell if a parser
is reading the entire document (i.e. does it read external entities).

Oh, and that namespace support that seems to have gotten no comments
in this discusion ... ;-)

The current published SAX2 API makes it easy for apps to use such
features if they exist.  The notion of an incompatible API evolution
does not have that feature ... the notion of a "pluggable" parser
(typically SAX1 for now, later SAX2) wouldn't work.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From hendy at tedopres.nl  Tue Dec 21 19:58:26 1999
From: hendy at tedopres.nl (hendy)
Date: Mon Jun  7 17:18:47 2004
Subject: &eacute
Message-ID: <016601bf4c39$94465f70$1900000a@intag.tedopres.nl>

I use the sx parser and it generates error like this :
"E:\Test3\sx.exe:D:\IBBL\uit1227.sgm:32:57:W: reference to internal SDATA
entity "eacute" not allowed in XML"
-------------------------
Uit1227.sgml contains :
"...
 ingeval van knelpunten MWFO adviseren om &eacute;&eacute;n
..."

As I know &eacute; is a convention for writing special mark/punctuation.
In my catalog file for parser, I have already declared the necessary entity
for special mark like &eacute :
....

PUBLIC "ISO 8879-1986//ENTITIES Russian Cyrillic//EN"
"entities/iso-cyr1.gml"
PUBLIC "ISO 8879-1986//ENTITIES Non-Russian Cyrillic//EN"
"entities/iso-cyr2.gml"
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 1//EN" "entities/iso-lat1.gml"
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 1//EN" "entities/iso-lat1.gml"
PUBLIC "ISO 8879-1986//ENTITIES Added Latin 2//EN" "entities/iso-lat2.gml"
PUBLIC "ISO 8879:1986//ENTITIES Added Latin 2//EN" "entities/iso-lat2.gml"
PUBLIC "ISO 8879-1986//ENTITIES Greek Letters//EN" "entities/iso-grk1.gml"
PUBLIC "ISO 8879:1986//ENTITIES Greek Letters//EN" "entities/iso-grk1.gml"
PUBLIC "ISO 8879-1986//ENTITIES Monotoniko Greek//EN"
"entities/iso-grk2.gml"
PUBLIC "ISO 8879:1986//ENTITIES Monotoniko Greek//EN"
"entities/iso-grk2.gml"
PUBLIC "ISO 8879-1986//ENTITIES Greek Symbols//EN" "entities/iso-grk3.gml"
PUBLIC "ISO 8879:1986//ENTITIES Greek Symbols//EN" "entities/iso-grk3.gml"
PUBLIC "ISO 9573-13:1991//ENTITIES Greek Symbols file://EN"
"entities/isogrk3.pen"
PUBLIC "ISO 9573-13:1991//ENTITIES Greek Symbols//EN" "entities/isogrk3.pen"
PUBLIC "ISO 8879:1986//ENTITIES Alternative Greek Symbols//EN"
"entities/iso-grk4.gml"
PUBLIC "ISO 8879-1986//ENTITIES Alternative Greek Symbols//EN"
"entities/iso-grk4.gml"
PUBLIC "ISO 9573-13:1991//ENTITIES Alternative Greek Symbols file://EN"
"entities/isogrk4.pen"
PUBLIC "ISO 9573-13:1991//ENTITIES Alternative Greek Symbols//EN"
"entities/isogrk4.pen"

....

----------------------------------------
But the result : Uit1227.xml also contains &eacute without being able to
convert it to format that can be read by XML / IE 5.0:
...
ingeval van knelpunten MWFO adviseren om &eacute;&eacute;n
...


Does I miss a character entity for &eacute, &oacute, &euml....etc ?


H
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991221/63758f4c/attachment.htm
From david-b at pacbell.net  Tue Dec 21 20:26:31 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: The per-element-type namespace partition
References: <000001bf4b50$d7844520$d1940e18@smateo1.sfba.home.com>
Message-ID: <385FE26D.CE45E36F@pacbell.net>

Don Park wrote:
> 
> Well, I find it rather counter-intuitive.

That makes _at least_ two of us ... ;-)

With both Tim and Andrew posting on this thread ... perhaps
one of them could provide some of the use cases that were
proposed where 

	<foo:a foo:href="bar">
	<foo:a href="bar">

should NOT be equivalent?  Lacking such use cases, I'm sure
Don and I aren't the only ones who will forever remain puzzled.

I noticed the latest version of the DOM L2 spec had to have a
specific warning on this issue, which suggests something.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Tue Dec 21 21:04:17 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
In-Reply-To: <14431.36102.720933.248648@localhost.localdomain>
Message-ID: <888534362.945781339@MDAXKE>

> 5. Have all callbacks that formerly threw SAXException throw
>    IOException instead.  This should help to avoid a lot of exception
>    tunneling.

when you say "throw", do you mean a native C++ raising of an exception?
or you mean an error handler?

will there be both?

which ones are continuable?

there are also potential MT issues which i mentioned on a SAX2 thread
a few weeks ago.

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From james.britt at rez.com  Tue Dec 21 21:06:33 1999
From: james.britt at rez.com (Britt, James)
Date: Mon Jun  7 17:18:47 2004
Subject: Appending a Doctype node using Microsoft's XML COM interfaces
Message-ID: <EBEFB50D4AD6D111846B00A0C9B3CDB704E86E9F@pheexh01.anasazi.com>


>-----Original Message-----
>From: Robin Dahlstedt [mailto:Robin_Dahlstedt@pml.com]
>Im building an XML document in C++ using the COM interfaces provided by
>Microsoft. Does anybody know of a way to append the DOCTYPE 
>declaration to
>the xml file? Ive tried to get around the problem by creating 
>an element
>with the tagname "!DOCTYPE", but apparently, the boys and girls down at
>Microsoft already thought of that (ie: it didn't work). From 
>the looks of
>things, I believe that Microsoft purposefully left this 
>ability out...help!

And well they should, since the doctype attribute is read-only.

See http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#i-Document

where it purposefully states:
"The DOM Level 1 does not support editing the Document Type Declaration, 
therefore docType cannot be altered in any way."


James

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 21:28:21 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
References: <888534362.945781339@MDAXKE>
Message-ID: <385FF0F0.4332B03C@pacbell.net>

"Mark D. Anderson" wrote:
> 
> > 5. Have all callbacks that formerly threw SAXException throw
> >    IOException instead.  This should help to avoid a lot of exception
> >    tunneling.
> 
> when you say "throw", do you mean a native C++ raising of an exception?
> or you mean an error handler?
> 
> will there be both?
> 
> which ones are continuable?

Neither Java nor C++ has "continuable" exceptions; what are you
talking about?


> there are also potential MT issues which i mentioned on a SAX2 thread
> a few weeks ago.

Perhaps you could give a pointer to the archive entry.

There should be no MT issues, beyond the fact that multithreaded C++
isn't standardized.  Any "catch" clauses clearly need to operate in
exactly the context of the thread whose stack is being unwound, and
be able to release locks owned by that thread.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Tue Dec 21 21:30:23 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:47 2004
Subject: The per-element-type namespace partition
Message-ID: <33D189919E89D311814C00805F1991F7F4AA77@RED-MSG-08>

Maybe some day Tim and I or some other member of the Namespaces WG will
write a nice prose document explaining in lengthy terms what and why the
Namespaces Specification says so tersely and occasionally obscurely.  It
would save us going through this every-three-month exercise in hermeneutics.


However, the present lack of such a document explaining why the various
choices were made does not affect the actual specification, which, for
better or for worse, says that the following two things are not necessarily
the same:

	<foo:a foo:href="bar">
	<foo:a href="bar">

Tim wrote a nice appendix, A, that does a decent job of describing this
(though, as Murray Maloney has pointed out, the 

That disclaimer aside, I recall that a major part of the motivation for the
distinction was the desire to allow for "global attributes," where a
qualified attribute such as "foo:href" could have a definition and meaning
independent of the element within which it appeared, and at the same time
continue the current practice fostered by DTDs in which an unqualified
attribute may have a definition and meaning local to the enclosing element.

I hope this is helpful,
Andrew Layman

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 21:45:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
In-Reply-To: "Mark D. Anderson"'s message of "Tue, 21 Dec 1999 13:02:19 -0800"
References: <888534362.945781339@MDAXKE>
Message-ID: <m3aen4rmtl.fsf@localhost.localdomain>

"Mark D. Anderson" <mda@discerning.com> writes:

> > 5. Have all callbacks that formerly threw SAXException throw
> >    IOException instead.  This should help to avoid a lot of exception
> >    tunneling.
> 
> when you say "throw", do you mean a native C++ raising of an exception?
> or you mean an error handler?

Right now, I'm talking just about Java -- I need advice on C++
exceptions, since I don't understand them nearly as well.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Tue Dec 21 21:47:00 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Namespace proposal
In-Reply-To: David Brownell's message of "Tue, 21 Dec 1999 11:44:56 -0800"
References: <3.0.32.19991218163458.01500100@pop.intergate.ca> <14430.36362.589577.199567@localhost.localdomain> <385FC180.8AFC226E@pacbell.net> <m3n1r4gmbj.fsf@localhost.localdomain> <385FD8B8.C027C016@pacbell.net>
Message-ID: <m37li8rmqw.fsf@localhost.localdomain>

David Brownell <david-b@pacbell.net> writes:

> The current published SAX2 API makes it easy for apps to use such
> features if they exist.  The notion of an incompatible API evolution
> does not have that feature ... the notion of a "pluggable" parser
> (typically SAX1 for now, later SAX2) wouldn't work.

I currently plan to keep get/setFeature and get/setProperty in the new
SAX2, but haven't had an opportunity to post that point yet.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Tue Dec 21 21:52:36 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
References: <888534362.945781339@MDAXKE> <m3aen4rmtl.fsf@localhost.localdomain>
Message-ID: <385FF6E4.1FC0B8C0@fxtech.com>

> Right now, I'm talking just about Java -- I need advice on C++
> exceptions, since I don't understand them nearly as well.

When used well, they make everything nicer (just like in Java). I'm not
sure what advice you are seeking, but there shouldn't be any reason not
to use them for a C++ implementation of your work.

-Paul

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Tue Dec 21 21:53:36 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
References: <888534362.945781339@MDAXKE> <385FF0F0.4332B03C@pacbell.net>
Message-ID: <00eb01bf4bfd$00cda730$0200a8c0@mdaxke>

> > which ones are continuable?
> 
> Neither Java nor C++ has "continuable" exceptions; what are you
> talking about?

sorry, i didn't mean it in the technical sense.
but instead asking whether all exceptions are going to be "fatal".
personally i prefer that -- i don't like using exceptions as a
message sending mechanism. but then there does need to be a way
to signal warnings that don't mean that parsing can't be continued,
if you follow my double-negative drift. 

> 
> 
> > there are also potential MT issues which i mentioned on a SAX2 thread
> > a few weeks ago.
> 
> Perhaps you could give a pointer to the archive entry.

this was intermixed in the "SAX/C++: C++-specific design principles" thread
at http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Dec-1999/0368.html

> There should be no MT issues, beyond the fact that multithreaded C++
> isn't standardized.  Any "catch" clauses clearly need to operate in
> exactly the context of the thread whose stack is being unwound, and
> be able to release locks owned by that thread.

i didn't mean anything as deep as that. actually it isn't an MT issue
so much as a multiple use issue -- the last draft i examined appeared
not to have enough information in the exception data structure for
me to recover which of possibly multiple parsing activities it is
related to (in case i've got a single catch at top level, whether ST or MT).

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 21 22:26:25 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
References: <888534362.945781339@MDAXKE> <385FF0F0.4332B03C@pacbell.net> <00eb01bf4bfd$00cda730$0200a8c0@mdaxke>
Message-ID: <385FFE8C.E96EA6BF@pacbell.net>

"Mark D. Anderson" wrote:
> 
> > > which ones are continuable?
> >
> > Neither Java nor C++ has "continuable" exceptions; what are you
> > talking about?
> 
> sorry, i didn't mean it in the technical sense.
> but instead asking whether all exceptions are going to be "fatal".

If you catch an exception, you get to decide how to continue
processing starting at the point you caught it, including the
optional rethrow of that exception after local cleanup.  Same
in Java and C++ -- once you throw, the stack unwinds till some
handler says to stop unwinding, continue from there.  Right?
(My C++ has gotten rusty.)


> personally i prefer that -- i don't like using exceptions as a
> message sending mechanism. but then there does need to be a way
> to signal warnings that don't mean that parsing can't be continued,
> if you follow my double-negative drift.

Like the ErrorHandler does, right?  Am I missing something, this
seems like an obvious answer, no reason for Java or C++ to differ.


> > > there are also potential MT issues which i mentioned on a SAX2 thread
> > > a few weeks ago.
> >
> > Perhaps you could give a pointer to the archive entry.
> 
> this was intermixed in the "SAX/C++: C++-specific design principles" thread
> at http://www.lists.ic.ac.uk/hypermail/xml-dev/xml-dev-Dec-1999/0368.html

Re pointer-v-ref, I'm used to throwing refs -- fewer opportunities
to leak exception objects, no heap interactions.  That'd imply (only
for C++) that ErrorHandler also gets a ref, and if it chose to report
an exception it'd be throwing that ref.  Assuming (for sake of example)
that the error handler's a non-null object pointer:

	errorHandler->error (SAXParseException (... constructor args...));

and (if memory serves)

	error (SAXParseException &exception)
	raises SAXException
	{
		if ( ... it's evil enough ... ) {
			throw exception;
		}
		...
	}

[ The discussions on the C++ binding have been largely ignored by yours
truly, particularly the suggestions to ignore/reinvent standard APIs, or
otherwise to encourage use of retro nonconformant C++ implementations now
that the C++ spec has finally been "blessed".  Most of the relevant features
have been stable for a very long time, as I understand things. ]


I'll confess I didn't quite notice any MT issues in that post, but as you
stated it was really a "what Parser/InputSource is in use" issue that
isn't MT-specific at all.

I can't see a way confusion could arise there unless one parser callback
needs to invoke some other parser, and is sloppy about letting exceptions
from that invocation appear as if they were exceptions from the current
invocation.  There are always ways to create bugs if code isn't careful.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Tue Dec 21 23:00:13 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:47 2004
Subject: SAX2: Exceptions
In-Reply-To: <385FFE8C.E96EA6BF@pacbell.net>
Message-ID: <895304557.945788109@MDAXKE>

> If you catch an exception, you get to decide how to continue
> processing starting at the point you caught it, including the
> optional rethrow of that exception after local cleanup.  Same
> in Java and C++ -- once you throw, the stack unwinds till some
> handler says to stop unwinding, continue from there.  Right?
> (My C++ has gotten rusty.)

sure. that doesn't mean your parser wants you to go calling
random functions while you are in your exception handler.
this is mostly a documentation issue, about the underlying state
machine in the principal objects of the api.

>> personally i prefer that -- i don't like using exceptions as a
>> message sending mechanism. but then there does need to be a way
>> to signal warnings that don't mean that parsing can't be continued,
>> if you follow my double-negative drift.
> 
> Like the ErrorHandler does, right?  Am I missing something, this
> seems like an obvious answer, no reason for Java or C++ to differ.

yes, that is fine.

> Re pointer-v-ref, I'm used to throwing refs -- fewer opportunities
> to leak exception objects, no heap interactions.

Yes well, this gets into religious issues that are about as strong
as how to represent strings. As long as the API makes clear what its
intended usage model is i'm happy, since any can be made safe with
sufficient work.

> I'll confess I didn't quite notice any MT issues in that post, but as you
> stated it was really a "what Parser/InputSource is in use" issue that
> isn't MT-specific at all.
> 
> I can't see a way confusion could arise there unless one parser callback
> needs to invoke some other parser, and is sloppy about letting exceptions
> from that invocation appear as if they were exceptions from the current
> invocation.  There are always ways to create bugs if code isn't careful.

if i have a single catch which is "above" multiple simultaneous parsing
activities, then how can i determine from the exception object alone
which parser is involved? or is the answer to not do that?

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Tue Dec 21 23:18:37 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:47 2004
Subject: The per-element-type namespace partition
In-Reply-To: <33D189919E89D311814C00805F1991F7F4AA77@RED-MSG-08>
Message-ID: <000901bf4c09$e05ef780$d1940e18@smateo1.sfba.home.com>

>That disclaimer aside, I recall that a major part of the 
>motivation for the distinction was the desire to allow for
>"global attributes," where a qualified attribute such as
>"foo:href" could have a definition and meaning independent
>of the element within which it appeared, and at the same time
>continue the current practice fostered by DTDs in which an
>unqualified attribute may have a definition and meaning local
>to the enclosing element.

But namespaces are not about semantics but about
avoiding name clashes.  In other words, the namespace
spec is about telling 'foo:href' apart from 'html:href'
and not about what 'foo:href' means within a certain
element.  Am I mistaken?

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Tue Dec 21 23:26:52 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:47 2004
Subject: The per-element-type namespace partition
Message-ID: <33D189919E89D311814C00805F1991F7F4AA79@RED-MSG-08>

The namespaces specification is about telling names apart.  There may be
semantics associate with the names, and in that case, distinguishing names
is a precondition of distinguishing semantics.

-----Original Message-----
From: Don Park [mailto:donpark@docuverse.com]
Sent: Tuesday, December 21, 1999 3:20 PM
To: xml-dev@ic.ac.uk
Subject: RE: The per-element-type namespace partition


>That disclaimer aside, I recall that a major part of the 
>motivation for the distinction was the desire to allow for
>"global attributes," where a qualified attribute such as
>"foo:href" could have a definition and meaning independent
>of the element within which it appeared, and at the same time
>continue the current practice fostered by DTDs in which an
>unqualified attribute may have a definition and meaning local
>to the enclosing element.

But namespaces are not about semantics but about
avoiding name clashes.  In other words, the namespace
spec is about telling 'foo:href' apart from 'html:href'
and not about what 'foo:href' means within a certain
element.  Am I mistaken?

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Tue Dec 21 23:33:22 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:47 2004
Subject: Representing IP addresses in XML Schema
References: <000801bf4bae$ee397980$ab20268a@pc-lrd.bath.ac.uk> <385FBE1F.B6B1DD8C@pacbell.net>
Message-ID: <38600EE3.439A598@pacbell.net>

David Brownell wrote:
> Leigh Dodds wrote:
> > > > You can avoid the problem by storing the IP as a hex address, as in:
> > > >
> > > > ff.ff.ff.ff
> > >
> > > Thanks Ray.  An interesting choice of words - "You can avoid the
> > > 'problem' ...".
> 
> And create a worse one.  IP addresses are "dotted DECIMAL"
> not "dotted hex".

Yeah, I guess you're right.  Where would dotted hex ever be useful?

My &#xBAD;!

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Robin_Dahlstedt at pml.com  Wed Dec 22 01:41:33 1999
From: Robin_Dahlstedt at pml.com (Robin Dahlstedt)
Date: Mon Jun  7 17:18:48 2004
Subject: Appending a Doctype node using Microsoft's XML COM interfaces
Message-ID: <8E6C9AEA17A8D2118D6E00A0C9986940F52F02@hermes.pml.com>

>Im building an XML document in C++ using the COM interfaces provided by
>Microsoft. Does anybody know of a way to append the DOCTYPE 
>declaration to the xml file? I believe that Microsoft purposefully left
this 
>ability out.

And well they should, since the doctype attribute is read-only.

See http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#i-Document

where it purposefully states:
"The DOM Level 1 does not support editing the Document Type Declaration, 
therefore docType cannot be altered in any way."


James

Is my only option to manually open the file and put the DOCTYPE declaration
in it!?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clovett at microsoft.com  Wed Dec 22 02:22:51 1999
From: clovett at microsoft.com (Chris Lovett)
Date: Mon Jun  7 17:18:48 2004
Subject: Websites using XML
Message-ID: <2F2DC5CE035DD1118C8E00805FFE354C19A24AFB@RED-MSG-56>

It is www.microsoft.com, not msdn.  msdn does use some XML, but only for
managing the table of contents.  We are in the process of writing an article
all about it which will contain more details.


-----Original Message-----
From: Steve Muench [mailto:smuench@us.oracle.com]
Sent: Tuesday, December 21, 1999 9:29 AM
To: Alex Garrett
Cc: XML-DEV LIST
Subject: Re: Websites using XML


Alex,

At XML99 Adam Bosworth and Charlie Heinemann from
Microsoft repeatedly cited their MSDN site as a showcase
reference for a big site using lots of XML/XSLT
behind the scenes (presumably in combination with
ASP pages).

Maybe someone from Microsoft can send some pointers.

_________________________________________________________
Steve Muench, Consulting Product Manager & XML Evangelist
Business Components for Java Development Team
http://technet.oracle.com/tech/java
http://technet.oracle.com/tech/xml
----- Original Message -----
From: "Alex Garrett" <Alex_Garrett@coregis.com>
To: <xml-dev@ic.ac.uk>
Sent: Tuesday, December 21, 1999 7:12 AM
Subject: Websites using XML


|
|
| I realise this is probably the wrong place to ask this, but I've come to
rely on
| the concentrated expertise of the list members and I ask your indulgence.
What I
| need to know is the names (and possibly URLs) of companies that have web
sites
| that are implemented in XML. I need to provide a case for designing a site
in
| XML, and a crucial element for that case is to demonstrate that it's been
done
| before and it works. There's no need to clutter the list any more than I
already
| have with this -- please send responses straight to me. If there's
interest,
| I'll compile a list of responses and post it. Thanks,
|
| Alex
| alex_garrett@coregis.com
|
|
|
| xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
| Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
| To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
| unsubscribe xml-dev
| To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
| subscribe xml-dev-digest
| List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
|
|


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chq at softlab.nju.edu.cn  Wed Dec 22 02:52:58 1999
From: chq at softlab.nju.edu.cn (Chen Hong Qiang)
Date: Mon Jun  7 17:18:48 2004
Subject: Which class.method() in xml4j.jar(version xml4j_2_0_15) can update File??
Message-ID: <001601bf4c27$eb6714a0$f72477ca@nju.edu.cn>

Hello,
    I use xml4j.jar to operate a XML file.Now I can access and change the datas.but when I want to update the xml file,I cann't find a method,such as saveXMLfile(String filename,Node rootnode), to succeed it.Do you know which method in which class can do it!
        
Thanks.
    
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991222/edf287a4/attachment.htm
From tpassin at idsonline.com  Wed Dec 22 05:23:35 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: ErrorHandler mods
References: <14431.36842.362320.404738@localhost.localdomain>
Message-ID: <00b801bf4c3d$5f2c0660$b2fbb1cd@tomshp>


----- Original Message ----- 
From: David Megginson <david@megginson.com>

> Following up on a long-ago suggestion from (I think) David Brownell,
> I'm consider adding an extra method, validationError(), to the SAX2
> ErrorHandler.  That way, we'd have the following methods:
> 
>   warning - any kind of non-fatal warning
>   error - an optionally-reportable error of some kind, or an error
>           from outside the XML 1.0 domain (such as a Namespace error
>           that doesn't affect well-formedness)
>   validationError - an error in schema validation
>   fatalError - a well-formedness error
> 
> Does this work?
> 
> 
A yes vote here.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From curta_ontheroad at yahoo.com  Wed Dec 22 05:39:29 1999
From: curta_ontheroad at yahoo.com (Curt Arnold)
Date: Mon Jun  7 17:18:48 2004
Subject: Representing IP addresses in XML Schema
Message-ID: <19991222053855.19989.qmail@web3105.mail.yahoo.com>

Sorry to harp on it, but this is another example of
the general usefulness of support for lists in XML
Schema as I tried to strongly lobby (as much as you
can outside the W3C) earlier this month.  See my
comments on the schema comments archive,
http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0038.html.

Lists would also be useful for DNS names.

--- Leigh Dodds <ldodds@ingenta.com> wrote:
> > > > However, this is not satisfactory - it allows
> each field in the IP to
> > > > have values from 000-999.  I want to restrict
> the possible values to:
> > > > [0-255].[0-255].[0-255].[0-255]
> > > 
> > > You can avoid the problem by storing the IP as a
> hex address, as in:
> > > 
> > > ff.ff.ff.ff
> > > 
> > Thanks Ray.  An interesting choice of words - "You
> can avoid the
> > 'problem' ...".  I think that it is a "problem" if
> such a common, simple
> > datatype cannot be expressed directly, and one is
> forced to resort to
> > indirect means of expressing what is desired. 
> Yes?  /Roger
> 
> It also becomes trickier if you want to disallow IP
> addresses such 
> as 127.0.0.1, 0.0.0.0, etc.
> 
> L.
> 
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as:
> http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the
> following message;
> unsubscribe xml-dev
> To subscribe to the digests,
> mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa
> (mailto:rzepa@ic.ac.uk)
> 
> 


__________________________________________________
Do You Yahoo!?
Thousands of Stores.  Millions of Products.  All in one place.
Yahoo! Shopping: http://shopping.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kent.fitch at its.csiro.au  Wed Dec 22 05:40:39 1999
From: kent.fitch at its.csiro.au (Kent Fitch)
Date: Mon Jun  7 17:18:48 2004
Subject: Topic Maps and RDF
References: <00ff01bf46bf$859733c0$420a5398@cbr.its.csiro.au> <199912150818.CAA00892@bruno.techno.com>
Message-ID: <00ba01bf4c3f$206b5960$420a5398@cbr.its.csiro.au>

Thanks Steven and Didier for your responses last week,
and to Didier for his "A bit of synergy this morning"
which seemed to me to provide some good arguments for
blurring RDF and XLink.

I'm looking at this area to try to understand how we
could better structure metadata, or move to more
advanced usage of metadata to support more flexible
linking and browsing on our site (http://www.csiro.au),
which is already heavily driven by metadata contained 
within XML resources.

I've attempted to summarise why I'm interested, and some
of the issues for me at:
http://www.csiro.au/itsb/kent/csiroOnlineMetaDataDirections.html

Feedback is welcome!

Kent Fitch                           Ph: +61 2 6276 6711
ITS  CSIRO  Canberra  Australia      kent.fitch@its.csiro.au


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Wed Dec 22 07:58:46 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:48 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: Steve Harris's message of "Tue, 21 Dec 1999 09:32:32 -0800"
References: <EE5F339A2558D311B7360008C73BFD001687BB@exchange1.primus.com>
Message-ID: <whk8m7flub.fsf@viffer.metis.no>

>>>>> Steve Harris <sharris@primus.com>:

> This UTF-8/UTF-16 representation translation seems like a job for
> Standard C++'s <locale>/codecvt facility, not something to be
> embedded in a string class.

Well, maybe... if the C++ standardarization commitee hadn't dropped
the ball on sizeof(wchar_t)...:-/

I fear that this will make the entire std::wstring stuff unusable for
multiplatform development.

In any case it's not available as an option for me at this point.

My current platforms are gcc/linux and MSVC++/Win32.  gcc doesn't have
locale yet (I am on the mailing list for libstdc++-v3, so there is no
need to inform me of this work).  The MSVC++ Standard C++ Library
support is seriously broken in a multitude of ways, so we're getting
support for the parts of the library that we use, from Objectspace
Standards<ToolKit>, which doesn't support the locale stuff or
templatized streams.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From georg_edelmann at gmx.net  Wed Dec 22 08:52:01 1999
From: georg_edelmann at gmx.net (Georg Edelmann)
Date: Mon Jun  7 17:18:48 2004
Subject: XML Rendering problem
Message-ID: <26405.945852717@www3.gmx.net>

Hi all,

I am not sure if this is the right mailinglist to send this problem to. If
not, it would be great if you could point me to the correct one.

So here is my problem:

The following XSL file does not work, rendering it with either the IBM nor
the SUN xml parsers (either using Xalan or Saxon as XSL renderer):

----------------------------------------- stylesheet start
<xsl:stylesheet
     xmlns:xsl="http://www.w3.org/TR/WD-xsl"
     xmlns="hhtp://www.w3.org/TR/REC-html40"
     result-ns="">
     
<xsl:template match="text()">
</xsl:template>               

<xsl:template match="image">
    <a href="/NASApp/portal/home?tmpl=browse&url=next">
        <xsl:value-of select="imageurl"/>
    </a>
</xsl:template>

</xsl:stylesheet>
----------------------------------------- stylesheet end

The problem lies in the line with the href parameter. The parser
interprets '&url' as an html command and wants to have a trailing ';'. It does not
understand that the '&' separates two parameters in the URL. 
In my opinion that is a serious bug in all the parsers i tested so far.

Georg Edelmann

-- 
Sent through Global Message Exchange - http://www.gmx.net


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Chetan_Sangoram at BMG.Satyam.com  Wed Dec 22 09:29:48 1999
From: Chetan_Sangoram at BMG.Satyam.com (Chetan_Sangoram)
Date: Mon Jun  7 17:18:48 2004
Subject: XML Convertor
Message-ID: <212C7E060072D111809E00805FA67EF2016988AF@BMGNT001>

Hi All,
I am currently looking out for converting Word Perfect, MS Word and ASCII
files into XML. 

So Far I was just able to find out only RTF to XML convertor, which uses
omnimark technology.

Is there anything generalised which would take care of all (or most) types
of Binary & ASCII files.

Is it possible to run these convertors from command line interface.

There are loads of those for HTML but not for XML.

Thanks In Advance
Chetan


Regards & Thanks
Chetan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Rajiv.Mordani at eng.sun.com  Wed Dec 22 09:44:56 1999
From: Rajiv.Mordani at eng.sun.com (Rajiv Mordani)
Date: Mon Jun  7 17:18:48 2004
Subject: XML Rendering problem
In-Reply-To: <26405.945852717@www3.gmx.net>
Message-ID: <Pine.SOL.3.96.991222014122.19744B-100000@nine>

& indicates entities.. So if you need to show the & you should put &amp;
in place of the &.

On Wed, 22 Dec 1999, Georg Edelmann wrote:

> Hi all,
> 
> I am not sure if this is the right mailinglist to send this problem to. If
> not, it would be great if you could point me to the correct one.
> 
> So here is my problem:
> 
> The following XSL file does not work, rendering it with either the IBM nor
> the SUN xml parsers (either using Xalan or Saxon as XSL renderer):
> 
> ----------------------------------------- stylesheet start
> <xsl:stylesheet
>      xmlns:xsl="http://www.w3.org/TR/WD-xsl"
>      xmlns="hhtp://www.w3.org/TR/REC-html40"
>      result-ns="">
>      
> <xsl:template match="text()">
> </xsl:template>               
> 
> <xsl:template match="image">
>     <a href="/NASApp/portal/home?tmpl=browse&url=next">
>         <xsl:value-of select="imageurl"/>
>     </a>
> </xsl:template>
> 
> </xsl:stylesheet>
> ----------------------------------------- stylesheet end
> 
> The problem lies in the line with the href parameter. The parser
> interprets '&url' as an html command and wants to have a trailing ';'. It does not
> understand that the '&' separates two parameters in the URL. 
> In my opinion that is a serious bug in all the parsers i tested so far.
> 
> Georg Edelmann
> 
> -- 
> Sent through Global Message Exchange - http://www.gmx.net
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From heiko.grussbach at crpht.lu  Wed Dec 22 10:11:17 1999
From: heiko.grussbach at crpht.lu (heiko.grussbach@crpht.lu)
Date: Mon Jun  7 17:18:48 2004
Subject: DTD design
Message-ID: <OFD51A8AB2.5799A032-ONC125684F.00373657@crpht.lu>

Hi,

I have the following problem, I want to define an element E that may
contain elements A,B,C. Order should be insignificant and A,B and C are all
optional. Furthermore, A,B and C may each be replaced by X.
The first example was simply like this:

<!ELEMENT E (
     ( A?, B? ) |
     ( B?, A? ) |
     ( A?, X? ) |
     ( B?, X? ) |
     ( X?, A? ) |
     ( X?, B? ) |
     ( X?, X? )
) >

My XML-editor (XMetal 1.2) reports an ambigous content model error. After carefully studying the appendix of the XML rec, and with some help
by the support of XMetaL , I came up with the following:

<!ELEMENT E (
     EMPTY |
     (A,(B|X)?) |
     (B,(A|X)?) |
     (X,(A|B|X)?)
)>

Is this the best solution?, What if there are more childs, like A,B,C,D etc. Wouldn't the DTD just explode?

Regards

Heiko Grussbach


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pvd_2000 at yahoo.fr  Wed Dec 22 10:38:17 1999
From: pvd_2000 at yahoo.fr (=?iso-8859-1?q?Vincent=20PHAM?=)
Date: Mon Jun  7 17:18:48 2004
Subject: DTD design
Message-ID: <19991222103813.25010.qmail@web702.mail.yahoo.com>

 Does it match your need ?

 <!ELEMENT E ( EMPTY | ((A|B|C|X)?, (A|B|C|X)?) )>

  Vincent PHAM.

--- heiko.grussbach@crpht.lu a ?crit:
> Hi,
> 
> I have the following problem, I want to define an element E that may
> contain elements A,B,C. Order should be insignificant and A,B and C are all
> optional. Furthermore, A,B and C may each be replaced by X.
> The first example was simply like this:
> 
> <!ELEMENT E (
>      ( A?, B? ) |
>      ( B?, A? ) |
>      ( A?, X? ) |
>      ( B?, X? ) |
>      ( X?, A? ) |
>      ( X?, B? ) |
>      ( X?, X? )
> ) >
> 
> My XML-editor (XMetal 1.2) reports an ambigous content model error. After
> carefully studying the appendix of the XML rec, and with some help
> by the support of XMetaL , I came up with the following:
> 
> <!ELEMENT E (
>      EMPTY |
>      (A,(B|X)?) |
>      (B,(A|X)?) |
>      (X,(A|B|X)?)
> )>
> 
> Is this the best solution?, What if there are more childs, like A,B,C,D etc.
> Wouldn't the DTD just explode?
> 
> Regards
> 
> Heiko Grussbach

___________________________________________________________
Do You Yahoo!?
Achetez, vendez! ? votre prix! Sur http://encheres.yahoo.fr

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Sajeev_M1 at verifone.com  Wed Dec 22 10:37:40 1999
From: Sajeev_M1 at verifone.com (Sajeev M.)
Date: Mon Jun  7 17:18:48 2004
Subject: DTD design
Message-ID: <F9FBA0D1187BD11188B200A0C9979DF902DB56BA@blrmail.india.hp.com>

hi ,

  it can be 
        <!ELEMENT E (EMPTY |((A|B|X)?,(A|B|X)?)
)>

since you wanted A,B and X to be optional.
> ----------
> From: 	heiko.grussbach@crpht.lu[SMTP:heiko.grussbach@crpht.lu]
> Reply To: 	heiko.grussbach@crpht.lu
> Sent: 	Wednesday, December 22, 1999 3:41 PM
> To: 	xml-dev@ic.ac.uk
> Subject: 	DTD design
> 
> Hi,
> 
> I have the following problem, I want to define an element E that may
> contain elements A,B,C. Order should be insignificant and A,B and C are all
> optional. Furthermore, A,B and C may each be replaced by X.
> The first example was simply like this:
> 
> <!ELEMENT E (
>      ( A?, B? ) |
>      ( B?, A? ) |
>      ( A?, X? ) |
>      ( B?, X? ) |
>      ( X?, A? ) |
>      ( X?, B? ) |
>      ( X?, X? )
> ) >
> 
> My XML-editor (XMetal 1.2) reports an ambigous content model error. After
> carefully studying the appendix of the XML rec, and with some help
> by the support of XMetaL , I came up with the following:
> 
> <!ELEMENT E (EMPTY |(A,(B|X)?) |(B,(A|X)?) |(X,(A|B|X)?)
> )>
> 
> Is this the best solution?, What if there are more childs, like A,B,C,D etc.
> Wouldn't the DTD just explode?
> 
> Regards
> 
> Heiko Grussbach
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
> 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From hendy at tedopres.nl  Wed Dec 22 11:07:46 1999
From: hendy at tedopres.nl (hendy)
Date: Mon Jun  7 17:18:48 2004
Subject: XSL- XML
Message-ID: <010201bf4cb8$9cb0d030$1900000a@intag.tedopres.nl>


Hi, 
I have a problem with XSL file.
In my XML file these elements exist  : &ldquo; &rdquo; &iuml; &amp;
Do you know how to write an XSL file in order to be used in XML file so
that IE 5.0 can display &ldquo; &rdquo; &iuml; &amp;


 Hendy
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991222/ba0561da/attachment.htm
From costello at mitre.org  Wed Dec 22 12:33:34 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:48 2004
Subject: Solution to Representing IP addresses in XML Schema
Message-ID: <3860C56F.819B6D00@mitre.org>

Hi Folks,

Below is the solution to creating an XML Schema datatype to represent IP
numbers.  Thanks to all those who responded, especially Robert Hanson
who came up with the solution.  /Roger

<datatype name="IP" source="string">
   <pattern value="((1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5]).){3}
                    (1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])"/>
      <annotation>
         <info>
             Datatype for representing IP addresses.  Examples,
                129.83.64.255, 64.128.2.71, etc.
             This datatype restricts each field of the IP address
             to have a value between zero and 255, i.e.,
                [0-255].[0-255].[0-255].[0-255]
             Note: in the value attribute (above) the regular
             expression has been split over two lines.  This is
             for readability purposes only.  In practive the R.E.
             would all be on one line.
         </info>
      </annotation>
   </pattern>
</datatype>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 14:36:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: DeclHandler
Message-ID: <14432.57765.516268.430263@localhost.localdomain>

Here's the DeclHandler that we designed for SAX2alpha, with
IOException replacing SAXException in the throws clauses:

  public interface DeclHandler
  {
    public void elementDecl (String name, String model) throws IOException;
    public void attributeDecl (String eName, String name, String type,
			       String valueDefault, String value)
      throws IOException;
    public void internalEntityDecl (String name, String value)
      throws IOException;

    public void externalEntityDecl (String name, String publicId,
				    String systemId)
      throws IOException;
  }

Notes:

1. Unparsed entity and notation declarations are reported by the (now
   confusingly-named) DTDHandler.  The distinction is that the XML 1.0 
   REC requires parsers to report unparsed-entity and notation
   declarations, but not other DTD-based declarations.

2. The model argument in elementDecl is a normalized string
   representation of a content model.  It's not ideal, but everyone
   agreed last time that it was workable.

This interface seems hopelessly anachronistic, and I'm not willing to
invest too much time in it -- after all, while DTDs are useful in
themselves, the declarations should hardly form part of downstream
processing -- but enough people want it that it's useful to include it
as an optional feature.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 14:31:43 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: LexicalHandler
Message-ID: <14432.57481.527212.886471@localhost.localdomain>

Just a quick review -- here's the LexicalHandler from SAX2 alpha,
modified so that the callbacks throw IOException rather than
SAXException:

  public interface LexicalHandler
  {
    public void startDTD (String name, String publicId, String systemId)
      throws IOException;
    public void endDTD () throws IOException;

    public void startEntity (String name) throws IOException;
    public void endEntity (String name) throws IOException;

    public void startCDATA () throws IOException;
    public void endCDATA () throws IOException;

    public void comment (char ch[], int start, int length)
      throws IOException;
  }

Notes:

1. start/endDTD refer to the start and end of the DOCTYPE declaration, 
   not to the start and end of the external DTD subset.  The
   parameters are those given in the DOCTYPE declaration (if present).

2. The start/endEntity callbacks provide the entity name, prefixed
   with '%' if it is a parameter entity, or the special names
   [document] for the document entity or [dtd] for the external DTD
   subset.

During the original design a few months back, we decided not to
provide the public and system identifiers in start/endEntity, because
that information would be available through the DeclHandler (which
I'll include in a following message).  Since both the LexicalHandler
and the DeclHandler are optional, however, I wonder if a little
redundancy would make sense:

    public void startEntity (String name, String publicId,
                             String systemId) throws IOException;
    public void endEntity (String name) throws IOException;

That way, if the parser supports the LexicalHandler but not the
DeclHandler, the public and system identifiers for entities will still 
be available.

Comments?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Wed Dec 22 14:41:01 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:18:48 2004
Subject: XML Convertor
Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2782B2@MDYNYCMSX1>

>I am currently looking out for converting Word Perfect, MS Word and ASCII
>files into XML. 
>So Far I was just able to find out only RTF to XML convertor, which uses
>omnimark technology.

Converting something to XML means converting it to a text file in which
start and end tags show the beginning and end of structural elements (and,
maybe storing certain pieces of information as attributes in the
start-tags). There has to be some way for the converter to identify the
beginning and end of these structural elements. Rick Geimer's Omnimark-based
rtf2xml (see http://www.omnimark.com/develop/contributed/) does this by
looking at RTF codes.

A program that reads proprietary binary formats (WordPerfect or MS Word) and
does this would be difficult enough that no one I know of has bothered--they
just save as RTF and either write something customized to convert that RTF
to their own DTD or use Rick's program and then convert its output to their
own DTD. WordPerfect and Word 2000 have some XML-related features, so you
might want to look at those. 

To convert an ASCII file to XML, you could put "<myDocument>" at the
beginning and "</myDocument>" at the end, but this wouldn't do you much
good. To put additional tags in places where they would be useful requires a
program that knows what to look for. People often use perl, python, awk,
etc. to write scripts that look for patterns in their input that give them
clues as to which tags should go where.

>Is there anything generalised which would take care of all (or most) types
>of Binary & ASCII files.

To find and identify the structure of the input, the processing program has
to know its structure intimately, so a generalized program that takes care
of all types of binary and ASCII files is impossible. Having spent too much
time studying RTF, I applaud Rick for studying it even harder so that others
wouldn't have to. It would be difficult to do any better.

Bob DuCharme       www.snee.com/bob       <bob@  
snee.com>  see www.snee.com/bob/xmlann for "XML:
The Annotated Specification" from Prentice Hall.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 15:02:31 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: Renaming DTDHandler
Message-ID: <14432.59332.741656.299651@localhost.localdomain>

In SAX 1.0, the DTDHandler reports only those declarations that the
XML 1.0 REC requires to be reported: notation declarations and
unparsed entity declarations.

In SAX2, we will be providing optional support for other types of DTD
declarations in the DeclHandler.  Unfortunately, that creates a lot of 
opportunity for confusion, because of the way DTDHandler is named.

Since we're implementing SAX2 in a new package anyway, I'm interested
in hearing suggestions for renaming DTDHandler so that the name more
accurately describes its purpose (reporting notation and unparsed
entity declarations).  Here are some ideas off the top of my head:

DataHandler
DataTypeHandler
  Since unparsed entities and notations both have to do with non-XML
  data, this name is technically accurate.  Unfortunately, it will
  collide in the future with any support for the different data-type
  approach in XML Schemas.

NonXMLHandler
  Klunky, but usable.

BinaryHandler
  A little more confusing.

I know that the perfect name's out there somewhere.  Personally, I'd
make these declarations optional and dump them into DeclHandler with
the rest if I could, but the XML 1.0 REC is quite explicit on this
point.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 15:12:08 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: NamespaceDeclHandler
Message-ID: <14432.59910.577845.823030@localhost.localdomain>

Here is my current draft of the NamespaceDeclHandler for SAX2.  Note
that the purpose of this handler is to let the application know what
Namespace prefix mappings are currently in scope, *not* to tell
anything about a specific name.

The purpose of this handler is to allow prefix-qualified names to be
used in attribute values and character data.  The Namespaces REC makes 
no explicit provision for this sort of usage, but as others have
pointed out, it is required for several other specs.

  public interface NamespaceDeclHandler
  {
    public void startNamespaceDeclScope (String prefix, String uri)
      throws IOException;

    public void endNamespaceDeclScope (String prefix)
      throws IOException;
  }

There are two important questions here:

1. Is this adequate for its goal?

2. Should support for this handler be required (assuming that
   Namespace processing is required)?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 15:21:20 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: Parser interface
Message-ID: <14432.60462.189392.681294@localhost.localdomain>

Here is my current draft of the SAX2 Parser class:

  public interface Parser
  {
    public void setLocale (Locale locale)
      throws SAXNotSupportedException;

    public void setFeature (String feature, boolean state)
      throws SAXNotRecognizedException, SAXNotSupportedException;
    public boolean getFeature (String feature)
      throws SAXNotRecognizedException;

    public void setProperty (String property, Object value)
      throws SAXNotRecognizedException, SAXNotSupportedException;
    public Object getProperty (String property)  
      throws SAXNotRecognizedException;

    public void setEntityResolver (EntityResolver resolver);
    public void setDTDHandler (DTDHandler handler);
    public void setNamespaceDeclHandler (NamespaceDeclHandler handler);
    public void setDocumentHandler (DocumentHandler handler);
    public void setErrorHandler (ErrorHandler handler);

    public void setLexicalHandler (LexicalHandler handler)
      throws SAXNotSupportedException;
    public void setDeclHandler (DeclHandler handler)
      throws SAXNotSupportedException;

    public void parse (String systemId)
      throws IOException;

    public void parse (InputSource input)
      throws IOException;  
  }


Notes:

1. LexicalHandler and DeclHandler now have explicit setters, but the
   parser may throw a SAXNotSupportedException if it does not support
   them.

2. Extended handler types (for schemas, or what-have-you) can be set
   using the setProperty method with the appropriate URI identifier.

3. The first arguments to get/setFeature and get/setProperty are
   fully-qualified URIs (to be included in a future message).


Questions:

1. Should there be getters as well as setters for the handlers and the
   locale?

2. Should the parser be allowed to throw SAXNotSupportedException for
   NamespaceDeclHandler as well?

3. Should we just use setProperty to set the optional handlers?

4. Should we explicitly allow the systemId argument to parse() to be a 
   relative URI?  If so, what should it be relative to?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kyeefung at extend.com  Wed Dec 22 15:31:26 1999
From: kyeefung at extend.com (Khun Yee Fung)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: DeclHandler
Message-ID: <E09B8717558DD211BF0300609793FBB0018F29DD@MAILSERVER>

I have a question. Right now, the Xerces SAX implementation calls the
comment() method when a comment is encountered in a DTD. Is this the
intended behaviour?

As to whether element and attribute declarations are useful for downstream
processing. I did find a use. In XPath, there is a function called 'id()'
which returns a node with a certain ID. Without getting access to the DTD,
it is actually quite difficult to find out which attribute is the ID of an
element.

Regards,
Khun Yee Fung

		-----Original Message-----
		From:	David Megginson [mailto:david@megginson.com]
<mailto:[mailto:david@megginson.com]> 
		Sent:	Wednesday, December 22, 1999 9:35 AM
		To:	XMLDev list
		Subject:	SAX2: DeclHandler

		Here's the DeclHandler that we designed for SAX2alpha, with
		IOException replacing SAXException in the throws clauses:

		  public interface DeclHandler
		  {
		    public void elementDecl (String name, String model)
throws IOException;
		    public void attributeDecl (String eName, String name,
String type,
					       String valueDefault, String
value)
		      throws IOException;
		    public void internalEntityDecl (String name, String
value)
		      throws IOException;

		    public void externalEntityDecl (String name, String
publicId,
						    String systemId)
		      throws IOException;
		  }

		Notes:

		1. Unparsed entity and notation declarations are reported by
the (now
		   confusingly-named) DTDHandler.  The distinction is that
the XML 1.0 
		   REC requires parsers to report unparsed-entity and
notation
		   declarations, but not other DTD-based declarations.

		2. The model argument in elementDecl is a normalized string
		   representation of a content model.  It's not ideal, but
everyone
		   agreed last time that it was workable.

		This interface seems hopelessly anachronistic, and I'm not
willing to
		invest too much time in it -- after all, while DTDs are
useful in
		themselves, the declarations should hardly form part of
downstream
		processing -- but enough people want it that it's useful to
include it
		as an optional feature.


		All the best,


		David

		-- 
		David Megginson                 david@megginson.com
<mailto:david@megginson.com> 
		           http://www.megginson.com/
<http://www.megginson.com/> 

		xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk <mailto:xml-dev@ic.ac.uk> 
		Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
<http://www.lists.ic.ac.uk/hypermail/xml-dev/>  and on CD-ROM/ISBN
981-02-3594-1
		To unsubscribe, mailto:majordomo@ic.ac.uk
<mailto:majordomo@ic.ac.uk>  the following message;
		unsubscribe xml-dev
		To subscribe to the digests, mailto:majordomo@ic.ac.uk
<mailto:majordomo@ic.ac.uk>  the following message;
		subscribe xml-dev-digest
		List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk
<mailto:rzepa@ic.ac.uk> )


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 15:38:41 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: Features and Properties
Message-ID: <14432.61502.356079.908984@localhost.localdomain>

Here is my current draft of core SAX2 features and properties.

Notes:

1. Implementors are free to define their own features and properties;
   these are meant to serve only as a common core.
2. Parser can throw exceptions for unrecognized properties or
   unsupported states; the only features that must be supported are
   'true' for namespace processing and 'false' for namespace prefixes.


Features
--------

I've removed http://xml.org/sax/features/normalize-text, because its
interaction with optional hander types is too tricky.  I've aded
http://xml.org/sax/features/namespace-prefixes to preserve the
original prefixes on the local parts of names if desired.

http://xml.org/sax/features/validation
  Default value: unknown
  true: validate the document
  false: do not validate the document

http://xml.org/sax/features/external-general-entities
  Default value: unknown
  true: include external text entities
  false: do not include external text entities

http://xml.org/sax/features/external-parameter-entities
  Default value: unknown
  true: include external parameter entities
  false: do not include external parameter entities

http://xml.org/sax/features/namespaces
  Default value: true (*must* be supported)
  true: perform Namespace processing
  false: do not perform Namespace processing

http://xml.org/sax/features/namespace-prefixes
  Default value: false (*must* be supported)
  true: leave prefixes attached to the local parts of names
  false: do not leave prefixes attached to the local parts of names
  Note: will have no effect unless the 'namespaces' feature is true.

http://xml.org/sax/features/use-locator
  Default value: unknown
  true: always provide a Locator
  false: it's OK not to provide a Locator (but the parser still may)


Properties
----------

I've removed http://xml.org/sax/properties/namespace-sep, since
Namespace-qualified names are no longer reported as a single string.
I've removed the properties for LexicalHandler, DeclHandler, and
NamespaceDeclHandler because they have their own, explicit setters
now.


http://xml.org/sax/properties/dom-node
  Read-only.  Valid only during a callback (null otherwise).
  The DOM node currently being visited if SAX is being used as a DOM
  iterator and is visiting a DOM node.

http://xml.org/sax/properties/xml-string
  Read-only.  Valid only during a callback (null otherwise).
  The string of characters associated with the current event.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From abisheks at india.hp.com  Wed Dec 22 15:44:21 1999
From: abisheks at india.hp.com (Abhishek Srivastava)
Date: Mon Jun  7 17:18:48 2004
Subject: XML Indexing Engine
Message-ID: <00bd01bf4c93$4d048ab0$252f0a0f@india.hp.com>

Hi,

Can anyone tell me, what is an XML Indexing Engine ?

regards,
Abhishek.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    _/               Abhishek Srivastava
   _/                Hewlett Packard ISO
  _/_/_/   _/_/_/    -------------------
 _/    /   _/   _/     (Work)   +91-80-2251554 x1190
_/  _/   _/_/_/      (Ip)     15.10.47.37
        _/           (Url)    http://sites.netscape.net/abhishes/index.html
       _/
                     Work like you don't need the money.
                     Dance like no one is watching.
                     And love like you've never been hurt.
                     --Mark Twain

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Wed Dec 22 15:43:54 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: Parser interface
In-Reply-To: <14432.60462.189392.681294@localhost.localdomain>
Message-ID: <001101bf4c93$5ac61e20$ab20268a@pc-lrd.bath.ac.uk>

>     public void setProperty (String property, Object value)
>       throws SAXNotRecognizedException, SAXNotSupportedException;
>     public Object getProperty (String property)
>       throws SAXNotRecognizedException;
<snip!>
> 2. Extended handler types (for schemas, or what-have-you) can be set
>    using the setProperty method with the appropriate URI identifier.
<snip!>
> 3. Should we just use setProperty to set the optional handlers?

Couldn't there be a SAX2 Handler interface

public interface Handler
{
}

which marks a SAX2 Handler. The other handler interfaces are subclasses
of this. Giving a generic:

public void setHandler(Handler handler, String identifier)
	throws SAXNotRecognizedException, SAXNotSupportedException;

- identifiers for the standard handlers can be provided (e.g. as
public final constants)

- additional extended handlers use the same method to register, only
using the appropriate URI identifier as you suggest.

IMHO this is a cleaner way of registering handlers. I don't have to
remember that one Handler is registered explicitly (setDTDHandler) and
another through
a setProperty call. A Handler isNotA property.

Cheers,

L.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Wed Dec 22 15:48:17 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:48 2004
Subject: SAX2: DeclHandler
References: <14432.57765.516268.430263@localhost.localdomain>
Message-ID: <3860F2B5.87C0155F@pacbell.net>

David Megginson wrote:
> 
> Here's the DeclHandler that we designed for SAX2alpha, with
> IOException replacing SAXException in the throws clauses:

I'm still not keen on having IOException there.

However, there are two other declarations that should show
up in a DeclHandler:

	- The root element name.

	- Flag saying it it's standalone.

If those show up, then I think it'll be possible to use the
DeclHandler (and to-be-renamed DtdHandler) to provide a
cleanly layered XML validation module.  Else ...

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 15:55:48 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: Parser interface
In-Reply-To: <001101bf4c93$5ac61e20$ab20268a@pc-lrd.bath.ac.uk>
References: <14432.60462.189392.681294@localhost.localdomain>
	<001101bf4c93$5ac61e20$ab20268a@pc-lrd.bath.ac.uk>
Message-ID: <14432.62515.162817.957791@localhost.localdomain>

Leigh Dodds writes:

 > Couldn't there be a SAX2 Handler interface
 > 
 > public interface Handler
 > {
 > }
 > 
 > which marks a SAX2 Handler. The other handler interfaces are subclasses
 > of this.

I proposed this originally last spring and it was roundly shot down.
If people are more comfortable with the idea, I'd be happy to add it
back in.


 Giving a generic:
 > 
 > public void setHandler(Handler handler, String identifier)
 > 	throws SAXNotRecognizedException, SAXNotSupportedException;
 > 
 > - identifiers for the standard handlers can be provided (e.g. as
 > public final constants)
 > 
 > - additional extended handlers use the same method to register, only
 > using the appropriate URI identifier as you suggest.
 > 
 > IMHO this is a cleaner way of registering handlers. I don't have to
 > remember that one Handler is registered explicitly (setDTDHandler) and
 > another through
 > a setProperty call. A Handler isNotA property.

Hmm -- I don't know about this part.  I think that it's still
convenient to have explicit setters for the core handler types.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 15:56:40 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: suggestion for renaming DTDHandler
Message-ID: <14432.62581.903179.533162@localhost.localdomain>

Forwarded with permission -- I like the suggestion.


-------------- next part --------------
An embedded message was scrubbed...
From: Leigh Dodds <ldodds@ingenta.com>
Subject: RE: SAX2: Renaming DTDHandler
Date: Wed, 22 Dec 1999 15:28:33 -0000
Size: 1842
Url: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991222/7a2ddbdf/attachment.eml
-------------- next part --------------


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/
From david at megginson.com  Wed Dec 22 15:58:51 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: DeclHandler
In-Reply-To: <3860F2B5.87C0155F@pacbell.net>
References: <14432.57765.516268.430263@localhost.localdomain>
	<3860F2B5.87C0155F@pacbell.net>
Message-ID: <14432.62712.408140.13233@localhost.localdomain>

David Brownell writes:
 > David Megginson wrote:
 > > 
 > > Here's the DeclHandler that we designed for SAX2alpha, with
 > > IOException replacing SAXException in the throws clauses:
 > 
 > I'm still not keen on having IOException there.
 > 
 > However, there are two other declarations that should show
 > up in a DeclHandler:
 > 
 > 	- The root element name.

You can get that from the LexicalHandler start/endDTD methods already.
Do you imagine validation modules will require DeclHandler but not
LexicalHandler?

 > 	- Flag saying it it's standalone.

Slippery slope, domino effect, etc., but perhaps we do need an xmlDecl 
callback.  What do others think?

 > If those show up, then I think it'll be possible to use the
 > DeclHandler (and to-be-renamed DtdHandler) to provide a
 > cleanly layered XML validation module.  Else ...


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 16:01:11 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: DeclHandler
In-Reply-To: Khun Yee Fung's message of "Wed, 22 Dec 1999 10:31:12 -0500"
References: <E09B8717558DD211BF0300609793FBB0018F29DD@MAILSERVER>
Message-ID: <m37li7c6f2.fsf@localhost.localdomain>

Khun Yee Fung <kyeefung@extend.com> writes:

> I have a question. Right now, the Xerces SAX implementation calls the
> comment() method when a comment is encountered in a DTD. Is this the
> intended behaviour?

It makes sense to use the proposed comment and the existing
processingInstruction callbacks for all comments and
processingInstructions in a document.

> As to whether element and attribute declarations are useful for downstream
> processing. I did find a use. In XPath, there is a function called 'id()'
> which returns a node with a certain ID. Without getting access to the DTD,
> it is actually quite difficult to find out which attribute is the ID of an
> element.

Well, you can also iterate through the AttributeList using getType().


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Wed Dec 22 16:06:18 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: LexicalHandler
References: <14432.57481.527212.886471@localhost.localdomain>
Message-ID: <3860F6EB.95B8FCE1@pacbell.net>

David Megginson wrote:
> 
> Just a quick review -- here's the LexicalHandler from SAX2 alpha,
> modified so that the callbacks throw IOException rather than
> SAXException:

[ again, overloading IOException seems quite wrong to me ]


>     public void startDTD (String name, String publicId, String systemId)
>       throws IOException;

The "name" is a declaration, affecting validity, and belongs
with other DTD declarations.  Else parsers that expose the rest
of the DTD declarations, but not lexical events, can't support
the full set of validity checks in application code.

For DOM Level 2 support, the literal text of the internal subset
needs to be provided.


>     public void startEntity (String name) throws IOException;
>     public void endEntity (String name) throws IOException;

A bunch of restrictions to this were identified as being essential,
such as the fact that entities expanded within other constructs
mustn't be exposed.  For example:

	<!ATTLIST foo %std-attrs; %i18n-attrs; %gooey-attrs;>

	<element foo="&entity1;" bar="&entity2;" />

I'm hoping the full spec for those callbacks makes clear that
in such situations the entities MUST NOT be reported.  (And
would strongly prefer that parameter entities never show up
in any context whatsoever.)

The reason was briefly that applications can't see inside the
structure of those constructs -- they'll just see some start/end
entity calls, FOLLOWED (oops!) by the callback of which they're
a part.  Just like they would if the entities preceded that
construct.


>	 I wonder if a little
> redundancy would make sense:
> 
>     public void startEntity (String name, String publicId,
>                              String systemId) throws IOException;
>     public void endEntity (String name) throws IOException;
> 
> That way, if the parser supports the LexicalHandler but not the
> DeclHandler, the public and system identifiers for entities will still
> be available.

That wouldn't handle internal entities, though.

I have fundamental issues with the notion of exposing the entity
structure of documents beyond that needed to recreate the DOCTYPE
declaration (DTD).  Not just in SAX; DOM does it pretty poorly too
(children of entity refs must be readonly, making them impossible
to manipulate in typical ways).

So I'd really rather not see that particular thing done ... if
any substantial change is to be made to entity reporting, my vote
is to just drop it entirely.  It's too messy a notion (IMHO) to
show up in any API offering higher level notions than lexical
tokens. (angle bracket, name, space, name token, space, equals,
double quote, text, entity ref, text, double quote, angle bracket,
text ... you get the idea.)

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Wed Dec 22 16:15:11 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: DeclHandler
References: <14432.57765.516268.430263@localhost.localdomain>
	 <3860F2B5.87C0155F@pacbell.net> <14432.62712.408140.13233@localhost.localdomain>
Message-ID: <3860F8FC.473CB4DD@pacbell.net>

>  > However, there are two other declarations that should show
>  > up in a DeclHandler:
>  >
>  >      - The root element name.
> 
> You can get that from the LexicalHandler start/endDTD methods already.
> Do you imagine validation modules will require DeclHandler but not
> LexicalHandler?

Absolutely; nothing else in the LexicalHandler interface relates
to validity constraints.  It's annoying to need half a dozen
methods in the filter that do nothing more than pass irrelevent
lexical events to the next stage.

 
>  >      - Flag saying it it's standalone.
> 
> Slippery slope, domino effect, etc., but perhaps we do need an xmlDecl
> callback.  What do others think?

My slipperly slope had brakes on it!  _Only_ the standalone flag
shows up in a validity constraint, so that's all that's necessary.

The real slippery slope comes in when exposing encoding decls,
which may be inside external entities.  XML decls and text decls
have different rules re whether "encoding=..." and "version=..."
are optional.

Were there an interest in encoding decls (there is -- how strong?)
as well as version decls (XML 1.1?) I'd say those belong in the
LexicalHandler -- they really don't relate to validation.

- Dave


>  > If those show up, then I think it'll be possible to use the
>  > DeclHandler (and to-be-renamed DtdHandler) to provide a
>  > cleanly layered XML validation module.  Else ...
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 16:24:48 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: LexicalHandler
In-Reply-To: <3860F6EB.95B8FCE1@pacbell.net>
References: <14432.57481.527212.886471@localhost.localdomain>
	<3860F6EB.95B8FCE1@pacbell.net>
Message-ID: <14432.64264.86507.674059@localhost.localdomain>

David Brownell writes:

 > For DOM Level 2 support, the literal text of the internal subset
 > needs to be provided.

You're kidding!  That's disgusting -- I'm strongly tempted just to
leave the DOM people dangling on that one.  After all, the proposed
SAX2 interfaces provide enough information to construct an equivalent
internal subset.

 > 
 > >     public void startEntity (String name) throws IOException;
 > >     public void endEntity (String name) throws IOException;
 > 
 > A bunch of restrictions to this were identified as being essential,
 > such as the fact that entities expanded within other constructs
 > mustn't be exposed.  For example:
 > 
 > 	<!ATTLIST foo %std-attrs; %i18n-attrs; %gooey-attrs;>
 > 
 > 	<element foo="&entity1;" bar="&entity2;" />

Agreed.

 > I'm hoping the full spec for those callbacks makes clear that
 > in such situations the entities MUST NOT be reported.  (And
 > would strongly prefer that parameter entities never show up
 > in any context whatsoever.)

To tell the truth, I don't think that many people really need any of
this stuff, so it's hard for me to distinguish one type of noise from
another.  If I were dictator, the only things I'd put in SAX2 would be
property/feature queries and Namespace support.

 > The reason was briefly that applications can't see inside the
 > structure of those constructs -- they'll just see some start/end
 > entity calls, FOLLOWED (oops!) by the callback of which they're
 > a part.  Just like they would if the entities preceded that
 > construct.

Agreed -- entity boundaries inside attribute values are forever lost.

 > >	 I wonder if a little
 > > redundancy would make sense:
 > > 
 > >     public void startEntity (String name, String publicId,
 > >                              String systemId) throws IOException;
 > >     public void endEntity (String name) throws IOException;
 > > 
 > > That way, if the parser supports the LexicalHandler but not the
 > > DeclHandler, the public and system identifiers for entities will still
 > > be available.
 > 
 > That wouldn't handle internal entities, though.

For internal entities, both publicId and systemId would be null, and
the value would be the text that appears before the corresponding
endEntity callback.

 > I have fundamental issues with the notion of exposing the entity
 > structure of documents beyond that needed to recreate the DOCTYPE
 > declaration (DTD).  Not just in SAX; DOM does it pretty poorly too
 > (children of entity refs must be readonly, making them impossible
 > to manipulate in typical ways).

Yes, I know -- that's why I want (at least) to make all of this mess
optional.  XML is simple at heart, but not when they start letting API 
writers loose on it.

 > So I'd really rather not see that particular thing done ... if
 > any substantial change is to be made to entity reporting, my vote
 > is to just drop it entirely.  It's too messy a notion (IMHO) to
 > show up in any API offering higher level notions than lexical
 > tokens. (angle bracket, name, space, name token, space, equals,
 > double quote, text, entity ref, text, double quote, angle bracket,
 > text ... you get the idea.)

I'd like to leave it out as well.  Personally, I think that the XML
community would be better served if purely lexical items like
Namespace prefixes, the DOCTYPE declaration, comments, element type
declarations, entity boundaries, etc. were simply inaccessible through
any standard API -- that way, the APIs would be easier to learn and
the obfuscators of the world would be less likely to abuse them.

I am tired, however, from all the e-mails from DOM implementors who
want comments (for example) in SAX so that they can bloat their DOM
trees with them.  They're wrong, of course, but I'm too tired to fight 
any more.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eliot at isogen.com  Wed Dec 22 16:25:28 1999
From: eliot at isogen.com (W. Eliot Kimber)
Date: Mon Jun  7 17:18:49 2004
Subject: Survey: Catalysis Templates
References: <F9FBA0D1187BD11188B200A0C9979DF902DB56BA@blrmail.india.hp.com>
Message-ID: <3860FC95.D96A4F59@isogen.com>

I'm wondering if anyone in the XML_DEV community is familiar with the
concept of "model templates" or "Frameworks" as defined by the Catalysis
modeling methodology (www.catalysis.org). Catalysis is defined largely
through the use of UML for capturing analysis, design, and
implementation models, but it's definition of "framework" is somewhat
different from the more general notion of frameworks in UML.

I ask because I think that Frameworks as defined by Catalysis are
similar to, if not identical to the concept of document architectures as
defined in the HyTime standard. I'm wondering whether or not talking
about "frameworks" or "model templates" would mean more to the typical
XML developer than does talking about "architectures".

Thanks,

Eliot

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Wed Dec 22 16:26:48 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: Parser interface
References: <14432.60462.189392.681294@localhost.localdomain>
Message-ID: <3860FBBD.139196C3@pacbell.net>

David Megginson wrote:
> 
> Questions:
> 
> 1. Should there be getters as well as setters for the handlers and the
>    locale?

I've certainly needed them often enough; "write-only" properties
are a model that's best avoided.  (They push value coordination up
a level, where it's error prone; and errors can't easily be caught.)


> 4. Should we explicitly allow the systemId argument to parse() to be a
>    relative URI?  If so, what should it be relative to?

I'd vote for requiring it to be absolute.  If relative URIs are allowed,
the base URI should be a read/write property, with a meaningful default.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ldodds at ingenta.com  Wed Dec 22 16:32:11 1999
From: ldodds at ingenta.com (Leigh Dodds)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: Parser interface
In-Reply-To: <14432.62515.162817.957791@localhost.localdomain>
Message-ID: <001301bf4c9a$192fc9a0$ab20268a@pc-lrd.bath.ac.uk>

>  > public interface Handler
>  > {
>  > }
>  >
>  > which marks a SAX2 Handler. The other handler interfaces are subclasses
>  > of this.
>
> I proposed this originally last spring and it was roundly shot down.
> If people are more comfortable with the idea, I'd be happy to add it
> back in.

I wasn't paying close attention last time SAX2 came up, I'll try and
dig back and see what the objections were.

>
>  Giving a generic:
>  >
>  > public void setHandler(Handler handler, String identifier)
>  > 	throws SAXNotRecognizedException, SAXNotSupportedException;
>
> Hmm -- I don't know about this part.  I think that it's still
> convenient to have explicit setters for the core handler types.

parser.setNamespaceDeclHandler(
		new NameSpaceDeclHandlerImpl()) ;
parser.setProperty("http://my.org/schema/handlers/schema",
		new MySchemaHandler()) ;

versus

parser.setHandler(new NameSpaceDeclHandlerImpl(),
		"http://xml.org/sax/handlers/namespacedecl") ;
parser.setHandler(new MySchemaHandler(),
		"http://my.org/schema/handlers/schema") ;

I like the second, mainly for consistency reasons (all handlers are treated
alike), and theres only a single method to remember. Although I can
see that I'd then have to make sure I used the correct URI identifier,
and I might well experience problems if I mispelt the URI, although
constants would mitigate this.

You could support both (with an anonymous/inner/private class?). So
setNamespaceDeclHandler just called setHandler with the correct URI.
Not 100% on this, I'll have to check.

All in all probably a stylistic point.

L.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Dec 22 16:39:21 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <m3n1r4gmbj.fsf@localhost.localdomain>
References: <David Brownell's message of "Tue, 21 Dec 1999 10:05:52 -0800">
 <3.0.32.19991218163458.01500100@pop.intergate.ca>
 <14430.36362.589577.199567@localhost.localdomain>
 <385FC180.8AFC226E@pacbell.net>
Message-ID: <199912221639.LAA17135@hesketh.net>

At 01:50 PM 12/21/99 -0500, David Megginson wrote:
>That misses Tim's original argument, though -- he was arguing that
>people who don't want Namespace processing can use SAX1, and people
>who want it can use SAX2, and wondered (aloud) why there would be
>anyone who would need both in the same API.

Because switching from flat-head to Phillips screwdrivers is a big pain in
the neck, and I'd rather be able to flip the head of the thing over without
having to reach back into my toolbox (careful of that hacksaw!) and find a
different screwdriver when I'm working on a simple project.

Never mind using a screwdriver as a hammer - just don't make me use
different screwdrivers on similar projects or even wirhin the same project.

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Wed Dec 22 16:47:07 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: LexicalHandler
References: <14432.57481.527212.886471@localhost.localdomain>
	 <3860F6EB.95B8FCE1@pacbell.net> <14432.64264.86507.674059@localhost.localdomain>
Message-ID: <38610082.73C98483@pacbell.net>

David Megginson wrote:
> 
> I am tired, however, from all the e-mails from DOM implementors who
> want comments (for example) in SAX so that they can bloat their DOM
> trees with them.  They're wrong, of course, but I'm too tired to fight
> any more.

That's exactly why you get email from DOM implementors; they're
just as tired of listening to users asking for such features!!
"If only" DOM L1 didn't expose them, many folk would be happier.

My DOM construction still defaults to disabling such bloat, for
what it's worth.  

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Wed Dec 22 16:59:26 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: LexicalHandler
Message-ID: <3.0.32.19991222085614.01506490@pop.intergate.ca>

At 11:23 AM 12/22/99 -0500, David Megginson wrote:
>David Brownell writes:
>
> >if
> > any substantial change is to be made to entity reporting, my vote
> > is to just drop it entirely.  It's too messy a notion (IMHO) to
...
>I'd like to leave it out as well.  

Me too.

>I am tired, however, from all the e-mails from DOM implementors who
>want comments (for example) in SAX so that they can bloat their DOM
>trees with them.  They're wrong, of course, but I'm too tired to fight 
>any more.

"Just say no". -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From xml at searchtools.com  Wed Dec 22 17:03:19 1999
From: xml at searchtools.com (Avi Rappoport)
Date: Mon Jun  7 17:18:49 2004
Subject: XML Indexing Engine
In-Reply-To: <00bd01bf4c93$4d048ab0$252f0a0f@india.hp.com>
References: <00bd01bf4c93$4d048ab0$252f0a0f@india.hp.com>
Message-ID: <a04300c01b486ab2c4b5d@[171.66.196.146]>

I'm a search person, so my guess is that it should be program to 
create an inverted index of the contents of a bunch of XML files for 
later retrieval.

<http://www.XMLindex.com>, which has been advertising in 
InternetWorld claims it's a "portal application that uses Xdex 
software for context - sensitive searching. The Sequoia XML-based 
Interactive Portal -- Index, Search and Retrieve XML in Any File ... 
"  But oddly enough, their web site is down right now.

Hope that helps,

Avi

At 9:13 PM +0530 12/22/1999, Abhishek Srivastava wrote:
>Hi,
>
>Can anyone tell me, what is an XML Indexing Engine ?
-- 
_______________________________________________________
Guide to Local Site, Intranet, and Portal Search Engines: 
<http://www.searchtools.com> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kyeefung at extend.com  Wed Dec 22 17:13:31 1999
From: kyeefung at extend.com (Khun Yee Fung)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: LexicalHandler
Message-ID: <E09B8717558DD211BF0300609793FBB0018F2A5A@MAILSERVER>

		David Megginson [david@megginson.com] wrote:

		... Personally, I think that the XML
		community would be better served if purely lexical items
like
		Namespace prefixes, the DOCTYPE declaration, comments,
element type
		declarations, entity boundaries, etc. were simply
inaccessible through
		any standard API -- that way, the APIs would be easier to
learn and
		the obfuscators of the world would be less likely to abuse
them.

I hope you are not talking about people who use the APIs. If the people who
have to use APIs did not have to look at these things, they would not want
them in any APIs. Personally, I do not believe people who write
specifications will look at XML APIs to limit their imagination. They will
look at what is in an XML document. And API writers will then have to make
sure their APIs can do everything under the Sun. Or the people who have to
implement the specifications will just do it themselves or not implement the
specifications at all.

If SAX2 had not given me the comment nodes in XML documents, I would not
have switched to SAX. I would have stayed in DOM for my little
implementation of XPath. Or, more likely, I would have started implementing
my own XML parser. I did not specify XPath. As an implementor, I will choose
the tool that allows me to do my work.

And in general, it is not a problem we can avoid. How many people need to
get the comments of a Java program in source form? I can think of a few:
people who have to implement something like JavaDoc. The ordinary Joe would
not think of implementing JavaDoc because not that many people can handle
it. I see it as a strength rather than a weakness for everybody in XML to be
able to access anything in an XML document. And SAX is doing a wonderful
job. And I thank you very much for that.

		I am tired, however, from all the e-mails from DOM
implementors who
		want comments (for example) in SAX so that they can bloat
their DOM
		trees with them.  They're wrong, of course, but I'm too
tired to fight 
		any more.

And if both DOM and SAX had not provided access to comment nodes and
specifications like XSLT and XPath allow manipulation of these nodes, we
would have seen XML parsers left and right that did not support any standard
APIs. I consider that a worse scenario than the current situation.

The issue about why some features got into various W3C specifications is too
big for me to know. :-)

Regards,
Khun Yee Fung


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From kyeefung at extend.com  Wed Dec 22 17:21:51 1999
From: kyeefung at extend.com (Khun Yee Fung)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: DeclHandler
Message-ID: <E09B8717558DD211BF0300609793FBB0018F2A6B@MAILSERVER>

Thanks for the info. The getType() solution is actually a better solution
for me. I should have checked SAX better before going for DeclHandler, I
guess.

Regards,
Khun Yee


		-----Original Message-----
		From:	David Megginson [mailto:david@megginson.com]
		Sent:	Wednesday, December 22, 1999 11:00 AM
		To:	XMLDev list
		Subject:	Re: SAX2: DeclHandler

		Khun Yee Fung <kyeefung@extend.com> writes:

		> I have a question. Right now, the Xerces SAX
implementation calls the
		> comment() method when a comment is encountered in a DTD.
Is this the
		> intended behaviour?

		It makes sense to use the proposed comment and the existing
		processingInstruction callbacks for all comments and
		processingInstructions in a document.

		> As to whether element and attribute declarations are
useful for downstream
		> processing. I did find a use. In XPath, there is a
function called 'id()'
		> which returns a node with a certain ID. Without getting
access to the DTD,
		> it is actually quite difficult to find out which attribute
is the ID of an
		> element.

		Well, you can also iterate through the AttributeList using
getType().


		All the best,


		David

		-- 
		David Megginson                 david@megginson.com
		           http://www.megginson.com/

		xml-dev: A list for W3C XML Developers. To post,
mailto:xml-dev@ic.ac.uk
		Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/
and on CD-ROM/ISBN 981-02-3594-1
		To unsubscribe, mailto:majordomo@ic.ac.uk the following
message;
		unsubscribe xml-dev
		To subscribe to the digests, mailto:majordomo@ic.ac.uk the
following message;
		subscribe xml-dev-digest
		List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From seh at speakeasy.org  Wed Dec 22 17:42:10 1999
From: seh at speakeasy.org (Steve Harris)
Date: Mon Jun  7 17:18:49 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: Steinar Bang's message of "22 Dec 1999 08:58:36 +0100"
References: <EE5F339A2558D311B7360008C73BFD001687BB@exchange1.primus.com> <whk8m7flub.fsf@viffer.metis.no>
Message-ID: <sv7li698n8.fsf@hodge.primus.com>

Steinar Bang <sb@metis.no> writes:

> >>>>> Steve Harris <sharris@primus.com>:
> 
> > This UTF-8/UTF-16 representation translation seems like a job for
> > Standard C++'s <locale>/codecvt facility, not something to be
> > embedded in a string class.
> 
> Well, maybe... if the C++ standardarization commitee hadn't dropped
> the ball on sizeof(wchar_t)...:-/
> 
> I fear that this will make the entire std::wstring stuff unusable for
> multiplatform development.

[perhaps off-topic, but...]
Can you elaborate a bit here? Do you mean that the problem is that we
can't know the size of a wchar_t? Aren't there some guarantees to the
effect of, "A wchar_t will be at least as big as two chars," or
whatever would be appropriate? I can see how if you're writing the
low-level UTF-8 translation that you need to know what bits to shift
where. It seems that so long as the compiler will guarantee that you
can fit _at least_ 16 bits in a wchar_t, then your translation code
would be sufficiently portable.

[...]

> The MSVC++ Standard C++ Library support is seriously broken in a
> multitude of ways

[...]

Right. Do anything aggressive with templates and "Internal Compiler
Error" will become the stuff of nightmares.

I know we need to get work done today, but it's sad that we can't use
more of the Standard C++ pieces in a project like this. If we're
successful, this API will outlast the current rev of the lagging
compilers. I'm still in favor of planning an API that may not work for
everyone today. The C++ specification provides a road map (and
hopefully a guarantee) of where the compilers and libraries are going
to. We shouldn't have to ignore the generalized facilities that solve
our specific problem here. Targeting use of a fully-compliant
compiler/library pairing keeps us close to "The C++ Way," but I
concede that it may also keep us from using SAX in the near future.

-- 
Steven E. Harris
Primus Knowledge Solutions, Inc.
http://www.primus.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Wed Dec 22 17:47:46 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:49 2004
Subject: xlink, xinclude, show, and actuate
Message-ID: <018701bf4ca3$e62d4550$0200a8c0@mdaxke>

(sent to both www-xml-linking-comments@w3.org and xml-dev;
please follow up just to xml-dev.)

I just read the latest (dec 99) revs of xlink and xinclude, and have
some thoughts.

I think everyone agrees that the "show" and "actuate" attributes feel
uncomfortably like display-specific hacks. 

Paraphrasing, as currently drafted the show attribute is one of:
- "embed": stick it here
- "new": make a new window
- "replace": replace this window

I would maintain that "new" and "replace" are actually hiding complex
links. They imply a target which is different from the current
location: they specify that the identified resource should be
loaded to either a new child of the "all windows" collection,
or to replace the current window in that collection. Those are
different locations in the "GUI tree".

I would suggest that the "show" attribute should be broken down into
two separate attributes:
- "target": the location that the href resource should be loaded to,
if different from the current location. It could be a relative or
absolute url, in either case utilizing xpath.
- "operation": NMTOKEN, one of:
-- "insert", insert as new child of target
-- "replace", replace the target
-- perhaps others such as "merge", which would merge descendants with target

If the currently proposed "show" is retained, it should probably
be placed in a different html namespace, since it is display specific.
It is syntactic sugar for something like:
  html:show="embed" -> xlink:target="." xlink:operation="replace"
  html:show="new" -> xlink:target="//dom:windows" xlink:operation="insert"
  html:show="replace" -> xlink:target="ancestor::dom:window" xlink:operation="insert"
(Presumably the target could be expressed in a suitable mapping from DOM
to xpath, when/if that exists, in the html usage case.)

Note that the 4th case missing from the existing show attribute
is IMHO still useful: xlink:target="." xlink:operation="insert".
That corresponds to an inclusion case where I want to leave a "marker",
and not replace the element.

Now with respect to "actuate", which can currently be "onLoad" and "onRequest":
these two are very display specific. And while there are just two,
in fact a document might go through any number of layers/tiers of processing
(some on the "client" and some on the "server"), and it might be desired for
any one of them to be the one to actually act on the declared link. 

If the current actuate were to be preserved, I would suggest it
be moved to the html namespace. I might note that the two current
choices are far less than what the more ambitious dhtml sites are using
today in load timing (actuate 10 seconds after my stupid flash demo,
actuate with a fade, actuate when i roll over this gif, etc.).

Since the semantics of actuate has to do with timing, I would think
that some tie-in should be specified to SMIL or at least to DOM2.

Some of the important dimensions to consider with respect to activation:
- the id of the processor (or layer) that is examining the content
- the event(s) that should trigger an actuation
- the role of the link
- the type of the located resource

The actuate attribute could probably be done away with. 
An "activation-event" attribute might be added. Then the html group
could define things like:
  html:a -> activation-event="html:onRequest" 
  html:img -> activation-event="html:onLoad"

But I'm not real happy with that either. On the client side, I'd want
a tighter integration with the DOM events. On the server side, I have
no need of actuate at all -- the fully qualified role name should be
sufficient for me.

With respect to xinclude, at first I was pleased with the separation
from xlink, but now upon further reflection I believe it will lead
to a confusing and artificial separation of capabilities.

In my view, xinclude is just a case of the "actuate" being done
at an earlier stage than is currently allowed for in the simplistic
set of 2 choices in xlink actuate. xinclude has potential need for the
the "target" attribute described above, and perhaps for further
control over who should expand it.

For its part, xlink could benefit from the "parse" attribute that
xinclude has. Speaking of "parse", I think that both would benefit from an
even more powerful capability of indicating what processor
to use to load the resource (which might be implied by type),
as well as processor directives and/or a processor stylesheet url
(which might be done some how as a complex xlink?).
This could be used to provide something like architectural forms, for example.

I think the "steps" attribute should just be ditched. xslt seems
to be doing fine without a "steps" attribute to control how far
it should follow possibly infinite recursion in calling templates
or importing stylesheets. It is an implementation level issue.
After all, suppose that I want to allow an infinite number of steps.
Will the processor really do that?

There is still the pernicious confusion in xinclude about where
in layering it stands with respect to validation and the DTD, which I
sympathize with and have no good suggestions for at the moment, beyond
those mentioned in the spec. That could be an obstacle to the (re)unification
of xlink and xinclude.

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 17:57:22 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:49 2004
Subject: SAX2: Namespace proposal
In-Reply-To: <199912221639.LAA17135@hesketh.net>
References: <David Brownell's message of "Tue, 21 Dec 1999 10:05:52 -0800">
	<3.0.32.19991218163458.01500100@pop.intergate.ca>
	<14430.36362.589577.199567@localhost.localdomain>
	<385FC180.8AFC226E@pacbell.net>
	<199912221639.LAA17135@hesketh.net>
Message-ID: <14433.4286.695602.900882@localhost.localdomain>

Simon St.Laurent writes:

 > At 01:50 PM 12/21/99 -0500, David Megginson wrote:
 > >That misses Tim's original argument, though -- he was arguing that
 > >people who don't want Namespace processing can use SAX1, and people
 > >who want it can use SAX2, and wondered (aloud) why there would be
 > >anyone who would need both in the same API.
 > 
 > Because switching from flat-head to Phillips screwdrivers is a big pain in
 > the neck, and I'd rather be able to flip the head of the thing over without
 > having to reach back into my toolbox (careful of that hacksaw!) and find a
 > different screwdriver when I'm working on a simple project.
 > 
 > Never mind using a screwdriver as a hammer - just don't make me use
 > different screwdrivers on similar projects or even wirhin the same project.

I think that James Clark's suggestion, which I've adopted (for now),
gives us a good multi-head screwdriver -- I just wanted to make
certain that Tim Bray's argument to the contrary was represented
fairly.

I think that Tim's concern is a very good one: essentially, I'd sum it 
up like this:

  Any compromises that we make now for political expediency will stay
  long after the political need has passed.

OK, that's a little more portentious than anything Tim wrote, but he
knows whereof he speaks.  Remember the SML discussion from a while
back?  A lot of the stuff that people wanted to remove from XML is
stuff that people originally wanted for the same kinds of reasons that
people want the pre-Namespace stuff in SAX2.

Now SGML compatibility doesn't really matter any more (I wasn't at the
Philly show, but how much new SGML software was on the exhibit
floor?), but we're still stuck with DOCTYPE declarations,
less-than-useful attribute types (with varying whitespace-handing
rules), notations, unparsed entities, and a lot of other awkward stuff
that makes XML look at first glance far more complicated than it truly
is.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Wed Dec 22 18:09:26 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:50 2004
Subject: attribute values as qnames?
Message-ID: <018d01bf4ca7$4b885ae0$0200a8c0@mdaxke>

i noticed that xmlschema is using qnames in attribute values:
<schema targetNamespace="http://www.myco.com/MYPO"
        xmlns="http://www.w3.org/TR/1999/WD-xmlschema-1-19991217"
        xmlns:po="http://www.myco.com/MYPO">

 <element name="PurchaseOrder" type="po:PurchaseOrderType"/>

This is something I've wanted to, and thought was not allowed,
so I dug up xml-names, and found only this, in section 6:
  "Strictly speaking, attribute values declared to be of types ID,
  IDREF(S), ENTITY(IES), and NOTATION are also Names, and thus
  should be colon-free."

Of course, other types of attributes can have a colon, but regardless
there is no intimation that the prefixes would be expanded (and in
fact they shouldn't be, for an arbitrary attribute).

xmlschema introduces an ab initio datatype of QName (4.2.2).
but that then means that xmlschema aware processors will produce
a different infoset (if i can correctly use that word in a sentence)
than a mere run of the mill namespace-aware processor.

It seems that the "right" thing here is actually an extension to xml-names,
to make ID/IDREF/ENTITY/NOTATION names be qnames, and expanded at the
lower layer?

otherwise, for example, someone parsing two different schemas which
differ only in prefixes will conclude that the schemas are different,
which to me is inimical to what namespaces are supposed to accomplish.

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Wed Dec 22 18:19:11 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:50 2004
Subject: Bug in Solution to Representing IP addresses in XML Schema
References: <3860C56F.819B6D00@mitre.org>
Message-ID: <38611676.C9AF7881@mitre.org>

I just realized that there is a bug in the solution that I mailed this
morning.  The period (.) is a special character meaning "any
character".  To indicate that we want a period and not "any character"
the period must be escaped with a backslash, i.e., \. 

Here's the fixed solution:

<datatype name="IP" source="string">
    <pattern value="((1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])\.){3}
                     (1?[0-9]?[0-9]|2[0-4][0-9]|25[0-5])"/>
       <annotation>
          <info>
              Datatype for representing IP addresses.  Examples,
                 129.83.64.255, 64.128.2.71, etc.
              This datatype restricts each field of the IP address
              to have a value between zero and 255, i.e.,
                 [0-255].[0-255].[0-255].[0-255]
              Note: in the value attribute (above) the regular
              expression has been split over two lines.  This is
              for readability purposes only.  In practive the R.E.
              would all be on one line.
          </info>
       </annotation>
    </pattern>
</datatype>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Rajiv.Mordani at eng.sun.com  Wed Dec 22 19:00:55 1999
From: Rajiv.Mordani at eng.sun.com (Rajiv Mordani)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: DeclHandler
In-Reply-To: <14432.57765.516268.430263@localhost.localdomain>
Message-ID: <Pine.SOL.3.96.991222105955.20108C-100000@nine>

I am not sure what the intent is of having IOException. I think
SAXException seems appropriate to me..

- Rajiv

On Wed, 22 Dec 1999, David Megginson wrote:

> Here's the DeclHandler that we designed for SAX2alpha, with
> IOException replacing SAXException in the throws clauses:
> 
>   public interface DeclHandler
>   {
>     public void elementDecl (String name, String model) throws IOException;
>     public void attributeDecl (String eName, String name, String type,
> 			       String valueDefault, String value)
>       throws IOException;
>     public void internalEntityDecl (String name, String value)
>       throws IOException;
> 
>     public void externalEntityDecl (String name, String publicId,
> 				    String systemId)
>       throws IOException;
>   }
> 
> Notes:
> 
> 1. Unparsed entity and notation declarations are reported by the (now
>    confusingly-named) DTDHandler.  The distinction is that the XML 1.0 
>    REC requires parsers to report unparsed-entity and notation
>    declarations, but not other DTD-based declarations.
> 
> 2. The model argument in elementDecl is a normalized string
>    representation of a content model.  It's not ideal, but everyone
>    agreed last time that it was workable.
> 
> This interface seems hopelessly anachronistic, and I'm not willing to
> invest too much time in it -- after all, while DTDs are useful in
> themselves, the declarations should hardly form part of downstream
> processing -- but enough people want it that it's useful to include it
> as an optional feature.
> 
> 
> All the best,
> 
> 
> David
> 
> -- 
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 19:08:53 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: <Pine.SOL.3.96.991222105955.20108C-100000@nine>
References: <14432.57765.516268.430263@localhost.localdomain>
	<Pine.SOL.3.96.991222105955.20108C-100000@nine>
Message-ID: <14433.8570.811500.337122@localhost.localdomain>

Rajiv Mordani writes:

 > I am not sure what the intent is of having IOException. I think
 > SAXException seems appropriate to me..

I'd like to hear as many opinions on this point as possible.  What is
the rationale for *not* deriving SAXException from IOException when
(for example) java.net.MalformedURLException and
java.util.zip.ZipException are derived from IOException?

I especially like the idea that higher-level libraries could have

  void importXML (String uri) throws IOException;

without the application's having any direct dependency on SAX
interfaces.  It could accomplish the same thing by having a
SAXExceptionAdapter that embeds the SAXException and extends
IOException, but that seems like a lot of unnecessary fuss for a very
common case.

Opinions?


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nikita.ogievetsky at csfb.com  Wed Dec 22 19:11:53 1999
From: nikita.ogievetsky at csfb.com (Ogievetsky, Nikita)
Date: Mon Jun  7 17:18:50 2004
Subject: xml:base
Message-ID: <9C998CDFE027D211B61300A0C9CF9AB4016D6A45@SNYC11309>


2 basic questions about XML Base :-) :

1. If I specify xml:base on XML fragment
 there is no way to reference local resources within this fragment?
 this is poor...
2. What if I want 2 or 3 bases? Impossible?
like this:
<my:a base="mylinks" href="some.html>
	<my:img base="myimages" src="img.gif"/>
</my:a>

I think that a reasonable solution will be
to allow alias ing xml:base like it is done with 
namespaces. May be something like this:
<AAA	xmlns="yyy.yyy.yyy"
	xmlns:xxx="xxx.xxx.xxx"
	base:w3tr="http://www.w3.org/TR"
	base:cot="http://www.cogx.com/xml99">
<slink base="cot" href="xity" title="XLink visualiser"/>
<slink base="w3tr" href="xmlbase" title="XML Base Working Draft"/>
<slink base="w3tr" href="xlink" title="XML Link Working Draft"/>
...
</AAA>

Actually I am already sinning like this for quite some time.
It works, it is easy, I enjoy it.

If somebody (and probably quite a few) have problems with this
base elements can be specified individually:

<AAA>
	<base alias="w3tr" href="http://www.w3.org/TR"/>
	<base alias="cot" href="http://www.w3.org/TR"/>
	<slink base="cot" href="xity" title="XLink visualiser"/>
	<slink base="w3tr" href="xmlbase" title="XML Base Working Draft"/>
	<slink base="w3tr" href="xlink" title="XML Link Working Draft"/>
</AAA>

With best regards,

Nikita Ogievetsky
nogievet@offsight.com
http://www.cogx.com


This message is for the named person's use only.  It may contain
confidential, proprietary or legally privileged information.  No
confidentiality or privilege is waived or lost by any mistransmission.
If you receive this message in error, please immediately delete it and all
copies of it from your system, destroy any hard copies of it and notify the
sender.  You must not, directly or indirectly, use, disclose, distribute, 
print, or copy any part of this message if you are not the intended 
recipient. CREDIT SUISSE GROUP, CREDIT SUISSE FIRST BOSTON, and each of
their subsidiaries each reserve  the right to monitor all e-mail 
communications through its networks.  Any views expressed in this message
are those of the individual sender, except where the message states 
otherwise and the sender is authorised to state them to be the views of 
any such entity.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Wed Dec 22 19:18:25 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:50 2004
Subject: Solution2: Representing IP addresses in XML Schema
Message-ID: <38612458.3FBBBEB9@mitre.org>

Andrew Greene pointed out that the regular expression was allowing
things like 08 and 09 to appear as a field in the IP, and many (all?)
implementations treat a number beginning with "0" as an octal value. 
Here's the latest solution, containing the R.E. that Andrew sent to me
which does not have this deficiency: (thanks Andrew!)

<datatype name="IP" source="string">
    <pattern value="(([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])\.){3}
                     ([1-9]?[0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5])"/>
       <annotation>
          <info>
              Datatype for representing IP addresses.  Examples,
                 129.83.64.255, 64.128.2.71, etc.
              This datatype restricts each field of the IP address
              to have a value between zero and 255, i.e.,
                 [0-255].[0-255].[0-255].[0-255]
              Note: in the value attribute (above) the regular
              expression has been split over two lines.  This is
              for readability purposes only.  In practice the R.E.
              would all be on one line.
          </info>
       </annotation>
    </pattern>
</datatype>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From martind at netfolder.com  Wed Dec 22 19:18:48 1999
From: martind at netfolder.com (Didier PH Martin)
Date: Mon Jun  7 17:18:50 2004
Subject: Survey: Catalysis Templates
In-Reply-To: <3860FC95.D96A4F59@isogen.com>
Message-ID: <NBBBJPGDLPIHJGEHAKBAGEFPEKAA.martind@netfolder.com>

Hi Eliot,

Eliot said:
I ask because I think that Frameworks as defined by Catalysis are
similar to, if not identical to the concept of document architectures as
defined in the HyTime standard. I'm wondering whether or not talking
about "frameworks" or "model templates" would mean more to the typical
XML developer than does talking about "architectures".

Didier reply:
I am currently studying Catalyst so, before saying anything about it I'll
have to master it. However, concerning the architecture, we can say that we
have more an inheritance relationship between our particular document
structure and the "Architecture". Because of this kind of relationship, we
can say that we have here something that resemble a lot a particular pattern
(ref: Gamma & al.): the interface pattern. For some, a Framework is a
particular assembly of patterns targeted to a particular domain. So, I guess
that it all depends on the architecture scope to call it a "pattern", a
"base structure" (that we inherit from so this makes the architecture
resemble the interface notion) and finally a "framework". I guess that
because of the scope of Hytime we can probably call that a framework applied
to the linkage domain.

This said, thanks for bringing the subject because, for me, "architectures"
are closer to "interface" like found in C++ base classes or Java interfaces.
And what is an interface after all? In its simplest form, we can say that it
is a contract that assure the client that, even if an implementation does
something totally differently than an other one, we still have the same way
to interact with all the implementations. So, for example, if I have a
particular document structure (all the elements have names totally different
than Hytime) an Hytime apps can still interact with my document structure.
Thus, the concept is similar to the notion of interface.

Thinking loud, I would say that a particular Hytime link is like an
interface and if I want to interact with a varlink, I interact with the
"interface" instead of the implementation (which could be for instance a
topic map "assoc" element). However, the collection of interfaces that
compose the linkage domain could probably be called a framework. But because
we deal here more with data than methods, a "data template framework" seems
a good way to call this. We have to be cautious, framework for Object
oriented guys also implies behavior because a framework, most of the time is
an aggregate of interfaces and these interfaces are objects. Therefore the
access to the object is a mix of properties and methods. So, finally, "data
template framework" seems a good candidate here (no implicit notion of
behavior).

Simple opinion of a guys with a bad cold and who hate to be sick :-)

Cheers
Didier PH Martin
mailto:martind@netfolder.com
http://www.netfolder.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Rajiv.Mordani at eng.sun.com  Wed Dec 22 19:35:05 1999
From: Rajiv.Mordani at eng.sun.com (Rajiv Mordani)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: <14433.8570.811500.337122@localhost.localdomain>
Message-ID: <Pine.SOL.3.96.991222112505.20108F-100000@nine>

java.util.zip.ZipException deriving from IOException is appropriate. It is
related to i/o so it makes sense to have that. Incase of
java.net.MalformedURLExcetion ther could be an argument why it derives
from IOException. However in case of SAX it is a callback mechanism It
doesn't do the IO operations. It is the parser code / the person
implementing the xml file handling doing that. So if any IO errors occur
it will be detected earlier on, before the callback happens. Hence I don't
see the rationale in making the SAXHandlers throw IOException. It is just
a callbck/event mechanism which occurs after IO operations have been done
it should derive from RuntimeException rather than IOException IMHO. 

- Rajiv

On Wed, 22 Dec 1999, David Megginson wrote:

> Rajiv Mordani writes:
> 
>  > I am not sure what the intent is of having IOException. I think
>  > SAXException seems appropriate to me..
> 
> I'd like to hear as many opinions on this point as possible.  What is
> the rationale for *not* deriving SAXException from IOException when
> (for example) java.net.MalformedURLException and
> java.util.zip.ZipException are derived from IOException?
> 
> I especially like the idea that higher-level libraries could have
> 
>   void importXML (String uri) throws IOException;
> 
> without the application's having any direct dependency on SAX
> interfaces.  It could accomplish the same thing by having a
> SAXExceptionAdapter that embeds the SAXException and extends
> IOException, but that seems like a lot of unnecessary fuss for a very
> common case.
> 
> Opinions?
> 
> 
> All the best,
> 
> 
> David
> 
> -- 
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dwshin at nlm.nih.gov  Wed Dec 22 19:44:33 1999
From: dwshin at nlm.nih.gov (Dongwook Shin)
Date: Mon Jun  7 17:18:50 2004
Subject: XML Indexing Engine
References: <00bd01bf4c93$4d048ab0$252f0a0f@india.hp.com> <a04300c01b486ab2c4b5d@[171.66.196.146]>
Message-ID: <386125DA.9F8B4310@nlm.nih.gov>

Let me add a couple of indexing engines:


(1) XRS: XML Retrieval System

It creates inverted index of XML files and save them into
the file system. it provides a variety of structural search functions.
Demo and download:
http://dlb2.nlm.nih.gov/~dwshin/xrs.html


(2) XML Query Engine

It creates an in-memory index and performs XQL queries.
Download:
http://www.fatdog.com

Avi Rappoport wrote:

> I'm a search person, so my guess is that it should be program to
> create an inverted index of the contents of a bunch of XML files for
> later retrieval.
>
> <http://www.XMLindex.com>, which has been advertising in
> InternetWorld claims it's a "portal application that uses Xdex
> software for context - sensitive searching. The Sequoia XML-based
> Interactive Portal -- Index, Search and Retrieve XML in Any File ...
> "  But oddly enough, their web site is down right now.
>
> Hope that helps,
>
> Avi
>
> At 9:13 PM +0530 12/22/1999, Abhishek Srivastava wrote:
> >Hi,
> >
> >Can anyone tell me, what is an XML Indexing Engine ?
> --
> _______________________________________________________
> Guide to Local Site, Intranet, and Portal Search Engines:
> <http://www.searchtools.com>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
Dongwook Shin
Visiting Scholar
Lister Hill National Center for Biomedical Communications
National Library of Medicine,
8600 Rockville Pike Bethesda 20894, MD
E-mail: dwshin@nlm.nih.gov
Tel: (301) 435-3257
FAX: (301) 480-3035
URL: http://dlb2.nlm.nih.gov/~dwshin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Wed Dec 22 19:51:00 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:50 2004
Subject: The per-element-type namespace partition
References: <33D189919E89D311814C00805F1991F7F4AA77@RED-MSG-08>
Message-ID: <38612A33.742F4D1@mecomnet.de>

apologies, in advance, for flogging a grandfather for the crimes of his
progeny, but...

Andrew Layman wrote:

> That disclaimer aside, I recall that a major part of the motivation for the
> distinction was the desire to allow for "global attributes," where a
> qualified attribute such as "foo:href" could have a definition and meaning
> independent of the element within which it appeared, and at the same time
> continue the current practice fostered by DTDs in which an unqualified
> attribute may have a definition and meaning local to the enclosing element.

from which it would appear to be that the real issue is that, it was held to
be imperative that the following be XML-1.0+namespaces valid:

>         <foo:a foo:href="bar" href="bar" xmlns="x" xmlns:foo="x">

gee, there's no reason to object to that. on the other hand, when working to
implement this,  if a spec which can only be characterized as

> better or for worse, [saying] that the following two things are not necessarily
> the same:
> 
>         <foo:a foo:href="bar">
>         <foo:a href="bar">

and leaves it up to each implementation - at point SAX2 - to say what they
are, the spec is not serving its purpose. hermeneutics aside. that something
is, finally, in a position to resolve this quandry is, welcome.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From orchard at pacificspirit.com  Wed Dec 22 20:16:04 1999
From: orchard at pacificspirit.com (David Orchard)
Date: Mon Jun  7 17:18:50 2004
Subject: base
In-Reply-To: <9C998CDFE027D211B61300A0C9CF9AB4016D6A45@SNYC11309>
Message-ID: <002801bf4cb9$5a97c5e0$9730e620@n54wntw.vancouver.can.ibm.com>

The concept of multiple bases came up during discussions on Base.  At the
time, I characterized your proposal as a proposal for what I called a
"Basespace".  This is very similar to namespaces in the naming and scoping
of identifers.  It's an interesting notion.

Cheers,
Dave Orchard
XLink co-editor

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Ogievetsky, Nikita
> Sent: Wednesday, December 22, 1999 11:09 AM
> To: 'xml-dev@ic.ac.uk'
> Subject: xml:base
>
>
>
> 2 basic questions about XML Base :-) :
>
> 1. If I specify xml:base on XML fragment
>  there is no way to reference local resources within this fragment?
>  this is poor...
> 2. What if I want 2 or 3 bases? Impossible?
> like this:
> <my:a base="mylinks" href="some.html>
> 	<my:img base="myimages" src="img.gif"/>
> </my:a>
>
> I think that a reasonable solution will be
> to allow alias ing xml:base like it is done with
> namespaces. May be something like this:
> <AAA	xmlns="yyy.yyy.yyy"
> 	xmlns:xxx="xxx.xxx.xxx"
> 	base:w3tr="http://www.w3.org/TR"
> 	base:cot="http://www.cogx.com/xml99">
> <slink base="cot" href="xity" title="XLink visualiser"/>
> <slink base="w3tr" href="xmlbase" title="XML Base Working Draft"/>
> <slink base="w3tr" href="xlink" title="XML Link Working Draft"/>
> ...
> </AAA>
>
> Actually I am already sinning like this for quite some time.
> It works, it is easy, I enjoy it.
>
> If somebody (and probably quite a few) have problems with this
> base elements can be specified individually:
>
> <AAA>
> 	<base alias="w3tr" href="http://www.w3.org/TR"/>
> 	<base alias="cot" href="http://www.w3.org/TR"/>
> 	<slink base="cot" href="xity" title="XLink visualiser"/>
> 	<slink base="w3tr" href="xmlbase" title="XML Base Working Draft"/>
> 	<slink base="w3tr" href="xlink" title="XML Link Working Draft"/>
> </AAA>
>
> With best regards,
>
> Nikita Ogievetsky
> nogievet@offsight.com
> http://www.cogx.com
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> This message is for the named person's use only.  It may contain
> confidential, proprietary or legally privileged information.  No
> confidentiality or privilege is waived or lost by any mistransmission.
> If you receive this message in error, please immediately delete it and all
> copies of it from your system, destroy any hard copies of it and
> notify the
> sender.  You must not, directly or indirectly, use, disclose, distribute,
> print, or copy any part of this message if you are not the intended
> recipient. CREDIT SUISSE GROUP, CREDIT SUISSE FIRST BOSTON, and each of
> their subsidiaries each reserve  the right to monitor all e-mail
> communications through its networks.  Any views expressed in this message
> are those of the individual sender, except where the message states
> otherwise and the sender is authorised to state them to be the views of
> any such entity.
>
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
> CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec 22 20:30:42 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: <14433.8570.811500.337122@localhost.localdomain>
Message-ID: <000c01bf4cbb$948b0f80$d1940e18@smateo1.sfba.home.com>

David,

I don't see any real problem with deriving SAXException
from IOException.  It is not clear cut whether parsing is
i/o or not.  Having SAXException derive from IOException
makes things easier for me, so I vote for less pain.

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 21:29:03 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: DeclHandler
In-Reply-To: Rajiv Mordani's message of "Wed, 22 Dec 1999 11:00:34 -0800 (PST)"
References: <Pine.SOL.3.96.991222105955.20108C-100000@nine>
Message-ID: <m34sdad5t7.fsf@localhost.localdomain>

Rajiv Mordani <Rajiv.Mordani@eng.sun.com> writes:

> I am not sure what the intent is of having IOException. I think
> SAXException seems appropriate to me..

My plan is to have

  public class SAXException extends IOException
  {
  }

We could still have the callbacks throw SAXException, if that's what
people prefer.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Wed Dec 22 21:32:12 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:50 2004
Subject: The per-element-type namespace partition
Message-ID: <33D189919E89D311814C00805F1991F7F4AA95@RED-MSG-08>

There may be a misunderstanding engendered by my wording.  When I wrote that


> the following two things are not necessarily
> the same:
> 
>         <foo:a foo:href="bar">
>         <foo:a href="bar">

I was recognizing that the two names may actually have the same referent or
might have different referents, but that one cannot tell which simply by
unaided inspection of the names, in the same way that two distinct URLs
might actually refer to the same document, but one cannot generally
determine that fact by mere inspection of the two URLs.  This point is not
actually an invention of the namespaces specification, but simply a
recognition that names whose interpretation is not part of the specification
may behave that way, as URIs do.

No API can claim that the two attributes are necessarily equivalent or
conversely distinct without being sometimes flat wrong.  I do not mean this
as a criticism of the good intentions of the SAX contributors, but just a
caution that if SAX is to faithfully reflect what a document author wrote,
for all documents, it cannot add peculiar interpretation to namespaces.

-----Original Message-----
From: james anderson [mailto:James.Anderson@mecomnet.de]
Sent: Wednesday, December 22, 1999 11:45 AM
To: xml-dev@ic.ac.uk
Subject: Re: The per-element-type namespace partition


apologies, in advance, for flogging a grandfather for the crimes of his
progeny, but...

Andrew Layman wrote:

> That disclaimer aside, I recall that a major part of the motivation for
the
> distinction was the desire to allow for "global attributes," where a
> qualified attribute such as "foo:href" could have a definition and meaning
> independent of the element within which it appeared, and at the same time
> continue the current practice fostered by DTDs in which an unqualified
> attribute may have a definition and meaning local to the enclosing
element.

from which it would appear to be that the real issue is that, it was held to
be imperative that the following be XML-1.0+namespaces valid:

>         <foo:a foo:href="bar" href="bar" xmlns="x" xmlns:foo="x">

gee, there's no reason to object to that. on the other hand, when working to
implement this,  if a spec which can only be characterized as

> better or for worse, [saying] that the following two things are not
necessarily
> the same:
> 
>         <foo:a foo:href="bar">
>         <foo:a href="bar">

and leaves it up to each implementation - at point SAX2 - to say what they
are, the spec is not serving its purpose. hermeneutics aside. that something
is, finally, in a position to resolve this quandry is, welcome.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 22 21:31:23 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: Rajiv Mordani's message of "Wed, 22 Dec 1999 11:34:53 -0800 (PST)"
References: <Pine.SOL.3.96.991222112505.20108F-100000@nine>
Message-ID: <m31z8ed5pn.fsf@localhost.localdomain>

Rajiv Mordani <Rajiv.Mordani@eng.sun.com> writes:

> java.util.zip.ZipException deriving from IOException is
> appropriate. It is related to i/o so it makes sense to have that.

I'd be interested in a clear statement of the criteria for this
distinction -- you get a ZipException, presumably, because of an error
in the format of the zip file you're reading from; you get a
SAXException because of an error in the format of the XML file you're
reading from.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Wed Dec 22 21:46:37 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: DeclHandler
References: <Pine.SOL.3.96.991222105955.20108C-100000@nine> <m34sdad5t7.fsf@localhost.localdomain>
Message-ID: <00cf01bf4cc5$a822e270$4a5eedc1@arp01>

> We could still have the callbacks throw SAXException, if that's what
> people prefer.

Yes Please.  Exception really dont work well in VB and several other
langauges.

Actually, anything that is language specific should be chopped initially in
my view.

Regards,

Rich.


>
>
> All the best,
>
>
> David
>
> --
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Wed Dec 22 22:25:35 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Parser interface
In-Reply-To: <14432.60462.189392.681294@localhost.localdomain>
Message-ID: <000001bf4ccb$a44ac9a0$d1940e18@c1033339-a.smateo1.sfba.home.com>

David,

Any chance of adding parseFragment?  It could come in
real handy for applications that primarily deals with
fragments instead of documents.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Wed Dec 22 22:32:07 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Parser interface
In-Reply-To: <000001bf4ccb$a44ac9a0$d1940e18@c1033339-a.smateo1.sfba.hom
 e.com>
References: <14432.60462.189392.681294@localhost.localdomain>
Message-ID: <199912222232.RAA00755@hesketh.net>

At 02:26 PM 12/22/99 -0800, Don Park wrote:
>Any chance of adding parseFragment?  It could come in
>real handy for applications that primarily deals with
>fragments instead of documents.

I'll second that motion....

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrc at allette.com.au  Wed Dec 22 23:13:22 1999
From: mrc at allette.com.au (Marcus Carr)
Date: Mon Jun  7 17:18:50 2004
Subject: DTD design
References: <OFD51A8AB2.5799A032-ONC125684F.00373657@crpht.lu>
Message-ID: <38615B2C.EF7A98A1@allette.com.au>


heiko.grussbach@crpht.lu wrote:

> I have the following problem, I want to define an element E that may
> contain elements A,B,C. Order should be insignificant and A,B and C are all
> optional. Furthermore, A,B and C may each be replaced by X.

If more elements were to be added as children of E, the exponential combinations would make maintenance of content models untenable - it's
reasonable then to conclude that this is not an appropriate approach from the outset. I would use something like:

<!ELEMENT E   (A | B | C | X)*>

and use a schema or XSL to advise when elements are found to behave in a way contrary to your intentions but not expressed by the content model.

BTW, this question is probably more suited to 'General discussion of Extensible Markup Language <XML-L@listserv.heanet.ie>'.


--
Regards,

Marcus Carr                      email:  mrc@allette.com.au
___________________________________________________________________
Allette Systems (Australia)      www:    http://www.allette.com.au
___________________________________________________________________
"Everything should be made as simple as possible, but not simpler."
       - Einstein


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From nogievet at offsight.com  Thu Dec 23 02:02:42 1999
From: nogievet at offsight.com (Nikita Ogievetsky)
Date: Mon Jun  7 17:18:50 2004
Subject: base
Message-ID: <01b101bf4cbf$6eae89a0$0101010a@COGITECH>


Basespace - sounds good.
By the way a little of-topic question:
is there any active xlink-oriented mailing list?
(xlxp-dev is silent...)
Thanks,

Nikita Ogievetsky
http://www.cogx.com
nogievet@offsight.com


----------------------------------------------------------------
Dave Orchard wrote:

The concept of multiple bases came up during discussions on Base.  At the
time, I characterized your proposal as a proposal for what I called a
"Basespace".  This is very similar to namespaces in the naming and scoping
of identifers.  It's an interesting notion.

Cheers,
Dave Orchard
XLink co-editor

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Ogievetsky, Nikita
> Sent: Wednesday, December 22, 1999 11:09 AM
> To: 'xml-dev@ic.ac.uk'
> Subject: xml:base
>
>
>
> 2 basic questions about XML Base :-) :
>
> 1. If I specify xml:base on XML fragment
>  there is no way to reference local resources within this fragment?
>  this is poor...
> 2. What if I want 2 or 3 bases? Impossible?
> like this:
> <my:a base="mylinks" href="some.html>
>  <my:img base="myimages" src="img.gif"/>
> </my:a>
>
> I think that a reasonable solution will be
> to allow alias ing xml:base like it is done with
> namespaces. May be something like this:
> <AAA xmlns="yyy.yyy.yyy"
>  xmlns:xxx="xxx.xxx.xxx"
>  base:w3tr="http://www.w3.org/TR"
>  base:cot="http://www.cogx.com/xml99">
> <slink base="cot" href="xity" title="XLink visualiser"/>
> <slink base="w3tr" href="xmlbase" title="XML Base Working Draft"/>
> <slink base="w3tr" href="xlink" title="XML Link Working Draft"/>
> ...
> </AAA>
>
> Actually I am already sinning like this for quite some time.
> It works, it is easy, I enjoy it.
>
> If somebody (and probably quite a few) have problems with this
> base elements can be specified individually:
>
> <AAA>
>  <base alias="w3tr" href="http://www.w3.org/TR"/>
>  <base alias="cot" href="http://www.w3.org/TR"/>
>  <slink base="cot" href="xity" title="XLink visualiser"/>
>  <slink base="w3tr" href="xmlbase" title="XML Base Working Draft"/>
>  <slink base="w3tr" href="xlink" title="XML Link Working Draft"/>
> </AAA>
>
> With best regards,
>
> Nikita Ogievetsky
> nogievet@offsight.com
> http://www.cogx.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jenghan_hsieh at mail.hgc.com.tw  Thu Dec 23 05:02:26 1999
From: jenghan_hsieh at mail.hgc.com.tw (=?Big5?B?wcKsRr+r?=)
Date: Mon Jun  7 17:18:50 2004
Subject: schematron-frame.html
Message-ID: <51C38DDEF19ED211AAA10008C71E612FC3062B@gigaexchange.hgc.com.tw>

  
 Interesting XML Document Validation ...

 http://www.ascc.net/xml/resource/schematron/DavidCarlisle/schematron-frame.
html
 <<schematron-frame.html.url>> 
-------------- next part --------------
A non-text attachment was scrubbed...
Name: schematron-frame.html.url
Type: application/octet-stream
Size: 229 bytes
Desc: not available
Url : http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991223/61688136/schematron-frame.html.obj
From vita96 at se.its-sby.edu  Thu Dec 23 06:16:42 1999
From: vita96 at se.its-sby.edu (Vita Prihatoni Purnomo)
Date: Mon Jun  7 17:18:50 2004
Subject: Asking...
Message-ID: <001b01bf4d0d$2d5bae80$f50a7e0a@betengan.its-sby.edu>

Dear all,

I'm new in XML. Could you give me some online publishing that I can read. So, I can follow your comments here.

Thank you,

Vita Prihatoni Purnomo
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991223/06011a75/attachment.htm
From ricko at allette.com.au  Thu Dec 23 06:20:00 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:50 2004
Subject: DTD design
Message-ID: <004c01bf4d11$546262e0$1cf96d8c@NT.JELLIFFE.COM.AU>


From: heiko.grussbach@crpht.lu <heiko.grussbach@crpht.lu>

>I have the following problem, I want to define an element E that may
>contain elements A,B,C. Order should be insignificant and A,B and C are
all
>optional. Furthermore, A,B and C may each be replaced by X.

This may be a kind of "variant GI" issue.  Most schema languages do not
support it well.  Grammar-based schema languages that do not
provide some explicit support may have their content models explode,
as you mention.

XML Schemas allows "&" at the top level, rather like SGML. But that does
not help your problem. You may find that your problem is actually one
of subclassing, in which case XML Schemas may help when they are
implemented.

In general, when you have complex problems like this, you may find
the "architecture" approach useful.  In this approach, you make up
a DTD for exactly what you want to validate:
   ( (A, (B | C) ?) | ( B, (C | A)?) | ( C, (A | B)? ))
Then you define a transformation (e.g. using XSL) to create a version
of you document which uses these structures.  One document may
have multiple architectures like this.  This is a very powerful method
of validating many structures that are unavailable to normal
validation or modeling.

If you are more interested in validation rather than modeling, then
try Schematron
http://www.ascc.net/xml/resource/schematron/schematron.html
An error browser is available free.

The appropriate pattern for your model is this:

<pattern name="Heiko's Problem">
    <rule context="E">
        <assert test="count(*) = count(A | B | C | X)"
        >The only subelements of E are A, B, C, or X.</assert>
        <assert test="count(*) = 2"
        >The element E must have 2 subelements.</assert>
        <assert test="count(A) &lt; 2"
        >The element E can only have zero or one of subelement
A</assert>
        <assert test="count(B) &lt; 2"
        >The element E can only have zero or one of subelement
B</assert>
        <assert test="count(C) &lt; 2
        >The element E can only have zero or one of subelement
C</assert>
        <assert test="count(X) &lt;= 2"
        >The element E can only have zero, one or two of subelement
X</assert>
    </rule>
</pattern>

To add a new element takes only an extra assert statement (and an update
of
the counts).  Compare this to a content model, where each new element
may double the size of the content model (depending on what constraints
you have).  Note that this is not at all a grammatical view of what you
are doing in your document: for some types of documents, the grammatical
abstraction is not helpful or appropriate.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Dec 23 06:27:32 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:50 2004
Subject: attribute values as qnames?
Message-ID: <005201bf4d12$685e8c00$1cf96d8c@NT.JELLIFFE.COM.AU>

 
From: Mark D. Anderson <mda@discerning.com> 

>i noticed that xmlschema is using qnames in attribute values:
><schema targetNamespace="http://www.myco.com/MYPO"
>        xmlns="http://www.w3.org/TR/1999/WD-xmlschema-1-19991217"
>        xmlns:po="http://www.myco.com/MYPO">
>
> <element name="PurchaseOrder" type="po:PurchaseOrderType"/>
>
>This is something I've wanted to, and thought was not allowed,
>so I dug up xml-names, and found only this, in section 6:
>  "Strictly speaking, attribute values declared to be of types ID,
>  IDREF(S), ENTITY(IES), and NOTATION are also Names, and thus
>  should be colon-free."
>
>Of course, other types of attributes can have a colon, but regardless
>there is no intimation that the prefixes would be expanded (and in
>fact they shouldn't be, for an arbitrary attribute).
 
To reference the name of an element type in an
attribute, one can use the namespace prefix: this is what
XPaths do, for example.  So the XML schema processor 
may indeed have to have the xmlns prefix->URI mappings 
available. A namespace processor will not resolve values
of attributes, merely names of elements and attributes (AFAIK,
but I am easily confusable.)

That type attribute is not an ID, IDREF, ENTITY or NOTATON
but a %QName;  (i.e., CDATA) so it conforms to the XML NS Spec.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Rameshcs at India.mastech.com  Thu Dec 23 06:56:22 1999
From: Rameshcs at India.mastech.com (Ramesh C S)
Date: Mon Jun  7 17:18:50 2004
Subject: Learning XML - Tools needed
Message-ID: <F2A18CA8D306D3118BF900805F1593BE106113@MASCHNEXC01>

Dear Friends,

I am new to XML. I know HTML,Cascading Style Sheets (CSS) and DHTML using
JavaScript.

I want to learn XML. I had read the articles abt the XML spec 1.0 and other
articles.

It says about DTD,XSL etc..,

Still stuck with how to proceed like the tools need to code and its
implementation. 

So, Can somebody please guide me on XML with a sample code and help on 
what are all the tools need to write and view a XML document


All suggestions are welcome.

Thanks in advance.


Regards,

(S.Ramesh)
rameshcs@india.mastech.com <mailto:rameshcs@india.mastech.com> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Bruce.Durling at equifax.com  Thu Dec 23 09:20:02 1999
From: Bruce.Durling at equifax.com (Bruce.Durling@equifax.com)
Date: Mon Jun  7 17:18:50 2004
Subject: psgml namespaces and schemas
Message-ID: <85256850.0032FAA1.00@noteswetc15.fin.equifax.com>


Hello,

I've been using psgml for a couple of weeks now and I have to say it is
absolutely fantastic.

Does anyone know if it supports XML schemas and namespaces? I'm trying to do
some XSL work and it would be very useful. If psgml is the wrong package for
doing it can anyone suggest another?

cheers,
bld


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Thu Dec 23 10:30:20 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:50 2004
Subject: A weird question?
In-Reply-To: <E19A882C6CD5D211A8A70008C75B6AF40122CC96@SFOEXCH_MAIL1>
Message-ID: <Pine.GHP.4.21.9912231019220.20207-100000@mail.ilrt.bris.ac.uk>

On Wed, 22 Dec 1999, Jeff Sussna wrote:

> I'm not entirely sure myself this makes sense, but I think it does. Has
> anyone thought about an RDF vocabulary for describing XML Schema documents? 

This is pretty much what the DCD proposal (proof of concept?) 
from IBM/Microsoft shows -- http://www.w3.org/TR/NOTE-dcd  
Or rather, it makes explicit some of the assertions about
elements/attributes etc that XML Schema documents make. Or do you mean
the use of RDF to describe administrative and resource-discovery
metadata about the schemas? (eg. title/description/subject/creator etc?)

There are also quite a few people are interested in reflecting the
dataypes component of XML Schema into the RDF data model; I think that should be
a reasonably straightforward task. That too would in a sense be 
'using RDF for describing XML Schemas', ie. given some XML Schema
datatype descriptions, re-describe that info as a set of RDF statements.

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rwaldin at pacbell.net  Thu Dec 23 11:05:40 1999
From: rwaldin at pacbell.net (Ray Waldin)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: Should SAXException extend IOException?
References: <14432.57765.516268.430263@localhost.localdomain>
	 <Pine.SOL.3.96.991222105955.20108C-100000@nine> <14433.8570.811500.337122@localhost.localdomain>
Message-ID: <386202AB.A458BD6C@pacbell.net>

from David's SAX2 Exceptions proposal:
> 5. Have all callbacks that formerly threw SAXException throw
>    IOException instead.  This should help to avoid a lot of exception
>    tunneling.

I don't see the point in a Handler throwing an IOException to a Parser in most
cases.  That is what this item implies, right?  What could a DocumentHandler
mean by throwing an IOException during a call to startElement?  The I/O has
already occurred before the handler gets involved.  The only times I can think
of where it *does* make sense is:

- EntityResolver.resolveEntity(), where the handler is taking part in IO
- ErrorHandler.*(), where a handler is passed an exception and may reflect it
back to the parser

-Ray

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 12:06:44 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:50 2004
Subject: SAX2: DeclHandler
In-Reply-To: "Richard Anderson"'s message of "Wed, 22 Dec 1999 21:43:59 -0000"
References: <Pine.SOL.3.96.991222105955.20108C-100000@nine> <m34sdad5t7.fsf@localhost.localdomain> <00cf01bf4cc5$a822e270$4a5eedc1@arp01>
Message-ID: <m37li5n9ph.fsf@localhost.localdomain>

"Richard Anderson" <rja@arpsolutions.demon.co.uk> writes:

> > We could still have the callbacks throw SAXException, if that's what
> > people prefer.
> 
> Yes Please.  Exception really dont work well in VB and several other
> langauges.

I'm not quite sure I understand -- if exceptions don't work well in a
language, then neither a general IOException nor a specific
SAXException would be appropriate.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 12:05:36 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: Ray Waldin's message of "Thu, 23 Dec 1999 03:08:27 -0800"
References: <14432.57765.516268.430263@localhost.localdomain> <Pine.SOL.3.96.991222105955.20108C-100000@nine> <14433.8570.811500.337122@localhost.localdomain> <386202AB.A458BD6C@pacbell.net>
Message-ID: <m3aen1n9rg.fsf@localhost.localdomain>

Ray Waldin <rwaldin@pacbell.net> writes:

> from David's SAX2 Exceptions proposal:
> > 5. Have all callbacks that formerly threw SAXException throw
> >    IOException instead.  This should help to avoid a lot of exception
> >    tunneling.
> 
> I don't see the point in a Handler throwing an IOException to a
> Parser in most cases.  That is what this item implies, right?  What
> could a DocumentHandler mean by throwing an IOException during a
> call to startElement?  The I/O has already occurred before the
> handler gets involved.

Well, that's really domain-specific.  It could be that a handler in an
XML I/O library does additional I/O (such as retrieving an external
bitmap) which, from the top-level application's point of view, is
still part of the same I/O process.

In the end, though, this is a relatively minor point.  The important
point, for me, is that SAXException extend IOException -- I think that 
it would be convenient to have the callbacks throw IOException rather
than SAXException (otherwise, other IOExceptions will have to tunnel), 
but it's not a show-stopper if everyone else thinks it's a bad idea.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 12:07:53 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Parser interface
In-Reply-To: "Don Park"'s message of "Wed, 22 Dec 1999 14:26:50 -0800"
References: <000001bf4ccb$a44ac9a0$d1940e18@c1033339-a.smateo1.sfba.home.com>
Message-ID: <m34sd9n9nk.fsf@localhost.localdomain>

"Don Park" <donpark@docuverse.com> writes:

> Any chance of adding parseFragment?  It could come in
> real handy for applications that primarily deals with
> fragments instead of documents.

I see no problem with people designing SAX parsers that happen to
parse fragments, just as I see no problem with designing SAX parsers
that happen to parse LaTeX or RTF, but I wouldn't want to push that
feature onto all of them in the general case.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 12:11:06 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:51 2004
Subject: psgml namespaces and schemas
In-Reply-To: Bruce.Durling@equifax.com's message of "Thu, 23 Dec 1999 09:19:13 +0000"
References: <85256850.0032FAA1.00@noteswetc15.fin.equifax.com>
Message-ID: <m31z8dn9i8.fsf@localhost.localdomain>

Bruce.Durling@equifax.com writes:

> I've been using psgml for a couple of weeks now and I have to say it is
> absolutely fantastic.
> 
> Does anyone know if it supports XML schemas and namespaces? I'm trying to do
> some XSL work and it would be very useful. If psgml is the wrong package for
> doing it can anyone suggest another?

PSGML does not do validation against XML schemas, and to tell the truth,
I wouldn't encourage anyone to do it.  Emacs is a wonderful tool -- I
use it to do all of my e-mail, news, programming, and technical
writing -- but it's a single-threaded monolithic mess, and it might be
nice to get a more modern open-source XML editing tool under
development.

PSGML allows you to edit XML documents that happen to contain
Namespaces, but it doesn't know anything about the Namespaces view.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From l-arcini at uniandes.edu.co  Thu Dec 23 12:27:54 1999
From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.)
Date: Mon Jun  7 17:18:51 2004
Subject: Survey: Catalysis Templates
Message-ID: <Pine.GSO.3.96.991223070521.28777A-100000@isis>

[Sorry if this arrives more than once, I've had problems with latest
postings]

Hi,

> <snip/>....., but it's definition of "framework" is somewhat
> different from the more general notion of frameworks in UML.

Right, it is different from most definitions of "framework". Basically
what it says is that a framework is a template model of something(...anything: a
spec, a design, anything...) plus information about assumptions on the
replacement of parameters, all inside a package.

> I ask because I think that Frameworks as defined by Catalysis are
> similar to, if not identical to the concept of document architectures as
> defined in the HyTime standard. 

Right. Nice observation. A particular subset of "frameworks", namely
document t model templates are quite similar to what is defined by HyTime.

>I'm wondering whether or not talking
> about "frameworks" or "model templates" would mean more to the typical
> XML developer than does talking about "architectures".

*finally* getting to your question :), I think the term "model templates" 
could mean more to the typical XML programmer with no architectures
experience than the term "architectures" because of the much popular use of 
"architectures" as "high level structural/functional view of a system". As
for the term "Framework", IMHO the way its used on catalysis is too 
restrictive...anyway I don't think it would do as well as "model
templates" also because other meanings are more strongly rooted.

Best,
	Fabio 

> Thanks,
> 
> Eliot
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

--
Fabio Arciniegas A.		        Viaduct Technologies, Inc.
fabio@viaduct.com			Software Engineer
Interests: XML, Wittgenstein and just about everything in between.
Oblique Strategy of the day: 	      "Abandon normal instruments"


--
Fabio Arciniegas Arjona              
l-arcini@uniandes.edu.co            
                                

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rja at arpsolutions.demon.co.uk  Thu Dec 23 13:46:49 1999
From: rja at arpsolutions.demon.co.uk (Richard Anderson)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: DeclHandler
References: <Pine.SOL.3.96.991222105955.20108C-100000@nine> <m34sdad5t7.fsf@localhost.localdomain> <00cf01bf4cc5$a822e270$4a5eedc1@arp01> <m37li5n9ph.fsf@localhost.localdomain>
Message-ID: <002c01bf4d4b$b8f9d550$b6010180@p197>

> > > We could still have the callbacks throw SAXException, if that's what
> > > people prefer.
> >
> > Yes Please.  Exception really dont work well in VB and several other
> > langauges.
>
> I'm not quite sure I understand -- if exceptions don't work well in a
> language, then neither a general IOException nor a specific
> SAXException would be appropriate.

Provided we just have a class that represents an exception, which is passed
via a callback and doesnt depend on the usual java/C++ throw() mechanisms
I'll be happy.

Also, if we dont derive any of the SAX interfaces from anything that isnt
available in all langauges, I'll be even happlier.

Cheers.

Rich.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Daniel.Brickley at bristol.ac.uk  Thu Dec 23 13:54:10 1999
From: Daniel.Brickley at bristol.ac.uk (Dan Brickley)
Date: Mon Jun  7 17:18:51 2004
Subject: A weird question?
In-Reply-To: <004501bf4d48$915e8ca0$e6ea7392@honeybee>
Message-ID: <Pine.GHP.4.21.9912231344520.21487-100000@mail.ilrt.bris.ac.uk>

On Thu, 23 Dec 1999, Sankar Virdhagriswaran wrote:

> > There are also quite a few people are interested in reflecting the
> > dataypes component of XML Schema into the RDF data model; I think that
> should be
> 
> This is an interesting idea. The flow I would see would be the other way
> around however. One would describe the XML-Schema descriptions in RDF from
> which the XML-Schema descriptions are generated. This approach would fit
> with the way folks today use modeling tools such as UML.

This would also be interesting, but is at the instance-data level,
ie. we're talking slightly cross purposes. When I talk about reflecting
from XML into RDF, I meant the constructs defined in the XML datatype
spec (facets etc) not particular application schemas or data valid by
those schemas.

At the W3C spec level, RDF deferred datatyping issues so they could be
dealt with once and for all across all XML apps by XML Schema. Now we
(more or less) have this, it is natural to explore a mapping of the
concepts defined in XML datatypes spec into RDF data graphs. Once some
mapping has been established, applications should be able to go both
ways, ie. first we reflect the XML datatype machinery into an
RDF-processsable representation, _then_ we can (hopefully) reflect
datatyped XML information into RDF and vice-versa.  

Dan


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tpassin at idsonline.com  Thu Dec 23 14:08:24 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:51 2004
Subject: XML Rendering problem
References: <Pine.SOL.3.96.991222014122.19744B-100000@nine>
Message-ID: <007101bf4d4f$d19f5ae0$41fbb1cd@tomshp>


Rajiv Mordani wrote:

> & indicates entities.. So if you need to show the & you should put &amp;
> in place of the &.
>

The HTML standard discusses using &amp; in the url and says it is legal:

"The URI that is constructed when a form is submitted may be used as an
anchor-style link (e.g., the href attribute for the A element).
Unfortunately, the use of the "&" character to separate form fields
interacts with its use in SGML attribute values to delimit character entity
references. For example, to use the URI "http://host/?x=1&y=2" as a linking
URI, it must be written <A href="http://host/?x=1&#38;y=2"> or <A
href="http://host/?x=1&amp;y=2">.

We recommend that HTTP server implementors, and in particular, CGI
implementors support the use of ";" in place of "&" to save authors the
trouble of escaping "&" characters in this manner. "

Spelling it out, you would have in the stylesheet:

<xsl:template match="image">
     <a href="/NASApp/portal/home?tmpl=browse&amp;url=next">
         <xsl:value-of select="imageurl"/>
    </a>
 </xsl:template>

> On Wed, 22 Dec 1999, Georg Edelmann wrote:
>
<snip/>
> > So here is my problem:
> >
> > The following XSL file does not work, rendering it with either the IBM
nor
> > the SUN xml parsers (either using Xalan or Saxon as XSL renderer):
> >
> > ----------------------------------------- stylesheet start
> > <xsl:stylesheet
> >      xmlns:xsl="http://www.w3.org/TR/WD-xsl"
> >      xmlns="hhtp://www.w3.org/TR/REC-html40"
> >      result-ns="">
> >
> > <xsl:template match="text()">
> > </xsl:template>
> >
> > <xsl:template match="image">
> >     <a href="/NASApp/portal/home?tmpl=browse&url=next">
> >         <xsl:value-of select="imageurl"/>
> >     </a>
> > </xsl:template>
> >
> > </xsl:stylesheet>
> > ----------------------------------------- stylesheet end
> >
> > The problem lies in the line with the href parameter. The parser
> > interprets '&url' as an html command and wants to have a trailing ';'.
It does not
> > understand that the '&' separates two parameters in the URL.
> > In my opinion that is a serious bug in all the parsers i tested so far.
> >

> > Georg Edelmann
> >

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Thu Dec 23 14:12:49 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Parser interface (fragments)
References: <000001bf4ccb$a44ac9a0$d1940e18@c1033339-a.smateo1.sfba.home.com>
Message-ID: <38622B13.1F0E7273@trantor.de>

> Any chance of adding parseFragment?  It could come in
> real handy for applications that primarily deals with
> fragments instead of documents.

In Java, I currently use a special reader that adds an additional root
element to the stream (see below). My life would become much easier if I
could tell the parser that it is reading a fragment (without root
element) only...

Vector streams = new Vector ();

streams.add (new ByteArrayInputStream ("<root>".getBytes ()));
streams.add (new FileInputStream (new File (path, "changes.log")));
streams.add (new ByteArrayInputStream ("</root>".getBytes ()));
                
parser.parse (new InputSource
              (new SequenceInputStream (streams.elements ())));

I have another fragment-related problem:

Imagine I have an xml parser for a particular type of content. Now, I
want to send this type of content, and for sending it is included in a
kind of "xml envelope". 

e.g.
<envelope>
  <sender>agent1</sender>
  <receiver>agent2</receiver>
  <content>
     <someRealContent/>
  </content>
</envelope>

Currently, the envelope handler delegates all content related events to
the content handler.  

It would be much nicer if I could switch the DocumentHandler by calling
"setDocumentHandler", and the documentHandler would switch back
automatically when back at the corresponding nesting level.  However,
the suggested semantics requires maintenance of a stack of
DocumentHandlers in the parser.

Best regards
 
Stefan

-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Thu Dec 23 15:05:01 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Should SAXException extend IOException?
References: <14432.57765.516268.430263@localhost.localdomain> <Pine.SOL.3.96.991222105955.20108C-100000@nine> <14433.8570.811500.337122@localhost.localdomain> <386202AB.A458BD6C@pacbell.net> <m3aen1n9rg.fsf@localhost.localdomain>
Message-ID: <386239D4.2CF0CEF0@trantor.de>

> > I don't see the point in a Handler throwing an IOException to a
> > Parser in most cases.  That is what this item implies, right?  What
> > could a DocumentHandler mean by throwing an IOException during a
> > call to startElement?  The I/O has already occurred before the
> > handler gets involved.
> 
> Well, that's really domain-specific.  It could be that a handler in an
> XML I/O library does additional I/O (such as retrieving an external
> bitmap) which, from the top-level application's point of view, is
> still part of the same I/O process.
> 

What about SAXException extends RuntimeException, wouldn't that also
solve the problem?

Best regards

Stefan

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 23 16:56:03 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <007101bf4d4f$d19f5ae0$41fbb1cd@tomshp>
Message-ID: <Pine.LNX.4.10.9912231152400.27464-100000@cauchy.clarkevans.com>

Yet another fundamental question... any insight
would be greatly apprechiated!

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 17:10:49 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: <386239D4.2CF0CEF0@trantor.de>
References: <14432.57765.516268.430263@localhost.localdomain>
	<Pine.SOL.3.96.991222105955.20108C-100000@nine>
	<14433.8570.811500.337122@localhost.localdomain>
	<386202AB.A458BD6C@pacbell.net>
	<m3aen1n9rg.fsf@localhost.localdomain>
	<386239D4.2CF0CEF0@trantor.de>
Message-ID: <14434.18169.589820.998828@localhost.localdomain>

Stefan Haustein writes:

 > > Well, that's really domain-specific.  It could be that a handler in an
 > > XML I/O library does additional I/O (such as retrieving an external
 > > bitmap) which, from the top-level application's point of view, is
 > > still part of the same I/O process.
 > > 
 > 
 > What about SAXException extends RuntimeException, wouldn't that also
 > solve the problem?

I don't think that it's fair to cast an XML parsing error as a
RuntimeException -- it's a common enough problem, and application
writers need to know to catch it.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Thu Dec 23 17:50:51 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Should SAXException extend IOException?
Message-ID: <776DC00B49ECD21189750090273F729130B373@EROS>

David Megginson writes:
> In the end, though, this is a relatively minor point.  The important
> point, for me, is that SAXException extend IOException -- I 
> think that 
> it would be convenient to have the callbacks throw IOException rather
> than SAXException (otherwise, other IOExceptions will have to 
> tunnel), 
> but it's not a show-stopper if everyone else thinks it's a bad idea.

I'll agree it is a relatively minor point, but if the only reason is
"otherwise, other IOExceptions will have to tunnel" then it is a truly
horrible idea which at its extreme boils down to 'Lets just throw Exception
so that no exceptions are tunneled'.  Perhaps there is a more cogent
argument to made about it?

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Thu Dec 23 17:57:27 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Should SAXException extend IOException?
Message-ID: <776DC00B49ECD21189750090273F729130B374@EROS>

Stefan Haustein writes:
> What about SAXException extends RuntimeException, wouldn't that also
> solve the problem?

Perhaps too well :)  You aren't required to catch any RuntimeExceptions
allowing the inexperienced programmer to blame XML for crashing their
wonderfully crafted UI.

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From lhill at excelergy.com  Thu Dec 23 18:09:04 1999
From: lhill at excelergy.com (Hill, Les)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: DeclHandler
Message-ID: <776DC00B49ECD21189750090273F729130B375@EROS>

Richard Anderson writes:
> Provided we just have a class that represents an exception, 
> which is passed
> via a callback and doesnt depend on the usual java/C++ 
> throw() mechanisms
> I'll be happy.

Gee, I guess I'd be unhappy :)  The ability to catch an exception in place
and continue as appropriate is worth losing direct API mappings to older
languages.

Regards,

Les Hill
Senior Architect
Excelergy

=======================================================
Excelergy is hiring Java/C++ XML developers, all levels
   send resume (and mention me :) to jobs@excelergy.com
=======================================================

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Thu Dec 23 18:43:34 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Exceptions/C++
References: <895304557.945788109@MDAXKE>
Message-ID: <38626D12.B1A4A6CB@pacbell.net>

To summarize some off-line discusions, it's not "necessary" to use a
new exception model due to any MT isues in C++ program; state kept on
the stack, easily accessible to applications, suffices for reasonably
written chunks of code.  As in Java, so in C++.  (Although historically
it goes the other way around ... C++/MT/Exceptions illuminated the
design of Java exceptions, along with experience in other languages
that offered OO, MT and exceptions well before C++ got its clue!)

- Dave


"Mark D. Anderson" wrote:
> 
> > I'll confess I didn't quite notice any MT issues in that post, but as you
> > stated it was really a "what Parser/InputSource is in use" issue that
> > isn't MT-specific at all.
> >
> > I can't see a way confusion could arise there unless one parser callback
> > needs to invoke some other parser, and is sloppy about letting exceptions
> > from that invocation appear as if they were exceptions from the current
> > invocation.  There are always ways to create bugs if code isn't careful.
> 
> if i have a single catch which is "above" multiple simultaneous parsing
> activities, then how can i determine from the exception object alone
> which parser is involved? or is the answer to not do that?
> 
> -mda

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Thu Dec 23 18:48:29 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:51 2004
Subject: A weird question?
Message-ID: <002101bf4d79$e71ef310$44f96d8c@NT.JELLIFFE.COM.AU>


From: Dan Brickley <Daniel.Brickley@bristol.ac.uk>

>On Thu, 23 Dec 1999, Sankar Virdhagriswaran wrote:

 >At the W3C spec level, RDF deferred datatyping issues so they could be
>dealt with once and for all across all XML apps by XML Schema. Now we
>(more or less) have this, it is natural to explore a mapping of the
>concepts defined in XML datatypes spec into RDF data graphs. Once some
>mapping has been established, applications should be able to go both
>ways, ie. first we reflect the XML datatype machinery into an
>RDF-processsable representation, _then_ we can (hopefully) reflect
>datatyped XML information into RDF and vice-versa.

The schematron-rdf application (beta at
    http://www.ascc.net/xml/resource/schematron/schematron.html
may do something similar soon.  It takes a schema and an instance and
generates an RDF document labelling each part of the instance according
to which pattern was found.  An XML-schema-to-schematron converter
is definitely on the cards; with that you could generate an RDF document
that shows which parts of a schema apply to which parts of a document.

If anyone is interested in this, a good first stage would be an XSL or
Perl
script to resolve all type references so that we don't need to trace
back
along multiple type references to figure out the facets of a datatype.
I think that might be a useful module for other applications too, for
example
for converting XML-schemas-to-DTDs to allow validation using current
systems.

Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 19:18:54 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: "Clark C. Evans"'s message of "Thu, 23 Dec 1999 11:55:54 -0500 (EST)"
References: <Pine.LNX.4.10.9912231152400.27464-100000@cauchy.clarkevans.com>
Message-ID: <m3wvq5lb4q.fsf@localhost.localdomain>

"Clark C. Evans" <clark.evans@manhattanproject.com> writes:

> Yet another fundamental question... any insight would be greatly
> apprechiated!

There are a lot of answers to this question, but in the end, the real
argument was that PIs cause display problems in level-3 and level-4
HTML browsers, and some influential parties [1] had a strong interest
in being able to write HTML+XML documents that, by various sorts of
lexical trickery, could still be displayed in XML-oblivious browsers
like Netscape 3.

Yes, I know everything you're going to say, and I probably agree with
all of it.  I've written a couple of Namespace filters, and they'd be
*much* easier if all Namespace declarations appeared in the prolog.


All the best,


David

[1] Who those parties were is confidential, but they were not the Big 
    Evil Companies.

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 23 19:17:46 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:51 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: <776DC00B49ECD21189750090273F729130B373@EROS>
References: <776DC00B49ECD21189750090273F729130B373@EROS>
Message-ID: <14434.28033.594048.829051@localhost.localdomain>

Hill, Les writes:

 > David Megginson writes:
 > > In the end, though, this is a relatively minor point.  The important
 > > point, for me, is that SAXException extend IOException -- I 
 > > think that 
 > > it would be convenient to have the callbacks throw IOException rather
 > > than SAXException (otherwise, other IOExceptions will have to 
 > > tunnel), 
 > > but it's not a show-stopper if everyone else thinks it's a bad idea.
 > 
 > I'll agree it is a relatively minor point, but if the only reason is
 > "otherwise, other IOExceptions will have to tunnel" then it is a truly
 > horrible idea which at its extreme boils down to 'Lets just throw Exception
 > so that no exceptions are tunneled'.  Perhaps there is a more cogent
 > argument to made about it?

Well, no, that's just an aside.  There are two main points:

1. Is SAXException logically a kind of I/O exception?

   After a lot of thought (and practical experience writing apps and
   libraries that use SAX), I'm certain that the answer to this
   question is 'yes'.  I know that to many of us on the list, XML is
   the sun, the moon, and the stars, but for the rest of the world
   it's just a fancy format that you can write information to or read
   it from -- in other words, XML is almost never the point, just a
   means.

   From that point of view, reading information from an XML document
   is a kind of I/O, so it makes sense for SAXException to be a kind
   of IOException.


2. Should SAX2 callbacks throw IOException or SAXException?

   This is the trickier one of the two, and it really depends on your
   perspective.  From a pure XML perspective, it would be best to
   have the handlers throw SAXException, because any errors should be
   strictly XML-syntax-related.  

   For someone using XML as an interchange format, however, the goal
   is to get the information that the XML document represents, and the
   niceties of the distinction between XML syntax errors and other
   kinds of I/O errors is not really significant.  If the XML markup
   points to another file, and I cannot read that file, it's just as
   much an I/O problem as if the document itself is malformed: in both
   cases, I've failed to load the information I want, so it's an I/O
   failure.  I'm not as confident of this point, but that's the best
   argument I can put forward for it.

   The tunnelling point can be paraphrased as "if we're dealing with
   I/O anyway, why force IOExceptions to be tunnelled".  As an
   interested historical note, though, the very first prototype SAX
   handlers I posted, about two years ago, did throw Exception.


So, to summarize, I'm pretty sure that a SAXException is a kind of
IOException, but I'm still trying to figure out whether it's best for
SAX callbacks to throw SAXException (as in SAX1) or IOException.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Thu Dec 23 19:15:15 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:51 2004
Subject: attribute values as qnames?
In-Reply-To: <005201bf4d12$685e8c00$1cf96d8c@NT.JELLIFFE.COM.AU>
Message-ID: <1061992063.945954797@[192.168.0.2]>

> To reference the name of an element type in an
> attribute, one can use the namespace prefix: this is what
> XPaths do, for example.  So the XML schema processor 
> may indeed have to have the xmlns prefix->URI mappings 
> available. A namespace processor will not resolve values
> of attributes, merely names of elements and attributes (AFAIK,
> but I am easily confusable.)

Just to make this concrete...

Suppose (in the future) I have a document which indicates that a particular
attribute is of the qname type, via xmlschema.

Now suppose I process this via a SAX or XSLT processor. If that
processor is "xmlschema-aware", then it will expand those qname's
in attribute values in the same way it would element type names and
attribute names. I would write xslt patterns using whatever prefix
I like.

But if the SAX or XSLT processor is not "xmlschema-aware", then I
would have to match against them using a lexical pattern that happens
to have a colon in it (i.e. choose the same prefix).

Is that right?

It also seems like this would have implications for xsl:key, in
ways I can't quite suss out.

This is of interest to me not just for use of an attribute value
as a reference. I would like to have an "enum" list where the values
are qualified. These are basically URNs. Maybe that is an abuse of
the namespace mechanism though?

-mda


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec 23 19:30:51 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
Message-ID: <33D189919E89D311814C00805F1991F7F4AA9D@RED-MSG-08>

Clark Evans asks why PIs are not the mechanism for namespace declaration.
That option was extensively debated during the design process (see the
archives for details).  The short answer is that PIs do not have tree scope,
so are unsuitable for modular document construction.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Dec 23 19:42:30 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <m3wvq5lb4q.fsf@localhost.localdomain>
References: <"Clark C. Evans"'s message of "Thu, 23 Dec 1999 11:55:54 -0500 (EST)">
 <Pine.LNX.4.10.9912231152400.27464-100000@cauchy.clarkevans.com>
Message-ID: <199912231942.OAA08747@hesketh.net>

At 02:17 PM 12/23/99 -0500, David Megginson wrote:
>There are a lot of answers to this question, but in the end, the real
>argument was that PIs cause display problems in level-3 and level-4
>HTML browsers, and some influential parties [1] had a strong interest
>in being able to write HTML+XML documents that, by various sorts of
>lexical trickery, could still be displayed in XML-oblivious browsers
>like Netscape 3.
>
>Yes, I know everything you're going to say, and I probably agree with
>all of it.  I've written a couple of Namespace filters, and they'd be
>*much* easier if all Namespace declarations appeared in the prolog.

And here I thought it was that processing instructions give the W3C a
mysterious rash.  Good to know, though!

Happy Holidays, all!

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 23 20:01:08 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <33D189919E89D311814C00805F1991F7F4AA9D@RED-MSG-08>
Message-ID: <Pine.LNX.4.10.9912231442150.27464-100000@cauchy.clarkevans.com>

On Thu, 23 Dec 1999, Andrew Layman wrote:
> Clark Evans asks why PIs are not the mechanism for namespace declaration.
> That option was extensively debated during the design process (see the
> archives for details).  The short answer is that PIs do not have tree scope,
> so are unsuitable for modular document construction.

This specific problem points out to a flaw in the PI 
mechanism that could have been fixed... rather than 
creating a "work-around" -- as David pointed out, not
many tools were compliant anyway!  For example, this 
could have been easily fixed by altering the XML syntax 
to allow for PI's to occur within elements...

<parent>
  <child>
    <?pi?>
    <grandchild/>
    <!-- pi's scope ends here -->
  </child>
</parent>

I'm sure the "backward compatible" drum was used,
however, in the XML world, unlike the bulk of
programming tradition, the data outlives the
program, not the other way around.   Thus, this
would have been backward-data compatible, which
is the only concern.  Specific versions of programs
typically have a shelf life for less than 2-3 years,
where data can last for decades.  On the other hand, 
how is having xmlns:prefix="uri" going to mess up
programs that expect processing instructions to
appear in PIs -- ones that show attributes directly
to end-users.

So, is a long-term fix in the works?  Or are we going 
to keep using attributes for processing instructions
and deprechate the unapprechiated PI mechanism?

Best,

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec 23 20:37:17 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
Message-ID: <33D189919E89D311814C00805F1991F7F4AAA0@RED-MSG-08>

Clark Evans asks whether there is ongoing work to introduce a new form of
Processing Instruction.  I am not aware of any such work, but perhaps others
are.

In any case, I am not attempting to debate the merits of the namespaces
design, but to achieve a precondition of reasonable debate, namely an
understanding of what the specification actually does and does not say.

I also recommend reading some some other posings on this subject,
particularly those of Tim Bray, Jon Bosak and David Megginson.

Best wishes,
Andrew Layman

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec 23 20:46:29 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:51 2004
Subject: attribute values as qnames?
Message-ID: <33D189919E89D311814C00805F1991F7F4AAA2@RED-MSG-08>

The namespaces specification does not take a stand on qnames in attribute
values, or, more exactly, the specification leaves it up to an application
to interpret attribute values, and up to an attribute's creator to set the
rules for its expression and interpretation.

The draft schema specifications define a qname datatype, presumably
instructing a generic application (or savvy processor) to interpret a
qualified name according to the sort of rules that the namespace
specification uses for element and attribute names. 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Thu Dec 23 21:17:17 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
References: <Pine.LNX.4.10.9912231442150.27464-100000@cauchy.clarkevans.com>
Message-ID: <3862913B.67D2E25C@pacbell.net>

"Clark C. Evans" wrote:
> 
> On Thu, 23 Dec 1999, Andrew Layman wrote:
> > Clark Evans asks why PIs are not the mechanism for namespace declaration.
> > That option was extensively debated during the design process (see the
> > archives for details).  The short answer is that PIs do not have tree scope,
> > so are unsuitable for modular document construction.

Bad short answer; see below.

More accurate is that certain person (or persons) disliked
PIs extremely, if even a tenth of what I've heard is accurate.
The debate on this was comparable to the recent "how many XHTMLs
are there" namespace debate.


> 	 For example, this
> could have been easily fixed by altering the XML syntax
> to allow for PI's to occur within elements...

But XML _does_ permit this.  Check the grammar, or almost
any XML processor ... this was an easy bit to get right!

What you're suggesting is that PIs be lexically scoped.
(That's what Andrew seems to mean by "tree" scope.)

And in fact, there's nothing in the world preventing the
definition of a particular PI from using lexical scope.
One doesn't need all PIs to work that way; only one.

> 
> <parent>
>   <child>
>     <?pi?>
>     <grandchild/>
>     <!-- pi's scope ends here -->
>   </child>
> </parent>

I've no intention of reopening the debate on this topic
(we're stuck with attributes), but I've got this strange
belief that truth should be told, so I couldn't let this
one slip by.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 23 21:32:13 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <3862913B.67D2E25C@pacbell.net>
Message-ID: <Pine.LNX.4.10.9912231628520.27464-100000@cauchy.clarkevans.com>

On Thu, 23 Dec 1999, David Brownell wrote:
> > On Thu, 23 Dec 1999, Andrew Layman wrote:
> > > Clark Evans asks why PIs are not the mechanism for namespace declaration.
> > > That option was extensively debated during the design process (see the
> > > archives for details).  The short answer is that PIs do not have tree scope,
> > > so are unsuitable for modular document construction.
> 
> Bad short answer; see below.

Oh ya!  Stupid me.  Very sorry.

content ::= (element | CharData | Reference | CDSect | PI | Comment)   

> What you're suggesting is that PIs be lexically scoped.
> (That's what Andrew seems to mean by "tree" scope.)
> 
> And in fact, there's nothing in the world preventing the
> definition of a particular PI from using lexical scope.
> One doesn't need all PIs to work that way; only one.
> 
> > 
> > <parent>
> >   <child>
> >     <?pi?>
> >     <grandchild/>
> >     <!-- pi's scope ends here -->
> >   </child>
> > </parent>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Dec 23 21:41:16 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:51 2004
Subject: Why not PIs for namespace declarations?
References: <Pine.LNX.4.10.9912231152400.27464-100000@cauchy.clarkevans.com>
Message-ID: <006501bf4d90$561e02e0$abacdccf@prioritynetworks.net>

In fact if you look at the working Drafts you will see that namespaces used
PI'sright up to the final recomendation!

1. My understanding is that they were abandoned to allow scoping.
2. A mecanism to not scope attributes was wanted.
3. there was a general feeling that PI's were 'broken'

Frank

----- Original Message -----
From: Clark C. Evans <clark.evans@manhattanproject.com>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 23, 1999 11:55 AM
Subject: Why not PIs for namespace declarations?


> Yet another fundamental question... any insight
> would be greatly apprechiated!
>
> Clark
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tmmet at hotmail.com  Thu Dec 23 21:54:36 1999
From: tmmet at hotmail.com (tmmet tvp)
Date: Mon Jun  7 17:18:52 2004
Subject: Using count in xsl
Message-ID: <19991223215358.33618.qmail@hotmail.com>

Hi,
I want to get the count of total number of nodes in my xml file.How can I do 
this in xsl?.Since,IE5 doesnot support XSLT,I can't use XSLT functions like 
position,count(),from-descendants() etc.
In MSXML,we have <xsl:script>,in which I can call the function for 
incrementing my count(variable).I would be glad if there is any other method 
to do this(getting the count of total nodes,subnodes of a particular 
type/tag etc) using XSL which will work in IE5 also.Any ideas will be 
greatly helpful for me.
Thanks in advance.

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 23 21:57:33 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:52 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <006501bf4d90$561e02e0$abacdccf@prioritynetworks.net>
Message-ID: <Pine.LNX.4.10.9912231644030.27464-100000@cauchy.clarkevans.com>

On Thu, 23 Dec 1999, Frank Boumphrey wrote:
> In fact if you look at the working Drafts you will see that 
> namespaces used PI'sright up to the final recomendation!

Yes, sounds like a fast change of heart.  The PI version even
made it into the XML Companion by Neil Bradley...

 <?xml:namespace name="http://www.vehicles.org/"
                 href="http://www.vehicles.org/dtds/v.dtd"
                 as="veh"
 ?>

> 1. My understanding is that they were abandoned to allow scoping.

Yes.  I had assumed (due to attributes being used) that 
PIs were only valid in the prolog... now that David pointed
out that this was a bad assumption -- I'm quite puzzled.

> 2. A mecanism to not scope attributes was wanted.

Is this related to that "three name space partitions" stuff?

> 3. there was a general feeling that PI's were 'broken'

Thanks Frank!

...

Now I'm wondering if it wouldn't be all that hard for parser's
to support both methods.  As the PI version seems far cleaner.

  <pre:element xmlns:pre="http//www.vehicles.org/dtds/v.dtd" />

becomes

  <?xml:namespace href="http://www.vechicles.org/dtds/v.dtd" as="pre" ?>
  <pre:element/>
  
Oh humm.  Thank you all for your feedback!

;) Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tmmet at hotmail.com  Thu Dec 23 21:58:33 1999
From: tmmet at hotmail.com (tmmet tvp)
Date: Mon Jun  7 17:18:52 2004
Subject: Search using XSL
Message-ID: <19991223215758.6224.qmail@hotmail.com>

Hi,
I'm creating a tree view form using XSL by reading from my XML file.
Traversing through all the nodes.I've a list box with nodes(say,title,author 
etc).That is,the list box contains the tag names of my XML file.When I 
select "title" from the list box,my xsl file should search nodes containing 
those tags(that is,title) from my xml file and create a tree view using 
it.How can this be done?.I know to handle this using 2 xsl files...But,I 
want this operations to done in a single xsl file which creates a tree 
view,does sorting and filtering etc.
Thansk in advance.

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Thu Dec 23 22:31:25 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:52 2004
Subject: SAXPP / fragment processing
Message-ID: <3862A2D2.1C89BBB1@trantor.de>

Hi,

does anybody know about an implementation of a pulling parser on top of
SAX? A pulling parser seems better suitable if interpretation of a
document is distributed over several different fragment processing
entities. The pulling parser could just be handed over from one part to
another and back...

Best regards

Stefan

-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Thu Dec 23 23:07:09 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:52 2004
Subject: SAXPP / fragment processing
In-Reply-To: <3862A2D2.1C89BBB1@trantor.de>
Message-ID: <Pine.LNX.4.10.9912231806241.27464-100000@cauchy.clarkevans.com>

On Thu, 23 Dec 1999, Stefan Haustein wrote:
> does anybody know about an implementation of a pulling parser on top of
> SAX? A pulling parser seems better suitable if interpretation of a
> document is distributed over several different fragment processing
> entities. The pulling parser could just be handed over from one part to
> another and back...

Were busy working on one on the SML list.  Paul Miller has
written a C version; I'm stuggling with defining interfaces
for a Java version.

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Thu Dec 23 23:36:08 1999
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:18:52 2004
Subject: WISH! Open Source XML Editor [was Re: psgml namespaces and
  schemas]
In-Reply-To: <m31z8dn9i8.fsf@localhost.localdomain>
References: <Bruce.Durling@equifax.com's message of "Thu, 23 Dec 1999 09:19:13 +0000">
 <85256850.0032FAA1.00@noteswetc15.fin.equifax.com>
Message-ID: <3.0.1.32.19991224135813.009b9380@pop3.demon.co.uk>

At 07:10 AM 12/23/99 -0500, David Megginson wrote:
>Bruce.Durling@equifax.com writes:
>
[...]
>and it might be
>nice to get a more modern open-source XML editing tool under
>development.
>
If I had a single wish as departing moderator it would be that we have a
project to develop an open source editor for XML. There is anyway a
shortage of  editors at present, and those that do exist are (not
unreasonably) usually tied to a single author's point of view (e.g.
streamed text, hierarchical content, etc.) As far as I know, none of them
are easily extensible at API level, and those that do have APIs will differ
enormously from each other.

The lack of an API for an editor effectively makes it impossible for people
to develop a modular approach. Many of the "non-textual" DTD/schemas will
require specialist editors (my own interest is chemistry, but Math,
Geography/maps, SVG, etc are all similar). We need to be able to
concentrate *just* on the domain-specific parts of our subject, and not to
be concerned with general structural or technical editing.

I have raised this subject from time to time over the last year or two and
haven't found it easy to get interest. Now that XML is really here, editing
is a key requirement for creating documents. For example, I know that there
is pressure to create graduate theses in electronic form - but this is not
easy in XML if there is anything other than text in the thesis. Are there
other readers of this list that understand the problem and do we have a way
forward?

	P.


>All the best,
>
>
>David
>
>-- 
>David Megginson                 david@megginson.com
>           http://www.megginson.com/
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Fri Dec 24 00:38:37 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:52 2004
Subject: SAXPP / fragment processing
References: <Pine.LNX.4.10.9912231806241.27464-100000@cauchy.clarkevans.com>
Message-ID: <3862C0A0.3D963B02@trantor.de>

> written a C version; I'm stuggling with defining interfaces
> for a Java version.

Ups, meanwhile I have the thing working... Maybe we can harmonize the
interfaces, you'll find my suggestion
http://www.trantor.de/saxpp/doc

Best regards

Stefan

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stele at fxtech.com  Fri Dec 24 01:42:47 1999
From: stele at fxtech.com (Paul Miller)
Date: Mon Jun  7 17:18:52 2004
Subject: SAXPP / fragment processing
References: <Pine.LNX.4.10.9912231806241.27464-100000@cauchy.clarkevans.com> <3862C0A0.3D963B02@trantor.de>
Message-ID: <3862CFDF.AEFAE0D5@fxtech.com>

> > written a C version; I'm stuggling with defining interfaces
> > for a Java version.

> Ups, meanwhile I have the thing working... Maybe we can harmonize the
> interfaces, you'll find my suggestion
> http://www.trantor.de/saxpp/doc

My design is somewhat different. You can see how it works in the list
archives (or check out the sample C++ code at
http://www.fxtech.com/xmlio/. Basically you register interest in certain
elements, and can specify a callback for each, where you can then
register new handlers for the sub-elements at that scope, and so on. You
can also register element handlers for intrinsic types (such as ints,
lists, booleans, strings, floats, etc) where it just plugs the
(validated) value into your own data structures. More of an XML
interpreter than a parser (as Clark pointed out to me). The nice thing
about my design is it consumes no additional memory over the buffering
(which is independent of the reader through a stream interface anyway).
It doesn't actually process any elements that there is no handler for,
so it can quickly skip large chunks of a document (that you are not
interested in) without consuming any memory or generating any events.

I'll have more details (and a completely "C" implementation) next week
when I return from Florida.

-Paul

--
Paul Miller - stele@fxtech.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Fri Dec 24 02:09:52 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:52 2004
Subject: SAXPP / fragment processing
References: <Pine.LNX.4.10.9912231806241.27464-100000@cauchy.clarkevans.com> <3862C0A0.3D963B02@trantor.de> <3862CFDF.AEFAE0D5@fxtech.com>
Message-ID: <3862D600.40DF5653@trantor.de>

Paul Miller wrote:
> 
> > > written a C version; I'm stuggling with defining interfaces
> > > for a Java version.
> 
> > Ups, meanwhile I have the thing working... Maybe we can harmonize the
> > interfaces, you'll find my suggestion
> > http://www.trantor.de/saxpp/doc
> 
> My design is somewhat different. You can see how it works in the list
> archives (or check out the sample C++ code at
> http://www.fxtech.com/xmlio/. Basically you register interest in certain
> elements, and can specify a callback for each, where you can then
> register new handlers for the sub-elements at that scope, and so on. You
> can also register element handlers for intrinsic types (such as ints,
> lists, booleans, strings, floats, etc) where it just plugs the

So if I understand you right your parser is still a (more elaborated)
push parser. 
My intension was to build something that is more similar to a java
reader. Something I
can hand over to another part of the program, reading its part, and
after that
the main program continues pulling the events left. 

best regards

Stefan

-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jjc at jclark.com  Fri Dec 24 02:33:44 1999
From: jjc at jclark.com (James Clark)
Date: Mon Jun  7 17:18:52 2004
Subject: SAX2: Should SAXException extend IOException?
References: <776DC00B49ECD21189750090273F729130B373@EROS> <14434.28033.594048.829051@localhost.localdomain>
Message-ID: <3862DB61.319FF5E@jclark.com>

David Megginson wrote:
>
> 1. Is SAXException logically a kind of I/O exception?
> 
>    After a lot of thought (and practical experience writing apps and
>    libraries that use SAX), I'm certain that the answer to this
>    question is 'yes'.  I know that to many of us on the list, XML is
>    the sun, the moon, and the stars, but for the rest of the world
>    it's just a fancy format that you can write information to or read
>    it from -- in other words, XML is almost never the point, just a
>    means.
> 
>    From that point of view, reading information from an XML document
>    is a kind of I/O, so it makes sense for SAXException to be a kind
>    of IOException.

But SAXExceptions do not just represent exceptions in reading
information from an XML document.  They also represent arbitrary
exceptions thrown by callbacks during the course of processing an XML
document.  I cannot see any argument on the basis of which these can be
considered I/O Exceptions.

> 2. Should SAX2 callbacks throw IOException or SAXException?

This seems like a false dichotomy to me.  Why not keep SAXException as
it is, not derived from IOException, but change the handler functions to
throw both IOException and SAXException?  This avoids tunnelling in the
common case.

Note that in Java if I have an interface that declares a method as
throwing exceptions A and B,  an implementation of that method can be
declared as throwing both A and B, or just A or just B or nothing at
all.

James


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Steve.Ball at zveno.com  Fri Dec 24 05:27:34 1999
From: Steve.Ball at zveno.com (Steve Ball)
Date: Mon Jun  7 17:18:52 2004
Subject: WISH! Open Source XML Editor [was Re: psgml namespaces andschemas]
References: <Bruce.Durling@equifax.com's message of "Thu, 23 Dec 1999 09:19:13 +0000">
	 <85256850.0032FAA1.00@noteswetc15.fin.equifax.com> <3.0.1.32.19991224135813.009b9380@pop3.demon.co.uk>
Message-ID: <386304A8.7F459F1D@zveno.com>

Peter Murray-Rust wrote:
> 
> At 07:10 AM 12/23/99 -0500, David Megginson wrote:
> >Bruce.Durling@equifax.com writes:
> >and it might be
> >nice to get a more modern open-source XML editing tool under
> >development.
> >
> If I had a single wish as departing moderator it would be that we have a
> project to develop an open source editor for XML.

Wish no longer... an Open Source XML Editor is here right now:

Swish!

My company, Zveno, recently decided to release our editor as an
Open Source Software project.  We haven't announced it yet, because
we've been pretty busy with other projects, but the source code
is available right now at:

ftp://ftp.zveno.com/swish/Swish-1.0b5.tar.gz	or
ftp://ftp.zveno.com/swish/Swish-1.0b5.zip

You need to have Tcl/Tk 8.1 (or better) installed to run it.

> There is anyway a
> shortage of  editors at present, and those that do exist are (not
> unreasonably) usually tied to a single author's point of view (e.g.
> streamed text, hierarchical content, etc.) As far as I know, none of them
> are easily extensible at API level, and those that do have APIs will differ
> enormously from each other.

All quite true.  I'll mention my points-of-view and then discuss the
Swish project.

Firstly, Swish is written in Tcl/Tk.  Some people like Tcl, some don't...
but one has to choose an implementation language and Tcl provides a
number of practical advantages - simplicity, good GUI toolkit, 
extensibility.  One of the most important advantages is packaging;
it is quite easy to get Tcl/Tk installed on a platform, and it is
relatively easy to create a single executable for folks to download.

If you want to argue the relative merits of Tcl/Tk, then perhaps it
would be best to email me offline from the list.

Another feature of Tcl is that it plays nicely with other languages.
For example, using Tcl Blend (the Tcl interface to Java) we could 
incorporate calls to Java classes such as Java (validating) parsers,
XSL processors, etc.  The idea here is that Tcl provides a high-level
glue language for components provided in Java (or Python, or C++, or...)

As far as design choices go, I have modelled some simple UIs (tree view,
XML source view) but I am very keen to explore alternatives.
That's a major reason for making this package open source.
Swish has a plugin system to allow for this kind of extension.

APIs are very, very important and it is early days for Swish.
Developing a comprehensive plugin API is on my TODO list.
Perhaps the Tcl Plugin API should have a corresponding Java API?

> The lack of an API for an editor effectively makes it impossible for people
> to develop a modular approach. Many of the "non-textual" DTD/schemas will
> require specialist editors (my own interest is chemistry, but Math,
> Geography/maps, SVG, etc are all similar). We need to be able to
> concentrate *just* on the domain-specific parts of our subject, and not to
> be concerned with general structural or technical editing.

I have recognised that application/domain-specific UIs are going to be
extremely important.  The interfaces that I have supplied (as for 
other existing XML editors) are too general-purpose.  Again, the
plugin mechanism is there to cater for new interfaces.

> I have raised this subject from time to time over the last year or two and
> haven't found it easy to get interest. Now that XML is really here, editing
> is a key requirement for creating documents. For example, I know that there
> is pressure to create graduate theses in electronic form - but this is not
> easy in XML if there is anything other than text in the thesis. Are there
> other readers of this list that understand the problem and do we have a way
> forward?

Well, I'm "putting my code where my mouth is" to find a way forward.
Anyone who is interested is very welcome to contact me.

We here at Zveno are working hard (despite the Summer holidays) to
get the website updated to support Swish's new status.  Please bear
with us while we catch up on our workload!

Have a great Christmas everyone, and a New Year that's a blast!

Cheers,
Steve Ball

-- 
Steve Ball            |   Swish XML Editor    | Training & Seminars
Zveno Pty Ltd         |   Web Tcl Complete    |      XML XSL
http://www.zveno.com/ |    TclXML TclDOM      | Tcl, Web Development
Steve.Ball@zveno.com  +-----------------------+---------------------
Ph. +61 2 6242 4099   | Mobile (0413) 594 462 | Fax +61 2 6242 4099

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From aray at q2.net  Fri Dec 24 05:49:33 1999
From: aray at q2.net (Arjun Ray)
Date: Mon Jun  7 17:18:52 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <33D189919E89D311814C00805F1991F7F4AA9D@RED-MSG-08>
Message-ID: <Pine.LNX.4.10.9912240115310.10551-100000@mail.q2.net>


On Thu, 23 Dec 1999, Andrew Layman wrote:

> Clark Evans asks why PIs are not the mechanism for namespace
> declaration. That option was extensively debated during the design
> process (see the archives for details).

What archives?  The index at

  http://lists.w3.org/Archives/Public/

does not carry the xml-sig list.  Even though that list has been defunct
for over a year, it's only W3C process rules or somesuch that "justify"
its non-publication.

Any number of questions and debates about namespaces on this list and
others could have been answered if not resolved with much less traffic had
the archives been available for scrutiny. 

> The short answer is that PIs do not have tree scope, so are unsuitable
> for modular document construction.

Um, no.  Check the archives for details:-)

As David Megginson writes:

: There are a lot of answers to this question, but in the end, the real
: argument was that PIs cause display problems in level-3 and level-4
: HTML browsers, and some influential parties [1] had a strong interest
: in being able to write HTML+XML documents that, by various sorts of
: lexical trickery, could still be displayed in XML-oblivious browsers
: like Netscape 3.

and David Brownell comments:

: More accurate is that certain person (or persons) disliked
: PIs extremely, if even a tenth of what I've heard is accurate.

Yep, a part of W3C Canon.  Keeping The Web Safe For Netploder.


Arjun


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From fujisawa at the.canon.co.jp  Fri Dec 24 05:55:48 1999
From: fujisawa at the.canon.co.jp (Jun Fujisawa)
Date: Mon Jun  7 17:18:52 2004
Subject: SAX2: Renaming DTDHandler
In-Reply-To: <14432.59332.741656.299651@localhost.localdomain>
Message-ID: <v04010109b488bb237b02@the.canon.co.jp>

At 10:01 AM -0500 99.12.22, David Megginson wrote:
> Since we're implementing SAX2 in a new package anyway, I'm interested
> in hearing suggestions for renaming DTDHandler so that the name more
> accurately describes its purpose (reporting notation and unparsed
> entity declarations).  Here are some ideas off the top of my head:

How about simply calling NotationDeclHandler or UnparsedEntityHandler?

I think both describe the purpose of the interface reasonably well, since
notation and unparsed entity declarations are almost always used together.

--
Jun Fujisawa
<mailto:fujisawa@the.canon.co.jp>

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From aray at q2.net  Fri Dec 24 06:06:02 1999
From: aray at q2.net (Arjun Ray)
Date: Mon Jun  7 17:18:52 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <3862913B.67D2E25C@pacbell.net>
Message-ID: <Pine.LNX.4.10.9912240130320.10551-100000@mail.q2.net>


On Thu, 23 Dec 1999, David Brownell wrote:

> What you're suggesting is that PIs be lexically scoped.
> (That's what Andrew seems to mean by "tree" scope.)

Especially "local".  The "requirement" that had to be met, apparently, was
that the syntactic device announcing a "local" lexical scope had to be
"locally" available itself (thus ruling out, e.g., stuff in the internal
subset that would be indefinitely "far away".)

> And in fact, there's nothing in the world preventing the definition of
> a particular PI from using lexical scope. One doesn't need all PIs to
> work that way; only one.

Yes.  There are only two natural scoping constructs in XML: elements and
marked sections.  There was no consensus on how MS syntax could be
extended (if at all), so the issue effectively became one of working with
the element structure.  A PI pointing to an ID could have been enough.

> I've no intention of reopening the debate on this topic (we're stuck
> with attributes), but I've got this strange belief that truth should
> be told, so I couldn't let this one slip by.

On the archive we've been refered to for the details, it was quoted:

"The making of laws, and of sausages, should be hidden from children"


Arjun


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ashish_agarwal at msdc.hcltech.com  Fri Dec 24 12:13:13 1999
From: ashish_agarwal at msdc.hcltech.com (Ashish Agarwal,AMB Chennai)
Date: Mon Jun  7 17:18:52 2004
Subject: No subject
Message-ID: <21FCEFDE42DFD211A1A10007250603B2E6EC36@PLUTO>

Hi All,
I have this problem as explained below.... 

I want to pass XML as string... from my ASP program to my COM object.
but now the problem is, I prepare the XML using the XMLDOMs "set"
methods. 
I dont understand, as to, how can I assign my XMLDOM object to a
string and then pass it to my COM object.
Any tips please....

Rgds,
Ashish

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ashish_agarwal at msdc.hcltech.com  Fri Dec 24 12:14:06 1999
From: ashish_agarwal at msdc.hcltech.com (Ashish Agarwal,AMB Chennai)
Date: Mon Jun  7 17:18:52 2004
Subject: Problems with XML
Message-ID: <21FCEFDE42DFD211A1A10007250603B2E6EC39@PLUTO>

> Hi All,
> I have this problem as explained below.... 
> 
> I want to pass XML as string... from my ASP program to my COM object.
> but now the problem is, I prepare the XML using the XMLDOMs "set"
> methods. 
> I dont understand, as to, how can I assign my XMLDOM object to a
> string and then pass it to my COM object.
> Any tips please....
> 
> Rgds,
> Ashish

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Fri Dec 24 13:59:51 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:52 2004
Subject: Why not PIs for namespace declarations?
References: <Pine.LNX.4.10.9912240130320.10551-100000@mail.q2.net>
Message-ID: <38637D70.FB4B731B@mecomnet.de>

Arjun Ray wrote:
> 
> On Thu, 23 Dec 1999, David Brownell wrote:
> 
> > What you're suggesting is that PIs be lexically scoped.
> > (That's what Andrew seems to mean by "tree" scope.)
> 
> Especially "local".  The "requirement" that had to be met, apparently, was
> that the syntactic device announcing a "local" lexical scope had to be
> "locally" available itself (thus ruling out, e.g., stuff in the internal
> subset that would be indefinitely "far away".)

I surmise that "stuff" here refers to a PI which would have preceeded or
followed the respective element tag. If this was the reason which, in fact,
swayed the decision, the cited quotation obtains a remarkable irony.

> 
> > And in fact, there's nothing in the world preventing the definition of
> > a particular PI from using lexical scope. One doesn't need all PIs to
> > work that way; only one.
> 
> Yes.  There are only two natural scoping constructs in XML: elements and
> marked sections.  There was no consensus on how MS syntax could be
> extended (if at all), so the issue effectively became one of working with
> the element structure.  A PI pointing to an ID could have been enough.
> 

As XML had, to that point in time, neither a storage nor a processing model,
any arguments regarding "natural" whould have been most suspect. A claim, for
example, that the present encoding does not place the encoding for the
namespace binding "indefinitely far away" from the encoding for the respective
element type depends on the presumption of a procesing structure akin to that
proposed in the recent strawman sax2. Namely one in which interning the type
name is deferred until the attributes have been read. A PI encoding with a
lexical scope covering the immediately succeeding element would not have made
this presumption.

> > I've no intention of reopening the debate on this topic (we're stuck
> > with attributes), but I've got this strange belief that truth should
> > be told, so I couldn't let this one slip by.
> 
> On the archive we've been refered to for the details, it was quoted:
> 
> "The making of laws, and of sausages, should be hidden from children"
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Fri Dec 24 14:13:23 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:52 2004
Subject: Why not PIs for namespace declarations?
Message-ID: <002601bf4e1c$a49d0d30$30f96d8c@NT.JELLIFFE.COM.AU>

 From: Arjun Ray <aray@q2.net>

>Yes.  There are only two natural scoping constructs in XML: elements
and
>marked sections.

I think there were three scopes in XML:
    * element over content
    * element over attribute
    * entities (e.g. documents)

If we need element scope (either over content or over attributes) then
elements are the appropriate mechanism to produce the effect.

If we need document scope for semantics that are particular to
a document type, then again elements are appropriate.

If we need document scope independent of the document type,
or parsed-entity scope, then PIs are appropriate.

If we need a point only, and the point is related to the document
type, then an element is appropiate.

If we need a point, and the point is related to the main document
type and we have a schema that will not be disrupted by the presence of
elements from another namespace, then an empty element from
another namespace is appropriate. (In SGML, we can use
global inclusions for this, too.)

If we have a point, and that point is unrelated to the main document
type, and we are using DTDs or are processing using XPaths or
their equivalent that may fall over if the element structure contains
elements in unexpected places, then PIs are appropriate.

So I think the xmlns was a good call, even though I originally
stronly supported PIs.  Once it was decided that namespaces
should have some element scope (to allow easier cut-and-paste without
needing to add explicit namespace prefixes everywhere), then
PIs became inappropriate for them.

Unfortunately, it has greatly complicated XML.  Working through
my recent confusion on XSL and namespaces, I have been shocked that
namespaces represent a third stream independent of content or
attributes.  I had thought they were merely a construct built on
top of attributes, but it seems that the way some W3C specs have
used them is to keep them in splendid isolation from other attributes.

So we have PIs that are deemed not PIs (i.e., XML header) and
attributes that are deemed not attributes (i.e., namespace attributes).
What is next: elements that are not elements?  (I suppose this is
what XInclude is.)

So now we have 4 scopes in XML

I think there were three scopes in XML:
    * element over content
    * element over attribute
    * entities (e.g. documents)
    * namespace

And the XML Schema may allow some other scoping to, based on supertype.

Since all these scopes hang off the basic element graph, none of them
provide
a substitute for PIs when the rare set of events above are met. One good
use of
a PI is this: many C/C++ editors save their settings in a comment at the
bottom
of the document. In XML documents one could use comments or PIs to
append
arbitrary settings at the end of a document.  I think PIs are the better
choice
for that.

Rick


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From abisheks at india.hp.com  Fri Dec 24 15:37:28 1999
From: abisheks at india.hp.com (Abhishek Srivastava)
Date: Mon Jun  7 17:18:52 2004
Subject: XSL for C++
Message-ID: <017301bf4e24$a02b6190$252f0a0f@india.hp.com>

Hi,

I need an XSL translator utility for C++. Is there one ?

Another question, I know that using XSL style sheets we can transform xml into html.
However, If I want to transform one xml vocabulary into another one can I still use xsl style sheets ?
The utility should take in a stream of one xml vocabulary and spit out a stream of another type.

regards,
Abhishek.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    _/               Abhishek Srivastava
   _/                Hewlett Packard ISO       
  _/_/_/   _/_/_/    -------------------   
 _/    /   _/   _/     (Work)   +91-80-2251554 x1190
_/  _/   _/_/_/      (Ip)     15.10.47.37            
        _/           (Url)    http://sites.netscape.net/abhishes/index.html                        
       _/            
                     Work like you don't need the money.
                     Dance like no one is watching.
                     And love like you've never been hurt.
                     --Mark Twain                       
                     
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991224/7eed8459/attachment.htm
From tpassin at idsonline.com  Fri Dec 24 17:01:38 1999
From: tpassin at idsonline.com (Thomas B. Passin)
Date: Mon Jun  7 17:18:52 2004
Subject: XSL for C++
Message-ID: <004201bf4e31$3d0d2960$3cfbb1cd@tomshp>

Abhishek Srivastava wrote:

>Hi,

>I need an XSL translator utility for C++. Is there one ?

>Another question, I know that using XSL style sheets we can transform xml
into html.
>However, If I want to transform one xml vocabulary into another one can I
still use xsl style sheets ?
>The utility should take in a stream of one xml vocabulary and spit out a
stream of another type

XSLT transforms one xml document into another.  The output may also be
slightly non-xml if it is HTML, or if it is just text.  So yes, an xslt
stylesheet can transform an xml document into one with a different
vocabulary.

Tom Passin


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Fri Dec 24 21:41:44 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:52 2004
Subject: Wish lists for the Holidays
Message-ID: <199912242141.QAA20410@hesketh.net>

It's Christmas Eve, and the new year is coming up.  It seems like a good
time to start pondering what we'd like to see over the next year's XML
development.  While the holidays are a hard time to carry on discussion, it
seems like wish lists might be something good for exactly this time of year.

While XML-Dev usually talks about how to achieve particular results, it
might be fun to talk about the results we'd like to get, and see if other
folks are interested as well.  Maybe vendors will listen, maybe
organizations will listen, and maybe this'll end up in some form that goes
beyond XML-Dev.
 
Peter Murray-Rust already posted a wish for an open XML editor API, and
hopefully he won't be getting coal in his stocking.  I'd love to see
something like that emerge in the next year, though it'll take some work.

Personally, I'd like to see some kind of XML-aware data storage that I can
use in small (workgroup-size, not enterprise) environments - a way to
store, manage, and search my data.  I'd love to use WebDAV or some other
open protocol to get information in and out of the repository, with support
for things like versioning, fragment addressing through XPath and XPointer,
easy replication and backup, and cross-platform capabilities.  (Yes, Zope
may get there, though I'd love to see a Java implementation.)

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mda at discerning.com  Fri Dec 24 22:09:04 1999
From: mda at discerning.com (Mark D. Anderson)
Date: Mon Jun  7 17:18:52 2004
Subject: attribute values as qnames?
In-Reply-To: <33D189919E89D311814C00805F1991F7F4AAA2@RED-MSG-08>
Message-ID: <1158847968.946051653@[192.168.0.2]>

This issue is now freaking me out less, now that I've concluded
that any processor will either:

- expand prefixes for *both* its control info (stylesheet, etc.)
and the document instance(s)
- not expand any, in either place

So if I just make sure I use the same prefix as the source
document, I stand a chance of being immune if I switch from
a processor that does not treat them as qnames, to
a different processor (or version) which decides it will.

The other trick I've seen in a few places, which I might
use, is to use an entity instead of a prefix:
  <el myattr="&someprefix;keyword"/>
I can just set in one place whether that someprefix is a prefix
or a full URI.
That approach also would allow me to be both forward and
backward looking, my favorite pasttime.

-mda

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Fri Dec 24 23:03:13 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:52 2004
Subject: attribute values as qnames?
In-Reply-To: <1158847968.946051653@[192.168.0.2]>
Message-ID: <Pine.LNX.4.10.9912241801050.31497-100000@cauchy.clarkevans.com>

On Fri, 24 Dec 1999, Mark D. Anderson wrote:
> The other trick I've seen in a few places, which I might
> use, is to use an entity instead of a prefix:
>   <el myattr="&someprefix;keyword"/>
> I can just set in one place whether that someprefix is a prefix
> or a full URI.

Now... using entities in this way is the best 
namespace solution I've seen yet.  The prefix 
is, afterall, just a variable to be expanded.

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From donpark at docuverse.com  Sat Dec 25 00:30:21 1999
From: donpark at docuverse.com (Don Park)
Date: Mon Jun  7 17:18:52 2004
Subject: Wish lists for the Holidays
In-Reply-To: <199912242141.QAA20410@hesketh.net>
Message-ID: <000c01bf4e6f$6aab2280$d1940e18@c1033339-a.smateo1.sfba.home.com>

My 'XML' wish for year 2000 is: atomic and molecular XML
standards.

Atomic XML standards are one page specs each of which defines
a single 'power word', a tag name or an attribute name.  An
example is 'xmlns' or 'table'.

Molecular XML standards are small specs each of which defines
a single 'power phrase', a micro-schema involving just a few
elements.  An example is 'address' molecule that consists of
small number of elements that make up an address.

These 'micro-standards' will allow us to create a more coherent
XML document standards as well as XML software that can 'learn'
to handle new standards by plugging in new power words or phrases.

Merry Christmas and a Happy New Millenium to you all.

Best,

Don Park    -   mailto:donpark@docuverse.com
Docuverse   -   http://www.docuverse.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Sat Dec 25 03:00:18 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:52 2004
Subject: java pull parser / fragment processing
Message-ID: <38643357.3D106C97@trantor.de>

Pull model based parsers offer several advantages 
when processing document fragments. The main
advantage is probably that a pull based parser
can be handed over between different fragment
processing entities without problems. Also,
the processing state can be encoded more naturally
in local variables etc. instead of having to
"find yourself" each time the handler is called.
Another advantage could be that namespaces (SAX2)
can be added without loosing compatibility since
all events need to be objects anyway. In contrast
to extending parameter lists, adding new methods
to objects does not destroy compatibility.

I have implemented a java xml parser following 
the pull model on top of a normal (push) SAX parser. 
If you are interested, please take a look at it.
It is available at http://www.trantor.de/saxpp
I am very interested if you think the interface is 
OK or if you have suggestions for improvements.

Best regards

Stefan

-- 
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From aray at q2.net  Sat Dec 25 04:31:09 1999
From: aray at q2.net (Arjun Ray)
Date: Mon Jun  7 17:18:52 2004
Subject: Why not PIs for namespace declarations?
In-Reply-To: <38637D70.FB4B731B@mecomnet.de>
Message-ID: <Pine.LNX.4.10.9912242342230.12898-100000@mail.q2.net>


On Fri, 24 Dec 1999, james anderson wrote:
> Arjun Ray wrote:
> > On Thu, 23 Dec 1999, David Brownell wrote:

> > The "requirement" that had to be met, apparently, was that the
> > syntactic device announcing a "local" lexical scope had to be 
> > "locally" available itself (thus ruling out, e.g., stuff in the
> > internal subset that would be indefinitely "far away".)
> 
> I surmise that "stuff" here refers to a PI which would have preceeded
> or followed the respective element tag. 

Not just a PI.  ArchForms, for instance, work with attributes which have
to be declared in <!ATTLIST...> declarations.  (New-fangled PIs are one
way of working around the fact that declarations can't appear within the
instance.)  The *general* idea - to use "special" attributes - can be
considered well-accepted; the issue is how these special attributes are to
be recognized.

> > There are only two natural scoping constructs in XML: elements and
> > marked sections.  

> As XML had, to that point in time, neither a storage nor a processing
> model, any arguments regarding "natural" whould have been most
> suspect.

Natural in the syntactic sense: both "start" and "end" lexically separate
and explicit.  "Natural scoping constructs", not "natural scopes":)

> A claim, for example, that the present encoding does not place the
> encoding for the namespace binding "indefinitely far away" from the
> encoding for the respective element type depends on the presumption of
> a procesing structure akin to that proposed in the recent strawman
> sax2. Namely one in which interning the type name is deferred until
> the attributes have been read. A PI encoding with a lexical scope
> covering the immediately succeeding element would not have made this
> presumption.

Yes.  The inherent chicken-and-egg problem is normally solved by
separating declaration and use (and sometimes the declaration can be
"indefinitely far away" enough to have to be assumed - e.g. in some
block-structured languages, a new block *mandates* a new lexical scope, so
there's no need to "declare" this fact.)


Arjun


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From denis at o3m.com  Sat Dec 25 04:31:01 1999
From: denis at o3m.com (Denis Voitenko)
Date: Mon Jun  7 17:18:53 2004
Subject: IE5 examples please...
Message-ID: <001b01bf4e90$761359a0$1000a8c0@adubn1.nj.home.com>

I've been working with PHP3 for a long time and now plan to integrate XML in
my projects. For right now I do not want to use PHP3 to parse XML, I'd like
to see how IE5 does it. Whatever I tried did not give me good results, I
could not parse my .xml document thru XSL. Could somone show me a "Hello,
World!" type of an example?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Sat Dec 25 10:07:23 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:53 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: Steve Harris's message of "22 Dec 1999 09:40:11 -0800"
References: <EE5F339A2558D311B7360008C73BFD001687BB@exchange1.primus.com> <whk8m7flub.fsf@viffer.metis.no> <sv7li698n8.fsf@hodge.primus.com>
Message-ID: <whn1qzml3n.fsf@viffer.metis.no>

>>>>> Steve Harris <seh@speakeasy.org>:

> I know we need to get work done today, but it's sad that we can't
> use more of the Standard C++ pieces in a project like this. If we're
> successful, this API will outlast the current rev of the lagging
> compilers. I'm still in favor of planning an API that may not work
> for everyone today.

I'm very much in agreement with this.  I agree with this, even if it
won't work as-is with my current set of platforms.

<position-statement>
In any case: no matter what objections I raise on this list, and no
matter how big or large a feature set of Standard C++ is used, I will
move my own little SAX implementation to as close as what is finally
decided, as my platforms will allow.
</position-statement>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Sat Dec 25 13:24:47 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:53 2004
Subject: Request for Discussion: SAX 1.0 in C++
In-Reply-To: Steve Harris's message of "22 Dec 1999 09:40:11 -0800"
References: <EE5F339A2558D311B7360008C73BFD001687BB@exchange1.primus.com> <whk8m7flub.fsf@viffer.metis.no> <sv7li698n8.fsf@hodge.primus.com>
Message-ID: <whr9gbmlev.fsf@viffer.metis.no>

>>>>> Steve Harris <seh@speakeasy.org>:

> ... It seems that so long as the compiler will guarantee that you
> can fit _at least_ 16 bits in a wchar_t, then your translation code
> would be sufficiently portable.

If you store a lot of strings (as we do), I'm afraid using twice the
amount of space that we actually need will be, will be a serious
perfomance killer.

I'm also worried that basic_string<> seems to really be meant for
uniform width character codings, and I'm not sure what happens once we
run into the surrogates that extends the coding beyond UCS-2.  Will we
be able to use it?  Will it create trouble of some sort?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Sat Dec 25 19:50:37 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:53 2004
Subject: psgml namespaces and schemas
In-Reply-To: David Megginson's message of "23 Dec 1999 07:10:07 -0500"
References: <85256850.0032FAA1.00@noteswetc15.fin.equifax.com> <m31z8dn9i8.fsf@localhost.localdomain>
Message-ID: <whbt7fj9rg.fsf@viffer.metis.no>

>>>>> David Megginson <david@megginson.com>:

> PSGML allows you to edit XML documents that happen to contain
> Namespaces, but it doesn't know anything about the Namespaces view.

It does?  psgml 1.2.1 on GNU emacs 20.3, croaked when parsing 
	<!ELEMENT xlink:simple EMPTY>
in a DTD, with the following error message:
	Recompiling DTD file /home/sb/apps/lib/sgml/metis.dtd...
	/home/sb/apps/lib/sgml/metis.dtd line 181 col 15 
	- line 0 col 0 entity METIS
	/home/sb/models/training.xml line 3 col 51 
	Name expected; at: :simple EMPT

Is there something I've forgot to turn on in psgml or is there
something syntactically wrong with the <!ELEMENT> declaration above, I 
wonder?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From denis at o3m.com  Sun Dec 26 03:14:06 1999
From: denis at o3m.com (Denis Voitenko)
Date: Mon Jun  7 17:18:53 2004
Subject: Need advise on a book about XML.
Message-ID: <002501bf4f4e$f2f2b920$1000a8c0@adubn1.nj.home.com>

Hello,

I am trying to jump into XML developement and have learned many things from
the examples I have seen so far, yet I think I could learn more from a good
book. Could someone point me to "THE ONE" ?

Denis
denis@o3m.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ed.nixon at LynnParkPlace.org  Sun Dec 26 04:06:48 1999
From: ed.nixon at LynnParkPlace.org (Ed Nixon)
Date: Mon Jun  7 17:18:53 2004
Subject: Need advise on a book about XML.
In-Reply-To: <002501bf4f4e$f2f2b920$1000a8c0@adubn1.nj.home.com>
Message-ID: <000001bf4f56$8f30e030$0100a8c0@lynnparknt1.iprimus.ca>

I like The XML Bible by Elliotte Rusty Harold. Check out some samples via
his XML news site, Caf? con L?che: http://metalab.unc.edu/xml/

	...edN

> -----Original Message-----
> From: owner-xml-dev@ic.ac.uk
> [mailto:owner-xml-dev@ic.ac.uk]On Behalf Of
> Denis Voitenko
> Sent: Saturday, December 25, 1999 10:12 PM
> To: xml-dev@ic.ac.uk
> Subject: Need advise on a book about XML.
>
>
> Hello,
>
> I am trying to jump into XML developement and have learned
> many things from
> the examples I have seen so far, yet I think I could learn
> more from a good
> book. Could someone point me to "THE ONE" ?
>
> Denis
> denis@o3m.com
>
>
> xml-dev: A list for W3C XML Developers. To post,
> mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and
> on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the
> following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From prb at uic.edu  Sun Dec 26 07:00:56 1999
From: prb at uic.edu (Paul R. Brown)
Date: Mon Jun  7 17:18:53 2004
Subject: XML Convertor
In-Reply-To: <212C7E060072D111809E00805FA67EF2016988AF@BMGNT001>
Message-ID: <LOBBKHPNFLHCMNDAGPBLKEFBDIAA.prb@uic.edu>


> I am currently looking out for converting Word Perfect, MS
> Word and ASCII files into XML.

As Robert DuCharme suggested, you're better off dealing with RTF, but even
that is a moving target.  (Microsoft says that the spec is subject to
change...)  Writing an RTF-to-TXT conversion in Perl is a good exercise (all
you really need are regular expressions), and then it's up to you how much
of the RTF markup you want to preserve or not.

For marking ASCII into XML, you're going to deal with a host of problems.
(Capturing tables into markup is an interesting exercise, for instance.)

What is it that you want to accomplish?

	- Paul


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From abisheks at india.hp.com  Sun Dec 26 09:16:52 1999
From: abisheks at india.hp.com (Abhishek Srivastava)
Date: Mon Jun  7 17:18:53 2004
Subject: XSL for C++
References: <5D0478DD31B2D21194E90090273C41D3CEA8E3@az15m06.iac.honeywell.com>
Message-ID: <000f01bf4f81$dee24710$252f0a0f@india.hp.com>

Hi,

I am using IBM's XML4C++ on HPUX.
I am not using xsl for b2b translations ...
We have a distributed system in which xml is used as a carrier for messages.
However, at some points we need to transform the xml message into a native
system format
For example. If I have a component that uses IBM's MQ Series,  then I want
to transform the XML message
into MQ message format that this component can understand. Since this
component is a legacy component I do not wish to alter it's code
to make it understand xml, instead I intend to write an adapter which will
translate the xml message into an MQ series message and serve it to this
component. To build such an XML to MQ adapter I need an XSL Transformation
utility in C++ on HP-UX.
I checked out Xalan ... currently, it's only available for windoze


regards,
Abhishek.
----- Original Message -----
From: "Srinivasan, Veeraraghavan (OH35)"
<Veeraraghavan.Srinivasan@iac.honeywell.com>
To: "Abhishek Srivastava" <abisheks@india.hp.com>
Sent: Saturday, December 25, 1999 10:44 PM
Subject: RE: XSL for C++


> I do not understand what you mean by XML translator for C++. Which parser
> are you using?
>
> XSL can be used to transform from one XML to another XML. If it is a B2B
> transformation, I would recommend using standard Middle tier B2B products
> such as Biztalk rather than writing your own that handles transformation
and
> messaging.
>
> Hope this helps.
>
>               Honeywell
> Veeraraghavan Srinivasan
> Senior Principal Engineer
> Honeywell Hi-Spec Solutions
> 1280, Kemper Meadow Drive,
> Cincinnati, OH 45240
> Phone: (513) 595-8913
>
>
>
> > -----Original Message-----
> > From: Abhishek Srivastava [SMTP:abisheks@india.hp.com]
> > Sent: Friday, December 24, 1999 10:36 AM
> > To: 'xml-dev'
> > Subject: XSL for C++
> >
> > Hi,
> >
> > I need an XSL translator utility for C++. Is there one ?
> >
> > Another question, I know that using XSL style sheets we can transform
xml
> > into html.
> > However, If I want to transform one xml vocabulary into another one can
I
> > still use xsl style sheets ?
> > The utility should take in a stream of one xml vocabulary and spit out a
> > stream of another type.
> >
> > regards,
> > Abhishek.
> >
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >     _/               Abhishek Srivastava
> >    _/                Hewlett Packard ISO
> >   _/_/_/   _/_/_/    -------------------
> >  _/    /   _/   _/     (Work)   +91-80-2251554 x1190
> > _/  _/   _/_/_/      (Ip)     15.10.47.37
> >         _/           (Url)
> > <http://sites.netscape.net/abhishes/index.html>
> >        _/
> >                      Work like you don't need the money.
> >                      Dance like no one is watching.
> >                      And love like you've never been hurt.
> >                      --Mark Twain
> >
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> >
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From francis at redrice.com  Sun Dec 26 11:56:57 1999
From: francis at redrice.com (Francis Norton)
Date: Mon Jun  7 17:18:53 2004
Subject: Wish lists for the Holidays
References: <199912242141.QAA20410@hesketh.net>
Message-ID: <386602D4.73E1D651@redrice.com>

Nice challenge, Simon.

What I'd like to see by way of results is:

[1] tools to bring real-life HTML into XML, so it can be manipulated via
DOM and SAX.

[2] visual xpath expression editors, so that the expression can be
edited in the context of a document, DTD or schema; and with a good
visual metaphor for clarifying xpath's implicit type conversions.

[3] tools which use xpath expressions for stress-testing and processing
XML- and HTML-based service results.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chq at softlab.nju.edu.cn  Sun Dec 26 12:49:52 1999
From: chq at softlab.nju.edu.cn (Chen Hong Qiang)
Date: Mon Jun  7 17:18:53 2004
Subject: Question: how to save file using xerces.jar method!
Message-ID: <3867611D.90FA16E1@softlab.nju.edu.cn>

Hello:
    It's maybe a stupid question.
    How can I save the xml files using xerces.jar  (Java).More
specificly,I can not find a method,such as class.save(String
filename,Document doc) , to save the file..Can anybody help me !
Thanks.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Sun Dec 26 14:03:10 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:53 2004
Subject: 
References: <21FCEFDE42DFD211A1A10007250603B2E6EC36@PLUTO>
Message-ID: <008c01bf4fab$e0a561c0$baacdccf@prioritynetworks.net>

To pass a string to the XML dom use the 'loadXML method'

dim oXMLdoc, strXML

strXML="<greeting>hello world</greeting>"

set oXMLdoc=server.createObject("Microsoft.XMLDOM")

oXMLdoc.loadXML(strXML)

Of course any XML doc can be opened and manipulated as a string using the
file objects.

HTH

Frank


----- Original Message -----
From: Ashish Agarwal,AMB Chennai <ashish_agarwal@msdc.hcltech.com>
To: <xml-dev@ic.ac.uk>
Sent: Friday, December 24, 1999 7:20 AM


> Hi All,
> I have this problem as explained below....
>
> I want to pass XML as string... from my ASP program to my COM object.
> but now the problem is, I prepare the XML using the XMLDOMs "set"
> methods.
> I dont understand, as to, how can I assign my XMLDOM object to a
> string and then pass it to my COM object.
> Any tips please....
>
> Rgds,
> Ashish
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 20:14:29 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class 
In-Reply-To: Your message of "15 Dec 1999 13:06:21 EST."
             <m3r9gohyea.fsf@localhost.localdomain> 
Message-ID: <199912262014.NAA02668@localhost.localdomain>

> > So I think it would be cleaner to deal with the fact that names can have
> > two parts, and not kludge them together with {} marks.  -Tim
> 
> So, in other words, we'd have something like this:
> 
>   public interface DocumentHandler2 extends DocumentHandler {
>     public void startElement (String ns, String name, AttributeList2 atts);
>     public void endElement (String ns, String name);
>   }
> 
>   public interface AttributeList2 extends AttributeList {
>     public String [] getName (int i);
>     public String getType (int i);
>     public String getValue (int i);
>     public String getType (String ns, String name);
>     public String getValue (String ns, String name);
>   }
> 
> We talked about this a few months ago, but I'd be happy to hear what
> people think now.

I'm sorry I'm late to the dicussion, but I'm emphaically with Tim on this one. 


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 20:27:21 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class 
In-Reply-To: Your message of "Wed, 15 Dec 1999 15:37:30 EST."
             <199912152037.PAA30480@hesketh.net> 
Message-ID: <199912262027.NAA02713@localhost.localdomain>

> At 11:57 AM 12/15/99 -0800, Tim Bray wrote:
> >i.e. the namespace processing is highly decoupled from the name
> >processing.  Another way to say it is that much name processing will
> >be written to deal with one particular vocabulary, and want to just
> >deal with names, assuming the NS to have been checked already.
> 
> I'm afraid my experience is rather different - that in building XML
> applications, people are reading the namespaces spec as providing a new and
> more sophisticated name, not a multi-level architecture.  While the
> multi-level architecture is intriguing architecturally, I'm not sure that
> requiring every application to support it is even worth contemplating.

My experience differs greatly from yours.  In _every_ XML application I have 
been involved with in the last six months or so, and that accounts for many, 
XML namespaces have been used to differentiate processing, and in that case, 
gluing the URI to the local name would make life uglier for us.

However, I have an idea I might disagree with Tim on one point.  I think that 
the name that SAX2 signals to the application should include the prefix.  I 
know that the prefix is nothing more than syntactic sugar, but it can be very 
useful sugar when trying not to surprise the end-user.  Often prefixes are 
chosen as a useful mnemonic, for example, using "xsl" as a common prefix for 
'http://www.w3.org/1999/XSL/Transform'.  In this case, it is nice to be 
courteous enough to maintain the prefix through transformations if possible.

> It's fine as an option, but for many many use cases - especially smaller
> use cases where SAX is being used for its quick-and-dirty nature, I think
> I'd much rather have the big kludged string.  If I as a programmer have to
> deal with this every time I write a handler, or even have to track down
> filters, I'll waste a lot of time complaining on XML-dev about what an
> utterly idiotic notion namespaces in XML were to begin with.  If I can just
> tell the parser my preference, and not be forced into extra work, I'll be a
> lot more productive.

But you've already made all these complaints, and I daresay we've all learned 
how to delete certain posts in a hurry.  I don't mean any personal offense, 
but I think Namespaces did an excellent job of addresing a difficult problem, 
and the fact that some people disagree is hardly grounds for excluding them 
from every following standard.  It is part of XML, deal with it.

> Until schemas/DTDs are capable of doing real work with namespaces, we are
> living in the pre-namespace era.  The W3C dropped the ball on validation
> and namespaces, and we've been living with the consequences - life between
> 'eras' - ever since.

Interestingly enough to note, lately, we've been using schematron for 
validation, and it handles namespaces pretty neatly through the awesome 
namespace-crunching power of XPath and XSLT.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 20:51:22 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class 
In-Reply-To: Your message of "16 Dec 1999 09:37:21 EST."
             <m34sdj54v2.fsf@localhost.localdomain> 
Message-ID: <199912262051.NAA02771@localhost.localdomain>

> To make this really useful, however, we should add equals(), intern(),
> and hashCode() methods, and that leads to a different (and trickier)
> should equals() and hashCode() consider the prefix, or not?  People
> will get really surprising results if 
> 
>   {"http://www.w3.org/1999/xhtml", "a", ""}
> 
> equals()
> 
>   {"http://www.w3.org/1999/xhtml", "a", "html"}
> 
> but it is counterintuitive that the two are not equal from a normal
> processing perspective.  Nasty business, really.

I think that though the prefix is maintained for convenience, it should _not_ 
be considered part of the name in any comparison at the semantic level.

DOM L2 exposes the prefix as well, but users who need to compare node names at 
the semantic level should use node.localName and node.namespaceURI strictly, 
and leave node.nodeName alone.

So, in short, I don't see a problem with {"http://www.w3.org/1999/xhtml", "a", 
""} == {"http://www.w3.org/1999/xhtml", "a", "html"}.  It's a very 
well-documented consequence of XML Namespaces 1.0, and users should be aware 
of it.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 20:53:30 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class 
In-Reply-To: Your message of "Thu, 16 Dec 1999 11:29:55 EST."
             <805C62F55FFAD1118D0800805FBB428D02BC01C7@cc20exch2.mobility.com> 
Message-ID: <199912262053.NAA02785@localhost.localdomain>

> > And the namespace-oblivious world is just no longer interesting.
> > 
> > -Tim
> 
> Interesting or not, and that depends on point of view :-), I'm not convinced
> that the namespace-oblivious world is going to go away.  Ever.  If one wants
> to use XML only in the context of one's own application, namespaces aren't
> useful or needed.  One of the reasons XML is so great is that you don't
> *need* a DTD to process XML documents - are we just going to replace the DTD
> with namespaces, and require that all XML documents use namespaces of some
> sort?

Is it not sufficient for the namespace-oblivious world to simply use SAX 1.0?

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 21:03:39 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: Xpath and DOM 
In-Reply-To: Your message of "Thu, 16 Dec 1999 11:50:30 PST."
             <001001bf47fe$cec9f330$63511c09@n54wntw.vancouver.can.ibm.com> 
Message-ID: <199912262103.OAA02824@localhost.localdomain>

[Cross-posted to xml-dev@ic.ac.uk and www-dom@w3.org]

> Leigh, I completely agree with you.  I have been in some discussions about
> this already so I'll try to relay what I've heard.
> 
> About 6 weeks ago, I asked Lauren Wood about DOM implementing XPath.  My
> version of her answer is "Nobody asked for it for Level 2 or Level 3, and
> Level 2 is too late now.  Nobody volunteered to write it for the DOM Spec".
> There's no technical reason why getElementByXPathExpr couldn't be added.
> 
> I asked some of the other IBM XML standards reps about this and my version
> of their answer is "XPath is a query language, and we've got a better query
> language coming.  Why support an inferior query language now when we'll have
> to support the better one soon.  Additionally, why should the DOM be the
> bucket for all API gorp?  Query should be built on top of the DOM so we can
> have layered parsers".
> 
> On one hand, I want an interoperable getElementByXPathExpr, but I understand
> the political and technical reasons why the DOM group isn't rushing to
> implement it.

Am I entirely missing something?  Why are there all these proposals for adding 
the trappings of separate recommendations into the core DOM?  Namespaces make 
sense because they are a fundamental part od XML, and will probably be 
incorporated into the XML 2.0 spec, but why should DOM define a 
getElementByXPathExpr or the like?  I would see this to be the task of the 
XPath spec to define an interface for XPath processing given a DOM.

For instance, we just added an "Evaluate" function to 4XPath, which takes an 
XPath string and a DOM Node (for context) and returns a NodeList.  4DOM users 
don't have to worry about all the machinery of XPath, but 4XPath users, who 
already need 4DOM anyway, now have XPath query support in addition.  It would 
be nice to have a standard interface for this, but I don't think DOM level N 
is the place for it.

And if it is, how do you choose which specs get grafted into the DOM?  Is it 
only W3C specs?  If not, what are the criteria?

IMHO, The DOM should be a simple, low-level interface for XML and no more.  
Other specs should refer to the DOM, but not vice-versa.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 21:06:45 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: Xpath and DOM 
In-Reply-To: Your message of "Thu, 16 Dec 1999 15:58:44 EST."
             <00a601bf4808$58e4ef80$9caddccf@oemcomputer> 
Message-ID: <199912262106.OAA02849@localhost.localdomain>

> <david>On one hand, I want an interoperable getElementByXPathExpr, but I
> understand
> the political and technical reasons why the DOM group isn't rushing to
> implement it.</david>
> 
> It should be a simple enough matter to write such a function one self and
> just call it! may be a week project for a student!(Hint!)

A week?!  Took me about 15 minutes, and I'd hardly expect it to take anyone 
else much longer given a good XPath implementation.

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Sun Dec 26 21:09:40 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <199912262053.NAA02785@localhost.localdomain>
Message-ID: <386683DB.9F8B58A7@pacbell.net>

uche.ogbuji@fourthought.com wrote:
> 
> Is it not sufficient for the namespace-oblivious world to simply use SAX 1.0?

Nope.  SAX 1.0 is missing more features than namespaces, and
one of the most bothersome notions in the latest round of SAX2
discussions is that it give up SAX 1.0 compatibility.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 21:10:27 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: Xpath and DOM 
In-Reply-To: Your message of "Fri, 17 Dec 1999 09:50:27 GMT."
             <000901bf4874$257e3ea0$ab20268a@pc-lrd.bath.ac.uk> 
Message-ID: <199912262110.OAA02868@localhost.localdomain>

> Hmm, I'd suggest:
> 
> interface QueryFactory {
> 
>   DocumentQuery getDocumentQuery(String queryType);
> }
> 
> And have the query type, "xpath" or "xsql" passed in to the 
> factory method. Solves the plugging in of new query types, and 
> avoids having to prepend the query string with "xpath:" or whatever.

This is a _very_ astute idea, and a much more sensible way to partition the 
problem-spaces.

Mrs. Wood and co., are you listening?

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Sun Dec 26 22:18:39 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2 Namespace Support 
In-Reply-To: Your message of "Mon, 20 Dec 1999 18:02:41 EST."
             <14430.46481.974707.922192@localhost.localdomain> 
Message-ID: <199912262218.PAA03078@localhost.localdomain>

> Unless someone shows a catastrophic problem with all this (not a
> purely aesthetic one), I plan to go ahead to other SAX2 problems now.

It's a slam-dunk from my POV.  Off to other problems.  And thank you, as ever.


-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Sun Dec 26 23:26:20 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:18:53 2004
Subject: Wish lists for the Holidays
In-Reply-To: <386602D4.73E1D651@redrice.com> from "Francis Norton" at Dec 26, 99 11:58:12 am
Message-ID: <199912262328.SAA21025@locke.ccil.org>

Francis Norton scripsit:

> [1] tools to bring real-life HTML into XML, so it can be manipulated via
> DOM and SAX.

See HTML Tidy at http://www.w3.org/People/Raggett/tidy
This is a program which valids up crufty HTML, making it clean HTML.
The option "-asxml" will force output to be XML-compatible.

-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From clark.evans at manhattanproject.com  Mon Dec 27 00:08:58 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class 
In-Reply-To: <199912262051.NAA02771@localhost.localdomain>
Message-ID: <Pine.LNX.4.10.9912261900330.8065-100000@cauchy.clarkevans.com>

On Sun, 26 Dec 1999 uche.ogbuji@fourthought.com wrote:
> I think that though the prefix is maintained for convenience, it should _not_ 
> be considered part of the name in any comparison at the semantic level.
>
> So, in short, I don't see a problem with
> {"http://www.w3.org/1999/xhtml", "a", ""} ==
> {"http://www.w3.org/1999/xhtml", "a", "html"}.  It's a very
> well-documented consequence of XML Namespaces 1.0, and users 
> should be aware of it.

Just testing edge cases...

	    {"","a",""} != {"","a","html"}
	and {"","a",""} == {"","a",""}

Right?

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eisen at pobox.com  Mon Dec 27 09:13:53 1999
From: eisen at pobox.com (Jonathan Eisenzopf)
Date: Mon Jun  7 17:18:53 2004
Subject: ANNOUNCE: XML::RSS 0.8
Message-ID: <38673041.6B2B3774@pobox.com>

DLSI=adpO

This release fixes a bug that causes problems when
working with multiple instances of XML::RSS. Thanks
to Randal Schwartz for explaining the problem and
rjp for sending me a patch.

This is an alpha release because the API has not been
finalized. The module will be available at your local
CPAN archive. Alternatively, try this URL:

http://www.perlxml.com/modules/XML-RSS-0.8.tar.gz

This Perl module provides a basic framework for creating and
maintaining Rich Site Summary (RSS) files. RSS is primarily
used for distributing news headlines, commonly called
channels, and is used primarily on Netscape's Netcenter,
http://my.netscape.com, and Userland Software's
http://my.userland.com.

More information on RSS can be found at:
http://my.netscape.com/publish/help/mnn20/quickstart.html

Please send comments, flames, etc. to eisen@pobox.com.

-- 
Jonathan Eisenzopf    |  http://motherofperl.com    
eisen@pobox.com       |  http://perlxml.com
Perl Hacker           |  http://dc.pm.org

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From James.Anderson at mecomnet.de  Mon Dec 27 11:09:47 1999
From: James.Anderson at mecomnet.de (james anderson)
Date: Mon Jun  7 17:18:53 2004
Subject: SAX2: Namespace Processing and NSUtils helper class
References: <Pine.LNX.4.10.9912261900330.8065-100000@cauchy.clarkevans.com>
Message-ID: <38674A1E.6B0885@mecomnet.de>

strictly speaking, not only, as

Clark C. Evans wrote:
> 
> 
> Just testing edge cases...
> 
>             {"","a",""} != {"","a","html"}
>         and {"","a",""} == {"","a",""}
> 
> Right?

but, in addition,

             {"","a",""} != {"","a",<any value>}

unless the specification "not in any namespace" is to be taken some other way?


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mparul at in.ibm.com  Mon Dec 27 11:20:47 1999
From: mparul at in.ibm.com (mparul@in.ibm.com)
Date: Mon Jun  7 17:18:53 2004
Subject: XML in databases
Message-ID: <CA256854.003E428A.00@d73mta05.au.ibm.com>

In my application there are some XML files that I want to store in a
database say DB2. Is there a utility available to store an XML file as per
a DTD in a database and then retrieve it later.

Thanks
Parul


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Sajeev_M1 at verifone.com  Mon Dec 27 11:46:37 1999
From: Sajeev_M1 at verifone.com (Sajeev M.)
Date: Mon Jun  7 17:18:53 2004
Subject: XML in databases
Message-ID: <F9FBA0D1187BD11188B200A0C9979DF902DB5797@blrmail.india.hp.com>


         Oracle has provisions for xml data handling and has provisons for
indexing and searching xml documents of upto 4gb size.they have several other
utilities also
> ----------
> From: 	mparul@in.ibm.com[SMTP:mparul@in.ibm.com]
> Reply To: 	mparul@in.ibm.com
> Sent: 	Monday, December 27, 1999 4:52 PM
> To: 	xml-dev@ic.ac.uk
> Subject: 	XML in databases
> 
> In my application there are some XML files that I want to store in a
> database say DB2. Is there a utility available to store an XML file as per
> a DTD in a database and then retrieve it later.
> 
> Thanks
> Parul
> 
> 
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
> 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mparul at in.ibm.com  Mon Dec 27 11:56:04 1999
From: mparul at in.ibm.com (mparul@in.ibm.com)
Date: Mon Jun  7 17:18:53 2004
Subject: XML in databases
Message-ID: <CA256854.00417A32.00@d73mta03.au.ibm.com>

Do you know if DB2 has something similar? Or if there are third party apis
available that can work with any ODBC compliant database?

parul


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Sajeev_M1 at verifone.com  Mon Dec 27 12:23:12 1999
From: Sajeev_M1 at verifone.com (Sajeev M.)
Date: Mon Jun  7 17:18:53 2004
Subject: XML in databases
Message-ID: <F9FBA0D1187BD11188B200A0C9979DF902DB5799@blrmail.india.hp.com>

  
     I dont know exactly but I have read that db2 version 6.1 does support xml
.It has utility called XML Extender.


> ----------
> From: 	mparul@in.ibm.com[SMTP:mparul@in.ibm.com]
> Sent: 	Monday, December 27, 1999 5:27 PM
> To: 	Sajeev M.
> Cc: 	xml-dev@ic.ac.uk
> Subject: 	RE: XML in databases
> 
> Do you know if DB2 has something similar? Or if there are third party apis
> available that can work with any ODBC compliant database?
> 
> parul
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Mon Dec 27 16:49:38 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:18:53 2004
Subject: java pull parser / fragment processing
Message-ID: <3.0.32.19991227104907.010510dc@pophost.arbortext.com>

At 04:00 1999 12 25 +0100, Stefan Haustein wrote:
>Pull model based parsers offer several advantages 
>when processing document fragments. ...
>
>I have implemented a java xml parser following 
>the pull model on top of a normal (push) SAX parser. 
>If you are interested, please take a look at it.
>It is available at http://www.trantor.de/saxpp
>I am very interested if you think the interface is 
>OK or if you have suggestions for improvements.

I couldn't seem to find the relevant info at that URL.

I hear people using the term "fragment" on xml-dev,
and I wonder how this relates to the term as defined
by the W3C XML Fragment Interchange spec [1].  As editor
of this spec and chair of the XML Core WG that is now in
charge this spec, I would like to hear about any implementations
of the W3C XML Fragment Interchange spec.  Does your 
implementation conform to it?  If there are any others out
there with implementations of this spec, please send info
to the public XML Fragment comment list [2]. 

thanks,

paul

[1] http://www.w3.org/TR/WD-xml-fragment
[2] mailto:www-xml-fragment-comments@w3.org

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 27 17:45:08 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:53 2004
Subject: java pull parser / fragment processing
References: <3.0.32.19991227104907.010510dc@pophost.arbortext.com>
Message-ID: <3867A5C4.9FB67BE6@trantor.de>

> >I have implemented a java xml parser following
> >the pull model on top of a normal (push) SAX parser.
> >If you are interested, please take a look at it.
> >It is available at http://www.trantor.de/saxpp
> >I am very interested if you think the interface is
> >OK or if you have suggestions for improvements.

I am sorry, I moved the page to http://www.trantor.de/xml 
because saxpp sounds more like sax++ than "sax pull parser". 
I forgot to exchange one of the addresses in the forwarder.

> I hear people using the term "fragment" on xml-dev,
> and I wonder how this relates to the term as defined
> by the W3C XML Fragment Interchange spec [1].  As editor
> of this spec and chair of the XML Core WG that is now in
> charge this spec, I would like to hear about any implementations
> of the W3C XML Fragment Interchange spec.  Does your
> implementation conform to it? 

No, it is has nothing todo with that. It's just about handling
the parse stream from one fragment processor to another and back while 
parsing which seems difficult with SAX.  

Best regards

Stefan

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From pgrosso at arbortext.com  Mon Dec 27 19:27:09 1999
From: pgrosso at arbortext.com (Paul Grosso)
Date: Mon Jun  7 17:18:53 2004
Subject: java pull parser / fragment processing
Message-ID: <3.0.32.19991227131438.010d2464@pophost.arbortext.com>

At 18:45 1999 12 27 +0100, Stefan Haustein wrote:
>Paul Grosso had written:
>> I hear people using the term "fragment" on xml-dev,
>> and I wonder how this relates to the term as defined
>> by the W3C XML Fragment Interchange spec [1].  As editor
>> of this spec and chair of the XML Core WG that is now in
>> charge this spec, I would like to hear about any implementations
>> of the W3C XML Fragment Interchange spec.  Does your
>> implementation conform to it? 
>
>No, it is has nothing todo with that. It's just about handling
>the parse stream from one fragment processor to another and back while 
>parsing which seems difficult with SAX.  

Perhaps, then, you could define what you mean by "fragment"
if it doesn't match the definition in the W3C spec.  I don't
find it a word with a necessarily obvious precise definition
when used in relation to XML.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From KenNorth at email.msn.com  Mon Dec 27 20:01:32 1999
From: KenNorth at email.msn.com (KenNorth)
Date: Mon Jun  7 17:18:53 2004
Subject: XML in databases
References: <CA256854.003E428A.00@d73mta05.au.ibm.com>
Message-ID: <004101bf50a5$1d11e080$0b00a8c0@grissom>

There is an XML Extender for IBM DB2 Universal Database. It enables you to
store and retrieve an XML document as a single column, or multiple columns:

http://www-4.ibm.com/software/data/db2/extenders/xmlext/index.html


================== Ken North =============================
http://ourworld.compuserve.com/homepages/Ken_North
ken_north@compuserve.com  71301.1306@compuserve.com  KenNorth@msn.com
===========================================================

----- Original Message -----
From: <mparul@in.ibm.com>
To: <xml-dev@ic.ac.uk>
Sent: Monday, December 27, 1999 3:22 AM
Subject: XML in databases


> In my application there are some XML files that I want to store in a
> database say DB2. Is there a utility available to store an XML file as per
> a DTD in a database and then retrieve it later.
>
> Thanks
> Parul
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From stefan.haustein at trantor.de  Mon Dec 27 20:56:21 1999
From: stefan.haustein at trantor.de (Stefan Haustein)
Date: Mon Jun  7 17:18:54 2004
Subject: java pull parser / fragment processing
References: <3.0.32.19991227131438.010d2464@pophost.arbortext.com>
Message-ID: <3867D29D.1B77C01D@trantor.de>

> Perhaps, then, you could define what you mean by "fragment"
> if it doesn't match the definition in the W3C spec.  I don't
> find it a word with a necessarily obvious precise definition
> when used in relation to XML.

What I mean by "fragment" is exactly what is defined in
http://www.w3.org/TR/WD-xml-fragment. But in contrast to the spec I am
not addressing general fragment interchange problems, but how to hand
over a parser between different fragment processors. For example, if I
nest xhtml into an xml message envelope:

<message>
  <sender>agent1</sender> 
  <receiver>agent2</receiver>
  <content type="xhtml">
   <xhtml>
    ....
   </xhtml>
  </content>
</message>

In the SAX push world, I cannot simply switch the DocumentHandler to my
xhtml processor when I reach the content start tag (how do I switch
back? how do I get the result?). With a pull model, I can hand over the
parser to the xhtml processor. The xhtml processor "pulls" out its
events and returns. Then I can continue parsing the envelope.

Best regards

Stefan 


-- 
KJAVA AWT project: www.trantor.de/kawt
SAX-based access to WBXML and WML: www.trantor.de/wbxml

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Mon Dec 27 22:13:03 1999
From: costello at mitre.org (Roger Costello)
Date: Mon Jun  7 17:18:54 2004
Subject: (Many) XML Schema Questions
Message-ID: <3867E4EC.5534A5C@mitre.org>

Hi Folks,

I have been making my way through the new XML Schema spec
and have numerous questions.

1.) My first question is on how XML Schemas are to be 
referenced by XML instance documents.

It is my understanding that in an XML Schema you 
use the targetNamespace attribute to indicate
the namespace of the XML Schema document, and 
then in an XML instance document you reference
the XML Schema using this namespace.

For example, suppose that I create an XML Schema
for BookCatalogues:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        targetNamespace="http://www.somewhere.org/BookCatalogue">
     ...
</schema>

Now, in my XML instance document I refer to it using
a namespace declaration:

<?xml version="1.0"?>
<BookCatalogue xmlns="http://www.somewhere.org/BookCatalogue">
    <Book>
        <Title>Illusions The Adventures of a Reluctant Messiah</Title>
        <Author>Richard Bach</Author>
        <Date>1977</Date>
        <ISBN>0-440-34319-4</ISBN>
        <Publisher>Dell Publishing Co.</Publisher>
    </Book>
        ...
</BookCatalogue>

The namespace declaration in this XML instance document
is to be interpreted as: "All the stuff between <BookCatalogue> 
and </BookCatalogue> conforms to the schema defined at 
this namespace."

Thus, my first question is: do I have a correct understanding
of the purpose of targetNamespace, and of how an XML instance
document is to indicate that it conforms to an XML Schema?

2. The namespace spec clearly indicates that there is
no guarantee that there is anything at the URL referenced
by a namespace.  However, with XML Schemas, it seems
that the namespace referenced in an XML instance document
must necessarily reference an XML Schema.  Is this a violation
of the namespace spec, or is it an application-specific 
usage of namespaces?

3. In question #1 I showed how an XML instance document
may reference the XML Schema that it conforms to:

<?xml version="1.0"?>
<BookCatalogue xmlns="http://www.somewhere.org/BookCatalogue">
   ...
</BookCatalogue>

How does an XML Parser know that it is to go to
BookCatalogue.xsd at this URL?  Should, instead, the
namespace declaration be:

<BookCatalogue
xmlns="http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd">

4. With XML Schemas is it possible for an XML instance document to
be composed of fragments, each conforming to a different
XML Schema?  For example:

<?xml version="1.0"?>
<Library>
    <BookCatalogue xmlns="http://www.somewhere.org/BookCatalogue">
        <Book>
            <Title>Illusions The Adventures of a Reluctant
Messiah</Title>
            <Author>Richard Bach</Author>
            <Date>1977</Date>
            <ISBN>0-440-34319-4</ISBN>
            <Publisher>Dell Publishing Co.</Publisher>
        </Book>
            ...
    </BookCatalogue>
    <Employees xmlns="http://www.somewhere-else.org/Personnel">
        <Name>John Doe</Name>
        <Position>Reference Manager</Position>
    </Employees>
</Library>

Here we see the BookCatalogue fragment conforming to one XML Schema
and the Employees fragment conforming to another XML Schema.  Is this
capability the intent of the XML Schema spec?


5. In section 3.6 of the XML Spec it deals with derived types
Below is an example of a Book type being derived (by extension)
from a Publication type:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd"[
<!ATTLIST schema xmlns:cat CDATA #IMPLIED>
]>
<schema xmlns="http://www.w3.org/1999/XMLSchema"
               targetNamespace="http://www.xfront.org/BookCatalogue"
               xmlns:cat="http://www.xfront.org/BookCatalogue">
    <type name="Publication">
        <element name="Title" type="string" maxOccurs="*"/>
        <element name="Author" type="string" maxOccurs="*"/>
        <element name="Date" type="date"/>
    </type>
    <type name="Book" source="cat:Publication" derivedBy="extension">
        <element name="ISBN" type="string"/>
        <element name="Publisher" type="string"/>
    </type>
    <element name="Catalogue">
        <type>
            <element name="CatalogueEntry" minOccurs="0"
maxOccurs="*"                      

type="cat:Publication"/>
        </type>
    </element>
</schema>

The CatalogueEntry element is of type Publication.  Thus, in an
XML instance document a CatalogueEntry element can contain
either a Publication or a Book (since a Book is a Publication).
Here's an example XML instance document:

<?xml version="1.0"?>
<Catalogue xmlns="http://www.somewhere.org/BookCatalogue"
           xmlns:xsi="http://www.w3.org/1999/XMLSchema">
        <CatalogueEntry>
                <Title>Staying Young Forever</Title>
                <Author>Karin Granstrom Jordan, M.D.</Author>
                <Date>December, 1999</Date>
        </CatalogueEntry>
        <CatalogueEntry xsi:type="Book">
                <Title>Illusions The Adventures of a Reluctant
Messiah</Title>
                <Author>Richard Bach</Author>
                <Date>1977</Date>
                <ISBN>0-440-34319-4</ISBN>
                <Publisher>Dell Publishing Co.</Publisher>
        </CatalogueEntry>
        <CatalogueEntry xsi:type="Book">
                <Title>The First and Last Freedom</Title>
                <Author>J. Krishnamurti</Author>
                <Date>1954</Date>
                <ISBN>0-06-064831-7</ISBN>
                <Publisher>Harper &amp; Row</Publisher>
        </CatalogueEntry>
</Catalogue>

Here we see the first CatalogueEntry is a Publication.
The next two CatalogueEntry's are Books.  Note the 
use of an attribute (xsi:type) to indicate that it is a Book. 
The XML Schema spec says that when using a type in an 
XML instance document if it's not the source type (Publication)
then we must indicate what derived type is being used
(Book).  Why? Surely, an XML Parser would be able to 
figure out the type, just as compilers are able to do so.

6. What is the default value for the content attribute of
the type element?  Is it elementOnly?

7. Consider this example of type deriving (by restriction):

<type name="Publication">
        <element name="Title" type="string" maxOccurs="*"/>
        <element name="Author" type="string" maxOccurs="*"/>
        <element name="Date" type="date"/>
</type>
<type name= "SingleAuthorPublication" source="cat:Publication"
derivedBy= "restriction">
        <element name="Author" type="string" maxOccurs="1"/>
</type>

Here we see the type SingleAuthorPublication is a Publication
which is restricted to a single author.

My question is, do you need a namespace qualifier
for the element that you are restricting?  e.g.,

     <element cat:Author type="string" maxOccurs="1"/>

8. Type deriving allows us to create a new type that is an
extension of another type.  It also allows us to create a 
new type that is a restriction of another type.  What if you 
want to create a new type that is both an extension and a 
restriction.  How do you do that?  For example, suppose that 
I would like to create a Magazine type from the Publication type
where the Author is restricted to zero occurrences, and we 
extend by adding an editor.  How do I do that?  Create an intermediate
type that is a restriction and then extend that?  Not very 
elegant, I would say.

9.  I guess that I don't understand what equivClass is all about.
How does it differ from extension and restriction types?

10. In section 3.7 of the XML Schema spec it talks about creating
unique identifiers using keys.  Here's an example of using
key and keyref:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd"[
<!ATTLIST schema xmlns:lib CDATA #IMPLIED>
]>
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        targetNamespace="http://www.xfront.org/Library"
        xmlns:lib="http://www.xfront.org/Library">
    <element name="Library">
        <type>
            <element name="BookCatalogue">
                <type>
                     <element name="Book" minOccurs="0" maxOccurs="*">
                          <type>
                              <element name="Title" type="string"/>
                              <element name="Author" type="string"/>
                              <element name="Date" type="string"/>
                              <element name="ISBN" type="string"/>
                              <element name="Publisher" type="string"/>
                              <attribute name="Category" minOccurs="1">
                                  <datatype source="string">
                                      <enumeration
value="autobiography"/>
                                      <enumeration value="non-fiction"/>
                                      <enumeration value="fiction"/>
                                  </datatype>
                              </attribute> 
                              <attribute name="InStock" type="boolean"
default="false"/>
                              <attribute name="Reviewer" type="string"
default=""/>
                          </type>
                     </element>
                </type>
            </element>
            <element name="CheckoutRegister">
                <type>
                    <element name="Person" type="string"/>
                    <element name="Book">
                    <type content="empty">
                        <attribute name="titleRef" type="string"/>
                        <attribute name="libegoryRef" type="string"/>
                    </type>
                </type>
            </element>
        </type>
    </element>
    <key name="bookKey">
        <selector>Library/BookCatalogue/Book</selector>
        <field>Title</field>
        <field>@Category</field>
    </key>
    <keyref name="bookRef" refer="bookKey">
        <selector>Library/CheckingOutBook/book</selector>
        <field>@titleRef</field>
        <field>@libegoryRef</field>
    <keyref>
</schema>

Note how the key element is declaring that the combination 
of the contents of Title plus the value of the attribute Category 
is unique and represents the key.  Note the XPath expression
which locates the relevant nodes.  My question is this: where
is the XPath expression relative to?  What's the current node? 
Is is to the root of the document?  In the XML Schema spec 
their examples used an XPath expression that begins with .//  
What's the current node being indicated by the dot?

11. What does maxOccurs="*" mean with respect to the <any/> element? 
e.g., what's the difference between:

   <element name="free-form">
      <type>
            <any/>
      </type>
   </element>

and

   <element name="free-form">
      <type>
            <any maxOccurs="*"/>
      </type>
   </element>

I ask this because the any element can contain
any well-formed XML document, so maxOccurs seems
to have no meaning.

Thanks for any anwers you can provide.  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From twleung at sauria.com  Mon Dec 27 22:59:23 1999
From: twleung at sauria.com (twleung@sauria.com)
Date: Mon Jun  7 17:18:54 2004
Subject: SAX2: Should SAXException extend IOException?
References: <14432.57765.516268.430263@localhost.localdomain> <Pine.SOL.3.96.991222105955.20108C-100000@nine> <14433.8570.811500.337122@localhost.localdomain> <386202AB.A458BD6C@pacbell.net> <m3aen1n9rg.fsf@localhost.localdomain>
Message-ID: <03fd01bf50be$5bc48f80$0a00a8c0@orconet.com>

What about a SAX driver that spits out events by walking a DOM?  
If the DOM was created programatically, it doesn't really seem like an
IOException is the right thing to throw.

Ted
----- Original Message ----- 
From: David Megginson <david@megginson.com>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 23, 1999 4:04 AM
Subject: Re: SAX2: Should SAXException extend IOException?


> Ray Waldin <rwaldin@pacbell.net> writes:
> 
> > from David's SAX2 Exceptions proposal:
> > > 5. Have all callbacks that formerly threw SAXException throw
> > >    IOException instead.  This should help to avoid a lot of exception
> > >    tunneling.
> > 
> > I don't see the point in a Handler throwing an IOException to a
> > Parser in most cases.  That is what this item implies, right?  What
> > could a DocumentHandler mean by throwing an IOException during a
> > call to startElement?  The I/O has already occurred before the
> > handler gets involved.
> 
> Well, that's really domain-specific.  It could be that a handler in an
> XML I/O library does additional I/O (such as retrieving an external
> bitmap) which, from the top-level application's point of view, is
> still part of the same I/O process.
> 
> In the end, though, this is a relatively minor point.  The important
> point, for me, is that SAXException extend IOException -- I think that 
> it would be convenient to have the callbacks throw IOException rather
> than SAXException (otherwise, other IOExceptions will have to tunnel), 
> but it's not a show-stopper if everyone else thinks it's a bad idea.
> 
> 
> All the best,
> 
> 
> David
> 
> -- 
> David Megginson                 david@megginson.com
>            http://www.megginson.com/
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
> 
> 


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Tue Dec 28 02:46:32 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
In-Reply-To: <000c01bf4e6f$6aab2280$d1940e18@c1033339-a.smateo1.sfba.hom
 e.com>
References: <199912242141.QAA20410@hesketh.net>
Message-ID: <199912280246.VAA24033@hesketh.net>

At 04:31 PM 12/24/99 -0800, Don Park wrote:
>Atomic XML standards are one page specs each of which defines
>a single 'power word', a tag name or an attribute name.  An
>example is 'xmlns' or 'table'.
>
>Molecular XML standards are small specs each of which defines
>a single 'power phrase', a micro-schema involving just a few
>elements.  An example is 'address' molecule that consists of
>small number of elements that make up an address.
>
>These 'micro-standards' will allow us to create a more coherent
>XML document standards as well as XML software that can 'learn'
>to handle new standards by plugging in new power words or phrases.

This is beautiful!  I'd love to see more projects that assemble smaller
pieces, rather than trying to create anew within gigantic frameworks.

This is the kind of approach that I think gives namespaces a good reason
for being, helping developers cope with lots of little fragments rather
than assuming that only their own vocabularies are worth using.  I'd love
to see these small parts standardized, and then reused as appropriate.  I
think it might make development of both XML document structures and XML
processing software a lot easier.

Based on these wish lists, it sounds like 2000 is pretty promising.  Are
there more folks with dreams for the next year?  XML-dev is a great place
to find people with similar needs and interests.
Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Tue Dec 28 02:42:48 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
In-Reply-To: <199912262328.SAA21025@locke.ccil.org>
References: <386602D4.73E1D651@redrice.com>
Message-ID: <199912280242.VAA23956@hesketh.net>

At 06:28 PM 12/26/99 -0500, John Cowan wrote:
>Francis Norton scripsit:
>
>> [1] tools to bring real-life HTML into XML, so it can be manipulated via
>> DOM and SAX.
>
>See HTML Tidy at http://www.w3.org/People/Raggett/tidy
>This is a program which valids up crufty HTML, making it clean HTML.
>The option "-asxml" will force output to be XML-compatible.

While I like Tidy a lot, I'd love to have a parser that tidies up the HTML
structure and then spits it out as SAX events or a DOM tree, rather than
the kind of document-to-document work that Tidy does.  Seems like that
shouldn't be much more difficult than the work Tidy does.

I'd like to add to my wish list: more development tools that recognize the
power of chaining together multiple processors.  SAX filters are already
there (and MDSAX takes that to the limit), and the same is possible with
DOM trees, but I'd love to see chaining made into a basic paradigm of XML
processing.

It's not something anyone can mandate, though.

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cowan at locke.ccil.org  Tue Dec 28 03:14:15 1999
From: cowan at locke.ccil.org (John Cowan)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
In-Reply-To: <199912280242.VAA23956@hesketh.net> from "Simon St.Laurent" at Dec 27, 99 09:42:49 pm
Message-ID: <199912280317.WAA16531@locke.ccil.org>

Simon St.Laurent scripsit:

> While I like Tidy a lot, I'd love to have a parser that tidies up the HTML
> structure and then spits it out as SAX events or a DOM tree, rather than
> the kind of document-to-document work that Tidy does.  Seems like that
> shouldn't be much more difficult than the work Tidy does.

It isn't, and in fact the Java version of Tidy (linked from Dave Raggett's
page) provides a mini-DOM.  WIth my DOMParser, you can generate SAX events
from the mini-DOM as well.

-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Tue Dec 28 03:39:25 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
In-Reply-To: <199912280317.WAA16531@locke.ccil.org>
References: <199912280242.VAA23956@hesketh.net>
Message-ID: <199912280339.WAA25487@hesketh.net>

At 10:17 PM 12/27/99 -0500, John Cowan wrote:
>Simon St.Laurent scripsit:
>
>> While I like Tidy a lot, I'd love to have a parser that tidies up the HTML
>> structure and then spits it out as SAX events or a DOM tree, rather than
>> the kind of document-to-document work that Tidy does.  Seems like that
>> shouldn't be much more difficult than the work Tidy does.
>
>It isn't, and in fact the Java version of Tidy (linked from Dave Raggett's
>page) provides a mini-DOM.  WIth my DOMParser, you can generate SAX events
>from the mini-DOM as well.

Looks promising!  For those who want to enjoy it, see:

http://www3.sympatico.ca/ac.quick/jtidy.html

Simon St.Laurent
XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Sharing Bandwidth / Cookies
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Tue Dec 28 04:00:04 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:54 2004
Subject: (Many) XML Schema Questions
Message-ID: <3.0.32.19991227200049.00f4da78@192.168.0.1>

At 05:15 PM 12/27/99 -0500, Roger Costello wrote:
>2. The namespace spec clearly indicates that there is
>no guarantee that there is anything at the URL referenced
>by a namespace.  However, with XML Schemas, it seems
>that the namespace referenced in an XML instance document
>must necessarily reference an XML Schema.  Is this a violation
>of the namespace spec

I haven't read a recent schema draft, but if it says what you say it says,
then in my opinion that would be an egregious design error.  It is of course 
not a violation of the namespace spec, but I'm beginning to think that we 
should have written into the spec an express prohibition against land-grab 
attempts on the address function of the namespace name.  The notion that a
single URL can address the One True Schema Which Will Meet All Needs is 
demonstrably, empirically, absolutely wrong.  -T.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dbox at develop.com  Tue Dec 28 04:47:50 1999
From: dbox at develop.com (Box, Don)
Date: Mon Jun  7 17:18:54 2004
Subject: (Many) XML Schema Questions
Message-ID: <824EAE80328AD311B2590090276267820AE03C@mail.develop.com>

> -----Original Message-----
> From: Tim Bray [mailto:tbray@textuality.com]
> Sent: Monday, December 27, 1999 8:01 PM
> To: xml-dev@ic.ac.uk
> Subject: Re: (Many) XML Schema Questions
> 
> 
> At 05:15 PM 12/27/99 -0500, Roger Costello wrote:
> >2. The namespace spec clearly indicates that there is
> >no guarantee that there is anything at the URL referenced
> >by a namespace.  However, with XML Schemas, it seems
> >that the namespace referenced in an XML instance document
> >must necessarily reference an XML Schema.  Is this a violation
> >of the namespace spec
> 
> I haven't read a recent schema draft, but if it says what you 
> say it says,
> then in my opinion that would be an egregious design error.  

Agreed, since I have been using URNs almost exclusively as namespace URIs.
Fortunately, that's not what the schema spec states. Check out section 4.3.2
of the dec. 17 draft (http://www.w3.org/TR/xmlschema-1/#schema-loc) which
describes the 'xsi:schemaLocation' attribute. Also, see
http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0054.
html for James Clark's comments on this attribute.


> It is of course 
> not a violation of the namespace spec, 

Yes, but if schemas required xmlns to be a URL that refers to the schema
definition, the utility of XML schemas would be greatly diminished given the
current XMLNS recommendation.

DB
http://www.develop.com/dbox

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ricko at allette.com.au  Tue Dec 28 06:21:28 1999
From: ricko at allette.com.au (Rick Jelliffe)
Date: Mon Jun  7 17:18:54 2004
Subject: (Many) XML Schema Questions
Message-ID: <001201bf50ff$602a9830$4ff96d8c@NT.JELLIFFE.COM.AU>

Please cross-post questions to www-xml-schema-comments@w3c.org  !

From: Tim Bray <tbray@textuality.com>

>At 05:15 PM 12/27/99 -0500, Roger Costello wrote:
>>2. The namespace spec clearly indicates that there is
>>no guarantee that there is anything at the URL referenced
>>by a namespace.  However, with XML Schemas, it seems
>>that the namespace referenced in an XML instance document
>>must necessarily reference an XML Schema.  Is this a violation
>>of the namespace spec
>
>I haven't read a recent schema draft, but if it says what you say it
says,
>then in my opinion that would be an egregious design error.

Good news for the people concerned about this issue.

The new (Dec.) XML Schema WD does not assume that the NS directly
references an XML Schema. Rather, it says the schema "is about" a
namespace (3.1): it does not say namespace is a schema or a schema is
a namespace.  (At one stage there was wording that a namespace
"guides the interpretation of an element", which I thought was a nice
way to put it.)

Instead, the WD allows an attribute, currently called targetNamespace,
on the schema element, which "associates" the definitions and
declarations
in the schema with a single namespace URI.

Then (s4.3.2) there is an attribute xsi:schemaLocation that can be
put on any instance element. It allows the location of the schema to be
declared.  Actually, it allows several URLs to be declared, so that
there are fallbacks.  Thus there is no policy that the namespace
URI must directly retrieve a schema: an alternative mechansim
is provided and the warning "Experience suggests it is not in general
safe or or desirable from a performance point of view to directly
deference NS URIs as a matter of course" is given.

<?Skip if not interested in Schematron?>
Interestingly, this mechanism can also be used to invoke schema
in other languages, since an XML schema does require any
declarations: for example, we could just have the XML Schema
    <schema ns="http://www.w3.org/1999/XMLSchema">
        <annotation>
            <appinfo>
              <schema ns="http://www.ascc.net/xml/schematron">
                ... schematron schema
                (needs namespace-aware schematon available next week)
              </schema>
          </appinfo>
        </annotation>
    </schema>

<?Resume about XML Schemas?>

Note also the following:
    * the subelements of an element can be in different namespaces
    * so therefore a document may be composed from components from
many different schemas
    * there is a wildcard mechanism (3.5) to say "any element" or "any
element of  a particular namespace" or "any element from
the current namespace" or "any element from another namespace"
which allows frameworks.

For example, here is, I think, an XML schema that validates
that foo:y can contain any element from any other namespace
and that bar:z is a date.  But it declares them as separate
components (there are other mechanisms available):
    <x
        xmlns:schema="http://www.w3.org/1999/XMLSchema"
        xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
        xmlns:bar="http://www.bar.com/bar"
        xmlns:foo="http://www.foo.com/foo" >

        <schema:schema targetNS="http://www.foo.com/foo">
                <schema:element name="y" >
                    <schema:type>
                        <schema:any namespace="##other" />
                    </schema:type>
                </schema:element>
        </schema:schema>

        <schema:schema targetNS="http://www.bar.com/bar">
                <schema:element name="z">
                    <schema:datatype source="date" />
                </schema:element>
        </schema:schema>

        <foo:y   xsi:schemaLocation=

"#xpointer(//schema:schema[@targetNS='http://www.foo.com/foo']">
            <bar:z    xsi:schemaLocation=
                "#xpointer
(//schema:schema[@targetNS='http://www.bar.com/bar']"
                >1999-12-29</bar:z>
        <foo:y>
    </x>

The XML Schema validator now knows how to validate y and z.
(The draft is not explicit about how x is treated, but it is consitent
that it would not make the document invalid if there are no
declarations available. )


Rick Jelliffe


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Tue Dec 28 07:18:53 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:54 2004
Subject: SAX/C++: First interface draft
In-Reply-To: James Clark's message of "Fri, 03 Dec 1999 11:48:47 +0700"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com>
Message-ID: <wh4sd484bc.fsf@viffer.metis.no>

>>>>> James Clark <jjc@jclark.com>:

> One interesting issue is whether to provide a virtual destructor.  I
> think the safest solution is not to provide a virtual destructor but
> instead to declare but not define a private operator delete.  This makes
> it a compile time error to do:

>   DTDHandler *p;
>   // ...
>   delete p;

I'm in the process of making a dispatching DocumentHandler (one that
looks at the top level element of the document and maps from this
element's tag name to the appropriate DocumentHandler).

For this I would like to use a DocumentHandler factory class, and just 
delete the DocumentHandler when I've finished parsing.

But this gets hard to do if the DocumentHandler base class destructor
is private.

Of course there is an easy workaround by wrapping the DocumentHandler
subclass in another class that knows about the subclass, ie. something 
like this:

class MyAdapter : public AdapterBase {
public:
    MyAdapter();
    virtual ~MyAdapter();

    virtual DocumentHandler& handler() { return handler; }
private:
    MyDocumentHandler handler_;
};

(where handler() is defined as a virtual function in AdapterBase, and
AdapterBase is the class known by the dispatching DocumentHandler).

But it seems like a bit of unneccessary indirection.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 28 21:41:24 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
References: <199912242141.QAA20410@hesketh.net> <386602D4.73E1D651@redrice.com>
Message-ID: <38692E68.34360A6C@pacbell.net>

Francis Norton wrote:
> 
> [1] tools to bring real-life HTML into XML, so it can be manipulated via
> DOM and SAX.

Like the utilities at http://home.pacbell.net/david-b/xml/ perhaps?

The focus there is HTML --> XHTML.  Since it's built on top of
the HTML parser in Swing, it's got a few glitches ... but it's
hooked up to an XHTML validator so you can find those easily,
and newer versions of Swing have fixed some of them.  And if you
want, the "jTidy" HTML parser should work too.

If you get both the utilities package and the DOM (L2.latest)
stuff, you've got most of the tools you'll need.

New, and still with some rough edges, is an XML validator that
is driven using SAX2 callbacks ... that is, it's a clean layer,
can work with anything that produces SAX2 events (including the
declaration callbacks).  Does pretty well except for the stuff
that requires layering violations (see the javadoc).  That is
hooked up to the conformance-enhanced AElfred in that package.

I think it gets confused about the "table" model in XHTML, but
that'll get fixed.

- Dave

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Tue Dec 28 22:17:07 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:54 2004
Subject: SAX/C++: GCJ and CNI
References: <14406.59198.949047.2487@localhost.localdomain>
	 <38474BAF.AF4CFF2D@jclark.com> <wh4sd484bc.fsf@viffer.metis.no>
Message-ID: <386936CB.F91680ED@pacbell.net>

I've not been closely following this discussion, but I don't think
that one interesting point got raised yet.

That point being that a side effect of the "GNU Compiler for Java"
(GCJ -- see http://sourceware.cygnus.com) such a binding is more or
less defined automatically from the Java binding.  For some folk this
may suffice ... there _will_ be multiple SAX/C++ bindings.

Look at the GCJ and LIBGCJ pages for more information if this sounds
at all interesting.  "CNI" is the binding; it's much lighter weight
than JNI (which has been known to kill all performance advantages of
using native code due to its call overhead).  It's a goal to have C++
and Java code interwork as smoothly as C++ and C do.

A sample CNI interface is included below.

If you go to my XML page (http://home.pacbell.net/david-b/xml/) you'll
see everything is set up to work with GCJ.  To try it now, you should feel
comfortable compiling GCJ (compiler and runtime!) from source, have the
bandwidth to download ~15Mb source snapshots, and cope with the bugs.

For the record, when I've run GCJ it's been mostly faster than Sun's
JVM (including x86 JIT), but not always.  Yes, folk are hard at work
making this faster, and fixing the bugs.  (And as always, help is much
appreciated.)  GCC/EGCS 2.96 should be very interesting!!

- Dave


// DO NOT EDIT THIS FILE - it is machine generated -*- c++ -*-

#ifndef __org_xml_sax_DocumentHandler__
#define __org_xml_sax_DocumentHandler__

#pragma interface

#include <java/lang/Object.h>
#include <gcj/array.h>

extern "Java"
{
  namespace org
  {
    namespace xml
    {
      namespace sax
      {
        class DocumentHandler;
        class AttributeList;
        class Locator;
      }
    }
  }
};

class ::org::xml::sax::DocumentHandler : public ::java::lang::Object
{
public:
  virtual void characters (jcharArray, jint, jint) = 0;
  virtual void endDocument () = 0;
  virtual void endElement (::java::lang::String *) = 0;
  virtual void ignorableWhitespace (jcharArray, jint, jint) = 0;
  virtual void processingInstruction (::java::lang::String *,
::java::lang::String *) = 0;
  virtual void setDocumentLocator (::org::xml::sax::Locator *) = 0;
  virtual void startDocument () = 0;
  virtual void startElement (::java::lang::String *,
::org::xml::sax::AttributeList *) = 0;
};

#endif /* __org_xml_sax_DocumentHandler__ */

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrossi at crusher.jcals.csc.com  Tue Dec 28 22:16:15 1999
From: mrossi at crusher.jcals.csc.com (Michael Rossi)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
Message-ID: <472EF0A38796D21185810000F807DD1E01A98CC1@crusher.jcals.csc.com>

Simon St.Laurent wrote:
> 
> At 04:31 PM 12/24/99 -0800, Don Park wrote:
> >Atomic XML standards are one page specs each of which defines
> >a single 'power word', a tag name or an attribute name.  An
> >example is 'xmlns' or 'table'.
> >
> >Molecular XML standards are small specs each of which defines
> >a single 'power phrase', a micro-schema involving just a few
> >elements.  An example is 'address' molecule that consists of
> >small number of elements that make up an address.
> >
> >These 'micro-standards' will allow us to create a more coherent
> >XML document standards as well as XML software that can 'learn'
> >to handle new standards by plugging in new power words or phrases.
> 
> This is beautiful!  I'd love to see more projects that assemble smaller
> pieces, rather than trying to create anew within gigantic frameworks.

   While not an exact analogy, this concept brings ontologies to mind.
<Caveat>I claim to know nothing about ontologies except what I've read on
http://www.Ontology.org. </Caveat> Again, the concept seems to be that
defining fundamental notions as building blocks can serve as a catalyst for
more advanced interoperability.

> 
<snip/>
> 
> Based on these wish lists, it sounds like 2000 is pretty promising.  Are
> there more folks with dreams for the next year?  XML-dev is a great place
> to find people with similar needs and interests.

   One thing that springs to mind would be a finalization of some
specification for rendering XML on the web (presumably XSL FOs). That isn't
to say that focusing the W3C's efforts on transformation was a bad thing. On
the contrary, it arguably paved the way for the next step. But I think it's
time to move on and get some groundswell for more things like FOP. If we
could get XML onto actual web pages, instead of just being converted to
HTML, maybe we'd start to see more of the "promises" being kept. Accurate
search results you say? More interactive and intelligent web-apps you say?
Not if XML is only on the back-end. I'd like to see XML step up to the front
in 2000. But I guess the real trick will be getting MS and NS (ah, I mean
AOL) to buy in. Yeah. :-)

Michael A. Rossi
mailto:mrossi@jcals.csc.com
856-983-4400 x4911

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From cbullard at hiwaay.net  Wed Dec 29 00:15:28 1999
From: cbullard at hiwaay.net (Len Bullard)
Date: Mon Jun  7 17:18:54 2004
Subject: (Many) XML Schema Questions
References: <824EAE80328AD311B2590090276267820AE03C@mail.develop.com>
Message-ID: <386951C6.3D7@hiwaay.net>

Box, Don wrote:
> 
> Also, see
> http://lists.w3.org/Archives/Public/www-xml-schema-comments/1999OctDec/0054.
> html for James Clark's comments on this attribute.

Thanks for the references, Don.

What James said.  The catalog is much cleaner to understand. The schema 
location attribute is 'ghastly'.

BTW:  If any of the XMLers are interested in a design situation where 
the basis of XML as a lexical design only is used and the application 
language (what some of you call an ML) is itself the basis for 
extensibility (contains its own self-describing means of extending 
it's namespace and element set), you should be following the 
X3D contributors design debate.  In X3D, they are assuming an 
object design from jump, so two different DTDs are being proposed.  
One maps directly to the node names of the VRML97 design; the 
other maps nodes and fields and hangs the names off of the 
attributes of same.   IOW, if you abandon the DOM, abandon the 
HTML browser, and think inside a standalone object framework, 
what do you design? This is XML MMTT in classic form because 
you can't use XML for validation of any real depth, why 
bother with it except to pick the shapes of the brackets?

[Note for a select group of elders:  they are 
at about meeting three of the MID work: which are the 
outermost objects and what are their names].

The design arguments should be of interest to those who wish to 
understand how XML and its lack of inline extensibility falls 
apart when designing object frameworks.  It is fascinating to read.  
Having sat through a decade of different teams trying to do this with 
markup starting with the IETM nodes and fields, talented and reasonable
designers 
keep trying the same solution and generally going away from 
the worlds of DTDs, schemas, architectural forms, etc.

len


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From francis at redrice.com  Wed Dec 29 01:05:42 1999
From: francis at redrice.com (Francis Norton)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays
References: <199912242141.QAA20410@hesketh.net> <386602D4.73E1D651@redrice.com> <38692E68.34360A6C@pacbell.net>
Message-ID: <38695EAA.2365D31C@redrice.com>

David Brownell wrote:
> 
> Francis Norton wrote:
> >
> > [1] tools to bring real-life HTML into XML, so it can be manipulated via
> > DOM and SAX.
> 
> Like the utilities at http://home.pacbell.net/david-b/xml/ perhaps?

Thank you David, and John Cowan, for the links. I had heard of Tidy and
thought it was too file-oriented for transactional use, but jTidy and
David's link show that things have moved further and faster than I
realised.

I get the impression that my desire to have xpath included in the DOM
may be naive, but I haven't yet formed a clear picture of the
alternatives. Is there an architectural solution which would allow one
to plug any arbitrary (but compatible) xpath processor in to a DOM
processor? And then use it to select nodes from a document that has been
opened in the DOM, and then update those nodes using the DOM?

Thanks -

Francis.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 29 02:35:37 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:54 2004
Subject: SAX2: Should SAXException extend IOException?
In-Reply-To: <03fd01bf50be$5bc48f80$0a00a8c0@orconet.com>
References: <14432.57765.516268.430263@localhost.localdomain>
	<Pine.SOL.3.96.991222105955.20108C-100000@nine>
	<14433.8570.811500.337122@localhost.localdomain>
	<386202AB.A458BD6C@pacbell.net>
	<m3aen1n9rg.fsf@localhost.localdomain>
	<03fd01bf50be$5bc48f80$0a00a8c0@orconet.com>
Message-ID: <14440.52804.249843.754831@localhost.localdomain>

twleung@sauria.com writes:

 > What about a SAX driver that spits out events by walking a DOM?  
 > If the DOM was created programatically, it doesn't really seem like an
 > IOException is the right thing to throw.

Well, you can get an IO exception by reading a character stream from a 
String, for example, which is a closely parallel example; that said,
I'm still considering how radically different SAX2 should be.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Wed Dec 29 02:35:39 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:54 2004
Subject: SAX2: Namespace Processing and NSUtils helper class 
In-Reply-To: <Pine.LNX.4.10.9912261900330.8065-100000@cauchy.clarkevans.com>
References: <199912262051.NAA02771@localhost.localdomain>
	<Pine.LNX.4.10.9912261900330.8065-100000@cauchy.clarkevans.com>
Message-ID: <14440.52595.283247.36221@localhost.localdomain>

Clark C. Evans writes:

 > On Sun, 26 Dec 1999 uche.ogbuji@fourthought.com wrote:
 > > I think that though the prefix is maintained for convenience, it should _not_ 
 > > be considered part of the name in any comparison at the semantic level.
 > >
 > > So, in short, I don't see a problem with
 > > {"http://www.w3.org/1999/xhtml", "a", ""} ==
 > > {"http://www.w3.org/1999/xhtml", "a", "html"}.  It's a very
 > > well-documented consequence of XML Namespaces 1.0, and users 
 > > should be aware of it.
 > 
 > Just testing edge cases...
 > 
 > 	    {"","a",""} != {"","a","html"}
 > 	and {"","a",""} == {"","a",""}

As I understood it, the suggestion was that

  {"", "a", ""} == {"", "a", "html"}

To tell the truth, I am not comfortable with either alternative, and
that's why I don't want to create a Name class.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From uche.ogbuji at fourthought.com  Wed Dec 29 02:54:48 1999
From: uche.ogbuji at fourthought.com (uche.ogbuji@fourthought.com)
Date: Mon Jun  7 17:18:54 2004
Subject: Wish lists for the Holidays 
In-Reply-To: Your message of "Wed, 29 Dec 1999 01:06:50 GMT."
             <38695EAA.2365D31C@redrice.com> 
Message-ID: <199912290254.TAA00954@localhost.localdomain>

> I get the impression that my desire to have xpath included in the DOM
> may be naive, but I haven't yet formed a clear picture of the
> alternatives. Is there an architectural solution which would allow one
> to plug any arbitrary (but compatible) xpath processor in to a DOM
> processor? And then use it to select nodes from a document that has been
> opened in the DOM, and then update those nodes using the DOM?

I don't know whether it's what you term an "architectural solution", but 
4XPath and 4DOM allow such manipulation right now:

http://FourThought.com/4Suite

-- 
Uche Ogbuji
FourThought LLC, IT Consultants
uche.ogbuji@fourthought.com	(970)481-0805
Software engineering, project management, Intranets and Extranets
http://FourThought.com		http://OpenTechnology.org


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From vita96 at se.its-sby.edu  Wed Dec 29 04:05:49 1999
From: vita96 at se.its-sby.edu (Vita Prihatoni Purnomo)
Date: Mon Jun  7 17:18:54 2004
Subject: XML Converter
Message-ID: <00d901bf51b1$fd358e70$f70a7e0a@bekel>

Hi Chetan,
You wrote that you've find out RTF to XML convertor. I'm interesting about this XML convertor. Could you tell me where you find this convertor ?.

Thanks,
Vita
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991229/7acf3382/attachment.htm
From sb at metis.no  Wed Dec 29 09:14:36 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:54 2004
Subject: SAX/C++: GCJ and CNI
In-Reply-To: David Brownell's message of "Tue, 28 Dec 1999 14:16:43 -0800"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <wh4sd484bc.fsf@viffer.metis.no> <386936CB.F91680ED@pacbell.net>
Message-ID: <whiu1i153b.fsf@viffer.metis.no>

>>>>> David Brownell <david-b@pacbell.net>:

> ... there _will_ be multiple SAX/C++ bindings.

Well,... personally I'll follow the binding that David Megginson and
James Clark finally comes up with, no matter what it looks like
(ie. no matter whether I like everything in it or not).

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Wed Dec 29 12:56:58 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:55 2004
Subject: XML Schema Question: How to indicate an XML document conforms to an XML 
 Schema
Message-ID: <386A0567.4ECA0C06@mitre.org>

Hi Folks,

Thanks for the pointer to the location in the XML Schema spec where they
discuss how an XML instance document is to reference an XML Schema
(4.3.2).  I have read it over and wish to confirm my understanding.

I would like to first see if I understand the simple case of how to
indicate in an XML document that it conforms to a single XML Schema. 

Example.  Suppose that I create an XML Schema for BookCatalogues 
(called BookCatalogue.xsd):

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd">
<schema xmlns="http://www.w3.org/1999/XMLSchema"
        targetNamespace="http://www.somewhere.org/BookCatalogue">
     ...
</schema>

In my XML document I indicate that it conforms to this XML Schema using
the schemaLocation attribute:

<?xml version="1.0"?>
<BookCatalogue 
          xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
          xsi:schemaLocation=
              "http://www.somewhere.org/BookCatalogue
              http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd">
    <Book>
        <Title>Illusions The Adventures of a Reluctant Messiah</Title>
        <Author>Richard Bach</Author>
        <Date>1977</Date>
        <ISBN>0-440-34319-4</ISBN>
        <Publisher>Dell Publishing Co.</Publisher>
    </Book>
    ...
</BookCatalogue>

In the BookCatalogue element (the root element) I declare that the
schemaLocation attribute comes from the XML Schema Instance namespace
(xsi).  The value of the schemaLocation attribute is a pair of values -
a namespace and the URI to a schema.  When the XML Parser processes this
XML document it will use the schemaLocation pair of values to determine
the XML Schema that it conforms to.  It will retrieve the schema at the
URI specified in schemaLocation (in this example, BookCatalogue.xsd) and
then it will open up this schema document to confirm that its
targetNamespace value matches the namespace value shown in
schemaLocation.  In this example it does.  Thus, the XML Parser knows
that "All the stuff between <BookCatalogue> and </BookCatalogue>
conforms to the schema defined at this URI with this namespace."

Is this a correct understanding of how to indicate in an XML document
that it conforms to a particular XML Schema?  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Wed Dec 29 15:39:56 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:18:55 2004
Subject: XML Converter
Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2782EB@MDYNYCMSX1>

See rtf2xml at http://www.omnimark.com/develop/contributed
<http://www.omnimark.com/develop/contributed> . It requires the use of
Omnimark, which is free.
 
Bob DuCharme       www.snee.com/bob <http://www.snee.com/bob>        <bob@  
snee.com>  see www.snee.com/bob/xmlann <http://www.snee.com/bob/xmlann>  for
"XML:
The Annotated Specification" from Prentice Hall.

 
xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From eliot at isogen.com  Wed Dec 29 16:46:23 1999
From: eliot at isogen.com (W. Eliot Kimber)
Date: Mon Jun  7 17:18:55 2004
Subject: Survey: Catalysis Templates
References: <NBBBJPGDLPIHJGEHAKBAGEFPEKAA.martind@netfolder.com>
Message-ID: <386A3C0A.C2D7BBC2@isogen.com>

Didier PH Martin wrote:

> This said, thanks for bringing the subject because, for me, "architectures"
> are closer to "interface" like found in C++ base classes or Java interfaces.
> And what is an interface after all? In its simplest form, we can say that it
> is a contract that assure the client that, even if an implementation does
> something totally differently than an other one, we still have the same way
> to interact with all the implementations. So, for example, if I have a
> particular document structure (all the elements have names totally different
> than Hytime) an Hytime apps can still interact with my document structure.
> Thus, the concept is similar to the notion of interface.

Yes, this is I think the closest programming analog to document
architectures. When an element in a document is mapped to some
architectural element, you are saying "my element 'foo' conforms to the
rules for architectural element 'bar', in addition to whatever else I
might say a 'foo' is."  That lets a user or processor of the document
know something about the elements in it without having to know
everything.  I think that is analogous to saying "objects that implement
the 'foo' interface must provide properties A, B, and C, implement
methods D, E, and F, and accept messages G, H, and I".

I like the interface analogy better than the "inheritance" analogy
because it stresses the contract aspect of the relationship and does not
imply any of the things that inheritance implies (which architectures do
not provide, not being about program objects but about inert data
elements).

Cheers,

E.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tlainevool at yahoo.com  Wed Dec 29 17:37:24 1999
From: tlainevool at yahoo.com (Toivo Lainevool)
Date: Mon Jun  7 17:18:55 2004
Subject: Survey: Catalysis Templates
Message-ID: <19991229173720.29337.qmail@web2105.mail.yahoo.com>

--- "W. Eliot Kimber" <eliot@isogen.com> wrote:
> Yes, this is I think the closest programming analog to document
> architectures. When an element in a document is mapped to some
> architectural element, you are saying "my element 'foo' conforms to the
> rules for architectural element 'bar', in addition to whatever else I
> might say a 'foo' is."  That lets a user or processor of the document
> know something about the elements in it without having to know
> everything.  I think that is analogous to saying "objects that implement
> the 'foo' interface must provide properties A, B, and C, implement
> methods D, E, and F, and accept messages G, H, and I".
> 
> I like the interface analogy better than the "inheritance" analogy
> because it stresses the contract aspect of the relationship and does not
> imply any of the things that inheritance implies (which architectures do
> not provide, not being about program objects but about inert data
> elements).
> 
> Cheers,
> 
> E.

I think an even better match from programming is the "Adapter" pattern from
GoF[1].  To quote from the book an Adapter is used to "Convert the interface of
a class into another interface clients expect.  Adapter lets classes work
together that couldn't otherwise because of incompatible interfaces."
If you change "class" to "document type" in the above, I think this gives a
perfect description of the intent of document architectures.

Toivo Lainevool

[1] Design Patterns: Elements of Reusable Object-Oriented Software.  Gamma,
Helm, Johnson, Vlissides.


__________________________________________________
Do You Yahoo!?
Talk to your friends online with Yahoo! Messenger.
http://messenger.yahoo.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From DuCharmR at moodys.com  Wed Dec 29 18:35:20 1999
From: DuCharmR at moodys.com (DuCharme, Robert)
Date: Mon Jun  7 17:18:55 2004
Subject: IE+XML+XSL and html namespace
Message-ID: <01BA10F0CD20D3119B2400805FD40F9F2782F4@MDYNYCMSX1>

When I feed an XML document to Internet Explorer 5.0 using a CSS2 stylesheet
and declare html as a namespace in the document element's start-tag, IE
recognizes html namespace element types and displays them properly. For
example, the following shows up as a working form:

  <html:form action="http://search.yahoo.com/bin/search">
    <html:input type="submit" value="Yahoo"/><html:input size="22"
name="p"/>
  </html:form>

However, if I'm using an XSL stylesheet, it doesn't work. Does anyone know
how I can get html-specific elements to work in an XML document displayed in
IE using XSL? (I'm actually interested in using html:object, html:script,
and  html:xml "data island" elements to pass some XML markup to an ActiveX
control. This also works using CSS and not XSL, but an html:form example was
easier to include in this message.)

Bob DuCharme          www.snee.com/bob           <bob@  
snee.com>  "The elements be kind to thee, and make thy
spirits all of comfort!" Anthony and Cleopatra, III ii

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From l-arcini at uniandes.edu.co  Wed Dec 29 18:49:42 1999
From: l-arcini at uniandes.edu.co (Fabio Arciniegas A.)
Date: Mon Jun  7 17:18:55 2004
Subject: Survey Catalysis
Message-ID: <Pine.GSO.3.96.991229134056.24374A-100000@isis>

Well, I think there are three points of view here and basically three 
analogies with traditional patterns:

1. The relation between the result of the mapping and the third party 
interested in the final product: As you said this would be an "Adapter":
The third party expects a certain interface and an existing product is adapted
to conform to it.

2.The relation between the mapped element and the architectural form: this
is -as Didier noted- an "Interface" ("Marker Interface"[Larman] maybe
would be more appropriate, since there purpose of the mapping is to signal the mapped 
product as something)

3.The Architecture structure as a "Model Template". (This is the way I 
originally understood the survey, because of the reference to  Catalysis 
notion of framework). The Architecture provides a structure that is
specialized via the mapping of other constructs to the template.

Best,
	Fabio

> I think an even better match from programming is the "Adapter" pattern
from
> GoF[1].  To quote from the book an Adapter is used to "Convert the
interface of
> a class into another interface clients expect.  Adapter lets classes
work
> together that couldn't otherwise because of incompatible interfaces."
> If you change "class" to "document type" in the above, I think this
gives a
> perfect description of the intent of document architectures.
> 
> Toivo Lainevool
> 
> [1] Design Patterns: Elements of Reusable Object-Oriented Software.
Gamma,
> Helm, Johnson, Vlissides.

--
Fabio Arciniegas A.		        Viaduct Technologies, Inc.
fabio@viaduct.com			Software Engineer
Interests: XML, Wittgenstein and just about everything in between.
Oblique Strategy of the day: 	      "Abandon normal instruments"

--
Fabio Arciniegas Arjona              
l-arcini@uniandes.edu.co            
                                

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david-b at pacbell.net  Wed Dec 29 20:00:35 1999
From: david-b at pacbell.net (David Brownell)
Date: Mon Jun  7 17:18:55 2004
Subject: SAX/C++: GCJ and CNI
References: <14406.59198.949047.2487@localhost.localdomain>
	 <38474BAF.AF4CFF2D@jclark.com> <wh4sd484bc.fsf@viffer.metis.no>
	 <386936CB.F91680ED@pacbell.net> <whiu1i153b.fsf@viffer.metis.no>
Message-ID: <386A6864.C27EBB5@pacbell.net>

Steinar Bang wrote:
> 
> >>>>> David Brownell <david-b@pacbell.net>:
> 
> > ... there _will_ be multiple SAX/C++ bindings.
> 
> Well,... personally I'll follow the binding that David Megginson and
> James Clark finally comes up with, no matter what it looks like
> (ie. no matter whether I like everything in it or not).

Lots of folk will make similar choices.  I heard voices already
pointing out they couldn't use a binding that didn't work on their
old funky nonstandard "C++" system, for example, regardless whether
it's a "blessed" API or not.

That's all I meant whan I said there "_will_" be multiple bindings.
SAX/Java hit a window of opportunity.  SAX/C++ missed one, IMHO,
so it's guaranteed plenty of competition.  Bindings generated right
from the Java code are not, and will not, be the only other options.

- Dave

p.s. No comments on the CNI binding itself?

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Wed Dec 29 20:11:12 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:55 2004
Subject: (Many) XML Schema Questions
Message-ID: <33D189919E89D311814C00805F1991F7F4AADD@RED-MSG-08>

What was said by Rick Jeliffe regarding the current schema draft is true
(and anyone who is interested is recommended to red the actual XML Schema WD
at http://www.w3.org/TR/xmlschema-1/ and http://www.w3.org/TR/xmlschema-2/.)

However, I would like to correct a possible misimpression that might arise
from the turgid wording in the current public draft and also from Rick's
statement "Then (s4.3.2) there is an attribute xsi:schemaLocation that can
be put on any instance element. It allows the location of the schema to be
declared. ..."  

After extensive debate, the XML Schemas WG decided that the
xsi:schemaLocation attribute serves as a hint, not a mandatory directive.
That is, the processor of an instance is welcome to look at the URI
referenced by the value of xsi:schemaLocation, but is not required to.  It
may process an instance document using a different schema set (or no schemas
at all).  The relevant phrase is "unless directed otherwise" in the
following passage from the 1999-12-17 structures draft:

"Again, unless directed otherwise general-purpose schema-aware processors
must attempt to dereference each schema URI in the value of "schemaLocation"
to obtain a schema..."

This is in recognition of the fact that, ultimately, the processor of a
document determines what processing is done.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From sb at metis.no  Thu Dec 30 00:19:59 1999
From: sb at metis.no (Steinar Bang)
Date: Mon Jun  7 17:18:55 2004
Subject: SAX/C++: GCJ and CNI
In-Reply-To: David Brownell's message of "Wed, 29 Dec 1999 12:00:36 -0800"
References: <14406.59198.949047.2487@localhost.localdomain> <38474BAF.AF4CFF2D@jclark.com> <wh4sd484bc.fsf@viffer.metis.no> <386936CB.F91680ED@pacbell.net> <whiu1i153b.fsf@viffer.metis.no> <386A6864.C27EBB5@pacbell.net>
Message-ID: <whiu1hqw3b.fsf@viffer.metis.no>

>>>>> David Brownell <david-b@pacbell.net>:

> Steinar Bang wrote:
>> 
>> >>>>> David Brownell <david-b@pacbell.net>:

>> Well,... personally I'll follow the binding that David Megginson and
>> James Clark finally comes up with, no matter what it looks like
>> (ie. no matter whether I like everything in it or not).

> Lots of folk will make similar choices.  I heard voices already
> pointing out they couldn't use a binding that didn't work on their
> old funky nonstandard "C++" system, for example,

I've made a lot of noises like that, which is why I've felt the need
to make my position clear a couple of times.  I wouldn't want my
objections to slow down of convergence towards a standard.

[snip!]
> p.s. No comments on the CNI binding itself?

Umm... I think I would prefer something more native C++-like as the
final standard.

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Thu Dec 30 01:43:44 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
Message-ID: <3.0.32.19991229204452.01a4c260@nexus.webmethods.com>

I've been thinking about the utility of naming all elements and most
attributes using a namespace-URI/local-name pair.  Let's denote such a
name as (namespace, local-name).  (I say "most" attributes because it
won't name anything in the per-element-type partition.)  Seems to me
that filtering operations would commonly extract names belonging to a
particular namespace, so requests for (namespace, *) might be pretty
common.  Let's look at this more closely...

Suppose I'm defining elements that describe electronics parts.  I'm
going to want to organize them hierarchically.  For example:

  www.parts.com/computer/memory/sram
  www.parts.com/computer/memory/dram
  www.parts.com/computer/cpus/intel
  www.parts.com/computer/cpus/amd
  www.parts.com/stereo/speaker/surround
  www.parts.com/stereo/speaker/subwoofer

etc.

It may make sense for one application to examine all computer parts,
another to examine all computer memory parts, and so on.  If I want all
memory parts I have to know all the pertinent namespace URIs.  If I
know that the URIs are structured hierarchically, I could do a wildcard
search on the URI itself -- assuming I had a tool that let me do so (do
any yet?).

But because URIs allow this, the next guy organizes his namespaces
differently:

  www.nextguy.com/computer-memory-sram
  www.nextguy.com/computer-memory-dram
  www.nextguy.com/computer-cpus-intel
  www.nextguy.com/computer-cpus-amd
  www.nextguy.com/stereo-speaker-surround
  www.nextguy.com/stereo-speaker-subwoofer

And the next next guy does so as follows:

  www.nextnextguy.com/computer?memory=true+type=sram
  www.nextnextguy.com/computer?memory=true+type=dram
  www.nextnextguy.com/computer?cputype=intel
  www.nextnextguy.com/computer?cputype=amd
  www.nextnextguy.com/stereo/speaker?surround
  www.nextnextguy.com/stereo/speaker?subwoofer

To make namespace filtering work for the general case requires
regex-like matching capabilities.  And regex matching isn't very easy
to optimize for performance (such as via indexing).  It also isn't the
kind of thing we want the average XML user to have to learn -- seems to
me that it would have to bubble up to the user interface, at least on
generic XML tools.

So I'm thinking that we need a *standard* way to organize namespaces
hierarchically, and that we need one before namespace usage is so
widespread that we absolutely have to provide regex support.

But maybe I'm jumping the gun.  I haven't yet heard anyone scream out
in pain, though I'm not sure we should be waiting for pain to come.

--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tbray at textuality.com  Thu Dec 30 01:58:54 1999
From: tbray at textuality.com (Tim Bray)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
Message-ID: <3.0.32.19991229180016.00ea9bfc@192.168.0.1>

At 08:44 PM 12/29/99 -0500, Joe Lapp wrote:
>So I'm thinking that we need a *standard* way to organize namespaces
>hierarchically, and that we need one before namespace usage is so
>widespread that we absolutely have to provide regex support.

Sounds like a good idea.  How about a proposal?  Shorter is better.
 -Tim


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bckman at ix.netcom.com  Thu Dec 30 02:22:44 1999
From: bckman at ix.netcom.com (Frank Boumphrey)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
References: <3.0.32.19991229204452.01a4c260@nexus.webmethods.com>
Message-ID: <01c401bf526e$b7c6c300$15addccf@prioritynetworks.net>

I think that we need to remember that a namespace is just a unique name, it
is not an URI. URI's are used because they happen to be unique names
presumably under the control of the namespace author. There is nothing to
stop me from using 'superfragilisticexpialidocious' as a name space.

thus
xmlns='www.parts.com/computer/memory/sram'

and

xmlns= 'www.nextguy.com/computer-memory-sram'

may be technically the same URI, but they are different namespaces!

Frank

----- Original Message -----
From: Joe Lapp <jlapp@webMethods.com>
To: <xml-dev@ic.ac.uk>
Sent: Wednesday, December 29, 1999 8:44 PM
Subject: Hierarchical namespaces?


> I've been thinking about the utility of naming all elements and most
> attributes using a namespace-URI/local-name pair.  Let's denote such a
> name as (namespace, local-name).  (I say "most" attributes because it
> won't name anything in the per-element-type partition.)  Seems to me
> that filtering operations would commonly extract names belonging to a
> particular namespace, so requests for (namespace, *) might be pretty
> common.  Let's look at this more closely...
>
> Suppose I'm defining elements that describe electronics parts.  I'm
> going to want to organize them hierarchically.  For example:
>
>   www.parts.com/computer/memory/sram
>   www.parts.com/computer/memory/dram
>   www.parts.com/computer/cpus/intel
>   www.parts.com/computer/cpus/amd
>   www.parts.com/stereo/speaker/surround
>   www.parts.com/stereo/speaker/subwoofer
>
> etc.
>
> It may make sense for one application to examine all computer parts,
> another to examine all computer memory parts, and so on.  If I want all
> memory parts I have to know all the pertinent namespace URIs.  If I
> know that the URIs are structured hierarchically, I could do a wildcard
> search on the URI itself -- assuming I had a tool that let me do so (do
> any yet?).
>
> But because URIs allow this, the next guy organizes his namespaces
> differently:
>
>   www.nextguy.com/computer-memory-sram
>   www.nextguy.com/computer-memory-dram
>   www.nextguy.com/computer-cpus-intel
>   www.nextguy.com/computer-cpus-amd
>   www.nextguy.com/stereo-speaker-surround
>   www.nextguy.com/stereo-speaker-subwoofer
>
> And the next next guy does so as follows:
>
>   www.nextnextguy.com/computer?memory=true+type=sram
>   www.nextnextguy.com/computer?memory=true+type=dram
>   www.nextnextguy.com/computer?cputype=intel
>   www.nextnextguy.com/computer?cputype=amd
>   www.nextnextguy.com/stereo/speaker?surround
>   www.nextnextguy.com/stereo/speaker?subwoofer
>
> To make namespace filtering work for the general case requires
> regex-like matching capabilities.  And regex matching isn't very easy
> to optimize for performance (such as via indexing).  It also isn't the
> kind of thing we want the average XML user to have to learn -- seems to
> me that it would have to bubble up to the user interface, at least on
> generic XML tools.
>
> So I'm thinking that we need a *standard* way to organize namespaces
> hierarchically, and that we need one before namespace usage is so
> widespread that we absolutely have to provide regex support.
>
> But maybe I'm jumping the gun.  I haven't yet heard anyone scream out
> in pain, though I'm not sure we should be waiting for pain to come.
>
> --
> Joe Lapp              (Looking for some good people to help design
> Principal Architect    and build the Internet's business-to-business
> webMethods, Inc.       XML infrastructure.  We are 100% Java.)
> jlapp@webMethods.com           http://www.webMethods.com
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Thu Dec 30 02:33:07 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
Message-ID: <3.0.32.19991229213414.01b607a0@nexus.webmethods.com>

At 09:36 PM 12/29/99 -0500, Frank Boumphrey wrote:
>thus
>xmlns='www.parts.com/computer/memory/sram'
>
>and
>
>xmlns= 'www.nextguy.com/computer-memory-sram'
>
>may be technically the same URI, but they are different namespaces!

Oh, I perfectly intended that!  My apologies for using the electronic
parts hierarchy in all three examples; I needn't have done so.  I'm
concerned with addressing any one of these namespace sets individually,
not with unifying the different sets.
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From chq at softlab.nju.edu.cn  Thu Dec 30 03:38:33 1999
From: chq at softlab.nju.edu.cn (Chen Hong Qiang)
Date: Mon Jun  7 17:18:55 2004
Subject: Is there any good browser in Unix?
Message-ID: <386C25DA.87B72C95@softlab.nju.edu.cn>

Hello,
    I download  Mozilla , but It cannot display xml file as I have
wished.
    For example :the xml and xsl file like this:
    XML source:
     <?xml version="1.0"?>
    <?xml-stylesheet type="text/xsl" href="result.xsl"?>
    <xslTutorial >
     <employee>
     <firstName>Joe</firstName>
     <surname>Smith</surname>
     </employee>
     </xslTutorial>

result.xsl:
     <xsl:stylesheet xmlns:xsl='http://www.w3.org/XSL/Transform/1.0'>
     <xsl:template match="firstName|surname">
     <DIV><xsl:text> [template: </xsl:text>
     <xsl:value-of select="name()"/>
     <xsl:text> outputs </xsl:text>
     <xsl:apply-templates/ >
     <xsl:text> ]</xsl:text> </DIV>
     </xsl:template>
    </xsl:stylesheet>

It doesn't display like :
    [template: firstName outputs Joe]
    [template: surname outputsSmith ]
instead this:
            John Smith.

What's wrong.

FURTHERMORE
        Is there any other browser in Unix?(I have got IE5.0 in
Windows,and it's ok!)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mailman.ic.ac.uk/pipermail/xml-dev/attachments/19991230/332a724c/attachment.htm
From clark.evans at manhattanproject.com  Thu Dec 30 03:49:52 1999
From: clark.evans at manhattanproject.com (Clark C. Evans)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
In-Reply-To: <3.0.32.19991229180016.00ea9bfc@192.168.0.1>
Message-ID: <Pine.LNX.4.10.9912292224520.15049-100000@cauchy.clarkevans.com>

On Wed, 29 Dec 1999, Tim Bray wrote:
> At 08:44 PM 12/29/99 -0500, Joe Lapp wrote:
> >So I'm thinking that we need a *standard* way to organize namespaces
> >hierarchically, and that we need one before namespace usage is so
> >widespread that we absolutely have to provide regex support.
> 
> Sounds like a good idea.  How about a proposal?  Shorter is better.

I don't remember where, but I remember reading an example where
a URL was given as the xmlns URI -- pointing to an XmlSchema; 
thus, not only uniquely defining the namespace, but also 
identifying the element definitions in that namespace. So, 
this is how I had seen namespaces being powerful, as a pointer
to meta-data.

That being said, I had never expected a set of namespaces 
to have value by organizing them hierarchially... so I'm 
wondering exactly what value a hierarchy of namespaces would 
provide?  Would it be a sequence of ever-so-much-more-specific
schemas?  Where the most specific definition is the binding 
one? I can't think of any other reasons why I'd have more
than one namespace for a given "domain", let alone a 
hierarchically organized set.  What am I missing?

Thanks!

Clark


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From anderst at toolsmiths.se  Thu Dec 30 09:44:33 1999
From: anderst at toolsmiths.se (Anders W. Tell)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
References: <Pine.LNX.4.10.9912292224520.15049-100000@cauchy.clarkevans.com>
Message-ID: <386A50D2.6371A10A@toolsmiths.se>


"Clark C. Evans" wrote:

> That being said, I had never expected a set of namespaces
> to have value by organizing them hierarchially... so I'm
> wondering exactly what value a hierarchy of namespaces would
> provide?  Would it be a sequence of ever-so-much-more-specific
> schemas?  Where the most specific definition is the binding
> one? I can't think of any other reasons why I'd have more
> than one namespace for a given "domain", let alone a
> hierarchically organized set.  What am I missing?

What you are missing is a "possible" correlation to to other namespace
and scoping structures such as C++ namespaces ,Corba IDL modules
and Java packages.


/anders
--
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/
/  Financial Toolsmiths AB  /
/  Anders W. Tell           /
/_/_/_/_/_/_/_/_/_/_/_/_/_/_/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From digitome at iol.ie  Thu Dec 30 11:39:20 1999
From: digitome at iol.ie (Sean Mc Grath)
Date: Mon Jun  7 17:18:55 2004
Subject: Hierarchical namespaces?
Message-ID: <3.0.6.32.19991230112047.00999100@gpo.iol.ie>

[Joe Lapp]
>
>So I'm thinking that we need a *standard* way to organize namespaces
>hierarchically, and that we need one before namespace usage is so
>widespread that we absolutely have to provide regex support.
>
>But maybe I'm jumping the gun.  I haven't yet heard anyone scream out
>in pain, though I'm not sure we should be waiting for pain to come.
>
I am in violent agreement with this.
I believe I heard a scream some time ago - the multiple HTML namespace
debate. If a hieararchical HTML namespace mechanism existed,
evelopers simply wanting to match "p" in the sense of good-old HTML
could use a regexp. to find them. Developers interested in
HTML 4.0 "p" would provide more leaves in the hierarchical
namespace specification.

regards,

Sean,


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Dec 30 12:03:10 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:55 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML Schema 
 Questions
References: <33D189919E89D311814C00805F1991F7F4AADD@RED-MSG-08>
Message-ID: <386B4A25.1D34E681@mitre.org>

Hi Folks,

I gotta tell ya, this 'ol country boy is having a mighty difficult time
figuring out how an XML document is to indicate to that it conforms to
a  particular XML Schema.  It seems to me that this should be one area
that should be made crystal clear.  Instead, I am finding this to be one
of the murkiest parts of the XML Schema spec.

These statements really throw me through a loop:

"xsi:schemaLocation attribute serves as a hint, not a mandatory
directive. That is, the processor of an instance is welcome to look at
the URI referenced by the value of xsi:schemaLocation, but is not
required to."

"The means used to locate appropriate schema document(s) are processor
and application dependent"

I read these statements as saying that there is no standard way for
specifying in an XML document what XML Schema it conforms to - every XML
Parser will have its own way of doing things.  Really???   If this is
so, please, please tell me why this is a good thing.  I am struggling to
appreciate its beauty.    /Roger

Andrew Layman wrote:
> 
> What was said by Rick Jeliffe regarding the current schema draft is true
> (and anyone who is interested is recommended to red the actual XML Schema WD
> at http://www.w3.org/TR/xmlschema-1/ and http://www.w3.org/TR/xmlschema-2/.)
> 
> However, I would like to correct a possible misimpression that might arise
> from the turgid wording in the current public draft and also from Rick's
> statement "Then (s4.3.2) there is an attribute xsi:schemaLocation that can
> be put on any instance element. It allows the location of the schema to be
> declared. ..."
> 
> After extensive debate, the XML Schemas WG decided that the
> xsi:schemaLocation attribute serves as a hint, not a mandatory directive.
> That is, the processor of an instance is welcome to look at the URI
> referenced by the value of xsi:schemaLocation, but is not required to.  It
> may process an instance document using a different schema set (or no schemas
> at all).  The relevant phrase is "unless directed otherwise" in the
> following passage from the 1999-12-17 structures draft:
> 
> "Again, unless directed otherwise general-purpose schema-aware processors
> must attempt to dereference each schema URI in the value of "schemaLocation"
> to obtain a schema..."
> 
> This is in recognition of the fact that, ultimately, the processor of a
> document determines what processing is done.
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bkline at rksystems.com  Thu Dec 30 13:02:29 1999
From: bkline at rksystems.com (Bob Kline)
Date: Mon Jun  7 17:18:55 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML
 Schema  Questions
In-Reply-To: <386B4A25.1D34E681@mitre.org>
Message-ID: <Pine.LNX.4.10.9912300736350.15103-100000@rksystems.com>

On Thu, 30 Dec 1999, Roger L. Costello wrote:

> I read these statements as saying that there is no standard way for
> specifying in an XML document what XML Schema it conforms to - every
> XML Parser will have its own way of doing things.  Really???  If
> this is so, please, please tell me why this is a good thing.  I am
> struggling to appreciate its beauty.  /Roger

I am as puzzled as you are.  Yes, it's true, as Andrew writes, that
"ultimately, the processor of a document determines what processing is
done" [cited as the rationale for the decision by the XML Schemas WG to
demote the xsi:schemaLocation attribute to a "hint"].  The same could be
said of any software, which behaves in direct response to the
instructions written by its creators, rather than the prescriptions of
standards.  The role of a standard is to assist in the processes of
predicting how software which claims conformance to it will behave, and
of determining which products are actually conformant.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Thu Dec 30 13:26:33 1999
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:18:55 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML  Schema  Questions
In-Reply-To: Bob Kline's message of "Thu, 30 Dec 1999 08:02:36 -0500 (EST)"
References: <Pine.LNX.4.10.9912300736350.15103-100000@rksystems.com>
Message-ID: <f5bbt785zls.fsf@cogsci.ed.ac.uk>

[Thread wrt instance->schema connections, lack of rigidity thereof]

With apologies for the 'turgid prose' of the draft in this area, let
me try to explain why flexibility IN THE REC in this area is a Good
Thing:

Schemas are a powerful and useful mechanism, with a wide range of
possible deployment scenarios.  Different schemas may usefully be
employed with respect to the same instance document for different
purposes, all legitimate.  'xsi:schemaLocation' is a means by which a
document author can signal A location for A schema with respect to
which s/he warrents the instance at hand is schema-valid.  It will
often be appropriate for schema-aware processors to exploit this
information.  But it may not always be possible (the processor may be
offline) or appropriate (the processor may have other schema-based
processing in view) to do so.  We have tried in the current draft to
indicate that 'xsi:schemaLocation' is the preferred, inter-operable
means by which instances signal schemas to processors, WITHOUT making
this connection make-or-break mandatory.

A moment's thought about experience with XML's instance->DTD linkage
will perhaps suggest some benefits of this approach:  as it stands, if 
I wish to validate an XML instance which references no external DTD, I 
have to edit it to incorporate a suitable DOCTYPE declaration.  Even
if the document has a DOCTYPE, if the URL it references is unavailable 
or out-of-date, I again must have recourse to a text editor to fix
this.  We've tried to do better for XML Schema.  Another experience
we've tried to learn from is the instance->stylesheet one, with
similar lessons we believe.

Hope this helps,

ht
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Thu Dec 30 13:31:40 1999
From: costello at mitre.org (Roger L. Costello)
Date: Mon Jun  7 17:18:55 2004
Subject: XML Schema Question: understanding equivClass and abstract
Message-ID: <386B5EE3.8511AD91@mitre.org>

Hi Folks,

I am not sure that I understand equivClass types and abstract elements. 
If I may, I will describe my understanding, and then you can tell me
where I am mistaken.  As a point of discussion, consider this XML
Schema:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd"[
<!ATTLIST schema xmlns:cat CDATA #IMPLIED>
]>
<schema xmlns="http://www.w3.org/1999/XMLSchema"
               targetNamespace="http://www.xfront.org/BookCatalogue"
               xmlns:cat="http://www.xfront.org/BookCatalogue">
    <type name="Publication">
        <element name="Title" type="string" maxOccurs="*"/>
        <element name="Author" type="string" maxOccurs="*"/>
        <element name="Date" type="date"/>
    </type>
    <element name="Publication" type="cat:Publication" 
             abstract="true"/>
    <element name="Book" equivClass="cat:Publication">
        <type source="cat:Publication" derivedBy="extension">
            <element name="ISBN" type="string"/>
            <element name="Publisher" type="string"/>
        </type>
    </element>
    <element name="Magazine" equivClass="cat:Publication">
        <type source="cat:Publication" derivedBy="restriction">
            <element name="Author" type="string" maxOccurs="0"/>
        </type>
    </element>
    <element name="Catalogue">
        <type>
            <element ref="cat:Publication" minOccurs="0" maxOccurs="*"/>
        </type>
    </element>
</schema>

I have defined a Publication type and a Publication element which is of
type Publication.  The Publication element I declared to be abstract by
setting the abstract attribute equal to 'true'.  I then defined a Book
and Magazine element and made them equivalence classes of the
Publication type.  Lastly, I defined a Catalogue element.  It contains
zero or more Publication elements.

My understanding is that by making the Publication element abstract then
whereever it is used (as it is in the Catalogue element) in the XML
instance document it must be replaced by an equivalent element.  What I
mean by this is best explained by showing an XML instance document:

<?xml version="1.0"?>
<Catalogue xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
          xsi:schemaLocation=
              "http://www.somewhere.org/Catalogue
              http://www.somewhere.org/Catalogue/Catalogue.xsd">
        <Magazine>
                <Title>Natural Health</Title>
                <Date>December, 1999</Date>
        </Magazine>
        <Book>
                <Title>Illusions The Adventures of a Reluctant
                       Messiah</Title>
                <Author>Richard Bach</Author>
                <Date>1977</Date>
                <ISBN>0-440-34319-4</ISBN>
                <Publisher>Dell Publishing Co.</Publisher>
        </Book>
        <Book>
                <Title>The First and Last Freedom</Title>
                <Author>J. Krishnamurti</Author>
                <Date>1954</Date>
                <ISBN>0-06-064831-7</ISBN>
                <Publisher>Harper &amp; Row</Publisher>
        </Book>
</Catalogue>

Notice how the Catalogue element contains Magazine and Book elements. 
The abstract Publication element that was defined to be the content of
Catalogue has been replaced by elements that are of an equivalent class
with Publication.  The Magazine and Book elements were declared to be of
an equivalent class, thus they can be used as the replacements for the
abstract Publication element.

Is this a correct understanding of abstract elements and of equivClass
types?  /Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From david at megginson.com  Thu Dec 30 13:43:34 1999
From: david at megginson.com (David Megginson)
Date: Mon Jun  7 17:18:55 2004
Subject: SAX2 Namespace Support
In-Reply-To: <385FA4C9.2B054683@pacbell.net>
References: <14430.46481.974707.922192@localhost.localdomain>
	<012701bf4b44$0c68b030$4a5eedc1@arp01>
	<14430.55483.433692.943811@localhost.localdomain>
	<385FA4C9.2B054683@pacbell.net>
Message-ID: <14442.32320.722924.870701@localhost.localdomain>

David Brownell writes:

 > There's been way too much email on this topic -- I should have
 > weighed in earlier.  In all honesty I'd prefer to see all namespace
 > support be cleanly layered on top of SAX1.  It's easy to do it that
 > way; just add some optional code to postprocess a SAX event stream.

The argument against that is efficiency: I have found that even the
most efficient Namespace post-processor that I can write adds about
25% to parsing time.  The reason, I think, is that there is a high
cost to iterating through every attribute list and examining every
attribute name, and copying or wrapping the attribute lists to give a
Namespace view.  

If Namespace processing is done in the parser itself, on the other
hand, the overhead should be relatively close to 0.  Since most new
XML-related standards require Namespace processing, this is an obvious
place to optimize by allowing the parser to pass on information
directly.

 > With respect to this particular proposal, I have several comments.
 > 
 > First, it's unclear to me what's happened to our old friend, the
 > org.xml.sax.DocumentHandler.startElement callback:
 > 
 >     public void startElement (String name, AttributeList attrs)
 >     throws SAXException;
 > 
 > If that call is gone, I anticipate migration problems to SAX2.

There have been so many proposals that I'm starting to lose track.
The idea, I think, is that this would be replaced by

  public void startElement (String namespaceURI, String localName,
                            String prefixedName, String atts)

or by

  public void startElement (String namespaceURI, String localName,
                            String atts)

with an option to leave the prefix on the local name if the parser
supports it.


 > If it's still there, then it must be the application's choice to use
 > the new sax2.DocumentHandler interface or the original ... presumably
 > it would use Configurable.setProperty() with some ID for the new
 > namespace-aware sax2.DocumehtHandler to identiy its choice.

One option that no one has suggested yet is to create the
NamespaceHandler a little differently:

  public class NamespaceHandler
  {
    public void startElement (String namespaceURI, String localName,
                              NSAttributeList atts)
      throws Whatever;

    public void endElement (String namespaceURI, String localName)
      throws Whatever;

    // and the original NS decl events as well...
  }

That way, SAX parsers could still use the original DocumentHandler to
report the XML 1.0 view (with prefixed names), and the
NamespaceHandler to report the Namespace view of elements and
attributes, which is the only place the view differs.

We would simply make a rule that, with Namespace support, the NS
startElement event always comes just before (or just after?) the SAX1
event, and that the attributes in the two lists must be in the same
order.

Personally, I find this approach a little brittle: I don't like
depending on ordering like that (and, of course, having to allow for
NS decl attributes), and I don't like the fact that the app might have 
to copy either or both of the attribute lists before using them.
Still, I'm surprised that this suggestion hasn't come up.

 > Second, it's unclear how to report violations of namespace conformance.
 > 
 > I'd asked that the namespace spec resolve this issue, by using the
 > same reporting terminology that the XML spec uses ("warning",
 > "error", and of course "fatal error"), but instead it got even more
 > vague.  So I'll have to ask how SAX will address this ... keeping
 > in mind that if W3C gets around to answering those questions, it
 > might pick different answers.

Don't hold your breath -- last I was involved, the W3C groups were
swamped.

 > That is, faced with this document
 > 
 > 	<?xml version="1.0"?>
 > 	<html:p>Hello again! :-)</html:p>
 > 	<?at-end-of-document?>
 > 
 > Two reporting issues arise:  (a) How does one know that namespaces are
 > to be used at all?  It's a legal XML 1.0 document, so inherently there
 > is no error.  

That's a big problem.  My SAX2 proposal is for XML+Namespaces by
default, but it's possible to try to disable Namespace support.  That
means that, by default, you would get an error for this document.

 > (b) If one knows that namespaces are to be used, is the undeclared
 > "html" prefix to generate a warning, recoverable error, or fatal
 > error through sax.ErrorHandler?  Is it reported some other way?

I think that it would be wrong to use fatalError to report Namespace
violations, but others may disagree.  I think that OASIS or some other
body should take a stab at this problem -- we shouldn't wait for the
W3C to solve everything.  I enjoyed my time with the W3C XML Activity,
but I'd like to think that XML will outlive the organization that
specified it.

 > I think that using ErrorHandler.error() is the best solution, but then
 > that leads to the issue of how to report namespace URIs that aren't
 > available.  (And as I recall, there were more errors to deal with than
 > just unresolved namespace prefixes.)

Error numbers would be helpful, if someone were willing to invent some.

 > > This would never be enabled by default, but for the relatively small
 > > class of apps that needed to know the original prefix, the prefix
 > > would be available simply by splitting the name argument.
 > 
 > Clearly that class includes "DOM-using applications", which for better
 > or worse (opinions do vary :-) isn't a small class.
 >
 > DOM L2 applications explicitly have the same option that I noted above:
 > use (or non-use) of namespace information is the choice of the application,
 > not the choice of some version of an XML infrastructure.

Is DOM2 more explicit about processing than DOM1, then?  There's
nothing in DOM1 that says (for example) that you have to include
comments and other stuff from the original XML document, if in fact
there is an original XML document.

Even in DOM2, I wonder if you'd have to have the *original* prefixes
or just some prefixes?  After all, the DOM won't always be built from
an XML document; it might be a wrapper around a bunch of DB tables
(for example) where there are no original prefixes available.

 > > I like this approach because it doesn't throw the prefix in the face
 > > of apps that don't need it -- to paraphrase Larry Wall, it makes common
 > > tasks easy and uncommon tasks possible.
 > 
 > A third issue:  building a DOM is quite "common" though, and it needs
 > those prefixes.

I'll have to read the latest DOM2 before I comment in detail, but in
general, I'm not convinced that you have to include everything in a
DOM2 tree that DOM2 happens to support.


All the best,


David

-- 
David Megginson                 david@megginson.com
           http://www.megginson.com/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bkline at rksystems.com  Thu Dec 30 14:20:13 1999
From: bkline at rksystems.com (Bob Kline)
Date: Mon Jun  7 17:18:55 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML 
 Schema  Questions
In-Reply-To: <f5bbt785zls.fsf@cogsci.ed.ac.uk>
Message-ID: <Pine.LNX.4.10.9912300846450.15103-100000@rksystems.com>

On 30 Dec 1999, Henry S. Thompson wrote:

> [Thread wrt instance->schema connections, lack of rigidity thereof]
> 
> With apologies for the 'turgid prose' of the draft in this area, let
> me try to explain why flexibility IN THE REC in this area is a Good
> Thing:
> 
> Schemas are a powerful and useful mechanism, with a wide range of
> possible deployment scenarios.  Different schemas may usefully be
> employed with respect to the same instance document for different
> purposes, all legitimate.  'xsi:schemaLocation' is a means by which
> a document author can signal A location for A schema with respect to
> which s/he warrents the instance at hand is schema-valid.  It will
> often be appropriate for schema-aware processors to exploit this
> information.  But it may not always be possible (the processor may
> be offline) or appropriate (the processor may have other
> schema-based processing in view) to do so.  We have tried in the
> current draft to indicate that 'xsi:schemaLocation' is the
> preferred, inter-operable means by which instances signal schemas to
> processors, WITHOUT making this connection make-or-break mandatory.
> 
> A moment's thought about experience with XML's instance->DTD linkage
> will perhaps suggest some benefits of this approach:  as it stands,
> if I wish to validate an XML instance which references no external
> DTD, I have to edit it to incorporate a suitable DOCTYPE
> declaration.  Even if the document has a DOCTYPE, if the URL it
> references is unavailable or out-of-date, I again must have recourse
> to a text editor to fix this.  We've tried to do better for XML
> Schema.  Another experience we've tried to learn from is the
> instance->stylesheet one, with similar lessons we believe.

I guess I don't see a conflict between the desire for flexibility and
the need to specify standard behavior for the primary purpose of the
schema mechanism.  I don't think anyone is arguing that other uses of
schemas should be prohibited, and surely no one expects a processor to
validate against a schema if the schema isn't made available to it.  I
see no problem with a processor providing options which mean "process
this XML document without validating it against its schema."  I see no
problem with a processor providing options which means "use some
non-standard mechanism for processing this document against (or
identifying or locating) its schema."  How would any of this preclude
saying, in effect, here is the standard mechanism for identifying the
schema against which a given XML instance is to be validated; when the
mechanism is used as described herein a processor which conforms to this
specification will behave as follows, in the absence of explicit
instructions to the contrary?  I don't expect the spec to require that
the processor burst into flames if it can't locate the schema, but I
would like the spec to describe predictable behavior of a conformant
processor if I use the standard mechanism for identifying a schema which
is available.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Lucio.Piccoli at one2one.co.uk  Thu Dec 30 14:24:26 1999
From: Lucio.Piccoli at one2one.co.uk (Lucio Piccoli)
Date: Mon Jun  7 17:18:55 2004
Subject: Element_node.getValue()
Message-ID: <E394D68F1B4BD31185520008C791B39C6CC994@elt02mbx.one2one.co.uk>

hi all,
I am getting very frustreated with the sun XML java parser. I am using the
DOM. When i retrieve a ELEMENT_NODE, i call getValue() on the node and it
returns null. Yet when i call toString() i get all the data.
The element is defined as

<!ELEMENT body (PCDATA) >

How do i get the value from the element?


adios

-lucio

---------------------------------------------------------------------
 One2One 				         LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower			 		tel : +44 181 214 3847
 Elstree Way
 Borehamwood	             			fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From prashantwalke at usa.net  Thu Dec 30 14:38:03 1999
From: prashantwalke at usa.net (Prashant Walke)
Date: Mon Jun  7 17:18:55 2004
Subject: unsubcribe
Message-ID: <19991230143457.4343.qmail@nw171.netaddress.usa.net>


____________________________________________________________________
Get free email and a permanent address at http://www.netaddress.com/?N=1

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From simonstl at simonstl.com  Thu Dec 30 14:47:32 1999
From: simonstl at simonstl.com (Simon St.Laurent)
Date: Mon Jun  7 17:18:55 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many)
  XML  Schema  Questions
In-Reply-To: <f5bbt785zls.fsf@cogsci.ed.ac.uk>
References: <Bob Kline's message of "Thu, 30 Dec 1999 08:02:36 -0500 (EST)">
 <Pine.LNX.4.10.9912300736350.15103-100000@rksystems.com>
Message-ID: <199912301447.JAA28582@hesketh.net>

At 01:26 PM 12/30/99 +0000, Henry S. Thompson wrote:
>A moment's thought about experience with XML's instance->DTD linkage
>will perhaps suggest some benefits of this approach:  as it stands, if 
>I wish to validate an XML instance which references no external DTD, I 
>have to edit it to incorporate a suitable DOCTYPE declaration.  Even
>if the document has a DOCTYPE, if the URL it references is unavailable 
>or out-of-date, I again must have recourse to a text editor to fix
>this.  We've tried to do better for XML Schema.  Another experience
>we've tried to learn from is the instance->stylesheet one, with
>similar lessons we believe.

As glad as I am to see the Schema folks wrestling with these issues, it
seems like they belong someplace else - with the instance->stylesheet
problem, perhaps.

Has anything ever come of the 'plans for a Working Group on XML Packaging'
mentioned in the XML Activity Statement
(http://www.w3.org/XML/Activity.html), or is this just too dull for
people/companies to get excited about?

These issues are pretty much at the core of XML processing, and not just
schema processing...

Simon St.Laurent
XML Elements of Style / XML: A Primer, 2nd Ed.
Building XML Applications
Inside XML DTDs: Scientific and Technical
Cookies / Sharing Bandwidth
http://www.simonstl.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mike.champion at softwareag-usa.com  Thu Dec 30 15:54:04 1999
From: mike.champion at softwareag-usa.com (Michael Champion)
Date: Mon Jun  7 17:18:55 2004
Subject: Element_node.getValue()
References: <E394D68F1B4BD31185520008C791B39C6CC994@elt02mbx.one2one.co.uk>
Message-ID: <00fd01bf52de$171172a0$0ef8fea9@mcc>


----- Original Message -----
From: "Lucio Piccoli" <Lucio.Piccoli@one2one.co.uk>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 30, 1999 9:23 AM
Subject: Element_node.getValue()


> hi all,
> I am getting very frustreated with the sun XML java parser. I am using the
> DOM. When i retrieve a ELEMENT_NODE, i call getValue() on the node and it
> returns null. Yet when i call toString() i get all the data.
> The element is defined as
>
> <!ELEMENT body (PCDATA) >
>
> How do i get the value from the element?

Blame the DOM working group (to which I must confess membership), not Sun.
The value() of an Element is always null according to the DOM spec.  The
standard way to get the value of an element is to get the child Text node
and get its value.  Most DOM implementations have some sort of convenience
method to get the text value of a single element ... but sometimes you have
to get the text of the entire subtree under the element (which is apparently
what Sun's toString() method is doing).

In the DOM WG's defense, it is *exactly* this reason for making the value()
of an Element null -- Should it return the value of the entire subtree?
Should the text come back with markup embedded or not?  There was no
consensus as to the best way to do it, so we essentially punted.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From dpotter at mitre.org  Thu Dec 30 16:10:54 1999
From: dpotter at mitre.org (Daniel Potter)
Date: Mon Jun  7 17:18:56 2004
Subject: XML Schema Datatype Questions
Message-ID: <386B8477.B04521D9@mitre.org>

I am currently developing a type checker designed to validate data in an
instance document corresponding to an XML schema.  I have a few
questions about the datatype spec I am hoping someone can answer.

First question, what is Not A Number in respect to the float/double
datatypes?  Or, rather, what is it in relationship to a min/max range? 
If I specify that mininclusive=0 and maxinclusive=INF, then is NAN in
that range?  What about mininclusive=-3 and maxexclusive=4?  Or does NAN
need to be specified in the enumeration?  (On that note, can the
enumeration facet be used to allow values outside the range specified by
min/max to be included?  Or do enumeration values need to fall within
the range?  There is no constraint saying that enumeration and min/max
values cannot be set together, which leads me to believe that they are
combined to describe legal values.)

In other words, if I specify a range of values, does NAN ever fall in
that range?  Or does it need to be specified as a legal value through
the enumeration facet?

Also relating to float, are the characters case sensitive?  Can I use
"inf" as a value or does it need to be "INF"?  Is "6.22e22" legal?  Or
does it need to be "6.22E22"?

One last question:  For the binary datatype, what is the default
encoding?  According to the encoding facet, it must be either hex or
base64, but what happens if the user doesn't specify?  If I specify that
an element is of type binary, but leave out the encoding facet, what
encoding is then used?  As far as I can tell, there is no default value
for encoding.  I would assume hex is default because it is listed first,
but it would be nice to know for certain which encoding is truely
intended as default.

Thanks for any help.


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From abisheks at india.hp.com  Thu Dec 30 17:08:36 1999
From: abisheks at india.hp.com (Abhishek Srivastava)
Date: Mon Jun  7 17:18:56 2004
Subject: Element_node.getValue()
References: <E394D68F1B4BD31185520008C791B39C6CC994@elt02mbx.one2one.co.uk>
Message-ID: <016301bf52e8$5e4da3a0$252f0a0f@india.hp.com>

Hi,

Let me see if I get your problem correctly...

you have an element called
<NAME> James bond </NAME>
and u want to extract the value of name that is "james bond"

First traverse down the DOM tree till u get a node that has the name NAME
 this can be done using getNodeName( )  ).
If u want to get the text what this node contains, do a getFirstChild ( ) on
this node ... this will give u a node that has the
a name "#text" if u call the getNodeValue( ) on this node you will get the
actual content of the element ( james bond )

Basically, the text value of an element is one node below the node that
contains the name of the element.

all the best,
Abhishek.
----- Original Message -----
From: "Lucio Piccoli" <Lucio.Piccoli@one2one.co.uk>
To: <xml-dev@ic.ac.uk>
Sent: Thursday, December 30, 1999 7:53 PM
Subject: Element_node.getValue()


> hi all,
> I am getting very frustreated with the sun XML java parser. I am using the
> DOM. When i retrieve a ELEMENT_NODE, i call getValue() on the node and it
> returns null. Yet when i call toString() i get all the data.
> The element is defined as
>
> <!ELEMENT body (PCDATA) >
>
> How do i get the value from the element?
>
>
> adios
>
> -lucio
>
> ---------------------------------------------------------------------
>  One2One          LUCIO.PICCOLI@one2one.co.uk
>  Elstree Tower tel : +44 181 214 3847
>  Elstree Way
>  Borehamwood              fax :+44 181 214 2325
>  LONDON WD6 1DT
>  __________ http://www.one2one.co.uk _____________
>
>
>
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrossi at crusher.jcals.csc.com  Thu Dec 30 17:47:20 1999
From: mrossi at crusher.jcals.csc.com (Michael Rossi)
Date: Mon Jun  7 17:18:56 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML  
	Schema  Questions
Message-ID: <472EF0A38796D21185810000F807DD1E01A98CC6@crusher.jcals.csc.com>

ht@cogsci.ed.ac.uk wrote:
> 
> Schemas are a powerful and useful mechanism, with a wide range of
> possible deployment scenarios.  Different schemas may usefully be
> employed with respect to the same instance document for different
> purposes, all legitimate.

   Could you elaborate on this Henry, maybe with a brief example? Either my
brain's on holiday a little early or, not being privvy to the working group
discsussions, maybe I've missed some helpful comments.

> 'xsi:schemaLocation' is a means by which a
> document author can signal A location for A schema with respect to
> which s/he warrents the instance at hand is schema-valid.  It will
> often be appropriate for schema-aware processors to exploit this
> information.  But it may not always be possible (the processor may be
> offline) or appropriate (the processor may have other schema-based
> processing in view) to do so.

   Again, I'd like an elaboration if you would. Are you saying that even
though an instance claims validity against a given schema, a processing
application may wish to validate or process it against a different schema?
Validation against a different schema may be explained by your response to
the above, but for what other processing would an application hope to use a
non-instance specified schema besides validation? I'm just not seeing the
practicality yet. Thanks.

> We have tried in the current draft to
> indicate that 'xsi:schemaLocation' is the preferred, inter-operable
> means by which instances signal schemas to processors, WITHOUT making
> this connection make-or-break mandatory.

   I'd have to agree with Bob Kline's comments on this:

> How would any of this preclude saying, in effect, here is the standard
> mechanism for identifying the schema against which a given XML instance
> is to be validated; when the mechanism is used as described herein a
> processor which conforms to this specification will behave as follows,
> in the absence of explicit instructions to the contrary?  I don't expect
> the spec to require that the processor burst into flames if it can't
> locate the schema, but I would like the spec to describe predictable
> behavior of a conformant processor if I use the standard mechanism for
> identifying a schema which is available.

   We're having similar debates in a working group of the Workflow
Management Coalition right now, flexibility vs. standardization. Personally,
I'd say if you've decided to standardize something than standardize it. If
all you're saying is "feel free to do this part however you like" than you
don't have a standard. Besides, as Bob also alluded to, if vendor's want to
MS a spec they're going to do it anyway. It's then their responsibility to
push the value of the proprietary "extensions".

Michael A. Rossi
mailto:mrossi@jcals.csc.com
856-983-4400 x4911

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Vlashua at rsgsystems.com  Thu Dec 30 18:10:26 1999
From: Vlashua at rsgsystems.com (Vane Lashua)
Date: Mon Jun  7 17:18:56 2004
Subject: Element_node.getValue()
Message-ID: <E9EB65078F9BD31193FE009027B100E10CBF95@RSGMAIL01>

#PCDATA

-----Original Message-----
From: Lucio Piccoli [mailto:Lucio.Piccoli@one2one.co.uk]
Sent: Thursday, December 30, 1999 9:24 AM
To: xml-dev@ic.ac.uk
Subject: Element_node.getValue()


hi all,
I am getting very frustreated with the sun XML java parser. I am using the
DOM. When i retrieve a ELEMENT_NODE, i call getValue() on the node and it
returns null. Yet when i call toString() i get all the data.
The element is defined as

<!ELEMENT body (PCDATA) >

How do i get the value from the element?


adios

-lucio

---------------------------------------------------------------------
 One2One 				         LUCIO.PICCOLI@one2one.co.uk
 Elstree Tower			 		tel : +44 181 214 3847
 Elstree Way
 Borehamwood	             			fax :+44 181 214 2325
 LONDON WD6 1DT
 __________ http://www.one2one.co.uk _____________


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec 30 18:27:38 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:56 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML  
	Schema  Questions
Message-ID: <33D189919E89D311814C00805F1991F7F4AAEF@RED-MSG-08>

I must side here with Henry's argument.  The schemas specification provides
a standard way that a document author can indicate a specific schema set
that he warrants the document conforms to, and it also provides a standard
set of rules related to the application of several schemas comprising that
schema set.  This is an entirely useful facility to have.

However, the existence of such a warrant in a document does not obligate a
processing application to use the asserted schemas.  As Henry says "But it
may not always be possible (the processor may
be offline) or appropriate (the processor may have other schema-based
processing in view) to do so."    

Consider the case in which I have an e-commerce site that accepts purchase
orders. The site advertises that it processes them according to a specific
schema, one that supplies a default value of "US Dollar" for the currency
units.  If I managed that site, I would not want to process purchase orders
that were similar, except that they specified a different schema in which
the default currency units were "Greek Drachma".  My site would need to
either (a) reject such purchase orders or (b) publish as part of the site's
description and/or legal conditions that it processes purchase orders
according to the specific schema, regardless of any schemaLocation
attributes in the document. Note that the second policy is actually more
friendly towards the use of schemaLocation.

The bottom line is that, in processing a document, either the writer or the
reader makes the final determination of what processing happens.  Actually,
only the reader does.

I hope this is helpful,
Andrew Layman

-----Original Message-----
From: Bob Kline [mailto:bkline@rksystems.com]
Sent: Thursday, December 30, 1999 6:20 AM
To: Henry S. Thompson
Cc: Roger L. Costello; xml-dev@ic.ac.uk;
www-xml-schema-comments@w3c.org; Schneider,John C.; Cokus,Michael S.
Subject: Re: No Standard way to reference XML Schema? Was Re: (Many) XML
Schema Questions


On 30 Dec 1999, Henry S. Thompson wrote:

> [Thread wrt instance->schema connections, lack of rigidity thereof]
> 
> With apologies for the 'turgid prose' of the draft in this area, let
> me try to explain why flexibility IN THE REC in this area is a Good
> Thing:
> 
> Schemas are a powerful and useful mechanism, with a wide range of
> possible deployment scenarios.  Different schemas may usefully be
> employed with respect to the same instance document for different
> purposes, all legitimate.  'xsi:schemaLocation' is a means by which
> a document author can signal A location for A schema with respect to
> which s/he warrents the instance at hand is schema-valid.  It will
> often be appropriate for schema-aware processors to exploit this
> information.  But it may not always be possible (the processor may
> be offline) or appropriate (the processor may have other
> schema-based processing in view) to do so.  We have tried in the
> current draft to indicate that 'xsi:schemaLocation' is the
> preferred, inter-operable means by which instances signal schemas to
> processors, WITHOUT making this connection make-or-break mandatory.
> 
> A moment's thought about experience with XML's instance->DTD linkage
> will perhaps suggest some benefits of this approach:  as it stands,
> if I wish to validate an XML instance which references no external
> DTD, I have to edit it to incorporate a suitable DOCTYPE
> declaration.  Even if the document has a DOCTYPE, if the URL it
> references is unavailable or out-of-date, I again must have recourse
> to a text editor to fix this.  We've tried to do better for XML
> Schema.  Another experience we've tried to learn from is the
> instance->stylesheet one, with similar lessons we believe.

I guess I don't see a conflict between the desire for flexibility and
the need to specify standard behavior for the primary purpose of the
schema mechanism.  I don't think anyone is arguing that other uses of
schemas should be prohibited, and surely no one expects a processor to
validate against a schema if the schema isn't made available to it.  I
see no problem with a processor providing options which mean "process
this XML document without validating it against its schema."  I see no
problem with a processor providing options which means "use some
non-standard mechanism for processing this document against (or
identifying or locating) its schema."  How would any of this preclude
saying, in effect, here is the standard mechanism for identifying the
schema against which a given XML instance is to be validated; when the
mechanism is used as described herein a processor which conforms to this
specification will behave as follows, in the absence of explicit
instructions to the contrary?  I don't expect the spec to require that
the processor burst into flames if it can't locate the schema, but I
would like the spec to describe predictable behavior of a conformant
processor if I use the standard mechanism for identifying a schema which
is available.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From rhanson at blast.net  Thu Dec 30 19:05:37 1999
From: rhanson at blast.net (Robert Hanson)
Date: Mon Jun  7 17:18:56 2004
Subject: What happened to XSL?
References: <33D189919E89D311814C00805F1991F7F4AAEF@RED-MSG-08>
Message-ID: <003a01bf52f7$c77ed100$0cb919ce@INTERNETDEPT>

When I say "XSL" I don't mean "XSLT" or "XPath", but instead the XSL spec
itself.  Looking at an article on XML.com today which was written in May of
this year, Michael Leventhal wrote a rather convincing piece saying that XSL
is not needed, and maybe he is right (at least partially).  And looking at
what has happened since he wrote it, it seems that XSLT moved on without
XSL... So what happened?  Will there ever be an XSL?

Thanks, and happy new year.

Robert


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From robnic at expedia.com  Thu Dec 30 19:44:33 1999
From: robnic at expedia.com (Rob Nichols)
Date: Mon Jun  7 17:18:56 2004
Subject: What happened to XSL?
Message-ID: <D5922CA42F8CD31189D800805F19A16C6900C5@RED-MSG-48>

I think XSL-List spent quite a bit of time chewing on that article... check
out the archive if you want to review the threads:
http://www.mulberrytech.com/xsl/xsl-list/archive/index.html

I think I remember one thread, 'Leventhal's challenge misses the point' or
something like that. No hidden comment, just a pointer.  No really.

-rob

-----Original Message-----
From: Robert Hanson [mailto:rhanson@blast.net]
Sent: Thursday, December 30, 1999 10:58 AM
To: xml-dev@ic.ac.uk
Subject: What happened to XSL?


When I say "XSL" I don't mean "XSLT" or "XPath", but instead the XSL spec
itself.  Looking at an article on XML.com today which was written in May of
this year, Michael Leventhal wrote a rather convincing piece saying that XSL
is not needed, and maybe he is right (at least partially).  And looking at
what has happened since he wrote it, it seems that XSLT moved on without
XSL... So what happened?  Will there ever be an XSL?

Thanks, and happy new year.

Robert


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From mrossi at crusher.jcals.csc.com  Thu Dec 30 21:30:43 1999
From: mrossi at crusher.jcals.csc.com (Michael Rossi)
Date: Mon Jun  7 17:18:56 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML  
	 Schema  Questions
Message-ID: <472EF0A38796D21185810000F807DD1E01A98CC7@crusher.jcals.csc.com>

Andrew Layman wrote:
> 
<snip/>
> 
> Consider the case in which I have an e-commerce site that accepts purchase
> orders. The site advertises that it processes them according to a specific
> schema, one that supplies a default value of "US Dollar" for the currency
> units.  If I managed that site, I would not want to process purchase
orders
> that were similar, except that they specified a different schema in which
> the default currency units were "Greek Drachma".  My site would need to
> either (a) reject such purchase orders or (b) publish as part of the
site's
> description and/or legal conditions that it processes purchase orders
> according to the specific schema, regardless of any schemaLocation
> attributes in the document. Note that the second policy is actually more
> friendly towards the use of schemaLocation.
> 
> The bottom line is that, in processing a document, either the writer or
the
> reader makes the final determination of what processing happens.
Actually,
> only the reader does.
> 
> I hope this is helpful,
> Andrew Layman

   It certainly is helpful, thanks. But I'd still contend that we haven't
gained anything here. In this particular situation, the writer decided that
their purchase order was going to be written against a schema that specified
Greek Drachma as the unit of currency. If our e-commerce site chooses
(whether stated or not) to process this PO against it's own schema that uses
US Dollars, the consumer is going to end up paying the wrong amount for
their order. So our only choice is (a) above unless we either retrieve the
instance-specified schema or provide for some form of translation or
coercion. Maybe a stylesheet that would exchange the Drachma for Dollars and
vice-versa would be appropriate.

   Now it's still possible that there may be situations where applying some
non-instance-specified schema would be useful, although I can't think of one
at the moment. But in most, if not all, cases I think it's safe to assume
that if an instance has been written to conform to a particular schema,
there's likely a good reason why.

Michael A. Rossi
mailto:mrossi@jcals.csc.com
856-983-4400 x4911

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec 30 22:06:56 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:56 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML S
	chema  Questions
Message-ID: <33D189919E89D311814C00805F1991F7F4AAF4@RED-MSG-08>

A standard does not necessarily preclude choices or optional processing
decisions by an application; it frequently sets bounds within which options
may be chosen; often the freedoms permitted by a standard are as important
as the freedoms removed.  For example, HTML allows different user agents to
render the same document differently.

I agree with your intuitive suspicion that readers of a document ought to be
reading what the author intended, and that a large part of an author's
intention is revealed in his warrants about schema conformance.  But the
question is: How does this get expressed in the rules of the schema
standard?  The schema WG debated this extensively (often with me taking the
side you are now arguing!) but in the end decided that applications need the
flexibility, in some cases, to ignore the schemas recommended by the
document and use either none or others.  In fact, if a (probably
namespace-qualified) information item is associated by an application with
certain semantics, then the document's claims about schema conformance may
be simply irrelevant.

-----Original Message-----
From: Bob Kline [mailto:bkline@rksystems.com]
Sent: Thursday, December 30, 1999 5:03 AM
To: Roger L. Costello
Cc: xml-dev@ic.ac.uk; www-xml-schema-comments@w3c.org; Schneider,John
C.; Cokus,Michael S.
Subject: Re: No Standard way to reference XML Schema? Was Re: (Many) XML
Schema Questions


On Thu, 30 Dec 1999, Roger L. Costello wrote:

> I read these statements as saying that there is no standard way for
> specifying in an XML document what XML Schema it conforms to - every
> XML Parser will have its own way of doing things.  Really???  If
> this is so, please, please tell me why this is a good thing.  I am
> struggling to appreciate its beauty.  /Roger

I am as puzzled as you are.  Yes, it's true, as Andrew writes, that
"ultimately, the processor of a document determines what processing is
done" [cited as the rationale for the decision by the XML Schemas WG to
demote the xsi:schemaLocation attribute to a "hint"].  The same could be
said of any software, which behaves in direct response to the
instructions written by its creators, rather than the prescriptions of
standards.  The role of a standard is to assist in the processes of
predicting how software which claims conformance to it will behave, and
of determining which products are actually conformant.

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From andrewl at microsoft.com  Thu Dec 30 22:14:00 1999
From: andrewl at microsoft.com (Andrew Layman)
Date: Mon Jun  7 17:18:56 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML S
	chema  Questions
Message-ID: <33D189919E89D311814C00805F1991F7F4AAF5@RED-MSG-08>

You misread.  There _is_ a standard way for a document to indicate the
schema set (perhaps several schemas) its element information items conform
to.  That is the function of the schemaLocation attribute.  

What may be confusing you is that applications have the option to process a
document in ways not strictly mandated by those asserted schemas.  For
example, even today XML processors may elect to ignore an element's DOCTYPE
declaration. (See my other mails and also those of Henry Thompson for more
details.)

-----Original Message-----
From: Roger L. Costello [mailto:costello@mitre.org]
Sent: Thursday, December 30, 1999 4:04 AM
To: xml-dev@ic.ac.uk
Cc: www-xml-schema-comments@w3c.org; Schneider,John C.; Costello,Roger
L.; Cokus,Michael S.
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML
Schema Questions


Hi Folks,

I gotta tell ya, this 'ol country boy is having a mighty difficult time
figuring out how an XML document is to indicate to that it conforms to
a  particular XML Schema.  It seems to me that this should be one area
that should be made crystal clear.  Instead, I am finding this to be one
of the murkiest parts of the XML Schema spec.

These statements really throw me through a loop:

"xsi:schemaLocation attribute serves as a hint, not a mandatory
directive. That is, the processor of an instance is welcome to look at
the URI referenced by the value of xsi:schemaLocation, but is not
required to."

"The means used to locate appropriate schema document(s) are processor
and application dependent"

I read these statements as saying that there is no standard way for
specifying in an XML document what XML Schema it conforms to - every XML
Parser will have its own way of doing things.  Really???   If this is
so, please, please tell me why this is a good thing.  I am struggling to
appreciate its beauty.    /Roger

Andrew Layman wrote:
> 
> What was said by Rick Jeliffe regarding the current schema draft is true
> (and anyone who is interested is recommended to red the actual XML Schema
WD
> at http://www.w3.org/TR/xmlschema-1/ and
http://www.w3.org/TR/xmlschema-2/.)
> 
> However, I would like to correct a possible misimpression that might arise
> from the turgid wording in the current public draft and also from Rick's
> statement "Then (s4.3.2) there is an attribute xsi:schemaLocation that can
> be put on any instance element. It allows the location of the schema to be
> declared. ..."
> 
> After extensive debate, the XML Schemas WG decided that the
> xsi:schemaLocation attribute serves as a hint, not a mandatory directive.
> That is, the processor of an instance is welcome to look at the URI
> referenced by the value of xsi:schemaLocation, but is not required to.  It
> may process an instance document using a different schema set (or no
schemas
> at all).  The relevant phrase is "unless directed otherwise" in the
> following passage from the 1999-12-17 structures draft:
> 
> "Again, unless directed otherwise general-purpose schema-aware processors
> must attempt to dereference each schema URI in the value of
"schemaLocation"
> to obtain a schema..."
> 
> This is in recognition of the fact that, ultimately, the processor of a
> document determines what processing is done.
> 
> xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
> Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
> To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
> unsubscribe xml-dev
> To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
> subscribe xml-dev-digest
> List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN
981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following
message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bkline at rksystems.com  Thu Dec 30 22:32:44 1999
From: bkline at rksystems.com (Bob Kline)
Date: Mon Jun  7 17:18:56 2004
Subject: No Standard way to reference XML Schema? Was Re: (Many) XML  
 Schema  Questions
In-Reply-To: <33D189919E89D311814C00805F1991F7F4AAEF@RED-MSG-08>
Message-ID: <Pine.LNX.4.10.9912301705200.15312-100000@rksystems.com>

On Thu, 30 Dec 1999, Andrew Layman wrote:

> I must side here with Henry's argument.  The schemas specification
> provides a standard way that a document author can indicate a
> specific schema set that he warrants the document conforms to, and
> it also provides a standard set of rules related to the application
> of several schemas comprising that schema set.  This is an entirely
> useful facility to have.
> 
> However, the existence of such a warrant in a document does not
> obligate a processing application to use the asserted schemas. [....]

It seems that some confusion has slipped into this discussion about the
roles of the application and the XML processor.  The same distinction
between these two applies in this context as for the XML specification
itself.  In Tim Bray's words:

    While this spec constrains some behaviors of an XML processor,
    it places no constraints on the application.  This is an
    important point; it would be inappropriate (note to mention
    futile) for this document to try to enforce what other people
    *do* with XML."  

      - "The Annotated XML Specification" (c) 1988 Tim Bray

No one is looking for constraints on what applications do with XML.  No
one is trying to prevent extensions to XML processors for doing useful
processing of which the authors of the specification may not have even
dreamed.  What we are hoping for is a standard mechanism for specifying
the schema against which a conforming XML processor will validate an XML
instance (which is, after all, the original reason the XML schema
specification was created).

-- 
Bob Kline
mailto:bkline@rksystems.com
http://www.rksystems.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From peter at ursus.demon.co.uk  Fri Dec 31 00:00:08 1999
From: peter at ursus.demon.co.uk (Peter Murray-Rust)
Date: Mon Jun  7 17:18:56 2004
Subject: Millennium greetings
In-Reply-To: <14442.32320.722924.870701@localhost.localdomain>
References: <385FA4C9.2B054683@pacbell.net>
 <14430.46481.974707.922192@localhost.localdomain>
 <012701bf4b44$0c68b030$4a5eedc1@arp01>
 <14430.55483.433692.943811@localhost.localdomain>
 <385FA4C9.2B054683@pacbell.net>
Message-ID: <3.0.1.32.20000101000327.00994e30@pop3.demon.co.uk>

Just a note to wish everyone a safe transition over the next few days and
to hope that the server stays up - there isn't much Henry can do about it!
In this country a lot of things - including humans - shut down over a
protracted period, either voluntarily or otherwise.

	Best wishes

	Peter


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From tmmet at hotmail.com  Fri Dec 31 01:55:09 1999
From: tmmet at hotmail.com (tmmet tvp)
Date: Mon Jun  7 17:18:56 2004
Subject: Filtering and sorting using xsl
Message-ID: <19991231015434.54704.qmail@hotmail.com>

Hi,
Can anyone help me out/suggest me an idea.I would be glad if anyone atleast 
mail me with an idea to work out this problem.

I've a list box with all the tag name attributes in my xml file .
Say as,
Filter using :
Title,
Author,
name,
Id

When I click on/select from this list box,I should filter my xml file using 
the attribute(text) that I selected from the list box and create a tree 
view(as in explorer).I've to do this filtering and tree view display using 
xsl.Filtering is not working for me.
In my java script,
var source
var style
function OnLoad()
{
	source = new ActiveXObject("Microsoft.xmlDOM");
	source.load("test.xml");
	if(source.parseError.reason != "")
	alert("XML File Load Error :" + "The Error is " + 
source.parseError.errorCode + source.parseError.errorString);
	style = new ActiveXObject("Microsoft.xmlDOM");
	style.load("style.xsl");
	if(style.parseError.reason != "")
	alert("XSL File Load Error :" +  "Error No is " + 
style.parseError.errorCode);
	filterField = sorter.documentElement.selectSingleNode("//@match[0]");
    sortField = style.documentElement.selectSingleNode("//@order-by");
  	filterField.value = "SUBCHAPTER[@ID = \'1']"
    document.all.TreeViewDisplay.innerHTML = 
source.transformNode(style.documentElement);
}

Can you help me out.This is very urgent.If you have time,can you help me?.
In my xsl file,

<?xml version="1.0" ?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/TR/WD-xsl">
  <!-- Identity transformation template -->

  <xsl:template>
  <xsl:copy>
  <xsl:apply-templates select="@* | * | comment() | pi() | text()"/>
  </xsl:copy>
  </xsl:template>
  <xsl:template match="/">
	<xsl:apply-templates select="//CHAPTER"/>
  </xsl:template>

<xsl:template match="SUBCHAPTER[@ID = '2']" />

<xsl:template match="CHAPTER">
//do the tree view code here
</xsl:template>

<xsl:template match="SUBCHAPTER">
//do the tree view code here
</xsl:template>
</xsl:stylesheet >

Filetering is not working.
I don't know where is the exact error.
Its displaying the entire contents.Its not filtering.
Can you please help  me out.
Thanks in advance.

______________________________________________________
Get Your Private, Free Email at http://www.hotmail.com


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From bruce at nilo.com  Fri Dec 31 03:07:03 1999
From: bruce at nilo.com (Bruce D. Nilo)
Date: Mon Jun  7 17:18:56 2004
Subject: Namespaces and Validating XML Parsers Question
Message-ID: <003501bf533b$30e17100$120c9fcd@banach>

Apologies if this is a well understood question. I am recently slogging
through the XML specs and tools and have not really found anything
that sheds light on this question. I hope the answer is a simple one.

I believe I understand the rationale and syntax of namespaces for XML
application instances. However I can find nowhere any explanation
as to how a validating parser knows how to associate an element or
attribute that is in a specific namespace with the elements and
attributes of either an external or internal DTD reference. Am I
missing something here, or is the specification punting entirely on
the issue? It seems to me that either the application instance must
associate a namespace declaration with a specific DTD or the DTD
itself must declare its name. If this is not the case the XML parser
has a difficult job determining how to interpret the names in a namespace
with those of a number of referenced DTDs which have name collisions.

- Bruce


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ashish_agarwal at msdc.hcltech.com  Fri Dec 31 04:55:25 1999
From: ashish_agarwal at msdc.hcltech.com (Ashish Agarwal,AMB Chennai)
Date: Mon Jun  7 17:18:56 2004
Subject: Problems with XML
Message-ID: <21FCEFDE42DFD211A1A10007250603B2EB693B@PLUTO>

> Switch from current encoding to specified encoding not supported. 
> Hi all,
> I am trying to send a XML string to a VB component from my ASP code and
> then
> write it to a file.
> The problems that i face are
> 
> 1) 
> the line : 
> <?xml version="1.0" encoding="UTF-8"?>
> is written on the file as 
> "<?xml version=""1.0"" encoding=""UTF-8""?>"
> 
> I do not want the inverted commas to appear. Any suggestions
> 
> 2) I read a XML file in my VB component and pass it as a string to my ASP
> program
> Here, i get the string in the right format, but when loading it to the DOM
> using 
> 
> doc1.loadXML(resultFromVBCom), I receive an error mentioning : 
> Switch from current encoding to specified encoding not supported
> Please help me...
> 
> 
> Regards,
> Ashish Agarwal
> 
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From liamquin at interlog.com  Fri Dec 31 05:13:01 1999
From: liamquin at interlog.com (Liam R. E. Quin)
Date: Mon Jun  7 17:18:56 2004
Subject: Typography tutorial slides on the web
Message-ID: <Pine.BSI.3.96r.991231001701.12306B-100000@shell1.interlog.com>

I've made the slides from my Typography tutorial avilable in
http:/www.valinor.sorcery.net/~liam/papers/
The slides for the Information Retrieval talk are also there,
but they are 90% content-free, unless you are interested in
gooseberries.

You need the Mrs Eaves font set (including JustLigatures) from
Emigre, www.emigre.com, to be able to read these slides.

I've not had a chance to see if I can make a pdf file, but
intend to look into it.  Emigre does not in general allow
embedding of their fonts, especially for posting on the net,
since it's possible to steal the fonts out of the PDF.

Lee

-- 
Liam Quin, Barefoot Computing, Toronto;  The barefoot agitator
l i a m    at    h o l o w e b    dot    n e t
Ankh on irc.sorcery.net, http://www.valinor.sorcery.net/~liam/


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ashish_agarwal at msdc.hcltech.com  Fri Dec 31 05:56:50 1999
From: ashish_agarwal at msdc.hcltech.com (Ashish Agarwal,AMB Chennai)
Date: Mon Jun  7 17:18:56 2004
Subject: Problems with XML
Message-ID: <21FCEFDE42DFD211A1A10007250603B2EB6950@PLUTO>

> Switch from current encoding to specified encoding not supported. 
> Hi all,
> I am trying to send a XML string to a VB component from my ASP code and
> then
> write it to a file.
> The problems that i face are
> 
> 1) 
> the line : 
> <?xml version="1.0" encoding="UTF-8"?>
> is written on the file as 
> "<?xml version=""1.0"" encoding=""UTF-8""?>"
> 
> I do not want the inverted commas to appear. Any suggestions
> 
> 2) I read a XML file in my VB component and pass it as a string to my ASP
> program
> Here, i get the string in the right format, but when loading it to the DOM
> using 
> 
> doc1.loadXML(resultFromVBCom), I receive an error mentioning : 
> Switch from current encoding to specified encoding not supported
> Please help me...
> 
> 
> Regards,
> Ashish Agarwal
> 
> 

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Thu Dec 30 01:43:44 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:56 2004
Subject: Hierarchical namespaces?
Message-ID: <3.0.32.19991229204452.01a4c260@nexus.webmethods.com>

I've been thinking about the utility of naming all elements and most
attributes using a namespace-URI/local-name pair.  Let's denote such a
name as (namespace, local-name).  (I say "most" attributes because it
won't name anything in the per-element-type partition.)  Seems to me
that filtering operations would commonly extract names belonging to a
particular namespace, so requests for (namespace, *) might be pretty
common.  Let's look at this more closely...

Suppose I'm defining elements that describe electronics parts.  I'm
going to want to organize them hierarchically.  For example:

  www.parts.com/computer/memory/sram
  www.parts.com/computer/memory/dram
  www.parts.com/computer/cpus/intel
  www.parts.com/computer/cpus/amd
  www.parts.com/stereo/speaker/surround
  www.parts.com/stereo/speaker/subwoofer

etc.

It may make sense for one application to examine all computer parts,
another to examine all computer memory parts, and so on.  If I want all
memory parts I have to know all the pertinent namespace URIs.  If I
know that the URIs are structured hierarchically, I could do a wildcard
search on the URI itself -- assuming I had a tool that let me do so (do
any yet?).

But because URIs allow this, the next guy organizes his namespaces
differently:

  www.nextguy.com/computer-memory-sram
  www.nextguy.com/computer-memory-dram
  www.nextguy.com/computer-cpus-intel
  www.nextguy.com/computer-cpus-amd
  www.nextguy.com/stereo-speaker-surround
  www.nextguy.com/stereo-speaker-subwoofer

And the next next guy does so as follows:

  www.nextnextguy.com/computer?memory=true+type=sram
  www.nextnextguy.com/computer?memory=true+type=dram
  www.nextnextguy.com/computer?cputype=intel
  www.nextnextguy.com/computer?cputype=amd
  www.nextnextguy.com/stereo/speaker?surround
  www.nextnextguy.com/stereo/speaker?subwoofer

To make namespace filtering work for the general case requires
regex-like matching capabilities.  And regex matching isn't very easy
to optimize for performance (such as via indexing).  It also isn't the
kind of thing we want the average XML user to have to learn -- seems to
me that it would have to bubble up to the user interface, at least on
generic XML tools.

So I'm thinking that we need a *standard* way to organize namespaces
hierarchically, and that we need one before namespace usage is so
widespread that we absolutely have to provide regex support.

But maybe I'm jumping the gun.  I haven't yet heard anyone scream out
in pain, though I'm not sure we should be waiting for pain to come.

--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From costello at mitre.org  Fri Dec 31 12:56:34 1999
From: costello at mitre.org (Roger Costello)
Date: Mon Jun  7 17:18:56 2004
Subject: XML Schemas and Namespaces
References: <33D189919E89D311814C00805F1991F7F4AAF5@RED-MSG-08>
Message-ID: <386CA878.67743310@mitre.org>

Hi Folks,

I have a couple of questions with regards to the use of namespaces in
XML Schemas.

1.  As has been recently discussed, the method for an XML instance
document to indicate the XML Schema that it conforms to is with the
schemaLocation attribute.  For example:

<?xml version="1.0"?>
<BookCatalogue xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
              xsi:schemaLocation=
               "http://www.somewhere.org/BookCatalogue
               
http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd">
        ...
</BookCatalogue>

At the root element (BookCatalogue) of this XML instance document I am
using schemaLocation to indicate the XML Schema that it conforms to.

The problem is this: when I defined BookCatalogue (in BookCatalogue.xsd)
I didn't define any attributes for it.  I certainly didn't define
xmlns:xsi nor xsi:schmemaLocation as attributes.  Thus, this XML
instance document is invalid, right?

The nice thing about DOCTYPE was that it separated the mechanism for
declaring the associated schema (i.e., the DTD) from the information
items (i.e., the elements).  With schemaLocation the mechanism for
declaring the associated schema is intertwined with the information
items.  

Thus, it seems that when an XML Schema is written the author must try to
anticipate how instance documents will use it and add in xmlns:xsi and
xsi:schemaLocation attributes to the elements being defined in the
schema.  For my example, I would need to define BookCatalogue as:

    <element name="BookCatalogue">
        <type>
             <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
             <attribute name="xmlns:xsi" type="URI"/>
             <attribute name="xsi:schemaLocation" type="string"/>
        </type>
    </element>

I must be misunderstanding something fundamental.  This is obviously
ridiculous.


2. My second question has to do with referencing elements within an XML
Schema.   Consider this schema:

<?xml version="1.0"?>
<!DOCTYPE schema SYSTEM "xml-schema.dtd"[
<!ATTLIST schema xmlns:cat CDATA #IMPLIED>
]>
<schema xmlns="http://www.w3.org/1999/XMLSchema"
               targetNamespace="http://www.somewhere.org/BookCatalogue"
               xmlns:cat="http://www.somewhere.org/BookCatalogue">
    <element name="BookCatalogue">
        <type>
             <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
        </type>
    </element>
    <element name="Book">
        <type>
            <element ref="cat:Title"/>
            <element ref="cat:Author"/>
            <element ref="cat:Date"/>
            <element ref="cat:ISBN"/>
            <element ref="cat:Publisher"/>
        </type>
    </element>
    <element name="Title" type="string"/>
    <element name="Author" type="string"/>
    <element name="Date" type="date"/>
    <element name="ISBN" type="string"/>
    <element name="Publisher" type="string"/>
</schema>

Note that we define the Book element and in the BookCatalogue element it
is referenced using cat:Book

    <element name="BookCatalogue">
        <type>
             <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
        </type>
    </element>

My understanding is that the reason for prefixing Book with cat: is to
indicate "the Book element that we are referencing comes from the cat:
namespace".   The cat: namespace is defined at the top of the schema to
be the same as the targetNamespace.  Thus, the cat: namespace refers to
this schema document.

Here's my question:  it appears to me that namespaces are being used
here to "point" to things.  In this case, cat: is "pointing" to the
current document (the XML Schema).  Isn't this a violation of the
namespace spec, which says that there is no guarantee that there is
anything at the URI referenced by a namespace?

/Roger


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From Sophie.Mabilat at apitech.fr  Fri Dec 31 14:32:30 1999
From: Sophie.Mabilat at apitech.fr (Sophie MABILAT)
Date: Mon Jun  7 17:18:56 2004
Subject: Beginner's questions...
Message-ID: <B1C8643B3AB0D21180250000C0B179CD05B6DA@JUPITER>

Hello !
Sorry, but I just begin with XML and I have some difficulties... I use IE5.
	1-	I would like to transform an XML document into an other XML
document :
	Initial document :
	<doc>
		<e att1="val1" att2="val2" .../>
		<e att1="val3" att2="val4" .../>
		...
      </doc>

     Final document to produce :
     <docother>
	<e>
	      <att1>val1</att1>
		      <att2>val2</att2>
		      ...
		</e>
<e>
      <att1>val3</att1>
		      <att2>val4</att2>
		      ...

		</e>
		...
     </docother>

I think to use a XSL transformation but all my attempts fail...
     
     2- I would like to know how to produce an XML document from a precise
schema and a recordset containing the datas to structure.

     3- Is it possible to update linked tables in a database by saving rows
in an XML document and then updating the database with the XML datas? If
yes, that I hope, is someone able to help me to do this ?

    4- Could someone to tell me how I can make a model of a foreign key
between 2 tables in a model like :
<database> 
	<table> 
		<row> 
			<column1>...</column1>
			<column2>...</column2>
		 	... 
		</row>
	 	... 
	</table> 
	... 
</database> 
Thank you for help me !
S. MABILAT


xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From jlapp at webMethods.com  Fri Dec 31 15:07:50 1999
From: jlapp at webMethods.com (Joe Lapp)
Date: Mon Jun  7 17:18:56 2004
Subject: Hierarchical namespaces?
Message-ID: <3.0.32.19991231100901.0369eb60@nexus.webmethods.com>

Err.  How did that happen?  I only sent this bugger once.
The Y2K cataclysm has begun.

At 08:44 PM 12/29/99 -0500, Joe Lapp wrote:
>I've been thinking about the utility of naming all elements and most
>attributes using a namespace-URI/local-name pair.  Let's denote such a
>name as (namespace, local-name).  (I say "most" attributes because it
>won't name anything in the per-element-type partition.)  Seems to me
>that filtering operations would commonly extract names belonging to a
>particular namespace, so requests for (namespace, *) might be pretty
>common.  Let's look at this more closely...
>
>Suppose I'm defining elements that describe electronics parts.  I'm
>going to want to organize them hierarchically.  For example:
>
>  www.parts.com/computer/memory/sram
>  www.parts.com/computer/memory/dram
>  www.parts.com/computer/cpus/intel
>  www.parts.com/computer/cpus/amd
>  www.parts.com/stereo/speaker/surround
>  www.parts.com/stereo/speaker/subwoofer
>
>etc.
>
>It may make sense for one application to examine all computer parts,
>another to examine all computer memory parts, and so on.  If I want all
>memory parts I have to know all the pertinent namespace URIs.  If I
>know that the URIs are structured hierarchically, I could do a wildcard
>search on the URI itself -- assuming I had a tool that let me do so (do
>any yet?).
>
>But because URIs allow this, the next guy organizes his namespaces
>differently:
>
>  www.nextguy.com/computer-memory-sram
>  www.nextguy.com/computer-memory-dram
>  www.nextguy.com/computer-cpus-intel
>  www.nextguy.com/computer-cpus-amd
>  www.nextguy.com/stereo-speaker-surround
>  www.nextguy.com/stereo-speaker-subwoofer
>
>And the next next guy does so as follows:
>
>  www.nextnextguy.com/computer?memory=true+type=sram
>  www.nextnextguy.com/computer?memory=true+type=dram
>  www.nextnextguy.com/computer?cputype=intel
>  www.nextnextguy.com/computer?cputype=amd
>  www.nextnextguy.com/stereo/speaker?surround
>  www.nextnextguy.com/stereo/speaker?subwoofer
>
>To make namespace filtering work for the general case requires
>regex-like matching capabilities.  And regex matching isn't very easy
>to optimize for performance (such as via indexing).  It also isn't the
>kind of thing we want the average XML user to have to learn -- seems to
>me that it would have to bubble up to the user interface, at least on
>generic XML tools.
>
>So I'm thinking that we need a *standard* way to organize namespaces
>hierarchically, and that we need one before namespace usage is so
>widespread that we absolutely have to provide regex support.
>
>But maybe I'm jumping the gun.  I haven't yet heard anyone scream out
>in pain, though I'm not sure we should be waiting for pain to come.
>
>--
>Joe Lapp              (Looking for some good people to help design
>Principal Architect    and build the Internet's business-to-business
>webMethods, Inc.       XML infrastructure.  We are 100% Java.)
>jlapp@webMethods.com           http://www.webMethods.com
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
>
>xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
>Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on
CD-ROM/ISBN 981-02-3594-1
>To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
>unsubscribe xml-dev
>To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
>subscribe xml-dev-digest
>List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)
>
--
Joe Lapp              (Looking for some good people to help design
Principal Architect    and build the Internet's business-to-business
webMethods, Inc.       XML infrastructure.  We are 100% Java.)
jlapp@webMethods.com           http://www.webMethods.com

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)


From ht at cogsci.ed.ac.uk  Fri Dec 31 15:32:18 1999
From: ht at cogsci.ed.ac.uk (Henry S. Thompson)
Date: Mon Jun  7 17:18:57 2004
Subject: XML Schemas and Namespaces
In-Reply-To: Roger Costello's message of "Fri, 31 Dec 1999 07:58:32 -0500"
References: <33D189919E89D311814C00805F1991F7F4AAF5@RED-MSG-08> <386CA878.67743310@mitre.org>
Message-ID: <f5byaab3z40.fsf@cogsci.ed.ac.uk>

Roger Costello <costello@mitre.org> writes:

> Hi Folks,
> 
> I have a couple of questions with regards to the use of namespaces in
> XML Schemas.
> 
> 1.  As has been recently discussed, the method for an XML instance
> document to indicate the XML Schema that it conforms to is with the
> schemaLocation attribute.  For example:
> 
> <?xml version="1.0"?>
> <BookCatalogue xmlns:xsi="http://www.w3.org/1999/XMLSchema/instance"
                  xmlns="http://www.somewhere.org/BookCatalogue"
>               xsi:schemaLocation=
>                "http://www.somewhere.org/BookCatalogue
>                
> http://www.somewhere.org/BookCatalogue/BookCatalogue.xsd">
>         ...
> </BookCatalogue>
> 
> At the root element (BookCatalogue) of this XML instance document I am
> using schemaLocation to indicate the XML Schema that it conforms to.
> 
> The problem is this: when I defined BookCatalogue (in BookCatalogue.xsd)
> I didn't define any attributes for it.  I certainly didn't define
> xmlns:xsi nor xsi:schmemaLocation as attributes.  Thus, this XML
> instance document is invalid, right?

No, it's fine.  Note it has no DTD, so validity (an XML 1.0 concept)
is not relevant.  Defining a DTD for it, which appropriately allowed
for namespace prefixes, would be possible but tedious.

It's SCHEMA-valid (or at least it's not obviously NOT schema-valid,
given the addition of a default namespace declaration as above)
because
  a) xmlns:xsi and xmlns are not attributes, they are namespace
     declarations, and they're just fine as such:  no declarations for them 
     are required in BookCatalogue.xsd;
  b) xsi:schemaLocation is an attribute, but by definition such an
     attribute is always schema-valid provided its contents are coherent,
     which they are in this case.
> The nice thing about DOCTYPE was that it separated the mechanism for
> declaring the associated schema (i.e., the DTD) from the information
> items (i.e., the elements).  With schemaLocation the mechanism for
> declaring the associated schema is intertwined with the information
> items.  
> 
> Thus, it seems that when an XML Schema is written the author must try to
> anticipate how instance documents will use it and add in xmlns:xsi and
> xsi:schemaLocation attributes to the elements being defined in the
> schema.  For my example, I would need to define BookCatalogue as:
> 
>     <element name="BookCatalogue">
>         <type>
>              <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
>              <attribute name="xmlns:xsi" type="URI"/>
>              <attribute name="xsi:schemaLocation" type="string"/>
>         </type>
>     </element>
> 
> I must be misunderstanding something fundamental.  This is obviously
> ridiculous.

I hope the comments above clarify that you don't need either of those
attribute declarations.

> 2. My second question has to do with referencing elements within an XML
> Schema.   Consider this schema:
> 
> <?xml version="1.0"?>
> <!DOCTYPE schema SYSTEM "xml-schema.dtd"[
> <!ATTLIST schema xmlns:cat CDATA #IMPLIED>
> ]>
> <schema xmlns="http://www.w3.org/1999/XMLSchema"
>                targetNamespace="http://www.somewhere.org/BookCatalogue"
>                xmlns:cat="http://www.somewhere.org/BookCatalogue">
>     <element name="BookCatalogue">
>         <type>
>              <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
>         </type>
>     </element>
>     <element name="Book">
>         <type>
>             <element ref="cat:Title"/>
>             <element ref="cat:Author"/>
>             <element ref="cat:Date"/>
>             <element ref="cat:ISBN"/>
>             <element ref="cat:Publisher"/>
>         </type>
>     </element>
>     <element name="Title" type="string"/>
>     <element name="Author" type="string"/>
>     <element name="Date" type="date"/>
>     <element name="ISBN" type="string"/>
>     <element name="Publisher" type="string"/>
> </schema>
> 
> Note that we define the Book element and in the BookCatalogue element it
> is referenced using cat:Book
> 
>     <element name="BookCatalogue">
>         <type>
>              <element ref="cat:Book" minOccurs="0" maxOccurs="*"/>
>         </type>
>     </element>
> 
> My understanding is that the reason for prefixing Book with cat: is to
> indicate "the Book element that we are referencing comes from the cat:
> namespace".   The cat: namespace is defined at the top of the schema to
> be the same as the targetNamespace.  Thus, the cat: namespace refers to
> this schema document.

I'd say, more carefully: "The prefix cat denotes a namespace URI which 
is the same as the namespace URI identifying the target namespace of
this schema.  Thus references to schema components in that namespace
refer to components defined in this schema."

> Here's my question:  it appears to me that namespaces are being used
> here to "point" to things.  In this case, cat: is "pointing" to the
> current document (the XML Schema).  Isn't this a violation of the
> namespace spec, which says that there is no guarantee that there is
> anything at the URI referenced by a namespace?

The fact that you can't depend on dereferencing a namespace URI is
fundamental to our design.  I hope the above gloss helps clarify that
we're not cheating here.  It may be helpful to consider the
intermediate case of the <import> concept.  Here are some excerpts
from the schema for schemas, but they could be from BookCatalogue.xsd:

<schema xmlns="http://www.w3.org/1999/XMLSchema"
        targetNamespace="http://www.w3.org/1999/XMLSchema"
        xmlns:x="http://www.w3.org/XML/1998/namespace">

 <import namespace="http://www.w3.org/XML/1998/namespace"
         schemaLocation="http://www.w3.org/XML/1998/xml.xsd"/>

      <element name="info">
       <type content="mixed">
         <any minOccurs="0" maxOccurs="*"/>
         <attribute name="source" type="uri"/>
         <attributeGroup ref="x:lang"/>
       </type>
     </element>
</schema>

The <attributeGroup> element references a group named 'lang' in a
namespace with the namespace URI
"http://www.w3.org/XML/1998/namespace", which we recognise as the
namespace for XML itself.  The import statement tells us we can find a
schema for the namespace with that namespace URI at
http://www.w3.org/XML/1998/xml.xsd, and indeed if you look there you
will find a schema with a declaration of an attributeGroup named
'lang'.  In other words, <import> establishes the connection between a
namespace URI used in explicit schema references and a schema which
discharges those references, in much the same way that
'xsi:schemaLocation' establishes the connection between the namespace
URI used in IMPLICIT schema references in an instance and a schema
which discharges them.

To close the conceptual loop, you can think of the 'targetNamespace'
attribute on a schema as providing the wherewithall for an implied
<import> statement, e.g.

  <import namespace="http://www.somewhere.org/BookCatalogue"
          schemaLocation=""/>

This is just what is meant by saying that every schema is taken to be
defining components in its target namespace.

Hope this helps,

ht

Note I've tried to be careful to distinguish four things in my answers 
above:

namespaces;
schemas;
namespace URIs;
prefixes.

Although doing this makes things more prolix, it avoids
misunderstandings, and I commend it to you in messages on this topic.
-- 
  Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
     2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
	    Fax: (44) 131 650-4587, e-mail: ht@cogsci.ed.ac.uk
		     URL: http://www.ltg.ed.ac.uk/~ht/

xml-dev: A list for W3C XML Developers. To post, mailto:xml-dev@ic.ac.uk
Archived as: http://www.lists.ic.ac.uk/hypermail/xml-dev/ and on CD-ROM/ISBN 981-02-3594-1
To unsubscribe, mailto:majordomo@ic.ac.uk the following message;
unsubscribe xml-dev
To subscribe to the digests, mailto:majordomo@ic.ac.uk the following message;
subscribe xml-dev-digest
List coordinator, Henry Rzepa (mailto:rzepa@ic.ac.uk)